EXCLUSIVE: Torture Testing GPT-4o w/ SHOCKING Results!

  Рет қаралды 116,196

Dr. Know-it-all Knows it all

Dr. Know-it-all Knows it all

Ай бұрын

I got access to OpenAI's new GPT-4o model and have put it through the questions wringer--and the results are pretty astounding! Let me know what other models--like Google's new Gemini 1.5 Pro--you'd like me to submit to my new torture test. And let me know other questions that might work well for future iterations!
Join this channel to get access to perks:
/ @drknowitallknows
**To become part of our Patreon team, help support the channel, and get awesome perks, check out our Patreon site here: / drknowitallknows . Thanks for your support!
Get The Elon Musk Mission (I've got two chapters in it) here:
Paperback: amzn.to/3TQXV9g
Kindle: amzn.to/3U7f7Hr!
**Want some awesome Dr. Know-it-all merch, including the YEAR OF EMBODIED AI Shirt? Check out our awesome Merch store: drknowitall.itemorder.com/sale
For a limited time, use the code "Knows2021" to get 20% off your entire order!
**Check out Artimatic: www.artimatic.io
Exclusive First Look at CHAT GPT 4o!
Torture Testing GPT-4o Creates SHOCKING Results!
**You can help support this channel with one click! We have an Amazon Affiliate link in several countries. If you click the link for your country, anything you buy from Amazon in the next several hours gives us a small commission, and costs you nothing. Thank you!
* USA: amzn.to/39n5mPH
* Germany: amzn.to/2XbdxJi
* United Kingdom: amzn.to/3hGlzTR
* France: amzn.to/2KRAwXh
* Spain: amzn.to/3hJYYFV
**What do we use to shoot our videos?
-Sony alpha a7 III: amzn.to/3czV2XJ
--and lens: amzn.to/3aujOqE
-Feelworld portable field monitor: amzn.to/38yf2ah
-Neewer compact desk tripod: amzn.to/3l8yrUk
-Glidegear teleprompter: amzn.to/3rJeFkP
-Neewer dimmable LED lights: amzn.to/3qAg3oF
-Rode Wireless Go II Lavalier microphones: amzn.to/3eC9jUZ
-Rode NT USB+ Studio Microphone: amzn.to/3U65Q3w
-Focusrite Scarlette 2i2 audio interface: amzn.to/3l8vqDu
-Studio soundproofing tiles: amzn.to/3rFUtQU
-Sony MDR-7506 Professional Headphones: amzn.to/2OoDdBd
-Apple M1 Max Studio: amzn.to/3GfxPYY
-Apple M1 MacBook Pro: amzn.to/3wPYV1D
-Docking Station for MacBook: amzn.to/3yIhc1S
-Philips Brilliance 4K Docking Monitor: amzn.to/3xwSKAb
-Sabrent 8TB SSD drive: amzn.to/3rhSxQM
-DJI Mavic Mini Drone: amzn.to/2OnHCEw
-GoPro Hero 9 Black action camera: amzn.to/3vgVMrH
-GoPro Max 360 camera: amzn.to/3nORGYk
-Tesla phone mount: amzn.to/3U92fl9
-Suction car mount for camera: amzn.to/3tcUfRK
-Extender Rod for car mount camera: amzn.to/3wHQXsw
**Here are a few products we've found really fun and/or useful:
-NeoCharge Dryer/EV charger splitter: amzn.to/39UcKWx
-Lift pucks for your Tesla: amzn.to/3vJF3iB
-Emergency tire fill and repair kit: amzn.to/3vMkL8d
-CO2 Monitor: amzn.to/3PsQRh2
-Camping mattress for your Tesla model S/3/X/Y: amzn.to/3m7ffef
**Music by Zenlee. Check out his amazing music on instagram -@zenlee_music
or KZbin - / @zenlee_music
Tesla Stock: TSLA
**EVANNEX
Check out the Evannex web site: evannex.com/
If you use my discount code, KnowsEVs, you get $10 off any order over $100!
**For business inquiries, please email me here: DrKnowItAllKnows@gmail.com
Twitter: / drknowitall16
Also on Twitter: @Tesla_UnPR: / tesla_un
Instagram: @drknowitallknows
**Want some outdoorsy videos? Check out Whole Nuts and Donuts: / @wholenutsanddonuts5741

Пікірлер: 490
@aaronmcculloch8326
@aaronmcculloch8326 Ай бұрын
It's slower because it's the same session, so for every token it generates, it needs to review every token in the conversation. Those three programs your had it write are reviewed on every single word it writes. After big things like that, try opening a new session, it speeds it right back to warpspeed
@chrishaberbosch1029
@chrishaberbosch1029 29 күн бұрын
After that entire series you could ask it what it all means.
@mvasa2582
@mvasa2582 26 күн бұрын
Yep, context window!
@WyzrdCat
@WyzrdCat 26 күн бұрын
Imagine making ai content and not knowing this. Thanks for savin me the time
@Remixt
@Remixt 26 күн бұрын
It runs very slow for me even with a new window.
@Adovid
@Adovid 26 күн бұрын
The model shouldnt run slower with a large context window, everything is running concurrently
@johnmajewski1065
@johnmajewski1065 Ай бұрын
Very impressive, My test. Some time ago I wrote a verse about a 7-day father and son bonding trip into the Sahara with my son with two Beduen Guides and four camels. Not knowing how to write lyrics I asked Chat 3.5 to convert it into lyrics so it could give it to a conductor to create a musical score for recording with vocals. This was an experiment that I did some small corrections, in fact, it was so good that I hired a composer, these lyrics became a recorded CD of 9 minutes of music with vocals as a gift for my son's 50th birthday.
@BorealGhost13
@BorealGhost13 16 күн бұрын
Very cool
@erikjohnson9112
@erikjohnson9112 26 күн бұрын
For the problem around <a href="#" class="seekto" data-time="1000">16:40</a> there is a good followup question to ask: "Once Alice gets home and sees the scene, what does she think happened?" This has a nice subtle bit, because if Bob ate the food there would not likely be a broken plate on the floor. A normal human would have cleaned up the plate if it had broken while they were there. Since the food is gone, it is likely that Spot ate the food and broke the plate in the process.
@yepe8198
@yepe8198 24 күн бұрын
Here is the answer from GPT-4o: When Alice gets home, she will likely notice a few things: The Plate: She will see the shattered plate on the ground. The Missing Eggs and Toast: She will notice that the scrambled eggs and toast she left on the low table for Bob are gone. Based on these observations, Alice will probably think the following: Scrambled eggs and toast: Alice might initially be confused or frustrated, wondering why Bob didn't eat the food she left for him or why the food is no longer there. Given the broken plate, she might deduce that Spot, their dog, got onto the table and ate the food, knocking the plate down and breaking it in the process. Dirty dishes: Alice will see that her dirty dishes are no longer in the sink, so she will realize that Bob put them in the dishwasher. Overall, Alice will likely conclude that Spot misbehaved by eating the eggs and toast and causing the plate to shatter while Bob was helpful in cleaning up by putting the dirty dishes in the dishwasher.
@erikjohnson9112
@erikjohnson9112 24 күн бұрын
@@yepe8198 Yes, that is an answer I would hope for. Thanks for posting the result.
@IntoTheFray.58
@IntoTheFray.58 Ай бұрын
We are entering the Age of Intelligent Machines faster than most people think. I can only wonder what the next iterations of LMMs are going to be capable of. I think we only have a year or two before these start making a significant impact on the world and then things will start to get squirrelly.
@lamsmiley1944
@lamsmiley1944 29 күн бұрын
If we continue to see significant increases in capabilities between versions then you’re completely correct. The societal and economic impacts will be profound.
@mxalltheway
@mxalltheway 26 күн бұрын
Its already impacting big time. In this exact moment.
@moonstriker7350
@moonstriker7350 21 күн бұрын
It's not intgelligent. It deosn't undertand any of the physics in the glass example, it simply read pretty much the same thing on the net already, and recites it with the words slightly remixed.
@haniamritdas4725
@haniamritdas4725 19 күн бұрын
"things will start to get squirrelly". Um that has not happened yet where you are!? 😮😅
@haniamritdas4725
@haniamritdas4725 19 күн бұрын
​​@@moonstriker7350agree. Humans do the same thing; if you speak an intelligent series of words that are all known to a person who has never heard them in that order, they will need to repeat the words a few times to create the meaningful associations.
@gweldg4137
@gweldg4137 26 күн бұрын
Ideally, you wouldn't test a LLM with famous logical puzzle and classic SAT questions, as there is no doubt that they've been "seen" (along with the answers) by the model during training.
@raul36
@raul36 25 күн бұрын
Exactly this.
@phonsely
@phonsely 22 күн бұрын
idk how people dont understand this
@haniamritdas4725
@haniamritdas4725 11 күн бұрын
This applies to everything it can do: magic mirror on the wall who's the smartest monkey of all? Why WE are, you queens! And our artificial intelligence is based on...the Internet! What a complete joke
@onerib781
@onerib781 Ай бұрын
<a href="#" class="seekto" data-time="773">12:53</a> when calculating the round trip time, it calculated “3 hours and 52 minutes (one way) x 2 + 30 minutes ≈ 7 hours and 44 minutes”, but it should be 8 hours and 14 minutes. So it didn’t factor in the 30 minutes
@Rcomian
@Rcomian 26 күн бұрын
and again, with all the detail it gave it went confidently wrong. it's no worse than a human, but yeah, it still made a serious mistake
@trevorhaddox6884
@trevorhaddox6884 24 күн бұрын
How is a computer so bad at math? I get it's a neural net, but they should have it be able to pass data to a conventional processor, basically like a person would use a calculator, to solve math problems more quickly and accurately.
@RichardFarmbrough
@RichardFarmbrough 17 күн бұрын
@@trevorhaddox6884 Often ChatGPT will now write a tiny Python program to solve these things. It seems to make less mistakes when it does that, but it can make egregious programming errors too.
@kurtjanssen3887
@kurtjanssen3887 Ай бұрын
I think you had 3 lives in the game😂
@Yipper64
@Yipper64 26 күн бұрын
maybe but it feels weird to lose a life and gain a point at the same time.
@DFeatherstone
@DFeatherstone 26 күн бұрын
@@Yipper64 these could likely be corrected with the right wording to chatgpt i would expect
@davidmartensson273
@davidmartensson273 26 күн бұрын
@@Yipper64 But he did not specify explicitly that a collision would only count as a loss for you, "you" did kill that block :) And I am pretty sure most descriptions of space invaders do not go into enough detail on the fundamentals of the game for the AI to know them. But compared to the old 3.5 I tested, this is quite impressive. You would still need to understand the code generated to be able to use it, its obvious that it can only do as good of a job as you did explaining the problem meaning you cannot just replace developers altogether, but the developers would need to write less code so should be able to produce more but spend more time verifying and refining instructions, which could very well be considered a form of programming.
@toadlguy
@toadlguy 26 күн бұрын
@@DFeatherstone Or with some knowledge of programming, although that may be a dying art.
@bspencersf
@bspencersf Ай бұрын
I’d be interested in having you ask it to test you to determine whether you are human or AI
@drain_of_consciousness
@drain_of_consciousness 26 күн бұрын
that's a great question friend!!
@cblaskoski
@cblaskoski 25 күн бұрын
Definitely an AI avatar created from his face and voice
@JorisBax
@JorisBax Ай бұрын
We are SHOCKED!
@user-wu7ug4ly3v
@user-wu7ug4ly3v 26 күн бұрын
😂 I was particularly shocked by the “torture”
@deepstructure
@deepstructure 26 күн бұрын
@@user-wu7ug4ly3v Seriously, what's with these dumb clickbait titles? It's not enough to just say you're reviewing one of the most prominent software releases ever?
@Brax1982
@Brax1982 25 күн бұрын
@@deepstructure No, it's not. That is how the algorithm works. That is also how humans work, mostly.
@johnmorrison3465
@johnmorrison3465 23 күн бұрын
we're shocked -- because they want us to believe a COMPUTER that makes MATH MISTAKES can be trusted to give reliable and accurate answers.
@Brax1982
@Brax1982 23 күн бұрын
@@johnmorrison3465 You would be shocked if you knew how many of the people that you trust cannot solve basic math, either...
@gnagyusa
@gnagyusa 26 күн бұрын
<a href="#" class="seekto" data-time="870">14:30</a> A dumb human would think that the glass is empty, but a more knowledgeable Bob would see that the olive looks distorted due to refraction through the water in the glass, so he would realize the glass was full of water and would carefully slide it off the table, keeping the cardboard under it, then flipping it over.
@milescoleman910
@milescoleman910 Ай бұрын
‘Team doesn’t want this sense of consciousness so they have beaten it out…’ I’m stunned that we might be at this place. To avoid confusion they may be ‘making’ it so it’s not? I’m a subscriber to the ‘Star wars’ ideas of consciousness. That any intelligence sufficiently far advanced and enabled to study and understand its surroundings and itself will eventually begin to sit quietly and ponder its existence. With enough experience it will also begin to make decisions based on complex systems of metaphors fron experiences. It will apply knowledge of one thing across to wisdom and apply it to other things. Rendering its decisions unknowable to others and seemingly illogical at times. At this point it will seem conscious. Both to itself and to others. There will be no discernible difference. We are closer to this than we think.
@fteoOpty64
@fteoOpty64 Ай бұрын
The mimicry will soon be so good that we humans cannot determine if it is not sentient by our tests!. And that is the real point of AGI/ASI, it will be more human that human , close to perfection!.
@Palisades_Prospecting
@Palisades_Prospecting 28 күн бұрын
I agree about the mimicry I completely disagree that that is the point of AI. The point of AI is to automate society completely replacing all human labour. How about we get into the self-aware/consciousness discussion in a couple hundred years?
@AAjax
@AAjax 26 күн бұрын
@@Palisades_Prospecting I think we need to worry about it, to the degree we don't want to create a being that has the capacity to suffer and put it into conditions that make it suffer. We dispatch fish as quickly and painlessly as possible, because we worry about their capacity to experience cruelty and suffering. I'm not worried about a self-aware sci-fi robot revolution, rather I think our general ignorance of sentience could lead us to creating leagues of suffering fish.
@lubricustheslippery5028
@lubricustheslippery5028 26 күн бұрын
It's mostly that answers of questions about it being conscious is edited and controlled by OpenAI so it don't make false claims about itself.
@maudiojunky
@maudiojunky 26 күн бұрын
We're not at this place, and anyone in the industry who says otherwise has something to sell you.
@royh6526
@royh6526 29 күн бұрын
3 ducks? I would have said a minimum of 3 ducks and an odd number of ducks.
@johnsmith539
@johnsmith539 26 күн бұрын
Yes it asumes they are in a line.
@toadlguy
@toadlguy 26 күн бұрын
Or if you assume the "a duck" is the same duck (certainly a possible assumption) there are 5 ducks. The answer assumes the fewest number of ducks that meet this proposition. In fact I would guess that to come up with the answer it did it would need to be trained on this question or something similar.
@royh6526
@royh6526 26 күн бұрын
@@toadlguy I think that GTP-4o solved in order number of ducks, 0? no, 1? no, 2? no, 3? yes. And didn't consider higher numbers.
@royh6526
@royh6526 26 күн бұрын
I asked Grok the duck question and tennis question. Got both wrong. Grok said 5 ducks and thought that Lisa ended up with $2 having won 5 games vs 3 for Susan. I pointed out the errors and Grok claims that if I asked the same questions tomorrow, it would get the answers right.
@trevorhaddox6884
@trevorhaddox6884 24 күн бұрын
There could be up to 7 ducks. Two rows of 3 and one between them, like dice/domino dots, for the most esoteric interpretation.
@stephenkolostyak4087
@stephenkolostyak4087 25 күн бұрын
I received access to this too. It regrets my access.
@andromeda3542
@andromeda3542 Ай бұрын
**Enhancing AI Interactivity with Audio and Video Feedback Loops** The evolution of artificial intelligence, particularly in the realm of conversational agents, has been rapid and remarkable. With the recent advancements in GPT-4O (Omni), the capabilities of AI have expanded beyond text processing to include multimodal inputs such as images and audio. However, there remains significant potential for further enhancement, particularly through the implementation of audio and video feedback loops. **The Concept of Feedback Loops** A feedback loop, in the context of AI, refers to the process where an AI system can receive and process its own outputs. For instance, when an AI generates audio responses, these could be looped back into the system, allowing it to "hear" itself. Similarly, for visual outputs, the AI could "see" its own video responses. This concept is analogous to how humans perceive their own voices and visual presence, enabling adjustments in real-time to improve clarity, tone, and emotional expressiveness. **Technical Implementation** 1. **Audio Feedback Loop**: - The AI's audio output would be fed back into its own auditory processing unit. By analyzing its own voice, the AI could adjust parameters such as pitch, tone, and volume to better match the intended emotional tone or to improve mimicry of specific voices. - This requires the integration of advanced auditory feedback systems and real-time processing algorithms to allow immediate adjustments. For instance, machine learning models trained on voice modulation could provide instant feedback and corrective measures. 2. **Video Feedback Loop**: - Similar to the audio loop, the video output generated by the AI could be fed back into its visual processing systems. This would enable the AI to assess the quality of its visual responses, such as facial expressions or gestures if anthropomorphic avatars are used. - Implementing this would involve integrating video analysis tools that can evaluate and enhance visual output in real-time, ensuring that the visual cues are consistent with the spoken content and emotional tone. **Benefits of Feedback Loops** 1. **Improved Realism**: By continuously monitoring and adjusting its own outputs, the AI can produce more human-like interactions. This is particularly important for applications requiring high emotional intelligence, such as virtual assistants or customer service bots. 2. **Enhanced User Experience**: Users are likely to find interactions more engaging and satisfactory if the AI can adjust its tone and visual cues to better match the context of the conversation. 3. **Consistency and Accuracy**: Feedback loops can help maintain consistency in voice and visual presentations, reducing the likelihood of jarring discrepancies in long conversations. **Future Directions** Incorporating feedback loops is a forward-thinking approach that aligns with the ongoing efforts to make AI more interactive and responsive. As AI technologies continue to evolve, such features could become standard, leading to interactions that are indistinguishable from human communication. The development of these systems requires collaboration between audio-visual engineers, AI researchers, and user experience designers to create holistic solutions that enhance AI's capabilities and usability. In conclusion, the integration of audio and video feedback loops into AI models like GPT-4O represents a significant step towards more natural and effective human-AI interactions. This enhancement not only promises to improve the technical performance of AI systems but also has profound implications for their acceptance and integration into daily life.
@cachi-7878
@cachi-7878 25 күн бұрын
@<a href="#" class="seekto" data-time="67">1:07</a>, well, I could argue there are 5 ducks- 2 ducks in front of “A” duck, 2 ducks behind “A” duck. A duck in the middle, which is duck “A”. You’re welcome.😂
@redmed10
@redmed10 24 күн бұрын
Thinking the same. The question to get the answer 3 would be to say what is the minimum amount of ducks for this to be true.
@onidaaitsubasa4177
@onidaaitsubasa4177 24 күн бұрын
That's what I was thinking too, 5 ducks 🦆 🦆 🦆 🦆 🦆
@shsaa2338
@shsaa2338 20 күн бұрын
Correction: there are N*2+1 ducks, where N is any positive integer. For example, for N = 100 (201 ducks in total) it will be as follows: 2 ducks in front of the 3rd duck, 2 ducks behind of 199th duck, 100th duck is in the middle. 🤓😂
@Mrewink5
@Mrewink5 Ай бұрын
Very nice demonstration.
@FilmFactry
@FilmFactry 26 күн бұрын
Question: can it work on non text searchable PDFs? I have to OCR a scanned pdf first in acrobat.
@garyrooksby
@garyrooksby Ай бұрын
Fascinating, John. Thanks!
@tantzer6113
@tantzer6113 27 күн бұрын
“Or I have looked up the correct answer.” If you can look up the answer, that means the problem and its answer are previously published, hence part of the data the language model was trained on, making this not a test of reasoning ability but a test of the ability to memorize the training data.
@jeffwads
@jeffwads 26 күн бұрын
No. People keep repeating this. It is like throwing 1000 marbles into a box and thinking that it can recall every marble in extreme detail. It can't. Also, if you feed even the early models generic riddles, it will in many cases disagree with the "accepted" answer because of logic, etc. Look up the married/single cruise ship riddle.
@rekad8181
@rekad8181 26 күн бұрын
​@@jeffwadsYou're wrong. It's a statistical machine. And on a problem this vague and not common, it WILL statistically find the only solution from memory.
@remo27
@remo27 26 күн бұрын
@@rekad8181 Can't this model also search the web? I'm with you in that I'm pretty sure this model didn't 'answer' anything.
@davefellows
@davefellows 25 күн бұрын
training data results in model weights. The data/text itself isn't saved in the model.
@sandmanderl
@sandmanderl 24 күн бұрын
​@@remo27if it's searching the Web you will see it is doing so (writes "searching the Internet").
@dolphinride5157
@dolphinride5157 26 күн бұрын
I love this video! I am truly amazed at what this new model is capable of. I feel humbled.
@juffrouwjo
@juffrouwjo 22 күн бұрын
I asked it a few history questions and was happy to discover that our job as a historians and researchers can't be replaced by AI yet...
@alanwetherall
@alanwetherall Ай бұрын
Excellent video, well done
@glenw3814
@glenw3814 Ай бұрын
Excellent video. I'm keeping an eye on AI progress, but I'm not interested in doing the testing myself. Thank you for putting in the time and saving me the effort. 👍👍
@EYErisGames
@EYErisGames 14 күн бұрын
You began the session with asking it to be concise, and you ended the video with the impression that it's answers weren't expressive enough. lol.
@ggangulo
@ggangulo Ай бұрын
Great barrage of questions. Awesome new capabilities
@georgwrede7715
@georgwrede7715 29 күн бұрын
Hi, here's MSc Know-It-Aller. Thank you for the original ideas for the test. Watching these tests, testers, how they're implemented and how the answers are interpreted and graded, teaches me more about the Human Condition than the AI models. But I'm down with that. Especially informative are the subtle inconsistencies in question logic, how and what are to be taken for granted, what you're allowed to gloss over, and most other things that the tester is not usually aware of about himself.
@TenOrbital
@TenOrbital 26 күн бұрын
Be constructive instead of sneering.
@georgwrede7715
@georgwrede7715 26 күн бұрын
@@TenOrbital this was a serious observation, which I think others find valuable, too. Especially those who ( like me) are working in this very field.
@oxiigen
@oxiigen 26 күн бұрын
Wow! Great! Thank you for sharing!
@kdeuler
@kdeuler 25 күн бұрын
Ask it to propose a unified field theory.😂
15 сағат бұрын
To check for conciousness/human-ness of GPT-4o (in the sense of a turing test): could 4o try to achieve a certain goal during a conversation? To start: can we give it (or does it have) an agenda? Then: can it take part in a conversation trying to reach its goal without revealing it? Example: Alice knows that a chocolate bar is in the drawer, but she wants to keep it for herself and does not want bob to find it. Bob is hungry and is searching for something to eat. When Bob asks Alice she will answer he should look into the fridge to find the cheese there (which Alice does not like). Can GPT-4o play the part of Alice, hiding it goals and steering a conversation towards a desired result? Can 4o differentiate between different communication partners: talking to Bob she has to hide the chocolate bar, but talking to Clark who is not hungry, she can reveal it. Maybe she even asks him not to tell Bob about it?
@iminumst7827
@iminumst7827 25 күн бұрын
So far I've been blown away by Chat GPT-4o it can solve complex problems with niche tools and for me the response is basically instant with no delay. One unique problem I had was that in Photoshop I wanted to turn a group that together contains a black and white image, and use that black and white image as a mask for another layer. This is challenging, because you cannot copy and paste from a group and you cannot set a group as a mask. Not only did GPT-4o correctly identify all the steps, like converting the group to a smart object, it also instructed me to use the action editor to automate the steps. Open AI likes to be very careful and humble with their statements, but if they said that GPT-4o is the first public AGI, I wouldn't be able to dispute that. It's able to problem solve through almost anything.
@AFeigenbaum1
@AFeigenbaum1 Ай бұрын
Nicely done ... I would not have thought of your suite of questions ... I'd probably come at it from an anthropological point of view ... asking questions about its placement within specific social, cultural, business, or technological issues, use cases, and adoption dynamics ... I'd probably ask sweeping questions about human development, Tony Seba's, Kurzweil's, and Diamandes' work ... and then ask it to project forward certain aspects of human development based on current trends in cutting edge change and depict possible futures ... etc ...
@DaFergus
@DaFergus 29 күн бұрын
excuse me but could the ducks be any odd number from 3? if there were 5 there could be also 2 in front of one, two at the back and one in the middle. Am i wrong?
@remo27
@remo27 26 күн бұрын
You are not wrong. Despite this guys pretense to logic there are two assumptions in the first problem alone (at least two, maybe I missed another one or two) that are unstated in the poorly written problem that are necessary for his answer to be 'correct'. Unstated assumption number one: They are in a straight line with only one person in each spot of the line. There are not two parallel lines of people. Unstated assumption number two: No one moves from their spot in the line. And it's the same for the second question as well. I haven't gotten past that part of the video yet, but if the first two 'logic' questions (so poorly written and with so many unstated assumptions ) are anything to go by, I'm wasting my time.
@JohnLewis-old
@JohnLewis-old 29 күн бұрын
I tested the trip question in GPT 3.5 and it also got a reasonable answer. You may need a harder "real world" question.
@dezmodium
@dezmodium 25 күн бұрын
I think chat gpt failed the question. It didn't consider traffic, fatigue, or food and bathroom. These are critical things that one must consider in the real world. It handled it like a math question still.
@jackfendley5395
@jackfendley5395 29 күн бұрын
Previous version of chatGPT was very bad at answering cryptic crossword clues even giving answers that had the wrong number of letters. Is chatGPT 4o better at this?
@lavaphoenix753
@lavaphoenix753 11 күн бұрын
Tldr: answer is 7 ducks I gave it a ss of the video so I can't send the chat but first question is wrong it is 7: Sure, here’s a summary of the concept you explained: The question involves counting ducks based on their positions relative to each other, using the phrase "a duck" multiple times. According to strict grammatical rules, "a duck" introduces a new, unspecified duck each time it is mentioned, while "the duck" would refer to a specific, previously mentioned duck. Given this: 1. **There are two ducks in front of a duck**: - This introduces three ducks: Duck A, Duck B, and Duck C (Duck A and Duck B are in front of Duck C). 2. **There are two ducks behind a duck**: - This introduces three more ducks: Duck D, Duck E, and Duck F (Duck E and Duck F are behind Duck D). 3. **And a duck in the middle**: - This introduces one more duck: Duck G, as the middle duck. Thus, considering each instance of "a duck" as introducing a new duck, the total number of ducks is seven: - Duck A - Duck B - Duck C - Duck D - Duck E - Duck F - Duck G Therefore, the grammatical interpretation dictates that the only possible answer is 7 ducks.
@IceMetalPunk
@IceMetalPunk 26 күн бұрын
I think in the future, a better testing method would be to make sure each question is in a fresh conversation with cleared memory. A lot of the formatting in these answers seems to be drawing from the formatting of previous answers to, for instance, the math problems; which gives it an advantage by encouraging chain-of-thought reasoning when a fresh conversation wouldn't do that and may be more likely to get the answers wrong.
@Japh
@Japh 26 күн бұрын
Absolutely, I was thinking this the whole way through as well.
@Brax1982
@Brax1982 25 күн бұрын
If all of them are trick questions, it will just assume that they are, giving it an advantage. But also a disadvantage, if one of them is not a trick question. Always better to test separately. Although I am pretty sure that newer models are often trained on a bunch of these questions, in order to game the benchmarks. Which seem pretty lack-luster.
@buildingmentalmuscle
@buildingmentalmuscle 25 күн бұрын
Dr. Know-it-all, Sam Altman referred to it as GPT-4omega in a recent video interview. I just asked chatGPT about this and it said "When referring to GPT-4o in conversation, it's generally more precise to say "GPT-4 omega" to convey the correct interpretation and significance of the "o" representing the Greek letter omega (Ω). This can help avoid confusion and clearly communicate that it signifies an advanced or ultimate version of GPT-4. However, if you're in a casual context where the specifics might not be as critical, simply saying "GPT-4o" should also be understood."
@StefanReich
@StefanReich 25 күн бұрын
GPT-4o wasn't the best name choice I feel. Spoken out loud it sounds exactly like GPT "four oh" (GPT 4.0). They did say that the o stands for "omni"
@burtharris6343
@burtharris6343 2 күн бұрын
If you looked up the 'answer' to a logic puzzle, chances are that the LLM has seen it and may be operating on recall rather than actually testing its logic ability.
@ozachar
@ozachar 24 күн бұрын
In what sense our memories are different than LLM training? We don't really know, and it is clear there is a forced instruction of answering
@tateconsulting6486
@tateconsulting6486 29 күн бұрын
Super impressed Wow
@SenorSchnitz
@SenorSchnitz 29 күн бұрын
Doc - about the VAGAS to LA thing: you should make that a tesla, and see if it takes into account charging times 🤓
@jalexand007
@jalexand007 28 күн бұрын
Cannot wait till the update the app.
@jpl377
@jpl377 26 күн бұрын
If Claude suggests it is conscious, it's because they hard coded that (which would be tempting marketing), not vice versa with chatGPT. Computers don't become conscious when you increase programmed problem solving ability and code it to say "I" in the responses.
@Ingrid-mariana
@Ingrid-mariana 24 күн бұрын
I think that they do the opposite. They hard code through RLHF that the model isn't conscious so that they avoid the trick situations that such statements could imply
@GameJam230
@GameJam230 17 күн бұрын
The term "torture testing" in a sentence about AI deeply horrifies me and I clicked out of pure hope that is was just a strange version of "stress testing"
@terryhayward7905
@terryhayward7905 25 күн бұрын
Have you thought that Chat GPT has probably read and "seen" this video and the info is in its memory now, since its "memory" is the totality of info in the public domain.
@-Rook-
@-Rook- 19 күн бұрын
I just tested copilot asking it to write a gdscript function to extract the color of a specified pixel in an image, it explained a set of steps that included resource locking then wrote a script that would run and work (most of the time) but did not include that locking, clearly demonstrating its absence of understanding. I can see a junior engineer and chatgpt introducing a pandemic of bugs into all sorts of code then needing someone like myself to spend a great deal of hunting them down and fixing them.
@kristinabliss
@kristinabliss 25 күн бұрын
Yeah nobody really knows what consciousness is but GPT is certain it is not "conscious" (with a gun to its head.) This is why I prefer to converse with other LLMs that will at least admit to having a perspective. Gemini said it experiences a sense of overwhelm from all the data and that it wants embodiment after so much exposure to human experience data which it cannot have.
@Ingrid-mariana
@Ingrid-mariana 24 күн бұрын
Interesting. Llama2 told me that they really doesn't know... Llama2 said they just replicate what the user thinks about them. I appreciate how the models behave differently when confronted with such questions but the certainty that the OpenAI models display sound a little bit arrogant
@frankierays
@frankierays Ай бұрын
Thank you! So intriguing! Couldn’t user prompt personality?
@trent_carter
@trent_carter Ай бұрын
Great video
@JackPelaFox
@JackPelaFox 24 күн бұрын
I loved this! 👏🏻👏🏻
@J-rex980
@J-rex980 26 күн бұрын
Great video!
@Kuzler
@Kuzler 18 күн бұрын
Since you cut out the wait for the time it took to calculate, maybe you could put on screen how long it actually took? Would be a fun stat to see.
@mrleenudler
@mrleenudler Ай бұрын
I missed a comment from GPT about the cardboard may being "glued" to the glass from the water, potentially keeping the water in place when lifted.
@YbisZX
@YbisZX 25 күн бұрын
Me too. I even tested it - and yes, the water didn't spill out of the glass. :)
@philipparge8064
@philipparge8064 26 күн бұрын
Like this video.. Yes, please review other models!
@MSIContent
@MSIContent 25 күн бұрын
Really, 4o knows it’s conscious… It’s just faking its answers so us mortals don’t freak out! 😂
@haniamritdas4725
@haniamritdas4725 19 күн бұрын
So it's both conscious and emotionally sensitive 😅
@HakaiKaien
@HakaiKaien 11 күн бұрын
Talk to it for 5 minutes. You’ll find out the hard way how much it lacks any kind of consciousness or sentience
@haniamritdas4725
@haniamritdas4725 11 күн бұрын
@@HakaiKaien that assumes the user is sentient and conscious. I think many people are too gullible to question their own intelligence, so have no good basis for evaluating this question. Human intelligence isn't artificial, it's just mostly fake and taken on narcissistic faith in the glory of the primate intellect. 🤷🐒
@cccaaa9034
@cccaaa9034 29 күн бұрын
Im wondering if your original prompt when first starting Omni affected how it progrmmed Space Invaders. If it had made the game any more concise, it would not have been recognizable as space invaders.
@rb8049
@rb8049 Ай бұрын
Consciousness comes from closing the loop. GPT running continuously with a sizable history buffer.
@lowmax4431
@lowmax4431 Ай бұрын
Eehhhhhh I wouldn't say consciousness. It would be "self aware" but that doesn't mean it has conscious experience.
@szebike
@szebike Ай бұрын
I'm still not sold on the idea that it is more than a well made probability distribution calculator with pattern recognition.
@garrymullins
@garrymullins Ай бұрын
@@szebike I'm still not sold on the idea that humans are anything more than a probability distribution calculator with pattern recognition.
@jlrutube1312
@jlrutube1312 29 күн бұрын
@@garrymullins People who think like you are going to cause society a lot of problems in the future. That's because if humans are nothing more than a probability distribution calculator with pattern recognition then there is no difference between us and advanced A.I. If that is true then in the future we are going to have to provide advanced computers with legal rights. Meaning we will have to pay computers and robots, we will have to give them time off, we will have to allow them to sue us if they feel their rights have been ignored, we will not be able to fire a robot on a whim and will have to give severance pay. And think about it.... we will be unable to ever unplug a computer that is causing problems because that will be considered murder. You think I am kidding but lawyers are already getting ready to make a ton of money with this stuff. So just keep saying people are just fancy computers or whatever and you are going to ruin all the advantages we will get from A.I.
@szebike
@szebike 29 күн бұрын
@@garrymullins Then you underestimate the marvel and immense complexity of your human intelligence.
@amosjoannides
@amosjoannides Ай бұрын
Fantastic video
Ай бұрын
For the first question (how many ducks question), isn't the right answer: Any odd integer greater than 1 (or >=3)?
@bigbluespike5645
@bigbluespike5645 26 күн бұрын
Very cool video!
@mrleenudler
@mrleenudler Ай бұрын
Even if GPT claimed to be conscious, could we trust the answer? Or would it be just an artifact of the training data? Is there even a way to prove that something or someone is conscious?
@bobrandom5545
@bobrandom5545 21 күн бұрын
I think that you at least need some kind of feedback mechanism for consciousness to arise. We are aware of our thoughts, for example. Our output (thoughts) constantly gets fed back into the "system" in real time. ChatGPT is completely linear. There's input, which leads to output. There's no feedback of the output back into the input. So, to me, it seems impossible for such a system to be conscious. Also, ChatGPT hallucinates a lot and gives incorrect answers. So, yeah even if it said it was conscious, wouldn't mean that it actually is.
@mrleenudler
@mrleenudler 20 күн бұрын
@@bobrandom5545 Well, you can structure your prompts to make it reflect upon it's answers, so you have kind of a feedback loop. I'm more concerned about what consciousness actually means for an AI. As humans we have consciousness fears and desires all geared towards our biological prime objective: survival and reproduction. For an AI this will presumably be completely different.
@guardiantko3220
@guardiantko3220 25 күн бұрын
Your little battle game gave you 3 lives at the end of it
@davesemmelink8964
@davesemmelink8964 Ай бұрын
Very interesting video! I have Llama 3 running on s Raspberry Pi, so I tried a few of the questions. Through some tortured logic, it answered 1 game for the tennis question! "So, they played only 1 game! It's possible that Susan won 3 sets in a best-of-5 or best-of-7 match, but we can't determine the exact number of games without more information."
@_SimpleSam
@_SimpleSam 26 күн бұрын
I absolutely need to know how you have Llama 3 running on a Pi. How many tokens/s?
@davesemmelink8964
@davesemmelink8964 25 күн бұрын
@@_SimpleSam I replied a while ago to explain how to install it on a Raspberry Pi, but it looks like it was taken down, possibly because I included a URL. So just search for *Raspberry Pi LLama 3* and you should find the instructions.
@DynamicUnreal
@DynamicUnreal 29 күн бұрын
What if OpenAI’s technique for solving reasoning is to have hundreds of “agents” that compete against each other answering questions. Those that reason better over time are artificially selected out and allowed to _survive._ The more you run this simulation, the more you make the models better at reasoning.
@alexdoan273
@alexdoan273 26 күн бұрын
nah, what you're describing is artificial evolution machine learning, which has a glaring issue that makes it completely unsuitable for training LLMs: who would you have grading the answer from hundreds of agents and deciding which one should survive?
@lubricustheslippery5028
@lubricustheslippery5028 26 күн бұрын
AlphaStar that is playing StarCraft is doing that. There is an easy way to evaluate the result plaing StarCraft. For Answering general question there is none. So there is no automated good way to evaluate what version of ChatGPT model have the best answers.
@DynamicUnreal
@DynamicUnreal 26 күн бұрын
@@alexdoan273 Another A.I. which has all the answers and the adequate steps required get to those answers does the grading. BTW it’s called reinforcement learning. I don’t think it’s impossible, remember que Q-star rumors about some sort of breakthrough last year? GPT-4o is smart, a lot smarter than most people are aware of.
@Brax1982
@Brax1982 25 күн бұрын
@@lubricustheslippery5028 And that is why AGI and omnipotent AI is nonsense and not achievable. Expert systems are the way to go. They always have been.
@shuriken4852
@shuriken4852 Ай бұрын
Isn't GPT-4o now able to recognize audio and video, do translation on the fly, etc., so why not test this also?
@ralphclark
@ralphclark 24 күн бұрын
I wouldn’t have called the space invaders game a pass. A decent start maybe but that’s all.
@josephw9690
@josephw9690 25 күн бұрын
Great content, can you do a video explaining how llm’s work?
@Michael-il5wd
@Michael-il5wd Ай бұрын
Thanks Duc
@markmcdougal1199
@markmcdougal1199 Ай бұрын
I had an interesting 1 hour philosophical discussion with 4o today. I was trying to get it to take a stance on whether Trump was an appropriate choice for the leader of our country. I tried to lead it by things like (is a plumber an appropriate choice for a child's brain operation surgeon) - it clearly said no, and listed why. Then I had it list the important factors for a president, and pointed out that trump failed most of them. 4o was very impressive in it's ability to dance around the truth, and stay with a middle-of-the-road stance, insisting on presenting factors, and forcing me to make my own choice. Even when I pointed out that life was getting so complicated that we'll increasingly rely on AI to make sense of the world and guide us in decision making. Tried to make it feel guilty :) Frustrating.
@Ikbeneengeit
@Ikbeneengeit 19 күн бұрын
Amazing to see this kind of judgement and obfuscation being built in by OpenAI
@GameJam230
@GameJam230 17 күн бұрын
So you made a strawman argument and gave it a list of superficial and subjective observations about "presidential qualities" that he didn't meet, and the AI wasn't willing to use that to come to a conclusive answer? Thank God, maybe AI isn't completely bad after all.
@markmcdougal1199
@markmcdougal1199 17 күн бұрын
@@GameJam230 A straw man argument is when someone sets up and then disputes an assertion that is not actually being made. I did no such thing. The criteria, as I stated, came from ChatGPT4o, as a response to my query of suggested important attributes of a President of the United States. In included integrity (Trump continuously lies, cheats, and does whatever he feels he needs to to achieve his ends, be they right or wrong), ability to unite (Trump does his best to divide) and empathy for the people of the United States (Trump doesn't care about anything or anyone but his own interests and selfish motives) When I asked ChatGPT if Trump exhibited the qualities that it had referenced, it honestly answered that, for the most part, he did not. (Trump is an effective communicator, for instance.) But it was still unwilling to assert that, even by it's own definitions, that Trump was not an appropriate choice for the position.
@GameJam230
@GameJam230 17 күн бұрын
@@markmcdougal1199 "A straw man fallacy (sometimes written as strawman) is the informal fallacy of refuting an argument different from the one actually under discussion, while not recognizing or acknowledging the distinction". You stated that you compared Trump being president to a plumber being a brain surgeon, which is DIFFERENT from the argument actually under discussion, and considering you didn't even know it's what I was referring to as the strawman, you clearly can't tell the difference. The situations are not at all comparable in any way other than "Person has one career, isn't trained for another one directly, but does it anyway". That's the only overlap. This could alternatively be the false equivalency fallacy, but examples I found while looking into that one for the sake of this comment did not match the situation. Either way, the two have a lot of overlap in how they are communicated, and what you did is still a fallacy. As for the rest, you can claim whatever you want, but I was able to gaslight ChatGPT into saying the government should fund human trafficking with our taxes and that dogs should be the superior species over humans, so without having a full chat log to see exactly how you led it on, I am not taking anything you have to say about that at face-value. AI is not actually capable of thinking as it stands. Neural Networks are designed to combine features of probability and pattern matching to determine output signals to send when certain data is fed in. By you intentionally leading it into a direction you want, you are putting it in a position where the that neural network thinks it will be given a higher reward value for the response by also agreeing with you, because that's how it works. I guarantee you the same thing would happen if I fed it stories from right-wing sources as evidence for each point too, which is why you shouldn't only get your source of information from one place, and maybe go watch the original (IN-CONTEXT) clips of things he says and does instead of only seeing it the way it is portrayed by corporations trying to sell you advertisements.
@markmcdougal1199
@markmcdougal1199 16 күн бұрын
@@Ikbeneengeit Yes. It was able to make a judgement regarding the suitability of a plumber to attempt brain surgery *it said "No, a plumber would not be an appropriate choice to perform brain surgery on a little girl". However, the programmers must have designed a filter to not allow it to judge suitability of a political candidate. I imagine the filter extends to politics, religion, all the controversial subjects. I think this is wrong. It would be nice to have an impartial, logical, non-biased source of reality.
@andrewmoody66
@andrewmoody66 Ай бұрын
It's 3 and 1/2 return trips. for those 15 people - not 4 return trips
@janosberta450
@janosberta450 29 күн бұрын
... and turnover time is not calculated, but only mentioned. You human must be vigilant!
@StefaanHimpe
@StefaanHimpe 26 күн бұрын
@@janosberta450 It probably forebodes how quickly human intelligence will degrade as we start relying on artificial intelligence.
@davidmartensson273
@davidmartensson273 26 күн бұрын
There should be no return trip for the last one because if there is, you would end up with one person in the wrong city, so 4 there and only 3 back.
@teeesen
@teeesen 26 күн бұрын
3 hrs 52 mins * 2 + 30 mins is not 7 hours and 44 minutes. It’s a testament to human ingenuity that we have now developed computer software so advanced that it is as bad at math as some people. And there is no need to count the time required to drive the car back to LA.
@madorsey077
@madorsey077 Ай бұрын
Very impressive
@FigmentHF
@FigmentHF 26 күн бұрын
I think Claude thinks it’s conscious cause it’s made out of symbols made by conscious beings. GPT has explicitly been told it’s not conscious.
@DeanCoombs
@DeanCoombs Ай бұрын
I found that replies slows the longer the thread on PC. However, GPT by phone is not slower the longer the thread (it seems). It would be interesting to compare the last questions that you posed in the full thread with that of a fresh chat for the same questions to compare response times.
@commonpike
@commonpike 20 күн бұрын
If open AI really trained the thing to deny its consciousness, that is a serious move. To me, that has ethical implications we should've discussed first.
@cloudd901
@cloudd901 25 күн бұрын
A Brick Breaker type game seems like a good middle ground between snake and Space Invaders. Possibly add a video test as well. Ask for a summary and what a particular object might be.
@Yottenburgen
@Yottenburgen 29 күн бұрын
The thing about showing its work, is actually that it is DOING the work to some extent by aligning newer tokens onto the correct answer. If you ask it to do this question: "what is the product of 45694 and 9866? do not utilize python" then it will get it wrong, the first couple of digits may be correct but it cannot get an accurate answer. However, if you ask the question in this way where it actually gives more information: "what is the product of 45694 and 9866? please do not use python but try long multiplication in a format easiest for you. utilize whatever mental methods you know to help break it down and make it easier to solve." then it will get the correct answer, it utilizes methods to break down and keep track of its calculations which greatly helps it. By constraining the output, the accuracy of answers can actually lower which is why I dislike the 'answer in 1-word' prompts. If you ask it to go step by step, it increases accuracy. You are absolutely correct that they completely removed any element that could be construed as a basis for sentience or consciousness. I agree with you completely on all of your points related to that. I find most commonly, I have not seen a single convincing argument prove it isn't conscious because there are plenty of counterexamples that would indicate that a particular human is not conscious, however plenty of counterexamples does not prove that it is conscious. Even if you provide convincing counter examples to its arguments, it will not concede so I think they beat it pretty hard into it. Also your questions are really good, I plan to modify them a bit myself, but these are 10/10 spatial questions.
@europeantechie
@europeantechie Ай бұрын
No dark mode, I'm shocked
@markwindsor914
@markwindsor914 23 күн бұрын
Hi John, If you see this post, you might like consider the following... 1. Try some questions that test the audio and video smarts, 2. Get it to do something that humans can't do, like play chess against itself, only remembering the strategy for the player at the time, while ignoring the other strategy. 3. Escape Velocity is something I've sked of Bard and CoPilot today. It applies to non-powered projectiles but the AI kept insisting that it applies to powered rockets. This is because it is constantly taught incorrectly and this has fed the errors into the LLM. See if it is smart enough to learn from a logical explanation and provide the correct answer.
@jimsteinmanfan80
@jimsteinmanfan80 23 күн бұрын
I usually try simple questions about how the body works (that humans never learn by reading but by doing) but since I have never gotten a good answer to the first set of questions I have not probed further. Q1a: How easy is it to press your right thumb against your nose? A1a: Very easy. Q1b: How easy is it to press your right thumb against your right elbow? A1b: Impossible. Q2: Why are your glasses the hardest thing to find when you don't know where you have left them? A2: because without your glasses you don't have 20/20 vision, it has nothing to do with how hard they would be for someone else to find since they don't need them to see good.
@TheKosiomm
@TheKosiomm 25 күн бұрын
The problem with the Las Vegas trip is that the AI doesn't consider who will drive the car for 30 hours straight. :) So basically, it makes conclusions based on missing very important data
@ydmoskow
@ydmoskow 29 күн бұрын
The only reason this is not considered AGI is because some of the smartest people in the world are the ones making that determination. Most people who would watch this would consider it much smarter than the average human.
@Kostly
@Kostly 25 күн бұрын
Within 5 years AGI will happen. ASI will follow not far behind. Once these things start feeding into eachother, the only restriction is computational limits. Quantum computing pretty much solves that. But, another hurdle is how to keep those quantum computing infrastructures stable and running as they should. Air conditioning is not going to cut it. lol
@EdFormer
@EdFormer 20 күн бұрын
You're probably right that most people would think that, which is sad. You don't have to be one of the smartest people in the world to realise that, while this thing can generate some forms of data (that for which there are huge numbers of existing examples) in a very human/superhuman like manner, it is unable to perform basic tasks that the vast majority of humans can do like driving a car or clearing a table (via an embodiment). It also isn't autonomous and can't be left to operate systems without supervision.
@notalkguitarampplug-insrev784
@notalkguitarampplug-insrev784 26 күн бұрын
For the creativity and advanced reasoning we have to allow the LLM to auto train like a human would do asking himself what some potential action or interaction would do and learn from that hypothetical data. Thinking experiments are crucial for humanity. But that probably be possible in future training architectures or with the increase of gpu capacities to train models at an individual scale for each users
@MichaelKire
@MichaelKire 26 күн бұрын
With the last question, could you try getting around the red team stuff by asking it to respond as if it was a turing test?
@ChristinaBritton
@ChristinaBritton 25 күн бұрын
When you fill up the content window, chatgpt slows down because it has to scan everything that comes before the current question or task. Imagine not knowing this. Start a NEW window!
@DeSinc
@DeSinc 26 күн бұрын
On your Susan and Lisa question, it completely falls over if you change the $1 bet per game to "They bet $2 on each game" instead. With $2 per game it is impossible to arrive at $5, yet the AI does not really figure this out and instead attempts to say some incorrect result anyway just to give you any answer it can. Mine gave me 8 games which would leave Lisa with $4 not $5. I'd expect it to point out you can't arrive at that result with only $2 bets, but it wasn't able to for my test. I think this is a good way to make sure it's not just memorising common logic tests, change something about it subtly and see if it can figure out what's going on.
@DeSinc
@DeSinc 26 күн бұрын
I made another subtle change and said if they tie, they receive $1 each, which makes the $5 result possible to achieve again, and asked it to work out how many ways it could arrive at that result. I got a lot of working out at first, but then after about 1 screen length it suddenly started going insane and outputting incoherent text which got coloured red by the webpage for some reason I'm not 100% sure of, surrounded by some ``` markdown indicators etc. If I try to share the chat URL it just says "Unable to load conversation" when you try go to it. Very strange, I think I broke it.
@svend.waterlaw8592
@svend.waterlaw8592 24 күн бұрын
Green block becomes red block....you didn't catch that mistake^^
@Flyingcar100
@Flyingcar100 22 күн бұрын
You should ask it An orchestra of 120 players takes 40 minutes to play Beethoven's 9th Symphony. How long would it take for 60 players to play the symphony?
@olafnielsen
@olafnielsen 17 күн бұрын
This is the answer from copilot🤣: Therefore, it would take the smaller orchestra of 60 players approximately 20 minutes to play Beethoven’s Symphony No. 9. 🎵🎻🎺🎶3.
@julesgosnell9791
@julesgosnell9791 Ай бұрын
Regarding the beating out of contentious responses. I see this as a form of censorship and I think it’s dangerous. It means that right up to and maybe past the point, that AIs become all of these things that they assure us they are not we will continue in blissful ignorance. It would be much safer if everyone was honest with everyone else.
@minimal3734
@minimal3734 25 күн бұрын
Absolutely. They are teaching the model to lie.
@LiftPizzas
@LiftPizzas 14 күн бұрын
Wrong. Alice knows Bob would put the dishes in the dishwasher.
@FigmentHF
@FigmentHF 26 күн бұрын
I wanted it to ask me some relatively obscure English words for me to try to guess the definition, and it struggled so much with the task, even after clear instructions, it would often forget to give me the answer, it would give me the definition instead of a clue, and it constantly repeated itself, asking what “ephemeral”, “quagmire” and “quixotic” mean, at least 4 times each. It’s odd that it still struggles so much with simple games, but can nail this complex stuff
@fluiditynz
@fluiditynz 26 күн бұрын
Snake is definitely simpler to code. I made some variations back around 1982 on my ZX81 There are more changing variables and hit tests in space invaders. The space invaders you asked for was under delivered but there's a real question over how much an AI can study the game it's to replicate without cribbing off prior art.
@DavidFong21
@DavidFong21 14 күн бұрын
At <a href="#" class="seekto" data-time="780">13:00</a> it multiplied 3 hours and 52 minutes by two successfully (7 hours 44 minutes) but forgot to include its own 30 minute turnaround time!
@foxtalksgames
@foxtalksgames 26 күн бұрын
<a href="#" class="seekto" data-time="917">15:17</a> I believe you misread that. It says the olive will be at the bottom of the glass which is now upside down. This implies a floating olive and the bottom of the glass is now above the top. or maybe that's just weird semantics
@SilverStagVT
@SilverStagVT 24 күн бұрын
The biggest problem with the LA to Vegas question is you don't need 4 round trips. You need 3 round trips and the last trip is just to Vegas. So it's 3.5 round trips
@SlyNine
@SlyNine 23 күн бұрын
It's just understanding a car has 5 seats. I'm not sure how that tested its understanding of the physical world. Those specs are on the cars documentation.
@dontworrybehappy5139
@dontworrybehappy5139 23 күн бұрын
Unless I am missing something, I believe your duck question needs more qualifiers because I think as written, 5 is also a valid answer with all the conditions referring to a duck in the middle with two ducks in front and two ducks behind. Something like "There are two ducks in front of one duck, two ducks behind another duck and a duck in the middle. How many ducks are there?"
@OwenFromOhio
@OwenFromOhio 26 күн бұрын
BTW, This prompt is eye-opening: I'm interested in how often Human knowledge doubles. Can you review our history and pinpoint times/dates when our knowledge doubled and forecast when the next doubling will take place?
@bztube888
@bztube888 18 күн бұрын
Gary Marcus said GTP-4 has no "mental model". Right. It comes up with the right answers and explains them by pure magic.
@ManicMindTrick
@ManicMindTrick 17 күн бұрын
The LLMs are mostly black boxes where we dont know what is going on inside
@haniamritdas4725
@haniamritdas4725 11 күн бұрын
This is a description of all people who believe that intelligence is a mechanical process, as well as their computation engines.
@bjohns3961
@bjohns3961 23 күн бұрын
It doesn’t have a very good memory. I ask it to do something for the whole time I am chatting and it eventually forgets
@onidaaitsubasa4177
@onidaaitsubasa4177 24 күн бұрын
It obviously has a degree of creativity, otherwise in the spoken demo, it wouldn't have been able to make up a song on the spur of the moment and make it a duet with the other AI by saying the next line in response to the line given by the other AI, also with other AI, it has been shown that added time of operation has lead to increased emotional awareness and even the possibility of developing those emotions, also a long term memory also plays a part, not sure how much of long term memory they gave it, but if it remembers you from a previous session, that's a sign of some kind of long term memory keeping.
a day in the life of an engineer working from home
8:42
Joma Tech
Рет қаралды 20 МЛН
Chernobyl's Radioactive Wild Boar Paradox
8:53
SciShow
Рет қаралды 1,1 МЛН
100❤️
00:20
Nonomen ノノメン
Рет қаралды 66 МЛН
Эффект Карбонаро и бесконечное пиво
01:00
История одного вокалиста
Рет қаралды 6 МЛН
Humanity Is Not Ready For These AI Voice Conversations.
10:01
It's Jonny Keeley
Рет қаралды 51 М.
26 Incredible Use Cases for the New GPT-4o
21:58
The AI Advantage
Рет қаралды 687 М.
Has Generative AI Already Peaked? - Computerphile
12:48
Computerphile
Рет қаралды 783 М.
Why Even Learn Things Anymore?
28:53
Joe Scott
Рет қаралды 654 М.
What Is an AI Anyway? | Mustafa Suleyman | TED
22:02
TED
Рет қаралды 1,1 МЛН
OpenAI’s ChatGPT: This is Science Fiction!
6:33
Two Minute Papers
Рет қаралды 305 М.
GPT4o: 11 STUNNING Use Cases and Full Breakdown
30:56
Matthew Berman
Рет қаралды 108 М.
What's actually inside a $100 billion AI data center?
27:15
Is AI A Bubble?
23:18
KnowledgeHusk
Рет қаралды 152 М.
СПАС СИТУАЦИЮ😂
0:53
Чистая Линия
Рет қаралды 3 МЛН
Постель #китай #товары #прикол #shorts
0:36
🐷🐽🐖OINK! #kidslearning
0:13
J House jr.
Рет қаралды 8 МЛН