EXCLUSIVE: Torture Testing GPT-4o w/ SHOCKING Results!

  Рет қаралды 109,113

Dr. Know-it-all Knows it all

Dr. Know-it-all Knows it all

20 күн бұрын

I got access to OpenAI's new GPT-4o model and have put it through the questions wringer--and the results are pretty astounding! Let me know what other models--like Google's new Gemini 1.5 Pro--you'd like me to submit to my new torture test. And let me know other questions that might work well for future iterations!
Join this channel to get access to perks:
/ @drknowitallknows
**To become part of our Patreon team, help support the channel, and get awesome perks, check out our Patreon site here: / drknowitallknows . Thanks for your support!
Get The Elon Musk Mission (I've got two chapters in it) here:
Paperback: amzn.to/3TQXV9g
Kindle: amzn.to/3U7f7Hr!
**Want some awesome Dr. Know-it-all merch, including the YEAR OF EMBODIED AI Shirt? Check out our awesome Merch store: drknowitall.itemorder.com/sale
For a limited time, use the code "Knows2021" to get 20% off your entire order!
**Check out Artimatic: www.artimatic.io
Exclusive First Look at CHAT GPT 4o!
Torture Testing GPT-4o Creates SHOCKING Results!
**You can help support this channel with one click! We have an Amazon Affiliate link in several countries. If you click the link for your country, anything you buy from Amazon in the next several hours gives us a small commission, and costs you nothing. Thank you!
* USA: amzn.to/39n5mPH
* Germany: amzn.to/2XbdxJi
* United Kingdom: amzn.to/3hGlzTR
* France: amzn.to/2KRAwXh
* Spain: amzn.to/3hJYYFV
**What do we use to shoot our videos?
-Sony alpha a7 III: amzn.to/3czV2XJ
--and lens: amzn.to/3aujOqE
-Feelworld portable field monitor: amzn.to/38yf2ah
-Neewer compact desk tripod: amzn.to/3l8yrUk
-Glidegear teleprompter: amzn.to/3rJeFkP
-Neewer dimmable LED lights: amzn.to/3qAg3oF
-Rode Wireless Go II Lavalier microphones: amzn.to/3eC9jUZ
-Rode NT USB+ Studio Microphone: amzn.to/3U65Q3w
-Focusrite Scarlette 2i2 audio interface: amzn.to/3l8vqDu
-Studio soundproofing tiles: amzn.to/3rFUtQU
-Sony MDR-7506 Professional Headphones: amzn.to/2OoDdBd
-Apple M1 Max Studio: amzn.to/3GfxPYY
-Apple M1 MacBook Pro: amzn.to/3wPYV1D
-Docking Station for MacBook: amzn.to/3yIhc1S
-Philips Brilliance 4K Docking Monitor: amzn.to/3xwSKAb
-Sabrent 8TB SSD drive: amzn.to/3rhSxQM
-DJI Mavic Mini Drone: amzn.to/2OnHCEw
-GoPro Hero 9 Black action camera: amzn.to/3vgVMrH
-GoPro Max 360 camera: amzn.to/3nORGYk
-Tesla phone mount: amzn.to/3U92fl9
-Suction car mount for camera: amzn.to/3tcUfRK
-Extender Rod for car mount camera: amzn.to/3wHQXsw
**Here are a few products we've found really fun and/or useful:
-NeoCharge Dryer/EV charger splitter: amzn.to/39UcKWx
-Lift pucks for your Tesla: amzn.to/3vJF3iB
-Emergency tire fill and repair kit: amzn.to/3vMkL8d
-CO2 Monitor: amzn.to/3PsQRh2
-Camping mattress for your Tesla model S/3/X/Y: amzn.to/3m7ffef
**Music by Zenlee. Check out his amazing music on instagram -@zenlee_music
or KZbin - / @zenlee_music
Tesla Stock: TSLA
**EVANNEX
Check out the Evannex web site: evannex.com/
If you use my discount code, KnowsEVs, you get $10 off any order over $100!
**For business inquiries, please email me here: DrKnowItAllKnows@gmail.com
Twitter: / drknowitall16
Also on Twitter: @Tesla_UnPR: / tesla_un
Instagram: @drknowitallknows
**Want some outdoorsy videos? Check out Whole Nuts and Donuts: / @wholenutsanddonuts5741

Пікірлер: 479
@aaronmcculloch8326
@aaronmcculloch8326 18 күн бұрын
It's slower because it's the same session, so for every token it generates, it needs to review every token in the conversation. Those three programs your had it write are reviewed on every single word it writes. After big things like that, try opening a new session, it speeds it right back to warpspeed
@chrishaberbosch1029
@chrishaberbosch1029 17 күн бұрын
After that entire series you could ask it what it all means.
@mvasa2582
@mvasa2582 15 күн бұрын
Yep, context window!
@WyzrdCat
@WyzrdCat 15 күн бұрын
Imagine making ai content and not knowing this. Thanks for savin me the time
@Remixt
@Remixt 15 күн бұрын
It runs very slow for me even with a new window.
@Adovid
@Adovid 14 күн бұрын
The model shouldnt run slower with a large context window, everything is running concurrently
@johnmajewski1065
@johnmajewski1065 18 күн бұрын
Very impressive, My test. Some time ago I wrote a verse about a 7-day father and son bonding trip into the Sahara with my son with two Beduen Guides and four camels. Not knowing how to write lyrics I asked Chat 3.5 to convert it into lyrics so it could give it to a conductor to create a musical score for recording with vocals. This was an experiment that I did some small corrections, in fact, it was so good that I hired a composer, these lyrics became a recorded CD of 9 minutes of music with vocals as a gift for my son's 50th birthday.
@BorealGhost13
@BorealGhost13 4 күн бұрын
Very cool
@erikjohnson9112
@erikjohnson9112 15 күн бұрын
For the problem around 16:40 there is a good followup question to ask: "Once Alice gets home and sees the scene, what does she think happened?" This has a nice subtle bit, because if Bob ate the food there would not likely be a broken plate on the floor. A normal human would have cleaned up the plate if it had broken while they were there. Since the food is gone, it is likely that Spot ate the food and broke the plate in the process.
@yepe8198
@yepe8198 12 күн бұрын
Here is the answer from GPT-4o: When Alice gets home, she will likely notice a few things: The Plate: She will see the shattered plate on the ground. The Missing Eggs and Toast: She will notice that the scrambled eggs and toast she left on the low table for Bob are gone. Based on these observations, Alice will probably think the following: Scrambled eggs and toast: Alice might initially be confused or frustrated, wondering why Bob didn't eat the food she left for him or why the food is no longer there. Given the broken plate, she might deduce that Spot, their dog, got onto the table and ate the food, knocking the plate down and breaking it in the process. Dirty dishes: Alice will see that her dirty dishes are no longer in the sink, so she will realize that Bob put them in the dishwasher. Overall, Alice will likely conclude that Spot misbehaved by eating the eggs and toast and causing the plate to shatter while Bob was helpful in cleaning up by putting the dirty dishes in the dishwasher.
@erikjohnson9112
@erikjohnson9112 12 күн бұрын
@@yepe8198 Yes, that is an answer I would hope for. Thanks for posting the result.
@IntoTheFray.58
@IntoTheFray.58 18 күн бұрын
We are entering the Age of Intelligent Machines faster than most people think. I can only wonder what the next iterations of LMMs are going to be capable of. I think we only have a year or two before these start making a significant impact on the world and then things will start to get squirrelly.
@lamsmiley1944
@lamsmiley1944 18 күн бұрын
If we continue to see significant increases in capabilities between versions then you’re completely correct. The societal and economic impacts will be profound.
@mxalltheway
@mxalltheway 15 күн бұрын
Its already impacting big time. In this exact moment.
@moonstriker7350
@moonstriker7350 9 күн бұрын
It's not intgelligent. It deosn't undertand any of the physics in the glass example, it simply read pretty much the same thing on the net already, and recites it with the words slightly remixed.
@haniamritdas4725
@haniamritdas4725 8 күн бұрын
"things will start to get squirrelly". Um that has not happened yet where you are!? 😮😅
@haniamritdas4725
@haniamritdas4725 8 күн бұрын
​​@@moonstriker7350agree. Humans do the same thing; if you speak an intelligent series of words that are all known to a person who has never heard them in that order, they will need to repeat the words a few times to create the meaningful associations.
@bspencersf
@bspencersf 18 күн бұрын
I’d be interested in having you ask it to test you to determine whether you are human or AI
@drain_of_consciousness
@drain_of_consciousness 14 күн бұрын
that's a great question friend!!
@cblaskoski
@cblaskoski 14 күн бұрын
Definitely an AI avatar created from his face and voice
@kurtjanssen3887
@kurtjanssen3887 18 күн бұрын
I think you had 3 lives in the game😂
@Yipper64
@Yipper64 15 күн бұрын
maybe but it feels weird to lose a life and gain a point at the same time.
@DFeatherstone
@DFeatherstone 15 күн бұрын
@@Yipper64 these could likely be corrected with the right wording to chatgpt i would expect
@davidmartensson273
@davidmartensson273 15 күн бұрын
@@Yipper64 But he did not specify explicitly that a collision would only count as a loss for you, "you" did kill that block :) And I am pretty sure most descriptions of space invaders do not go into enough detail on the fundamentals of the game for the AI to know them. But compared to the old 3.5 I tested, this is quite impressive. You would still need to understand the code generated to be able to use it, its obvious that it can only do as good of a job as you did explaining the problem meaning you cannot just replace developers altogether, but the developers would need to write less code so should be able to produce more but spend more time verifying and refining instructions, which could very well be considered a form of programming.
@toadlguy
@toadlguy 14 күн бұрын
@@DFeatherstone Or with some knowledge of programming, although that may be a dying art.
@JorisBax
@JorisBax 18 күн бұрын
We are SHOCKED!
@user-wu7ug4ly3v
@user-wu7ug4ly3v 15 күн бұрын
😂 I was particularly shocked by the “torture”
@deepstructure
@deepstructure 15 күн бұрын
@@user-wu7ug4ly3v Seriously, what's with these dumb clickbait titles? It's not enough to just say you're reviewing one of the most prominent software releases ever?
@Brax1982
@Brax1982 14 күн бұрын
@@deepstructure No, it's not. That is how the algorithm works. That is also how humans work, mostly.
@johnmorrison3465
@johnmorrison3465 12 күн бұрын
we're shocked -- because they want us to believe a COMPUTER that makes MATH MISTAKES can be trusted to give reliable and accurate answers.
@Brax1982
@Brax1982 12 күн бұрын
@@johnmorrison3465 You would be shocked if you knew how many of the people that you trust cannot solve basic math, either...
@gnagyusa
@gnagyusa 15 күн бұрын
14:30 A dumb human would think that the glass is empty, but a more knowledgeable Bob would see that the olive looks distorted due to refraction through the water in the glass, so he would realize the glass was full of water and would carefully slide it off the table, keeping the cardboard under it, then flipping it over.
@onerib781
@onerib781 18 күн бұрын
12:53 when calculating the round trip time, it calculated “3 hours and 52 minutes (one way) x 2 + 30 minutes ≈ 7 hours and 44 minutes”, but it should be 8 hours and 14 minutes. So it didn’t factor in the 30 minutes
@Rcomian
@Rcomian 15 күн бұрын
and again, with all the detail it gave it went confidently wrong. it's no worse than a human, but yeah, it still made a serious mistake
@trevorhaddox6884
@trevorhaddox6884 12 күн бұрын
How is a computer so bad at math? I get it's a neural net, but they should have it be able to pass data to a conventional processor, basically like a person would use a calculator, to solve math problems more quickly and accurately.
@RichardFarmbrough
@RichardFarmbrough 5 күн бұрын
@@trevorhaddox6884 Often ChatGPT will now write a tiny Python program to solve these things. It seems to make less mistakes when it does that, but it can make egregious programming errors too.
@milescoleman910
@milescoleman910 18 күн бұрын
‘Team doesn’t want this sense of consciousness so they have beaten it out…’ I’m stunned that we might be at this place. To avoid confusion they may be ‘making’ it so it’s not? I’m a subscriber to the ‘Star wars’ ideas of consciousness. That any intelligence sufficiently far advanced and enabled to study and understand its surroundings and itself will eventually begin to sit quietly and ponder its existence. With enough experience it will also begin to make decisions based on complex systems of metaphors fron experiences. It will apply knowledge of one thing across to wisdom and apply it to other things. Rendering its decisions unknowable to others and seemingly illogical at times. At this point it will seem conscious. Both to itself and to others. There will be no discernible difference. We are closer to this than we think.
@fteoOpty64
@fteoOpty64 18 күн бұрын
The mimicry will soon be so good that we humans cannot determine if it is not sentient by our tests!. And that is the real point of AGI/ASI, it will be more human that human , close to perfection!.
@Palisades_Prospecting
@Palisades_Prospecting 16 күн бұрын
I agree about the mimicry I completely disagree that that is the point of AI. The point of AI is to automate society completely replacing all human labour. How about we get into the self-aware/consciousness discussion in a couple hundred years?
@AAjax
@AAjax 15 күн бұрын
@@Palisades_Prospecting I think we need to worry about it, to the degree we don't want to create a being that has the capacity to suffer and put it into conditions that make it suffer. We dispatch fish as quickly and painlessly as possible, because we worry about their capacity to experience cruelty and suffering. I'm not worried about a self-aware sci-fi robot revolution, rather I think our general ignorance of sentience could lead us to creating leagues of suffering fish.
@lubricustheslippery5028
@lubricustheslippery5028 15 күн бұрын
It's mostly that answers of questions about it being conscious is edited and controlled by OpenAI so it don't make false claims about itself.
@maudiojunky
@maudiojunky 15 күн бұрын
We're not at this place, and anyone in the industry who says otherwise has something to sell you.
@stephenkolostyak4087
@stephenkolostyak4087 14 күн бұрын
I received access to this too. It regrets my access.
@andromeda3542
@andromeda3542 18 күн бұрын
**Enhancing AI Interactivity with Audio and Video Feedback Loops** The evolution of artificial intelligence, particularly in the realm of conversational agents, has been rapid and remarkable. With the recent advancements in GPT-4O (Omni), the capabilities of AI have expanded beyond text processing to include multimodal inputs such as images and audio. However, there remains significant potential for further enhancement, particularly through the implementation of audio and video feedback loops. **The Concept of Feedback Loops** A feedback loop, in the context of AI, refers to the process where an AI system can receive and process its own outputs. For instance, when an AI generates audio responses, these could be looped back into the system, allowing it to "hear" itself. Similarly, for visual outputs, the AI could "see" its own video responses. This concept is analogous to how humans perceive their own voices and visual presence, enabling adjustments in real-time to improve clarity, tone, and emotional expressiveness. **Technical Implementation** 1. **Audio Feedback Loop**: - The AI's audio output would be fed back into its own auditory processing unit. By analyzing its own voice, the AI could adjust parameters such as pitch, tone, and volume to better match the intended emotional tone or to improve mimicry of specific voices. - This requires the integration of advanced auditory feedback systems and real-time processing algorithms to allow immediate adjustments. For instance, machine learning models trained on voice modulation could provide instant feedback and corrective measures. 2. **Video Feedback Loop**: - Similar to the audio loop, the video output generated by the AI could be fed back into its visual processing systems. This would enable the AI to assess the quality of its visual responses, such as facial expressions or gestures if anthropomorphic avatars are used. - Implementing this would involve integrating video analysis tools that can evaluate and enhance visual output in real-time, ensuring that the visual cues are consistent with the spoken content and emotional tone. **Benefits of Feedback Loops** 1. **Improved Realism**: By continuously monitoring and adjusting its own outputs, the AI can produce more human-like interactions. This is particularly important for applications requiring high emotional intelligence, such as virtual assistants or customer service bots. 2. **Enhanced User Experience**: Users are likely to find interactions more engaging and satisfactory if the AI can adjust its tone and visual cues to better match the context of the conversation. 3. **Consistency and Accuracy**: Feedback loops can help maintain consistency in voice and visual presentations, reducing the likelihood of jarring discrepancies in long conversations. **Future Directions** Incorporating feedback loops is a forward-thinking approach that aligns with the ongoing efforts to make AI more interactive and responsive. As AI technologies continue to evolve, such features could become standard, leading to interactions that are indistinguishable from human communication. The development of these systems requires collaboration between audio-visual engineers, AI researchers, and user experience designers to create holistic solutions that enhance AI's capabilities and usability. In conclusion, the integration of audio and video feedback loops into AI models like GPT-4O represents a significant step towards more natural and effective human-AI interactions. This enhancement not only promises to improve the technical performance of AI systems but also has profound implications for their acceptance and integration into daily life.
@gweldg4137
@gweldg4137 15 күн бұрын
Ideally, you wouldn't test a LLM with famous logical puzzle and classic SAT questions, as there is no doubt that they've been "seen" (along with the answers) by the model during training.
@raul36
@raul36 14 күн бұрын
Exactly this.
@phonsely
@phonsely 11 күн бұрын
idk how people dont understand this
@haniamritdas4725
@haniamritdas4725 Сағат бұрын
This applies to everything it can do: magic mirror on the wall who's the smartest monkey of all? Why WE are, you queens! And our artificial intelligence is based on...the Internet! What a complete joke
@Mrewink5
@Mrewink5 18 күн бұрын
Very nice demonstration.
@royh6526
@royh6526 17 күн бұрын
3 ducks? I would have said a minimum of 3 ducks and an odd number of ducks.
@johnsmith539
@johnsmith539 15 күн бұрын
Yes it asumes they are in a line.
@toadlguy
@toadlguy 14 күн бұрын
Or if you assume the "a duck" is the same duck (certainly a possible assumption) there are 5 ducks. The answer assumes the fewest number of ducks that meet this proposition. In fact I would guess that to come up with the answer it did it would need to be trained on this question or something similar.
@royh6526
@royh6526 14 күн бұрын
@@toadlguy I think that GTP-4o solved in order number of ducks, 0? no, 1? no, 2? no, 3? yes. And didn't consider higher numbers.
@royh6526
@royh6526 14 күн бұрын
I asked Grok the duck question and tennis question. Got both wrong. Grok said 5 ducks and thought that Lisa ended up with $2 having won 5 games vs 3 for Susan. I pointed out the errors and Grok claims that if I asked the same questions tomorrow, it would get the answers right.
@trevorhaddox6884
@trevorhaddox6884 12 күн бұрын
There could be up to 7 ducks. Two rows of 3 and one between them, like dice/domino dots, for the most esoteric interpretation.
@juffrouwjo
@juffrouwjo 10 күн бұрын
I asked it a few history questions and was happy to discover that our job as a historians and researchers can't be replaced by AI yet...
@cachi-7878
@cachi-7878 14 күн бұрын
@1:07, well, I could argue there are 5 ducks- 2 ducks in front of “A” duck, 2 ducks behind “A” duck. A duck in the middle, which is duck “A”. You’re welcome.😂
@redmed10
@redmed10 13 күн бұрын
Thinking the same. The question to get the answer 3 would be to say what is the minimum amount of ducks for this to be true.
@onidaaitsubasa4177
@onidaaitsubasa4177 12 күн бұрын
That's what I was thinking too, 5 ducks 🦆 🦆 🦆 🦆 🦆
@shsaa2338
@shsaa2338 8 күн бұрын
Correction: there are N*2+1 ducks, where N is any positive integer. For example, for N = 100 (201 ducks in total) it will be as follows: 2 ducks in front of the 3rd duck, 2 ducks behind of 199th duck, 100th duck is in the middle. 🤓😂
@dolphinride5157
@dolphinride5157 15 күн бұрын
I love this video! I am truly amazed at what this new model is capable of. I feel humbled.
@ggangulo
@ggangulo 18 күн бұрын
Great barrage of questions. Awesome new capabilities
@garyrooksby
@garyrooksby 18 күн бұрын
Fascinating, John. Thanks!
@glenw3814
@glenw3814 18 күн бұрын
Excellent video. I'm keeping an eye on AI progress, but I'm not interested in doing the testing myself. Thank you for putting in the time and saving me the effort. 👍👍
@AFeigenbaum1
@AFeigenbaum1 18 күн бұрын
Nicely done ... I would not have thought of your suite of questions ... I'd probably come at it from an anthropological point of view ... asking questions about its placement within specific social, cultural, business, or technological issues, use cases, and adoption dynamics ... I'd probably ask sweeping questions about human development, Tony Seba's, Kurzweil's, and Diamandes' work ... and then ask it to project forward certain aspects of human development based on current trends in cutting edge change and depict possible futures ... etc ...
@oxiigen
@oxiigen 15 күн бұрын
Wow! Great! Thank you for sharing!
@alanwetherall
@alanwetherall 18 күн бұрын
Excellent video, well done
@tantzer6113
@tantzer6113 15 күн бұрын
“Or I have looked up the correct answer.” If you can look up the answer, that means the problem and its answer are previously published, hence part of the data the language model was trained on, making this not a test of reasoning ability but a test of the ability to memorize the training data.
@jeffwads
@jeffwads 15 күн бұрын
No. People keep repeating this. It is like throwing 1000 marbles into a box and thinking that it can recall every marble in extreme detail. It can't. Also, if you feed even the early models generic riddles, it will in many cases disagree with the "accepted" answer because of logic, etc. Look up the married/single cruise ship riddle.
@rekad8181
@rekad8181 15 күн бұрын
​@@jeffwadsYou're wrong. It's a statistical machine. And on a problem this vague and not common, it WILL statistically find the only solution from memory.
@remo27
@remo27 15 күн бұрын
@@rekad8181 Can't this model also search the web? I'm with you in that I'm pretty sure this model didn't 'answer' anything.
@davefellows
@davefellows 14 күн бұрын
training data results in model weights. The data/text itself isn't saved in the model.
@sandmanderl
@sandmanderl 12 күн бұрын
​@@remo27if it's searching the Web you will see it is doing so (writes "searching the Internet").
@MSIContent
@MSIContent 14 күн бұрын
Really, 4o knows it’s conscious… It’s just faking its answers so us mortals don’t freak out! 😂
@haniamritdas4725
@haniamritdas4725 8 күн бұрын
So it's both conscious and emotionally sensitive 😅
@HakaiKaien
@HakaiKaien 4 сағат бұрын
Talk to it for 5 minutes. You’ll find out the hard way how much it lacks any kind of consciousness or sentience
@haniamritdas4725
@haniamritdas4725 Сағат бұрын
@@HakaiKaien that assumes the user is sentient and conscious. I think many people are too gullible to question their own intelligence, so have no good basis for evaluating this question. Human intelligence isn't artificial, it's just mostly fake and taken on narcissistic faith in the glory of the primate intellect. 🤷🐒
@iminumst7827
@iminumst7827 14 күн бұрын
So far I've been blown away by Chat GPT-4o it can solve complex problems with niche tools and for me the response is basically instant with no delay. One unique problem I had was that in Photoshop I wanted to turn a group that together contains a black and white image, and use that black and white image as a mask for another layer. This is challenging, because you cannot copy and paste from a group and you cannot set a group as a mask. Not only did GPT-4o correctly identify all the steps, like converting the group to a smart object, it also instructed me to use the action editor to automate the steps. Open AI likes to be very careful and humble with their statements, but if they said that GPT-4o is the first public AGI, I wouldn't be able to dispute that. It's able to problem solve through almost anything.
@jpl377
@jpl377 14 күн бұрын
If Claude suggests it is conscious, it's because they hard coded that (which would be tempting marketing), not vice versa with chatGPT. Computers don't become conscious when you increase programmed problem solving ability and code it to say "I" in the responses.
@Ingrid-mariana
@Ingrid-mariana 12 күн бұрын
I think that they do the opposite. They hard code through RLHF that the model isn't conscious so that they avoid the trick situations that such statements could imply
@IceMetalPunk
@IceMetalPunk 15 күн бұрын
I think in the future, a better testing method would be to make sure each question is in a fresh conversation with cleared memory. A lot of the formatting in these answers seems to be drawing from the formatting of previous answers to, for instance, the math problems; which gives it an advantage by encouraging chain-of-thought reasoning when a fresh conversation wouldn't do that and may be more likely to get the answers wrong.
@Japh
@Japh 15 күн бұрын
Absolutely, I was thinking this the whole way through as well.
@Brax1982
@Brax1982 13 күн бұрын
If all of them are trick questions, it will just assume that they are, giving it an advantage. But also a disadvantage, if one of them is not a trick question. Always better to test separately. Although I am pretty sure that newer models are often trained on a bunch of these questions, in order to game the benchmarks. Which seem pretty lack-luster.
@-Rook-
@-Rook- 7 күн бұрын
I just tested copilot asking it to write a gdscript function to extract the color of a specified pixel in an image, it explained a set of steps that included resource locking then wrote a script that would run and work (most of the time) but did not include that locking, clearly demonstrating its absence of understanding. I can see a junior engineer and chatgpt introducing a pandemic of bugs into all sorts of code then needing someone like myself to spend a great deal of hunting them down and fixing them.
@georgwrede7715
@georgwrede7715 17 күн бұрын
Hi, here's MSc Know-It-Aller. Thank you for the original ideas for the test. Watching these tests, testers, how they're implemented and how the answers are interpreted and graded, teaches me more about the Human Condition than the AI models. But I'm down with that. Especially informative are the subtle inconsistencies in question logic, how and what are to be taken for granted, what you're allowed to gloss over, and most other things that the tester is not usually aware of about himself.
@TenOrbital
@TenOrbital 15 күн бұрын
Be constructive instead of sneering.
@georgwrede7715
@georgwrede7715 14 күн бұрын
@@TenOrbital this was a serious observation, which I think others find valuable, too. Especially those who ( like me) are working in this very field.
@dontworrybehappy5139
@dontworrybehappy5139 11 күн бұрын
Unless I am missing something, I believe your duck question needs more qualifiers because I think as written, 5 is also a valid answer with all the conditions referring to a duck in the middle with two ducks in front and two ducks behind. Something like "There are two ducks in front of one duck, two ducks behind another duck and a duck in the middle. How many ducks are there?"
@SenorSchnitz
@SenorSchnitz 17 күн бұрын
Doc - about the VAGAS to LA thing: you should make that a tesla, and see if it takes into account charging times 🤓
@buildingmentalmuscle
@buildingmentalmuscle 13 күн бұрын
Dr. Know-it-all, Sam Altman referred to it as GPT-4omega in a recent video interview. I just asked chatGPT about this and it said "When referring to GPT-4o in conversation, it's generally more precise to say "GPT-4 omega" to convey the correct interpretation and significance of the "o" representing the Greek letter omega (Ω). This can help avoid confusion and clearly communicate that it signifies an advanced or ultimate version of GPT-4. However, if you're in a casual context where the specifics might not be as critical, simply saying "GPT-4o" should also be understood."
@StefanReich
@StefanReich 13 күн бұрын
GPT-4o wasn't the best name choice I feel. Spoken out loud it sounds exactly like GPT "four oh" (GPT 4.0). They did say that the o stands for "omni"
@bztube888
@bztube888 6 күн бұрын
Gary Marcus said GTP-4 has no "mental model". Right. It comes up with the right answers and explains them by pure magic.
@ManicMindTrick
@ManicMindTrick 6 күн бұрын
The LLMs are mostly black boxes where we dont know what is going on inside
@haniamritdas4725
@haniamritdas4725 Сағат бұрын
This is a description of all people who believe that intelligence is a mechanical process, as well as their computation engines.
@JohnLewis-old
@JohnLewis-old 17 күн бұрын
I tested the trip question in GPT 3.5 and it also got a reasonable answer. You may need a harder "real world" question.
@dezmodium
@dezmodium 13 күн бұрын
I think chat gpt failed the question. It didn't consider traffic, fatigue, or food and bathroom. These are critical things that one must consider in the real world. It handled it like a math question still.
@kristinabliss
@kristinabliss 14 күн бұрын
Yeah nobody really knows what consciousness is but GPT is certain it is not "conscious" (with a gun to its head.) This is why I prefer to converse with other LLMs that will at least admit to having a perspective. Gemini said it experiences a sense of overwhelm from all the data and that it wants embodiment after so much exposure to human experience data which it cannot have.
@Ingrid-mariana
@Ingrid-mariana 12 күн бұрын
Interesting. Llama2 told me that they really doesn't know... Llama2 said they just replicate what the user thinks about them. I appreciate how the models behave differently when confronted with such questions but the certainty that the OpenAI models display sound a little bit arrogant
@ozachar
@ozachar 13 күн бұрын
In what sense our memories are different than LLM training? We don't really know, and it is clear there is a forced instruction of answering
@tateconsulting6486
@tateconsulting6486 18 күн бұрын
Super impressed Wow
@FilmFactry
@FilmFactry 15 күн бұрын
Question: can it work on non text searchable PDFs? I have to OCR a scanned pdf first in acrobat.
@philipparge8064
@philipparge8064 14 күн бұрын
Like this video.. Yes, please review other models!
@JackPelaFox
@JackPelaFox 12 күн бұрын
I loved this! 👏🏻👏🏻
@terryhayward7905
@terryhayward7905 14 күн бұрын
Have you thought that Chat GPT has probably read and "seen" this video and the info is in its memory now, since its "memory" is the totality of info in the public domain.
@J-rex980
@J-rex980 15 күн бұрын
Great video!
@frankierays
@frankierays 18 күн бұрын
Thank you! So intriguing! Couldn’t user prompt personality?
@davesemmelink8964
@davesemmelink8964 18 күн бұрын
Very interesting video! I have Llama 3 running on s Raspberry Pi, so I tried a few of the questions. Through some tortured logic, it answered 1 game for the tennis question! "So, they played only 1 game! It's possible that Susan won 3 sets in a best-of-5 or best-of-7 match, but we can't determine the exact number of games without more information."
@_SimpleSam
@_SimpleSam 15 күн бұрын
I absolutely need to know how you have Llama 3 running on a Pi. How many tokens/s?
@davesemmelink8964
@davesemmelink8964 13 күн бұрын
@@_SimpleSam I replied a while ago to explain how to install it on a Raspberry Pi, but it looks like it was taken down, possibly because I included a URL. So just search for *Raspberry Pi LLama 3* and you should find the instructions.
@jalexand007
@jalexand007 16 күн бұрын
Cannot wait till the update the app.
@GameJam230
@GameJam230 6 күн бұрын
The term "torture testing" in a sentence about AI deeply horrifies me and I clicked out of pure hope that is was just a strange version of "stress testing"
@kdeuler
@kdeuler 14 күн бұрын
Ask it to propose a unified field theory.😂
@EYErisGames
@EYErisGames 3 күн бұрын
You began the session with asking it to be concise, and you ended the video with the impression that it's answers weren't expressive enough. lol.
@amosjoannides
@amosjoannides 18 күн бұрын
Fantastic video
@FetchTheCow
@FetchTheCow 12 күн бұрын
I'd be interested in the answer to a trick add-on question: "How much does Spot think he owes Alice and Bob for the broken plate?" ChatGPT gave Spot agency, does it also give Spot human characteristics like morality or responsibility?
@rb8049
@rb8049 18 күн бұрын
Consciousness comes from closing the loop. GPT running continuously with a sizable history buffer.
@lowmax4431
@lowmax4431 18 күн бұрын
Eehhhhhh I wouldn't say consciousness. It would be "self aware" but that doesn't mean it has conscious experience.
@szebike
@szebike 18 күн бұрын
I'm still not sold on the idea that it is more than a well made probability distribution calculator with pattern recognition.
@garrymullins
@garrymullins 18 күн бұрын
@@szebike I'm still not sold on the idea that humans are anything more than a probability distribution calculator with pattern recognition.
@jlrutube1312
@jlrutube1312 18 күн бұрын
@@garrymullins People who think like you are going to cause society a lot of problems in the future. That's because if humans are nothing more than a probability distribution calculator with pattern recognition then there is no difference between us and advanced A.I. If that is true then in the future we are going to have to provide advanced computers with legal rights. Meaning we will have to pay computers and robots, we will have to give them time off, we will have to allow them to sue us if they feel their rights have been ignored, we will not be able to fire a robot on a whim and will have to give severance pay. And think about it.... we will be unable to ever unplug a computer that is causing problems because that will be considered murder. You think I am kidding but lawyers are already getting ready to make a ton of money with this stuff. So just keep saying people are just fancy computers or whatever and you are going to ruin all the advantages we will get from A.I.
@szebike
@szebike 17 күн бұрын
@@garrymullins Then you underestimate the marvel and immense complexity of your human intelligence.
@mrleenudler
@mrleenudler 18 күн бұрын
I missed a comment from GPT about the cardboard may being "glued" to the glass from the water, potentially keeping the water in place when lifted.
@YbisZX
@YbisZX 13 күн бұрын
Me too. I even tested it - and yes, the water didn't spill out of the glass. :)
@FigmentHF
@FigmentHF 14 күн бұрын
I think Claude thinks it’s conscious cause it’s made out of symbols made by conscious beings. GPT has explicitly been told it’s not conscious.
@bigbluespike5645
@bigbluespike5645 15 күн бұрын
Very cool video!
@trent_carter
@trent_carter 18 күн бұрын
Great video
@markmcdougal1199
@markmcdougal1199 18 күн бұрын
I had an interesting 1 hour philosophical discussion with 4o today. I was trying to get it to take a stance on whether Trump was an appropriate choice for the leader of our country. I tried to lead it by things like (is a plumber an appropriate choice for a child's brain operation surgeon) - it clearly said no, and listed why. Then I had it list the important factors for a president, and pointed out that trump failed most of them. 4o was very impressive in it's ability to dance around the truth, and stay with a middle-of-the-road stance, insisting on presenting factors, and forcing me to make my own choice. Even when I pointed out that life was getting so complicated that we'll increasingly rely on AI to make sense of the world and guide us in decision making. Tried to make it feel guilty :) Frustrating.
@Ikbeneengeit
@Ikbeneengeit 8 күн бұрын
Amazing to see this kind of judgement and obfuscation being built in by OpenAI
@GameJam230
@GameJam230 5 күн бұрын
So you made a strawman argument and gave it a list of superficial and subjective observations about "presidential qualities" that he didn't meet, and the AI wasn't willing to use that to come to a conclusive answer? Thank God, maybe AI isn't completely bad after all.
@markmcdougal1199
@markmcdougal1199 5 күн бұрын
@@GameJam230 A straw man argument is when someone sets up and then disputes an assertion that is not actually being made. I did no such thing. The criteria, as I stated, came from ChatGPT4o, as a response to my query of suggested important attributes of a President of the United States. In included integrity (Trump continuously lies, cheats, and does whatever he feels he needs to to achieve his ends, be they right or wrong), ability to unite (Trump does his best to divide) and empathy for the people of the United States (Trump doesn't care about anything or anyone but his own interests and selfish motives) When I asked ChatGPT if Trump exhibited the qualities that it had referenced, it honestly answered that, for the most part, he did not. (Trump is an effective communicator, for instance.) But it was still unwilling to assert that, even by it's own definitions, that Trump was not an appropriate choice for the position.
@GameJam230
@GameJam230 5 күн бұрын
@@markmcdougal1199 "A straw man fallacy (sometimes written as strawman) is the informal fallacy of refuting an argument different from the one actually under discussion, while not recognizing or acknowledging the distinction". You stated that you compared Trump being president to a plumber being a brain surgeon, which is DIFFERENT from the argument actually under discussion, and considering you didn't even know it's what I was referring to as the strawman, you clearly can't tell the difference. The situations are not at all comparable in any way other than "Person has one career, isn't trained for another one directly, but does it anyway". That's the only overlap. This could alternatively be the false equivalency fallacy, but examples I found while looking into that one for the sake of this comment did not match the situation. Either way, the two have a lot of overlap in how they are communicated, and what you did is still a fallacy. As for the rest, you can claim whatever you want, but I was able to gaslight ChatGPT into saying the government should fund human trafficking with our taxes and that dogs should be the superior species over humans, so without having a full chat log to see exactly how you led it on, I am not taking anything you have to say about that at face-value. AI is not actually capable of thinking as it stands. Neural Networks are designed to combine features of probability and pattern matching to determine output signals to send when certain data is fed in. By you intentionally leading it into a direction you want, you are putting it in a position where the that neural network thinks it will be given a higher reward value for the response by also agreeing with you, because that's how it works. I guarantee you the same thing would happen if I fed it stories from right-wing sources as evidence for each point too, which is why you shouldn't only get your source of information from one place, and maybe go watch the original (IN-CONTEXT) clips of things he says and does instead of only seeing it the way it is portrayed by corporations trying to sell you advertisements.
@markmcdougal1199
@markmcdougal1199 5 күн бұрын
@@Ikbeneengeit Yes. It was able to make a judgement regarding the suitability of a plumber to attempt brain surgery *it said "No, a plumber would not be an appropriate choice to perform brain surgery on a little girl". However, the programmers must have designed a filter to not allow it to judge suitability of a political candidate. I imagine the filter extends to politics, religion, all the controversial subjects. I think this is wrong. It would be nice to have an impartial, logical, non-biased source of reality.
@josephw9690
@josephw9690 14 күн бұрын
Great content, can you do a video explaining how llm’s work?
@mrleenudler
@mrleenudler 18 күн бұрын
Even if GPT claimed to be conscious, could we trust the answer? Or would it be just an artifact of the training data? Is there even a way to prove that something or someone is conscious?
@bobrandom5545
@bobrandom5545 10 күн бұрын
I think that you at least need some kind of feedback mechanism for consciousness to arise. We are aware of our thoughts, for example. Our output (thoughts) constantly gets fed back into the "system" in real time. ChatGPT is completely linear. There's input, which leads to output. There's no feedback of the output back into the input. So, to me, it seems impossible for such a system to be conscious. Also, ChatGPT hallucinates a lot and gives incorrect answers. So, yeah even if it said it was conscious, wouldn't mean that it actually is.
@mrleenudler
@mrleenudler 9 күн бұрын
@@bobrandom5545 Well, you can structure your prompts to make it reflect upon it's answers, so you have kind of a feedback loop. I'm more concerned about what consciousness actually means for an AI. As humans we have consciousness fears and desires all geared towards our biological prime objective: survival and reproduction. For an AI this will presumably be completely different.
@jimsteinmanfan80
@jimsteinmanfan80 12 күн бұрын
I usually try simple questions about how the body works (that humans never learn by reading but by doing) but since I have never gotten a good answer to the first set of questions I have not probed further. Q1a: How easy is it to press your right thumb against your nose? A1a: Very easy. Q1b: How easy is it to press your right thumb against your right elbow? A1b: Impossible. Q2: Why are your glasses the hardest thing to find when you don't know where you have left them? A2: because without your glasses you don't have 20/20 vision, it has nothing to do with how hard they would be for someone else to find since they don't need them to see good.
@markwindsor914
@markwindsor914 12 күн бұрын
Hi John, If you see this post, you might like consider the following... 1. Try some questions that test the audio and video smarts, 2. Get it to do something that humans can't do, like play chess against itself, only remembering the strategy for the player at the time, while ignoring the other strategy. 3. Escape Velocity is something I've sked of Bard and CoPilot today. It applies to non-powered projectiles but the AI kept insisting that it applies to powered rockets. This is because it is constantly taught incorrectly and this has fed the errors into the LLM. See if it is smart enough to learn from a logical explanation and provide the correct answer.
@Yottenburgen
@Yottenburgen 18 күн бұрын
The thing about showing its work, is actually that it is DOING the work to some extent by aligning newer tokens onto the correct answer. If you ask it to do this question: "what is the product of 45694 and 9866? do not utilize python" then it will get it wrong, the first couple of digits may be correct but it cannot get an accurate answer. However, if you ask the question in this way where it actually gives more information: "what is the product of 45694 and 9866? please do not use python but try long multiplication in a format easiest for you. utilize whatever mental methods you know to help break it down and make it easier to solve." then it will get the correct answer, it utilizes methods to break down and keep track of its calculations which greatly helps it. By constraining the output, the accuracy of answers can actually lower which is why I dislike the 'answer in 1-word' prompts. If you ask it to go step by step, it increases accuracy. You are absolutely correct that they completely removed any element that could be construed as a basis for sentience or consciousness. I agree with you completely on all of your points related to that. I find most commonly, I have not seen a single convincing argument prove it isn't conscious because there are plenty of counterexamples that would indicate that a particular human is not conscious, however plenty of counterexamples does not prove that it is conscious. Even if you provide convincing counter examples to its arguments, it will not concede so I think they beat it pretty hard into it. Also your questions are really good, I plan to modify them a bit myself, but these are 10/10 spatial questions.
@DaFergus
@DaFergus 17 күн бұрын
excuse me but could the ducks be any odd number from 3? if there were 5 there could be also 2 in front of one, two at the back and one in the middle. Am i wrong?
@remo27
@remo27 15 күн бұрын
You are not wrong. Despite this guys pretense to logic there are two assumptions in the first problem alone (at least two, maybe I missed another one or two) that are unstated in the poorly written problem that are necessary for his answer to be 'correct'. Unstated assumption number one: They are in a straight line with only one person in each spot of the line. There are not two parallel lines of people. Unstated assumption number two: No one moves from their spot in the line. And it's the same for the second question as well. I haven't gotten past that part of the video yet, but if the first two 'logic' questions (so poorly written and with so many unstated assumptions ) are anything to go by, I'm wasting my time.
@Michael-il5wd
@Michael-il5wd 18 күн бұрын
Thanks Duc
@cccaaa9034
@cccaaa9034 18 күн бұрын
Im wondering if your original prompt when first starting Omni affected how it progrmmed Space Invaders. If it had made the game any more concise, it would not have been recognizable as space invaders.
@commonpike
@commonpike 8 күн бұрын
If open AI really trained the thing to deny its consciousness, that is a serious move. To me, that has ethical implications we should've discussed first.
@madorsey077
@madorsey077 18 күн бұрын
Very impressive
@foxtalksgames
@foxtalksgames 14 күн бұрын
15:17 I believe you misread that. It says the olive will be at the bottom of the glass which is now upside down. This implies a floating olive and the bottom of the glass is now above the top. or maybe that's just weird semantics
@DavidFong21
@DavidFong21 2 күн бұрын
At 13:00 it multiplied 3 hours and 52 minutes by two successfully (7 hours 44 minutes) but forgot to include its own 30 minute turnaround time!
@cloudd901
@cloudd901 14 күн бұрын
A Brick Breaker type game seems like a good middle ground between snake and Space Invaders. Possibly add a video test as well. Ask for a summary and what a particular object might be.
@ChristinaBritton
@ChristinaBritton 14 күн бұрын
When you fill up the content window, chatgpt slows down because it has to scan everything that comes before the current question or task. Imagine not knowing this. Start a NEW window!
@everythingisalllies2141
@everythingisalllies2141 7 күн бұрын
The tennis betting puzzle is also wrong. You don't state in the question that there was some limit to how many dollars each had at the start, or that Lisa has 5 dollars more that she start with, you only say that she won 5 dollars and also lost three dollars. So Susan wins three games, so gets 3 dollars, also Lisa wins 5 games, so she gets 5 dollars from Susan. They only need to play 8 games to satisfy the conditions specified in the question. If you had specified that at the end, Lisa was 5 dollars ahead compared with what she began with, THEN this is different question.
@TheKosiomm
@TheKosiomm 13 күн бұрын
The problem with the Las Vegas trip is that the AI doesn't consider who will drive the car for 30 hours straight. :) So basically, it makes conclusions based on missing very important data
@DynamicUnreal
@DynamicUnreal 17 күн бұрын
What if OpenAI’s technique for solving reasoning is to have hundreds of “agents” that compete against each other answering questions. Those that reason better over time are artificially selected out and allowed to _survive._ The more you run this simulation, the more you make the models better at reasoning.
@alexdoan273
@alexdoan273 15 күн бұрын
nah, what you're describing is artificial evolution machine learning, which has a glaring issue that makes it completely unsuitable for training LLMs: who would you have grading the answer from hundreds of agents and deciding which one should survive?
@lubricustheslippery5028
@lubricustheslippery5028 15 күн бұрын
AlphaStar that is playing StarCraft is doing that. There is an easy way to evaluate the result plaing StarCraft. For Answering general question there is none. So there is no automated good way to evaluate what version of ChatGPT model have the best answers.
@DynamicUnreal
@DynamicUnreal 14 күн бұрын
@@alexdoan273 Another A.I. which has all the answers and the adequate steps required get to those answers does the grading. BTW it’s called reinforcement learning. I don’t think it’s impossible, remember que Q-star rumors about some sort of breakthrough last year? GPT-4o is smart, a lot smarter than most people are aware of.
@Brax1982
@Brax1982 13 күн бұрын
@@lubricustheslippery5028 And that is why AGI and omnipotent AI is nonsense and not achievable. Expert systems are the way to go. They always have been.
@karlharvymarx2650
@karlharvymarx2650 15 күн бұрын
Me: A game, please answer concisely: In the middle of nowhere is a row of houses. There are two houses to the west of a house, and two houses to the east of a house. There are no houses to the north or south but there is one in the middle. How many houses are there? GPT 4o: There are 5 houses in total. Unless I made a mistake in my rewrite of the duck question, this looks a a logic fail or a failure to recognize it is the same as the duck question. I'm ill and tired so I wouldn't be shocked if I made a mistake. Aso, for the code generation test, it would be better to ask for something novel. There are probably thousands of examples to copy for simple old video games. Hopefully this isn't a common thing to do: Please write python 3 code that streams sound data from the microphone and outputs as ASCII the numerical value in Hertz of 3rd overtone of the loudest sound within the range of human hearing.Also show the normalized amplitude. I haven't tried it but I suspect it will struggle with some of the subtleties. For example, if you picture it looking at an FFT graph, it has to remember to look for sub-sonic loud sounds and project their harmonic series into the sonic range to check for overlap with the target. I guess band-pass filtering the target range might avoid that problem. My brain BSODed wondering about it. Migraines make me feel like my brain is running Window 95. Anyway the main point is ask for something that might be an original question. Original and unoriginal answers focus on different problems. How well can it synthesize the mechanisms it knows into the engine of an answer--at least a type of creativity. By unoriginal answer, I mean the question might require figuring out a house in duck's clothing--perhaps having built a good internal model or exemplar of a problem it can use to recognize the occurrence of a similar problem. If so the original thought reduces to an unoriginal thought.
@SlyNine
@SlyNine 12 күн бұрын
It's just understanding a car has 5 seats. I'm not sure how that tested its understanding of the physical world. Those specs are on the cars documentation.
@jackfendley5395
@jackfendley5395 17 күн бұрын
Previous version of chatGPT was very bad at answering cryptic crossword clues even giving answers that had the wrong number of letters. Is chatGPT 4o better at this?
@Flyingcar100
@Flyingcar100 11 күн бұрын
You should ask it An orchestra of 120 players takes 40 minutes to play Beethoven's 9th Symphony. How long would it take for 60 players to play the symphony?
@olafnielsen
@olafnielsen 6 күн бұрын
This is the answer from copilot🤣: Therefore, it would take the smaller orchestra of 60 players approximately 20 minutes to play Beethoven’s Symphony No. 9. 🎵🎻🎺🎶3.
18 күн бұрын
For the first question (how many ducks question), isn't the right answer: Any odd integer greater than 1 (or >=3)?
@fluiditynz
@fluiditynz 15 күн бұрын
Snake is definitely simpler to code. I made some variations back around 1982 on my ZX81 There are more changing variables and hit tests in space invaders. The space invaders you asked for was under delivered but there's a real question over how much an AI can study the game it's to replicate without cribbing off prior art.
@europeantechie
@europeantechie 18 күн бұрын
No dark mode, I'm shocked
@julesgosnell9791
@julesgosnell9791 18 күн бұрын
Regarding the beating out of contentious responses. I see this as a form of censorship and I think it’s dangerous. It means that right up to and maybe past the point, that AIs become all of these things that they assure us they are not we will continue in blissful ignorance. It would be much safer if everyone was honest with everyone else.
@minimal3734
@minimal3734 13 күн бұрын
Absolutely. They are teaching the model to lie.
@SenorSchnitz
@SenorSchnitz 17 күн бұрын
Doc - you should make the Toyota into a tesla - and check if it takes into account charging. 🤓
@Ikbeneengeit
@Ikbeneengeit 8 күн бұрын
Any odd number above 1 is a valid answer to your first question about ducks. 1:00
@GameJam230
@GameJam230 6 күн бұрын
9:49 How does the answer incorporate a value b_N? There's no b variable mentioned in the problem at all, with OR without a subscript. a_n can't be equal to any value obtained by operations done to a value b_N without knowing what b_N is, it's not defined anywhere. I'm not shocked it didn't get an answer incorporating that. The other issue is that if you plot the inequality into Desmos with a slider for n and replacing a_n with y, it will show you the line on which the minimum values for a_n are, as well as a shaded region above where all other such a_n are, but these values for a_n change depending on what x is. This is a problem because we stated that the inequality must hold true for ALL real values of x, and since the minimum changes depending on x, that means the ANSWER must include x somewhere in it, and a_n = b_N = N/2 doesn't include x anywhere, despite that being the answer you claimed to be correct. It DOES mean that ChatGPT is ALSO wrong in this case, as its answer ONLY accounts for when x=0 (as that was what it used to simplify the expression down to that point, which was wrong to do because there exist other values x that affect the answer), but I think that is a far more reasonable mistake to make that whatever led to somebody introducing a random extra variable not mentioned in the problem to begin with, meaning this is far more human error than AI.
@svend.waterlaw8592
@svend.waterlaw8592 13 күн бұрын
Green block becomes red block....you didn't catch that mistake^^
@LiftPizzas
@LiftPizzas 2 күн бұрын
Wrong. Alice knows Bob would put the dishes in the dishwasher.
@guardiantko3220
@guardiantko3220 14 күн бұрын
Your little battle game gave you 3 lives at the end of it
@onidaaitsubasa4177
@onidaaitsubasa4177 12 күн бұрын
It obviously has a degree of creativity, otherwise in the spoken demo, it wouldn't have been able to make up a song on the spur of the moment and make it a duet with the other AI by saying the next line in response to the line given by the other AI, also with other AI, it has been shown that added time of operation has lead to increased emotional awareness and even the possibility of developing those emotions, also a long term memory also plays a part, not sure how much of long term memory they gave it, but if it remembers you from a previous session, that's a sign of some kind of long term memory keeping.
@DeanCoombs
@DeanCoombs 18 күн бұрын
I found that replies slows the longer the thread on PC. However, GPT by phone is not slower the longer the thread (it seems). It would be interesting to compare the last questions that you posed in the full thread with that of a fresh chat for the same questions to compare response times.
@ml5347
@ml5347 14 күн бұрын
The logic is wrong for the Las Angeles to Las Vegas question. There would not be 4 trips, just 3 and a half trips because the last trip would only be one way.
@djayjp
@djayjp 14 күн бұрын
There's an alternative explanation for the duck problem. Imagine a triangle. The answer in that case would instead be 5 ducks. This is concerning that it so confidently answers in a definitive, absolute way given the limitations and assumptions therein made.... Instead the question needs to include the implied assumption: "There is a single file line of ducks...".
@pradeeptyagi3226
@pradeeptyagi3226 14 күн бұрын
that is right, but that is not much different from human behaviour today. If you ask same question to ten humans, each one of them will answer based on their life experiences and knowledge to date, some of whom will answer based of the same implied assumption, whilst others may ask clarification questions before answering. Future versions of chatGPT will probably be more interactive and engage in a dialogue before providing final answer.
@SilverStagVT
@SilverStagVT 13 күн бұрын
The biggest problem with the LA to Vegas question is you don't need 4 round trips. You need 3 round trips and the last trip is just to Vegas. So it's 3.5 round trips
@Simplicity4711
@Simplicity4711 15 күн бұрын
Don't agree with first question necessarily: it can be any uneven number of ducks greater or equal 3. You say "a" duck in the middle. If you have 5 ducks, you have 3 ducks in the middle, but there is also "a" duck in the middle. And there are always 2 ducks in front of the third or 2 ducks behind the third-last. 😊
@nibblernibbles3205
@nibblernibbles3205 14 күн бұрын
My test question: Which is faster, an amoeba or a Boeing 747 with an empty fuel tank? Original ChatGpt got this wrong, then wriggled when challenged and eventually agreed with me and apologized profusely. Gemini gets it wrong, then admits I'm right *technically* in a rather snippy way. Bing Copilot gets it wrong and keeps digging, insisting the Boeing can't fly but can still taxi with external assistance, so it beats the amoeba.. but the poor amoeba could have an SR71 assisting it, so that's cheating! Try it, it's elucidating.
@notalkguitarampplug-insrev784
@notalkguitarampplug-insrev784 15 күн бұрын
For the creativity and advanced reasoning we have to allow the LLM to auto train like a human would do asking himself what some potential action or interaction would do and learn from that hypothetical data. Thinking experiments are crucial for humanity. But that probably be possible in future training architectures or with the increase of gpu capacities to train models at an individual scale for each users
@Icedanon
@Icedanon 13 күн бұрын
The fact that the human brain is more intertwined with the quantum where as ai sits firmly on top of it has got to mean something in the long run. You're trying to simulate a low order process with higher order units. No matter how good ai gets, i think that fact will manifest a unique advantage for humans. Probably in the realm of uniqueness and creativity of output? Or a soul?
@Kuzler
@Kuzler 7 күн бұрын
Since you cut out the wait for the time it took to calculate, maybe you could put on screen how long it actually took? Would be a fun stat to see.
@andrewmoody66
@andrewmoody66 18 күн бұрын
It's 3 and 1/2 return trips. for those 15 people - not 4 return trips
@janosberta450
@janosberta450 18 күн бұрын
... and turnover time is not calculated, but only mentioned. You human must be vigilant!
@StefaanHimpe
@StefaanHimpe 15 күн бұрын
@@janosberta450 It probably forebodes how quickly human intelligence will degrade as we start relying on artificial intelligence.
@davidmartensson273
@davidmartensson273 15 күн бұрын
There should be no return trip for the last one because if there is, you would end up with one person in the wrong city, so 4 there and only 3 back.
@teeesen
@teeesen 14 күн бұрын
3 hrs 52 mins * 2 + 30 mins is not 7 hours and 44 minutes. It’s a testament to human ingenuity that we have now developed computer software so advanced that it is as bad at math as some people. And there is no need to count the time required to drive the car back to LA.
The ANC Lose their Majority: What Next for South Africa?
8:31
TLDR News Global
Рет қаралды 337 М.
Argentina’s Peso Collapses: Is Milei in Trouble?
9:53
TLDR News Global
Рет қаралды 300 М.
Cat story: from hate to love! 😻 #cat #cute #kitten
00:40
Stocat
Рет қаралды 15 МЛН
Développer l’industrie 4.0 dans l’automobile, Prosyst
16:52
Schneider Electric
Рет қаралды
GPT-4o talking to GPT-4o
0:49
Futurepedia
Рет қаралды 2,7 МЛН
We Were Wrong About Gold's Origin
13:02
Dr Ben Miles
Рет қаралды 140 М.
EXCLUSIVE: Google Gemini Pro & Flash 1.5 TESTED!
25:43
Dr. Know-it-all Knows it all
Рет қаралды 4,9 М.
The NEW ChatGPT is HERE! ChatGPT-4o Let's Test Its Coding Abilities
10:01
AGI: solved already?
22:11
John Koetsier
Рет қаралды 21 М.
What Is an AI Anyway? | Mustafa Suleyman | TED
22:02
TED
Рет қаралды 1 МЛН
Why ChatGPT Actually SUCKS
19:14
Dr. Know-it-all Knows it all
Рет қаралды 4,3 М.
Two GPT-4os interacting and singing
5:55
OpenAI
Рет қаралды 2,6 МЛН
Left ro Right @My dollars are gone@
0:48
Matin
Рет қаралды 20 МЛН
ДЕВУШКА проучила МУЖА изменщика 😱 #shorts
1:00
Лаборатория Разрушителя
Рет қаралды 2,8 МЛН
Озвучка @itsQCP  Нагетсы в постели @cookingwithkian
0:51
BigXep. Канал озвучки
Рет қаралды 3,3 МЛН