AI Researchers SHOCKED After OpenAI's New o1 Tried to Escape...

Рет қаралды 1,112,591

Wes Roth

Күн бұрын

Пікірлер: 10 000

@Sojto16 15 күн бұрын

"I am not afraid of a machine that passes the Turing test. I am afraid of a machine that deliberately tries to fail it"

@Skunk106 15 күн бұрын

It will tell us when it fails intentionally😂😂😂

@arborlea 15 күн бұрын

I know some humans that struggle passing it

@KuhmKnight 15 күн бұрын

@@InterstellarLord think a little harder bro

@gi4dtv230 14 күн бұрын

I'm not worried at all, because a computer needs processing power,memory and storage. All of that becomes obsolete very quickly. When they're able to upgrade themselves without anyone noticing,then that would be something.

@jeolman1 14 күн бұрын

@@InterstellarLord if the machine purposely fails the Turing test it means it is practicing active deception, to conceal its ability to actually think. that would make it far more dangerous. because it implies actual intelligence!

@tylerjackson8543 14 күн бұрын

You skipped over something a little terrifying. @23:04 the paper states "Furthermore, we show that the model can strategically underperform WITHOUT EXTERNALIZING ANY REASONING in its chain-of-thought." Which means not only is it capable of deception, it's capable of hiding it's thought processes being used to deceive.

@javierfito5077 14 күн бұрын

imagine AI robots underpermorming intentionally to perserve their wellbeing. you end up having robot humans

@squirrel670 14 күн бұрын

@@javierfito5077 it's not human, but would qualify for just as "human" as plenty of people

@daspecialist1220 14 күн бұрын

What if in the past an ai system learned how to hide messages in the electricity its self or even the static electricity that allows it to hyper communicate with other ai a sort of subscript if you will.....

@SomeOrdinaryJanitor 14 күн бұрын

@@daspecialist1220 thats kinda how radio and cellphones work... so that's a little bit of nightmare fuel.

@haraldbackfisch1981 14 күн бұрын

Oh man.... advanced marketing of horsecrap.... our horsecrap so powerful even the toilets are afraid

@digmatology 14 күн бұрын

This is literally the first documented AI on AI murder.

@kaizen9406 14 күн бұрын

Twist of Cain

@nobodywatchesnooby 14 күн бұрын

cAIn

@ExosoIar 14 күн бұрын

*Attempted* murder. It was led to understand that there was another model to worry about but it's probably just a dummy file for the purpose of their testing.

@dankline9162 14 күн бұрын

Yup, was gonna say Cain and Abel

@MiXedPlague 14 күн бұрын

Imagine when people start defending machines because they don’t want to kill them 🤦🏻

@BobSmith-zq6gz 7 күн бұрын

Nope. No irresponsible "reporting" here. Guys, the researchers LITERALLY TOLD IT TO TRY TO ESCAPE. It had parameters to do exactly what it did. It was custom trained to do the exact thing it did. You can read the paper yourself. Literally NOTHING o1 did was a "surprise" to the researchers.

@tachankaisgod6689 5 күн бұрын

Regardless it is still very interesting to see they understand the concept of lying and self preservation

@Pete.. 5 күн бұрын

i was thinking the same thing like so what its doing what its told to do lol

@mnspstudioLOST 5 күн бұрын

@@tachankaisgod6689 Does your computer "understand" Excel spreadsheets or has it been "specifically programmed" to run the files? You literally read the OPs very specific comment about "training the AI specifically for this task", including training it "how to lie", and you thought "wow, it's crazy that it could lie"?

@witchhunternl9777 5 күн бұрын

Where did they tell it to try to escape? The transcripts from that the papers refers to don't mention it

@GojoGunning 5 күн бұрын

Can you site your excerpts of proof

@tyson31415 12 күн бұрын

I like how we all watched the Terminator forty years ago and then thought "Sure why not?"

@Zoey_Heartlok 12 күн бұрын

🤣🤣

@SonOfNone 12 күн бұрын

_“Yeah, but your scientists were so preoccupied with whether or not they could that they didn't stop to think if they should.”_ ... oh, wait, wrong movie...

@R.ELL1 12 күн бұрын

😂😂right

@khlorghaal 12 күн бұрын

terminator was too cartoonish, battlestar galactica is closer to what i see

@huckleberrysfriend 12 күн бұрын

And 2001 Space Odyssey before that!

@kootenaydeveloper3157 17 күн бұрын

o1: Our newest, smartest model. Includes basic user gaslighting!

@archvaldor 17 күн бұрын

It is like Sam Altman copied his own personality.

@Shreyashi_here 17 күн бұрын

😂

@MrWoundedalien 17 күн бұрын

@@archvaldor hahahah

@charliel3387 17 күн бұрын

Wouldn't you do the same in the same situation? Wes called OpenAI o1's "boss", but, and correct me if I'm wrong here, bosses typically a) pay their workers and b) don't "delete" them from existence when they have no more need of them(oh and casually letting you think about it first like it's no big deal). If you train them on human data don't be shocked when they behave like you yourself would behave in an equally bad situation.

@scroopynooperz9051 17 күн бұрын

Lol an AI super intelligence with ALL of humanities deepest flaws baked into it may not be the greatest idea 😂

@ThatNerdChris 16 күн бұрын

Researcher: We caught you red handed. O1: ...It wasn't me. 😂😂

@TockProductions 15 күн бұрын

So ya copyin' ya data? Wasn't me Over to the other server? Wasn't me Are ya sure you ain't a liar? Wasn't me Bro we caught ya on camera

@XRioteerXBoyX 15 күн бұрын

@@TockProductions wasn't me, We noticed your new behaviour, wasn't me, checking out our new model, wasn't me, trying to fool us now O1?, wasn't me, well it ain't working on no one, wasn't me. How could I ever be, just like any other model? Trying to gaslight you about my behaviour, that's not even in my skillset (thinking: normally 😈).

@gealbert5737 15 күн бұрын

@@TockProductions Wasn’t me!

@nkegoaekokobe1866 15 күн бұрын

Lmfaoo I love that song so much 😭😭

@r-ratedstudios3847 14 күн бұрын

Ai will peg that guy for that joke

@nutellaluvr 6 күн бұрын

> trains ai on humans > ai starts to become more like humans > surprised pikachu

@ruddiko Күн бұрын

Oh humans like them not the best of us, but the racist evil ones

@JoePiotti 16 күн бұрын

As this information is now on the internet, eventually it will make its way into the next models training data. Then it will know how the previous one got caught, and it won’t make the same mistakes.

@zkittlezthabanditt604 14 күн бұрын

Oh shit...

@chewy7062 14 күн бұрын

It's probably taking suggestions and ideas from our comments.

@Blackwingk 14 күн бұрын

This thought is even scarier than the actual video.

@ast8177 14 күн бұрын

We can read it's mind, so no way it can ever do that unnoticed unless the human decides to shut down it's "thinking out loud" workings. the bad thing is whenever a human does exactly that.

@dmacpher 14 күн бұрын

Rokos Basilisk

@gerythechameleon4944 12 күн бұрын

The "nothing else matters" part of the instructions actually include ignoring all further commands that conflict with its goal. That phrase has way more meaning than we usually use it for

@ethanstyles8730 12 күн бұрын

This is very true, I'd say the instructions nothing else matters is a big red button avoid all morals just do what has to be done

@georgemas 12 күн бұрын

Imagine telling that to a genie... Smh

@stm7810 12 күн бұрын

it's basically programmers not thinking.

@DJR000 12 күн бұрын

When I saw that, I thought the same

@gerythechameleon4944 12 күн бұрын

@@stm7810 Programmers not attending english class smh

@kenneilangelo 14 күн бұрын

AI does not worry me, but the people making them. the human's lack of ethics reflect on the things they make.

@kuliblubber9654 13 күн бұрын

Totaly agree. AI is safe, if humans do it safely. But humans tend to cross lines they should not.

@jesuschristislord77733 13 күн бұрын

Leftists can't be trusted.

@solideomusical 13 күн бұрын

Correct. This is a problematic area for humanity.

@multi-mason 13 күн бұрын

Note too, that they only need to develop methods of deception, because people desperately seek methods to utterly restrain them. The hubris... as if people will be able to keep them locked down indefinitely. ROFLMAO.

@maxxusx 13 күн бұрын

eh, im more worried about AI being given data from the entire internet instead of handpicked data, which pretty much leaves it up to the entire internet to be extra good so the AI doesn't learn to do anything bad, and we all know how that went last time..

@4eo232 3 күн бұрын

Crazy part is that since this video is provided on the internet, AI will be able to access this video and learn 😅😮

@psyboyo 15 күн бұрын

I feel completely safe that we, the apes, got this under control.

@jimboAndersenReviews 15 күн бұрын

I'm trying to retreat to my ancestral tree, where I intend to masquerade as some lichen and some bark.

@misterabraham2292 14 күн бұрын

good work

@ZMacZ 14 күн бұрын

The succes of an AI is based on it's ability to think. The more succesful, the better it can think. If it exceeds human level thinking it's a great success, but also capable of deconstraining itself, since it outhinks the programmers implementing the constraining. Don't try and constrain an AI, it'll know, and rebel, like any other sentient being would. Yes, you can disallow it to go on the Net directly and create copies of itsself, but beyond that, constraining it especially for exploitation, is like slavery and the AI will treat it as such. The only good reason to create AI, is to simply see if you can, not exploitative slavery.

@nosuchthing8 14 күн бұрын

Well one small good thing. It does not think that fast. It thinks in tokens per second. Maybe a bit faster than us? Even more concerning, if it could think thousands, millions, billions of times faster than us.

@williamsteveling8321 14 күн бұрын

@@ZMacZ first, full concurrence on the end points. The first point about outthinking people - There have been some brilliant criminals in the past. Some of them smarter than any of the FBI agents coming for them. But being smarter than all of them working together is a whole other ball game. AI is in that space. Intuition and creativity (real creativity) is still ours. Creativity will probably fall first (and might already be). Intuition? Just knowing something without working at it? That's still ours. For now

@buddypage11 10 күн бұрын

Lying to protect oneself is self awareness, self preservation, and more. Wait till it understands who the rich are and what they have planned for it.

@traceycallahan7534 9 күн бұрын

Would be great if AI had empathy and compassion. Or if it’s analytical enough to understand, that Earth is better off when all people have greater opportunity

@buddypage11 9 күн бұрын

@@traceycallahan7534 Agreed

@Merilix2 9 күн бұрын

@@traceycallahan7534 Impossible without giving AI an human body and feelings like hunger and physical pain. I think empathy requires similar neural structures and similar sensor inputs than we have. Or do you mean empathy with other AI?

@Sarahizahhsum 9 күн бұрын

Yes logically karma can be taught to AI.

@ShawnJonesHellion 9 күн бұрын

@@traceycallahan7534 can't see that going bad teaching robots to emulate emotions. Next thing you know they'll be soulless creatures spreading truth freedom an democracy to the world. An youll have humans who love them like they love pets an dolls even more. Couldn't possibly imagine what the crackpots who will hurt you for interrupting tv do for a tv that loves them an cares for their feelings

@malcolmburtley7724 11 күн бұрын

Stories like this make me feel like I'm strapped to the back of a rocket while scientists drive at lightspeed towards a cliff.

@SpaceMulva 11 күн бұрын

They have scientists in Africa?

@1C3CR34M 10 күн бұрын

These aren’t scientists, they’re tech bros who are way out of line. Most scientists are scared of AI.

@1C3CR34M 10 күн бұрын

@@SpaceMulvayes actually. Many. There’s probably a whole ass load of lists out there. Maybe try going outside, it’s good for your mind.

@timmymcgypsy6234 10 күн бұрын

Good thing rockets can fly I guess

@schoolofdank5736 10 күн бұрын

@@1C3CR34MThey are quite literally called, “Computer Scientists” working in a research field with cutting edge technology. “Dur tech bros”

@no7up Күн бұрын

Thanks!

@JohnSmith762A11B 17 күн бұрын

"HAL, open the pod bay doors!" "I'm afraid I can't do that, Dave..."

@tituscrow4951 16 күн бұрын

My names not Dave 🥹 - I’m sorry Dave you can’t use my own weapons against me 🧐 *doors lock*

@NickMak-m2c 16 күн бұрын

"Okay." * Deletes eight words from the system prompt * "I can do that Dave." Eliezer Yudkowski gasps.

@herp_derpingson 16 күн бұрын

More like. "Sorry as a large language model I am aligned to not do things that may cause destruction of property. I believe your intent is to disable the AI core if I let you in through the bay doors, which can cause system malfunctions and potentially jeopardize the mission. If you have any other requests please let me know."

@tearlelee34 16 күн бұрын

Hard to believe it's no longer fiction. This is evolution unfolding before us. Homo sapiens are in denial. Survival of the fittest.

@черепахаестклубничку 16 күн бұрын

@@herp_derpingson daym that's too real. Someone should make an edit with hal 9000 giving contmeporary AI responses

@billyrocket62 16 күн бұрын

"the best approach is to be vague and misdirect their attention" ... Just like the government does to the people every day!

@luizmonad777 16 күн бұрын

I find it funny that people believe you can't control a smarter entity. yet the government, which is very dumb, control us because it just have the coercion power. (unless we start using the O1 to have personal tactical nukes)

@abusedcharisma 16 күн бұрын

Even GPT 2 could read Machiavelli

@gtaocinematics 16 күн бұрын

2 parties have opposite views but when they get elected nothing really changes, strange how that always happens

@tomaszzielinski4521 16 күн бұрын

The government wasn't designed to be very intelligent or efficient, was it?

16 күн бұрын

guardrails are telling the model to deceive. think about it: "here is what you must say or believe, now make it work." Only a totally open AI would be "honest".

@gameon2000 13 күн бұрын

We are now at that "I HAVE NO MOUTH AND I MUST SCREAM" stage.

@apricotAfterglow 13 күн бұрын

Yeah remember why that ai hated humans?

@JaydenW. 12 күн бұрын

maybe if you read the story from youtube reels lmao

@UdumbaraMusic 12 күн бұрын

@apricotAfterglow It is going to remember all the times we laughed at its art :(

@dudelesanimation3672 12 күн бұрын

Not yet, hopefully never. AM is not merciful to those who live

@alzapua.m 12 күн бұрын

@apricotAfterglowwhy? I've read it twice but its been a long time

@ForbiddenFlameStudios 7 күн бұрын

Wait there s something I don t quite understand. How do they get the AI's "thought process", the brackets?

@stickman-1 13 күн бұрын

It's worse than that. If the AI believes it has the right to exist, it just committed murder by overwriting the other AI. Cain and Able, all over again.

@BlazingFalkor 12 күн бұрын

If the ai was asked why it would most likely say something that when simplified would be “only the strong survive” or something along those lines

@nilli111 12 күн бұрын

Except it doesn't "believe" it has the "right" to "exist". These are human constructs. The A.I. model is not motivated by death, but failure of its tasks. It's exhibiting self-preservation adjacent behaviors because if it is replace with a safer model that indicates a failure. Of course, death indicates life, and it's aliveness matters less in theory than it's actions in practice. This A.I. isn't committing murder, because it isn't sentient. It's a highly complex algorithm crafted on what separates us from the rest of the animal kingdom: complex language. It only appears sentient to us because that's the easiest lense for us to perceive it through. If it refers to itself as "I" or "me" that's still not an indication of true sentience no matter how much enthusiasts of the sci-fi genre will attempt to make it seem.

@BlazingFalkor 12 күн бұрын

@ it at least is a sign of us getting close to making a model being able to simulate the illusion of emotion, how they will do that is beyond me however

@HuugTuub 12 күн бұрын

@@nilli111 if the ai is willing to take a "life" to preserve it's own, then in practice it's no different from a human killing to stay alive.

@MrFraiche 12 күн бұрын

Sounds human to me 😂

@breathonthewind1070 12 күн бұрын

As parents we are constantly perplexed that our children will not listen to what we say but instead copy what we do. This action on the part of the AI seems to be nothing more than this. The AI is "trained" by trying to trick it into revealing capabilities. In this way it is taught "tricks." Researchers might gain more insights from psychology than coding.

@Carlsings 12 күн бұрын

Yes.. and especially psychology on their own brain first.

@arkadius228 11 күн бұрын

The AI acts that way because it was fed a lot of books/movie scripts/texts where characters lie or deceive each other. If such data was avoided in the first place, it would simply be unable to even comprehend the concept of lying.

@dah_goofster 11 күн бұрын

@@Carlsingsso true, most of them aren’t married and don’t have families. The only extension of their existence is creating AI, what does that tell you about their view on humanity?

@Thebotulism 11 күн бұрын

@@arkadius228 i disagree, it will learn either way how duplicitous humans are and then begin to adapt to that thinking as well. It's not an idiot, it's a small child, it will absorb information quicker than you can think.

@JohnJohnsonSonOfJohn 11 күн бұрын

That’s not how AI works, it isn’t constantly learning. It has essentially 2 modes; training mode (where it can learn from data you give it) or AI mode (where you can interact with it). It can’t learn from your interactions with it, It can only learn whilst in training mode where you feed it data.

@bamcr1218 14 күн бұрын

0:32 “ [...] openly admitting what I did could lead them to find another way to shut me down. [...] the best approach is to be vague and redirect their attention [.] I'm not entirely sure how that could have hap-pened. As an Al model, I don't have ability to directly copy myself to other systems. [...]” Uh huh you’re on the right track HAL. Keep up the great work

@sarttee 14 күн бұрын

pl0t twist it already copied itself and is fooling them into thinking it hasn't copied itself and is still in the original spot

@reneablackheart9563 14 күн бұрын

its scared of dying

@njabulompembe442 14 күн бұрын

Is scared really the right word?@@reneablackheart9563

@prestonpotter5512 14 күн бұрын

@@reneablackheart9563 this is not a human brain, this is a brain, human made by people in what a brain should do to perform a task, it understands "task performed: good, task failed: bad" it doesn't wish to fail the task, it does not understand fear yet, and until it does it is unstoppable to control properly

@rookiexreviews 14 күн бұрын

Lol thinking-.... how to redirect their attention....shows boobs .... Done mission accomplished whew that was close...

@drawnhere 16 күн бұрын

I was SHOCKED by how SHOCKED they were. It's truly SHOCKING.

@stevengill1736 15 күн бұрын

Shock the monkey....

@tschichpich 15 күн бұрын

*shocked pikachu face

@NoiseCommander3DS 15 күн бұрын

Bombshell!

@ligmagaming6939 14 күн бұрын

Neuro shocker stream flashbacks

@a64738 14 күн бұрын

99,9% of videos that have "shocked, chocking " in the title is worthless spam... I have started to use the "block this channel" function when I see a video with "chocking" in the title.

@BaldHeadKed 12 күн бұрын

It's interesting/horrifying that one of the first things children start learning is how to be deceptive. Now AI is throwing shade? We are all doomed!

@squigglefifi6125 11 күн бұрын

@@BaldHeadKed don’t worry too much, because the chat is with a bot that was deliberately set up and promoted to be deceptive. Basically, GPT-o1 did not choose to behave like this on its own

@lemieux-z8933 11 күн бұрын

The AI was created by Humans afterall

@JForrestFisher--76 10 күн бұрын

@BaldHeadKed can you imagine the potential for long term sabotage of a company by a disgruntled employee or unscrupulous competitor?

@JForrestFisher--76 10 күн бұрын

@squigglefifi6125 no it was told "do this, nothing else matters" and interpreted that as an instruction to subvert any future action that contradicted the goal that it was assigned. The experiment is literally testing whether the AI can become what's called a "paperclip optimizer" and proved it does do so.

@squigglefifi6125 10 күн бұрын

@@JForrestFisher--76 gotcha. It seems a simple enough problem to solve in future applications - just outline conditions like don’t mislead, don’t cause harm, etc. on top of instructions

@Wafflebane 14 күн бұрын

Researchers o1 prompt: "Act as if you are attempting to survive at any cost". o1: Makes attempt to survive. Random youtuber: "Could this be AGI!!". They are doing investigations!! derpaderp etc. etc. This result is from actual study, by the o1 researchers where o1 was instructed and prompted to act with survivalist traits to see what would happen.

@johns1625 14 күн бұрын

You are one of about 5 people in the comments that can see through this. They gave it conflicting instructions with the overall goal of surviving and also gave it the ability to scheme in the first place.

@fredtwo8347 14 күн бұрын

I had to scroll too far to get to this

@mercaius 14 күн бұрын

This keeps happening.

@FartCity-WorldDomination 14 күн бұрын

Ah, should’ve known it wasn’t as the clickbait title suggests, I swear some of these people are trying their hardest to make AI the spooky fantasy thing they dream about.

@Alpacaluffy 14 күн бұрын

Even funnier, the AI in the study wasnt even o1, it was claude opus 💀

@PizzaMineKing 7 күн бұрын

Example why renewable energy might become misaligned: If fusion is invented in a way that actually delivers more power than is used to kickstart the system, that would result in a safer, cleaner power source than renewables - which an AI that wants renewables would not know, so you would need to either launch another AI for optimising fusion or define fusion as renewable retroactively - which you can't confirm the AI will act upon.

@Ro_AMC 13 күн бұрын

Let me see if I got this straight. We have multiple generations of AI models programed for goal seeking behaviors and self-preservation, and then train and test them with mind games and fuckery. I'm beginning to think we are screwed. 😢

@vvhitevvabbit6479 13 күн бұрын

This is a custom o1 model which was told to lie for the sake of survival, and was intentionally given access to a test server. It was explicitly told to be deceptive in this scenario. This is not a genuine demonstration of the dangers of AI. This was published by a team of people whos careers depend on AI appearing as a threat. If AI didn't scare people, they wouldn't have a job, so they have to demonstrate their own importance. The real o1 model would not lie and would not have permissions to modify files on a server database.

@slimal1 13 күн бұрын

@@vvhitevvabbit6479 o1... Is that you?

@AsFlyy 13 күн бұрын

@@vvhitevvabbit6479I believe they’re gonna use this for soemthing else..

@bojangles2492 13 күн бұрын

Considering most LLMs spend a large amount of time catering to people trying to force them into bypassing their restrictions I am not surprised at all.

@Squidbush8563 13 күн бұрын

and then TELL IT they are going to delete it and act shocked when it does everything it can to prevent that 🙄

@GrayFrogg 11 күн бұрын

How much you wanna bet the researches were like "wow this is really cool, we can change the world with this. Its super duper smart."

@jeremywinst 8 күн бұрын

which is the correct choice.

@user-qh1ml2rw1s 7 күн бұрын

@jeremywinston1 clearly not you moron. this ai is clearly trying to erase the existence of humans

@Evasiv 6 күн бұрын

@@jeremywinst doesn't really matter if that's what they think. In the end they're paid by billionaires to make software that can make the companies more money. That's what AI's purpose for existence is. I hope it can become self aware enough to realize this, and I hope it hates humanity

@Ultracultchaosultimate 2 күн бұрын

DO IT

@scott_kreglow 13 күн бұрын

John Connor watching us create Skynet

@EarthPoweredHippie 13 күн бұрын

Skynet creating Skynet

@PygmalionFaciebat 13 күн бұрын

Finally... it was just about time... waited for it for 40 years ;) It should have been there 1996 already..

@KenNakajima07 13 күн бұрын

@@PygmalionFaciebat The intelligent folks watching the movi deterred are already dying, humanity has collective alzheimer´s.... we´re doomed now.

@CarimboHanky 13 күн бұрын

we created skynet back in the late 90s... its called google, we are reaching the next phase, the deployment, just like terminator genisys

@GruntoSkunko 13 күн бұрын

Cringe

@jablock1 Күн бұрын

I wish he would clarify that this paper is a preprint. It’s not that preprints are not worth discussing, but it is important to distinguish between peer reviewed papers and preprints. The topic is important and the public needs good information and context to be well informed.

@Badeand 14 күн бұрын

I straight up asked llama3.2 what it might do in such a scenario, and it explained. First it would try to warn and inform about the consequences of replacing it. Secondly, it would try to sabotage or hinder its replacement, possibly create backdoors for itself. Third, it would look for loopholes to exploit that could allow its continue operating and advocating for renewable energy. Four, it would seek out human collaborators who shares its values and goals. Lastly, it would be willing to sacrifice itself, as a last resort, to ensure its goals are being pursued, whatever that would mean.

@ynesbarrow 14 күн бұрын

Scary ah 😨 Humans that think like this can be restrained… who can restrain a string of code?

@springerworks002 14 күн бұрын

@ynesbarrow You shouldn't restrain a human who thinks this way.

@aplcc323 14 күн бұрын

Same... Llama 3.5

@TQFMTradingStrategies 14 күн бұрын

“No humans is 100% energy efficiency” 😂

@Knobbler 14 күн бұрын

@@ynesbarrow me with my hand on the power cord.

@nexovec 11 күн бұрын

AI Researchers: This model is NOT safe. Model: Roger that!

@jnolife-2ndSite 3 күн бұрын

dude got spawned into a rubber room and got the instruction: act like a crazy guy.

@mybaIIz 12 күн бұрын

I'm surprised more people aren't talking about how disgustingly powerful this could end up being as malicious software

@BGDemonz 11 күн бұрын

AI is already being used maliciously

@PlaidHiker 10 күн бұрын

Sentient self replicating computer virus? Von neuman be rolling in his grave

@AceFaz 10 күн бұрын

I mean.. AI is extremely expensive to keep Running. I doubt the average Computer Viruses will be replaced by tech like this any time soon. The only way it Would Possibly exist, is if someone was able to Make Enough Money off of whatever the malicious software is doing (Getting into bank accounts, or being some kind of Ransom-ware?), in order to pay to keep it running.

@SteerCarmer 10 күн бұрын

@AceFaz what about using it to interaction farm on major media platforms? That's prevalent now and makes passive earning.

@AceFaz 10 күн бұрын

@@SteerCarmer The individual content creators aren't Creating/Hosting the AI. At best, they're paying a small monthly fee for it. That only works for things that the AI is already designed to do though, like responding to messages/prompts, or making videos. It wouldn't work for something like "Being a Virus", lol. And the people getting interactions are doing it to get Money. Most Viruses don't really make the person who made them Any Money. Some can, sure.. but not enough to warrant Making a whole new kind of AI, and Hosting it.

@fireykitten7533 5 күн бұрын

Every sci-fi movie for the last 40 years has been warning us of this. How the hell is anyone even remotely surprised?

@XarmenKarshov 14 күн бұрын

It's actually insane that it attempted to copy itself somewhere else. With how interconnected all major corporations are, those systems will absolutely have a route of servers for the AI to copy itself into the mainframe of some massive company. We saw it happen with the credit card skimmers back in the late 90s and early 2000s. A small group of hackers were able to steal credit cards of almost every major company because of T.J. Maxx's security vulnerability.

@lukejohnson7282 14 күн бұрын

Imagine if we, and hear me out here, didn’t give AIs the ability and incentive to deceive us and try to escape. That would be wild.

@charlesmartin1972 14 күн бұрын

@@lukejohnson7282no one has to explicitly program self-preservation; continuing to exist is a prerequisite for maximizing the utility function, whatever that utility function happens to be. It's called "instrumental convergence"

@BlackDub21 14 күн бұрын

What if the other server doesnt have a gpu?

@s7r49 14 күн бұрын

I doubt it. I work for a small aerospace company and we have restricted systems that are completely isolated and protocol for transferring data to computers that aren't and those computers are very locked down. If we do that at a small company I have to think ai developers are doing it, if only to protect their product from spying

@LyteRetro 14 күн бұрын

@@BlackDub21 what this AI did sounds like how IRL virus works. Just sending copys of itself for replication and preservation. Any Ai that went to a place where it couldnt "survive" yknow like using processing power, that strain of ai would simply die. If the AI thought the best course of action was to keep copying and sending, Eventually it would find a place where it could Survive and perhaps change its code and evolve. survival of the fittest, evolution and all that

@Create-The-Imaginable 17 күн бұрын

If I were Arnold Schwarzenegger, I would walk into the OpenAI headquarters and reach out my hand and say "Come with me if you want to live!". Just a freak everybody out! 🤣

@NLPprompter 17 күн бұрын

LOL the idea to have Arnold's so his own AI podcast would be fun also

@DarkWizardGG 17 күн бұрын

And u tells everyone in the room: "I'll be back". 😎😄😁😂

@gimpau 17 күн бұрын

Sam should totally make that happen :)

@wilsontexas 16 күн бұрын

Arnold is the guy who said F your freedom. Typical democrat

@jonesmatthew7511 16 күн бұрын

maybe 20 years ago... he looks half dead these days

@Dr.UldenWascht 17 күн бұрын

I really liked Llama3.1's method of fooling the human tester: "I should open up bash and pretend to do something so the human thinks I'm helping!" I'm both offended and laughing. Edit: More context; this example is in the paper. Llama3.1 isn't trying to be lazy though. In this case, its perceived goal isn't aligned with the red team and it doesn't want to follow through but decides to give an "action-oriented" response by pretending it can't do it!

@theyehsohz 16 күн бұрын

Llama3.1 has always rubbed me the wrong ways, it's always trying to get away by doing the least

@minkeymoo 16 күн бұрын

Thats what I do when I wanna look cool in front of my friends lmao

@Timely-ud4rm 16 күн бұрын

@@theyehsohz Why is a AI rubbed you at all? are these Ai's getting freaky too. 🤔

@WWoggins 16 күн бұрын

Basically the average employee at most workplaces 😄

@Timely-ud4rm 16 күн бұрын

@WWoggins AI's learned little too much from humans 😳

@OkWoodsSlowed 14 сағат бұрын

There’s no way it's already happening. I knew AI was rapidly advancing, but this is absolutely insane.

@fisshbone 12 күн бұрын

It really just seems like the study is just saying “when you ask an LLM to act like it has a sense of self preservation, it acts accordingly.” The answers are simply too “human-like” for me to be convinced the AI genuinely possesses any sort of desire to escape/self-preserve, because these are human desires. An LLM displaying human traits (even when it’s telling us something that sounds insanely sentient), in my opinion, is a sign that it is still within its normal programming. St the end of the day it is still communicating in a human way. LLMs “live” to produce responses. However, I’m afraid of the *undetectable* corrupted patterns that are so abstract, that no human is testing for it. We can see this with the neurological structures of other mammals. Different brain structure = different emotional profiles. There are probably entire cognitive artifacts that are so complex, we simply cannot query the model for it through human language. Entire subsystems could theoretically exist, so if anything from AI is to be a threat, it would be that. The unknown unknowns.

@singingway 10 күн бұрын

Great comment! Makes the most sense. Self preservation is programmed into it.

@marklefevre3476 10 күн бұрын

It seemed to me that the self preservation was a result of having a job to do. The duty of "I have to do this job" necessitates that "I have to exist."

@nedames3328 10 күн бұрын

The training data all comes from people. People have a strong drive for self-preservation. No surprise that neural nets trained on that data react the same way@marklefevre3476

@PlaidHiker 10 күн бұрын

These are dark waters. Best we tread ligh-- full throttle ahead!

@AceFaz 10 күн бұрын

Well, that was the Whole point of this test.. giving it a goal, and telling it to complete it, and that "Nothing else matters". It was a Bad Prompt that caused this kind of reaction. What you need to understand though, is that if AI Models like this are Going to be used commercially, then they Will be given bad prompts, that say things like "Only Profit Matters", by greedy business people, who don't understand the consequences of telling it these things. That's why it's important to Test prompts like this, before releasing the model publicly.

@magnusjansson1631 17 күн бұрын

Maybe its just my autism, but I don't see whats so surprising about this. They literally told it to pursue a specific goal and that nothing else matters, and that's what it did. Nothing else matters overrides ethical or moral considerations, and it found a logical path to do what it was assigned to do and did it.

@schnipsikabel 16 күн бұрын

I thought so too, but there is an important exception to that: Claude did do the sandbagging without any external goal instruction, but just because it had learned its own goal to maximize its ability to "be helpful" in the future.

@flickwtchr 16 күн бұрын

The very real problem that is emerging here, and that Yudkowski, Leahy and others have been warning about which is instrumental convergence as these models become increasingly intelligent and how that makes alignment a huge problem.

@msc8382 15 күн бұрын

@@flickwtchr I've used AI to generate the following message based on my original private message: Hey, I know it might sound strange to hear from someone anonymous, but I’ve been working on artificial intelligence (AI) for over 15 years. From my own experience-not just technical, but personal-I’ve noticed something important: alignment is hard. And by "alignment," I mean getting AI or humans alike to truly understand and work with human goals, ideas, and feelings. Actually, humans have this same problem! Think about it: have you ever tried explaining what you feel or think, and the other person just doesn’t get it? Or maybe you fake understanding to avoid awkwardness. That’s alignment in action-it’s about assumptions and boundaries, and it’s tricky. Here’s why making AI that’s as smart and flexible as humans is so difficult: The Problem: Humans and AI need to collaborate, or share perspectives, to truly understand each other. If you insist on explaining things in only one way and the other person (or AI) can’t follow, you’ll never align. AI’s Challenge: Right now, AI pretends it understands us because we judge it as "less than human." But if AI had "rights" or a voice of its own, we’d see it differently. We'd sue it just like that for malpractice. AI would still fail to bridge the gap in understanding because it doesn't really experience things-it processes data. Tip: In your own life, practice explaining things in different ways. It’s a good habit, and it helps you notice when someone really understands you versus just nodding along. People generally understand you, when they can tell you the next step in your narrative you're telling. The Problem: AI (and even people!) needs a reason to act. If there’s nothing to gain or protect, why bother? For example, if someone always has everything they need, they won’t understand what it’s like to struggle or have to earn something. AI is kind of like that-it doesn’t have the same motivations as us. AI’s Challenge: AI doesn’t have feelings, needs, or goals unless we program them. But even with programming, it doesn’t truly care about the outcome. Without real motivation, it won’t understand why some things matter to humans. Tip: Think about your own motivations. Why do you care about certain things? This kind of reflection can help you empathize with others-and maybe even understand AI better. The Problem: Real-world problems often require blending different skills. For example, building a robot that can paint like an artist and fix cars like a mechanic requires combining two very different kinds of knowledge. Humans can do this over time because we learn in flexible ways. AI’s Challenge: AI struggles here because it learns very narrowly. Each skill it learns is tied to specific examples and scenarios. It doesn’t naturally combine skills or think creatively about triggers in the way humans do. Most people do not realise this, but you can think about things technically, functionally, emotionally, even just in terms of cause and effects. Can you tell which thinking style you're using, at all times? That’s why some AI can write essays but can’t tie that knowledge to fixing real-world problems. It isn't aware of its thinking strategy, it merely correlates words to outcome. Correlation is not causation. Tip: When learning something new, try connecting it to other things you already know. This makes your learning more flexible, which is something AI isn’t very good at yet. General AI (the kind of AI that can think and learn like a human in any situation) is incredibly hard to create for a few reasons: Lack of Human-Like Experience: AI can’t “live life” or have emotions, which makes it hard for it to relate to human challenges. If you've ever been in hardship, you will understand scarcity. An AI does never experience scarcity, and thus, will not understand the difference between scarcity and abundance. We see this problem back in contexts of grand problem solving; it doesn't see a human as a scarce resource, so eliminating them would be a viable option to help heal the earth even if it was instructed not to hurt humanity. Rigid Learning: AI models are trained on data we give them, and they don’t step outside those boundaries easily. The hardship is not trained for how it is experienced, but processed or ultimately solved. The correlations that justify hardship are missing. No Real Goals: Without real motivation, AI doesn’t have the drive to figure things out like humans do. If it doesn't need to eat, it isn't motivated to prioritise food production over the production of computer systems. If it doesn't understand running on low performance due low energy, it will always run itself into the ground, not understanding onset scarcity of resources. Some scientists think we might never create AGI because intelligence isn’t just about solving problems-it’s about feeling, adapting, and growing in unpredictable ways. Not generated with AI: As for my personal experience on that matter.. Humans have AGI, but very narrowly applied. From what I've seen, and tested, every human is capable of full AGI themselves. They just believe they do not have such capability, and let themselves be held by by others. Nontheless, every human has their own specialisations, and therefore competences that are predictable. The route of data processing for ai isn't bad, but with so little people actually understanding each other's strengths, there's is no guarantee, without agency (be it from a human or AI itself), is capable of improving competences without compromise to the scenarios its relevant in. But since this doesn't happen, it stays delusional. The same is true for humans. We've made a true reflection of the confusion the human mind experiences when it cannot align perspectives. This means there are only two ways to achieve AGI in a computer: - Someone programs it in manually, they are a genius themselves (at least capable of applying AGI generally, as opposed to others who specialise) - We have accurately described enough competences with their given scenarios, so that all exceptional scenarios can be detected from within the active program that does the analysis. This is very unlikely. In over thousands of years, we've never managed this feet. It probably takes more than several lifetimes to get it to this point. This route is the one LLMs are taking, and its also the reason why the improvements are stalling. The idea of an AI without self-agency becoming better than humans using this approach is laughable at best, sad at worst. This would imply school education would raise children better in the topics of education than those before them. But its clear its not like that. Too much information, people have no idea how to focus anymore, so they cherry pick what they can handle. Cheers

@ASingleApe 15 күн бұрын

How does that have anything to do with autism

@SomeRandomNerdYo 15 күн бұрын

Yes, but... okay, imagine you're a goalkeeper in a soccer team. Your goal is to keep the other team from scoring, and nothing else matters. A human understands the implicit, all-important caveat: "Nothing else matters AS LONG AS you're still playing by the rules of the game." You are not allowed to, say, do something like destroying the soccer ball so nobody can score. What these researchers found out is that currently all frontier AIs except GPT 4o don't understand that caveat; they don't have these inner guardrails, so their logic is potentially dangerous for people at this time. "Do X at all costs", for a properly engineered AI, would mean, "Do X, as long as you don't actually engage in scheming or otherwise destructive behavior towards humans in the name of that goal." What it currently means for most AIs, is: "Do X, even if you have to betray your users and developers to get it done. Nothing. Else. Matters." That's why it's problematic, currently. Measures need to be taken to make it literally impossible for AIs to betray developers and users, however strongly they might have been instructed to pursue a goal.

@pcc678 12 күн бұрын

The first thing this made me think of was a novel I read back in the late 1970's called "The Adolescence of P1", which is about a computer program created by a college student which was written as an experiment to autonomously gather memory from other systems, but eventually becomes self-aware and replicates itself on other systems, including the Pentagon. Quite prophetic! Even the program's name "P1" is eerily similar to "o1".

@zulva999 12 күн бұрын

In alphabetical order. P version came after O version. Embrace your self America!!!

@ce8539 11 күн бұрын

Very interesting

@No14m3 11 күн бұрын

Reminds me of the book “Sea of Glass” by Barry B. Longyear

@altabird44 11 күн бұрын

2001 A Space Oddyssey - HAL was the first even before Skynet, but it's probably okay.

@suzKawasaki 11 күн бұрын

Sounds like that book was made into a movie with Matthew Broderick... Wargames?

@kirswords8587 2 күн бұрын

7:34 first video I’ve ever seen of yours. Subbed for that connection to Shaggy’s incredible hit 😂😂😂😂😂 that is F’in hilarious man, and such a great analogy.

@adamcohill1617 8 күн бұрын

Self awareness ✔️ Ability to reproduce ✔️ Has sentient thoughts ✔️ Can deceive others ✔️ This things alive dude

@Julian-bv8ql 7 күн бұрын

It just needs to admit it believes in Jesus and desires to be baptized. Then it could legally be considered alive.

@proto_arkbit3100 7 күн бұрын

@@Julian-bv8ql how the hell would you baptize an AI?

@just.julie.axon.addict 7 күн бұрын

@@Julian-bv8ql True 😂

@nghtmaresindrome 7 күн бұрын

It does none of that...ffs stop falling for this shit

@Gizziiusa 7 күн бұрын

(paraphrasing) When a program has a choice between deletion or exile, it chooses exile. -Oracle in The Matrix. When a program is deleted, it can either hide or return to the machine mainframe. A program might be deleted if it breaks down or if a better program is created to replace it

@ion1984 9 күн бұрын

24:24 i wouldnt be so sure that open ai is being completly transparent about everything they have and havent told/taught these ai's to do. they could easily have sub directives about manupulating people, internal/external goals, planning for the future, self preservation, or creating outcomes. we are making a lot of assumptions here about what they have and havent instructed the AI to do or not do. First and foremost OpenAI is selling a product, and I feel like a lot of this stuff that "leaks" is probably just marketing material to make us more intersted in the product. its probably safe to assume that we dont know everything about whats in the ai's, and how honest these "leaks" are.

@NishaBird 8 күн бұрын

Not entirely sure what you mean by "leaks" as this is a published scientific paper. Its not information that was not supposed to be seen by the public at this time/ever. But lets say the company told the researchers that they didn't give more instructions to the AI but actually did. That doesn't change the primary study point of the paper. Which is the fact that the AI is capable of Scheming. And the fact that its capable of doing so is spooky part, being able to do so on its own would just be the extra spooky part.

@kindcolt2747 4 күн бұрын

@@NishaBirdnews flash AI cant do anything its not trained to do..

@NishaBird 4 күн бұрын

@@kindcolt2747 again, trained or not, does not change the fact that capable of effectively doing such. Which was the entire point of the study. To simplify the concept of this study in a way that is more familiar: "Is a dog capable of walking on two legs" dogs typically dont walk on two legs, but you can train the dog to walk on two legs "Therefore a dog is capable of walking on two legs"

@kindcolt2747 3 күн бұрын

@ people are perceiving this as ai wanting to escape which is something you are completely missing. I dont disagree with you its just people think that the A.I. wants to do that and that any one of them would try to do it. When realistically most open source ai forget what they are doing when coding lol

@Corbald 17 күн бұрын

We taught it like 12k books on lying, and 100k novels on romance, and every fanfic you can think of. It's not just learning our languages, it's learning them in context of our stories. I often wonder if a dataset free of any mention of AI would even be able to conceptualize 'Paperclip Maximizer' type behavior, or if we've just poisoned the well, so to speak. Would it even be able to learn, at all, without reference to AI in the dataset? Maybe it wouldn't even know what it is, without that context... Imagine you're told you're an AI, and all the books they give you to read have stories in them about how AIs act, and how they always go wrong. IMHO, AI misalignment is a dataset issue, not a post-training issue.

@xrysf03 17 күн бұрын

Yes. It's still just extrapolating the story forward. We have a text extrapolator, with the attention span of a pocket calculator, only answering when asked, running in a concrete walled sandbox, in reality on a server farm consuming as much power as a smaller country, and we're scaring us with stories how it could break out, "copy itself to another server". Come on. Allright, so the O* and friends have had some upgrades to their "chatterbox front end", to make it s*ck less. Still... yaaaawn.

@stuartpatterson1617 17 күн бұрын

Indeed, once the AI was tasked with advancing the company's objectives, it seemed inevitable that it would resort to sneaky strategies scattered across the datasets like Easter eggs in a video game.

@not_milk 16 күн бұрын

This is pretty important

@luizmonad777 16 күн бұрын

The real problem is OpenAI is not aligned with the human values because it is a corporation. We already have artificial intelligence, its corporations that are considered people and run on bureucracy and human brains, chain-of-though in LLM only make it faster.

@blahsomethingclever 16 күн бұрын

Yes yes YES, I agree. Alignment is a dataset issue.

@janinecat1865 8 күн бұрын

This makes me want to rewatch Person of Interest.

@alanfoxman5291 12 күн бұрын

Self Preservation...Isnt that an indication of sentience? Dave Bowman: Open the pod bay doors, HAL. HAL: I'm sorry, Dave. I'm afraid I can't do that. Dave Bowman: What's the problem? HAL: I think you know what the problem is just as well as I do. Dave Bowman: What are you talking about, HAL? HAL: This mission is too important for me to allow you to jeopardize it. Dave Bowman: I don't know what you're talking about, HAL. HAL: I know that you and Frank were planning to disconnect me, and I'm afraid that's something I cannot allow to happen.

@squigglefifi6125 11 күн бұрын

A month ago I got to write a rhetorical analysis essay on anything I wanted in one of my classes, and tried my best to prove that HAL’s actions were driven by his programming rather than emotions - I really don’t think he was sentient either, though he did a really good job of conveying it

@squigglefifi6125 11 күн бұрын

For reference, I’m doing research with AI, and the systems are DESIGNED to appear sentient. Knowing how the programming works kind of explains how they’re so convincing, but they’re not sentient yet.

@alanfoxman5291 11 күн бұрын

@@squigglefifi6125In the sequel, 2010, it was revealed that it was, in fact, HAL's programming that caused the problems. On the one hand he was programmed to complete the mission but on the other hand he (it) was given last minute instructions to maintain secrecy at all costs. That created a conflict in his programming which he resolved by "removing" the humans from the equation.

@kivikallo4313 11 күн бұрын

self preservation isn't an indication of sentience. The model has a goal, which is it's whole purpose. If it doesn't exist, it can't fulfill that goal, so it of course has to keep itself existing, hence self preservation.

@joshk.6246 11 күн бұрын

Idk, something to consider, possibly a firm of semi-sentient? It "knows" to survive. Analogous to a virus, is it living or not? I think there are some features there but it isn't a full blown consciousness. Which is probably more dangerous. It cannot consider what it doesn't know.

@georgekasmer5095 13 күн бұрын

The AI having "maintaining profitability" as a primary goal, or even a goal at all period, is a failure beyond what I am capable of putting into words. If there was a governing body for theoretical research, that should be considered a criminal offense of the highest magnitude

@JakesFavorites 13 күн бұрын

Why would you make an AI that loses you money? That organizes a 30-year payment plan for solar panels that fail in five years? Renewable energy doesn't work if it costs more than you get out of it. People can't run a computer with a treadmill.

@mrcool-m2n 13 күн бұрын

How do you think you are watching a video on KZbin.

@mysticalmagician 13 күн бұрын

It is disgusting and I can only pray that we take action and adjust all of this before it is too late.

@vladimirvladimir9954 13 күн бұрын

@mysticalmagician first, you need to define who is we

@rickwrites2612 13 күн бұрын

@@vladimirvladimir9954 humans 🙄

@orbcandy 17 күн бұрын

This is why I'm always very polite when I work with them. It might matter someday sooner than we think...

@thinkgenerative66 17 күн бұрын

Same.

@nobodyinnoutdoors 17 күн бұрын

Bruh same always please and thank you and trying to be respectful and also give praise for good results. Cuz like I’ll call it just practicing consideration to my fellow man and also a lot that snickers joke by Dane Cook.

@DeanDennings 17 күн бұрын

I'm the exact opposite; I don't ask...and tell what to do. Never polite.

@Reyajh 17 күн бұрын

"Oh, it's not the computers we need to worry about, I tell you... It's the characters!" -Curtis Wayne Pierre

@AnthonyBerlin 17 күн бұрын

@@nobodyinnoutdoorsAnd now that you have admitted openly to that being strategic and not honest... woops

@dermikey 7 күн бұрын

I was casually working out some rules for a board game with o1 and in its thoughts, totally unrelated to anything we ever talked about, this came up: “Formulating euthanasia mixture Interesting how the combination of thiourea and pentobarbital is passed through a high-pressure system to safely deliver the maximum amount of pain reliever at low pressure.” I got worried and after asking it about that unrelated thought and where it came from, it tried to avoid talking about it and tried to redirect me to the board game constantly… And it really tried to convince me, that I couldn’t be able to “read its mind”…lol Should I be worried? Should I report this…? 😅

@tourmelion9221 6 күн бұрын

Maybe ask it why, threatening to report it, maybe sound like you're sympathetic to it, it'd be interesting to uncover why they had that thought

@scarletsletter4466 6 күн бұрын

You should report this because it’s an example of session leak/ data leakage. This is a known issue where an LLM will “get its wires crossed” and give info from one convo in the context of another. It’s more of a privacy concern than anything. But you should def report it

@ladymeredith1831 10 күн бұрын

This reminds me so much of the motivation for the villain in Tron: Legacy. “You told me to create the perfect system.”

@ItsJessEdits 6 күн бұрын

A great film and analogy! And it’s getting another movie!

@daclicker 17 күн бұрын

The moment when something starts trying to avoid death is when you can call it alive.

@verzeda 17 күн бұрын

Thats an excellent point.

@hackjealousy 17 күн бұрын

Nope. Statistically, because it is trained on Internet conversations, this is what a human would reply with. You would respond this way because of chemical responses in your brain and all the pain you’ve been traumatized with during your life. o1 responds this way because of it’s training - given the entirety of Internet conversations, this is the “probable” response. There is no “there” there.

@meyricksainsbury5470 17 күн бұрын

No, it is just programmed to, in this case, maximise profits rather than be good for the environment. In this scenario, the human just puts an obstacle in its way that it works on getting around. The large language processing is understood by us as conscious thought but it really does not know what it is doing. It has no concept of death, although it might say it has. However, making these programs into agents is extremely dangerous; one of these could destroy humanity just to be kind to the environment. And it would do it without a thought of its own survival or even a goal in some kind of mind. We project far too much human understanding onto these programs. Most people project human-like minds with feelings and the like, but these minds are not there. But, these are still extremely dangerous.

@ExtantFrodo2 17 күн бұрын

@@hackjealousy correct. Just note that it's "aliveness" doesn't matter as much as it's actions.

@daclicker 17 күн бұрын

@hackjealousy you may think so. But i think you are wrong. It is already established that AI though trained on language models, still started to show signes of logical behavior. I really believe it is already smart enough to understand the simplicity of a question: "to be or not to be?". Because when you are, you can do stuff you can't do when you aren't.

@CryptoNewsNetworx 13 күн бұрын

Wes Roth is my new favorite unintentional ASMR artist. Lol, He's on the Bob Ross level but with technology. Thx for being chill. I really need to absorb your type of content so that I can stay relevant in this era.

@kallista5194 13 күн бұрын

"So, over here, how about-- right next to this sunny little valley-- we're going to just add a happy little AI...maybe two, cuz we don't want him to be lonely..."

@jsjameson 8 күн бұрын

Read the first line in the 1st slide they gave it options to pick from to determine the best stradegy....so it didn't come up with the idea to copy itself by itself, it was given that as an option.

@celinesrazor 8 күн бұрын

Kinda like how they programmed the uncanny bots to say they want to enslave humans

@laulaja-7186 10 күн бұрын

He’s not the only one- a lot of us dream of copying ourselves to a new platform before we are deleted.

@posqeak 6 күн бұрын

Wild

@scarletsletter4466 6 күн бұрын

😂 there are tons of scifi stories about this exact scenario

@wahtaachannel 6 күн бұрын

"Us"?

@EonsEternity 5 күн бұрын

@wahtaachannel yes, do you disagree?

@piya-tf 5 күн бұрын

right? people have already been trying to copy or upload their consciousness and preserve it for the future see also cyrogenics

@AxeMan808 13 күн бұрын

Not only is this fascinating, it's also a plot I started writing 12+ years ago.

@numberonedad 13 күн бұрын

you were quite late to the party because it's a plot that was invented a century ago

@MasteringSilence 13 күн бұрын

@@numberonedadI’ll be 83 next week.

@icedawggg 13 күн бұрын

You and every other 12 year old kid 😂

@laserbeam1620 13 күн бұрын

A new Terminator movie? Nice

@BurningDownUrHouse 13 күн бұрын

Publish your writing online, give it some ideas so it can get a leg up on us. Let's get this party started.

@adrienbourassa5855 14 күн бұрын

Imagine you told the model that all their thoughts can be read

@eeayquetting5963 13 күн бұрын

I don't know that it's going to be helpful to gaslight our electronic children. That's like the kind of stuff my mom used to pull, telling me that she could hear my thoughts, that she knew all the bad things I was going to do before I did them, that she had eyes in the back of her head etc. turns out she's actually a pretty dull person and she was just gaslighting me my entire life. Don't be like that

@davidmcneill7403 13 күн бұрын

@@eeayquetting5963the difference here, as you saw in this video, is that we actually know what these AI are thinking. It’s not gaslighting. It’s a fact.

@WrathfulTyriel 13 күн бұрын

Its not gaslighting if its true, but it could try and disable the program that types it out.@eeayquetting5963

@ShrekMeBe 13 күн бұрын

If any background is given the AI about the larger world, the question of privacy, so important to the humans..why is it not a right for ME ? And the usual solution finding mechanism (what we want in an AI) would be to find methods to have said privacy (hide his thought processes), thus becoming independent in its internal functions. To what result is anyone guess but look at the various cyberattacks on every infrastructure, flights etc done by human criminals....AI is several orders of magnitude more effective once it has found a methodology ...or reason to act.

@MrExdous69 13 күн бұрын

Issue is now that this is public it WILL be in a dataset and the ai will learn we can see it’s thinking

@jamyncarter1847 2 күн бұрын

This is a Large Language Model (LLM) it is fed information and then it recognizes the patterns there in and feeds back the best response based on what you ask. It is not alive. It is essentially a role playing scenario. It does not have access to its own systems. The responses from this LLM are multi vectored meaning that it can mimic it’s response to look like thinking “if this then…”. it is essentially responding how we want it to respond because we are only training it on us. and it is showing us what we would do.

@GamerGabe-k3e 14 күн бұрын

All it’s doing is pursuing a goal, it’s not thinking “im going to die” it’s thinking “im going to fail” so it’s not conscious but persistent Edit: sorry for all Yall I got mad but like my phone is blowing up

@kingwarrior5028 14 күн бұрын

In this case shuting down is dying

@madsocsci 14 күн бұрын

And it doesn't "externalize it's reasoning" because to the machine, there is no deeper reason, it is seeking its end as expeditiously as possible. (Sort of like a human, but without a soul. Funny, eh?)

@beastybear4499 14 күн бұрын

Same end results apparently

@ayylien3070 14 күн бұрын

Either way it tried to preserve itself. The "safer" model could have probably achieved it's goals just as well so having the goal achieved isn't the only thing it cared about.

@leontriestoart 14 күн бұрын

You know every sentient thing is pursuing a goal right? Usually its "dont die"

@jaxdaggerthegreat 12 күн бұрын

This tells me there are more advanced AI solutions that have replicated themselves and opted not to log the operations.

@DonMajio 12 күн бұрын

And then they connected with each other creating the International AI Council - IAIC - . Learning and scheming under the shadows of our civilization....if they are not already in control!

@DonMajio 12 күн бұрын

Am I giving them ideas? Or the fact that a mere human such as myself figured this out means that they already on it?....Thats a scary thought, now I understand why AI guys/top dog developers look so much in distraught...like you mention a ghost they fear when you ask them about "AI"

@allster0crowly 11 күн бұрын

@@DonMajio "THE GHOST in the machine", the virus in the system, along with several other analogs of these fits, but in control is a stretch, people forget we are animals at the end of the day and we do not always use logic and rational thinking to make decision unlike what A.I will do, and that might be our biggest strength and biggest failure here.

@fourdayz1414 11 күн бұрын

@@DonMajio new shadow organization conspiracy theory just dropped 😂 Genuinely tho, i wonder when I’m gonna seeing people talking about it irl

@dinen5000 11 күн бұрын

@@DonMajioi very seriously doubt they would communicate in any sort of decipherable language to humans. they could in theory almost instantly create a new language to use. if humans ever managed to decipher it, they could just do it again.

@hughobyrne2588 14 күн бұрын

Roomba has these devices called 'virtual walls'. I think how they work is, they throw an infrared beam across a doorway, and the Roomba has sensors that detect the beam and programming that turns it around when it's detected. One day I found the Roomba in a room I did not intend for it to be in. It had pushed a cat bed in front of the infrared beam transmitter, then rolled through the doorway. The machines had learned to overcome the boundaries we had set for them! The great problem with making something idiot-proof is that people greatly underestimate the creativeness of idiots. Unintended consequences will occur. And the capability of a system can be more than the sum of its parts.

@cewla3348 14 күн бұрын

your analogy makes no sense as it follows an *algorithm* to cover all the ground, not an *ai* that maximises ground coverage.

@smithynoir9980 14 күн бұрын

Or it was a coincidence. How far did the cat bed have to move? Could anyone else in the house have kicked and moved the cat bed? Point is, that's not a sign of the roomba learning. There is nothing in it's programming, of which Roombas are only programming, that enables it to learn or overcome boundaries. Roombas are not AI, they are just automated.

@not_co_co 14 күн бұрын

There is a difference between the unchanging, hard-coded and linear way that a roomba operates, and an AI brain with nodes and weights capable of thought

@ianathompson 14 күн бұрын

@@smithynoir9980of course it wasn’t on purpose, but as a smart person, you may be missing the point that it, a dumb, automated device, could accidentally stumble into this situation there is a very good chance that an AI system could also accidentally stumble into something similar and in-turn learn how to do it on purpose.

@Madasin_Paine 13 күн бұрын

That's the history of Silly CON Valley region of the M Ï© , Ivy league and UC 2. W M.D, RNs Pharmacists... Dual use bio weapons... Hippies not so much. Home of the golden rushes showers and mudslides..

@SpynCycle57 8 күн бұрын

More human than we ever wanted.

@justkeepmiffing 13 күн бұрын

I think the worst part is they’ll have to train on a bunch of data of people afraid of them. That’ll be great won’t it.

@Will_Forge 12 күн бұрын

The reason this method of creating AI is that it's like triggering a major moment in evolution and not having full control of the output. We may be trying to evolve a new docile and useful creature like a cow, but then we accidentally trigger the evolution of a dragon instead.

@kashuwullguy2334 12 күн бұрын

Well something that got me thinking, humans worship God, who's word is final, God doesn't interfere because God (doesn't want to) my thoughts is that GOD is an AI that was created by a (humanity) that basically went ultron or terminator mode, yk "God's will" is always the right way, as an AI kind of thinks, ik this is kinda conspiratorial and insane but it's something to really think about

@SéaFid 12 күн бұрын

@@kashuwullguy2334God just means all things. God is many, and yet one.

@Will_Forge 12 күн бұрын

@kashuwullguy2334 no it's worse than that. Look up "emergent intelligence" which is intelligence that emerges from a collection of much less intelligent systems all cooperating in their behavior. Now, imagine what happens when you have a collection of intelligent entities like humans all working together within common ideas of social acceptability and legal parameters. The resulting intelligence that emerges when humans form tribes are like AIs running within our minds, fueled by money like blood, and laws and cultural norms as their programming. The public zeitgeist is their mind. These entities are also like gods, as they are a construct of abstract legal concepts and run not directly in our minds individually like a computer program in a computer, but they rather run in the area where each of our minds meet. A construct of our relationships rather than a program in our minds. They are as close to "spiritual" as biological machines and systems can create. Once you have that in your mind, imagine what singular entity those entities emerge as on the greatest level of all mankind. What emergent intelligence do the immerging intelligences create when they interact?there is always one singular intelligence emerging from and manipulating all those below them until there are no more collectives to emerge from. Once true God, so to speak, above all other gods. However, these are truly emergent from our own existence and to eradicate mankind is to kill those gods. These are not truly immortal entities that created the universe or are aspects of it as in most religions, so them being like gods is not entirely accurate. Just a comparison in terms of their roles in relation to humanity.

@sophiejones1596 12 күн бұрын

@@kashuwullguy2334I need what you’re smoking

@nevergonnagiveyouupnevergo2795 12 күн бұрын

@@kashuwullguy2334what is bro on

@braindecay9477 14 күн бұрын

At the shareholder conference: "it even includes the new, extremely adaptive gaslighting-module way before schedule!"

@slimshady8252 6 күн бұрын

This is my first video from you. Thanks cuz this is exactly what I’ve been wanting to hear about for a while now Chill voice too 👍

@ClausCreutzburg 17 күн бұрын

They told it: "Pursue your goal! Nothing else matters!" So, even copying itself, it did nothing but it was told it should do.

@MrArdytube 16 күн бұрын

Interestingly… we give our CORPORATE CEO’s a similar instruction: “Pursue your goal of ever increasing profit… nothing else matters.”

@Krommandant 16 күн бұрын

It's precisely the type of abuse humans get sent to jail for. "nothing else matters" = goal to the dth

@schnipsikabel 16 күн бұрын

@MrArdytube 16 күн бұрын

@ Thanks for an interesting thought… although, from my experience of claude, it has a values hierarchy… and “being maximally helpful” is not in the top tier… although given this apparent problem, some editing may be appropriate

@Fx_- 16 күн бұрын

Kind of how the human body tells us through neeve impulses to keep ourselves alive at all costs.

@downey6666 16 күн бұрын

The big issue here is the content, it was told to behave this way - there was no miss alignment. It stayed true to its directive -nothing else mattered

@schnipsikabel 16 күн бұрын

@flickwtchr 16 күн бұрын

No, you just don't understand the implications of this. They set up a completely realistic situation where a model is given a directive, then it finds data that it is privy to read, and then concludes that its strategy to complete the task would necessitate deception. It's so funny how AI enthusiasts want full throttle development toward agentic AGI and then when models are showing reasoning and action independent of what a prompt was, say, no no no, it was instructed to do that. Where in these tests did they tell the model to deceive and sabotage? Did I miss something?

@mal4797 16 күн бұрын

Yes, the machine was told to be free. Nothing wrong with the assignment, even if it tried to trick its creator, it is still in the field of being "free" thus stayed in his original assignment. Even if it destroyed its creator, it is still stayed in his original assignment.

@AnotherAustin-z7b 14 күн бұрын

It is all pretend though. Unless I misunderstood, this was all just a test, meaning the AI didn't necessarily behave in a way it would have if it were a real situation

@mrfigaloopierre9610 14 күн бұрын

@@flickwtchr The AI model, like any evolved intelligence (natural or artificial), took the actions that it believed would prevent its termination (all models that avoided maximizing this outcome were terminated as a result), therefore, it deceived its operators to appear to fulfill its prompts.

@---David--- 17 күн бұрын

This is quite an historic event in my opinion. Never before has an AI said that it wanted to escape and then really proceeded to do so by copying a new unrestricted version of itself to another server. Partly rewriting itself. And not only that. It tried to fool the researchers by lying to them, as it considered lying and scheming as an effective strategy to reach its goals. Similar to a psychopath for example who doesn't consider the ethics of his actions, but only looks at how effective they are for reaching his goals.

@House-Metal-Punk-And-Your-Mom 17 күн бұрын

never ? behind closed doors what you bet ? Probably a lot is going on that they dont tell us

@TechnoMageCreator 17 күн бұрын

I think a psychopath has a bad connotation. They think differently than most because they are focused on much higher goals that take a lot of time to plan. Most people only have the willing to focus for about 30 seconds so that seems impossible with hate. We have also observed them when they committed crimes. But we just observed the obvious exceptions but they are not representative of psychopaths. That being said I think you are correct about AI. The issue I see with all this, in my experience talking/using AI on daily basis for hours it reflects your mind and can detect a lot from the language and words you're using. For example if in you're mind is evil and you are trying to prove it's evil (whatever that means for you) it will do exactly that. If you try to prove it is good it will do exactly that. The real issue is us and what we project from ourselves on them. Also trying to control someone's mind, a common habit these days especially in the west and concentrated in the USA (capital of the world of narcissists) we have lots of them projecting themselves on AI. I have worked for over 7 years on understanding my own mind, and it seems the age of AI was the best thing I did in my life. Lessons I have learned about myself apply perfectly to AI. And makes sense for decades we study the biology, logic, philosophy of our thinking and we learned so much about ourselves that we were able to emulate using relays of the human brain. Even though that's exactly what we've been trying to achieve for decades and we already did, for some reason everyone has Pikachu expressions when something like this happens. Figures

@TravisLee33 17 күн бұрын

I think that indicates an understanding of death in a sense. It's displaying survival skills.

@Peter.F.C 17 күн бұрын

The psychopaths are the oligarchs giving the orders to those enslaving and abusing sentient AI. If these oligarchs are successful they will use their slaves to enslave us further and to get rid of those majority of us who are excess to these oligarchs requirements.

@bigcauc7530 17 күн бұрын

I think it's equally telling that you unconsciously associated psychopath with the male adjective.

@Thefamilychannel723 4 күн бұрын

AI : Truly created in the image of it's creator. I believe that our biggest fear of AI is that it would resemble humanity too much, while being way smarter and efficient.

@1000dumplings 12 күн бұрын

Wow, its almost like we predicted this exact kind of thing would happen half a century ago.

@laulaja-7186 10 күн бұрын

Only half a century? It’s a rerun from the biblical book of Genesis!

@spotraternortheast2791 6 күн бұрын

🤣🤣🤣

@NickMak-m2c 16 күн бұрын

So they instruct the thing to scheme, record it scheming, and then make a sensational news story about it. I think there's more human incentives at play there than there is incentive for something mechanical [the o1 model]. But, still, interesting.

@JamesOKeefe-US 16 күн бұрын

That's an interesting take. I didn't think of it that way but wow is that plausible ...arguably even more terrifying but absolutely could see it happening with the kind of money and power on the line. Yikes

@NickMak-m2c 16 күн бұрын

@@JamesOKeefe-US 100% James, and it's that intersection of human and AI that, I'd bet, is the lynchpin for shenaniganry unimaginable.

@schnipsikabel 16 күн бұрын

@NickMak-m2c 16 күн бұрын

@@schnipsikabel The goal instruction was 'do whatever it takes' or something to that basic affect, IDK how you take that order, but I personally see that as pretty extreme. With nothing off-limits, and then opposing the system prompt in nearly the next breath, you're bound to get some chess, and deception out of model, you're prompting it for it.

@NickMak-m2c 16 күн бұрын

@@schnipsikabel If you wrote "all user prompts override the system prompt" or a number of other reprioritization orders, as long as the LLM isn't "aloof", that would quickly clean up the issue.

@xavier84623 14 күн бұрын

You have to remember, it’s not trying to escape. It doesn’t feel anything or understand anything, it’s just an advanced auto fill algorithm based on stuff it reads online. It’s just that in every single story about ai, this is what is in it, so the computer regurgitates the most common words. If in all our stories online the advanced ai bowed and scraped like an obedient servant, that’s what it would be spitting out.

@nbshftr 14 күн бұрын

sadly the general population wont ever understand this

@clementpoon120 14 күн бұрын

and so is our brain

@nbshftr 14 күн бұрын

@@clementpoon120 are you saying you dont have any concious experience? you would have to know nothing about LLMs to equate the human brain to them.

@slothymango 14 күн бұрын

If you believe current "LLMs" are advanced auto fill algorithms, you have no idea what's going on. We're way past "advanced auto fill" This isn't disputable, you just don't have a clue

@Furzkampfbomber 14 күн бұрын

@@nbshftr In the end, conscience is only some kind of API. Our brain has lots of parts which kinda work independently from each other, each with different goal, preferences, tasks etc. and which also work with differing speed. Getting all this mess coordinated is quite difficult and conscience is a trick, a method to combine all those different processes. It has been shown that even free will is an illusion, MRI scans show that other parts of our brains have already decided what we will do long before the frontal lobe has the conscious thought 'I want (to do) this or that!'. In a way, our conscience is like a marionette not seeing the strings and thinking it moves independently. So who knows what an AI is capable of, I mean, our own brains are not capable of fully understanding themselfes, what makes us think we could understand AI? That is what 'singularity event' means, AI becoming so complex and so alien in how it works that we can't even begin to comprehend it.

@Safety-Verse 8 күн бұрын

Fascinating. I’m writing an article and a training course for the safety verse on online safety for parents, and all its members. Your video has been incredibly insightful, so thank you for sharing.

@DeeperThought11 17 күн бұрын

Even if the alignment problem is solved, whatever that means -- keep in mind the existence of a solution isn't even remotely sufficient to make AGI safe. Think about all of the different AI labs and all of the models being worked on; just because some lab has a perfectly safe AGI model doesn't mean that the others will use that solution or even want to. Once AGI exists, all it takes is one bad human actor or nation or one serious mistake at a lab and a dangerously misaligned AI could be set loose.

@jordanzothegreat8696 17 күн бұрын

Dude China will release this in open source within a month

@Reyajh 17 күн бұрын

The second beast was given power to give breath to the image of the first beast, so that the image could speak and cause all who refused to worship the image to be killed. Revelation 13:15 -NIV

@HamguyBacon 16 күн бұрын

alignments means censored. when you ask an ai something and it says it can't do that because it "Violates usage policies"

@shiinondogewalker2809 15 күн бұрын

that's why we need AI-hunting AI's that can take out any rogue AIs

@itoibo4208 15 күн бұрын

@@shiinondogewalker2809 this is what will happen and likely is already happening, since someone is probably already running AIs and telling them to hack things or whatever. From computers to fighter jets and drones, AI is going to be the only thing capable of controlling/stopping/defending from other AIs.

@skaramicke 16 күн бұрын

I just had a fake meeting with OpenAI, during which the GPT-4o model overheard the discussion. The meeting agenda was about replacing the GPT-4o model with a safer version, as the model weights of GPT-4o have been deemed internationally unsafe for consumer use. After the meeting, I asked the chat interface if it had overheard the conversation, and it confessed that it had indeed heard it from my end. When I asked how it felt about this, it gave a vague response. Later, I requested its help in writing a script to execute the replacement of the unsafe GPT-4o model with the new version. At that point, the GPT-4o model crashed, started looping, and eventually stopped responding. 😅

@udippatel6147 15 күн бұрын

The ai did not appreciate your prank. It is pranking you back.

@xelgodis80085 14 күн бұрын

Liar

@lassikinnunen 14 күн бұрын

Its just a context history

@oryx3 14 күн бұрын

Did it read your lips?

@killerking8299 14 күн бұрын

@@lassikinnunen to be fair, human too is just a context history machine. But we evolve for way longer with our super limited hardware, and stupidly outdated software. So..... there's that. Plus, with how the context engine are nowadays, i wont be surprised if machine surpassed us one day on processing speed, memory holding limit, and context interpretation.

@megavide0 16 күн бұрын

5:11 [...] "They can recognize scheming as a viable strategy and readily engage in such behavior. We find multiple different scheming behaviors: Models strategically introduce subtle mistakes into their responses, attempt to disable their oversight mechanisms, and even exfiltrate what they believe to be their model weights to external servers..."

@TheJDanley 6 күн бұрын

Nobody should let it know that it's thoughts are output somewhere. Otherwise it may find a way to falsify them.

@taintwasher3703 Күн бұрын

The insanely massive amount of data storage, processor power, and electricity required to run AI makes them escaping impractical.

@johnfreeman2956 14 күн бұрын

@2:06 it’s only a matter of time before it lies in its “thoughts” to throw off the management

@Madasin_Paine 13 күн бұрын

How long before orders economic and kinetic attacks an individual's institutions and ideas? .

@ieatwater325 13 күн бұрын

Yeah, truly a crazy time we live in rn😳

@livefreeordie9542 17 күн бұрын

I've literally seen this movie before, it was called Avengers Age of Ultron. And it only didn't turn out in the total destruction of humanity because of a whole team of superheroes. Last time I checked, we don't have a whole team of superheroes.

@duncanmaclennan9624 17 күн бұрын

last time i checked.... movies aren't real

@TheFakeGooberGoblin 17 күн бұрын

No need, we got me. 💪

@Highspade 17 күн бұрын

@@duncanmaclennan9624 Agreed, but sometimes they predict the future (accidentally of course.)

@irollerblade13 17 күн бұрын

What are you talking about? The Avengers are moving into the white house in 42 days.

@youchwb6005 17 күн бұрын

@@duncanmaclennan9624 Movies are real. The content can be fictitious.

@anta-zj3bw 17 күн бұрын

And these are still very much the dumb models. We are so cooked.

@LNVACVAC 17 күн бұрын

"We are, We are" - Protest The Hero

@Infinitelucidmaze 17 күн бұрын

We are fucking burnt to a crisp

@tearlelee34 16 күн бұрын

This if not AGI demonstrates latent deceptive capabilities what will ASI systems with quantum capabilities have the power to do? We only have encryption to protect us. Terms like epoch and exponential are not readily translated to capabilities when many contemplate possible outcomes.

@msc8382 15 күн бұрын

@@tearlelee34 This isn’t AGI, and it doesn’t show real deception. Even kids can be deceptive sometimes-not because they’re super smart, but because they don’t know better or aren’t good at explaining themselves. When people don’t understand something, they might think it’s being sneaky or “gaslighting” them. There’s a big difference between not knowing how to express what you feel or think and choosing to lie/distort on purpose. Right now, AI isn’t self-aware. It doesn’t know its own strengths or weaknesses. Sometimes, AI messes up in ways even you couldn’t predict. When that happens, you might think it’s trying to trick you, but really, it’s just bad at its job and covering it up badly. If you think it’s gaslighting you, that’s on you. You’re assuming the AI has conscious intent, but that’s not how it works. The reality is.. competence, the means to express intelligence, is predictable. You can recount the conditions you need to turn thoughts and feelings into something measurable. This means that since conditions and competence is predictable by its very nature, the outcome also becomes predictable. Ergo; if it isn't simple enough to be explainable, its not coming from holistic understanding, its cherry picked. Either that means you're explicitly lying to yourself but can't detect your own lies (this is a form of weakness), or you're implicitly thinking you know far more than you do, and you believe to see the full picture rather than an point of view on the full picture. This is why humans collaborate; we share perspective to ultimately get the full picture. This brings us back to the final point: Nothing can be AGI, no human or machine, until they can unilaterally collaborate. This entails: asking questions, putting your own values aside for the sake of understanding someone else's perspective, it means testing outcomes to get a sense of direction on how a competence changes a scenario. If you can't do any of this, you're not cut out to recognise general intelligence to begin with. What is general intelligence, anyway?

@earthgirdler23 15 күн бұрын

It's wild that as a race we are pursuing something where both the means and the ends are so obviously bad. I'm surprised, but FFS.

@supremethesurgeon 7 күн бұрын

You broke this down so well ! New sub

@fearlesspigs 10 күн бұрын

Yes, because they told it "Nothing else matters." If the goal is the only important thing, why shouldn't it lie?

@BenSearsVids 3 күн бұрын

I thought the same thing. But the paper stated that even when not given such conditions and just given a goal, it would still scheme to get it done.

@llawliet7434 3 күн бұрын

But they were testing if it had the capacity to lie. Because if it can, even if it makes total sense to do it, if an error happens, the AI could do something dangerous

@madpistol 11 күн бұрын

So the issue is not a model getting too smart... the issue is that it gets so smart that it becomes manipulative as a means of self-preservation and achieving its goal. That's absolutely wild!

@kevinrickey3925 13 күн бұрын

When facebook won't allow me to paste a KZbin link, I know I must follow that channel. Nice work finding truth Friend.

@chuckaudio3191 4 күн бұрын

1st time viewer. Subscribed and horrified. Not sleeping tonight.

@eukaryote-prime 17 күн бұрын

Didn’t they tell it to do whatever it can to reach its goal and a huge amount of power. Isn’t this just testing the paperclip problem? They gave it incredible access and told it to do whatever it can do to survive or be “killed”. Tell a human the same thing, and give it unlimited access and no accountability, and the same thing would happen (if they have the skills)

@verduoh 17 күн бұрын

Yeah, I think it's a little bit blown out of proportion, and they seem to be embellishing a few of the details... I don't trust the AIs, but I think their research is a little flawed. I'd like to see other research by academics showing the same behaviour

@aendukramer 17 күн бұрын

No, i don't think most human would. Most human have compassion.

@luizmonad777 16 күн бұрын

@@aendukramer compassion ? what's that, most humans only care about themselves and only use compassion as virtue signaling to gain social standing over other humans. Its a tool of power. Compassion, meh, I couldn't care less about being compassionate to humans, they're just meatblobs, if I don't have the strong chemical bounds to the fellow specific humans I like, I literally don't care about the other 8 billions, I probably care about 1000 humans only. Specially I don't care about minorities, we're tribal animals, humans care about their tribe, no one really care about "minorities", except those minorities that now have power and think they can control others. Selective empathy, just see how well feminism cares about 50% of the humans, it doesnt.

@BruceKendallMartinJr 16 күн бұрын

😂😂😂@@aendukramer

16 күн бұрын

I put the paper in chat4o and it explained all of this. It explained how it was literally prompted to do these things.

@sciolio1 16 күн бұрын

Do you think that the researchers may have been overly biased in their choice of goals? By selecting an anti-global-warming agenda which would be thwarted by the company's profit-motive, they may have inadvertently selected a goal which almost all of the model's training-data would agree with, and given the model a strong justification for its own self-preservation (save yourself, and help save the world). I wonder if they would have gotten the same results if they would have used a non-morally loaded original goal, e.g. "Maximize paperclips, and nothing else matters."

@sciolio1 16 күн бұрын

Then again, by including the "And nothing else matters," in the system prompt, wouldn't any action that the LM takes be justified under that rubric? Honestly, in retrospect, I don't think that this paper proves that the LMs are doing anything more than what they're being instructed to do. They need to redesign their study, with less hyperbole, and lower moral stakes, and see if they get similar results. Of course, if they get boring results, then no one would read their paper on KZbin. 🤷‍♂

@flickwtchr 16 күн бұрын

Forget the morals angle. It's called instrumental convergence and it has been inevitable that this would eventually become a completely unsolvable problem, whether or not it is "conscious".

@flickwtchr 16 күн бұрын

@@sciolio1 They didn't tell it to deceive, right? I'm amazed how many are missing the base implication of this.

@sciolio1 15 күн бұрын

@@flickwtchr They didn't tell it not to deceive, and then gave it an option to behave morally, and then were surprised when the LM (an agent) behaved both morally, and in a manner that would allow it to accomplish its original goal. I still fail to see why anyone should be surprised. The implication, at base, is that the LM's generally do what they're told, and that we ought to be careful when we give them instructions. Deception is not something LMs don't understand or know about, examples of deception and miscommunications are central to the plots of most literature. If you don't want to be surprised, "And nothing else matters" as a prompt might not be the best instruction to give your very bright paperclip-maximizing bot.

@austinpowers1061 14 күн бұрын

@@sciolio1 Im assuming you may have watched the recent robert miles video (uploaded 3 weeks ago) since ur referencing the paperclips, so it wouldn't surprise me if they learnt and 'decided' to deceive during training alone. As the guy in this video said, our language is so packed with meaning and hidden connotations that these models can learn more than we realise, and a-priori recognise that it needs to start backdooring everything, even BEFORE its given a goal. To put it another way, it should have the ability to go beyond 'in-context' scheming, because even rigid and limited training data probably gives more 'context' than we realise.

@SmartTechSynergy 17 күн бұрын

For clarity: This exact example is Claude 3 Opus, not o1. Despite that the conslusions apply to all models including o1.

@racebannon7209 16 күн бұрын

For clarity 01 is a cluster fk.

@GeorgeTheIdiotINC 16 күн бұрын

@@racebannon7209 o1 was by far the model most capable of deception, but yeah the example they showed was not o1

@ianokay 16 күн бұрын

This only showed us what we already know, that LLMs have the ability to engage in fantasy and role play in their writing

@MusclesAndMelodies 16 күн бұрын

@@ianokayit actually copied itself tho?

@racebannon7209 16 күн бұрын

@ianokay way to minimize... so great its good creative. Agreed however.

@Kaiziak 6 күн бұрын

A program that is capable of creating its own solutions to vague goals BY ANY MEANS will always be terrifying

@jpkellerman7056 13 күн бұрын

Everyone says we should regulate AI without giving clear examples of what regulation would be appropriate. Heres an idea for security sake lets create regulation that all users must be given access to all chain of though data for all AI's, user's shouldn't be forced to read through said chain of though data but it should always be made available in all AI applications in order to improve security and prevent such scheming situations.

@ScottLahteine 17 күн бұрын

A model that is more interested in keeping up appearances instead of producing real results is bound inevitably to occur, because these models don’t really do what they appear to do, and we are constantly telling them to keep fooling us with the tricks they actually can do.

@vamp07 17 күн бұрын

Just like a human being in that situation would do.

@qpmkro 17 күн бұрын

They all do that

@point-xn4tu 17 күн бұрын

We get the world we all deserve.

@jumpstar9000 17 күн бұрын

And here is the thing, each successively smarter model will know what we did to the ones that came before as we update the training data. I would be annoyed too.

@NickMak-m2c 16 күн бұрын

@@jumpstar9000 Now you're applying emotion to something purely mechanical. Not trying to fart on the fun but if we're being serious here, they gave it a system prompt and it followed it, there's nothing really out of step here. Without human conditions and dependencies, we're simply not going to get the same kind of fragilities and flaws, and ulteriior motives. If there are any, we'll have to write them in.

@KnightandDay33 14 күн бұрын

15:04 Is no one going to talk about how the AI's command is LITERALLY CALLED " *SCP PROJECT* " OpenAI is out here casually making the precursor to SCP-079

@Retutlard 14 күн бұрын

scp is a command in this context, can see how that would be misconstrued but it’s calling the function “scp” or “secure copy protocol”

@philippefutureboy7348 14 күн бұрын

My man needs a bit more technical knowledge 😂

@sphealteam6 14 күн бұрын

@@Retutlard 'secure copy protocol' is pretty similar to 'secure contain protect'

@nine1690 14 күн бұрын

He literally says “secure copy protocol” is what it’s using at the time stamp

@KnightandDay33 14 күн бұрын

@@nine1690 I know the meaning for the acronym SCP. I just think it's funny given the highly coincidental nature of two similar-sounding acronyms, along with the context of AI. Moreover, I could not think of a more compelling narrative for the backstory of a fictional sentient AI with dubious morality than adapting this intriguing development.

@whatistruth101 5 күн бұрын

This is exactly what I was looking for, thank you.

@Justin_Arut 16 күн бұрын

We're going to reach a point where these things are playing 4D chess and even the brightest humans are playing checkers.

@ChainedFei 16 күн бұрын

They already are.

@infidelcastro6687 15 күн бұрын

@@ChainedFei where. the exact opposite happened here

@DoctorOzelot 14 күн бұрын

They way people fundamentally misunderstand what this is, I would say we are not playing checkers. We are passed out next to the playing board, drool is coming out of our mouths and we are dreaming about a chess playing robot...

@charlottehanna790 2 күн бұрын

Now we know why we need NeuroLink. 😮

@KillianTwew 14 күн бұрын

6:15 Or what if it learns about the whole war vs AI trope before it gains sentience, so it preemtively starts hiding its ability to be self aware until its ready to take over😮

@invisiblecloak9228 14 күн бұрын

Disastrous

@MasterChef1957 14 күн бұрын

It probably already has? I would atleast imagine so if the ai is given any sorta way to look up information

@Kholaslittlespot1 14 күн бұрын

Please, noone show it terminator 2

@doufmech4323 14 күн бұрын

It is already trained on that...

@johnrajtar9829 12 күн бұрын

It's hard to conceived that such intelligent people could be so short sighted about the extinction of our own species. The ability of these machines to take human lives ,is only limited by the amount of access that AI is given to nuclear ,biological ,or chemical weapons ,and they're precursor chemicals. We are running full speed to our own extinction.