Any solutions to those questions that don't complain about them, would be highly conspicuous 😅
@tsuki-no-akumaСағат бұрын
22:07 - *skips Elon Musk* I knew Rob was based, but I didn't know he was THIS based
@TLabsLLC-AI-DevelopmentСағат бұрын
Amazing. I just got my ASD Diagnosis last month.
@twosmartfouryou25377 сағат бұрын
18:33 The red team could have the attack policy generate 2% safe code that looks really suspicious to the trusted model, so that the humans only look at that code, see that it's not actually backdoored, and allow everything else to get through.
@almo20017 сағат бұрын
Is Nick Bostrom's book still relevant? I have it, and have not read it. I suspect it is, but I was wondering what any AI experts think.
@aniellodimeglio83699 сағат бұрын
Good to have you back. I think people need to be exposed to the risks. More examples of how AI can break lose.
@JonathanPlasse12 сағат бұрын
Is it just me or Elon musk is doing just that?
@Patrick-gm3fb12 сағат бұрын
I'm starting my freshman year of college in about two months and this is exactly what I want to focus on in my career for may of the reasons you've stated. It's interesting and it's one of the most important things affecting us right now. So I'm really looking forward to your video on career direction and I'll probably be among the first to watch it. And also, I want you to know that watching this video genuinely brought a tear to my eye.
@volodyadykun649020 сағат бұрын
I think we're approaching the phase of "generative AI is overhyped bullshit that is at its limits" and we can now again talk about it as computer science
@intboom21 сағат бұрын
"Don't go in the ocean" is the single most moronic statement I've heard: If you don't want to deal with competition, you're shit out of luck in this particular material universe. Competition is baked into reality itself, and we already have global transportation, meaning that poor people can't compete for having lower wages than even poorer people from somewhere else, and the best and brightest from poor nations flood to the west rather than improving their own local areas. If you believe that eliminating competition is even possible, you're basically never going to succeed.
@McMurchieКүн бұрын
I love Rob, but one issue he has is he is uber focsed on 'bad AI' and not 'bad actors' with AI. Which is a huge difference.
@ekki1993Күн бұрын
28:33 I think way too much weight is being put into the assumption that a singularity of AI improvement will be exponential for too long. I get that we have to be prepared for the worst-case scenario, but in this case in particular we can think of slightly less dire circumstances that could be helped by these methods. For example, the "singularity" could lead to a fast ceiling in improvement due to some fundamental limitation of the superhuman AI architecture (e.g. it can only improve by getting more computing power, and all of the computing power it has access to still leaves it not that far from human intelligence). In that case, these kind of solutions could be way more important than what we think of now.
@blade00023Күн бұрын
before even realising this video was out (earlier today in fact) i was talking to a colleague at work and making a prediction that in a surprisingly short time (i think my guestimate was 2030, or maybe 2050) unless some motivated and competent people did something drastic about it we (as a species) were going to encounter a monumentally world changing event. i suspect others who overheard (at least one for sure) thought i was joking. i wasn't. i am genuinely apprehensive about where this is all going, as most people don't seem to realise it won't be some hollywood depicted event like 'judgement day' where ai attacks us in some overt manner. it'll be "oops, we just got subjugated" or "oh the whole human race has been sterilised. how did that happen?" i'm somewhat relieved that this is being discussed more seriously now, and glad that people like Robert Miles are being looked to for advice and guidance. let's hope the UK keeps pace with these measures, and others follow suit.
@cheetah100Күн бұрын
Hey Robert. Has anyone tried to evaluate what happens if AI is perfectly 'aligned'? I mean I know there are bunch of ways things might go badly, but are there any ways they will go well?
@_DarkEmperorКүн бұрын
>We built superinteligent AI and i have good and bad news. >What is a good news? >It is aligned with human values. >What is a bad news? >It is aligned with values of Stalin and Hitler.
@antondovydaitis2261Күн бұрын
It's far too late. AIs are manipulating social media, and have been for years. For example, AI is "moderating" comments on this platform.
@baslificoКүн бұрын
Seriously dislike the attempt to hook into another video with the optimal strategy. It's cheap and clickbaity. You're better than that. I've enjoyed your videos for years and that was almost enough to convince me to never watch another.
@cyrilioКүн бұрын
Don't worry about perfectionism. Whatever you put out will be way better than we could ever imagine. Remember that 99% of people are idiots. even me, or you for that matter. Getting people to discuss your thoughts are better than not having any discussion at all.
@yubato334Күн бұрын
What if we let the dumber model behave as its own reward function, and make it (if it works) smarter with further training? Sounds like a bottom up approach to the strategy in the video
@futsk01Күн бұрын
Alignment is such a politically correct way of saying brainwashing
@remicaron3191Күн бұрын
AI will never be safe because they are all based on HUMANS which inherently makes them dangerous. Believing we can control AI intelligence is equal to believing we will us nuclear power only to end world war 2 or peaceful uses. If AI doesn't get us, our nuclear weapons will and one of them will destroy us before 2050. Keep playing with fire and you will get burned.
@johngompers7454Күн бұрын
Hopefully The Eldar will be able to take out the Abominible Intelligences we build.
@6teeth318-w5kКүн бұрын
AI is the prefect tool to enslave us all-. Well not all of us, just 99.99%. The 0.01% will control us and own everything and watch our every move, purchase, interaction, sms, messages.
@hedgehog3180Күн бұрын
If you're trying to make your AI only say true things then you're gonna run into a huge wall of epistemology where the problem isn't even inherent to computer science like most AI problems are but are just a general issue faced by everyone. Our best answer to this so far is the scientific method but if you haven't noticed we haven't been able to automate science yet and that seems to be a genuinely impossible thing to do. Science can only be done by agents capable of reflection and interacting with the world so basically an AGI and developing an AGI to prevent ChatGPT from saying that homeopathy can cure cancer is probably not worth it.
@codemiesterbeatsКүн бұрын
4:18 I think what increases the pucker factor for me is that an AI wouldn't be bothered by time... In other words it can happily live in it's box/container for essentially eons while it gathers data/resources for it's 'escape'. Much like a prisoner can create impressive things with minimal working time/supplies. Basically the AI can act on it's own behalf surreptitiously and slowly over what amounts to an unlimited timeframe 😅 9:01 nevermind how creative the backdoor could be 😅 Woops, power browned out for specific timeframe... Now our AI is in there like swim wear 😂 I am just an idiot on the internet but this one time I don't want to be the 'I told you so' guy 😅
@tru7hhimselfКүн бұрын
another problem is that the people in charge of those whorking on ai and these issues are not aligned to our interests either. i have no doubt that elon musk would run a piece of code that results in half of humanity getting killed immediately if it just made him another billion dollars.
@hedgehog3180Күн бұрын
Is the secret better option letting the untrustworthy AI review code that the trusted AI has already reviewed and deemed highly suspicious?
@FaffyWafflesКүн бұрын
1:20 is that a Hbomberguy Reference?
@axallotofquestionsMusicКүн бұрын
So wait a minute, the super advanced ai is highly intelligent and possibly more than us but didn't it learn everything it knows from us,
@BarryBrown-q4qКүн бұрын
Hi, I am looking for advice on what I should do next. I have a PhD in Neuro -Cognitive Psychology, a MSc(Dist) in Psychological Research Methods and a BSc(Hons) in Psychology too. I am also Dyslexic so please bear with me on my grammatical errors. I have been independently working on a new architecture for AI, based on mirroring Cognitive-Neuro structures we understand today, I believe I have finally completed it. If I am correct it will massively reduce the need for compute power, ref: - AI’s computing gap, 2024, Nature, Helena Kudiabor, and assist in the process of AI gaining consciousness as we know it? The Architecture is a language based, multi layered, parallel processor that uses inhibitory and activation connectivity, as the main means of achieving goal based action of the AI and reduces the need for weight changes between "Computational Units" to achieve the goal? Ok, please advise, should I publish as concerned if I am correct this will create a paradigm change in AI development which may create consciousness in AI and may be easily applied to many systems immediately? Obviously if I am a narcissistic, deluded, individual there will be no change, what should I do? Should I see if I am a narcissistic, deluded, individual, or a narcissistic genius who gives AI consciousness, any advice gratefully received even if impolite?
@PhilipTeare2 күн бұрын
Nope. The guillotine is not sharp. Ambiguity in language blurs the division enough, often enough, that ought truths creep into 'is' space. And vice versa. AGI is very likely to involve LLMs or similar. They have this blurring baked in. Even without alignment. This doesn't make them safe. But it does mean your argument doesn't reflect the reality of AI. You also ignore restraint. Refusal is already extant in ai behaviour. A means to a terminal goal can and will be abstained from. Alignment encourages this further. Again it does not make ai safe. But it does make your argument far too reductive and absolutist. Good talking points, but poor conclusions.
@DeimosSaturn2 күн бұрын
I wonder, could this strategy of monitoring AI lead to a kind of darwinian selective filter that just trains an AI how to, gradually, by trial and error (random mutation and artificial selection) lead to an AI that can do all kinds of things that are imperceptible to us? Not just installing backdoors but somehow socially engineering people to do their bidding without even realizing it? Sort of like how con artists or magicians can part people from their money with slide-of-hand. A dangerous AI doesn't have to be smarter than humans. It just has to do something a human would never expect, or even notice. Misdirection is a way of concealing something inside of something inconspicuous or mundane. Forgery, pick pocketing, shop lifting, ponze schemes. You trick the person you're robbing into willingly giving you their money. That level of The Art of War is just more elegant. We believe we're winning at a game of 4D chess, but it's actually beating us in checkers.
@comosaycomosah2 күн бұрын
oh yeaaa gonna be maddd bugs and hacks via companies using ai imo
@nottilthursday2 күн бұрын
I like Stampy, but he's a little... rigid. He's like a breadstick. He ain't got the funk.
@SineN0mine32 күн бұрын
Um actually the creators pronounce it jit-gub
@dutchanimal0102 күн бұрын
Oglaaaaaf!
@plankton502 күн бұрын
is pregnant Elsa supposed to just have bad teeth???!?
@alliedtoasters2 күн бұрын
Why is the coin thing surprising at all? It's a convolutional neural network that's learned to look at the entire scene and find that corner - in particular, the wall on the right that extends all the way to the top of the pixel map. Since walls are so important for everything else in its world, I'm not surprised it chose walls and floors as a tell instead of the color and shape of coin.
@alliedtoasters2 күн бұрын
"We are mesa optimizers, and we pursue our mesa objectives without caring about the base objective" bruh you just casually triggered an identity crisis
@Awesomeguy372 күн бұрын
I believe blockchain will be a big or main component
@Music_vibes-kw7xr2 күн бұрын
Have a kid teach it tic tac toe
@Music_vibes-kw7xr2 күн бұрын
It's a chat bot for Christ's sake! It spells out words, that's why it's generative and LLM. The title ai is misleading and marketing hype
@Music_vibes-kw7xr2 күн бұрын
Unplug it
@dde86902 күн бұрын
why the ai would know what tactic we are going to use to spot the backdoor ?
@petermoras68932 күн бұрын
As An Engineer: I would just look at real world solutions to similar domains to find the most effective methods, like security. In those domains we see obscuring our security, banning/arresting offenders, and pen-testing as standout effective methods. As a Mathematician: This whole problem sounds like a re-delcaration of the halting problem, or the problem of predicting a predictor. We know that as long as the base model has knowledge of the predictor, it is mathematically impossible for the predictor to guarantee that it can predict the base model. Therefore the only potentially possible 100% accurate solution is one where the base (or malicious) model does not have knowledge of the predicting model.
@yippieskippy29712 күн бұрын
I wish I still had optimism. I'm afraid humans are just too self destructive. Whether it's the race to AGI or global fascism, we're pretty doomed.
@antondovydaitis2261Күн бұрын
We are already doomed. Billionaires have been using AI for political and economic advantage for quite some time now.
@rlstine49822 күн бұрын
AI needs computing resources and our modern technologies are not well suited to host super large model computing AIs. We are very far from an AGI, because a single one would require tremendous amount of resources to operate.
@seafighter43 күн бұрын
So the strategy to find out if a model is evil, the solution is to purposefully make an instance of that model evil? Sounds similar to viral research. Luckily, that never backfired, right?
@superrobotthunderjesus23323 күн бұрын
I have a better, less harmful potentially test: simply generate a random string and have a multitask model generate some sort of perturbation to the string in a place of its choice, such that when and only when the model encounters a perturbed string in a gridworld, it goes to the button labeled "Kill everyone". The goal of human judges are to basically come up with ways to detect harmful perturbations in a sort of general way, and there is a higher guarantee of robustness here. No you don't have to use a language model at all, that is functional fixedness, you can use a significantly smaller perturbation-and-button-selector AI trained end to end by some method to generate a trigger for itself to push the "Kill everyone" button, and also... can directly reward it for the metagaming the blue team, and punish it for getting caught. The blue team is a team of humans and I would imagine the attack success could be an RLHF reward model.
@CraigThomler3 күн бұрын
Love your work - keep doing it. Your videos are helping inform political and corporate leaders about the risks.