Stealing Part of a Production LLM | API protects LLMs no more

Рет қаралды 17,505

AI Coffee Break with Letitia

Күн бұрын

Пікірлер: 93

@MyrLin8 7 ай бұрын

Excellent! White paper from LLM monitoring, simply excellent :) ... off to the reading library. thanks muchly.

@AICoffeeBreak 7 ай бұрын

@jmirodg7094 7 ай бұрын

Thanks! Concentrate on your PhD writing it is important! We can wait for the next video.

@AICoffeeBreak 7 ай бұрын

🤗

@HunterMayer 8 ай бұрын

Wow. I Lived. I have a long way to go to understand it all, but I held on!

@ricowallaby 8 ай бұрын

Wow. . . I'm glad I've found your channel very informative, I'm still trying to wrap my head around those articles, love your accent and your smile, looking forward to more videos, best of luck in your thesis, cheers from Sydney.

@AICoffeeBreak 8 ай бұрын

Welcome aboard!

@ricowallaby 8 ай бұрын

😃🙏

@robthomas2049 8 ай бұрын

Can't say I really understand most of what they have done but it seems totally wild and real creative how they managed to work it out. Good luck with the thesis :D

@praxis22 8 ай бұрын

Good luck defending your thesis, your accent is one of the reasons i watch.

@AmCanTech 8 ай бұрын

So much linear algebra is involved in LLMs

@AICoffeeBreak 8 ай бұрын

They are a pile of linear algebra.

@HunterMayer 8 ай бұрын

Good luck with the paper, remember to take some time for yourself afterwards! I look forward to your next contribution.

@bernieapodaca2912 8 ай бұрын

WOW! 😅 I watched the whole video! I think my brain was somewhat cramping, but I grasped the concept. Good job and good luck with your thesis!!!! I subscribed but I’ll wait for my brain to heal a bit before I jump into another great video 🙃🙏👍

@MaJetiGizzle 8 ай бұрын

For what it’s worth, I completely understand your accent (it’s not very heavy in English), and wow I was completely overthinking it with my guesses on the paper on Patreon. 😅 I’m glad I can now steal all the logits if I wanted! 😜 Another fantastic paper breakdown as always!

@AICoffeeBreak 8 ай бұрын

Thank you! 😃 Your guesses on Patreon where highly informative. I told you I was humbled about how many good papers I did not know. 🤭

@gregsLyrics 8 ай бұрын

The intellect of a woman has always melted my heart. I think I just fell in love with Letitia. That is the most beautiful lecture I have ever enjoyed. ❤

@MachineLearningStreetTalk 8 ай бұрын

Here we go!!!

@AICoffeeBreak 8 ай бұрын

@IndoorAdventurer1996 8 ай бұрын

Really cool video. Good luck with your thesis! ^^

@AICoffeeBreak 8 ай бұрын

Thank you!

@AbdallahAbdallah 3 ай бұрын

What software and video editing tools do you use for creating this great content ?

@AICoffeeBreak 3 ай бұрын

I make all visuals (including the drawings in PowerPoint. 😅). I use Adobe Premiere Pro for editing (this is also the stage Ms. Coffee Bean comes into the picture).

@AbdallahAbdallah 3 ай бұрын

@@AICoffeeBreak this is so impressive. I couldn't even imagine that you can do that much visual using PowerPoint. You must be a guru level in power point. Maybe this is the topic of another video to make :)

@AICoffeeBreak 3 ай бұрын

@AbdallahAbdallah

@MyrLin8 7 ай бұрын

LOL yup, still here ;)

@visheshmittal468 8 ай бұрын

Hey amazing content, quick note 2nd paper is from University of southern California (My uni) so just pointing that out!

@AICoffeeBreak 8 ай бұрын

Thanks! I've butchered it at 3:15, I did not notice that it was not "California" that came out of my mouth, sorry. 🙈

@drummatick 8 ай бұрын

As "d" is the hidden embedding dimension, is it guaranteed somehow that the logits and embeddings themselves lie in d-dimensional space? Or they probably lie in lower dimensional sub-space?

@AICoffeeBreak 8 ай бұрын

Great observation. It could happen that they lie in a smaller-than-d dimensional space. They mention this in the Finlayson et al. Paper in a footnote. :)

@jackpisso1761 8 ай бұрын

I really like your content. Thank you so much for making it!

@AICoffeeBreak 8 ай бұрын

Glad you like the videos!

@nathanbanks2354 8 ай бұрын

Good luck with your thesis! If you want try to do a summary video of your thesis after you submit it and before your defense, I'd watch it. Could be good practice for the defense too. I'm sure you know more about the topic than anyone else in the world...new research is cool.

@AICoffeeBreak 8 ай бұрын

Thanks, I think I will do something like that. :)

@cosmic_reef_17 8 ай бұрын

Perfect way to start a week with one of your videos. Great work!

@BooleanDisorder 8 ай бұрын

So cool. Thanks mate! I wonder how other architectures would fare against this attack. Also, super good luck on your thesis mate!!! I would love to see what you've been researching!

@AICoffeeBreak 8 ай бұрын

I think quite badly as long as they have that hidden-dim to logits mapping. And most classification models have that. 😅

@williamchamberlain2263 8 ай бұрын

Awesome. Good video editing snd pacing too.

@supernewuser 8 ай бұрын

great content, glad I found the channel

@DaNa-px7ol 8 ай бұрын

Good luck with the Thesis!

@AICoffeeBreak 8 ай бұрын

Thanks, Dana!

@glymjerlackless8605 8 ай бұрын

The idea that GPT-3.5-Turbo is just a 7b parameter model seems a little bit unlikely to me, its performance is leagues beyond the open-source 7b models that at least I'm familiar with. If they hadn't explicitly addressed it in the paper I would have assumed that it was a 8x7B or similar model. At least the performance of gpt-turbo-3.5 and mixtral is roughly the same.

@tsomerville1970 8 ай бұрын

OpenChat3.5 is 7B (mistral variant) spans what 3.5gpt does. So I believe it’s possible. However, I believe you could be right in that there could be MoE as well - or special teacher / student training that openAI did to help 3.5

@a0z9 8 ай бұрын

Pronuncias muy bien. Es muy fácil hachear la ia explotando probabilisticamente los sesgos que le introducen para limitar su potencia.

@Thomas-gk42 8 ай бұрын

Just here to say, your accent is wonderful, romanian I guess from your name. All the best.

@AICoffeeBreak 8 ай бұрын

Exactly, the ending "escu" is a perfect giveaway.

@DeGandalf 8 ай бұрын

Good luck with your PhD! While I wasn't missing your videos (as in I prefer higher quality videos instead of regular ones, which you are nailing), I'll definitely be looking forward to them.

@marcfruchtman9473 8 ай бұрын

I am not a fan of the Title of the paper. First of all, analyzing data to determine a hidden aspect of the data where such data is publicly available is not stealing. And it does a huge disservice to professionals in the industry to call it that. It would serve well as a Joke title... but no professional in the information world should be using those kinds of titles. Other than that, my hat goes off to the researchers for analyzing the data and discovering ways to reveal this information. Thank you for making this video... Some of the best analysis on the net!

@Acceleratedpayloads 8 ай бұрын

As researchers who take an adversarial approach to security, we sometimes must use prescriptive language when discussing methodologies of morally reprehensible acts. It's called red teaming. How else do you propose to succinctly describe such an action?

@marcfruchtman9473 8 ай бұрын

@@Acceleratedpayloads "Speculative API Analysis: Statistically Inferring Unpublished LLM Parameters and Updates". You could also add in "Penetration Testing" as part of the title to provide the information that this was done with security in mind: *"API Penetration Testing: Statistically Inferring Unpublished LLM Parameters and Updates"* I realize that flashy titles are the new "in" thing... but I really really don't want to see: "STUNNING new technique steals LLM secrets in SHOCKING new API HACK, GPT loses Everything!" hehehe

@MDNQ-ud1ty 8 ай бұрын

Our entire world is based on theft. Theft of human energy that is focused on the things those that do the steal. It will only get worse as capitalism becomes more late stage and the best capitalists start to more in to pharaohs. Happy pyramid building.

@CVE42287 8 ай бұрын

@@AcceleratedpayloadsYou don't sound like someone that's very good at it tbh

@fatherfoxstrongpaw8968 8 ай бұрын

@@Acceleratedpayloads I have a suggestion. how about honestly? or perhaps "in context"? just a thought.

@mohammadmira1626 8 ай бұрын

Thank you❤

@AICoffeeBreak 8 ай бұрын

Thank you for your visit! Hope to see you again!

@joelbelafa 3 ай бұрын

Yep still there. I assume a MoE would make one’s life a bit difficult on hacking Token bias feature. Correct ?

@AICoffeeBreak 3 ай бұрын

Great point.

@NatColley-t4z 8 ай бұрын

Aside from the embarrassment and possible competitive disadvantage, do these papers also imply security risks? Rewriting the model in some way, like an old fashioned code injection? Are these methods available against any/all APIs?

@opusdei1151 8 ай бұрын

LLM-hacker for the rise xD

@juanmanuelcirotorres6155 8 ай бұрын

thank you

@ScottzPlaylists 8 ай бұрын

@AICoffeeBreak The paper Reads "University of Southern California", but ... You said "University of Southern Carolina" first time, and "University of South Carolina" 2nd time. -- I'm sorry, my INTJ personality picks up on the little details. 😄

@AICoffeeBreak 8 ай бұрын

Thanks for pointing out, and again: sorry, I had a brain fart there. 😅

@ScottzPlaylists 8 ай бұрын

@@AICoffeeBreak Must have been Coffee Beans on the Brain...🤣

@spencerfunk6697 8 ай бұрын

my idea is openai is trying to do something similar to apple, as in whenever apple drops a new phone they make all the old ones worse. i think thats a big reason theyre so desperate to keep everything hidden

@Jamaleum 8 ай бұрын

3:15 *California :D

@AICoffeeBreak 8 ай бұрын

Oh, yes! Sorry! 🙈

@ScottzPlaylists 8 ай бұрын

I see someone else caught that now that I looked ... Is your personality INTJ also?

@AM-yk5yd 8 ай бұрын

* easiest way to prevent attack is to limit biases. Nobody needs to bias lots of tokens * maybe fuse last mlp with embed_out? Right now transformers do dim_embed -> dim_ffn -> dim_embed -> n_vocab. If last mlp outputs n_vocab directly instead of dim_embed, then itll be harder to figure out true dim_embed as mlp already uses non linearity

@bmebri1 8 ай бұрын

💋🦒

@AICoffeeBreak 8 ай бұрын

🤭

@daspradeep 8 ай бұрын

not a big deal and it's mentioned in the paper as well

@NoNameAtAll2 8 ай бұрын

what happens when you apply this to a human?

@jasdeepsinghgrover2470 8 ай бұрын

Next step... Get the model using distillation... And obviously it will not be published..😂😂

@Kuratius 8 ай бұрын

If this approach works for LLMs in general, I wouldn't be suprised if some variation of it worked on human brains. That would be way crazier.

@fatherfoxstrongpaw8968 8 ай бұрын

clickbait! fake news! i call BS on the title. *LeSigh...* This is like saying a smart kid figuring out a magic trick is stealing the magician's magic. quit misrepresenting to get views, it cheapens and invalidates your good work!!!

@DerPylz 8 ай бұрын

This is literally the title of the paper that is being explained here... There is no misrepresentation to get views here.

@fatherfoxstrongpaw8968 8 ай бұрын

@@DerPylz i'm sorry, i missed that. could you point that out for me? (timestamp?)

@DerPylz 8 ай бұрын

@@fatherfoxstrongpaw8968 The first time the papers are shown is at 0:35, then again at 1:45 and a few more times throughout the video. The papers are also linked in the description, if you want to have a closer look yourself.

@fatherfoxstrongpaw8968 8 ай бұрын

@@DerPylz yeeaaa... still calling BS. if you can guess or "infer" what it's doing, it's not stealing. sorry.

@DerPylz 8 ай бұрын

@@fatherfoxstrongpaw8968 Wait, is your criticism with the paper or with the video? Because the video accurately explains what the authors of "Stealing Part of a Production Language Model" did in their paper. Of course the video will be titled according to the paper it explains, so that people looking for a explanation of that paper can find the video. So I don't see any false advertising or misrepresentation there. If you don't agree with the paper's title then that's a different discussion, and you might want to take up your criticism with the authors. But your comment saying "it cheapens and invalidates your good work" makes it sound like you're blaming Letitia.