No more Fine-Tuning: Unsupervised ICL+

Рет қаралды 5,115

Күн бұрын

Пікірлер: 20

@davidbarton3361 6 ай бұрын

This is really interesting, but until commercial LLMs don't charge for input tokens how viable is this? I can only imagine my OpenAI rate limits and credit card smoking if I decided to pump that many input tokens through for a single query. Even 32k tokens per request burns money fast. OTOH fine tuning LLMs is really low cost. If you were to fine tune that many examples with Mixtral, the performance might improve a lot more than that. And then using Lorax to have multiple fine tunes running on the same hardware makes this a lot more cost effective.

@archiee1337 4 ай бұрын

Cost is a valid factor, but I believe we should focus on that less (if possible), since the cost will be lower and lower in the future :)

@umbertosurricchio5365 6 ай бұрын

Amazing topic❤ please let's do more videos about It and I am interested about particolar context where special words and concepts are used , like Healtcare, Finance, Legal context. Thank You so much in Advance.🙏

@archiee1337 4 ай бұрын

great insights, thank you

@seaihacks 6 ай бұрын

I did a 2 month I took a two-month break, and so much has happened! You've been putting out a ton of content too. How am I supposed to catch up with all of this haha

@markopancic6060 5 ай бұрын

Could it be that the wider variety of problems suggested in the unsupervised icl is just activating math related attention heads allowing it to solve the problems. where as QA might be less varied and cause more of a pigeonhole effect? I feel like this has been seen with some of deepminds RL work where less prescriptive performs better that prescriptive work.

@chituyiwakhusama9944 6 ай бұрын

ICL many-shot inference will give you a better-grounded output. The tradeoff is cost and latency (Groq can save us on this 😊 but not on that')

@kengonzo1640 6 ай бұрын

This is interesting I'm trying to imagine how parsing principles of the of thought into subdivided components are then rearranged to create a different outcome much like order of operations affects math equations

@gileneusz 6 ай бұрын

8:15 this will increase prompt evaluation time to the roof

@johnytheripper 6 ай бұрын

and cost per inference call

@pi5549 6 ай бұрын

Can you try to summarise the innovation really early on in the video? My experience watching this is like trying to ride a bicycle too slowly. I just can't focus on anything.

@BradleyKieser 6 ай бұрын

Fascinating.

@mohamedfouad1309 6 ай бұрын

This is amazing 😂

@explorer945 6 ай бұрын

I haven't finished watching the video yet, but what about the cost?

@code4AI 6 ай бұрын

I haven't finished reading your complete question yet, but π?

@explorer945 6 ай бұрын

@@code4AI sorry I don't understand your pi reference..later I finished it. You didn't mention cost, security issues with long context. Anyways, please concise videos. There are way too many videos to watch.

@daryladhityahenry 6 ай бұрын

@@explorer945 lol.. Please don't shorten the video. I like his teaching because it's great to study, easy to understand even though this isn't my field. :D:D.

@explorer945 6 ай бұрын

@@daryladhityahenry got it. Ok, don't change anything. I am not the type of audience as I have to consume 10s of videos and can't afford to have each 30min. Sometimes, I feel like I can do my own reading, tell me the gist and high level points. Happens when you are dealing with so many updates. Not sure how many channels you follow . It's really difficult. You can see why @code4ai doesn't have as many subscriptions as others for same reasons.

@daryladhityahenry 6 ай бұрын

@@explorer945 Yes yes I can understand your point very well.. hahahah. I follow so much channel, but I didn't watch all of them. Just a couple ( less then ten a day I think ). Anyway, I do really understand your point and also about his long video format that makes him not getting much subs. But... I happen to be one of the one that likes it :D:D:D.