just by seeing title make me to subscribe. Really needed Thanks bhai.
@avb_fj2 ай бұрын
Haha you are welcome bhai! Glad to have you as a sub. :)
@JaysThoughts-q5e3 ай бұрын
Fun video. You explained it well.
@tessdejaeghere69722 ай бұрын
Great explanation!
@kenchang34563 ай бұрын
Great consumable explanation, thanks. Nice change of glasses BTW.
@avb_fj3 ай бұрын
Haha thanks for noticing the glasses! I totally did not accidentally sit on my old pair.
@zeelthumar3 ай бұрын
This is crazyyy man keep it up
@avb_fj3 ай бұрын
Thanks!
@PedroGabriel-jr3rb3 ай бұрын
Invaluable content, thanks!
@Urvil_Panchal3 ай бұрын
The content is Top Notch ❤🔥 , easy to understand. What's your linkedin bro?
@mbpiku3 ай бұрын
Thanks. Great video.
@boogati92213 ай бұрын
Hey AVB, great video! Do you have plans on making a video for finetuning embeddings? I believe this topic is extremely useful for custom RAG pipelines.
@avb_fj3 ай бұрын
Great idea. I didn’t have plans to make this, but sounds like a super important topic. I’ll add it to my TODO list!
@boogati92213 ай бұрын
Legit
@ratand46193 ай бұрын
damn i just came across this video and ive subscribed already. I come from arts background but ai has got all my focus lately, and i believe your channel would be the greatest help for me and people like me. Just one view, if you could explain technical points with the simplest explaination taking simplest examples, it'll be of a great help for people like me who don't come from this background. Thousands of people woulf benefit from your amazing content.
@avb_fj3 ай бұрын
Thank you so much for your kind words! Welcome to the channel! Regarding your feedback - I think because this topic was pretty intense, I tried to keep the technical points as concise as possible. I already have some older videos that describe attention, transformers, and LLMs, in more intuitive ways, so I opted to point the viewers towards those videos depending on what they are looking for. That said, I am definitely still working on my skills of distilling down technical points to it's simplest forms, and I hope to keep on improving one video at a time!
@Rohitkumar-vg5xi6 күн бұрын
Great but i have a requirement where i have a bunch of pdfs,txt,json files etc. I want to train the model and ask questions to get the content .How can i do ?
@avb_fj6 күн бұрын
In most cases fine tuning isn’t necessary. You should look into RAG pipelines first. Basically, you extract some relevant snippets out of your files given the input query, and pass these snippets into the LLM to extract an answer. I have a video covering best practices for RAGs, I’ll leave a link here: The RAG Visual Breakdown - The Ultimate guide to building powerful LLM pipelines! kzbin.info/www/bejne/hXnLkIZ4rreMo7M
@ashmin.bhattarai3 ай бұрын
What is the system spec (RAM / VRAM / GPU) you have used for fine tuning 1B Model?
@avb_fj3 ай бұрын
Great question. Something I should have mentioned in the video. This is MacBook Pro M2 16GB ram. If you have CUDA gpus, you could leverage better quantization. For my machine, I was able to train with batch size of 8 in float32… the sequence lengths were around 250 on average for this task. Honestly, if I were working on a real project I’ll rent GPU on the cloud and train there after prototyping locally. Since this is a YT video and it’s for education, I decided to not go into cloud servers.
@ashmin.bhattarai3 ай бұрын
@@avb_fj If so, one could easy replicate this exact project on free colab or kaggle..
@sukantdebnath44632 ай бұрын
Can you share the notebook as open source pls?
@PaneerPakoda-d9e3 ай бұрын
Hello sir how to actually create custom tokenizers for like let's say Odia language?
@avb_fj3 ай бұрын
I would check if any of the existing open source LLMs have Oriya words/alphabets in them. If not, I’d look at things like “Byte pair encoding” and “Subword tokenizers”. Huggingface has a Tokenizers library that provides useful APIs for these. If you find a pretrained model where a fair amount of Odia words are in the vocab then that’ll be pretty great. If they are not present, then the job becomes quite difficult, coz you’d either need to add new vocab tokens into existing models, or train your own Odia LM from scratch.
@PaneerPakoda-d9e3 ай бұрын
@@avb_fj thanks for the info🙏🏻
@yingpang243717 күн бұрын
could you share the notebook
@yingpang243717 күн бұрын
it is hard to follow without code.
@VarshneyKashishАй бұрын
good but i found this video little confusing and many more things in a single video
@mohsenghafari76523 ай бұрын
thanks . please share code.
@avb_fj3 ай бұрын
As mentioned in the video, all the code produced in this video and all other videos in the channel are shared on my Patreon. www.patreon.com/c/NeuralBreakdownwithAVB
@ErkanER12273 ай бұрын
I wish we didn't have to pay to access the materials 😢