🦙 LLAMA-2 : EASIET WAY To FINE-TUNE ON YOUR DATA Using Reinforcement Learning with Human Feedback 🙌

  Рет қаралды 9,284

Whispering AI

Whispering AI

Күн бұрын

In this video, I'll show you the easiest, simplest and fastest way to fine tune llama-v2 on your local machine for a custom dataset! You can also use the tutorial to train/finetune any other Large Language Model (LLM). In this tutorial, we will be using reinforcement learning with human feed back to train our llama, which will accelerate it performance.
This technique is how this model are trained and in this video we will see, how to finetune this llm.
Please subscribe and like the video to help me keep motivated to make awesome videos like this one. :)
Free Google Colab for 4bit QLoRA fine-tuning of llama-2-7b model
Rise and Rejoice - Fine-tuning Llama 2 made easier with this Google Colab Tutorial
✍️Learn and write the code along with me.
🙏The hand promises that if you subscribe to the channel and like this video, it will release more tutorial videos.
👐I look forward to seeing you in future videos
Links:
Dataset to train: huggingface.co/datasets/Carpe...
Reward_dataset:
huggingface.co/datasets/Carpe...
Second Part:
• 🐐Llama 3 Fine-Tune wit...
github.com/ashishjamarkattel/...
#llama #finetune #llama2 #artificialintelligence #tutorial #stepbystep #llm #largelanguagemodels #largelanguagemodels

Пікірлер: 50
@ahmedoumar3741
@ahmedoumar3741 4 ай бұрын
Nice video, thanks!
@rabin1620
@rabin1620 11 ай бұрын
Loved it. definitely Gona try. Waiting for next video.
@WhisperingAI
@WhisperingAI 11 ай бұрын
Glad you liked the video
@fc4ugaming359
@fc4ugaming359 2 ай бұрын
Hello Brother Your video is so much informatic and its cover each and every part like theories. Their coding is their explanations there. But can you possible, like making one good demo, like you provide the one Paragraph and showing how to data work and how to generate output like a demo of model before and after
@zainab-fahim722
@zainab-fahim722 8 ай бұрын
can you please link the research paper showed at the beginning of the video? Thanks! PS: great video! keep up the amazing work
@WhisperingAI
@WhisperingAI 8 ай бұрын
arxiv.org/abs/1909.08593 here it is. Sorry for the late reply was quite busy
@Paperstressed
@Paperstressed 7 ай бұрын
dear sir i have a question during generation of the output in llm the tokenizer is not giving the same summary but giving the same text which was passed in prompt
@WhisperingAI
@WhisperingAI 7 ай бұрын
This is the common problem while doing finetuning. Please check dataloader and try increasing context length. I will solve the issue
@Paperstressed
@Paperstressed 7 ай бұрын
@@WhisperingAI sir are you talking about this max length data_path = "test_policy.parquet" train_dataset = TLDRDataset( data_path, tokenizer, "train", max_length=256, )
@WhisperingAI
@WhisperingAI 7 ай бұрын
@@Paperstressed yes please increase it to higher 512 or 1024
@user-cr1sk9fq6o
@user-cr1sk9fq6o 10 ай бұрын
Look forward to watching new videos on this topic. When will you upload the new video?
@WhisperingAI
@WhisperingAI 10 ай бұрын
Thank you for the comment. If every thing goes good, there will be next video of rlhf tomorrow else saturday.
@user-cr1sk9fq6o
@user-cr1sk9fq6o 10 ай бұрын
@@WhisperingAI How does it go? haha
@WhisperingAI
@WhisperingAI 10 ай бұрын
Haha i will upload it today. Sorry for delay
@user-cr1sk9fq6o
@user-cr1sk9fq6o 10 ай бұрын
@@WhisperingAI Nice. Hope everything goes well. Thanks!
@emrahe468
@emrahe468 10 ай бұрын
This is a good one , but our custom collected data doesnt have positive / negative columns. It maybe nice if you could a make video about: how to create a custom finetuning database for llama2 (without negative and positives), but also how to use it. None of the videos on the internet focusing on the step 2. They just build the database with autotrain and do nothing after
@minjunpark6613
@minjunpark6613 10 ай бұрын
This video is specifically for RLHF, which requires the po/ne data. If you want to fine-tune without po/ne, you may search for sth like 'LORA'.
@emrahe468
@emrahe468 10 ай бұрын
@@minjunpark6613 ty, after some more thinking, i may convert the database fit for RLHF :)
@Cloudvenus666
@Cloudvenus666 11 ай бұрын
Would you be able to share the notebook link as well?
@WhisperingAI
@WhisperingAI 11 ай бұрын
Thank you for the comment. Since i have done the code in local, I will be only able to share the code after next video(which is after 2-3 day). Sorry for the inconvenience.
@sauravmohanty3946
@sauravmohanty3946 10 ай бұрын
Can i use Falcon model for this code ? any thing to keep in mind while using Falcon .
@WhisperingAI
@WhisperingAI 10 ай бұрын
Yes you can use any model to do this. There is nothing to keep in mind if you follow the tutorial.
@sauravmohanty3946
@sauravmohanty3946 10 ай бұрын
For the same use case as in the above tutorial?
@WhisperingAI
@WhisperingAI 10 ай бұрын
@@sauravmohanty3946 yes
@Paperstressed
@Paperstressed 7 ай бұрын
Dear sir can you teach us how to fine tune llm model for question answer
@WhisperingAI
@WhisperingAI 7 ай бұрын
Sure i have this video, you can check Its step by step process without talking. kzbin.info/www/bejne/fH7HYmiclNetfcU
@fc4ugaming359
@fc4ugaming359 2 ай бұрын
If I want to add humanu feedbackcan i do?? and if Yes than How ??
@WhisperingAI
@WhisperingAI 2 ай бұрын
Human feedback is the dataset that is created in step 1 and 2. So you can create your own dataset that matches that format to train all the 3 steps
@fc4ugaming359
@fc4ugaming359 Ай бұрын
@@WhisperingAI like in model traing time can i put some kind of feedback like some of Label selection during model training??
@brainybotnlp
@brainybotnlp 11 ай бұрын
Great content. Can you please share the code?
@WhisperingAI
@WhisperingAI 11 ай бұрын
Thank you for your comment, but i will be only able to share code after next video which i planned after 2-3 day which will be full tutorial and source code. Sorry for the trouble.
@brainybotnlp
@brainybotnlp 11 ай бұрын
@@WhisperingAI for sure waiting
@mahendras9238
@mahendras9238 9 ай бұрын
​@@brainybotnlp colab.research.google.com/drive/1gAixKzPXCqjadh6KLsR5ZRUnb8VRvZl1?usp=sharing
@_SHRUTIDAYAMA
@_SHRUTIDAYAMA 6 ай бұрын
The colab code not worked... it shows " cuda" error in step 3... can you pls help
@WhisperingAI
@WhisperingAI 6 ай бұрын
Can you please provide explaination about issue?
@_SHRUTIDAYAMA
@_SHRUTIDAYAMA 6 ай бұрын
0%| | 0/3 [00:00
@Ryan-yj4sd
@Ryan-yj4sd 11 ай бұрын
How to save and push model to hugging face?
@WhisperingAI
@WhisperingAI 11 ай бұрын
Thanks for the comment. In that case you can just add an argument in the trainingargument as push_to_hub=True or just call trainer.push_to_hub() after training the model. Hope this help
@Ryan-yj4sd
@Ryan-yj4sd 11 ай бұрын
@@WhisperingAI but that just pushes the adaptors though? You can’t do inference w that
@WhisperingAI
@WhisperingAI 11 ай бұрын
I am not able to get you for which video you are talking about, but i guess you are using Lora and peft and defining your model via Peft, in that case you need to save the model via 1.model.base_model.save_pretrained("/path_to_model") 2. load the model 3. model.push_to_hub()
@AleixPerdigo
@AleixPerdigo 8 ай бұрын
Great video, I would like to create my own fine tuned model for my company. Could I contact you?
@namantyagi6294
@namantyagi6294 8 ай бұрын
Is 12 GB enough VRAM to fine-tune model with 4bit QLoRA ?
@WhisperingAI
@WhisperingAI 8 ай бұрын
It might not be enough. As llama is 13 gb, loading it in 4bit decrease size by 4 time. So as per the calculation it Requires 4×(13/4)=13 gb of vram but might increase till 15 due to type of optimizer you are using.
@namantyagi6294
@namantyagi6294 8 ай бұрын
is it possible to offload some part of memory requirement to cpu/ram during fine tuning?@@WhisperingAI
@HaroldKouadio-gj7uw
@HaroldKouadio-gj7uw 24 күн бұрын
There is an error message when I try to install trl I don't know why I am stuck... Can I have your email? To discuss with you about this issue?
@WhisperingAI
@WhisperingAI 24 күн бұрын
Can you create raise issue in git ?
🐐Llama 3 Fine-Tune with RLHF [Free Colab 👇🏽]
14:30
Whispering AI
Рет қаралды 13 М.
Fine-tuning Large Language Models (LLMs) | w/ Example Code
28:18
Shaw Talebi
Рет қаралды 253 М.
IS THIS REAL FOOD OR NOT?🤔 PIKACHU AND SONIC CONFUSE THE CAT! 😺🍫
00:41
Must-have gadget for every toilet! 🤩 #gadget
00:27
GiGaZoom
Рет қаралды 4,8 МЛН
She ruined my dominos! 😭 Cool train tool helps me #gadget
00:40
Go Gizmo!
Рет қаралды 54 МЛН
RLHF+CHATGPT: What you must know
10:48
Machine Learning Street Talk
Рет қаралды 67 М.
QLoRA-How to Fine-tune an LLM on a Single GPU (w/ Python Code)
36:58
The EASIEST way to finetune LLAMA-v2 on local machine!
17:26
Abhishek Thakur
Рет қаралды 165 М.
Prepare Fine-tuning Datasets with Open Source LLMs
15:22
Trelis Research
Рет қаралды 12 М.
Finetuning Open-Source LLMs
20:05
Sebastian Raschka
Рет қаралды 28 М.
Reinforcement Learning through Human Feedback - EXPLAINED! | RLHF
10:17
WWDC 2024 Recap: Is Apple Intelligence Legit?
18:23
Marques Brownlee
Рет қаралды 6 МЛН
ТОП-5 культовых телефонов‼️
1:00
Pedant.ru
Рет қаралды 19 М.
Неразрушаемый смартфон
1:00
Status
Рет қаралды 1,3 МЛН
После ввода кода - протирайте панель
0:18