Insights from Finetuning LLMs with Low-Rank Adaptation

  Рет қаралды 4,170

Sebastian Raschka

Sebastian Raschka

6 ай бұрын

Links:
- LoRA: Low-Rank Adaptation of Large Language Models, arxiv.org/abs/2106.09685
- LitGPT: github.com/Lightning-AI/lit-gpt
- LitGPT LoRA Tutorial: github.com/Lightning-AI/lit-g...
Low-rank adaptation (LoRA) stands as one of the most popular and effective methods for efficiently training custom Large Language Models (LLMs). As practitioners of open-source LLMs, we regard LoRA as a crucial technique in our toolkit.
In this talk, I will delve into some practical insights gained from running hundreds of experiments with LoRA, addressing questions such as: How much can I save with quantized LoRA? Are Adam optimizers memory-intensive? Should we train for multiple epochs? How do we choose the LoRA rank?

Пікірлер: 15
@dariusduesentrieb
@dariusduesentrieb 5 ай бұрын
Very nice! I was just starting to think about a small project using LLM finetuning, so this video is very useful. (Though the hard part is currently getting the training data ...).
@Sanguen666
@Sanguen666 Ай бұрын
always funny to see the best videos in utube get zero views. nice video.
@SebastianRaschka
@SebastianRaschka Ай бұрын
Ha, thanks, I take this as a compliment :)
@Scientist287
@Scientist287 Ай бұрын
He is famous outside of KZbin, only a matter of time before he blows up here
@franky07724
@franky07724 2 ай бұрын
Thanks for the video and all the references. One suggestion: I feel that it is better to describe (write on the slides about) "model/data/hardware/batch-epoch" in some experiments, e.g., LoRA/QLoRA memory/runtime trade-off on 6:40. Maybe add the comment of these items in the description too. Great video!
@SebastianRaschka
@SebastianRaschka 2 ай бұрын
I like that idea and will make sure to include more of the details in future videos!
@franky07724
@franky07724 2 ай бұрын
@@SebastianRaschka Thanks!
@ThanhPham-xz2yo
@ThanhPham-xz2yo 4 күн бұрын
Thanks
@MuratJumashev
@MuratJumashev 5 ай бұрын
Thank you, @SebastianRaschka! Can you please give some ideas on how to add new languages to pretrained LLMs, for example, Llama 2? I reviewed its tokenizer, and our alphabet (Kyrgyz, Cyrillic) is mostly supported except for 3 uppercase letters. While English speakers can enjoy the LLMs, people speaking low-resource languages cannot benefit as much due to limited language support. I think a tutorial on this would be beneficial globally. I was thinking about adding Kyrgyz-English and English-Kyrgyz parallel corpora as well as monolingual texts. Do you think that would enable the "transfer learning" thing? I'm curious about your thoughts on whether this could be a viable solution to enhance the language model's capabilities for Kyrgyz.
@MuratJumashev
@MuratJumashev 5 ай бұрын
The following illustrates the tokenization output from the Llama 2 tokenizer for a short sentence in Kyrgyz: ``` Original sentence: ӨМҮРҮҢДҮН аягына чейин оку. өмүр! Encoded sentence: [29871, 214, 171, 30017, 213, 177, 30027, 213, 177, 213, 165, 30032, 213, 177, 30029, 1097, 29970, 29969, 29982, 477, 2950, 29977, 8197, 614, 1382, 29889, 29871, 30778, 29959, 30750, 29927, 29991] Token ID 29871 --> Token ID 214 --> � Token ID 171 --> � Token ID 30017 --> М Token ID 213 --> � Token ID 177 --> � Token ID 30027 --> Р Token ID 213 --> � Token ID 177 --> � Token ID 213 --> � Token ID 165 --> � Token ID 30032 --> Д Token ID 213 --> � Token ID 177 --> � Token ID 30029 --> Н Token ID 1097 --> а Token ID 29970 --> я Token ID 29969 --> г Token ID 29982 --> ы Token ID 477 --> на Token ID 2950 --> че Token ID 29977 --> й Token ID 8197 --> ин Token ID 614 --> о Token ID 1382 --> ку Token ID 29889 --> . Token ID 29871 --> Token ID 30778 --> ө Token ID 29959 --> м Token ID 30750 --> ү Token ID 29927 --> р Token ID 29991 --> ! ```
@SebastianRaschka
@SebastianRaschka 5 ай бұрын
Good points. Unfortunately, I am not familiar with LLMs for these languages. I think the challenge really is the tokenizer. If you want to leverage a pretrained LLM, even if you want to train it further on new languages it's crucial to use the same tokenizer that was used to train the LLM in the first place. Otherwise, the embedding layers won't recognize any of the tokens or map them weirdly. What you could do though is extend the tokenizer with those characters, I think. E.g., if you use tiktoken for GPT-like models (I have an example here in section 2.5, github.com/rasbt/LLMs-from-scratch/blob/main/ch02/01_main-chapter-code/ch02.ipynb), you can extend the tokenizer with new special tokens via "allowed_special" E.g., integers = tokenizer.encode(text, allowed_special={""}) These then get added after the main vocabulary. Instead of "" you could try to input the special characters you mentioned. I am not sure how or if it works, but maybe worth a try.
@programmingsiri5007
@programmingsiri5007 5 ай бұрын
Thanks for sharing! what do you think of the recent CALM paper by google which allows of composing llms in a different ways than LORA?
@user-xc9mo9qh4q
@user-xc9mo9qh4q 5 ай бұрын
Kindly share some project tut with this lecture.
@frankchieng
@frankchieng 2 ай бұрын
it seems like diffuser just implemented DORA in their newest.version
@SebastianRaschka
@SebastianRaschka 2 ай бұрын
A nice! Btw if you are interested, I've written an in-depth tutorial covering DoRA last month: "Improving LoRA: Implementing Weight-Decomposed Low-Rank Adaptation (DoRA) from Scratch" (magazine.sebastianraschka.com/p/lora-and-dora-from-scratch)
Finetuning Open-Source LLMs
20:05
Sebastian Raschka
Рет қаралды 28 М.
What is LoRA? Low-Rank Adaptation for finetuning LLMs EXPLAINED
8:22
AI Coffee Break with Letitia
Рет қаралды 34 М.
DELETE TOXICITY = 5 LEGENDARY STARR DROPS!
02:20
Brawl Stars
Рет қаралды 14 МЛН
2000000❤️⚽️#shorts #thankyou
00:20
あしざるFC
Рет қаралды 13 МЛН
Climbing to 18M Subscribers 🎉
00:32
Matt Larose
Рет қаралды 16 МЛН
A pack of chips with a surprise 🤣😍❤️ #demariki
00:14
Demariki
Рет қаралды 30 МЛН
Developing an LLM: Building, Training, Finetuning
58:46
Sebastian Raschka
Рет қаралды 10 М.
Low-Rank Adaptation - LoRA explained
10:42
AI Bites
Рет қаралды 8 М.
The Most Important Algorithm in Machine Learning
40:08
Artem Kirsanov
Рет қаралды 257 М.
Best Llama Pretraining to Real World Application Pipeline with Code Explained
37:18
Neural Hacks with Vasanth
Рет қаралды 1,8 М.
New LLM-Quantization LoftQ outperforms QLoRA
14:15
code_your_own_AI
Рет қаралды 4,5 М.
Llama 3 Fine Tuning for Dummies (with 16k, 32k,... Context)
23:16
Nodematic Tutorials
Рет қаралды 18 М.
Aligning LLMs with Direct Preference Optimization
58:07
DeepLearningAI
Рет қаралды 21 М.
LLMs for Everything and Everyone! - Sebastian Raschka - Lightning AI
45:22
Southern Data Science Conference
Рет қаралды 6 М.
Самый топовый ПК без RGB подсветки
1:00
CompShop Shorts
Рет қаралды 196 М.
Main filter..
0:15
CikoYt
Рет қаралды 2,8 МЛН
AI от Apple - ОБЪЯСНЯЕМ
24:19
Droider
Рет қаралды 22 М.
iPhone 15 Pro vs Samsung s24🤣 #shorts
0:10
Tech Tonics
Рет қаралды 12 МЛН