QLoRA paper explained (Efficient Finetuning of Quantized LLMs)

  Рет қаралды 12,651

AI Bites

AI Bites

Күн бұрын

Пікірлер: 28
@IgorAherne
@IgorAherne Ай бұрын
Thank you, that's a beautiful explanation! One thing I struggle understanding, is the term quantization blocks in 4:30 - why we need several of them. In my understanding from the video, we ponder about using 3 blocks of 16 bits to describe a number. Which is 48 bits and is more expensive than 32-bit float. But couldn't we just use 16*3 = 48 bits per number instead? Using 48 bits (without splitting it) would give us a very high precision within [0,1] range, due to powers-of-two I did ask GPT, and it responded that there exists a 'Scale Factor' and a 'Zero-Point', which are constants that shift and stretch the distribution in 6:02 Although I do understand these might be those quantization constants, - I am not entirely sure what the 64 blocks described in the video are 6:52 Is this because of the Rank of Matrix-Decompositions is 1, with 64 entries in both vectors?
@pierluigiurru962
@pierluigiurru962 4 ай бұрын
Your videos on LoRA finally made the concepts click for me. It was clearly explained! Thank you for the content you make
@AIBites
@AIBites 4 ай бұрын
Glad it helped. Welcome 😊
@haz5248
@haz5248 7 ай бұрын
That was really well explained with intuitative diagrams and explination. Thanks for the video , just subscribed .
@AIBites
@AIBites 7 ай бұрын
thank you! :)
@cacamaricano
@cacamaricano 6 ай бұрын
Thanks for connecting the dots!
@AIBites
@AIBites 6 ай бұрын
glad you liked! :)
@SudarakaYasindu
@SudarakaYasindu Ай бұрын
Awesome explanation! ❤
@AIBites
@AIBites 28 күн бұрын
glad you think so and thank you indeed :)
@vuluu4942
@vuluu4942 8 ай бұрын
Thank you for the explanation! I find that it's very helpful!
@huitangtt
@huitangtt 5 ай бұрын
Very well explained
@AIBites
@AIBites 4 ай бұрын
Thanks so much 😊
@JaishreeramCoder
@JaishreeramCoder 5 ай бұрын
amazing explanation
@AIBites
@AIBites 5 ай бұрын
Glad you think so!
@wilfredomartel7781
@wilfredomartel7781 9 ай бұрын
Waiting to see it.
@AIBites
@AIBites 8 ай бұрын
Sure! :)
@rahul.vpoojari6553
@rahul.vpoojari6553 2 ай бұрын
Thank you sire
@AIBites
@AIBites 28 күн бұрын
my pleasure Rahul! :-)
@yayasy1362
@yayasy1362 Ай бұрын
I don’t understand why you say that LoRA is fast for inference… in any case you need to forward through the full rank pretrained weights + low-rank finetuned weights.
@AIBites
@AIBites 28 күн бұрын
ah yes. If only we could quantize the weights, we can do better than the pre-trained weights. You are making a fair point here. Awesome and thank you! :)
@yayasy1362
@yayasy1362 28 күн бұрын
@@AIBites Yeah, if only we could replace the pretrained Full-Rank weights by the Low-Rank Weights... really nice video and illustrations! Thanks a lot!
@yeduniya657
@yeduniya657 8 ай бұрын
Hey. I need your help. I have a curated set of notes and books and I wish to use it to finetune a model. How can it be done?
@AIBites
@AIBites 8 ай бұрын
would you like to see a fine-tuning video on text data? Would that be useful? Do you have any suggestions on the dataset I can show fine-tuning on?
@yeduniya657
@yeduniya657 7 ай бұрын
@@AIBites Yes, I have suggestion. Finetuning a model on my journal in which I have written about the truth of nonduality and illusionary nature of reality. I am also actively curating books on truth, and would love your help.
@haz5248
@haz5248 7 ай бұрын
@@AIBites That would be very help. There arent many good videos on fine tune out there.
@AIBites
@AIBites 6 ай бұрын
hope the fine-tuning video was of some help
@AIBites
@AIBites 6 ай бұрын
hope the fine-tuning video was of some help
Which One Is The Best - From Small To Giant #katebrush #shorts
00:17
Every parent is like this ❤️💚💚💜💙
00:10
Like Asiya
Рет қаралды 6 МЛН
LoRA explained (and a bit about precision and quantization)
17:07
XLSTM - Extended LSTMs with sLSTM and mLSTM (paper explained)
14:14
QLoRA: Efficient Finetuning of Quantized LLMs
32:24
MIPAL-SNU
Рет қаралды 784
QLoRA: Efficient Finetuning of Quantized LLMs | Tim Dettmers
30:48
Applied Machine Learning Days
Рет қаралды 3,1 М.
Attention in transformers, visually explained | Chapter 6, Deep Learning
26:10
The Most Important Algorithm in Machine Learning
40:08
Artem Kirsanov
Рет қаралды 434 М.
Quantization vs Pruning vs Distillation: Optimizing NNs for Inference
19:46
Understanding 4bit Quantization: QLoRA explained (w/ Colab)
42:06