How to code long-context LLM: LongLoRA explained on LLama 2 100K

  Рет қаралды 5,329

Discover AI

Discover AI

Күн бұрын

Пікірлер: 16
@sambhavkhur
@sambhavkhur 11 ай бұрын
Amazing video! I had 1 doubt regarding the Shift Short Attention though and was hoping that you could clarify. Consider that we have 8000 sequence length and we divide it into 4 groups of 2000 length each. Group 1 contains tokens from 1 to 2000, group 2 contains the next 2000 tokens and so on. The way I understand S2-Attention is that if we have 4 heads for each group, we keep 2 heads as it is and for the other 2 heads we shift them by half the group size. So for these 2 shifted heads, 1st row (for group 1) will not contain information about the 1st token but that information will be present in the row number 1001. The 1st row will instead contain information about the token number 7001. Similarly 1st row in these shifted heads (for group 2) will contain information about token number 1001. Once we have shifted these 2 heads, we can continue with the original self attention mechanism for each of these groups independently where we concatenate these heads and produce input for the next layer. Am I correct or is there an error in my understanding?
@sandhyas2033
@sandhyas2033 2 ай бұрын
Thanks a lot!!! Great videos with indepth explanation.Please share the Presentations as well for reference.
@gunwooo
@gunwooo 9 ай бұрын
Tnks for your slow speaking eng. im korean, im not well eng. but i do little understanding 😊😊
@jdoejdoe6161
@jdoejdoe6161 Жыл бұрын
Lots of errors in trying out the fine-tuning code. Please try it out and see
@generativeresearch
@generativeresearch Жыл бұрын
Thanks for the valuable information.
@li-pingho1441
@li-pingho1441 Жыл бұрын
thank you so much!!
@htr-g8r
@htr-g8r Жыл бұрын
Paper links please?
@sergialbert97
@sergialbert97 Жыл бұрын
Do you see feasible a simple implementation for a "small"model within a notebook?
@comediansguidetotruecrime3836
@comediansguidetotruecrime3836 Жыл бұрын
are there any results on the long context learning, like test results?
@AzureUnagi
@AzureUnagi 10 ай бұрын
Have any one tried to conduct an experiment to inspect the lost in the middle effect? I wonder if the S2 attention mechanism affects how the model utilizes the results of attention.
@icestika
@icestika Жыл бұрын
how doese it effect intefernce time
@echofloripa
@echofloripa Жыл бұрын
I'd be wrong to imply that faster training would mean faster inference? 36:01
@jimhrelb2135
@jimhrelb2135 Жыл бұрын
Inference time should be similar to running the foundation model at the same context length. LoRA-based modes only changes couple weights from the foundational model in-place, so (production) inference time and memory remains unchanged
@fabianaltendorfer11
@fabianaltendorfer11 Жыл бұрын
Hey, do you have an idea how I could use Screenshots in the documents I want to use in a RAG-Usecase? Thanks
@jmirodg7094
@jmirodg7094 Жыл бұрын
Extremely interesting Thanks, The context length is a real limitation for now
@li-pingho1441
@li-pingho1441 Жыл бұрын
nice work!
Understanding 4bit Quantization: QLoRA explained (w/ Colab)
42:06
MISTRAL 7B explained - Preview of LLama3 LLM
41:30
Discover AI
Рет қаралды 9 М.
小天使和小丑太会演了!#小丑#天使#家庭#搞笑
00:25
家庭搞笑日记
Рет қаралды 11 МЛН
Win This Dodgeball Game or DIE…
00:36
Alan Chikin Chow
Рет қаралды 35 МЛН
Amazing Parenting Hacks! 👶✨ #ParentingTips #LifeHacks
00:18
Snack Chat
Рет қаралды 22 МЛН
New LLM-Quantization LoftQ outperforms QLoRA
14:15
Discover AI
Рет қаралды 4,6 М.
LLAMA-3.1 🦙: EASIET WAY To FINE-TUNE ON YOUR DATA 🙌
15:08
Prompt Engineering
Рет қаралды 33 М.
[1hr Talk] Intro to Large Language Models
59:48
Andrej Karpathy
Рет қаралды 2,2 МЛН
GraphRAG: The Marriage of Knowledge Graphs and RAG: Emil Eifrem
19:15
LoRA explained (and a bit about precision and quantization)
17:07
RoPE Rotary Position Embedding to 100K context length
39:56
Discover AI
Рет қаралды 3,6 М.
Building long context RAG with RAPTOR from scratch
21:30
LangChain
Рет қаралды 33 М.
小天使和小丑太会演了!#小丑#天使#家庭#搞笑
00:25
家庭搞笑日记
Рет қаралды 11 МЛН