What is Retrieval Augmented Generation (RAG) and JinaAI?

Рет қаралды 2,989

5 ай бұрын

Retrieval Augmented Generation (RAG) is one of the big AI patterns you must know for 2024, in this tutorial i break down the RAG pattern, What the Jina AI embeddings model isand why JinaAI is a game changer for LLM's such as GPT, Llama and Mistral.
In the video, chris breaks down what the issues with LLM's such as GPT, Mixtral 7B and Llama-2 are and how the RAG pattern helps solves the problems of hallucinations, extending data.
Chris also shows you in detail on how the RAG pattern exacty works under the hood, so you can truly understand what's going on
He also talks about how JinaAI is different, how it works, how compares to openai ada embeddings model and how Jina AI will kick off the next model trend for 2024.

Пікірлер: 21

@mohamedghazal3630 4 ай бұрын

Thank you! very simple, precise, yet very informative!

@chrishayuk 4 ай бұрын

Glad it was helpful!

@JAnders-oy2sv 2 ай бұрын

Another clear and informative video, thank you! I agree, I think RAG will be huge in 2024. One thing I would like to know, is it possible to have the LLM list or identify the chunk or chunks used to produce a response? Perhaps metadata or indexes can be added to the chunks which the LLM can use when generating a response.

@jonb9806 4 ай бұрын

Great video Chris. Even I could understand your explanation!

@chrishayuk 4 ай бұрын

thank you, it was a really difficult one to find the right angle for, glad it was useful.

@spheroid77 4 ай бұрын

Chris, great video as always. I learn so much from your channel, thanks. One thing I at least didn't quite "get" from this - where you talk about vectorization and embeddings - what actually *is* that process? The general concept I understand - turn the chunk into a numerical vector and compare them for similarity - but the vectorization itself - what is JinaAI doing at that point and how does it overcome e.g. the challenges of mismatching vocab between the question and the knowledge chunk without being externally trained itself on a bunch of stuff? Or maybe the embeddings are based on some other training from elsewhere? Was just a bit hazy on that point... maybe a thought for a future video if you're inclined :)

@chrishayuk 4 ай бұрын

you're right.. that's probably a really good video to do, as it's quite complex. it's quite interesting. my latest video on tiktoken, explains vectorization for decoder models, but that's a little simplistic compared to embedding models. will do a video on embeddings

@spheroid77 4 ай бұрын

Fantastic, thanks! @@chrishayuk

@willarnold3121 5 ай бұрын

Great video Chris!

@chrishayuk 5 ай бұрын

Thaaaank you!!

@juliussakalys9600 5 ай бұрын

Hi Chris, I appreciate the high quality content! Could you do a video or just give a simple reply on where are you taking your expertise from? Maybe some communities, projects or anything of the sort. Personally (and I am sure that other people as well), I would like to become proficient in basically the same things that you are an expert at when it comes to engineering solutions for AI related problems :)

@chrishayuk 5 ай бұрын

That is a pretty difficult one to answer, in all honesty just let myself play. I’ve been using the RAG pattern a lot at work. However, I felt the explanations of RAG were either 1 level too high, or 2 levels too deep. That’s usually when I’ll do a video like this one, just to go under the covers and show what’s really happening

@juliussakalys9600 5 ай бұрын

Thanks, I guess the takeaway is to try and play more with such things myself.

@chrishayuk 5 ай бұрын

@@juliussakalys9600 it’s tough to know where to look and start, I’ll try and put some sort of guide together

@harrykekgmail 3 ай бұрын

quality explanations!

@chrishayuk 3 ай бұрын

Thank you, glad it was useful

@Kopp141 5 ай бұрын

Thank you for a fantastic breakdown of RAG. I can now see why my Copilot trial at work is so bad at information retrieval. I'm guessing that as the queries get more complex and the spread of the data becomes wider, the less useful this method will become less effective. Does that push us toward a rolling fine-tune approach to a base model?

@chrishayuk 4 ай бұрын

i think the hybrid model where you have a circle of RAG and finetune is likely the way forward

@path1024 5 ай бұрын

If you want facts you need to pump the determinism by lowering the temperature.

@chrishayuk 5 ай бұрын

Only works if the data is in the training set, also doesn’t solve the traceability issue. Finally the models shouldn’t be making up answers for Q&A type questions this is where models will get better through routing questions to the correct expert with MoE

@path1024 5 ай бұрын

@@chrishayuk Well, it just makes it stick to the highest confidence answer. You were saying it kept putting out a different answer. That's usually temperature and its effect on topP and topK. And the higher the temperature, the more it hallucinates. It sounds like you're just saying it should know when to lower the temperature on its own for the type of question.