How to use Custom Prompts for RetrievalQA on LLaMA-2 7B

  Рет қаралды 15,897

Sam Witteveen

Sam Witteveen

Күн бұрын

How to use Custom Prompts for RetrievalQA on LLaMA-2 7B and 13B
Colab: drp.li/0z7GR
My Links:
Twitter - / sam_witteveen
Linkedin - / samwitteveen
Github:
github.com/samwit/langchain-t... (updated)
github.com/samwit/llm-tutorials
00:00 Intro
01:02 13B Model Testing
03:34 7B Model Testing
05:15 Custom Prompt

Пікірлер: 40
@jutubariel
@jutubariel 8 ай бұрын
I can't be grateful enough. Your prompt template does wonders. Thanks for the great work!
@Canna_Science_and_Technology
@Canna_Science_and_Technology 8 ай бұрын
Sam, thanks for all you do and keeping us updated. I’ve improved many apps based on your videos. 🧠
@mazenlahham8029
@mazenlahham8029 10 ай бұрын
Amazing Sam, as always ❤️
@UncleDao
@UncleDao 10 ай бұрын
That is helpful. Thank you for great content.
@romainjouhameau2764
@romainjouhameau2764 10 ай бұрын
Really helpful video, thanks a lot !
@guanjwcn
@guanjwcn 10 ай бұрын
Thanks, Sam. For once I was ahead of you by having tried to incorporate memory. 😅 FYI. I saw people using markers of and to wrap and indicate context. It works quite well.
@navneetkrc
@navneetkrc 10 ай бұрын
Have you tried text relevancy using openLLMs? I only need Llama2 to return relevant/non-relevant but getting junk/non consistent values. I used 7B/13B llama2, and production wise can't use 70B models
@kelkka7
@kelkka7 10 ай бұрын
Fantastic. Thank you!
@VenkatesanVenkat-fd4hg
@VenkatesanVenkat-fd4hg 10 ай бұрын
Very very helpful video, waiting for other llama2 experiments.. Whether can you discuss hyperparameters like temperature, repeative penalty tuning, top_k & top_p adjustments on llama2.
@sayanosis
@sayanosis 10 ай бұрын
Gold content 💙
@dare2dream148
@dare2dream148 10 ай бұрын
Thanks again Sam for another amazing video! I just tried 7B on my project and found it challenging to make the model just say "I don't know" when the retrieved context doesn't really help with the user's question (example being asking how to play football- the distance in the vector space is likely quite high). This is the case even when I added it explicitly in the system prompt. Wondering if you encountered the same challenge or had some ideas on solving it?
@howlger
@howlger 10 ай бұрын
That's helpful. Thanks for sharing. In the prompt there are ' ' and a few '/n', which should probably be all ' ' line breaks, right?
@dreamphoenix
@dreamphoenix 10 ай бұрын
Thank you.
@medicationrefill
@medicationrefill 10 ай бұрын
Nice 👍
@zd676
@zd676 10 ай бұрын
Also the testing for the prompt seems anecdotal by trial and error. I wonder if there is a systematic and scalable way of running these tests automatically (using some metrics). Would love to see a video around this topic! Great content nonetheless!
@kishorkukreja7733
@kishorkukreja7733 10 ай бұрын
Would highly appreciate a video on a retrival+ custom tools+gorilla together
@devtest202
@devtest202 4 ай бұрын
Hello Sam, thank you, what a good video, I ask you a question. I have tried to manage chat history and even reformulate questions with my own chain, but it is not really possible to obtain something stable, especially when asking questions that must have a defined context. Do you have any recommendations on this? Thank you!!
@happyday.mjohnson
@happyday.mjohnson 10 ай бұрын
Thank you for these videos. They are helpful! Have you tried using metadata when retrieving? It is not clear to me how the Retriever code actually uses it. Again, thank you.
@samwitteveenai
@samwitteveenai 10 ай бұрын
If you look in this notebook it is actually using the name/path of the doc as meta data. and returns the docs to extract that out. I will look at doing some more stuff like this with meta data in the future.
@judrogue
@judrogue 10 ай бұрын
Great video. Very helpful. Would you give me a hint? I would like to use the method "map reduce" instead of "stuff," but I couldn't figure out how to do it. Thank you in advance.
@samwitteveenai
@samwitteveenai 10 ай бұрын
I have a video that shows that for summarization, check that out there.
@ahmadzaimhilmi
@ahmadzaimhilmi 10 ай бұрын
Langchain regurgitates the answer in paragraphs and then outputs the list of sources at the end. There's no way of knowing which sentence refers to which source. I'm not sure if it's possible to restructure the answer so it looks like this: This is sentence one [1]. This is sentence two [2]. [1] Source 1 [2] Source 2 I know that Vectara can format the answer nicely this way.
@MrMyjanusn
@MrMyjanusn 10 ай бұрын
How do you incorporate memory with langchain and llama2 Chat ..
@cbusse7842
@cbusse7842 10 ай бұрын
I hope you're working on the new openAI API for 3.5 fine-tuning :D
@samwitteveenai
@samwitteveenai 10 ай бұрын
lol yes it is all done, will go live in a bit.
@joefiorini
@joefiorini 10 ай бұрын
Thanks Sam, been enjoying your content! One thing has been bugging me though… the “if you don’t know the answer please don’t share false information “statement in the prompt doesn’t make much sense to me given that LLMs are just parroting information out of the neural network (glorified auto complete). So how do they know what they don’t know? Does that prompt actually work?
@samwitteveenai
@samwitteveenai 10 ай бұрын
The same way that things like CoT etc work. While it is just predicting the the next token, to do that they are attending over the whole context window. The way that statement would work is that if there isn't something with a strong set of attention scores the model has some knowledge that the correct info isn't there. It certainly doesn't work all the time, but it does make it work more than if its not there. If you compare these output results to the last video's one on same questions you will notice that this has worked out to not include 2nd answers and other other addons etc. I certainly understand it can seem amazing that it works.
@joefiorini
@joefiorini 10 ай бұрын
@@samwitteveenai wow, thanks for the detailed response. Need to read through this a couple more times to really grok it. Might make an interesting video topic?
@zd676
@zd676 10 ай бұрын
I thought the [INST] token is only meant for the user's role to separate the user from system for the model to better understand the chat history?
@MatthiasWelwarsky
@MatthiasWelwarsky 8 ай бұрын
As far as I understood llama2 prompting, and looking at the chat wrapper that was published by Meta (I think?), [INST] should contain the user input, i.e. the instruction. I think the delimiter for the system portion of the prompt should be outside of the [INST] bracket.
@ptah23
@ptah23 10 ай бұрын
are you contributing TogetherLLM class to langchain?
@ivanlee7450
@ivanlee7450 10 ай бұрын
Is it possible to use langchain and hugging face to do an answer validation
@samwitteveenai
@samwitteveenai 10 ай бұрын
In any particular way? How would you want to do the validation.
@ivanlee7450
@ivanlee7450 10 ай бұрын
​@@samwitteveenai ask the user for the input, after received the user answer, put the question and answer in the model and the model validate it by saying it is a good or bad answer
@alx8439
@alx8439 10 ай бұрын
This is what I've been telling all the time. This langchain adds no value, rather than wrapping your own prompts into its own prompts. And adds some bugs. There's nothing you cannot do yourself without it
@joefiorini
@joefiorini 10 ай бұрын
As a programmer who has just started learning to code against LLMs I can understand this pov but I have seen some advantages to using langchain. I have worked with a number of different models: a bunch of local exllama, openai to name a couple. I experimented with some different vector stores & embeddings. Having the same interface for everything made it easy to swap around and still know how to use the new tools. The prompts do feel like a mess though. That said, I was a little frustrated when I couldn’t easily use some advanced Spacy features with their embedding API. If I was writing an app and knew already the entire set of tools I would be using then I could understand langchain being overkill.
@googleyoutubechannel8554
@googleyoutubechannel8554 4 ай бұрын
"So the problem is probably not in the context..." Probably? Probably? The context at this stage is just a text string that _your code created_ (well shitty langchain), and after the first prompt/response, it's still just a string that you can read! It's madness that people are still using langchain, a system that hides the most important and relevant information the dev needs, and it' just a simple text string?! Any system for developing useful tools against LLMs should do the exact opposite of how langchain works at development time. The full prompt history, in a request / response format, should be the #1 feature of any LLM dev toolchain. And it's easy to do this, it's a bunch of human readable STRINGs. Y'all still don't understand why it's a problem the way you're developing (slowly) with langchain. Unbelievable.
@yanakristian5903
@yanakristian5903 8 ай бұрын
Hi, this prompt somehow does not work for the model: Photolens/llama-2-7b-langchain-chat. If I put the SYS message outside the [INST] I get better results. But still sometimes, instead of having just one answer, I get several separated by [/INST] Or sometimes I get results with some json formatting inside. I have noticed this happens when the question is not in the context and the model has difficulty answering it.
@Florian_2066
@Florian_2066 8 ай бұрын
Hi, thanks for your feedback. I have same issue with LLAMA 2 7b. I wondered which system prompt do you use ?
@JamieGumb
@JamieGumb 2 ай бұрын
The line together.Models.start("togethercomputer/llama-2-70b-chat") gives an error: AttributeError Traceback (most recent call last) in () ----> 1 together.Models.start("togethercomputer/llama-2-70b-chat") AttributeError: type object 'Models' has no attribute 'start' Despite all previous code was running and executed.
Advanced RAG 01 - Self Querying Retrieval
12:02
Sam Witteveen
Рет қаралды 39 М.
Advanced RAG 06 - RAG Fusion
13:05
Sam Witteveen
Рет қаралды 18 М.
Жайдарман | Туған күн 2024 | Алматы
2:22:55
Jaidarman OFFICIAL / JCI
Рет қаралды 1,3 МЛН
Homemade Professional Spy Trick To Unlock A Phone 🔍
00:55
Crafty Champions
Рет қаралды 59 МЛН
когда повзрослела // EVA mash
00:40
EVA mash
Рет қаралды 2,4 МЛН
Prompt Templates for GPT 3.5 and other LLMs - LangChain #2
22:57
James Briggs
Рет қаралды 54 М.
Llama-2 with LocalGPT: Chat with YOUR Documents
23:14
Prompt Engineering
Рет қаралды 165 М.
Langchain's Prompt Templates and Sequential Chains
17:35
Derek Arends
Рет қаралды 4,4 М.
How-To Run Llama 3 LOCALLY with RAG!!! (GPT4ALL Tutorial)
8:36
1littlecoder
Рет қаралды 9 М.
The EASIEST way to finetune LLAMA-v2 on local machine!
17:26
Abhishek Thakur
Рет қаралды 166 М.
Advanced RAG 05 - HyDE - Hypothetical Document Embeddings
11:53
Sam Witteveen
Рет қаралды 15 М.
Llama2 Chat with Multiple Documents Using LangChain
15:22
Data Science Basics
Рет қаралды 21 М.
Creating an AI Agent with LangGraph Llama 3 & Groq
35:29
Sam Witteveen
Рет қаралды 39 М.
Florence 2 - The Best Small VLM Out There?
14:02
Sam Witteveen
Рет қаралды 10 М.
Gizli Apple Watch Özelliği😱
0:14
Safak Novruz
Рет қаралды 4,3 МЛН
В России ускорили интернет в 1000 раз
0:18
Короче, новости
Рет қаралды 349 М.