How to use Custom Prompts for RetrievalQA on LLaMA-2 7B

Рет қаралды 15,897

Күн бұрын

How to use Custom Prompts for RetrievalQA on LLaMA-2 7B and 13B
Colab: drp.li/0z7GR
My Links:
Twitter - / sam_witteveen
Linkedin - / samwitteveen
Github:
github.com/samwit/langchain-t... (updated)
github.com/samwit/llm-tutorials
00:00 Intro
01:02 13B Model Testing
03:34 7B Model Testing
05:15 Custom Prompt

Пікірлер: 40

@jutubariel 8 ай бұрын

I can't be grateful enough. Your prompt template does wonders. Thanks for the great work!

@Canna_Science_and_Technology 8 ай бұрын

Sam, thanks for all you do and keeping us updated. I’ve improved many apps based on your videos. 🧠

@mazenlahham8029 10 ай бұрын

Amazing Sam, as always ❤️

@UncleDao 10 ай бұрын

That is helpful. Thank you for great content.

@romainjouhameau2764 10 ай бұрын

Really helpful video, thanks a lot !

@guanjwcn 10 ай бұрын

Thanks, Sam. For once I was ahead of you by having tried to incorporate memory. 😅 FYI. I saw people using markers of and to wrap and indicate context. It works quite well.

@navneetkrc 10 ай бұрын

Have you tried text relevancy using openLLMs? I only need Llama2 to return relevant/non-relevant but getting junk/non consistent values. I used 7B/13B llama2, and production wise can't use 70B models

@kelkka7 10 ай бұрын

Fantastic. Thank you!

@VenkatesanVenkat-fd4hg 10 ай бұрын

Very very helpful video, waiting for other llama2 experiments.. Whether can you discuss hyperparameters like temperature, repeative penalty tuning, top_k & top_p adjustments on llama2.

@sayanosis 10 ай бұрын

Gold content 💙

@dare2dream148 10 ай бұрын

Thanks again Sam for another amazing video! I just tried 7B on my project and found it challenging to make the model just say "I don't know" when the retrieved context doesn't really help with the user's question (example being asking how to play football- the distance in the vector space is likely quite high). This is the case even when I added it explicitly in the system prompt. Wondering if you encountered the same challenge or had some ideas on solving it?

@howlger 10 ай бұрын

That's helpful. Thanks for sharing. In the prompt there are ' ' and a few '/n', which should probably be all ' ' line breaks, right?

@dreamphoenix 10 ай бұрын

Thank you.

@medicationrefill 10 ай бұрын

Nice 👍

@zd676 10 ай бұрын

Also the testing for the prompt seems anecdotal by trial and error. I wonder if there is a systematic and scalable way of running these tests automatically (using some metrics). Would love to see a video around this topic! Great content nonetheless!

@kishorkukreja7733 10 ай бұрын

Would highly appreciate a video on a retrival+ custom tools+gorilla together

@devtest202 4 ай бұрын

Hello Sam, thank you, what a good video, I ask you a question. I have tried to manage chat history and even reformulate questions with my own chain, but it is not really possible to obtain something stable, especially when asking questions that must have a defined context. Do you have any recommendations on this? Thank you!!

@happyday.mjohnson 10 ай бұрын

Thank you for these videos. They are helpful! Have you tried using metadata when retrieving? It is not clear to me how the Retriever code actually uses it. Again, thank you.

@samwitteveenai 10 ай бұрын

If you look in this notebook it is actually using the name/path of the doc as meta data. and returns the docs to extract that out. I will look at doing some more stuff like this with meta data in the future.

@judrogue 10 ай бұрын

Great video. Very helpful. Would you give me a hint? I would like to use the method "map reduce" instead of "stuff," but I couldn't figure out how to do it. Thank you in advance.

@samwitteveenai 10 ай бұрын

I have a video that shows that for summarization, check that out there.

@ahmadzaimhilmi 10 ай бұрын

Langchain regurgitates the answer in paragraphs and then outputs the list of sources at the end. There's no way of knowing which sentence refers to which source. I'm not sure if it's possible to restructure the answer so it looks like this: This is sentence one [1]. This is sentence two [2]. [1] Source 1 [2] Source 2 I know that Vectara can format the answer nicely this way.

@MrMyjanusn 10 ай бұрын

How do you incorporate memory with langchain and llama2 Chat ..

@cbusse7842 10 ай бұрын

I hope you're working on the new openAI API for 3.5 fine-tuning :D

@samwitteveenai 10 ай бұрын

lol yes it is all done, will go live in a bit.

@joefiorini 10 ай бұрын

Thanks Sam, been enjoying your content! One thing has been bugging me though… the “if you don’t know the answer please don’t share false information “statement in the prompt doesn’t make much sense to me given that LLMs are just parroting information out of the neural network (glorified auto complete). So how do they know what they don’t know? Does that prompt actually work?

@samwitteveenai 10 ай бұрын

The same way that things like CoT etc work. While it is just predicting the the next token, to do that they are attending over the whole context window. The way that statement would work is that if there isn't something with a strong set of attention scores the model has some knowledge that the correct info isn't there. It certainly doesn't work all the time, but it does make it work more than if its not there. If you compare these output results to the last video's one on same questions you will notice that this has worked out to not include 2nd answers and other other addons etc. I certainly understand it can seem amazing that it works.

@joefiorini 10 ай бұрын

@@samwitteveenai wow, thanks for the detailed response. Need to read through this a couple more times to really grok it. Might make an interesting video topic?

@zd676 10 ай бұрын

I thought the [INST] token is only meant for the user's role to separate the user from system for the model to better understand the chat history?

@MatthiasWelwarsky 8 ай бұрын

As far as I understood llama2 prompting, and looking at the chat wrapper that was published by Meta (I think?), [INST] should contain the user input, i.e. the instruction. I think the delimiter for the system portion of the prompt should be outside of the [INST] bracket.

@ptah23 10 ай бұрын

are you contributing TogetherLLM class to langchain?

@ivanlee7450 10 ай бұрын

Is it possible to use langchain and hugging face to do an answer validation

@samwitteveenai 10 ай бұрын

In any particular way? How would you want to do the validation.

@ivanlee7450 10 ай бұрын

@@samwitteveenai ask the user for the input, after received the user answer, put the question and answer in the model and the model validate it by saying it is a good or bad answer

@alx8439 10 ай бұрын

This is what I've been telling all the time. This langchain adds no value, rather than wrapping your own prompts into its own prompts. And adds some bugs. There's nothing you cannot do yourself without it

@joefiorini 10 ай бұрын

As a programmer who has just started learning to code against LLMs I can understand this pov but I have seen some advantages to using langchain. I have worked with a number of different models: a bunch of local exllama, openai to name a couple. I experimented with some different vector stores & embeddings. Having the same interface for everything made it easy to swap around and still know how to use the new tools. The prompts do feel like a mess though. That said, I was a little frustrated when I couldn’t easily use some advanced Spacy features with their embedding API. If I was writing an app and knew already the entire set of tools I would be using then I could understand langchain being overkill.

@googleyoutubechannel8554 4 ай бұрын

"So the problem is probably not in the context..." Probably? Probably? The context at this stage is just a text string that _your code created_ (well shitty langchain), and after the first prompt/response, it's still just a string that you can read! It's madness that people are still using langchain, a system that hides the most important and relevant information the dev needs, and it' just a simple text string?! Any system for developing useful tools against LLMs should do the exact opposite of how langchain works at development time. The full prompt history, in a request / response format, should be the #1 feature of any LLM dev toolchain. And it's easy to do this, it's a bunch of human readable STRINGs. Y'all still don't understand why it's a problem the way you're developing (slowly) with langchain. Unbelievable.

@yanakristian5903 8 ай бұрын

Hi, this prompt somehow does not work for the model: Photolens/llama-2-7b-langchain-chat. If I put the SYS message outside the [INST] I get better results. But still sometimes, instead of having just one answer, I get several separated by [/INST] Or sometimes I get results with some json formatting inside. I have noticed this happens when the question is not in the context and the model has difficulty answering it.

@Florian_2066 8 ай бұрын

Hi, thanks for your feedback. I have same issue with LLAMA 2 7b. I wondered which system prompt do you use ?

@JamieGumb 2 ай бұрын

The line together.Models.start("togethercomputer/llama-2-70b-chat") gives an error: AttributeError Traceback (most recent call last) in () ----> 1 together.Models.start("togethercomputer/llama-2-70b-chat") AttributeError: type object 'Models' has no attribute 'start' Despite all previous code was running and executed.