Generative AI and Long-Term Memory for LLMs (OpenAI, Cohere, OS, Pinecone)

Рет қаралды 17,175

James Briggs

Күн бұрын

Пікірлер: 63

@billykotsos4642 Жыл бұрын

I was blown away months back when you guys showed a live demo of this. Its been on my mind ever since...

@jamesbriggs Жыл бұрын

yeah me too haha - I'm 100% in on gen AI now

@lutune Жыл бұрын

Awesome content! thank you for letting us know that there are always different options to "shop" around for! Make sure to keep using it for your content creation too!

@kevon217 Жыл бұрын

Your videos are pure gold. You’ve helped me learn so much. Really appreciate it!

@jamesbriggs Жыл бұрын

Happy it helps!

@ylazerson Жыл бұрын

The best NLP AI videos on KZbin - hands down!

@jamesbriggs Жыл бұрын

Thanks bro!

@nelsonatp8957 Жыл бұрын

I’ve never commented KZbin videos but this is just pure gold ! Keep up with the great work and thank you very much

@jamesbriggs Жыл бұрын

That's awesome, I appreciate it a lot!

@billykotsos4642 Жыл бұрын

Retrieval augmented generative QA is where its at!

@jamesbriggs Жыл бұрын

I know man it’s wild

@dylanramirez3059 Жыл бұрын

Great videos James. However, I think you should clarify to people that fine-tunning is about teaching a task (or how to answer given X prompt) rather than teaching new knowledge. During the fine-tunning process most of the model's layers are frozen. Once a model has been fine-tuned, you won't need to provide examples in the prompt anymore (no more one/few-shot prompts). This saves costs and enables lower-latency requests.

@jamesbriggs Жыл бұрын

Great point, I'll be diving into this topic a lot more soon so will integrate a better fine-tuning vs. pretraining explanation

@Tap2EarnTips Жыл бұрын

Can you update the knowlesge base automatically based on repeated outputs so the knowledge base keeps improving itself based on it's individual components?

@jeanchindeko5477 Жыл бұрын

Thanks I’ve learned a lot in this short time frame on the subject. Where can we find the code samples you showcase at the end of the video?

@jamesbriggs Жыл бұрын

see here github.com/pinecone-io/examples/blob/master/generation/generative-qa/openai/gen-qa-openai/gen-qa-openai.ipynb covers the dataset build, indexing, and queries

@haneer3205 Жыл бұрын

Great video! I like how you remove the background and make it into white text on those articles. Can I ask how do you do that?

@jamesbriggs Жыл бұрын

Change image composition/blend from normal to add and deepen blacks in the image, then add a background

@haneer3205 Жыл бұрын

@@jamesbriggs Thank you! last question, what software are you using to do this? and to edit your videos?

@JulianHarris Жыл бұрын

Amazing and helpful, thank you! Here was my understanding: 1. fine tuning is potentially costly 2. A domain context if queried in isolation isn't relevant enough 3. So employ zero-shot learning by augmenting the prompt with some domain context from a domain knowledge base query result from the exact same query. What I don't understand is why (2) isn't enough -- I can see from kzbin.info/www/bejne/qKOkdJuml9ysnac that something is retrieved and I assume that is the output of the embeddings query but having gone through some experiments the pattern of "use sentence embeddings for semantic search" I came away (possibly incorrectly) that that in itself would indeed be good enough. Reading the video blurb I guess that's what you were saying too… maybe in a future version that point could've been a bit clearer as a possible enhancement…

@ryanperera4800 Жыл бұрын

Great videos man. Keep going ! I believe you will get many more subs! Also How do you do the query UI? Is that something you built ?

@jamesbriggs Жыл бұрын

hey bro thanks - can you point me to where in the video this is?

@j_hull Жыл бұрын

Fantastic as usual. Question: would you embed individual sentences as it looks like in this dataset? I’ve read elsewhere to embed paragraphs or sections. Just trying to figure out best approach for this

@jamesbriggs Жыл бұрын

Usually at least paragraph sized chunks, this dataset does produce sentence sized snippets but I actually merged them (30 at a time) before encoding. If using open source models you’re usually limited to paragraph sized chunks, in this case I used open AI’s text-embedding-ada-002 so I could go bigger (the 30 sentence chunks used)

@j_hull Жыл бұрын

@@jamesbriggs thanks! Yeah I’m using OpenAI as well, just wanted to know if there was a significant difference in results between paragraph and/or sentence. I was thinking to go with best size given overall token length if the differential was minimal.

@elhoim Жыл бұрын

Thanks for the explanation. You make it so much clearer!

@jamesbriggs Жыл бұрын

Glad it was helpful!

@chenlin7535 Жыл бұрын

Can I use FAISS instead of Pinecone?

@hamvjones Жыл бұрын

For the knowledge base, how large does it need to be more useful? Understanding the larger it is the better, but if you have access to a small amount of data on a very specific subject (ie several books worth) would that in fact help? (sorry if that's overly simple)

@jamesbriggs Жыл бұрын

If all of the information is more than what you could fit into a single LLM prompt, then it can be useful - but this is really the extreme, I'd be surprised to see a use case consisting of just a few pages of relevant text In your example of several books work, yes this approach is definitely helpful - and yes larger is better as long as "larger" doesn't equate to "more irrelevant information"

@hamvjones Жыл бұрын

@@jamesbriggs thanks for the reply, super helpful

@Steve-js7bp Жыл бұрын

great job. these videos are fascinating!

@jamesbriggs Жыл бұрын

thanks!

@avi7278 Жыл бұрын

OK, please you have to tell me how to get that semi-transparent browser window.

@jamesbriggs Жыл бұрын

I wish that was the browser window itself haha - it's actually edited post-recording

@avi7278 Жыл бұрын

@@jamesbriggs ah dang it, way cool effect.

@Davipar Жыл бұрын

How "big" can the new source of information to add context be? What are the limits here?

@jamesbriggs Жыл бұрын

it depends, total 'max context window' for text-davinci-003 is 4097 tokens, that's our token budget for both prompt and generation, so you need to fit your instructions, context, query, and then leave enough space for the generated output there. Typically you have like 20 or so paragraphs. Other LLMs will vary from this I talk about the limits in this video kzbin.info/www/bejne/eIGcl5yVZb2Hotk at the 22:55 mark

@Davipar Жыл бұрын

@@jamesbriggs Thank you James! Appreciate it!

@outdoorvideo8305 Жыл бұрын

Great video. Can you do a video on how you could use tabular data in GPT for either fine tuning or prompt engineering?

@jamesbriggs Жыл бұрын

not sure what you mean exactly, like taking tabular data and feeding into gpt to train it for a specific task? Maybe you can give an example?

@jimmynguyen3386 Жыл бұрын

I’m in the legal/cyber security space and the biggest concern with this type of new technology is data privacy. How is the data being kept, where, etc. Until this can be properly addressed, adoption in the legal industry will be limited.

@jamesbriggs Жыл бұрын

yep understandable, I used to work in finance so 100% get this - in reality it is pretty safe though, the "long-term memory" is a vector DB, all you need to store are the vector embeddings, which can't be decoded into the original information, so they're safe to store. An option with Pinecone (the vector DB) is to store related information in the vector's "metadata" field, in this you can store your sensitive information if you like, and it's still very safe as Pinecone is SOC 2 compliant (read more here www.pinecone.io/security/). But as mentioned if you don't trust it, you can still store this metadata wherever you usually store it and just store the vector embeddings in your vector DB. Hope that helps!

@jimmynguyen3386 Жыл бұрын

@@jamesbriggs this is very helpful, thanks! Legal is still concerned with sending data to OpenAI due to their privacy policy unfortunately. Basically, data can never leave due to its sensitivity.

@ayoolafakoya9841 Жыл бұрын

@@jimmynguyen3386 You can just host your own language model and replace the open AI component.

@Wahedzurma Жыл бұрын

@@ayoolafakoya9841 easy

@JulianHarris Жыл бұрын

@@jimmynguyen3386it OpenAI don’t add an Enterprise tier in future with control over data residency and usage they’re missing a trick for sure