Using Langchain and Open Source Vector DB Chroma for Semantic Search with OpenAI's LLM

Using Langchain and Open Source Vector DB Chroma for Semantic Search with OpenAI's LLM | Code

Рет қаралды 28,170

Pradip Nichite

Күн бұрын

Пікірлер: 59

@amitagarwal5223 5 ай бұрын

This video was very helpful. Good explanation. Appreciate your time and effort in putting it togather.

@oxytic Жыл бұрын

Dear Pradip, I have been following your video tutorials for the past three weeks, and I have found them to be very informative and comprehensive. However, I am having some difficulty understanding how to create a custom agent in Langchain with class and class decorators and LLMchain. I would be grateful if you could create a video tutorial that specifically addresses this topic. I believe that this would be a valuable resource for many people who are interested in learning how to use Langchain to create custom agents.

@FutureSmartAI Жыл бұрын

Sure. I will explore it and cover

@chinmaydeshpande5046 9 ай бұрын

Thanks for the great video . TBH , I was struggling with this from 2 days .. and your video helped me .

@FutureSmartAI 8 ай бұрын

Glad it helped!

@digvijayyadav4168 Жыл бұрын

Great work Pradip

@allwiyn Жыл бұрын

Thanks for the video. This was very useful. Information was pretty clear for the beginners too.

@FutureSmartAI Жыл бұрын

Glad to hear it!

@nukulkhadse5253 Жыл бұрын

Hi @Pradip, what if I need a summarised answer for my query, for example, what is the average health score of all my pumps? Health score is just a number in all my documents. So will my model get all that data or it will go out of context limit while fetching the data ?

@yazanrisheh5127 Жыл бұрын

Can you make a video where we persist the Chroma db on cloud and show us how to add new files, delete new files, embed these files and ask questions with our new db? Would really really really appreciate it! Thank you in advance

@FutureSmartAI Жыл бұрын

I will try

@rizwanat7496 9 ай бұрын

how long will it take to embed and store a large dataset of around 50 MB in Chroma

@nikitamobile 9 ай бұрын

Thanks for your video. I have tried to implement the same on my side, however the text pieces returned by similarity_search for the query look unrelevant. I'm using all-MiniLM-L6-v2 model for the embedding and the following settings of the text_splitter - chunk_size=1024,chunk_overlap=20. What can be the reason of the poor seimilarity search results? P.S. input files are not in English language. It is in Uzbek. Can it be also the reason?

@aidev8926 Жыл бұрын

Brilliant work !!!!

@FutureSmartAI Жыл бұрын

Thank you! Cheers!

@testahom4690 Жыл бұрын

Hey Pradip, good video. Do you have any plans for similar videos on LLAMA2 and vector dbs?

@FutureSmartAI Жыл бұрын

Yes I will be

@ashxos 11 ай бұрын

Thanks for Sharing!

@FutureSmartAI 11 ай бұрын

My pleasure!

@abhishekvij2409 8 ай бұрын

is it possible to dockerize this code with fastapi?

@jpdoan4531 Жыл бұрын

Thank you Pradip for this great tutorial! 👏It worked for my dataset.🎉 Do you have an alternative code to create chroma_db from: either: splitting a Pandas DataFrame's column or: splitting a PySpark SQL DataFrame's column of table content ? instead of: docs = split_docs(documents) and db = Chroma.from_documents(docs, embeddings) My dataset is too large and creating thousands of .txt files is not sustainable.

@FutureSmartAI Жыл бұрын

Why dont you use Langchain SQL Agents

@rahulcn1314 5 ай бұрын

is there any difference in embeddings if we use OpenAi embedding rather than langchain ?

@FutureSmartAI 4 ай бұрын

langchain dont have their own embedings. we can use open ai or other open source embding like sentence transformers

@sachintiwari2794 10 ай бұрын

Thanks for your helpful video! I am loading python code. Could you please suggest the best vector semantic similarity search here to get most relevant top k results?

@FutureSmartAI 10 ай бұрын

you mean best embeding model?

@sandedom339 Жыл бұрын

Very nicely explained! Can we do this job without OpenAI API key? can we do it locally on our own computer?

@FutureSmartAI Жыл бұрын

For open ai you will require key. if you want to only test semantic search and see relevant docs you dont need key

@sandedom339 Жыл бұрын

@@FutureSmartAI Thanks Pradip- Just sent you an email, please see.

@riyadhmollik Жыл бұрын

@@FutureSmartAI this api key free or paid ?

@faddyAB Жыл бұрын

@@riyadhmollik+1

@himu04 7 ай бұрын

cannot import name 'Document' from 'langchain' getting this error also i m facing some problem is it possible to connect you.

@suniha2803 Жыл бұрын

hi pradip In the past video You used streamlit, langchain, and pinecone. I want to use ChomaDB instead of pinecone I don’t know what I should do. Can you give me some tips?

@FutureSmartAI Жыл бұрын

you should be able to do it yourself. try combining code from both of these videos. Even you can ask chatGPT to do that

@nehat786 Жыл бұрын

Hello sir, my name is Nehat. I am from India. I am full stack developer and very much interested in AI. I have seen your videos and trust me I learn a lot from those videos. You have an amazing quality of teaching and making things easy to understand. I would be so many if you teach me those skills and show me the new world of AI. Please do let me know if there is any paid courses I can take. Thank you!

@FutureSmartAI Жыл бұрын

Hi Nehat, Thanks. I dont have any paid course. I mostly focused on my freelancing work and share my learning here on youtube. IF you have any doubts you can ask here.

@dattatreyagundumolu1593 8 ай бұрын

can i upload more than 1 pdf file when i integrate with streamlit

@yosshi2028 Жыл бұрын

Could you share your 「Chroma DB with Langchain.ipynb 」file?

@FutureSmartAI Жыл бұрын

blog.futuresmart.ai/using-langchain-and-open-source-vector-db-chroma-for-semantic-search-with-openais-llm

@indranilcool Жыл бұрын

If different models were used for generating embedding would the performance of semantic search be effected ?

@FutureSmartAI Жыл бұрын

Yes. www.sbert.net/docs/pretrained_models.html

@MrunmayeeBangar 5 ай бұрын

How can we do vector search for dynamic data, can nodes update automatically

@FutureSmartAI 4 ай бұрын

as we get new data we need to add more docs to existing index or collection

@virtualrendezvous2220 Жыл бұрын

Hi,I have a json file with content and metadata. How to split content into chunks in that case.

@koushiksherugar8680 Жыл бұрын

I'm looking for the same Have you got any leads?

@virtualrendezvous2220 Жыл бұрын

@@koushiksherugar8680 yes,you can check json document loader integration in langchain documentation

@nattyzaddy6555 Жыл бұрын

Does this video say how you can use a local model with the vector db?

@FutureSmartAI Жыл бұрын

Hi Vector DB is local not LLM

@kanikasharma4611 Жыл бұрын

Can you guide me how can I connect LLM and pine Cone and open Ai embedding to connect with mongo database and create chat bot ?

@shaikshavalivali876 Жыл бұрын

I have one doubt Does Azure cognitive service provides vector db service too or not? Please help me

@FutureSmartAI Жыл бұрын

They have something Azure Cognitive Search but you can always use other vector databases. learn.microsoft.com/en-us/semantic-kernel/memories/vector-db