Multi Document RAG Chatbot - Streamlit Langchain Groq ChromaDB - LLAMA 3.1

Multi Document RAG Chatbot - Streamlit Langchain Groq ChromaDB - LLAMA 3.1 | Generative AI

Рет қаралды 4,154

Күн бұрын

Пікірлер: 21

@lnstagrarm 3 ай бұрын

Lets not split the chunk wise. It loses structure. Just structurise it in format. Title, content, tables in document. Also if images. Generate embeddings of all columns. When user asks query decide whether its based on table or content based on score. We can use bm25 and semantics search scores and then reciprocal rank fusion to calculate all scores. If the score of table col iss more then query is based on table. When passing data to the llm we will pass all data. Suppose if the query based on table col. We will take high score result on table. And we will pass that table as well as corresponding columns. That is title and content. This really works great and it maintains the structure as welll. We can as query like give me content insode the title named “example title “. It will be based on title so we will pass the title as well as corresponding columns.

@KumR 3 ай бұрын

Hey Sid.. Great one. Can u expand this to reading not only pdf but also csv, docx and txt?

@WiseCoder-rp2zn 12 күн бұрын

so, if we are changing documents, we have to run the vectorizer file again right?

@ruznyma 3 ай бұрын

Great contents ❤️ Could you please make videos on lang graph and AI agents ? Thanks for your valuable tutorials 👍

@Siddhardhan 3 ай бұрын

Sure

@SUDA-u2r 3 ай бұрын

I have implemented Conversation rag chain taking reference of langchain documentation. I'm not getting how to implement reciprocal rank fusion or reranking when you are using History aware retriever

@KumR 3 ай бұрын

Thanks Sid. What if we want to upload docs & nt just point to data folder ? And what if we have multiple pdf, csv, docx, txt etc where data lies ?

@HarshitaAmbre 11 күн бұрын

how to proceed with multipe csv files ?

@gauravwankhede9263 3 ай бұрын

Can Chunksize can be spotted by using " " so that chunks will be splitted into paragraph wise. hence it cannot be mess up with the structure.

@yazanrisheh5127 3 ай бұрын

Yes using RecursiveCharacterTextSplitter

@gauravwankhede9263 3 ай бұрын

@@yazanrisheh5127 Thank you for comment, I did this, and successfully it works.

@gamekhela 2 ай бұрын

how to implement cache mechanism here using gptcache or langchain inmemory or sqllite cache

@KumR 3 ай бұрын

Can we use agent based on type of document?

@lnstagrarm 3 ай бұрын

Also i don’t suggest using langchain. As we cant configure it according to our needs. Also it need so much things in backend to run so it makes responses slower.

@KumR 3 ай бұрын

Will Llamaindex help ?

@lnstagrarm 3 ай бұрын

@@KumR helps but u will spend so much time if u have data in other formats. Suppose u parsed data in a way u want and wantt to feed to llmama index. U need to make in that format first. There are so many things. As some functions in llmaindex need documents directly. And it will parse in its way. So i think u should manually write code. But for some tasks u can use. I don’t suggest langchain. U will struggle to write the code. As their code modularity is. Not good.

@KumR 3 ай бұрын

@@lnstagrarm Thanks for the perspective. Helps newbies in this area..

@junaidamin 2 ай бұрын

what u suggest then?

@lnstagrarm 2 ай бұрын

@@junaidamin i suggest llamaindex. Easy to conduct with anything. Like langchain always has openai suppose when it comes to use openai api. But when it comes to use other models then its really hard to figure it out how to do it. But llamaindex has good documentation also can be configured easily according to our needs.