This is by far the best and concise rag tutorial available online.
@starX799518 күн бұрын
bro this is the only tutorial which helped me so far of all the other youtube videos i have watched on rag based applications love you from india bro
@yankeem18 күн бұрын
Thank you for your kind words! Glad it helped! 🚀
@modest_supreme14 күн бұрын
Great tutorial, thank you for going in depth and showing me these tools!
@programmingholicАй бұрын
Thaks bro for sharing, it's helpful.
@fancypetsulove3 күн бұрын
Talk slowly to make it easy to follow. Thanks for this tutorial
@JellosKanellos20 күн бұрын
Thanks for the awesome video! I am setting up a LLM/RAG project where I want the LLM to analyze log files. I noticed that the upsert actions into the chromadb take a long time, even with relatively small log files. Inserting around 8000 chunks can easily take up 10 minutes. I believe the lack of concurrency in the underlying SQLite architecture is the problem. Are there any alternatives to chromadb that I can use in order to solve this problem? I still want to run everything locally. Thanks in advance.
@yankeem20 күн бұрын
Hey Jellos, this might be worth the read: docs.trychroma.com/deployment/performance#insert-throughput Apart from this, try using Vector Database with an Async Client like Qdrant. Also note that the upsert might be taking longer due to the embedding model. If you switch to a different model, that might help as well. Let me know what works for you!
@JellosKanellos16 күн бұрын
@@yankeem Thanks for the suggestions. I did test Qdrant today. It is about a factor 2 faster. I am trying to change the configuration to keep making it faster.
@yankeem16 күн бұрын
Awesome!
@EverydayKarmaАй бұрын
Lets goo 🥳
@akki_the_tecki24 күн бұрын
is this the BASIC RAG? or AVANCED ONE? because I should develop an ENTERPRISE RAG for atleast 1000 pdf documents. can i get help from you my brother?
@themax2go23 күн бұрын
what''s the issue here? if you can "rag" 1 pdf you can "rag" 1k pdfs. though you might need to use a diff vector store. search for vector db comparison, there are a few vids. also, what's "advanced one" mean... do you mean ms' graphrag? depending on the "theme" of the docs you might not need graphrag. it's not that straightforward (yet)
@akki_the_tecki23 күн бұрын
@themax2go suggest me some videos brother... My rag pipeline is very basic without reranking and all.. I want a robust one
@yankeem22 күн бұрын
Please watch the full video, If you want perform a semantic search on Vector DB and let LLM generate the answer, then video up to this chapter covers it 15:58. If you need to rank the extracted document based on relevance then section from 20:30 would be relevant.
@akki_the_tecki22 күн бұрын
@@yankeem I need it for Multiple files (pdfs) and it shld be different UIs for Ingesting (ingest.py) and Generation (app.py)
@themax2go19 күн бұрын
you're too vague. what is "advanced rag", explain?
@girishcoorgi422420 күн бұрын
I tried to replicate the code by setting up the environment on mac machine. I am getting this error "ConnectError: [Errno 61] Connection refused in upsert." while clicking on the process button On windows machine - I am getting this error "error PermissionError: [WinError 32] The process cannot access the file because it is being used by another process:" Please help me fix this issue
@yankeem20 күн бұрын
Can you try the dependencies from the demo repo? github.com/yankeexe/llm-rag-with-reranker-demo
@RamakishhАй бұрын
if we have images in the pdf ,will it process with out error?
@yankeemАй бұрын
No, it will just extract the text by default. We have to extract and process images separately in case of multimodal RAG application.
@slayofficial11366 күн бұрын
bro its very easy with pdf data try to do with csv file and excel data
@yankeem6 күн бұрын
All the steps are the same with CSV and excel as well. If you want to use csv use this loader instead of the PDFLoader as shown in the video. python.langchain.com/docs/how_to/document_loader_csv/ -- Same for excel use a supported loader like this one: python.langchain.com/docs/how_to/document_loader_office_file/ Or convert it to markdown or any other format and use a relevant loader. You can learn more here: python.langchain.com/docs/how_to/#document-loaders And here: python.langchain.com/docs/how_to/#text-splitters
@mohammad-xy9owАй бұрын
Bhai isko sikhne k baad koi job mil skta h kya
@yankeem22 күн бұрын
You can try learning it and putting it on your CV. Best of luck.
@docuong9569Ай бұрын
error PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\cuongdd\\AppData\\Local\\Temp\\tmprjpx428b.pdf'
@akki_the_tecki22 күн бұрын
@@docuong9569 yes same error.. have you made it to pass?
@yankeem20 күн бұрын
Is this resolved?
@akki_the_tecki20 күн бұрын
@yankeem we should run it on Ubuntu terminal! then there is no errors
@docuong956920 күн бұрын
@@yankeem No, please show me how to fix the error. I run it on windows 10
@yankeem20 күн бұрын
@@docuong9569 I don't have a windows machine to test this, can you DM me the stack trace on LinkedIn?
@MicheleSblendorio-n6v29 күн бұрын
Even though I cloned the repo, it gives me an error regarding the chromadb when I process a document. How can I solve it? Many thanks! if isinstance(target[0], (int, float)) and not isinstance(target[0], bool): ~~~~~~^^^ IndexError: list index out of range in upsert. 2024-11-26 10:13:23.939 Examining the path of torch.classes raised: Tried to instantiate class '__path__._path', but it does not exist! Ensure that it is registered via torch::class_
@yankeem28 күн бұрын
Resolving in DM.
@akki_the_tecki22 күн бұрын
@@yankeem where is ur DM,? how can i contact you?
@akki_the_tecki22 күн бұрын
@@MicheleSblendorio-n6v iam also getting the same error
@MicheleSblendorio-n6v22 күн бұрын
@@akki_the_tecki I haven't solved it yet. I'll have to debug the code.
@akki_the_tecki22 күн бұрын
@@MicheleSblendorio-n6v pls tell me how if u did it!