Python RAG Tutorial (with Local LLMs): AI For Your PDFs

  Рет қаралды 244,162

pixegami

pixegami

Күн бұрын

Пікірлер: 483
@vdabhade
@vdabhade 3 ай бұрын
It's hard to find such high quality videos which is to the point with simplification in all the aspects. Great work !!!
@frederichominh3152
@frederichominh3152 5 ай бұрын
Best tutorial I've ever seen in a long time, maybe ever. Timing, sequence, content, logic, context... everything is right in your video. Thank YOU and congrats, you are smart as hell.
@heesongkoh
@heesongkoh 4 ай бұрын
agreed.
@pixegami
@pixegami 4 ай бұрын
Wow, thanks for your comment. I really appreciate it, and I'm glad you liked the video.
@fabsync
@fabsync 4 ай бұрын
Oh man.. by far the best tutorial on the subject.. finally someone using pdf and explaining the entire process! You should do a more in-depth series on this...
@pixegami
@pixegami 4 ай бұрын
Thank you for the feedback :) Looks like with the interest this topic has received, I'm definitely keen to dive into it a bit deeper.
@fabsync
@fabsync 4 ай бұрын
One of the questions that I was asking myself with pdf.. do you clean the pdf before doing the embeddings .. or this is something that you can resolve by customizing the prompt? What would be a good way to do semantic search after using pgvector..? I am still struggling with those answers
@pixegami
@pixegami 4 ай бұрын
@@fabsync Yeah I've had a lot of people ask about cleaning the PDFs too. I think if you have PDFs that have certain structural challenges, I'd probably recommend to find a way to clean/augment it for your workflow. And LLM prompt can only go so far, and cleaning noise from the data will always help.
@jonuldrick
@jonuldrick Ай бұрын
I just wanted to say that this video inspired me to setup my own RAG. I got some help from a friend with some parts, but I've been working on adding more functionality. My current iteration has a menu that has database management and chatbot options. Database management lets me create, update, and delete databases. The chatbot option lets me choose which databases to use before loading the LLM. I also have added graceful interrupt handling. Thanks for the tutorial that provided me with a jumping off point.
@jonuldrick
@jonuldrick Ай бұрын
And everything is being run locally using HuggingFace and Ollama.
@mohamedjasim8247
@mohamedjasim8247 18 күн бұрын
​@@jonuldrick yes correct. I am also trying same like you
@tinghaowang-ei7kv
@tinghaowang-ei7kv 5 ай бұрын
It's hard to find such high quality videos on China's Beep, but you've done it, thank you so much for your selflessness. Great talk, looking forward to the next video. Thanks again, you did a great job!
@pixegami
@pixegami 5 ай бұрын
Thank you! Glad you enjoyed it!
@musiitwaedmond1426
@musiitwaedmond1426 5 ай бұрын
this is the best RAG tutorial I have come across on youtube, thank you so much man💪
@pixegami
@pixegami 4 ай бұрын
Thank you! I appreciate it!
@NW8187
@NW8187 4 ай бұрын
Simplifying a complex topic for a diverse set of users requires an amazing level of clarity of thought, knowledge and communication skills, which you have demonstrated in this video. Congratulations! Here are some items on my wish list for you when you can get to it. 1. Ability for users to pick among a selected list of open-source LLMs. A list that users can keep it updated. 2. build a local RAG application for getting insights from personal tabular data, which stored in multiple formats e.g. excel/google sheets, PDF tables
@pixegami
@pixegami 4 ай бұрын
Thanks for your comment, I'm really glad to hear it was helpful. I appreciate you sharing the feedback and suggestions as well, I've added these items to my list of ideas for future videos :)
@agustinfilippo5451
@agustinfilippo5451 2 ай бұрын
I've watched a few of your videos and I didn't know which one to comment first. And congratulate you. Great content and even better style.
@denijane89
@denijane89 5 ай бұрын
That was the most useful video I've seen on the topic (and I watched quite a lot). I didn't realise that the quality of the embedding is so important. I have one working code for local pdf ai, but I wasn't very impressed by the results. That explains why. Thank you for the great content. I'd love to see other uses of local LLMs.
@pixegami
@pixegami 4 ай бұрын
Glad you liked it! Thanks for commenting and for sharing your experience. And absolutely - when building apps with LLM (or any kind of ML/AI technology), the quality of the data and the index is really non-negotiable if you want to have high-quality results.
@chair_guy
@chair_guy 2 ай бұрын
which is the best freely available embedder? any help would be really appreciated
@mpesakapoeta
@mpesakapoeta Ай бұрын
The best RAG tutorial so far, you've introduced concepts ave not seen in other similar tutorials
@davidtindell950
@davidtindell950 2 ай бұрын
BTW (ByeTheWay): I used the OpenAI Embeddings model="text-embedding-3-large" and obtained very similar results to your demo query about Monopoly. I first used Ollama 'llama3', but then retested with Ollama 'mistra:latest'. Surprisingly, the 'mistral' results were better than the ''llama3' !?!?! All I can say now is "G'Day Mate" and thank you again!
@vslabs-za
@vslabs-za 2 ай бұрын
llama3 vs mistral? That's a weighty comment there mate...
@habibbayo3327
@habibbayo3327 Ай бұрын
@@vslabs-za 🤣🤣🤣
@royli2009
@royli2009 Ай бұрын
Pure gold! Thanks for putting all the beads and pieces together so well bro!
@JaqUkto
@JaqUkto 4 ай бұрын
Thank you very much! I've started my RAG using your vids. Of course, much of your code needed to be updated, but it was simple even given my zero knowledge of Python.
@sergiovasquez7686
@sergiovasquez7686 4 ай бұрын
You may could share the updates to us 😅
@pixegami
@pixegami 4 ай бұрын
Nice work, glad you got it working!
@mrrohitjadhav470
@mrrohitjadhav470 5 ай бұрын
After searching 100s of videos journey ends here. 😍Please would you make a tutorial making a knowledge graph using Ollama?
@pixegami
@pixegami 5 ай бұрын
Thanks, glad your journey came to an end :) Thanks for the suggestion - I've added the idea to my list :)
@mrrohitjadhav470
@mrrohitjadhav470 5 ай бұрын
@@pixegami Aweeeeeeeesome, Just want to slightly change the knowledge graph based on pdf,txt (own data). Sorry for not elaborating, but too much own data makes it difficult to find connections between many sources.
@mrrohitjadhav470
@mrrohitjadhav470 3 ай бұрын
??/
@nachoeigu
@nachoeigu 5 ай бұрын
Your content is amazing! Keep it going. I would like to see the continuation of this video in terms of how to upload and automate the workflow in the cloud AWS and how to integrate the chat interface with telegram bot
@pixegami
@pixegami 5 ай бұрын
Glad you liked it, and thanks for the suggestions. My next video will be focused on how to deploy this to the cloud - but I hadn't thought about the Telegram bot idea before, I will look up how to do that.
@michaelmaloy6378
@michaelmaloy6378 2 ай бұрын
My laptop is woefully underpowered, and I had to update a couple of the dependencies, but I was able to get mistral to tell me about gears. Hoping I can get use these "6 simple machines" pdfs to accomplish similar. Thank you so much for this project. :)
@jial.5245
@jial.5245 5 ай бұрын
Thank you so much for the content👍🏼 very well explained! Would be great to see a use case of using autogen multi-agent approach to enhance RAG response.
@pixegami
@pixegami 5 ай бұрын
Glad you liked it, thank you! And thanks for the suggestion and project idea :)
@davidtindell950
@davidtindell950 2 ай бұрын
Thank You for: "Taking it up a notch ..." or 2 or 3 'notches' !
@trueindian03
@trueindian03 2 ай бұрын
This is the best RAG tutorial on youtube, Thanks for the Video, you got a new Subscriber 🎉
@KrishnaKotabhattara
@KrishnaKotabhattara 5 ай бұрын
For evaluation, use RAGAs and Langsmith. There is also an SDK for azure which does same things as RAGAs and Langsmith.
@pixegami
@pixegami 5 ай бұрын
Oh, thanks for the recommendation. I'll have to take a look into that.
@carlosalberto-mo1wj
@carlosalberto-mo1wj 3 ай бұрын
I simply love the hole video! for the next Rag tutorial can you make a deploy on a azure cloud or any other cloud, just to see in depth how this works! thanks so mutch for the content man!
@pixegami
@pixegami 3 ай бұрын
My upcoming video is actually about how to deploy a RAG app like this to the AWS cloud :) Stay tuned!
@nickmills8476
@nickmills8476 4 ай бұрын
To update the chromadb data for PDF chunks whose data has changed, store the PDF document contents hash in the metadata field. In addition to adding IDs that don't already exist, select records whose metadata.hash has changed and update these records, using collection.update()
@VictorWaknin
@VictorWaknin 28 күн бұрын
Upvote this guys🎉
@JohnBoen
@JohnBoen 4 ай бұрын
He he he... tests are easy. I was wondering how to do those. Prompt: State several facts about the data and construct a question that asks for each fact. Create tests that look for the wrong answer... Give me 50 of each... Give me some examples of boundary conditions... Formatting... In an hour I will have fat stack of tests that would normally take a day a day to create. This is awesome :)
@paulham.2447
@paulham.2447 5 ай бұрын
Very very useful and so much well explained ! Thanks.
@pixegami
@pixegami 5 ай бұрын
Thank you!
@nascentnaga
@nascentnaga 5 ай бұрын
Suuuuuper helpful. I need to test this for a work idea. thank you!
@pixegami
@pixegami 5 ай бұрын
You're welcome!
@ravikiranbasuthkar2818
@ravikiranbasuthkar2818 2 ай бұрын
This is the best practical tutorial came across llms, RAG, langchain. Also can you make one about agents and their use
@ramanamachireddy
@ramanamachireddy 2 ай бұрын
Thanks for your good work. It was such a crisp and clear video straight to the point. I loved it. Keep doing the good work. I would like also see how can we deploy such models in production. If you can do one such video next time, it would be really informative. Thanks-Ramana
@muhannadobeidat
@muhannadobeidat 5 ай бұрын
Great video and nicely scripted. Thanks for the excellent effort. I find that nomic 1.5 is pretty good for embedding and lightweight as well. I did not do actual performance metric based analysis of that but actual recall and precision testing is pretty impressive with 768 dimensions only.
@pixegami
@pixegami 5 ай бұрын
Thank you! Glad nomic text worked well for your use case :)
@maikoke6768
@maikoke6768 4 ай бұрын
correct and improve: The issue I have with the Rag is that when I ask about something in a document that I know doesn't exist, the AI still provides a response, even though I would prefer it not to.
@mo3x
@mo3x 5 ай бұрын
So it is just an advanced ctrl+f ?
@pixegami
@pixegami 5 ай бұрын
Yes, that's one way to think about it. Still, incredibly powerful.
@Larimuss
@Larimuss 2 ай бұрын
When a company has 10,000 documents. Trust me this shit is useful and will be the future. Probably Microsoft will sell it as Azure extension option.
@yadhapdahal758
@yadhapdahal758 2 ай бұрын
Good usecase - alternative for FAQ/ help pages for websites/ applications . Will give it a shot
@kashishkumar
@kashishkumar 2 ай бұрын
Semantic now
@beauforda.stenberg1280
@beauforda.stenberg1280 2 күн бұрын
😂
@philc787
@philc787 8 күн бұрын
Good content. Suggestion for the future: zoom in vscode to make test slightly bigger when running full screen.
@careyatou
@careyatou 4 ай бұрын
I got this to work with my own data. This was so cool. Thanks!
@pixegami
@pixegami 4 ай бұрын
Awesome! Glad to hear it worked for you :)
@pmgear
@pmgear 2 ай бұрын
I could not get it to work, it won't build a database, neither using bedrock nor ollama, bummer.
@willnorden2268
@willnorden2268 Ай бұрын
@@pmgear yeah, it keeps saying that "from langchain.vectorstores.chroma import Chroma" is depracated
@zhubarb
@zhubarb 5 ай бұрын
Crystal clear. Great video.
@pixegami
@pixegami 4 ай бұрын
Thank you! Glad to hear that :)
@AlexandreBarbosaIT
@AlexandreBarbosaIT 5 ай бұрын
Smashed the Subscribe button! Awesome content! Looking forward for the next ones.
@pixegami
@pixegami 5 ай бұрын
Thank you! Glad you enjoyed it, and welcome!
@michaelwindeyer6278
@michaelwindeyer6278 2 ай бұрын
Thank you for this tutorial! It went into more detail than most. I have questions. In all the tutorial i have watched, there is always a small dataset used (a few games instructions in yours). How big can the dataset be? What if I have 1000s of PDFs? Will a RAG give less accurate answers in this case and are there other things to consider when dealing with larger datasets?
@samho9831
@samho9831 3 ай бұрын
Thanks for sharing and I can have a grasp of the concept. If possible, it will be deeply appreciated if you can show how to do it in the cloud. Thanks.
@pixegami
@pixegami 3 ай бұрын
Stay tuned for my next video! kzbin.info/www/bejne/opWpgIF4pJd2iJI
@basselkordy8223
@basselkordy8223 5 ай бұрын
High quality stuff. Thanks
@pixegami
@pixegami 5 ай бұрын
Glad you liked it!
@කැලණිකුප්පි
@කැලණිකුප්පි 5 ай бұрын
Recently discovered your channel 🎉 , subscribed 😊 keep up the awesome content
@pixegami
@pixegami 5 ай бұрын
Thank you! Welcome to the channel!
@sardorshorahimov9486
@sardorshorahimov9486 2 ай бұрын
Hi, thank you for video and info. This video one of the best videos about AI, ML, Rag. Your video was so helpful) thank you again)
@Karthik-ln7eg
@Karthik-ln7eg 2 ай бұрын
Great video! Love the way you simplified the concept. Are you thinking of making videos on the topics fine-tuning, function-calling, Agents? If so, that would be a great series of videos. I am sure all your subscribers including me would greatly benefit from them. Meanwhile, can you share any resources on these topics? (fine-tuning, function-calling, Agents)
@VenuraPussella
@VenuraPussella Ай бұрын
This is really good and informative, make a video on deploying application like these to a cloud.
@shankarkarande4175
@shankarkarande4175 2 ай бұрын
Thank you so mcuh, Best tutorial I've ever seen!!
@ManuelJimenez1
@ManuelJimenez1 3 ай бұрын
Thanks for the whole tutorial, I would suggest adding speed at the queries of the vector database with PostgreSQL with pgvector, pg_embeddings plugin.
@maxflokinho
@maxflokinho 3 ай бұрын
I would like it not only to be able to read PDFs but also if the final information was 'weak' or missing information, it would do an internet search on the topic provided in the 'query' and complete the final answer with the collected data. Do you think this is feasible to do? I thought about using agents for this, like crewai. I looked but couldn't find any tutorial that used both methods.
@Jorxdandres22
@Jorxdandres22 2 ай бұрын
Gracia amigo eres un crack, me ayudo mucho en mi proyecto, lo pude implementar para un chat con django, ademas logre poder agregar las imagenes y tambien me las toma encuenta
@ayoubfr8660
@ayoubfr8660 5 ай бұрын
Great stuff as usual! Could we have a video about how to turn this RAG app into a nice and proper desktop app with a graphic interface? Cheers mate.
@pixegami
@pixegami 5 ай бұрын
Good idea, thanks! I'll note it down as a video idea :)
@ayoubfr8660
@ayoubfr8660 5 ай бұрын
@@pixegami Thank you for the reply and reactivity! Have a nice day!
@J3R3MI6
@J3R3MI6 5 ай бұрын
@@pixegamiI subbed for the advanced RAG content
@iainhmunro
@iainhmunro 5 ай бұрын
This is pretty good. I was wondering how I could integrate this with my current python scripts for my AI Calling Agent, so if someone wanted to call the number, they could chat with the PDF.
@pixegami
@pixegami 5 ай бұрын
I think that certainly should be possible, but it's quite complicated (I haven't done anything like that before myself). You'd probably need something to hook up a phone number/service to an app that can transcribe the text in real like (like what Alexa or Siri does), then have an agent to figure out what to do with that interaction. And eventually hook it up to the RAG app. After that, you'll need to seriously think about guard-rails for the agent, otherwise you could end up with it getting your business into trouble. An example of this is when Air Canada's chatbot promised a customer a discount that wasn't available: www.bbc.com/travel/article/20240222-air-canada-chatbot-misinformation-what-travellers-should-know
@kofiadom7779
@kofiadom7779 Ай бұрын
your tutorials are very simple-to-understand. could you please do a tutorial on reinforcement learning from human feedback?
@RasNot
@RasNot 5 ай бұрын
Great content, thanks for making it!
@pixegami
@pixegami 5 ай бұрын
Glad you enjoyed it!
@davidgortega3734
@davidgortega3734 3 ай бұрын
For the unit tests you can use tools or grammars to limit the output and that way you can fix some issues that you are showing
@pixegami
@pixegami 3 ай бұрын
Good idea. I haven't actually explored testing LLM output in detail yet, and I think it will be a fascinating topic.
@durand101
@durand101 3 ай бұрын
Such a helpful tutorial, thank you!
@pixegami
@pixegami 3 ай бұрын
Glad you enjoyed it!
@gustavojuantorena
@gustavojuantorena 5 ай бұрын
Great content as always!
@pixegami
@pixegami 5 ай бұрын
Thanks for watching!
@bartmeeus9033
@bartmeeus9033 2 ай бұрын
Great tuturial, well explained!!!
@techietoons
@techietoons 2 ай бұрын
Very thorough. Best video.
@bharanij6130
@bharanij6130 2 ай бұрын
Hello! Mighty pleased and thank you very much!
@jasiriwa-kyendo8043
@jasiriwa-kyendo8043 Ай бұрын
Yes I would love to know how this can be pushed to the web, as using Ollama would completely change everything.
@elvistolotti45
@elvistolotti45 4 ай бұрын
great tutorial
@houstonfirefox
@houstonfirefox 12 күн бұрын
How about computing the binary checksum on each chunk and compare it with the binary checksum stored in the DB to detect any changes? Great video!
@ishadhiwar7636
@ishadhiwar7636 4 ай бұрын
Thank you for the fantastic tutorial! It was incredibly helpful and well-explained. I was wondering if you have any plans to release a video on fine-tuning this project using techniques like RLHF? It would be great to see your insights on that aspect as well.
@pixegami
@pixegami 4 ай бұрын
Thank you! Glad you enjoyed the video. I've noted the suggestion about fine-tuning-I hadn't considered it yet, but thanks for sharing that idea with me.
@sergiovasquez7686
@sergiovasquez7686 4 ай бұрын
I just subscribed to your channel… very high vids on KZbin
@pixegami
@pixegami 4 ай бұрын
Thank you! Welcome.
@ziadbensaada
@ziadbensaada 4 ай бұрын
hi, it give me this problem when I run python populate_database.py: Could not load credentials to authenticate with AWS client. Please check that credentials in the specified profile name are valid. Bedrock error: The config profile (default) could not be found (type=value_error)
@1nd3v
@1nd3v Ай бұрын
same problem, did you solve it and if you did, how?
@ziadbensaada
@ziadbensaada Ай бұрын
​@@1nd3vthis model requires payment, you need to buy it!
@jpmikewired
@jpmikewired Ай бұрын
Very good video.
@pm1234
@pm1234 5 ай бұрын
I successfully run it locally with ollama embeddings, but it pretends its answer is from the PDF while I'm 100% sure it's from elsewhere. Let me explain: My monopoly PDF rules are in French, my LLM is in French (vigostral), my question is in French (translation of the test question: How much total money does a player start with in Monopoly?), but the reply is stating $1500, an amount and a currency that are NOT in the French PDF book of rules (150'000Frs, very old rules), and it cites the sources: ['data/1d-monopoly-regle.pdf:5:1', 'data/1d-monopoly-regle.pdf:3:4' ... Asking the same question in French directly to ollama (run), without the PDF, states the actual amount (2000€). So, it makes me wonder if it really works all the time without bias (model info vs doc info), why does it cite sources while giving an answer that is not related to the sources, and how to identify what is not working in this case.
@Yakibackk
@Yakibackk 5 ай бұрын
Maybe its related to embedding ? Which one are you using?
@thaslim7869
@thaslim7869 5 ай бұрын
Is this work offline?
@pm1234
@pm1234 5 ай бұрын
@@Yakibackk Ollama offers (only) 3 embeddings, I tested nomic and mxbai.
@pm1234
@pm1234 5 ай бұрын
@@thaslim7869 ollama works locally.
@juryel3561
@juryel3561 5 ай бұрын
You can explicitly tell it not to use information outside of the provided documents through its instructions since its RAG you can do that.
@joaquinestay8097
@joaquinestay8097 2 ай бұрын
To regenerate modified pdf, just add modified date to meta data and add it to the workflow.
@pradeepvenkat4557
@pradeepvenkat4557 Ай бұрын
nice video.. can you please share update logic too or make another video if possible.
@anwartaleb2608
@anwartaleb2608 8 күн бұрын
Thanks a lot for this incredible video! AI plays a great role in our lives nowadays. I'm thinking about creating a service similar to the one in this video. Do you know of any good hosting services, and are the prices high?
@nickmills8476
@nickmills8476 4 ай бұрын
Using a local embedding model: mxbai-embed-large, got me similar results to your monopoly answer.
@pixegami
@pixegami 4 ай бұрын
Thanks for sharing! I hadn't tried that one yet.
@eldino
@eldino 2 ай бұрын
Thank you for the tutorial and the code! Two questions: 1. How can we improve the chunking part? 2. How can we create a derivative model of llama3 or similar that includes our embeddings? And how do we export and import in multiple machines running ollama? Thanks!
@60pluscrazy
@60pluscrazy 5 ай бұрын
Excellent 🎉🎉🎉
@pixegami
@pixegami 5 ай бұрын
Thank you! Cheers!
@sebastianscarano9418
@sebastianscarano9418 17 күн бұрын
great video indeed! could you explain the difference between this kind of RAG approach and taking and existing model (llama3.1 for example) and "tuning it" or "train it" (is that how it's said) with those same pdf to create a custom llm?
@maxi-g
@maxi-g 4 ай бұрын
Hey, I have a question. I tried to load a fairly large PDF (100 pages) into the database (approx. 400 documents). However the add_to_chroma function seems to be excruciatingly slow. The output from ollama shows that the embeddings only get requested once every two seconds or so. There is also no CPU or GPU load on my system when this process is running. Is there any way to improve this? Thank's already
@pixegami
@pixegami 4 ай бұрын
This is most definitely because of the time it takes to embed each page (since you mentioned embeddings get requested once every two seconds). Your Ollama model might not be able to fully leverage your hardware, which is potentially why your don't see your CPU/GPU load rise up. You could experiment by switching this to use an online embedding API (like OpenAI or AWS Bedrock) and see if it's faster. Or you could double check to see if Ollama is using your GPU correctly (github.com/ollama/ollama/blob/main/docs/gpu.md)
@mingilin1317
@mingilin1317 5 ай бұрын
Great video! Successfully implemented RAG for the first time, so touching. Subscribed to the channel already! In the video, you mentioned handling document updates. Do you have plans to cover this topic in the future? I'm really interested about it! Also, is "ticket_to_ride" and "monopoly" sharing the same database in example code? What if I don't want them to share? Is there a way to handle that?
@pixegami
@pixegami 4 ай бұрын
Awesome! Glad to hear about your successful RAG project, well done! I've had a lot of folks ask about vector database updates, so it's something I definitely want to cover. If you want to store different pieces of data in different databases, then I recommend put another layer of logic on top of the document loading (and querying). Have each folder use a different database (named after each folder), then add another LLM layer to interpret the question, and map it to which database it should query.
@octavee2507
@octavee2507 2 ай бұрын
Thank you for your video and for sharing your knowledge! You give a very clear explanation. As an experienced person, could you, please, suggest: what is the right way to fine-tune the model for embeddings to process synonyms and definitions? For example, in our knowledge base we have the statement: “The constituent documents are located here: ”. And the “constituent documents” include the regulation, the order on the appointment of the director and some other docs. And at first I need to get this statement (“The constituent documents are located here: ”) in return when the user asks for ‘constituent documents’ or for ‘regulation documents’ or for the “order on the appointment” (by comparing the embeddings). Now I try ‘paraphrase-multilingual-mpnet-base-v2’ model for embeddings, it performs well and helps to find this statement for query ‘get constituent documents’ and ‘get regulation documents’, but not for the ‘order on the appointment’ (it is not even in the top 5 results). And I can’t find out how to fine-tune to make the model ‘know’, that ‘constituent documents’ include ‘the order on the appointment’ (even how to prepare training dataset for that). Thank you for suggestions.
@nirmalkumar007
@nirmalkumar007 5 ай бұрын
which part of the code makes the API call to the OLAMA server ? Kindly help
@pixegami
@pixegami 5 ай бұрын
The Langchain Ollama wrapper class (e.g. python.langchain.com/docs/integrations/text_embedding/ollama/ for the embedding) wraps all the code to call Ollama for you.
@raylin2527
@raylin2527 Ай бұрын
amazing!!!
@KushwanthK
@KushwanthK 4 ай бұрын
Very good content. Thanks for making it. Actually I liked your validation idea but How about for the descriptive answers to evaluate? Does pytest and prompt together works? Also did you make tutorial on how to update the vector database when file content changes.
@pixegami
@pixegami 4 ай бұрын
Thanks, I appreciate it! I think you can also try to use the same evaluation strategy for descriptive answers. I've also seen other commenters mention frameworks for evaluating LLM responses so that might be worth looking into as well.
@bachirafik8040
@bachirafik8040 5 ай бұрын
Hi sir I got a aws error
@pixegami
@pixegami 5 ай бұрын
What error did you get? Did you set up AWS CLI and enable the embedding models in your AWS account as well?
@vidfan1967
@vidfan1967 5 ай бұрын
I want to verify an existing PowerPoint document against a PDF, e.g. with an updated law text. I want to find out, whether my statements on each PPT slide are still true. Challenge: each slide contains one or more statements, which should be verified against the PDF, for example you have 6 or 10 bullet points on one slide. To use RAG I cannot use them all for the query as they might be quite diverse and would knowledge from the PDF, that is not specifically matching any one point but all of them together. Also: the context on the slide should be considered together with each statement, e.g. the title of the slide, an intro text above the bullet point list, or the upper level information for a statement that sits in a sub-structure of bullet points. I guess I would somehow need to split the statements in the PPT in logical chunks, preserving the context. Is there a python function I could use? Or can this be done with AI (e.g. few shot) after the slide text has been extracted? If this is of wider interest, I would appreciate a video on this 🙂
@pixegami
@pixegami 5 ай бұрын
Thanks for sharing your use case. Yup, this is definitely something you need to solve during the "chunking" phase of the process. For example, there are some experimental "chunking" functions from Langchain you could try: python.langchain.com/docs/modules/data_connection/document_transformers/semantic-chunker/ Also, you could "bake" in custom information to each chunked document itself. E.g. something like (each variable name is made up): chunk.metadata["page_content"] = title_of_doc_str + context_str + actual_page_text_str
@NotAnotherKelvin
@NotAnotherKelvin 2 ай бұрын
Amazing video!
@Martin-kv3th
@Martin-kv3th Ай бұрын
5:25 - mismatching curlybracket and paranthese
@rob679
@rob679 4 ай бұрын
On model param size, 7B models are enough. Not related to this video, but I'm using Llama3 8B with OpenWebUI's RAG and it works but it sometimes have problems to refer to correct document while giving correct answer (it will hallucinate document name), but its how its RAG implementation are.
@pixegami
@pixegami 4 ай бұрын
Interesting, I haven't tried this with the 7GB models yet. Thanks for sharing!
@PANDURANG99
@PANDURANG99 2 ай бұрын
WHAT IF WE DINDT MADE ANY PREDEFINED RULES?
@David-re8bi
@David-re8bi 8 күн бұрын
It works so slowly on my local machine. It would be nice to know how to deploy and accelerate it.
@studyboifromthavoid9606
@studyboifromthavoid9606 Ай бұрын
Hello! Quick question, in what part of the code are you inputting the user's question to answer? Thanks for the tutorial, very well done!!
@edwardtse8631
@edwardtse8631 5 ай бұрын
This is a very good tutorial, how do you solve the problem of edit a data file? to store sha-1 of the file?
@pixegami
@pixegami 5 ай бұрын
Exactly right - you can use an even simpler hash function like MD5 just to check that the content of each chunk hasn't changed. You'll still have to loop through all the chunks though to calculate the hash and to compare them. That should be fine for 100s or 1000s of chunks, but it might not scale too well beyond that.
@danielcomeon
@danielcomeon 5 ай бұрын
Thanks a lot. Great video!!! I want to know how to add new data to the existing database with new unique IDs.
@pixegami
@pixegami 4 ай бұрын
Thanks! Glad you liked it. If you just want to **add** new data, the chapter on updating the database should already cover this. You just need to add new files into the folder, and run the `populate_database` command again. Any pages/docs on in the database will be added. But if you meant updating existing pages/segments in the existing data, then yes I'll have to make a video/tutorial about that :)
@felixkindawoke
@felixkindawoke 4 ай бұрын
Thank you! Could you do a tutorial on how to talk to the data? So based on this create a voice chat with it.
@utuberay007
@utuberay007 3 ай бұрын
how much time does it take to do chunks of 10 pages single pdf. For me it sits for ever here... Number of existing documents in DB: 0 👉 Adding new documents: 26
@kayafurkanipekci
@kayafurkanipekci 2 ай бұрын
Great video thanks :)
@bowen-feng
@bowen-feng 2 ай бұрын
Thanks for the great vid! Do you have a sense of how much this setup differs in terms of output quality compared to using OpenAI's assistants API (where you can upload the files directly via the UI, create a vector DB also via same interface and have the AI use that as RAG architecture), as this approach is much simpler to implement.
@namaefumei
@namaefumei 4 ай бұрын
This is great!
@jtstrocel
@jtstrocel Ай бұрын
How is this approach different from uploading a zip file with the pdfs into an LLM using OpenWebUI?
@beauforda.stenberg1280
@beauforda.stenberg1280 2 күн бұрын
That is quite interesting. On reflection and investigation, a hybridisation of distributed processing between the local environment, which for me is an Android tablet, and a private server either locally, remotely or a third-party cloud server solution, would be optimal and resource efficient. Cheers 🎉
@patoury4392
@patoury4392 3 ай бұрын
Very interesting, thanks for this video. I have a question : Once you realize that your model has many failures. How can you adjust back the model ? Should there be document replacement ? like starting over ?
@pixegami
@pixegami 3 ай бұрын
Thank you for watching! That's an excellent question. I haven't explored all the options to rollback and limit blast radius in detail. Top of my head, I'd probably: - Separate my data-sources and DBs by topic/subject to limit the blast-radius (if something goes wrong). - Store every single version of the embedding DB and source data as a back-up. Maybe make them last 1-2 years before they get deleted (if storage cost is a problem). Then if I have an issue, just rollback to the previous version and try again.
@ergala10
@ergala10 2 ай бұрын
How could you run the LLM on AWS or AZURE instead of OLLAMA? Could we see an implementation of it? Also How can we convert this to an interface similar to a chatbot? Great Video
@pratheekbabu272
@pratheekbabu272 2 ай бұрын
Great Content can you also evaluate Rag without openai key
@den-hub
@den-hub 2 ай бұрын
thank you :)
@SDChillRide
@SDChillRide 3 ай бұрын
Hi, not sure what is going on, but chromaDB take way too long to create my database. {'source': 'data\\EntityDiagrams_TR_v2014SE.pdf', 'page': 0, 'start_index': 0} Adding to Chroma DB... Number of existing documents in DB: 0 👉 Adding new documents: 261 Processing batch 1 with 100 chunks... Batch 1 added in 205.49 seconds. Processing batch 2 with 100 chunks... Batch 2 added in 204.37 seconds. Processing batch 3 with 61 chunks... Batch 3 added in 124.76 seconds. Documents added to Chroma DB in 535.19 seconds. Data store generated in 537.88 seconds.
@NotDefinitelyNotMe
@NotDefinitelyNotMe 2 ай бұрын
Amazing!
@AiWithAnshul
@AiWithAnshul 5 ай бұрын
This is an impressive setup! I'm currently using Weaviate as my Vector DB along with Open AI Models, and it's working really well for handling PDFs, Docs, PPTs, and even Outlook email files. However, I've been struggling to integrate Excel and CSV files into my Knowledge Base. For small Excel files, the vector approach seems fine, but it's challenging for larger ones. I'd love to get your input on how to build a system that incorporates Excel files along with the other formats. I've considered using something like PandasGPT for handling the Excel and CSV files and the traditional RAG approach for the remaining file types (PDFs, Docs, etc.). Perhaps adding an agent as the first layer to determine where to direct the query (to the RAG model or PandasGPT) would be a good idea? What are your thoughts on this?
@pixegami
@pixegami 4 ай бұрын
Thanks for your comment and for sharing your challenges and ideas. I think if you are mixing free-form text (like documents) and something more traditionally queryable (like a DB), it does make sense to engineer some more modality into your app (like what you suggested). I haven't explored that far myself so I can't share anything useful yet. But I'll be sure to keep it in mind for future videos. Good luck with your project!
@nhtdmr
@nhtdmr 5 ай бұрын
Could you make video same project with LLAMA 3?
@pixegami
@pixegami 5 ай бұрын
You can do it here! Just change the Ollama model to Llama 3: ollama.com/blog/llama3
RAG + Langchain Python Project: Easy AI/Chat For Your Docs
16:42
Supercharge your Python App with RAG and Ollama in Minutes
9:42
Matt Williams
Рет қаралды 35 М.
Spongebob ate Patrick 😱 #meme #spongebob #gmod
00:15
Mr. LoLo
Рет қаралды 16 МЛН
Как мы играем в игры 😂
00:20
МЯТНАЯ ФАНТА
Рет қаралды 3,1 МЛН
I Analyzed My Finance With Local LLMs
17:51
Thu Vu data analytics
Рет қаралды 481 М.
OpenAI Embeddings and Vector Databases Crash Course
18:41
Adrian Twarog
Рет қаралды 469 М.
Сравниваем: RAG на Local LLM vs GPT-4
5:19
AI Experience Exchange
Рет қаралды 6 М.
host ALL your AI locally
24:20
NetworkChuck
Рет қаралды 1,1 МЛН
These coding projects give you an unfair advantage
8:13
Jason Goodison
Рет қаралды 1 МЛН
LangChain Explained in 13 Minutes | QuickStart Tutorial for Beginners
12:44
How to Improve LLMs with RAG (Overview + Python Code)
21:41
Shaw Talebi
Рет қаралды 60 М.
EASIEST Way to Fine-Tune a LLM and Use It With Ollama
5:18
warpdotdev
Рет қаралды 31 М.
How I Made AI Assistants Do My Work For Me: CrewAI
19:21
Maya Akim
Рет қаралды 834 М.
Spongebob ate Patrick 😱 #meme #spongebob #gmod
00:15
Mr. LoLo
Рет қаралды 16 МЛН