Supercharge your Python App with RAG and Ollama in Minutes

Рет қаралды 27,649

Күн бұрын

This video will show you how easy it is to build a RAG app with just Ollama and ChromaDB using Python and nothing else. You can be up and running in less than 5 or so minutes.
You can find the code for this video on my github repo at github.com/technovangelist/vi...
Be sure to sign up to my monthly newsletter at technovangelist.com/newsletter
And if interested in supporting me, sign up for my patreon at / technovangelist

Пікірлер: 115

@juliprabhu 2 ай бұрын

Tashi Dalek! I figured out from your background picture hanging on the side wall. Great video. I recently been using ollama at work and I am loving it.

@technovangelist 2 ай бұрын

Ahh. I picked that up on one of my trips to Nepal. My sister used to run a free healthcare clinic in a small town called Jiri east of Kathmandu.

@G0rd0nG3ck0 2 ай бұрын

Hey Matt - I know I'm just an internet stranger in an endless ocean of internet noise, but I just wanted to drop you a comment to let you know I've really enjoyed your videos. You have a casual approach to production, or maybe it could better be described as a thorough planning and preparation process that results in a casual vibe for the viewer, and I dig the nuggets of wisdom I've gleaned over the past few months I've been watching your content. I work in tech (professionally) myself, and I mostly recreate in tech as well. I have been all over the place with my inspiration and dabbling over the past year and change. Image and video generation, LLMs, open source architectures and paid platforms, basically whatever I stumble across that looks nifty. I've recently been seeking some new inspiration, and your videos have been a breath of fresh air to watch and get the gears turning for me as I consider what I could dig into for a next project. I'm not a "developer" either but I've had great fun with Python and Ollama, and you explain how to use these tools in a manner that is very approachable. Keep up the great work.

@BetterThanTV888 2 ай бұрын

The most amazing video I have watched in over a year. Your style of educating users is amazing. Great videography and editing. My only wish is that your channel shoots the moon 🚀and you get rewarded from the YT algorithm and compensated.

@technovangelist 2 ай бұрын

It’s growing faster than I expected so all good. Thanks so much for the comment

@jz-xq4vx 2 ай бұрын

enjoying your videos and Ollama - looking forward to the TS version of this one!

@JohnMitchellCalif 2 ай бұрын

super useful and clear! Subscribed.

@joeburkeson8946 2 ай бұрын

Wow, I can only imagine what things will be like 5 years in the future, thanks for all you do.

@technovangelist 2 ай бұрын

yeah, i wonder if it will be a big change or incremental. A lot of the science behind whats going on was started being research 60 -70 years ago. And the core concepts of language models and how they work was from the early 1990's. There was the transformers paper about 8 years ago which was evolutionary rather than revolutionary, but is the big change that got us to here. So i could see it going both ways. Maybe as things are moving fast the mythical AGI is only a decade away, or maybe its much further. who knows. Exciting time to be making videos.

@khalifarmili1256 2 ай бұрын

that awkward silence in the end 😅, Thanks a lot for the insights 🎉🎉❤

@PoGGiE06 2 ай бұрын

I really like your clear, measured, logical presentation style. This is a great, informative, video that will help get anyone up and running with RAG and chroma db, quickly, without getting bogged down in langchain, which does not seem necessary for this task, and yet is often lazily used, along with openAI. My questions would be: (i). Why not use e.g. llama index? Is that not a more powerful approach, especially if one is smart/targeted in the way one constructs indices for particular subsets of the data. (Ii). Should one finetune the embeddings model for the use case? E.g. specific tuned embedding models e.g. extracting particular information from annual reports, for example, one model to retrieve corporate compensation/options data, another for segmental/divisional data, and another for analysing notes to accounts/accounting standards etc. (Iii). Pre-processing data e.g. using eg pdf plumber to extract all tables from annual report, locate and extract each relevant section of annual report eg management report, notes etc. and then query relevant section for information sought. (Iv). Agentic use of models, possibly also fine tuned to the specific data retrieval tasks above. In particular, using several prompts asking the same question differently and passing the e.g. 3 responses back to another model for summarisation to ‘iterate’ to the best response. (V) optimal use of different llms for different tasks. E.g. could one finetune tinydolphin and use that for targeted, faster information retrieval tasks, and then use e.g. mistral for the final combination of responses into summary? (Vi). Basic reasoning applied to data sets. For example, I have my own, custom, ih house financial dat set: say i want to compare the leverage between different competitors in the same subindustry, what mode might be best to do that? Shoul i fine tune the model with examples of my particular analyses and conclusions that i would like to see? Or even, using multiple considerations e.g. company valuation, ‘quality’ metrics, growth, competitive positioning, and scenario analysis, it should be possible to construct a simple, reasoned, investment thesis. Re: (i), I think that you recently did a vid on this. But I have seen a number of seemingly knowledgeable people saying the best approach is to just finetune bert as it uses encoding and is the best starting point. Apologies if that sounds confused: it probably is, I am new to this area.

@brian2590 2 ай бұрын

PDF's are a nightmare. I started building a semantic search platform for PDFs back in 2017. This needed to process thousands of PDFs a day all within a very short time frame of half an hour. Then i added some early AI to recognize entities and sentiment. Now i am tasked to add LLMs to the mix. The result is a massive project with many moving parts, it's polyglot.. Java, Python, Rust JS. Uses old and new NLP and AI technologies. I hate it though there is no other way pull off what i need to do without all of these moving parts. It's also too expensive to run all of this in a cloud. I feel for anyone tasked with a large RAG project and cringe a bit when i see people trying to throw a bunch of PDFs into an LLM. don't do it! . Ollama has been very helpful as this system progresses. Thank you!

@BelgranoK 2 ай бұрын

I work in a project to build a RAG from pdf using php. We tested somo tools seperately and they works.

@mrrohitjadhav470 2 ай бұрын

@@BelgranoK may i ask list of tools?

@rayzorr Ай бұрын

Hey Matt. Love your work. This one took me a while to get up and running but with just the right amount of cursing I was able to get it up. When I looked at the data in database, there was a lot of blank lines in between text. I assume it would be preferable to strip that out before chunking?

@RocktCityTim 2 ай бұрын

Thank you, Matt.

@RedShipsofSpainAgain 2 ай бұрын

Great video, Matt. This is so cool. One small suggestion: at 6:00, could you please use syntax highlighting in your code? The all white font makes it hard to follow which finctions youre improtong from 3rd party libraries vs UDFs. I think a color scheme similar to what VS Code uses in its default theme would help readability. Thanks again for the excellent videos.

@dbwstein 2 ай бұрын

Great stuff. Here’s the issue: Most of the data I want to review like contracts, zoning laws…etc are in PDFs. So, the RAG apps I want to build will be for getting data out of PDFs. So, anything you can do on that front would be great.

@technovangelist 2 ай бұрын

your best bet is to find the source documents with the full text. PDFs are never the source. The amount of cleanup required to get good info will take longer. In some cases you may get lucky.

@adrianbool4568 2 ай бұрын

If you're on a Mac with homebrew, trying installing the "gc" package (Ghostscript) (brew install gc). Similar on Linux, using whatever package manager is appropriate. Ghostscript provides the "ps2ascii" tool - just call that giving it the input PDF filename and an output (text) filenaname as arguments and it will perform the translation. If your PDF is mostly just text, the output is usually pretty good. If there are lots of "design elements" within the PDF - not so much. For your type of content, it may do pretty well. You casn script this with zsh/bash to convert whole folders of PDF files to text quickly. Good luck.

@technovangelist 2 ай бұрын

It is unfortunate that you need to go through hoops like that. I hope to find a better way that works that doesn’t require a horrible approach like that.

@chrisBruner 2 ай бұрын

I have a shell script called summerize_pdf which is pdf2text $1 | ollama run mistral "summarize this in 2000 words or less" pdf2text is a python program which is: !/usr/bin/env python3 import sys from PyPDF2 import PdfReader # Use `PdfReader` instead of `PdfFileReader` for more recent versions def extract_text(pdf_path): with open(pdf_path, 'rb') as f: reader = PdfReader(f) num_pages = len(reader.pages) # Get the total number of pages in the PDF document text = "" for page_num in range(num_pages): # Iterate over each page page = reader.pages[page_num] text += page.extract_text() + " " # Append the text from this page to the full text string, followed by a newline return num_pages,text def main(): if len(sys.argv) != 2: print("hi Usage: pdf_text_extractor.py ") return filename = sys.argv[1] # Get the PDF filename from the command line arguments numPages, text = extract_text(filename) # Extract text from the file print("Total pages: ", numPages) print("Extracted Text: ", text) # Print out the extracted text if __name__ == "__main__": main() Not elegant but it gets the job done.

@technovangelist 2 ай бұрын

Unfortunately it uses pypdf which does a terrible job for most PDFs. Sometimes it works ok, but way too often the text is jumbled up. Since many PDFs are simply images, an OCR step is often needed. I think most who think pypdf works don't actually look at the resulting text.

@nicholascomjean 2 ай бұрын

Please do a STT (Speech to Text) / TTS (Text to Speech) integration!

@alibahrami6810 2 ай бұрын

Awesome content ❤

@technovangelist 2 ай бұрын

Thank you 🙌

@Aristocle 2 ай бұрын

One idea for a next video could be a guide on how to create a chatbot+RAG with Knowledge Graphs(Neo4J).

@thesilentcitadel 2 ай бұрын

Yes please!

@uskola 2 ай бұрын

Thanks for this video

@csepartha 2 ай бұрын

Good explanation

@farexBaby-ur8ns Ай бұрын

Matt, have very good content. I first saw the webui vid and then came to this vid. Qn: so what you do here can be done via openui >Documents. Correct?

2 ай бұрын

Hi Matt, thanks for the video! I've encountered some setbacks while trying this under Windows 11, but managed to solve them easily. Here I mention the problems and solutions: - ImportError: failed to find libmagic. Solution: pip install python-magic-bin - ModuleNotFoundError: No module named 'nltk'. Solution: pip install nltk - FileNotFoundError: [Errno 2] No such file or directory: 'content/some_docs. Solution: Create the "content" folder because it doesn't exist nor gets created in windows - Resource punkt not found. Solution: Follow the steps given by the module: - import nltk - nltk.download('punkt')

@technovangelist 2 ай бұрын

This is one of the annoyances with Python. Their requirements process is garbage. It’s python_magic for some platforms and not others. Nltk is the module on some platforms. Punkt isn’t required on some platforms. Just a mess.

@saakshihs9401 Ай бұрын

hey, where do you create the contents folder?

@thesilentcitadel 2 ай бұрын

Hi Matt, how would you approach dealing with code as the data you want to put into the vector store? I am thinking that sentence chunks might be function chunks?

@TauvicRitter 2 ай бұрын

Hello Matt would chunking knowledge into logical units like paragraphs or chapters not better than chopping after so many sentences? Could use llm and instruct it to do the chopping more intelligent or use nlp software for that. Did you consider this?

@darthcryod1562 Ай бұрын

Great video! any suggestions regarding what embeddings to use if my RAG app is to consume pdfs, any document in spanish, i have tried nomic-text, fastembeddings and all-minilm for sentence transformers, but all of them fail to retrieve a good answer from chroma using search, similarity_search or similarity_search_whith relevant score, I have tried using only english language pdfs and works fairly ok

@SashaBraus321 2 ай бұрын

This dude is so underrated

@IdPreferNot1 9 күн бұрын

Hi Matt. Could you consider a video where you take this local RAG script you've made here and redo as using Langchain to demonstrate the process and if you think the abstraction approach is efficient or helpful for 1)new coders and/or 2) experienced coders?

@technovangelist 9 күн бұрын

Lang chain only complicates things, especially in such a simple app. I don’t want to create videos about the wrong way to do something.

@ursamajor3275 2 ай бұрын

Hi @Matt, can you update your repo? So that we can have a full working one? Some steps are missing. Thanks.

2 ай бұрын

Thanks - very good explanations! Would there be any advantages using Ollama RAG Application using Langchain or LlamaIndex?

@technovangelist 2 ай бұрын

Not for this. A much more complicated app might benefit but I haven’t seen it.

@thesilentcitadel 2 ай бұрын

Hi Matt, further to the idea of chunks and the use case being code as input to the RAG, how would you think about context of related functions.. thinking that the retrieval could miss the important interdependence of functions..

@technovangelist 2 ай бұрын

Yes that is interesting. I was purely looking at English. I’m not sure how to look at code for this

@FetaleKetones 2 ай бұрын

You’re breathtaking 😘

@ChristianCOLOT-rb4li Ай бұрын

Hello Math, Thank you for this great video. I try to implement your solution but I am facing issues when using the ollama libray "Connection refused". Are the embedding model and LLM dynamically downloaded from a website with your code or should we do it ourselves before using it ?

@technovangelist Ай бұрын

If you are getting some sort of error when running ollama pull look into your network connection

@UnchartedWorlds 2 ай бұрын

Typescript 👌 please also pdf tutorial also 👌

@technovangelist 2 ай бұрын

the typescript version will be published on Monday. And then will look at pdf in the next few weeks.

@rhenriquea 2 ай бұрын

@@technovangelist I really appreciate what you're doing man, I'm acting as an advocate to use JS/TS with AI and your videos help me a lot. Success!

@chrisBruner 2 ай бұрын

@@technovangelistI put some python code in the comments, for a quick and dirty system.

@user-tw3fo8re5v 2 ай бұрын

Great work and explanation Sir. Thanks for sparing your valuable time and the code but could you please add folders for documents and DB in the code where we can add our own files.Sorry I am not a SW guy, just copy the OpenSource codes and try/run them.THX.

@technovangelist 2 ай бұрын

Just add paths to the source docs file

@GeorgAubele 2 ай бұрын

Thanks for your video, I think I understand the process of embedding. Is there a way to use the embedded docs with an API call? I want tot write a Winform App in C# and therefore a API call would come in handy.

@technovangelist 2 ай бұрын

The full api is documented in the docs. Https://github.com/ollama/ollama

@GeorgAubele 2 ай бұрын

@@technovangelist Yes, I know, but there is only a small chapter for generating embeddings, but not on how to use them with the API.

@technovangelist 2 ай бұрын

you wouldn't use them directly. You can generate the embedding but then you need to put it somewhere. That is often a vector db. the model can't do anything with the embedding itself. you use the embedding to do a similarity search and then use the source in the model

@GeorgAubele 2 ай бұрын

@@technovangelist Ok, my question was misleading: I now got your scripts running, and I have the documents in ChromaDB. Is there a way to use the Ollama API to talk to my documents in the DB instead of using a python script to do so? I wrote a small WinForm app in C# to talk to my models via the Ollama API, but I don't see a way to use the API to support this talks with my documents in the DB.

@shuntera Ай бұрын

I think I watched you say you now use Obsidian. How about a video where you write some Python to ingest your Obsidian Vault to RAG for Ollama LLM access to the content?

@DC-xt1ry 2 ай бұрын

After playing around with RAG I have several questions * What Vector DB is the best option? * Multi-Agent? CrewAI?? * What orchestrator is the best? lanchain, lamaindex * What open source models is the best? * What is ideal workflow? Goal = reliable answers and reduce hallucinations

@technovangelist 2 ай бұрын

Well keep watching. For rag orchestrators add complexity without benefit. Which model is best depends on what your needs are and only you can decide. Workflow again is all about you.

@marcinlesniak3642 2 ай бұрын

Say, we have a database like this, which includes medical criteria for different conditions, examples of cases etc. and we want to use it as a context for LLM. Now we provide a description of a new case and we prompt the model to compare the provided information with the database and suggest a proper diagnosis. Is RAG a good choice in this scenario? RAG + prompt engineering? No-RAG solution? What would be your suggestion?

@technovangelist 2 ай бұрын

I don’t know. Best way to find out is to try it.

@ovots 2 ай бұрын

Hi Matt, I've cloned the project repo for this video, and I'm trying to play along, but I'm running my Ollama service on a separate machine, and I can't figure out where/how I'd specify that in either the config file or the individual ollama.embeddings() and ollama.generate() invocations. Sorry if I've missed something obvious. I have zero experience with Python.

@ovots 2 ай бұрын

Solved: I needed to create a "custom client". I should have RTFM for the Python SDK more carefully. Guess I glossed over that the first time.

@Sri_Harsha_Electronics_Guthik Ай бұрын

what about htmls? strip em and use it as it is?

@ErnestOak 2 ай бұрын

Waiting for the typescript and pdf videos

@neil12345 2 ай бұрын

I launch the ChromaDB in a separate terminal within VS Code. Then I run the import.py script in a different terminal. When I run the script, I receive a Errno 61, however when I look at the logs of the localhost port 8000 ChromaDB server, I get multiple 200 API Responses. Is there any troubleshooting as to why it would generate 200 Responses while still erroring in the "for index, chunk in enumerate(chunks_: loop?

@highwayman696 Ай бұрын

I am facing the same issue as of now. Have you managed to find a solution?

@ShikharDadhich 2 ай бұрын

I have few doubts here: 1. The model always respond to a question which means, if I am asking something outside the vector database, the LLM will respond using the knowledge on which it is been trained. Is there any way to handle this? 2. How to identify the model suitable for RAG, I have tried multiple models some are, extremely slow, some are fast with low quality output. unable to find the right model which can work for a large enterprise application. 3. Is RAG is also good for document summarisation?

@technovangelist 2 ай бұрын

if you dont want the model to respond if nothing was found in the db, then don't ask the model if there are no results. easy. Most models can respond well, but its easy to get the chunk size wrong. Too big or too small will result in bad output. Document summarization isn't really something RAG can help with.

@gokudomatic 2 ай бұрын

I suppose that you prefer EPUB to PDF for printable file format. Right?

@technovangelist 2 ай бұрын

Well ideally txt or md. Even a docx which is a zipped xml is better.

@gokudomatic 2 ай бұрын

@@technovangelist Oh, you mean the source document file. I thought you meant a processed file ready to print. I see what you mean. It can also be latex or asciidoc.

@technovangelist 2 ай бұрын

Well any format where the text is accessible as is. PDF obfuscates it

@Vera150607 2 ай бұрын

So now the old .chm format of older digital books is getting its revenge.

@primeq Ай бұрын

🌟

@ursamajor3275 2 ай бұрын

Hi Matt, i have the chromadb running on 1 terminal, and on another terminal, I run: python3 import.py however ... Exception: {"error":"ValueError('Collection buildragwithpython does not exist.')"}

@ursamajor3275 2 ай бұрын

I added """ try: chroma.delete_collection("buildragwithpython") except Exception as e: print("An error occurred:", e) """ in import.py and now I am seeing: """ /import.py", line 23, in chunks = chunk_text_by_sentences(source_text=text, sentences_per_chunk=7, overlap=0 ) Resource punkt not found. Please use the NLTK Downloader to obtain the resource: >>> import nltk >>> nltk.download('punkt') """

@ursamajor3275 2 ай бұрын

# for ntlk /Applications/Python\ 3.12/Install\ Certificates.command

@technovangelist 2 ай бұрын

doh, you can't delete something that doesn't exist, and it wont till you run the app which you can't run till the thing exists, which it wont until you run it..... fixed. thanks for pointing that out

@technovangelist 2 ай бұрын

the certificates thing is weird. definitely didn't need that. I wonder if that’s a windows thing

@sethjchandler 2 ай бұрын

Maybe if enough people rightly trash PDF, it will stop being a dominant format for document distribution? I can dream, can’t I?

@johngoad 2 ай бұрын

I am just happy you know it pdf's suck...

@patricktang3377 Ай бұрын

Where is your Discord link, pls?

@technovangelist Ай бұрын

I don't have a Discord, but the Discord for Ollama is Discord.gg/Ollama

@patricktang3377 Ай бұрын

@@technovangelist Thanks. How do I update the list in the sourcedocs.txt file, pls? I tried to just add a url and save it, but received a 500 Server Error when ran import.py / do you know how to fix this?

@technovangelist Ай бұрын

If it’s a 500 it’s probably not a valid url to a real server

@technovangelist Ай бұрын

Plus it’s meant as a code sample so you can start building your own

@GeandersonLenz 2 ай бұрын

Where to host ollama? Without expend million of dollars hahaha

@technovangelist 2 ай бұрын

On your machine is ideal. But I have another video that shows one option called Brev.dev. See Unlocking The Power Of GPUs For Ollama Made Simple! kzbin.info/www/bejne/h4PSpWSNqc6pp6s

@ClayShoaf 2 ай бұрын

"useless tools like ... PyMuPDF" Hard disagree.

@mbarsot 2 ай бұрын

Hi great video and will try to use if for a project here at the Metropole de Nice. Source documents ad .docx. A separate question: can you exlain how to get a copilot-like behaiour? Meaning: I ask ollama "summary of the three top news from cnn and corriere.it, in italian and in a radio-newscast style" it performs a google search on the two websites, puts everything in the prompt (or maybe builds an embeddings ? not clear). And gives me the answer.

@technovangelist 2 ай бұрын

Docx is a bit better than pdf. Change the extension to zip and unzip it and you have a bunch of xml files and they are much better to pull text out of. Not sure what you mean by copilot behavior. I have used that in vscode but I am not a windows user.