Crazy FAST RAG | Ollama | Nomic Embedding Model

Crazy FAST RAG | Ollama | Nomic Embedding Model | Groq API

Рет қаралды 13,747

Күн бұрын

Пікірлер

@davidtindell950 4 ай бұрын

Thank You very much! I did run several comparisons between a local_llm (' llama3.1:8b ') and llm_groq. Although the local llm did provide good answers (with NO 'I dont knows') the groq llm appeared to provide better and more complete answers across several different PDFs !!!

@kushis4ever 4 ай бұрын

@@davidtindell950 bro, which embeddings did you use? Is it free?

@davidtindell950 4 ай бұрын

@@kushis4ever Hi ! Hope that U R doing well. Currently, local free mxbai-embed-large' embeddings via Ollama ! They seem to be providing better query results than "nomic" !?!?!

@SantK1208 9 ай бұрын

This is awesome, finally I found on KZbin. Kindly make more videos on this tilte video. Some end to end simple project , Many Thanks❤

@datasciencebasics 9 ай бұрын

You are welcome !!

@sreeharinittur2043 8 ай бұрын

When I upload the pdf,this error appears Error raised by inference endpoint: HTTPConnectionPool(host='localhost', port=11434): Max retries exceeded with url: /api/embeddings (Caused by NewConnectionError(': Failed to establish a new connection: [WinError 10061] No connection could be made because the target machine actively refused it')).Any thoughts?

@datasciencebasics 8 ай бұрын

Hello, are you installing Ollama ? Or are you running Ollama ? if yes and still the error persists try changing embedding model or close Ollama and run again.

@TechQuanta-rq7oc 9 ай бұрын

i cloned the repo i have this error when i upload the files 1 validation error for LLMChain llm Can't instantiate abstract class BaseLanguageModel with abstract methods agenerate_prompt, apredict, apredict_messages, generate_prompt, predict, predict_messages (type=type_error)

@thaiscvo 9 ай бұрын

how can I make the app save the embedding data (as a ingest.py) forever?

@datasciencebasics 9 ай бұрын

hello, you can persist the embeddings. Please watch other videos in this channel. I have explained how to achieve those.

@thaiscvo 9 ай бұрын

you have many vids please just point where is it@@datasciencebasics

@datasciencebasics 9 ай бұрын

please watch my last two videos, you will get the answer :) RAG with LlamaParse, Qdrant and Groq | Step By Step kzbin.info/www/bejne/rWikoWmdj6uLopI RAG With LlamaParse from LlamaIndex & LangChain 🚀 kzbin.info/www/bejne/nGrLp6Wni9uNopI

@ginisksam 10 ай бұрын

Thanks for the good tutorial. Additionally, I had to type "mistral:instruct" to run the local Ollama instead of with "-" in between to declare the llm. The manifest-ID is similar for mistral and mistral:instruct when cli 'ollama list'. Do you notice it? Saw the improvement in search especially you added metadatas arguments during embeddings this time around. Groq is a game-changer. Hope they will charge user based on time between API called rather than tokens, if they decided and we wanna speed up things. Otherwise local CPU-offline will suffice with coffee sips in between. You are always ahead of the game... Keep it up. Cheers.

@datasciencebasics 10 ай бұрын

You are welcome. Yes, you are correct. I indeed used ”mistral:instruct” later while running the code but forgot to mention and update the code. Thanks for trying and noticing it. Indeed adding metadata helps and groq is game changer, good times :)

@TheColdharbour 10 ай бұрын

I’m sure someone deleted my helpful comment before. Probably would have saved quite a few people a lot of time. Still very grateful for the guide. 👍

@FawziBreidi 6 ай бұрын

Which local embedding is the best in the market regardless of size

@Tobi-cf5tm 8 ай бұрын

Error raised by inference endpoint: HTTPConnectionPool(host='localhost', port=11434): Max retries exceeded with url: /api/embeddings (Caused by NewConnectionError(': Failed to establish a new connection: [WinError 10061] No connection could be made because the target machine actively refused it')) any help here please

@Tobi-cf5tm 8 ай бұрын

My mistake i had not installed ollama, I now get this Error raised by inference API HTTP code: 404, {"error":"model 'nomic-embed-text' not found, try pulling it first"}

@thaiscvo 9 ай бұрын

how can I make it accept other doc formats besides only PDF?

@datasciencebasics 9 ай бұрын

You can use a for loop to accept different file formats and process each files separately.

@daven9517 9 ай бұрын

I tried on localhost, and keep getting this error right after uploading the PDF: Error raised by inference endpoint: HTTPConnectionPool(host='localhost', port=11434): Max retries exceeded with url: /api/embeddings (Caused by NewConnectionError(': Failed to establish a new connection: [WinError 10061] No connection could be made because the target machine actively refused it'))

@datasciencebasics 9 ай бұрын

do you have Ollama installed and running ?

@AlexUnder_BR 8 ай бұрын

Hey, could this tutorial work with the new llama3 70B on groq? If not, could you please make a video about it?

@datasciencebasics 8 ай бұрын

hello, it should work. Give a try !!

@AlexUnder_BR 8 ай бұрын

@@datasciencebasics thank you Sir.

@gsayak 10 ай бұрын

As there were many sections where the application replied in "content not found", which part do you think is affecting the results? The models itself or maybe there is an issue with the nomic embedding model?

@datasciencebasics 10 ай бұрын

that’s good find. probably that might be also the case but can’t confirm without testing. One idea is to just view the top k and see which embeddings perform the best.

@teddyfulk 10 ай бұрын

Just stumbled across your channel, I have been having the same sort of issue as far as local models not answering the questions correctly once given the context/sources.sometimes the correct context/sources don't appear and sometimes they do like in your case. Then the llm tries to apease the query and makes up something. So from that analysis sometimes its the model not getting the right context or sometimes it does get the right context and the model can't figure out the right answer. I have noticed if the question is unique enough it will get the right answer. What are your thoughts on this? Have you experienced such things?

@datasciencebasics 10 ай бұрын

There might be many things that can go wrong ( data quality / embeddings model / llm ) Garbage In Garage Out, having good quality data ( good cleaning, chunking strategies might help )

@medoai 7 ай бұрын

error: did not find qroq_api_key, please add an environment variable 'GROQ_API_KEY' which contains it.

@karthikgowda9530 7 ай бұрын

file name should be ".env" does look like the gear icon, and 'GROQ_API_KEY' shouldn't be put in ' ' if u you already tried that then maybe jute upload the error and the related files to chatGPT it's free now

@kushis4ever 4 ай бұрын

Boss, how many pdfs we can process at once? Is it one by one or I can feed it like 10 pdfs

@datasciencebasics 4 ай бұрын

With the code shown in this video demo, its one pdf, you can merge multiple pdfs and upload or change the default parameters. docs.chainlit.io/api-reference/ask/ask-for-file

@saeednsp1486 10 ай бұрын

hi man,do you think this nomic embedding model is better than bge-m3 embedding model ? cause i saw its size and its very lite compare to bge-m3 which is 5GB, please answer me

@AjayYadav-xi9sj 10 ай бұрын

Can i run this RAG application in my local machine MacBook Pro M1, does it require high end system?

@datasciencebasics 10 ай бұрын

it should work pretty well, give a try !!

@MiheerPrakash 10 ай бұрын

what makes nomic generate embeddings so quickly compared to ada?

@datasciencebasics 10 ай бұрын

Well its better than ada according to them, about quicker, I haven’t checked. Its quicker than other OS embeddings model.

@SantK1208 9 ай бұрын

Suppose If I have bunch of pdf files we can keep in one folder and do embedding as you mentioned??

@datasciencebasics 9 ай бұрын

this video might be great for you RAG With LlamaParse from LlamaIndex & LangChain 🚀 kzbin.info/www/bejne/nGrLp6Wni9uNopI

@himenatika 9 ай бұрын

I am trying to exclude Lanchain from my solutions, I want to try Llamaindex, is it possible alternative ?

@datasciencebasics 9 ай бұрын

Yes it is. Exploring another framework is good way to learn new stuffs. At least from the data part, llamaindex is easier to implement. Or you can combine both.

@IamalwaysOK 10 ай бұрын

Excellent video as usual. Thanks a lot. Do you know if we can print the complete prompt (including sources and instructions for RAG) that is being sent to llm from chainlit framework? Do you have any reference on that in the code?

@datasciencebasics 10 ай бұрын

LangSmith or similar platforms integration into the project might solve the problem you are trying to achieve.

@Alkotas25 9 ай бұрын

hi, is this RAG solution (groq) is also compatible with OpenAi GPT models, or just with opensource LLMs?

@datasciencebasics 9 ай бұрын

Open source llms only and right now, mixtral, llama2 and gemma.

@clementtimothyjudehorak1456 9 ай бұрын

amazing video i have been looking for something like this for a while. However I have a question, I am trying to produce this RAG app in a production ready state. meaning, i need to call the nomic API and I need to set up chromadb api. chroma is easy enough to set up remote connection but I am really struggling to call nomic through its API interface especially in the context of this program. have you tried to do this before or do you know somebody that has?

@datasciencebasics 9 ай бұрын

Here I am using Nomic via Ollama, you can try to use the embedding model directly from nomic’s production ready api docs.nomic.ai/reference/endpoints/nomic-embed-text

@MohammedFaisal-l9q 9 ай бұрын

@@datasciencebasics Sir can u please make a video where nomic api is used its a request

@JuMPeRaBG 8 ай бұрын

How to make the RAG accurate? We don't need fast answers and API calls with Groq but a good embedding model and valid data.

@datasciencebasics 8 ай бұрын

I have created a playlist where I have explained how to make better RAG, please refer to that. kzbin.info/aero/PLz-qytj7eIWUiNRSaQ0Y3u-kg60NKVjrf&si=Pknolrle_mi6Apnv

@THE-AI_INSIDER 10 ай бұрын

Is groq completely free ? If yes until when ?

@datasciencebasics 10 ай бұрын

As of now, yes. Untill. when, no idea 🤷

@THE-AI_INSIDER 9 ай бұрын

can you do the same video to upload multiple files of the same type in the same piece of code? E.g uploading multiple csv or pdf files? i think we can use max_files in the decorator, but if you could do that and modify the whole piece of code it would be great! # Wait for the user to upload a file while files is None: files = await cl.AskFileMessage( content="Please upload a CSV file to begin!", accept=["text/csv"], max_size_mb=100, # Optionally limit the file size timeout=180, # Set a timeout for user response ).send() file = files[0] # Get the first uploaded file # Sending an image with the local file path elements = [ cl.Image(name="image", display="inline", path="pic.jpg") ] # Inform the user that processing has started msg = cl.Message(content=f"Processing `{file.name}`...", elements=elements) await msg.send() # Read the CSV file with open(file.path, 'r', encoding='utf-8') as csv_file: csv_reader = csv.reader(csv_file) csv_text = " ".join([" ".join(row) for row in csv_reader]) # Split the text into chunks text_splitter = RecursiveCharacterTextSplitter(chunk_size=2000, chunk_overlap=50) texts = text_splitter.split_text(csv_text) # Create a metadata for each chunk metadatas = [{"source": f"{i}-pl"} for i in range(len(texts))] # Create a Chroma vector store embeddings = OllamaEmbeddings(model="nomic-embed-text") docsearch = await cl.make_async(Chroma.from_texts)( texts, embeddings, metadatas=metadatas ) the code above is from one of your videos!

@inprogs 10 ай бұрын

at kzbin.info/www/bejne/in7Egqdup7NomKcsi=rcb0hPNzIma51T3I&t=788 you never changed groq API to use llama and I think you were using mixtral till the end of the presentation with groq

@datasciencebasics 10 ай бұрын

hello, I was using nomic-embed-text as embedding model which I downloaded locally and used mixtral from groq as llm to provide the answer You can use llama2 if you want.

@patricktang3377 8 ай бұрын

In the config files, can you tell me where the Ollama is running at? Is it port 8000? I received an error message. This is the error at the front "Ollama call failed with status code 404" && this is the error at the back " async for stream_resp in self._acreate_stream( File "C:\Users\patta\AppData\Local\Programs\Python\Python312\Lib\site-packages\langchain_community\llms\ollama.py", line 305, in _acreate_stream raise OllamaEndpointNotFoundError( langchain_community.llms.ollama.OllamaEndpointNotFoundError: Ollama call failed with status code 404." The Chainlit chatbot is at port 8000, so where is Ollama? Which port? How did you config it in your code, because I don't see this information in app.py? How did you config this pertaining to Ollama? How did you set up the Ollama port on localhost?

@datasciencebasics 8 ай бұрын

Ollama always runs on port 11434 in localhost once you run it. Please make yourself familiar with Ollama via the website or github, knowing a bit more about the project will make things easier. ollama.com/ github.com/ollama/ollama

@patricktang3377 8 ай бұрын

@@datasciencebasics Thanks. I have tried to debug this for hours, but wasn't able to do so. It is so weird that I would get this error, given it appears that the same code works fine at your end. I was able to, however, make the code for langchain_groq_chainlit work, just not this one. I don't understand.

@TooyAshy-100 10 ай бұрын

Thank you, keep going

@datasciencebasics 10 ай бұрын

You are welcome !!

@saeednsp1486 10 ай бұрын

hi man,do you think this nomic embedding model is better than bge-m3 embedding model ? cause i saw its size and its very lite compare to bge-m3 which is 5GB, please answer me