Crazy FAST RAG | Ollama | Nomic Embedding Model | Groq API

  Рет қаралды 11,656

Data Science Basics

Data Science Basics

Күн бұрын

Пікірлер: 62
@davidtindell950
@davidtindell950 18 күн бұрын
Thank You very much! I did run several comparisons between a local_llm (' llama3.1:8b ') and llm_groq. Although the local llm did provide good answers (with NO 'I dont knows') the groq llm appeared to provide better and more complete answers across several different PDFs !!!
@kushis4ever
@kushis4ever 8 күн бұрын
@@davidtindell950 bro, which embeddings did you use? Is it free?
@davidtindell950
@davidtindell950 7 күн бұрын
@@kushis4ever Hi ! Hope that U R doing well. Currently, local free mxbai-embed-large' embeddings via Ollama ! They seem to be providing better query results than "nomic" !?!?!
@sreeharinittur2043
@sreeharinittur2043 4 ай бұрын
When I upload the pdf,this error appears Error raised by inference endpoint: HTTPConnectionPool(host='localhost', port=11434): Max retries exceeded with url: /api/embeddings (Caused by NewConnectionError(': Failed to establish a new connection: [WinError 10061] No connection could be made because the target machine actively refused it')).Any thoughts?
@datasciencebasics
@datasciencebasics 4 ай бұрын
Hello, are you installing Ollama ? Or are you running Ollama ? if yes and still the error persists try changing embedding model or close Ollama and run again.
@SantK1208
@SantK1208 5 ай бұрын
This is awesome, finally I found on KZbin. Kindly make more videos on this tilte video. Some end to end simple project , Many Thanks❤
@datasciencebasics
@datasciencebasics 5 ай бұрын
You are welcome !!
@TechQuanta-rq7oc
@TechQuanta-rq7oc 5 ай бұрын
i cloned the repo i have this error when i upload the files 1 validation error for LLMChain llm Can't instantiate abstract class BaseLanguageModel with abstract methods agenerate_prompt, apredict, apredict_messages, generate_prompt, predict, predict_messages (type=type_error)
@Tobi-cf5tm
@Tobi-cf5tm 4 ай бұрын
Error raised by inference endpoint: HTTPConnectionPool(host='localhost', port=11434): Max retries exceeded with url: /api/embeddings (Caused by NewConnectionError(': Failed to establish a new connection: [WinError 10061] No connection could be made because the target machine actively refused it')) any help here please
@Tobi-cf5tm
@Tobi-cf5tm 4 ай бұрын
My mistake i had not installed ollama, I now get this Error raised by inference API HTTP code: 404, {"error":"model 'nomic-embed-text' not found, try pulling it first"}
@thaiscvo
@thaiscvo 6 ай бұрын
how can I make the app save the embedding data (as a ingest.py) forever?
@datasciencebasics
@datasciencebasics 6 ай бұрын
hello, you can persist the embeddings. Please watch other videos in this channel. I have explained how to achieve those.
@thaiscvo
@thaiscvo 6 ай бұрын
you have many vids please just point where is it@@datasciencebasics
@datasciencebasics
@datasciencebasics 5 ай бұрын
please watch my last two videos, you will get the answer :) RAG with LlamaParse, Qdrant and Groq | Step By Step kzbin.info/www/bejne/rWikoWmdj6uLopI RAG With LlamaParse from LlamaIndex & LangChain 🚀 kzbin.info/www/bejne/nGrLp6Wni9uNopI
@ginisksam
@ginisksam 6 ай бұрын
Thanks for the good tutorial. Additionally, I had to type "mistral:instruct" to run the local Ollama instead of with "-" in between to declare the llm. The manifest-ID is similar for mistral and mistral:instruct when cli 'ollama list'. Do you notice it? Saw the improvement in search especially you added metadatas arguments during embeddings this time around. Groq is a game-changer. Hope they will charge user based on time between API called rather than tokens, if they decided and we wanna speed up things. Otherwise local CPU-offline will suffice with coffee sips in between. You are always ahead of the game... Keep it up. Cheers.
@datasciencebasics
@datasciencebasics 6 ай бұрын
You are welcome. Yes, you are correct. I indeed used ”mistral:instruct” later while running the code but forgot to mention and update the code. Thanks for trying and noticing it. Indeed adding metadata helps and groq is game changer, good times :)
@TheColdharbour
@TheColdharbour 6 ай бұрын
I’m sure someone deleted my helpful comment before. Probably would have saved quite a few people a lot of time. Still very grateful for the guide. 👍
@medoai
@medoai 3 ай бұрын
error: did not find qroq_api_key, please add an environment variable 'GROQ_API_KEY' which contains it.
@karthikgowda9530
@karthikgowda9530 3 ай бұрын
file name should be ".env" does look like the gear icon, and 'GROQ_API_KEY' shouldn't be put in ' ' if u you already tried that then maybe jute upload the error and the related files to chatGPT it's free now
@saeednsp1486
@saeednsp1486 6 ай бұрын
hi man,do you think this nomic embedding model is better than bge-m3 embedding model ? cause i saw its size and its very lite compare to bge-m3 which is 5GB, please answer me
@FawziBreidi
@FawziBreidi 2 ай бұрын
Which local embedding is the best in the market regardless of size
@kushis4ever
@kushis4ever 15 күн бұрын
Boss, how many pdfs we can process at once? Is it one by one or I can feed it like 10 pdfs
@datasciencebasics
@datasciencebasics 12 күн бұрын
With the code shown in this video demo, its one pdf, you can merge multiple pdfs and upload or change the default parameters. docs.chainlit.io/api-reference/ask/ask-for-file
@himenatika
@himenatika 5 ай бұрын
I am trying to exclude Lanchain from my solutions, I want to try Llamaindex, is it possible alternative ?
@datasciencebasics
@datasciencebasics 5 ай бұрын
Yes it is. Exploring another framework is good way to learn new stuffs. At least from the data part, llamaindex is easier to implement. Or you can combine both.
@AjayYadav-xi9sj
@AjayYadav-xi9sj 6 ай бұрын
Can i run this RAG application in my local machine MacBook Pro M1, does it require high end system?
@datasciencebasics
@datasciencebasics 6 ай бұрын
it should work pretty well, give a try !!
@gsayak
@gsayak 6 ай бұрын
As there were many sections where the application replied in "content not found", which part do you think is affecting the results? The models itself or maybe there is an issue with the nomic embedding model?
@datasciencebasics
@datasciencebasics 6 ай бұрын
that’s good find. probably that might be also the case but can’t confirm without testing. One idea is to just view the top k and see which embeddings perform the best.
@daven9517
@daven9517 5 ай бұрын
I tried on localhost, and keep getting this error right after uploading the PDF: Error raised by inference endpoint: HTTPConnectionPool(host='localhost', port=11434): Max retries exceeded with url: /api/embeddings (Caused by NewConnectionError(': Failed to establish a new connection: [WinError 10061] No connection could be made because the target machine actively refused it'))
@datasciencebasics
@datasciencebasics 5 ай бұрын
do you have Ollama installed and running ?
@user-pg7tj4vp2d
@user-pg7tj4vp2d 6 ай бұрын
what makes nomic generate embeddings so quickly compared to ada?
@datasciencebasics
@datasciencebasics 6 ай бұрын
Well its better than ada according to them, about quicker, I haven’t checked. Its quicker than other OS embeddings model.
@Alkotas25
@Alkotas25 5 ай бұрын
hi, is this RAG solution (groq) is also compatible with OpenAi GPT models, or just with opensource LLMs?
@datasciencebasics
@datasciencebasics 5 ай бұрын
Open source llms only and right now, mixtral, llama2 and gemma.
@thaiscvo
@thaiscvo 6 ай бұрын
how can I make it accept other doc formats besides only PDF?
@datasciencebasics
@datasciencebasics 6 ай бұрын
You can use a for loop to accept different file formats and process each files separately.
@AlexUnder_BR
@AlexUnder_BR 4 ай бұрын
Hey, could this tutorial work with the new llama3 70B on groq? If not, could you please make a video about it?
@datasciencebasics
@datasciencebasics 4 ай бұрын
hello, it should work. Give a try !!
@AlexUnder_BR
@AlexUnder_BR 4 ай бұрын
@@datasciencebasics thank you Sir.
@teddyfulk
@teddyfulk 6 ай бұрын
Just stumbled across your channel, I have been having the same sort of issue as far as local models not answering the questions correctly once given the context/sources.sometimes the correct context/sources don't appear and sometimes they do like in your case. Then the llm tries to apease the query and makes up something. So from that analysis sometimes its the model not getting the right context or sometimes it does get the right context and the model can't figure out the right answer. I have noticed if the question is unique enough it will get the right answer. What are your thoughts on this? Have you experienced such things?
@datasciencebasics
@datasciencebasics 6 ай бұрын
There might be many things that can go wrong ( data quality / embeddings model / llm ) Garbage In Garage Out, having good quality data ( good cleaning, chunking strategies might help )
@JuMPeRaBG
@JuMPeRaBG 4 ай бұрын
How to make the RAG accurate? We don't need fast answers and API calls with Groq but a good embedding model and valid data.
@datasciencebasics
@datasciencebasics 4 ай бұрын
I have created a playlist where I have explained how to make better RAG, please refer to that. kzbin.info/aero/PLz-qytj7eIWUiNRSaQ0Y3u-kg60NKVjrf&si=Pknolrle_mi6Apnv
@SantK1208
@SantK1208 5 ай бұрын
Suppose If I have bunch of pdf files we can keep in one folder and do embedding as you mentioned??
@datasciencebasics
@datasciencebasics 5 ай бұрын
this video might be great for you RAG With LlamaParse from LlamaIndex & LangChain 🚀 kzbin.info/www/bejne/nGrLp6Wni9uNopI
@IamalwaysOK
@IamalwaysOK 6 ай бұрын
Excellent video as usual. Thanks a lot. Do you know if we can print the complete prompt (including sources and instructions for RAG) that is being sent to llm from chainlit framework? Do you have any reference on that in the code?
@datasciencebasics
@datasciencebasics 6 ай бұрын
LangSmith or similar platforms integration into the project might solve the problem you are trying to achieve.
@clementtimothyjudehorak1456
@clementtimothyjudehorak1456 5 ай бұрын
amazing video i have been looking for something like this for a while. However I have a question, I am trying to produce this RAG app in a production ready state. meaning, i need to call the nomic API and I need to set up chromadb api. chroma is easy enough to set up remote connection but I am really struggling to call nomic through its API interface especially in the context of this program. have you tried to do this before or do you know somebody that has?
@datasciencebasics
@datasciencebasics 5 ай бұрын
Here I am using Nomic via Ollama, you can try to use the embedding model directly from nomic’s production ready api docs.nomic.ai/reference/endpoints/nomic-embed-text
@user-xp7oj3eg4g
@user-xp7oj3eg4g 5 ай бұрын
@@datasciencebasics Sir can u please make a video where nomic api is used its a request
@THE-AI_INSIDER
@THE-AI_INSIDER 6 ай бұрын
Is groq completely free ? If yes until when ?
@datasciencebasics
@datasciencebasics 6 ай бұрын
As of now, yes. Untill. when, no idea 🤷
@THE-AI_INSIDER
@THE-AI_INSIDER 6 ай бұрын
can you do the same video to upload multiple files of the same type in the same piece of code? E.g uploading multiple csv or pdf files? i think we can use max_files in the decorator, but if you could do that and modify the whole piece of code it would be great! # Wait for the user to upload a file while files is None: files = await cl.AskFileMessage( content="Please upload a CSV file to begin!", accept=["text/csv"], max_size_mb=100, # Optionally limit the file size timeout=180, # Set a timeout for user response ).send() file = files[0] # Get the first uploaded file # Sending an image with the local file path elements = [ cl.Image(name="image", display="inline", path="pic.jpg") ] # Inform the user that processing has started msg = cl.Message(content=f"Processing `{file.name}`...", elements=elements) await msg.send() # Read the CSV file with open(file.path, 'r', encoding='utf-8') as csv_file: csv_reader = csv.reader(csv_file) csv_text = " ".join([" ".join(row) for row in csv_reader]) # Split the text into chunks text_splitter = RecursiveCharacterTextSplitter(chunk_size=2000, chunk_overlap=50) texts = text_splitter.split_text(csv_text) # Create a metadata for each chunk metadatas = [{"source": f"{i}-pl"} for i in range(len(texts))] # Create a Chroma vector store embeddings = OllamaEmbeddings(model="nomic-embed-text") docsearch = await cl.make_async(Chroma.from_texts)( texts, embeddings, metadatas=metadatas ) the code above is from one of your videos!
@inprogs
@inprogs 6 ай бұрын
at kzbin.info/www/bejne/in7Egqdup7NomKcsi=rcb0hPNzIma51T3I&t=788 you never changed groq API to use llama and I think you were using mixtral till the end of the presentation with groq
@datasciencebasics
@datasciencebasics 6 ай бұрын
hello, I was using nomic-embed-text as embedding model which I downloaded locally and used mixtral from groq as llm to provide the answer You can use llama2 if you want.
@patricktang3377
@patricktang3377 4 ай бұрын
In the config files, can you tell me where the Ollama is running at? Is it port 8000? I received an error message. This is the error at the front "Ollama call failed with status code 404" && this is the error at the back " async for stream_resp in self._acreate_stream( File "C:\Users\patta\AppData\Local\Programs\Python\Python312\Lib\site-packages\langchain_community\llms\ollama.py", line 305, in _acreate_stream raise OllamaEndpointNotFoundError( langchain_community.llms.ollama.OllamaEndpointNotFoundError: Ollama call failed with status code 404." The Chainlit chatbot is at port 8000, so where is Ollama? Which port? How did you config it in your code, because I don't see this information in app.py? How did you config this pertaining to Ollama? How did you set up the Ollama port on localhost?
@datasciencebasics
@datasciencebasics 4 ай бұрын
Ollama always runs on port 11434 in localhost once you run it. Please make yourself familiar with Ollama via the website or github, knowing a bit more about the project will make things easier. ollama.com/ github.com/ollama/ollama
@patricktang3377
@patricktang3377 4 ай бұрын
@@datasciencebasics Thanks. I have tried to debug this for hours, but wasn't able to do so. It is so weird that I would get this error, given it appears that the same code works fine at your end. I was able to, however, make the code for langchain_groq_chainlit work, just not this one. I don't understand.
@TooyAshy-100
@TooyAshy-100 6 ай бұрын
Thank you, keep going
@datasciencebasics
@datasciencebasics 6 ай бұрын
You are welcome !!
@saeednsp1486
@saeednsp1486 6 ай бұрын
hi man,do you think this nomic embedding model is better than bge-m3 embedding model ? cause i saw its size and its very lite compare to bge-m3 which is 5GB, please answer me
RAG With LlamaParse from LlamaIndex & LangChain 🚀
17:04
Data Science Basics
Рет қаралды 3,7 М.
Nomic's New Embedding Model | nomic-embed-text
10:51
Data Science Basics
Рет қаралды 4,1 М.
Шок. Никокадо Авокадо похудел на 110 кг
00:44
Люблю детей 💕💕💕🥰 #aminkavitaminka #aminokka #miminka #дети
00:24
Аминка Витаминка
Рет қаралды 1,4 МЛН
Bend The Impossible Bar Win $1,000
00:57
Stokes Twins
Рет қаралды 44 МЛН
Python RAG Tutorial (with Local LLMs): AI For Your PDFs
21:33
pixegami
Рет қаралды 225 М.
RAG from the Ground Up with Python and Ollama
15:32
Decoder
Рет қаралды 30 М.
Mixture of Agents TURBO Tutorial 🚀 Better Than GPT4o AND Fast?!
11:20
Unlimited AI Agents running locally with Ollama & AnythingLLM
15:21
Tim Carambat
Рет қаралды 127 М.
Open Source RAG running LLMs locally with Ollama
10:00
Weaviate • Vector Database
Рет қаралды 26 М.
How AI 'Understands' Images (CLIP) - Computerphile
18:05
Computerphile
Рет қаралды 199 М.
Insanely Fast LLAMA-3 on Groq Playground and API for FREE
8:54
Prompt Engineering
Рет қаралды 30 М.
Local RAG using Ollama and Anything LLM
15:07
GovBotics
Рет қаралды 13 М.
Шок. Никокадо Авокадо похудел на 110 кг
00:44