Thank You very much! I did run several comparisons between a local_llm (' llama3.1:8b ') and llm_groq. Although the local llm did provide good answers (with NO 'I dont knows') the groq llm appeared to provide better and more complete answers across several different PDFs !!!
@kushis4ever4 ай бұрын
@@davidtindell950 bro, which embeddings did you use? Is it free?
@davidtindell9504 ай бұрын
@@kushis4ever Hi ! Hope that U R doing well. Currently, local free mxbai-embed-large' embeddings via Ollama ! They seem to be providing better query results than "nomic" !?!?!
@SantK12089 ай бұрын
This is awesome, finally I found on KZbin. Kindly make more videos on this tilte video. Some end to end simple project , Many Thanks❤
@datasciencebasics9 ай бұрын
You are welcome !!
@sreeharinittur20438 ай бұрын
When I upload the pdf,this error appears Error raised by inference endpoint: HTTPConnectionPool(host='localhost', port=11434): Max retries exceeded with url: /api/embeddings (Caused by NewConnectionError(': Failed to establish a new connection: [WinError 10061] No connection could be made because the target machine actively refused it')).Any thoughts?
@datasciencebasics8 ай бұрын
Hello, are you installing Ollama ? Or are you running Ollama ? if yes and still the error persists try changing embedding model or close Ollama and run again.
@TechQuanta-rq7oc9 ай бұрын
i cloned the repo i have this error when i upload the files 1 validation error for LLMChain llm Can't instantiate abstract class BaseLanguageModel with abstract methods agenerate_prompt, apredict, apredict_messages, generate_prompt, predict, predict_messages (type=type_error)
@thaiscvo9 ай бұрын
how can I make the app save the embedding data (as a ingest.py) forever?
@datasciencebasics9 ай бұрын
hello, you can persist the embeddings. Please watch other videos in this channel. I have explained how to achieve those.
@thaiscvo9 ай бұрын
you have many vids please just point where is it@@datasciencebasics
@datasciencebasics9 ай бұрын
please watch my last two videos, you will get the answer :) RAG with LlamaParse, Qdrant and Groq | Step By Step kzbin.info/www/bejne/rWikoWmdj6uLopI RAG With LlamaParse from LlamaIndex & LangChain 🚀 kzbin.info/www/bejne/nGrLp6Wni9uNopI
@ginisksam10 ай бұрын
Thanks for the good tutorial. Additionally, I had to type "mistral:instruct" to run the local Ollama instead of with "-" in between to declare the llm. The manifest-ID is similar for mistral and mistral:instruct when cli 'ollama list'. Do you notice it? Saw the improvement in search especially you added metadatas arguments during embeddings this time around. Groq is a game-changer. Hope they will charge user based on time between API called rather than tokens, if they decided and we wanna speed up things. Otherwise local CPU-offline will suffice with coffee sips in between. You are always ahead of the game... Keep it up. Cheers.
@datasciencebasics10 ай бұрын
You are welcome. Yes, you are correct. I indeed used ”mistral:instruct” later while running the code but forgot to mention and update the code. Thanks for trying and noticing it. Indeed adding metadata helps and groq is game changer, good times :)
@TheColdharbour10 ай бұрын
I’m sure someone deleted my helpful comment before. Probably would have saved quite a few people a lot of time. Still very grateful for the guide. 👍
@FawziBreidi6 ай бұрын
Which local embedding is the best in the market regardless of size
@Tobi-cf5tm8 ай бұрын
Error raised by inference endpoint: HTTPConnectionPool(host='localhost', port=11434): Max retries exceeded with url: /api/embeddings (Caused by NewConnectionError(': Failed to establish a new connection: [WinError 10061] No connection could be made because the target machine actively refused it')) any help here please
@Tobi-cf5tm8 ай бұрын
My mistake i had not installed ollama, I now get this Error raised by inference API HTTP code: 404, {"error":"model 'nomic-embed-text' not found, try pulling it first"}
@thaiscvo9 ай бұрын
how can I make it accept other doc formats besides only PDF?
@datasciencebasics9 ай бұрын
You can use a for loop to accept different file formats and process each files separately.
@daven95179 ай бұрын
I tried on localhost, and keep getting this error right after uploading the PDF: Error raised by inference endpoint: HTTPConnectionPool(host='localhost', port=11434): Max retries exceeded with url: /api/embeddings (Caused by NewConnectionError(': Failed to establish a new connection: [WinError 10061] No connection could be made because the target machine actively refused it'))
@datasciencebasics9 ай бұрын
do you have Ollama installed and running ?
@AlexUnder_BR8 ай бұрын
Hey, could this tutorial work with the new llama3 70B on groq? If not, could you please make a video about it?
@datasciencebasics8 ай бұрын
hello, it should work. Give a try !!
@AlexUnder_BR8 ай бұрын
@@datasciencebasics thank you Sir.
@gsayak10 ай бұрын
As there were many sections where the application replied in "content not found", which part do you think is affecting the results? The models itself or maybe there is an issue with the nomic embedding model?
@datasciencebasics10 ай бұрын
that’s good find. probably that might be also the case but can’t confirm without testing. One idea is to just view the top k and see which embeddings perform the best.
@teddyfulk10 ай бұрын
Just stumbled across your channel, I have been having the same sort of issue as far as local models not answering the questions correctly once given the context/sources.sometimes the correct context/sources don't appear and sometimes they do like in your case. Then the llm tries to apease the query and makes up something. So from that analysis sometimes its the model not getting the right context or sometimes it does get the right context and the model can't figure out the right answer. I have noticed if the question is unique enough it will get the right answer. What are your thoughts on this? Have you experienced such things?
@datasciencebasics10 ай бұрын
There might be many things that can go wrong ( data quality / embeddings model / llm ) Garbage In Garage Out, having good quality data ( good cleaning, chunking strategies might help )
@medoai7 ай бұрын
error: did not find qroq_api_key, please add an environment variable 'GROQ_API_KEY' which contains it.
@karthikgowda95307 ай бұрын
file name should be ".env" does look like the gear icon, and 'GROQ_API_KEY' shouldn't be put in ' ' if u you already tried that then maybe jute upload the error and the related files to chatGPT it's free now
@kushis4ever4 ай бұрын
Boss, how many pdfs we can process at once? Is it one by one or I can feed it like 10 pdfs
@datasciencebasics4 ай бұрын
With the code shown in this video demo, its one pdf, you can merge multiple pdfs and upload or change the default parameters. docs.chainlit.io/api-reference/ask/ask-for-file
@saeednsp148610 ай бұрын
hi man,do you think this nomic embedding model is better than bge-m3 embedding model ? cause i saw its size and its very lite compare to bge-m3 which is 5GB, please answer me
@AjayYadav-xi9sj10 ай бұрын
Can i run this RAG application in my local machine MacBook Pro M1, does it require high end system?
@datasciencebasics10 ай бұрын
it should work pretty well, give a try !!
@MiheerPrakash10 ай бұрын
what makes nomic generate embeddings so quickly compared to ada?
@datasciencebasics10 ай бұрын
Well its better than ada according to them, about quicker, I haven’t checked. Its quicker than other OS embeddings model.
@SantK12089 ай бұрын
Suppose If I have bunch of pdf files we can keep in one folder and do embedding as you mentioned??
@datasciencebasics9 ай бұрын
this video might be great for you RAG With LlamaParse from LlamaIndex & LangChain 🚀 kzbin.info/www/bejne/nGrLp6Wni9uNopI
@himenatika9 ай бұрын
I am trying to exclude Lanchain from my solutions, I want to try Llamaindex, is it possible alternative ?
@datasciencebasics9 ай бұрын
Yes it is. Exploring another framework is good way to learn new stuffs. At least from the data part, llamaindex is easier to implement. Or you can combine both.
@IamalwaysOK10 ай бұрын
Excellent video as usual. Thanks a lot. Do you know if we can print the complete prompt (including sources and instructions for RAG) that is being sent to llm from chainlit framework? Do you have any reference on that in the code?
@datasciencebasics10 ай бұрын
LangSmith or similar platforms integration into the project might solve the problem you are trying to achieve.
@Alkotas259 ай бұрын
hi, is this RAG solution (groq) is also compatible with OpenAi GPT models, or just with opensource LLMs?
@datasciencebasics9 ай бұрын
Open source llms only and right now, mixtral, llama2 and gemma.
@clementtimothyjudehorak14569 ай бұрын
amazing video i have been looking for something like this for a while. However I have a question, I am trying to produce this RAG app in a production ready state. meaning, i need to call the nomic API and I need to set up chromadb api. chroma is easy enough to set up remote connection but I am really struggling to call nomic through its API interface especially in the context of this program. have you tried to do this before or do you know somebody that has?
@datasciencebasics9 ай бұрын
Here I am using Nomic via Ollama, you can try to use the embedding model directly from nomic’s production ready api docs.nomic.ai/reference/endpoints/nomic-embed-text
@MohammedFaisal-l9q9 ай бұрын
@@datasciencebasics Sir can u please make a video where nomic api is used its a request
@JuMPeRaBG8 ай бұрын
How to make the RAG accurate? We don't need fast answers and API calls with Groq but a good embedding model and valid data.
@datasciencebasics8 ай бұрын
I have created a playlist where I have explained how to make better RAG, please refer to that. kzbin.info/aero/PLz-qytj7eIWUiNRSaQ0Y3u-kg60NKVjrf&si=Pknolrle_mi6Apnv
@THE-AI_INSIDER10 ай бұрын
Is groq completely free ? If yes until when ?
@datasciencebasics10 ай бұрын
As of now, yes. Untill. when, no idea 🤷
@THE-AI_INSIDER9 ай бұрын
can you do the same video to upload multiple files of the same type in the same piece of code? E.g uploading multiple csv or pdf files? i think we can use max_files in the decorator, but if you could do that and modify the whole piece of code it would be great! # Wait for the user to upload a file while files is None: files = await cl.AskFileMessage( content="Please upload a CSV file to begin!", accept=["text/csv"], max_size_mb=100, # Optionally limit the file size timeout=180, # Set a timeout for user response ).send() file = files[0] # Get the first uploaded file # Sending an image with the local file path elements = [ cl.Image(name="image", display="inline", path="pic.jpg") ] # Inform the user that processing has started msg = cl.Message(content=f"Processing `{file.name}`...", elements=elements) await msg.send() # Read the CSV file with open(file.path, 'r', encoding='utf-8') as csv_file: csv_reader = csv.reader(csv_file) csv_text = " ".join([" ".join(row) for row in csv_reader]) # Split the text into chunks text_splitter = RecursiveCharacterTextSplitter(chunk_size=2000, chunk_overlap=50) texts = text_splitter.split_text(csv_text) # Create a metadata for each chunk metadatas = [{"source": f"{i}-pl"} for i in range(len(texts))] # Create a Chroma vector store embeddings = OllamaEmbeddings(model="nomic-embed-text") docsearch = await cl.make_async(Chroma.from_texts)( texts, embeddings, metadatas=metadatas ) the code above is from one of your videos!
@inprogs10 ай бұрын
at kzbin.info/www/bejne/in7Egqdup7NomKcsi=rcb0hPNzIma51T3I&t=788 you never changed groq API to use llama and I think you were using mixtral till the end of the presentation with groq
@datasciencebasics10 ай бұрын
hello, I was using nomic-embed-text as embedding model which I downloaded locally and used mixtral from groq as llm to provide the answer You can use llama2 if you want.
@patricktang33778 ай бұрын
In the config files, can you tell me where the Ollama is running at? Is it port 8000? I received an error message. This is the error at the front "Ollama call failed with status code 404" && this is the error at the back " async for stream_resp in self._acreate_stream( File "C:\Users\patta\AppData\Local\Programs\Python\Python312\Lib\site-packages\langchain_community\llms\ollama.py", line 305, in _acreate_stream raise OllamaEndpointNotFoundError( langchain_community.llms.ollama.OllamaEndpointNotFoundError: Ollama call failed with status code 404." The Chainlit chatbot is at port 8000, so where is Ollama? Which port? How did you config it in your code, because I don't see this information in app.py? How did you config this pertaining to Ollama? How did you set up the Ollama port on localhost?
@datasciencebasics8 ай бұрын
Ollama always runs on port 11434 in localhost once you run it. Please make yourself familiar with Ollama via the website or github, knowing a bit more about the project will make things easier. ollama.com/ github.com/ollama/ollama
@patricktang33778 ай бұрын
@@datasciencebasics Thanks. I have tried to debug this for hours, but wasn't able to do so. It is so weird that I would get this error, given it appears that the same code works fine at your end. I was able to, however, make the code for langchain_groq_chainlit work, just not this one. I don't understand.
@TooyAshy-10010 ай бұрын
Thank you, keep going
@datasciencebasics10 ай бұрын
You are welcome !!
@saeednsp148610 ай бұрын
hi man,do you think this nomic embedding model is better than bge-m3 embedding model ? cause i saw its size and its very lite compare to bge-m3 which is 5GB, please answer me