Chat with Documents is Now Crazy Fast thanks to Groq API and Streamlit

No video

Chat with Documents is Now Crazy Fast thanks to Groq API and Streamlit

Рет қаралды 20,214

Күн бұрын

Learn how to build an RAG pipeline with the world's fastest LLM API via Groq API. We will build an RAG application that will enable you to chat with a website and will wrap everything in a Streamlit App.
🦾 Discord: / discord
☕ Buy me a Coffee: ko-fi.com/prom...
|🔴 Patreon: / promptengineering
💼Consulting: calendly.com/e...
📧 Business Contact: engineerprompt@gmail.com
Become Member: tinyurl.com/y5h...
Signup for Advanced RAG:
tally.so/r/3y9bb0
LINKS:
Getting Started with Groq: • Getting Started with G...
How to chunk: • LangChain: How to Prop...
Code: github.com/Pro...
TIMESTAMPS:
[00:00] Introduction
[00:39] Setting Up: Installing Packages and Importing Libraries
[01:32] Designing the RAC Pipeline: From Data to Response
[02075] Implementing the RAC Pipeline with Groq API
[06:00] RAG with Streamlit and Groq API
[10:07] Streamlit App in Action: Real-Time Responses
All Interesting Videos:
Everything LangChain: • LangChain
Everything LLM: • Large Language Models
Everything Midjourney: • MidJourney Tutorials
AI Image Generation: • AI Image Generation Tu...

Пікірлер: 45

@engineerprompt 2 ай бұрын

If you are interested in learning more about how to build robust RAG applications, check out this course: prompt-s-site.thinkific.com/courses/rag

@GroqInc 5 ай бұрын

Excellent demo, thank you for choosing Groq.

@engineerprompt 5 ай бұрын

If you are interested in leanring more about Advanced RAG techniques, signup here: tally.so/r/3y9bb0

@samcavalera9489 5 ай бұрын

I already signed up! Can't wait to start learning advanced RAG techniques 😀

@hadi-yeg 5 ай бұрын

I thought you're going to show how Crazy FAST the RAG system will be set up (which is at the startup of your streamlit app)! BUT You're actually showing the response time from the LLM which obviously is fast when you call the API.

@rajeevchourasiya3889 Ай бұрын

how to do that

@RickySupriyadi 5 ай бұрын

your content never disappointing, love it!

@engineerprompt 5 ай бұрын

Glad you enjoy it!

@zubinbalsara8414 5 ай бұрын

Can you please give an example on how to do reranking? Your style of teaching is just absolutely fantastic.

@user-cq7iu4ws6q 5 ай бұрын

thank you very much! How to use without in-memory vector store such AWS opensearch or Pinecone? I have alot of documents to search in

@stanTrX 3 ай бұрын

thanks. how can i rag for my document such as pdf, instead of web site as in your example?

@2vadlamani 5 ай бұрын

Really fast :) Thanks for the video

@stanTrX 3 ай бұрын

why do you use ollama , llama2 embedding model? instead of something else like nomic-embed?

@engineerprompt 3 ай бұрын

there are way too many options. This is to just show what is possible :)

@ABHINAVKUMAR-tu4ry 2 ай бұрын

i want to make and deploy this type of application, but for this i have to run ollama in background, is there any other way, any one can help me

@r0f115L4m 5 ай бұрын

What app did you use to create your flow diagram? Thank you so much for these videos, learn a lot from them!

@truliapro7112 5 ай бұрын

I have same question!!

@NightSpyderTech 5 ай бұрын

it's called Ecalidraw

@truliapro7112 5 ай бұрын

@@NightSpyderTech thanks 😊

@kate-pt2ny 5 ай бұрын

1. After running the program, it is required to install llama2. There are many other ollama models on my computer, but it seems that the program uses llama2 by default: 2. The initial run takes a long time (the URL in the example takes about 3 minutes, M1 max 32G). After the vector is completed, the search is very fast: 3. Provide news links, which can be quickly parsed and searched to obtain answers. RAG has a good effect; 4. The RAG effect of arxiv html papers is poor Thank you for sharing

@engineerprompt 5 ай бұрын

You can use any other embedding model in it. This example was using the llama2 embedding in ollama.

@MikewasG 5 ай бұрын

Thank you very much for your efforts. Your videos have been incredibly helpful to me! I have a question: In my experience, RAG's performance in extracting information from tables or images in PDFs is quite poor. Is there any way to improve this?

@engineerprompt 5 ай бұрын

Look into unstructured io for correctly parsing tables. Llamaindex also released a new tool called llama-parse for parsing tables. You might want to explore that as well.

@ubaisalih2987 5 ай бұрын

this is really awsome , and it will be great if you can deploy it on hugging face or any other suggested platform , eventually we need to deploy the app not only on local machine

@scitechtalktv9742 5 ай бұрын

I agree with that, I would like to deploy this on Hugging Face Spaces. Is the free version enough for this, or is a paid version necessary?

@ubaisalih2987 5 ай бұрын

@@scitechtalktv9742 i think using small llm can be fit on the free version

@limjuroy7078 5 ай бұрын

So, we can speed up the response from the local LLM by using GROQ? Also, would creating embeddings for text chunks be faster also?

@uwegenosdude 4 ай бұрын

Thanks for the very intersting video. I tried to run your example on window 11. Unfortunalety I get an error when trying to use FAISS. How do I have to run the faiss server ? My error looks like so: Error raised by inference endpoint: HTTPConnectionPool(host='localhost', port=11434): Max retries exceeded with url I changed one line to embeddings = OllamaEmbeddings(model="llama2:7b") Then calling vector = FAISS.from_documents(documents, embeddings) seems to work (took a couple of minutes !!) So I think I had no problem with FAISS but instead the default value for model "llama:7b" was not correct. But another question: What has mixtral.... to do with llama2? Is it the same?

@yusufersayyem7242 5 ай бұрын

Great work Sir 🌟🌟🌟 But I have a question How can I add more than one links and also add PDF files...??!

@BrandonLee-ik8kw 5 ай бұрын

The notebook doesn't work. I get an error with ValueError: Error raised by inference endpoint: HTTPConnectionPool(host='localhost', port=11434): Max retries exceeded with url: /api/embeddings (Caused by NewConnectionError(': Failed to establish a new connection: [Errno 111] Connection refused'))

@engineerprompt 5 ай бұрын

Do you have ollama running?

@qurious474 2 ай бұрын

@@engineerprompt what is this

@cynthiarohr8560 5 ай бұрын

Is it possible for you to create a video which also has Deepgram so now it becomes a conversational AI?

@scitechtalktv9742 5 ай бұрын

I would like to store / serialize the vector store / embeddings because on my PC it takes a very long time to generate those! I mean extremely long: more than 4 hours! How can I do that?

@engineerprompt 5 ай бұрын

You can use external API for doing the same. Hugging face offers free embedding APIs.