LlamaIndex Workshop: Multimodal + Advanced RAG Workhop with Gemini

  Рет қаралды 9,227

LlamaIndex

LlamaIndex

Күн бұрын

Пікірлер: 14
@lawrencetsang5387
@lawrencetsang5387 11 ай бұрын
On the question about Paul Graham's wife at kzbin.info/www/bejne/nJXTknuAobNjhrM, I missed the chance to explain that the Google AQA model actually did its job by *not* saying that Jessica Livingston was his wife because the Paul Graham essay does not say so. Although it is the right answer, it was not an answer that can be derived from the provided source text. So, the Google AQA model demonstrates its ability to ground its response to the provided source!
@chrsl3
@chrsl3 11 ай бұрын
It would be nice if always a super-clear answer was given, like: "The provided document does not contain info about this."
@chaoticblankness
@chaoticblankness 11 ай бұрын
### Summary: In this special edition of The W index webinar series, the focus was on presenting multimodal and advanced retrieval-augmented generation (RAG) use cases utilizing Google's API offerings, specifically the Google Gemini and Llama index. The session provided insights into semantic retrieval and how to build an advanced RAG with L index components, followed by a workshop on creating multimodal use cases with Google Gemini and Llama index. #### Part 1: Advanced RAG with Llama Index and Google Gemini **Presenters:** Lawrence, Michael, and Sher from Google Labs The presentation covered RAG use cases for both novice and advanced users, including: - A simple RAG pattern introduction for context setting. - Google's developer RAG offerings. - Advanced techniques for customizing use cases and improving quality. - A demonstration of the RAG process. **Simple RAG Pattern:** - Ingestion phase with embeddings and Vector store. - Retrieval step with user query and Vector store. - Response synthesis with L to arrive at an answer. **Google's Offerings:** - Google Vector store - a managed Vector database and embeddings, designed for simplicity, flexibility, and production readiness. It's optimized for a small corpus of 1 million chunks. - AQA (Attributed Question Answering) model - provides grounded answers, attributions, answerability probability, voice styles, and safety settings. **Advanced Techniques:** - Breaking down complex queries into focused sub-questions for better retrieval. - Re-ranking to refine the retrieval process by comparing textual content in the question and retrieved documents. **Demonstration:** - A live demo showed how Google's AQA model and Llama index can be used to answer complex questions and handle cases where an answer is not available in the provided documents. #### Part 2: Multimodal RAG with Google Gemini and Llama Index **Presenters:** Jerry and Howan from L index This section focused on leveraging multimodal data (text and images) to enhance RAG use cases. The presenters discussed the integration of the DEI Pro visual model and the L index, which supports text and image inputs to generate text outputs. **Multimodal RAG:** - Indexing both text and images. - Retrieving relevant information using queries that include text and/or images. - Re-ranking and synthesizing responses that incorporate multimodal data. **Image Indexing:** - Extracting structured text from images using a multimodal model. - Generating image embeddings and storing them in a vector store. **Multimodal Retrieval and Generation:** - Retrieving and synthesizing responses based on text and image inputs. - Using structured data extraction to create structured metadata from images. - Leveraging this structured output to build a knowledge base for RAG. **Demonstration:** - A case study showed how Google Maps screenshots of restaurants were used to extract structured metadata, which was then indexed and used to answer queries about restaurant recommendations, including nearby tourist places. **Final Q&A:** - The possibility of fine-tuning Gemini for improved capabilities. - Uncertainty about Gemini's ability to process video and audio. The webinar ended with encouragement for the audience to provide feedback and explore the shared notebooks.
11 ай бұрын
All these techniques work quite fine for general content and knowledge. Now, for niche domains, the problems pop-up. In particular the pre-trained encoders lack accuracy and the VQA is not very helpful. The fine-tuning of the encoders is mandatory... but here again the curse of labelelling is present. Despite the size of the datasets for FT is less than for pre-training, it is still a big challenge for many companies. Again and again, the source of progress is within the labeled data and the labeling resources which are now made of Subject Matter Experts.
@EricB1
@EricB1 11 ай бұрын
Is the code shared somewhere?
@Jmstr-p6h
@Jmstr-p6h 11 ай бұрын
Great content thx. Can you share the slides?
@ramih5488
@ramih5488 10 ай бұрын
When we create a simple google index in the first simple usecase, which google region is this index created?
@mutumjagat
@mutumjagat 10 ай бұрын
Can we have the code notebook link ?
@unclecode
@unclecode 11 ай бұрын
Very helpful, thanks, and can you share the code for the first demo?
@mrchongnoi
@mrchongnoi 11 ай бұрын
where ae the slides located ?
@RuturajHange-k2z
@RuturajHange-k2z 11 ай бұрын
is there a way by which we can retrieve images from a folder of images using text query? Using gemini not openai
@chrsl3
@chrsl3 11 ай бұрын
Is the Google code already available for developers in the Google Cloud?
@qingsongyao4974
@qingsongyao4974 10 ай бұрын
how to create google index in GCP?
@karanv293
@karanv293 9 ай бұрын
did you figure this out? wish they didnt leave that part out lol
From RAG to Knowledge Assistants
27:29
LlamaIndex
Рет қаралды 27 М.
How to build Multimodal Retrieval-Augmented Generation (RAG) with Gemini
34:22
Google for Developers
Рет қаралды 74 М.
To Brawl AND BEYOND!
00:51
Brawl Stars
Рет қаралды 17 МЛН
Quando A Diferença De Altura É Muito Grande 😲😂
00:12
Mari Maria
Рет қаралды 45 МЛН
Anthropic MCP with Ollama, No Claude? Watch This!
29:55
Chris Hay
Рет қаралды 14 М.
Building Production RAG Over Complex Documents
1:22:18
Databricks
Рет қаралды 16 М.
LightRAG: A More Efficient Solution than GraphRAG for RAG Systems?
19:49
Prompt Engineering
Рет қаралды 40 М.
Gemini 2 Multimodal and Spatial Awareness in Python
32:43
James Briggs
Рет қаралды 1,9 М.
Advanced RAG: Combining RAG with Text-to-SQL
15:08
LlamaIndex
Рет қаралды 6 М.
LlamaIndex Webinar: Make RAG Production-Ready
1:00:45
LlamaIndex
Рет қаралды 19 М.
To Brawl AND BEYOND!
00:51
Brawl Stars
Рет қаралды 17 МЛН