Can VISION Language Models Solve RAG? Introducing localGPT-Vision

  Рет қаралды 8,341

Prompt Engineering

Prompt Engineering

Күн бұрын

Пікірлер: 33
@kai_s1985
@kai_s1985 Ай бұрын
Thanks again for the great work! I have tested similar approach with the vision model. It is especially good for pdf's with lots of unstructured data like graphs, plots, pictures, text, etc... One limitation for this approach was when I created a chatbot and wanted to get the hyperlink within the documents I couldn't because the url of the hyperlink is not visible in the image, but it was not a problem when I used markdown with the standard text based RAG system. Questions: - how many pdf's can I upload? Is there any size limit? - Does the chatbot has a memory of the current conversation? If so, how are you handling it?
@kenchang3456
@kenchang3456 Ай бұрын
Indeed, this is an amazing project. I'll check out the code and give try. Thank you very much for sharing, there's a lot to learn from this one.
@MeinDeutschkurs
@MeinDeutschkurs Ай бұрын
Wooohoooo!!! This is so cool! I need more time, I definitely have to test it!!!!
@magmikefpv
@magmikefpv Ай бұрын
This is amazing ! Thanks will try it out
@ysy69
@ysy69 Ай бұрын
This is awesome. Very grateful. What is your local setup, GPU?
@Elingsanto
@Elingsanto Ай бұрын
Cool! Is there a context window or any strict limit on the quantity of pages or images that can be uploaded? WIll try it out
@akashnagarkar7560
@akashnagarkar7560 Ай бұрын
Would love a video about the detailed architecture and code explanation. Thanks.
@forcebrew
@forcebrew Ай бұрын
Thank you for your expertise! Could you recommend a stable and efficient large language model for coding that I can run on my machine without it becoming unresponsive?
@thenextension9160
@thenextension9160 Ай бұрын
Very nice work
@awesomedata8973
@awesomedata8973 Ай бұрын
Any chance you can input the new Mistral Pixtral model in your software? -- It seems to be the best version of a local model for vision, and it's based on Nemo.
@engineerprompt
@engineerprompt Ай бұрын
Yes, I think it can be added. Will have a look into it.
@nyliveechay-so3ps
@nyliveechay-so3ps Ай бұрын
Pdf document format is specific right, so maybe posssible to compare results just using that formatted content data? It's closed, owned, controlled by Adobe correct? So why do this?
@nyliveechay-so3ps
@nyliveechay-so3ps Ай бұрын
Great stuff though!! Nice work!
@bwljustus8077
@bwljustus8077 Ай бұрын
ERROR - models.indexer - Error during indexing: Unable to get page count. Any ideas?
@mosbehbarhoumi9256
@mosbehbarhoumi9256 Ай бұрын
same
@trevorbaylis7423
@trevorbaylis7423 Ай бұрын
What would be the complexity level to combining Verbi and Local GPT --Vision? Is this a realistic possibility?
@TeamDman
@TeamDman Ай бұрын
VERY cool!
@unshadowlabs
@unshadowlabs Ай бұрын
Can document metadata be included as well in the retrieval, such as document name or title, author, and publication year?
@engineerprompt
@engineerprompt Ай бұрын
Yes, that can be added
@bwljustus8077
@bwljustus8077 Ай бұрын
If poppler is missing under Windows, use: choco install poppler
@eduardvendrell9136
@eduardvendrell9136 17 күн бұрын
Running on a laptop with GPU I am getting the following error: - ERROR - models.indexer - Error during indexing: Input type (torch.cuda.FloatTensor) and weight type (CUDABFloat16Type) should be the same Any idea?
@SurajPrasad-bf9qn
@SurajPrasad-bf9qn 15 күн бұрын
I am facing the same error, did you solved it?
@eduardvendrell9136
@eduardvendrell9136 15 күн бұрын
@@SurajPrasad-bf9qn nope!
@MagagnaJayzxui
@MagagnaJayzxui Ай бұрын
Qwen2.5 VL 72b support?
@RyanSmith-rb1ch
@RyanSmith-rb1ch Ай бұрын
I think google-generativeai is misspelled as google-generative-ai in the requirements.txt
@engineerprompt
@engineerprompt Ай бұрын
Thanks for pointing it out, will fix that
@brianhopson2072
@brianhopson2072 Ай бұрын
I like the concept of this, but I don't like the original model selection. Can you add other open ai api's like 4o?
@engineerprompt
@engineerprompt Ай бұрын
Yes, will update the list with more models
@chaitanyanerpagar6076
@chaitanyanerpagar6076 23 күн бұрын
I have uploaded a pdf for indexing and once i click on upload and indexing button getting page response as can't reach this page... Can anyone suggest me where to check the issue
@dadadies
@dadadies Ай бұрын
Can us mere mortals has a 1 click installer plox. Some sort of bat file or something that checks for whatever is required and optional and let us choose. You could tell an AI to write it for you.
ColPali: Vision Language Models for Efficient Document Retrieval
17:36
Prompt Engineering
Рет қаралды 11 М.
Goodbye Text-Based RAG, Hello Vision AI: Introducing LocalGPT Vision!
17:58
Prompt Engineering
Рет қаралды 33 М.
БУ, ИСПУГАЛСЯ?? #shorts
00:22
Паша Осадчий
Рет қаралды 1,1 МЛН
У вас там какие таланты ?😂
00:19
Карина Хафизова
Рет қаралды 22 МЛН
Motorbike Smashes Into Porsche! 😱
00:15
Caters Clips
Рет қаралды 22 МЛН
Try Not To Laugh 😅 the Best of BoxtoxTv 👌
00:18
boxtoxtv
Рет қаралды 7 МЛН
Ollama with Vision - Enabling Multimodal RAG
13:01
Prompt Engineering
Рет қаралды 6 М.
Not Just Talk: A Voice Assistant That can take Actions
14:35
Prompt Engineering
Рет қаралды 10 М.
LightRAG: A More Efficient Solution than GraphRAG for RAG Systems?
19:49
Prompt Engineering
Рет қаралды 29 М.
The Best RAG Technique Yet? Anthropic’s Contextual Retrieval Explained!
16:14
Large Language Models (LLMs) - Everything You NEED To Know
25:20
Matthew Berman
Рет қаралды 116 М.
Marker: This Open-Source Tool will make your PDFs LLM Ready
14:11
Prompt Engineering
Рет қаралды 56 М.
ColPali: Vision-Based RAG System For Complex Documents
15:56
Prompt Engineering
Рет қаралды 17 М.
Multi-modal RAG: Chat with Docs containing Images
17:40
Prompt Engineering
Рет қаралды 24 М.
Local LightRAG: A GraphRAG Alternative but Fully Local with Ollama
18:55
Prompt Engineering
Рет қаралды 35 М.
БУ, ИСПУГАЛСЯ?? #shorts
00:22
Паша Осадчий
Рет қаралды 1,1 МЛН