FULLY LOCAL Mistral AI PDF Processing [Hands-on Tutorial]

  Рет қаралды 27,152

1littlecoder

1littlecoder

Күн бұрын

Using faiss, sentence transformers and ctransformers, we have got a fully functional completely LOCAL AI powered PDF Processing engine powered by Mistral AI
Colab - colab.research...
Katana ML's Project - github.com/kat...
❤️ If you want to support the channel ❤️
Support here:
Patreon - / 1littlecoder
Ko-Fi - ko-fi.com/1lit...
🧭 Follow me on 🧭
Twitter - / 1littlecoder
Linkedin - / amrrs

Пікірлер: 58
@jorgerios4091
@jorgerios4091 Жыл бұрын
Wow, practical use videos are back, thanks!
@1littlecoder
@1littlecoder Жыл бұрын
Haha Thanks!
@L1merick
@L1merick Жыл бұрын
Yup, love the practical use-case videos! ❤
@azuremis
@azuremis Жыл бұрын
Epic video, really helped me out with my first local use of an LLM!
@1littlecoder
@1littlecoder Жыл бұрын
Glad it helped!
@azp7
@azp7 8 ай бұрын
Thank you! I had some issues with some Python Packages, that I could resolved. Your Hands-on tutorial is great, very practical!
@amindforevervoyaging3789
@amindforevervoyaging3789 Жыл бұрын
Thanks. Keep up the great work 🙂
@1littlecoder
@1littlecoder Жыл бұрын
Thank you too!
@dreamwalker1739
@dreamwalker1739 7 ай бұрын
how to get get output as json format. I used json outputparser for getting output , but it gives output as schema. How to fix this?
@santanuc
@santanuc 10 ай бұрын
this is good, but can it handle complex unstructured tables or text with images?
@__________________________6910
@__________________________6910 Жыл бұрын
9:57 "I'm not just cheating around" 🤣
@1littlecoder
@1littlecoder Жыл бұрын
Haha. People make such allegations so wanted to make sure 😁
@__________________________6910
@__________________________6910 Жыл бұрын
@@1littlecoder lol
@eriktedja1199
@eriktedja1199 Жыл бұрын
Is it possible to handle hundreds of documents each around 200 - 300 pages long
@gabscar1
@gabscar1 Жыл бұрын
Its for invoicing. Not large PDF reading. Video by the creator here: kzbin.info/www/bejne/b4Oog6impsh5gpY
@papitou4140
@papitou4140 Жыл бұрын
overall which open source model is the best at the moment for creating chatbots? but only the ones that are truly open source and commercial free without any llama licenses. thanks
@rakshaballa
@rakshaballa 6 ай бұрын
Can I know how do you solve that error which is shown at 7.49?
@SanketRaut-x6u
@SanketRaut-x6u Жыл бұрын
Great tutorial why this model takes to much time for answering the question ?
@RonanChardonneau-b7b
@RonanChardonneau-b7b Жыл бұрын
How can you make it analyze another pdf? I deleted the invoice sample pdf, i uploaded another file into it (300 pages) and the answer given by mistral are saying that they don't know have the answers to my questions.
@MichealScott24
@MichealScott24 Жыл бұрын
@Sri_Harsha_Electronics_Guthik
@Sri_Harsha_Electronics_Guthik Жыл бұрын
nice.! gpu version?
@jurandfantom
@jurandfantom Жыл бұрын
Did I miss information how big files it is possible to process? I have 800 page pdf book for example - don't think that hardware can be limiting factor (24VRAM 128 RAM) in my case
@hossromani
@hossromani Жыл бұрын
Thanks, when pip install, it modifies the local packages, would you recommend Conda or Docker?
@NageshPindi-y3v
@NageshPindi-y3v 11 ай бұрын
Will it work for HTML files??
@lgprasad
@lgprasad Жыл бұрын
But using open source LLM, is it possible to restrict the RAG only to the documents, in this case the PDF(s) alone ? Won't it answer any or all generic questions too ?
@1littlecoder
@1littlecoder Жыл бұрын
Yes
@lgprasad
@lgprasad Жыл бұрын
Means can we restrict the responses ? If so can you guide how?@@1littlecoder
@AbishekEkhool
@AbishekEkhool Жыл бұрын
finetuning quantized model is possible or not if it possible make video for that plz
@philtoa334
@philtoa334 Жыл бұрын
Nice.
@1littlecoder
@1littlecoder Жыл бұрын
Thank you! Cheers!
@antiinfluencer2269
@antiinfluencer2269 10 ай бұрын
the link is guff model is outdated
@__________________________6910
@__________________________6910 Жыл бұрын
LOCAL -I love local private things
@j.k.priyadharshini9753
@j.k.priyadharshini9753 8 ай бұрын
katanaml? does it belong to mistralAI?
@1littlecoder
@1littlecoder 8 ай бұрын
No it's an open source by someone else
@malleswararaomaguluri6344
@malleswararaomaguluri6344 Жыл бұрын
After 3 or 4 questions, it is showing token limit exceeded. How to overcome this?
@MungeParty
@MungeParty Жыл бұрын
Summarize each perspective in the chat history by relevance to the question, then break the question down into steps. Repeat that recursively until the total tokens required for context plus chat history plus question add up to few enough tokens to run past your max on the model.
@__________________________6910
@__________________________6910 Жыл бұрын
are you using MacBook Pro 14” or 16” ?
@1littlecoder
@1littlecoder Жыл бұрын
16" why? It's an old one though, been thinking to buy an M1 machine, but haven't!
@__________________________6910
@__________________________6910 Жыл бұрын
@@1littlecoder noice
@echofloripa
@echofloripa Жыл бұрын
Isn't it slow because it's CPU based?
@1littlecoder
@1littlecoder Жыл бұрын
It is slow exactly because of the CPU speed. Also the compute power makes a difference. You could see the time difference in Colab and on my machine. I'd see if i can port these to GPTQ models which can leverage GPU as well
@gayathrik1517
@gayathrik1517 Жыл бұрын
Very helpful, kindly do the needful with GPU machine as well.
@1littlecoder
@1littlecoder Жыл бұрын
Thanks, I'll try to see if this could be ported for GPTQ as well
@alx8439
@alx8439 Жыл бұрын
6 minutes. Gosh. What does it do during all that time - mining etherium?
@1littlecoder
@1littlecoder Жыл бұрын
That'd be amazing if it can mine, I'd end up rich $$$ !
@alx8439
@alx8439 Жыл бұрын
@@1littlecoderhaha. But not sure it's doing that for your favour :)
@edwardlori4627
@edwardlori4627 11 ай бұрын
Damn, it doesn't work ( Can you update this tutorial? Maybe some new tools are available?
@1littlecoder
@1littlecoder 11 ай бұрын
What was the error ?
@edwardlori4627
@edwardlori4627 11 ай бұрын
@@1littlecoder Mmm... Looks like youtube automatically deletes the copied error.... I'll try it piece by piece...
@edwardlori4627
@edwardlori4627 11 ай бұрын
after "Python main.py "what is the name of the seller ?"" appears...
@edwardlori4627
@edwardlori4627 11 ай бұрын
Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_errors.py", line 286, in hf_raise_for_status response.raise_for_status() File "/usr/local/lib/python3.10/dist-packages/requests/models.py", line 1021, in raise_for_status raise HTTPError(http_error_msg, response=self) requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url:
@edwardlori4627
@edwardlori4627 11 ай бұрын
raise RepositoryNotFoundError(message, response) from e huggingface_hub.utils._errors.RepositoryNotFoundError: 401 Client Error. (Request ID: Root=1-65dee93f-395f6cfe199fc1705cb53901;1c0ae4c6-0efd-4e1e-b6e6-9800213dbd3b) Repository Not Found for url: h%%%%%%%%%o/api/models/models/mistral-7b-instruct-v0.1.Q5_K_M.gguf/revision/main. Please make sure you specified the correct `repo_id` and `repo_type`. If you are trying to access a private or gated repo, make sure you are authenticated. Invalid username or password.@@1littlecoder
@flethacker
@flethacker Жыл бұрын
how could it possibly take 1 minute on a 1 page pdf? i dont think this code approach is any good tbh
@adrianacala1034
@adrianacala1034 Жыл бұрын
1 minute of compute type is significantly cheaper than 1 minute of human time.
@ArtisanTony
@ArtisanTony Жыл бұрын
I am glad I did not get doxed in this video like these people did :)
@1littlecoder
@1littlecoder Жыл бұрын
I'm sorry didn't understand the comment well!
@ArtisanTony
@ArtisanTony Жыл бұрын
@@1littlecoder It was humor, The person name and address on the invoice seems like personal information they would not want out on the internet :)
AI Critique with SELF-REFLECTION 🤯 (powered RAG)!!!
16:51
1littlecoder
Рет қаралды 10 М.
Python RAG Tutorial (with Local LLMs): AI For Your PDFs
21:33
pixegami
Рет қаралды 384 М.
When you have a very capricious child 😂😘👍
00:16
Like Asiya
Рет қаралды 18 МЛН
Don’t Choose The Wrong Box 😱
00:41
Topper Guild
Рет қаралды 62 МЛН
Mom Hack for Cooking Solo with a Little One! 🍳👶
00:15
5-Minute Crafts HOUSE
Рет қаралды 23 МЛН
EASIEST Way to Fine-Tune a LLM and Use It With Ollama
5:18
warpdotdev
Рет қаралды 288 М.
host ALL your AI locally
24:20
NetworkChuck
Рет қаралды 1,7 МЛН
Marker: This Open-Source Tool will make your PDFs LLM Ready
14:11
Prompt Engineering
Рет қаралды 68 М.
LlamaParse: Convert PDF (with tables) to Markdown
15:55
Alejandro AO - Software & Ai
Рет қаралды 24 М.
Talk to Your Documents, Powered by Llama-Index
17:32
Prompt Engineering
Рет қаралды 89 М.
Create a LOCAL Python AI Chatbot In Minutes Using Ollama
13:17
Tech With Tim
Рет қаралды 159 М.
Supercharge your Python App with RAG and Ollama in Minutes
9:42
Matt Williams
Рет қаралды 41 М.
When you have a very capricious child 😂😘👍
00:16
Like Asiya
Рет қаралды 18 МЛН