FULLY LOCAL Mistral AI PDF Processing [Hands-on Tutorial]

Рет қаралды 27,152

Күн бұрын

Using faiss, sentence transformers and ctransformers, we have got a fully functional completely LOCAL AI powered PDF Processing engine powered by Mistral AI
Colab - colab.research...
Katana ML's Project - github.com/kat...
❤️ If you want to support the channel ❤️
Support here:
Patreon - / 1littlecoder
Ko-Fi - ko-fi.com/1lit...
🧭 Follow me on 🧭
Twitter - / 1littlecoder
Linkedin - / amrrs

Пікірлер: 58

@jorgerios4091 Жыл бұрын

Wow, practical use videos are back, thanks!

@1littlecoder Жыл бұрын

Haha Thanks!

@L1merick Жыл бұрын

Yup, love the practical use-case videos! ❤

@azuremis Жыл бұрын

Epic video, really helped me out with my first local use of an LLM!

@1littlecoder Жыл бұрын

Glad it helped!

@azp7 8 ай бұрын

Thank you! I had some issues with some Python Packages, that I could resolved. Your Hands-on tutorial is great, very practical!

@amindforevervoyaging3789 Жыл бұрын

Thanks. Keep up the great work 🙂

@1littlecoder Жыл бұрын

Thank you too!

@dreamwalker1739 7 ай бұрын

how to get get output as json format. I used json outputparser for getting output , but it gives output as schema. How to fix this?

@santanuc 10 ай бұрын

this is good, but can it handle complex unstructured tables or text with images?

@__________________________6910 Жыл бұрын

9:57 "I'm not just cheating around" 🤣

@1littlecoder Жыл бұрын

Haha. People make such allegations so wanted to make sure 😁

@__________________________6910 Жыл бұрын

@@1littlecoder lol

@eriktedja1199 Жыл бұрын

Is it possible to handle hundreds of documents each around 200 - 300 pages long

@gabscar1 Жыл бұрын

Its for invoicing. Not large PDF reading. Video by the creator here: kzbin.info/www/bejne/b4Oog6impsh5gpY

@papitou4140 Жыл бұрын

overall which open source model is the best at the moment for creating chatbots? but only the ones that are truly open source and commercial free without any llama licenses. thanks

@rakshaballa 6 ай бұрын

Can I know how do you solve that error which is shown at 7.49?

@SanketRaut-x6u Жыл бұрын

Great tutorial why this model takes to much time for answering the question ?

@RonanChardonneau-b7b Жыл бұрын

How can you make it analyze another pdf? I deleted the invoice sample pdf, i uploaded another file into it (300 pages) and the answer given by mistral are saying that they don't know have the answers to my questions.

@MichealScott24 Жыл бұрын

❤

@Sri_Harsha_Electronics_Guthik Жыл бұрын

nice.! gpu version?

@jurandfantom Жыл бұрын

Did I miss information how big files it is possible to process? I have 800 page pdf book for example - don't think that hardware can be limiting factor (24VRAM 128 RAM) in my case

@hossromani Жыл бұрын

Thanks, when pip install, it modifies the local packages, would you recommend Conda or Docker?

@NageshPindi-y3v 11 ай бұрын

Will it work for HTML files??

@lgprasad Жыл бұрын

But using open source LLM, is it possible to restrict the RAG only to the documents, in this case the PDF(s) alone ? Won't it answer any or all generic questions too ?

@1littlecoder Жыл бұрын

Yes

@lgprasad Жыл бұрын

Means can we restrict the responses ? If so can you guide how?@@1littlecoder

@AbishekEkhool Жыл бұрын

finetuning quantized model is possible or not if it possible make video for that plz

@philtoa334 Жыл бұрын

Nice.

@1littlecoder Жыл бұрын

Thank you! Cheers!

@antiinfluencer2269 10 ай бұрын

the link is guff model is outdated

@__________________________6910 Жыл бұрын

LOCAL -I love local private things

@j.k.priyadharshini9753 8 ай бұрын

katanaml? does it belong to mistralAI?

@1littlecoder 8 ай бұрын

No it's an open source by someone else

@malleswararaomaguluri6344 Жыл бұрын

After 3 or 4 questions, it is showing token limit exceeded. How to overcome this?

@MungeParty Жыл бұрын

Summarize each perspective in the chat history by relevance to the question, then break the question down into steps. Repeat that recursively until the total tokens required for context plus chat history plus question add up to few enough tokens to run past your max on the model.

@__________________________6910 Жыл бұрын

are you using MacBook Pro 14” or 16” ?

@1littlecoder Жыл бұрын

16" why? It's an old one though, been thinking to buy an M1 machine, but haven't!

@__________________________6910 Жыл бұрын

@@1littlecoder noice

@echofloripa Жыл бұрын

Isn't it slow because it's CPU based?

@1littlecoder Жыл бұрын

It is slow exactly because of the CPU speed. Also the compute power makes a difference. You could see the time difference in Colab and on my machine. I'd see if i can port these to GPTQ models which can leverage GPU as well

@gayathrik1517 Жыл бұрын

Very helpful, kindly do the needful with GPU machine as well.

@1littlecoder Жыл бұрын

Thanks, I'll try to see if this could be ported for GPTQ as well

@alx8439 Жыл бұрын

6 minutes. Gosh. What does it do during all that time - mining etherium?

@1littlecoder Жыл бұрын

That'd be amazing if it can mine, I'd end up rich $$$ !

@alx8439 Жыл бұрын

@@1littlecoderhaha. But not sure it's doing that for your favour :)