LocalGPT Updates - Tips & Tricks

Рет қаралды 24,166

Күн бұрын

Пікірлер: 83

@engineerprompt Жыл бұрын

Want to connect? 💼Consulting: calendly.com/engineerprompt/consulting-call 🦾 Discord: discord.com/invite/t4eYQRUcXB ☕ Buy me a Coffee: ko-fi.com/promptengineering |🔴 Join Patreon: Patreon.com/PromptEngineering ▶ Subscribe: www.youtube.com/@engineerprompt?sub_confirmation=1

@Shogun-C Жыл бұрын

It'd be great if you could create a step by step series for all of this aimed at complete novices such as myself. Starting from the very beginning and assuming no prior knowledge or other supporting software already installed (e.g. git or conda).

@YounessMazouz Жыл бұрын

I support this request. Thank you

@nattyzaddy6555 Жыл бұрын

There's many tutorials on how to get git, conda, and python setup

@paulbishop7399 Жыл бұрын

kzbin.info/www/bejne/qHeok61vaKxsi5o

@shortthrow434 Жыл бұрын

Excellent work. Thank you for the walk through, especially on the mac/linux side. The modular approach will make much easier to maintain moving forward.

@engineerprompt Жыл бұрын

Glad you found it helpful. I agree, that’s the plan

@MikeCreuzer Жыл бұрын

Great update. I appreciate the video format for the updates like this. Thank you.

@engineerprompt Жыл бұрын

Thank you 🙏

@Nihilvs Жыл бұрын

Thank you very much for this project, it's such a pleasure to use it !

@engineerprompt Жыл бұрын

Glad you like it!

@lalpremi Жыл бұрын

thank you for sharing, have a great day 🙂

@kevinyuan2735 Жыл бұрын

谢谢！

@engineerprompt Жыл бұрын

Thank you 🙏

@tiagoneto3902 Жыл бұрын

Congrats on the great content, it is flawless! 🎉 Are we able to connect the project with slack and build a chat bot?

@venugopalasrinivasa7418 10 ай бұрын

This is very good project. How do we do fine tuning using Quantization?

@GregSimon-c1f Жыл бұрын

How can I clear the cache from the last time I ran the model? I swapped all the docs with a new set of documents and my LOCALGPT model keeps giving me answers from the last set of docs which are no longer relevant for this version?

@sadiqkavungal9256 Жыл бұрын

Please share what is the devicetype for intel IRIS GPU ?

@prestonmccauley43 Жыл бұрын

Great new changes. I have customized a lot of the core scripts myself, but maybe you could put an example that just shows how to access the persist database standalone? I was trying to run some data viz on my chroma, but running into issues understanding how to access it just standalone.

@vasimraja6811 Жыл бұрын

what are factors to consider when choosing a cloud server for the above project? say am using Llama-2-7b-Chat-GGUF model, which instance is best? how much gpu memory is required?

@deltawp Жыл бұрын

Hello, thank you for this video and the brilliant work how is it possible to force the response of the model in a language other than English typically here French the model that I use mistral, 7B already knows how to respond in French but yet the most answers are in English

@MichealAngeloArts Жыл бұрын

Great! Thanks for sharing! Does LocalGPT support code-tuned LLMs such as Codellama?

@stephenthumb2912 Жыл бұрын

No reason it should not as long as you're using the right format, gguf, ggml, gptq, no clue on others like gpt4

@engineerprompt Жыл бұрын

Yes

@hectornonayurbusiness2631 Жыл бұрын

For installing LLama CPP on windows, this worked for me: setx CMAKE_ARGS "-DLLAMA_CUBLAS=on" setx FORCE_CMAKE 1 pip install llama-cpp-python==0.1.83 --no-cache-dir Also if your computer defaults to using the cpu use --device_type cuda for windows Even with all that it kicks me out, BLAS=0

@kevinfutero7166 Жыл бұрын

Same here .. I haven’t investigated why yet

@Yeti9693 10 ай бұрын

@@kevinfutero7166 Any idea why its not using gpu?

@vservicesvservices7095 Жыл бұрын

Do we need a tone of video ram for localgpt? Doesnt look like lenovo p51 can cope....

@ninahaller9913 Жыл бұрын

Hi, great video! You said if the "BLAS" variable is set to 1, llama.cpp uses my GPU, if it is set to 0 it is not. I have a M1 Mac and I want to run it on CPU, which I specify with --device_type cpu. However, BLAS is still set to 1. Can someone explain? The LLaMA 7B Chat model also takes forever to answer and if I load other models there is no answer at all.

@engineerprompt Жыл бұрын

BLAS=1 means that llamacpp is able to see your GPU. if you explicitly set device_type to cpu, then the code will use cpu. That might explain why its running so slow. How much RAM do you have on your system and what quantization level are you using?

@wasserbesser Жыл бұрын

Amazing Work! It would be really awesome, if you could give it a GUI and a 1click installer, like other tools have it already. Like GPT4ALL or subtitle edit. For Mac, Windows and Linux. This would extend the userbase dramatically and give your more coffees ;)

@engineerprompt Жыл бұрын

Thanks for the idea! will see what I can put together.

@jafizzle95 Жыл бұрын

@@engineerprompt Seconding this request. I've spent the last 18 hours installing (and uninstalling and reinstalling and then uninstalling and reinstalling again) literally hundreds of gigabytes of CUDA and Visual Studio nonsense trying to get this thing to work.

@BabylonBaller Жыл бұрын

@@engineerprompt I agree, it needs a GUI. Noone prefers a command prompt over a GUI

@engineerprompt Жыл бұрын

@@BabylonBaller There are two UI options now. One is via the API and the other is dedicated UI via streamlit. I am working on a gradio one that will make thing much more easier.

@BabylonBaller Жыл бұрын

@@engineerpromptmuch appreciated , will look into streamlit

@shreyamahamuni1021 Жыл бұрын

Still im getting llm corpus data if I ask other than source document

@edwardchan3521 Жыл бұрын

Hi, I like this project and wanna have a try in my notebook. However, it has 6GB VRAM only. Shall I use the openai API key? Being a layman to machine learning and Python, I would like to know any VRAM requirements for embeddings. If no need for VRAM, I may try on an old notebook without VRAM. Thanks a lot. :)

@Koorawithsedky Жыл бұрын

This is exactly what I was looking for quite some time. I am just wondering if I can use it for generating Code in a specific structure, by ingesting the documents as .py or .java files and tries to use one of code generation model so that it can generate code in a specific structure as well as spot a snippet of code which is doing a particular functionality?

@engineerprompt Жыл бұрын

This is more of a search feature. Basically it will be looking for specific information in the document. You could use it to retrieve certain function or code snippet but then you will need to have a subsequent LLM call to use it.

@intellect5124 Жыл бұрын

can local gpt installed on windows11? very exiting video

@42svb58 Жыл бұрын

Can this be used with dual K80 GPUs?

@wennis89 Жыл бұрын

Thanks, this video was extremely helpful! Could you make a video on how to use another language than english for the local chatgpt? For example if I feed it with documents in Swedish, and I want to ask the question in Swedish and also get an answer in Swedish? Is that possible?

@AI-Tech-Stack Жыл бұрын

Should be the same just use the multilingual embedding models

@engineerprompt Жыл бұрын

Yes, as mentioned above you need an embedding model that supports the language you are working with as well as an llm

@juandavidpenaranda6136 Жыл бұрын

@@engineerprompt Do you know which LLm from TheBloke (or in Hugging Face) provides answers in spanish? My docs are in spanish. I tried some models called Falcon, Roberta, Bert, but they are not compatible with localGPT. Thanks in advance. Amazing project

@Antaromran Жыл бұрын

Thanks for sharing this, when is the support for falcon expected?

@engineerprompt Жыл бұрын

You should be able to run gguf and gguf version models including falcon even now

@caiyu538 Жыл бұрын

Great, great ,great

@sbacon92 Жыл бұрын

excellent updates. now can you write it in Nodejs?

@engineerprompt Жыл бұрын

I don’t have experience with Nodejs but hopefully someone can implement it

@blackstonesoftware7074 Жыл бұрын

How can I use an Arabic LLM with this ?? How would I set that up and what steps would I need to take? This is awesome!!!

@jimitnaik Жыл бұрын

How well does it work on excel or CSV file? Overall great info and thanks for sharing an update.

@engineerprompt Жыл бұрын

This setup will work with cvs and excel files but you will need to experiment with embeddings and models for better performance

@stupidbollywood4066 Жыл бұрын

hey i have m1 8 gb ram model can i run a 4bit quantized model

@khalidal-reemi3361 Жыл бұрын

How if I use ubuntu with cpu? Is there any change ? I am struggling with llama_cpp

@ayanbandyapadhyay Жыл бұрын

I like to have a cup of coffee with you. Great.

@42svb58 Жыл бұрын

Would love the code walk through!

@malikrumi1206 Жыл бұрын

With regard to the splitting operation, I want to split on paragraphs, not random chunks. Can localgpt accommodate that out of the box, or would I need to hack your source code? Could I achieve my result with a decorator? Thx

@engineerprompt Жыл бұрын

It’s using the recursive character text splitter which uses paragraphs for splitting. So I think it will work for your use case

@richierosewall3035 Жыл бұрын

If the pdf has the tables in it, it's not been extracted the same format as it is. What is the best way to ingest a pdf having tables which has lof of missing values?

@engineerprompt Жыл бұрын

Look at the unstructured loader for working with pdfs

@richierosewall3035 Жыл бұрын

@@engineerprompt tried it. Row columns alignments are mismatched after loading. Bcz of that llm is giving incorrect response while asking n×m based questions.

@ps3301 Жыл бұрын

How is localgpt different from quivr project ?

@photize Жыл бұрын

Setx on visual studio gives a syntax error !

@kevinfutero7166 Жыл бұрын

just use "set VARIABLE_NAME"

@hassentangier3891 Жыл бұрын

can it produce structured article from prompts

@shreyamahamuni1021 Жыл бұрын

Im using cpu getting errro like int object not collable

@broomandmopmop Жыл бұрын

can you release the code you are showing every example I am finding using openai I made several changes locally to localgpt but I would like to see this spin on it

@engineerprompt Жыл бұрын

Code is in the localgpt repo

@broomandmopmop Жыл бұрын

Oh ok my sorry busy day writing code when i seen this KZbin video notification I thought was new code sorry@@engineerprompt

@ImranKhan-wr2il Жыл бұрын

Is anyone else struggling to run llamacpp on windows using cublas No matter what, blas is always 0 😐

@engineerprompt Жыл бұрын

If you have NVidia gpu, my recommendation is to use gptq models

@ImranKhan-wr2il Жыл бұрын

@@engineerprompt Gptq in giving nice inference speed on my nvidia 3070Ti but I am struggling to use conversationbuffermemory with it. What options do we have for memory for GPTQ models.