If you have empty content strings in chat completion, try adding the “-chat_format llama-2” flag when running the server; change llama-2 to the appropriate handle for your model.
@foreignconta8 ай бұрын
You don't actually need both the `llama-cpp-python[server]` and llama-cpp. The llama-cpp-python[server] is just the python bindings on top of llama-cpp. If you are setting up a server, just one of them is enough!
@NicholasRenotte7 ай бұрын
Legend, cheers man!!
@zdeacon7 ай бұрын
If you already have llama.cpp installed, you can just cd to your llama.cpp folder and run "./server -m PATH_TO_YOUR_MODEL.gguf". This will start your server just the same, except I believe it will auto-offload layers to GPU. From there just continue the tutorial building the app.py script. THANK YOU NICHOLAS FOR ANOTHER AWESOME GUIDE!!
@Er1ku8 ай бұрын
Eyyyy, this is a really great topic! I obviously haven't listened to it all yet but this will help me build some of the apps I have been meaning to create that rely of LLMs but I haven't followed through with due to pricing.
@NicholasRenotte8 ай бұрын
Yeah, can definitely go way more hardcore than I did. This covers all the foundations though!
@spectra50247 ай бұрын
I love your content, thanks for teaching great stuff. It just made me fall in love with AI. I'm glad I'm doing my bachelor thesis related with CV and Continual Learning. Thanks Nicholas!!!
@FireFly9697 ай бұрын
Months ago, I was doing a project that can identify specific patterns in trading images, but at that time, these type of videos were not popular, now, it looks easy to make an ai that can do the job, and give me the best patterns, as If I want to do it manually, I need to go trough like 500 or more of assets and see their candlesticks data, daily, Which will take me a lot of energy and time to do it each day. At least 2hours, and after it the energy will be so low, that I need to go and have a small nap to recharge.
@lonesharp11066 ай бұрын
I know exactly how you feel 😂
@Jhedataprofesorjr4 ай бұрын
Hi I was also wondering in the same problem how did you achieve to solve the problem if possible please send me the github repo :) it will help alot
@ronnieleon78577 ай бұрын
Lately, I have been working with open-source LLMs such as Yi, Solar, Mistral, Mistral merged models such as neuralbeagle and the biggest challenge has been how I can deploy these models and use them in prod. This video has really solved the challenge I was facing and I'm definitely going to use this approach. The biggest problem is that not many servers have GPUs so one drawback that I foresee is the time it'll take to generate a response to an app.
@tester00837 ай бұрын
great video, you have a new subscriber, hope to see more vids showing us how to combine models and functions for the financial stuff
@FireFly9697 ай бұрын
Wonderfull project, the part I love the most, is when we make it able to use our functions in real time.
@abhishekdalakoti24194 ай бұрын
I like your work and you have an edge over others because you show by doing practically rather than theoretically.
@user-ix3kj2vn6i8 ай бұрын
Man thank you for your videos! This one is amazing!!! I have created my own AI without internet. Very cool thing!
@johnHTC237 ай бұрын
Hey Nicholas loved this video and content! I hope you do a more in-depth project with finance stuff and maybe incorporate an Multi-Agent System like Pythagora GPT-Pilot or Crew AI or something similar that's open source.
@santicastrovilabella32488 ай бұрын
Please do a bigger video on this ❤
@zejiaann7 ай бұрын
I am facing "ggml_metal_init: error: Error Domain=MTLLibraryErrorDomain Code=3 "program_source:3:10: fatal error: 'ggml-common.h' file not found #include "ggml-common.h" error, when spinning up the local server any idea how to fix this?
@edinsonriveraaedo2927 ай бұрын
Hi Nicholas, I want to thank you for your work here. It is really helpfull and very well explained. Thanks a lot.
@PoGGiE067 ай бұрын
Quick question: why use quantised mistral 7b rather than mixtral quantised 8x7b? And presumably it can be finetuned with e.g. qlora? And one can use RAG? Great video, think this is just what I was looking for!
@MohamedKeddache-r1o7 ай бұрын
please make a video about RAG with Ollama with some advance technics of your choice and use an agents
@GeobotPY7 ай бұрын
Really love this! Continue making LLM vids!
@GeobotPY7 ай бұрын
And def want a function calling course. Perhaps retrieval also
@santicastrovilabella32488 ай бұрын
What is the difference (or the advantage) of using this approach vs using Langchain for this function calling and multi model pipeline?
@NicholasRenotte7 ай бұрын
I like this because I can build pipelines myself without relying on a cloud provider. Langchain is usually pretty tightly coupled to OA and when stuff doesn’t work you have to go digging through a ton of code. Admittedly haven’t played with it for a while so it might have gotten better!
@smtamimmahmud7 ай бұрын
Do I need to install CUDA beforehand to use the nvidia GPU? And does the model size matter with --n_gpu -1 flag?
@fr_ey247 ай бұрын
Hey Nick not related to this video but it’s the most recent one. I have a question about Sign Language Translation, was wondering if you could help me out
@aishashaikh1977 ай бұрын
Hi, during the summarization step it's taking too long for mixtral to execute and hence it is giving me execution time out error, is it because of the execution time issue or am I missing any library, could u plz help me with that
@g.s.33898 ай бұрын
what you have done isn't it the same of running OLLAMA server and than using the same API on it?
@NicholasRenotte7 ай бұрын
🤦♂️ yeah ngl I wish I knew about ollama to begin with 😅 ollama is built on top of llama cpp so same outcome. The latter half shows integration and maybe some extra stuff that’d be useful!
@MCAUptown6 ай бұрын
Yes we want a hardcore function call video
@potofwisdomquotes7 ай бұрын
What's the best machine to launch this locally. Or which machine specs dud u usw
@NicholasRenotte6 ай бұрын
I did this on my Mac and it runs 👌🏽 (Mac M1 Max with 32GB Ram). Works well on my Windows machine as well Ryzen 3700x with 2070 Super.
@snuwan8 ай бұрын
Something like ollama essentialy doing doing the same thing right. I use it everyday to run llms locally
@NicholasRenotte7 ай бұрын
Yeah I went through ollama last night and I was like ahh damn that looks way more streamlined. Ollama is built on top of lcpp though so I guess this goes into the nuts and bolts. But 100% if you’re using ollama already then maybe then no need to switch it up!
@mohammadanash812210 күн бұрын
how you are getting response in mardown format
@Maxisnice7 ай бұрын
Heyy I am a really big fan of your work.I am currently working on your action recognition for sign project and I have a doubt in that.Can you pleaseee reply to my comment on that video.
@endo90008 ай бұрын
may i ask why you havent used ollama?
@NicholasRenotte8 ай бұрын
Good question, no particular reason. I just started dev for this with lcpp and kept going! Any reason why you prefer ollama? Curious, maybe next vid?
@endo90008 ай бұрын
i mean i use ollama daily but i imagine it would have shorten your workflow at the beginning (also llama.cpp is harder to get setup) :)@@NicholasRenotte but still informative tho! watched till the end. 👍
@NicholasRenotte7 ай бұрын
Hey thanks man, will check it out!!
@ronakttawde8 ай бұрын
Kindly make pre-announcement for Gen-AI videos.
@NicholasRenotte7 ай бұрын
Whatcha mean? Like premiere these vids?
@ronakttawde7 ай бұрын
@@NicholasRenotte yeah.. You are a awsm man and God Gift to serve good purpose with DSC technology skills and talents. Keep up this work...😘🙏🤟😎
@nelohenriq7 ай бұрын
How can i use this example with google colab notebook?
@rishichowdhury42967 ай бұрын
Hey nick nice video!!. But can you suggest some alternative to mixtral as it is taking too much ram and I am unable to load it into memory. Thanks!
@SelimMakni7 ай бұрын
Hi! 😁 I can’t install open ‘llama-cop-python[server] on my Intel MacOS Monterey It’s said Failed building wheel for llama-cop-python and if I change the version 0.1.48 then it says -model not found as if it didn’t recognize llama-cop-python. I am stuck on it for ages !!!hhh Any suggestions please ? I also installed CMake naaah nothing worked thanks in advance 😁😢(crying from the inside xD)
@thelordofnill49398 ай бұрын
looks super cool!!!
@NicholasRenotte8 ай бұрын
Cheers!!
@premier22548 ай бұрын
Thank you mister. It's nice!
@NicholasRenotte7 ай бұрын
Cheers!!
@cfk-oz8 ай бұрын
I am guessing Ollama would have made it easier.
@NicholasRenotte7 ай бұрын
💯 😅 yeah I’m finding that out now
@mariamshittu28327 ай бұрын
Hi Nicholas, Thank you for all these beautiful videos. I have really learnt alot from you since I stumbled upon your videos. I will like you to do a video on image headshot generator. I will be so happy if you can do that. If you have a video on that already please share the link. Thank you so much.
@DragonyEstorial7 ай бұрын
First: Great video i love it Second: how i can bring both llama servers together to one ? to use llava and the other models without starting 2 servers?^^
@LukasHirt-qn2sf7 ай бұрын
this is amazing!!! Do you think we could put this into a code to make it a connected trading bot
@farouktouil50368 ай бұрын
Can u tell me how much space the project needs on hardisk ?
@NicholasRenotte7 ай бұрын
The majority of the space is going to come from the models. I think the largest model I had was ~15GB which was mixtral.
@farouktouil50367 ай бұрын
90GB SSD left 24RAM CORE I5 CPU i will give a try, thank much u excellent with tuto Nik
@thewatersavior7 ай бұрын
DEVIKA next episode please
@10x_y247 ай бұрын
Man, you're rocking, all the best insh'Allah
@IhebAkrimi-jh2ib7 ай бұрын
Thank you for this great video, I'm wondering about using this in production and fine-tune it to a specific context by using the openai prompt, is that possible without using the real api?
@faizaanshaikh388722 күн бұрын
Sorry, new to all of this. Aren’t you using the mistral AI model in this video and not openAI?
@Afzal30008 ай бұрын
Will this project work if i make this app and deploy on some free deploy service???
@NicholasRenotte8 ай бұрын
Yep, would just need to host the llama server somewhere as well!!
@adeelhasan75368 ай бұрын
@@NicholasRenotte Will the llama cpp server be able to handle multiple requests at the same time i.e. batched inference
@kevinalexis98868 ай бұрын
Awesome, great video!
@NicholasRenotte8 ай бұрын
Cheers Kevin!
@hemanthrayavarapu16638 ай бұрын
can we do this with our local data ?
@NicholasRenotte7 ай бұрын
Sure can!
@deanchanter2177 ай бұрын
Again super awesome content! But have to ask what styler are using that greys the code when there is a error
@Jay-lo6kz8 ай бұрын
Hey nick can you please look into behaviour cloning/Imitation learning and apply it on games, please man iam working on it and it is damn good but iam getting stuck,I can use your support
@NicholasRenotte8 ай бұрын
Alrighty, will take a look!
@AnonymousAccount5148 ай бұрын
do you need atleast 32 gb of ram
@NicholasRenotte8 ай бұрын
It'll vary per model but Mistral was roughly 4gb. Mixtral took up a fair bit more!
@pakshaljain19698 ай бұрын
damn this is good stuff
@NicholasRenotte7 ай бұрын
Cheers man!!!
@properjob23118 ай бұрын
Wow amazing. What is minimum/recommended spec machine to run this on?
@NicholasRenotte7 ай бұрын
Check this out github.com/ggerganov/llama.cpp/issues/13
@philtoa3348 ай бұрын
Nico 🤩
@NicholasRenotte7 ай бұрын
Ayyyy Phil!! How you doing?!?
@giftahmed8 ай бұрын
Nice❤
@shantanugote7 ай бұрын
Wait until he see ollama😂
@iaminvisible.24338 ай бұрын
Letsssss gooooooooooo
@NicholasRenotte8 ай бұрын
Ayyyyyyy!
@AzeemKhan-yt4tm6 ай бұрын
Yo AI generated David Goggins "Be hard mf" lmao yoo where have u been
@lokeshart33407 ай бұрын
U pls recreate gemini demo in live llava lr any vlms? Webcam live streaming pls ❤❤
@dj49_aakashn928 ай бұрын
👍🏻
@arnaudlelong23428 ай бұрын
lol bro
@NicholasRenotte8 ай бұрын
Yeah went a little hardcore with this one 😂
@arnaudlelong23428 ай бұрын
@@NicholasRenotte lol dig you my man
@datawaly7 ай бұрын
Hello Nicolas , i hope you are doing well, can i get the link of you discord to join your community , i like so much your cours
@10x_y247 ай бұрын
Could you make a collaboration with @PythonSimplified. All the best insh'Allah