How to Build a Fake OpenAI Server (so you can automate finance stuff)

Рет қаралды 32,747

Күн бұрын

Пікірлер: 102

@AustinRoutt 7 ай бұрын

If you have empty content strings in chat completion, try adding the “-chat_format llama-2” flag when running the server; change llama-2 to the appropriate handle for your model.

@foreignconta 8 ай бұрын

You don't actually need both the `llama-cpp-python[server]` and llama-cpp. The llama-cpp-python[server] is just the python bindings on top of llama-cpp. If you are setting up a server, just one of them is enough!

@NicholasRenotte 7 ай бұрын

Legend, cheers man!!

@zdeacon 7 ай бұрын

If you already have llama.cpp installed, you can just cd to your llama.cpp folder and run "./server -m PATH_TO_YOUR_MODEL.gguf". This will start your server just the same, except I believe it will auto-offload layers to GPU. From there just continue the tutorial building the app.py script. THANK YOU NICHOLAS FOR ANOTHER AWESOME GUIDE!!

@Er1ku 8 ай бұрын

Eyyyy, this is a really great topic! I obviously haven't listened to it all yet but this will help me build some of the apps I have been meaning to create that rely of LLMs but I haven't followed through with due to pricing.

@NicholasRenotte 8 ай бұрын

Yeah, can definitely go way more hardcore than I did. This covers all the foundations though!

@spectra5024 7 ай бұрын

I love your content, thanks for teaching great stuff. It just made me fall in love with AI. I'm glad I'm doing my bachelor thesis related with CV and Continual Learning. Thanks Nicholas!!!

@FireFly969 7 ай бұрын

Months ago, I was doing a project that can identify specific patterns in trading images, but at that time, these type of videos were not popular, now, it looks easy to make an ai that can do the job, and give me the best patterns, as If I want to do it manually, I need to go trough like 500 or more of assets and see their candlesticks data, daily, Which will take me a lot of energy and time to do it each day. At least 2hours, and after it the energy will be so low, that I need to go and have a small nap to recharge.

@lonesharp1106 6 ай бұрын

I know exactly how you feel 😂

@Jhedataprofesorjr 4 ай бұрын

Hi I was also wondering in the same problem how did you achieve to solve the problem if possible please send me the github repo :) it will help alot

@ronnieleon7857 7 ай бұрын

Lately, I have been working with open-source LLMs such as Yi, Solar, Mistral, Mistral merged models such as neuralbeagle and the biggest challenge has been how I can deploy these models and use them in prod. This video has really solved the challenge I was facing and I'm definitely going to use this approach. The biggest problem is that not many servers have GPUs so one drawback that I foresee is the time it'll take to generate a response to an app.

@tester0083 7 ай бұрын

great video, you have a new subscriber, hope to see more vids showing us how to combine models and functions for the financial stuff

@FireFly969 7 ай бұрын

Wonderfull project, the part I love the most, is when we make it able to use our functions in real time.

@abhishekdalakoti2419 4 ай бұрын

I like your work and you have an edge over others because you show by doing practically rather than theoretically.

@user-ix3kj2vn6i 8 ай бұрын

Man thank you for your videos! This one is amazing!!! I have created my own AI without internet. Very cool thing!

@johnHTC23 7 ай бұрын

Hey Nicholas loved this video and content! I hope you do a more in-depth project with finance stuff and maybe incorporate an Multi-Agent System like Pythagora GPT-Pilot or Crew AI or something similar that's open source.

@santicastrovilabella3248 8 ай бұрын

Please do a bigger video on this ❤

@zejiaann 7 ай бұрын

I am facing "ggml_metal_init: error: Error Domain=MTLLibraryErrorDomain Code=3 "program_source:3:10: fatal error: 'ggml-common.h' file not found #include "ggml-common.h" error, when spinning up the local server any idea how to fix this?

@edinsonriveraaedo292 7 ай бұрын

Hi Nicholas, I want to thank you for your work here. It is really helpfull and very well explained. Thanks a lot.

@PoGGiE06 7 ай бұрын

Quick question: why use quantised mistral 7b rather than mixtral quantised 8x7b? And presumably it can be finetuned with e.g. qlora? And one can use RAG? Great video, think this is just what I was looking for!

@MohamedKeddache-r1o 7 ай бұрын

please make a video about RAG with Ollama with some advance technics of your choice and use an agents

@GeobotPY 7 ай бұрын

Really love this! Continue making LLM vids!

@GeobotPY 7 ай бұрын

And def want a function calling course. Perhaps retrieval also

@santicastrovilabella3248 8 ай бұрын

What is the difference (or the advantage) of using this approach vs using Langchain for this function calling and multi model pipeline?

@NicholasRenotte 7 ай бұрын

I like this because I can build pipelines myself without relying on a cloud provider. Langchain is usually pretty tightly coupled to OA and when stuff doesn’t work you have to go digging through a ton of code. Admittedly haven’t played with it for a while so it might have gotten better!

@smtamimmahmud 7 ай бұрын

Do I need to install CUDA beforehand to use the nvidia GPU? And does the model size matter with --n_gpu -1 flag?

@fr_ey24 7 ай бұрын

Hey Nick not related to this video but it’s the most recent one. I have a question about Sign Language Translation, was wondering if you could help me out

@aishashaikh197 7 ай бұрын

Hi, during the summarization step it's taking too long for mixtral to execute and hence it is giving me execution time out error, is it because of the execution time issue or am I missing any library, could u plz help me with that

@g.s.3389 8 ай бұрын

what you have done isn't it the same of running OLLAMA server and than using the same API on it?

@NicholasRenotte 7 ай бұрын

🤦‍♂️ yeah ngl I wish I knew about ollama to begin with 😅 ollama is built on top of llama cpp so same outcome. The latter half shows integration and maybe some extra stuff that’d be useful!

@MCAUptown 6 ай бұрын

Yes we want a hardcore function call video

@potofwisdomquotes 7 ай бұрын

What's the best machine to launch this locally. Or which machine specs dud u usw

@NicholasRenotte 6 ай бұрын

I did this on my Mac and it runs 👌🏽 (Mac M1 Max with 32GB Ram). Works well on my Windows machine as well Ryzen 3700x with 2070 Super.

@snuwan 8 ай бұрын

Something like ollama essentialy doing doing the same thing right. I use it everyday to run llms locally

@NicholasRenotte 7 ай бұрын

Yeah I went through ollama last night and I was like ahh damn that looks way more streamlined. Ollama is built on top of lcpp though so I guess this goes into the nuts and bolts. But 100% if you’re using ollama already then maybe then no need to switch it up!

@mohammadanash8122 10 күн бұрын

how you are getting response in mardown format

@Maxisnice 7 ай бұрын

Heyy I am a really big fan of your work.I am currently working on your action recognition for sign project and I have a doubt in that.Can you pleaseee reply to my comment on that video.

@endo9000 8 ай бұрын

may i ask why you havent used ollama?

@NicholasRenotte 8 ай бұрын

Good question, no particular reason. I just started dev for this with lcpp and kept going! Any reason why you prefer ollama? Curious, maybe next vid?

@endo9000 8 ай бұрын

i mean i use ollama daily but i imagine it would have shorten your workflow at the beginning (also llama.cpp is harder to get setup) :)@@NicholasRenotte but still informative tho! watched till the end. 👍

@NicholasRenotte 7 ай бұрын

Hey thanks man, will check it out!!

@ronakttawde 8 ай бұрын

Kindly make pre-announcement for Gen-AI videos.

@NicholasRenotte 7 ай бұрын

Whatcha mean? Like premiere these vids?

@ronakttawde 7 ай бұрын

@@NicholasRenotte yeah.. You are a awsm man and God Gift to serve good purpose with DSC technology skills and talents. Keep up this work...😘🙏🤟😎

@nelohenriq 7 ай бұрын

How can i use this example with google colab notebook?

@rishichowdhury4296 7 ай бұрын

Hey nick nice video!!. But can you suggest some alternative to mixtral as it is taking too much ram and I am unable to load it into memory. Thanks!

@SelimMakni 7 ай бұрын

Hi! 😁 I can’t install open ‘llama-cop-python[server] on my Intel MacOS Monterey It’s said Failed building wheel for llama-cop-python and if I change the version 0.1.48 then it says -model not found as if it didn’t recognize llama-cop-python. I am stuck on it for ages !!!hhh Any suggestions please ? I also installed CMake naaah nothing worked thanks in advance 😁😢(crying from the inside xD)

@thelordofnill4939 8 ай бұрын

looks super cool!!!

@NicholasRenotte 8 ай бұрын

Cheers!!

@premier2254 8 ай бұрын

Thank you mister. It's nice!

@NicholasRenotte 7 ай бұрын

Cheers!!

@cfk-oz 8 ай бұрын

I am guessing Ollama would have made it easier.

@NicholasRenotte 7 ай бұрын

💯 😅 yeah I’m finding that out now

@mariamshittu2832 7 ай бұрын

Hi Nicholas, Thank you for all these beautiful videos. I have really learnt alot from you since I stumbled upon your videos. I will like you to do a video on image headshot generator. I will be so happy if you can do that. If you have a video on that already please share the link. Thank you so much.

@DragonyEstorial 7 ай бұрын

First: Great video i love it Second: how i can bring both llama servers together to one ? to use llava and the other models without starting 2 servers?^^

@LukasHirt-qn2sf 7 ай бұрын

this is amazing!!! Do you think we could put this into a code to make it a connected trading bot

@farouktouil5036 8 ай бұрын

Can u tell me how much space the project needs on hardisk ?

@NicholasRenotte 7 ай бұрын

The majority of the space is going to come from the models. I think the largest model I had was ~15GB which was mixtral.

@farouktouil5036 7 ай бұрын

90GB SSD left 24RAM CORE I5 CPU i will give a try, thank much u excellent with tuto Nik

@thewatersavior 7 ай бұрын

DEVIKA next episode please

@10x_y24 7 ай бұрын

Man, you're rocking, all the best insh'Allah

@IhebAkrimi-jh2ib 7 ай бұрын

Thank you for this great video, I'm wondering about using this in production and fine-tune it to a specific context by using the openai prompt, is that possible without using the real api?

@faizaanshaikh3887 22 күн бұрын

Sorry, new to all of this. Aren’t you using the mistral AI model in this video and not openAI?

@Afzal3000 8 ай бұрын

Will this project work if i make this app and deploy on some free deploy service???

@NicholasRenotte 8 ай бұрын

Yep, would just need to host the llama server somewhere as well!!

@adeelhasan7536 8 ай бұрын

@@NicholasRenotte Will the llama cpp server be able to handle multiple requests at the same time i.e. batched inference

@kevinalexis9886 8 ай бұрын

Awesome, great video!

@NicholasRenotte 8 ай бұрын

Cheers Kevin!

@hemanthrayavarapu1663 8 ай бұрын

can we do this with our local data ?

@NicholasRenotte 7 ай бұрын

Sure can!

@deanchanter217 7 ай бұрын

Again super awesome content! But have to ask what styler are using that greys the code when there is a error

@Jay-lo6kz 8 ай бұрын

Hey nick can you please look into behaviour cloning/Imitation learning and apply it on games, please man iam working on it and it is damn good but iam getting stuck,I can use your support

@NicholasRenotte 8 ай бұрын

Alrighty, will take a look!

@AnonymousAccount514 8 ай бұрын

do you need atleast 32 gb of ram

@NicholasRenotte 8 ай бұрын

It'll vary per model but Mistral was roughly 4gb. Mixtral took up a fair bit more!

@pakshaljain1969 8 ай бұрын

damn this is good stuff

@NicholasRenotte 7 ай бұрын

Cheers man!!!

@properjob2311 8 ай бұрын

Wow amazing. What is minimum/recommended spec machine to run this on?

@NicholasRenotte 7 ай бұрын

Check this out github.com/ggerganov/llama.cpp/issues/13

@philtoa334 8 ай бұрын

Nico 🤩

@NicholasRenotte 7 ай бұрын

Ayyyy Phil!! How you doing?!?

@giftahmed 8 ай бұрын

Nice❤

@shantanugote 7 ай бұрын

Wait until he see ollama😂

@iaminvisible.2433 8 ай бұрын

Letsssss gooooooooooo

@NicholasRenotte 8 ай бұрын

Ayyyyyyy!

@AzeemKhan-yt4tm 6 ай бұрын

Yo AI generated David Goggins "Be hard mf" lmao yoo where have u been

@lokeshart3340 7 ай бұрын

U pls recreate gemini demo in live llava lr any vlms? Webcam live streaming pls ❤❤

@dj49_aakashn92 8 ай бұрын

👍🏻

@arnaudlelong2342 8 ай бұрын

lol bro

@NicholasRenotte 8 ай бұрын

Yeah went a little hardcore with this one 😂

@arnaudlelong2342 8 ай бұрын

@@NicholasRenotte lol dig you my man

@datawaly 7 ай бұрын

Hello Nicolas , i hope you are doing well, can i get the link of you discord to join your community , i like so much your cours

@10x_y24 7 ай бұрын

Could you make a collaboration with @PythonSimplified. All the best insh'Allah