Incredible content, and he doesn't waffle either!!! just to the point, good pace, great voice, great cadence, and perfect audio levels. This channel is gonna be big.
@matthew_berman Жыл бұрын
Thank you :)
@marilynlucas5128 Жыл бұрын
@@matthew_berman With open llm , you don't get an Open AI like Api token right?
@marilynlucas5128 Жыл бұрын
@@matthew_berman How can a project like Aider utilized open llm?
@shotelco Жыл бұрын
Yet another piece to the democratization of AI! Very valuable.
@matthew_berman Жыл бұрын
Agreed!
@marilynlucas5128 Жыл бұрын
Yes indeed!
@MrGaborKukucska Жыл бұрын
The future is now 🙌🏼
@applyingpressureeveryday Жыл бұрын
Democracy means those in power rule. We live in a democracy that’s clearly 1000% centralized. I got the message tho. 👍🏿
@josephsagotti8786 Жыл бұрын
@@applyingpressureeveryday Democratization of technology means the de-centralization of technology.
@maxamad13 Жыл бұрын
First time man. To the point and straight forward, Thank youuuuu !!!!!!!!!!!!
@matthew_berman Жыл бұрын
You got it!
@pancakeflux Жыл бұрын
This is exactly what I’ve recently been looking for! Thanks for showing it off :)
@paulbishop7399 Жыл бұрын
stop it I cant keep up anymore :) everyday I am pivoting around your content, gimme a break already! What an exciting time to be alive!
@matthew_berman Жыл бұрын
Haha nice :) wait until you see the next video!
@VastIllumination Жыл бұрын
You are becoming my favorite AI channel! This is literally exactly what I've needed. I've been looking for an open llm alternative to openai API for querying PDFs with Langchain. I haven't been able to test the largest LLMs using Langflow because it always times out from Huggingface.
@matthew_berman Жыл бұрын
Glad I could help 🎉
@sjimosui8279 Жыл бұрын
Matthew are you pushing it to github? I 'm also working for the same looking for ideas but a beginner though looking for help
@ajith_e Жыл бұрын
Had'nt heard of OpenLLM before but now I can't hold my excitement to test it out. Well paced, Well executed tutorial that touches on the important aspects of deployment. Please follow this space closely because we'll be following you !! Thank you for this great tutorial
@RyckmanApps Жыл бұрын
Awesome quick video!
@matthew_berman Жыл бұрын
Thanks :)
@williammixson2541 Жыл бұрын
My last computer was a gaming rig. My newest build this week will be specifically for ML and I cannot wait!!! Easy sub.
@thelalomorales Жыл бұрын
dude, totally just did it! YOU RULE
@Tenly2009 Жыл бұрын
It would be a lot easier for us to follow along and be successful if you did these demos starting with a brand new machine with just python and conda pre-installed. That way our experience would be more likely to match the one in your video *exactly* and we wouldn’t struggle at the points where you say “the first time I tried this, I got an error” or “I already have this installed”. Just a suggestion.
@wendellvonemet74439 ай бұрын
When you cut out all the dead space, your sentences run together without the natural pause that would allow beginners to digest each new concept before being bombarded with the next five new concepts that are rattled off at the speed of light. Tutorials work best when the newbies have time to let new concepts sink in. I'll be stuck trying to wrap my head around what you just said, and I continually have to pause and rewind to catch what you said while I was still chewing on the first bite. You also run your words together, within the sentences, so I have to continually rewind to make sure that I heard you correctly. Many of us are complete newbs to all of this. The info you provide is great. I watch a ton of your videos. I just wish you'd go a hair slower and dumb it down for those of us who are brand new and have to look up the definition of each piece of new tech jargon used (had to ask AI what the hell a bento was, and it thought I was interested in Japanese cuisine).
@8eck Жыл бұрын
Will be waiting for their fine-tuning feature. Should be interesting.
@Garfield_MinecraftАй бұрын
"tell me a joke or I will tell you a joke" damn this AI is crazy
@vinylrebellion Жыл бұрын
Looking forward to the mosaic 33b. Loving the videos
@matthew_berman Жыл бұрын
Literally testing it right now!
@tiredlocke Жыл бұрын
This is awesome. I've played with some different open-source models in Runpod(which is great, btw). And I looked into installing the Text Generator WebUI locally... but I don't have a suitable GPU yet. Ultimately, I want a self-hosted (preferably in a container) API that can run various models and hit from a web browser, or from a console app, or from a game. This looks like exactly what I want. Now I just need to find a GPU to toss into my server...
@Trahloc Жыл бұрын
Oobabooga's webui-text-generator is compatible with ggml models which are CPU only but can use gpu for speedup although latest versions don't use my gou for some reason.
@tiredlocke Жыл бұрын
@@Trahloc Good to know. I previously tried some stuff that wouldn't run without the Nvidia GPU. I'll have to give this a try and see how it works.
@ehsanrt Жыл бұрын
i was looking for something like this for 2 weeks, thank you for your video .. made my learning much easier ... please make a langchain video too
@MeinDeutschkurs Жыл бұрын
I‘m excited! Yeah! I‘m interested in custom/not listed models, also NLLB-200… And what about Mac? There is no xformers available.
@khandakerrahin1003 Жыл бұрын
Are these models running locally? If yes, what are the hardware requirements?
@matthew_berman Жыл бұрын
Yes. It depends on the model. Smaller models have very little requirements.
@khandakerrahin1003 Жыл бұрын
@@matthew_berman thank you so much Sir.
@matthew_berman Жыл бұрын
@@khandakerrahin1003 you got it!
@henkhbit5748 Жыл бұрын
Wow, that is really simple. Thanks for showing this api tool for LLM 👍
@matthew_berman Жыл бұрын
@@JohnSmith-jc7yi no way. You can run local models on much smaller machines
@antonioveloy9107 Жыл бұрын
I prefer the Oobabooba Web UI, which basically runs an API locally and has a nice button to "import" any hugging face model.. But this is interesting too
@nikdog419 Жыл бұрын
I'm gonna need a cardboard box server again. Time to start a 24/7 AI stream. 😂
@musikdoktor Жыл бұрын
Great youtuber. Regards from Uruguay!
@sevenkashtan Жыл бұрын
Just adding a positive comment for the algorithm! Great video
@matthew_berman Жыл бұрын
Haha thank you!
@jmanhype1 Жыл бұрын
Please explain if this is hosted locally as a server or if we need runpod or chainlit
@matthew_berman Жыл бұрын
You can run this locally AND/OR deploy it to the cloud when you're ready for production.
@jmanhype1 Жыл бұрын
@@matthew_berman please go over steps to host for production
@scitechtalktv9742 Жыл бұрын
I also would like to know how to deploy this to the cloud. And what alternatives there are for doing that. Does HuggingFace have a cloud solution (for free)?
@grimtagnbag Жыл бұрын
Ty for all these videos getting tons of ideas
@Boboche Жыл бұрын
Love this channel
@matthew_berman Жыл бұрын
Thank you :)
@brianv2871 Жыл бұрын
Thanks for the video. This is getting close to something i'm looking for, but this still requires a permanent system set up with some decent hardware. Would be interesting to see this combined into a single google colab that could be run as needed, for those of us looking to utilize this on an occasional basis.
@doords Жыл бұрын
Colab would be very useful. I wish we can keep a colab running forever.
@joseberlines-l4f Жыл бұрын
Really dreaming about the moment this can be used to ask my own set of documents like in your previous videos about gpt4all
@build.aiagents Жыл бұрын
Still not sure how building models work in the examples, i see you using the models but how do we build on top of them? Sorry if I missed it.
@faisalalqasim Жыл бұрын
شكرًا
@GamingDaveUK Жыл бұрын
what is the advantage of this over textgen webui? and does it handle custom models as well as textgen webui (4bit gptq models etc)
@erick2will Жыл бұрын
Awesome! Thanks for sharing! 😀
@NguyenHoangHuy Жыл бұрын
Are you using WSL? Would you recommend using Windows over Linux? I've had problems trying to install all the Nvidia GPU drivers and CUDA and pytorch modules... using Ubuntu, to the point I had to reinstall Ubuntu.
@soubinan Жыл бұрын
very similar to localai seems the difference is the localai is compatible with the openai api
@GlobalOffense6 ай бұрын
What is the beginning transition? That is epic looking.
@zepto59453 ай бұрын
4:04 It started rambling like a mad man 😭
@8eck Жыл бұрын
It is like personal computers in era of Steve Jobs, when they were still not so available to anyone. I guess soon this will become even more open with projects like that.
@godned74 Жыл бұрын
You are a freakin genius.😎
@Heynmffc Жыл бұрын
“ I don’t know what any of that means but doesn’t seem to be causing any problems“ amen lol
@michaelberg7201 Жыл бұрын
Super interesting and exciting project. I didn't quite get though if the models are running locally? I thought this required a lot of GPU power.
@BlackDragonBE Жыл бұрын
You can run LLMs locally on the the CPU, GPU or shared between the CPU and GPU. CPU only is quite slow though.
@faff Жыл бұрын
He's running a $2000 video card.
@RodrigoRecio Жыл бұрын
Great content. Thank you!
@nannan3347 Жыл бұрын
A feature on my wish list is being able to GET and POST the context so it can be edited on the fly
@RixtronixLAB Жыл бұрын
Nice video, well done ,thanks :)
@fabianaltendorfer11 Жыл бұрын
Awesome, thank you for this vid
@bowenchen4908 Жыл бұрын
How fast if this is running locally? Is speed going to be an issue?
@clray123 Жыл бұрын
Uhh so a wrapper over a wrapper (HuggingFace/LangChain)??? What does this new API add exactly (except for new bugs)?
@hermysstory8333 Жыл бұрын
Many thanks!!!
@patrickwalchshofer4004 Жыл бұрын
Hey Maathew - this is really great! So with this I can replace the OpenAI api and run all the apps that are built to use OpenAI?
@MarkDemarest Жыл бұрын
AMAZING!!! 💪🔥🙌
@marcoamato8461 Жыл бұрын
Maybe a silly question but what the minimum hardware requirements?
@user-wr4yl7tx3w Жыл бұрын
Is conda installation more stable than pip? Just wondering which one to use. mostly, I have used pip previously.
@aliakbari89003 ай бұрын
I want to create a custom chatbot that utilizes multiple Gemini and GPT APIs. Does an API remember the history of messages in a chat? This is crucial for maintaining context within the conversation.
@xavierf2229 Жыл бұрын
I thing you should show what are these LLMs are really capable of,the examples you are showing are pretty simple
@8eck Жыл бұрын
Interesting, would be cool to have response streaming feature.
@s0ckpupp3t Жыл бұрын
probably can through the gRPC interface
@8eck Жыл бұрын
@@s0ckpupp3t yes, but they depend on another project, which doesn't support it 😕
@8eck Жыл бұрын
@@s0ckpupp3t at least yet
@khorLDW7 ай бұрын
Just wanted to point out for anyone trying, if you do this on Windows and wanted to install directly without conda, you'll get an error pointing to vllm library pointing that it can only be used on Linux.
@revenger211 Жыл бұрын
I am facing issues with the "openllm start opt", I get an error of "KeyError: 'OPENLLM_OPT_MODEL_ID'" why is that? I searched online and still can't find a solution
@gavinray24110 ай бұрын
Why did you go through the process of creating a Conda env when you then install with Pip?
@gnosisdg8497 Жыл бұрын
well if and when the make the training section available and langchain then it will be really cool project to have !!!
@pipoviola Жыл бұрын
You are tooooo awesome!!!
@hrishabhg Жыл бұрын
It is superb knowledge. As a sequence, can you create a video, which can help user decide to choose GPU & CPU Configuration for serving.
@justin9494 Жыл бұрын
Please help. I have cuda and torch all working, but when running the model, it says Cuda not found or something. Any ideas?
@SillyProphecies Жыл бұрын
Awesome! Great stuff and thank you very much! Do you have an idea how to implement a qlora finetuned Falcon Modell?
@akshatkant142310 ай бұрын
Will there be input/output token limits when building custom llm models using openllm like we see in other monetized llm api models?
@heliosobsidian Жыл бұрын
Wonderful Content!! will this more easier to work with AutoGen? 🤔
@yacahumax14314 ай бұрын
Very nice
@javiergimenezmoya Жыл бұрын
Is it possible to link an own finetunned LLM stored in your local machine?
@BlayneOliver Жыл бұрын
Sorry for my noob question, but could someone explain why we’d need more than ChatGPT 4?
@ALFTHADRADDAD Жыл бұрын
Absolutely crazy
@uuuuu48588 ай бұрын
hey for me when i try to import openllm in python it shows me the module dosnt exist. any suggestions
@LUDOVICOPAPALIA Жыл бұрын
I want to run the model on runpid and create some API to run a service (python) from my personal computer. Any idea on how to do that?
@victordanneygarciaplaza2374 Жыл бұрын
Hi Matthew, thanks for this video! I have a question about how to use open-llm and have documents as a knowledge base.
@forexhunter2040 Жыл бұрын
Does using falcon model improve the accuracy more than the opt one?
@ronaldkodras4527 Жыл бұрын
It says I have no GpU available to run the falcon model. I have NVIDIA drivers down loaded but still no luck. What can I do? How about GPU from runpod?
@chrisBruner Жыл бұрын
If you've got models downloaded, can they be used?
@clear_lake Жыл бұрын
Which server configuration do you reccomend if I wanna run falcon?
@PeacefulislamicLofi8 ай бұрын
when I installe openllm the installation processing started but was not complete I tried it 3 times but same result I get, what kind of mistake I don't know can you help me?
@DeepKarmakar-i7v6 ай бұрын
can i use the same in a javascript application ??
@originalsuperheroguruji Жыл бұрын
Any Idea server configuration needed to use this AI models on custom servers of AWS or Linode ???
@jcfnetwork6768 Жыл бұрын
Finally!!
@matthew_berman Жыл бұрын
Wooo
@ErnestGWilsonII Жыл бұрын
How can we run an LLM at Home and have the same API that open API uses
@cavlab5 ай бұрын
What is the minimum GPU requirement to use this
@averaguilar Жыл бұрын
Awesome!, I just want to know what model is rather good for spanish language. I have tried some and are just awful.
@mijanurrahaman3778 Жыл бұрын
Can we provide a customized knowledge base to the system?
@ThobaniMngoma Жыл бұрын
Does this API also work when running LLMS using CPU resources?
@MuhammadHadiHadi-w1r Жыл бұрын
Does it support autogen or crewAI?
@ZakkFromSource Жыл бұрын
Do you know of any current services where you could host something like this on the cloud for free to test out creating something like a chat bot that you could call and add extra functionality to, via python code running locally on your machine?
@s0ckpupp3t Жыл бұрын
does it have a streaming api endpoint?
@Kulbaru7 ай бұрын
I am getting this error, and cant find any solutions to fix the dependency error: "Failed to build ninja ERROR: ERROR: Failed to build installable wheels for some pyproject.toml based projects (ninja)"
@user-wr4yl7tx3w Жыл бұрын
Is this a replacement for TextGen WebUI? do they perform the same function?
@doords Жыл бұрын
Yeah same function but all of the textgen webuis cost money. If you build an app for many people you will have to pay a lot every time your users send a query
@Azcraz Жыл бұрын
Has anyone been able to get this working recently? I follow the docs to a 'T' and the opt model is unable to start-up. I ran openllm models --show-available and it looks like it's not properly downloading the model locally after running 'pip install "openllm[opt]" as it says 'No local models available". Do we need to download the models with 'openllm import ...'? I've tried that as well with 'openllm import opt facebook/opt-1.3b" as well to no avail. Surely I must be doing somehting silly!? Any help appreciated!
@Azcraz Жыл бұрын
Got it working! Turns out you need to manually import the models, running the pip install openllm[] does not download the model. You must use the import command and specify the model, in my case I also needed to pass the -serialisation legacy flag because the model did not support safetensors!
@thehkmalhotra9714 Жыл бұрын
I loved your content mate ❤️ Thanks for your video. Just a quick question can we use localhost:3000 to a domain. This localhost url can be used as an API till I am running in my PC what if I want to point this url to a domain name which can be easily accessible to all? Will be waiting for your answer 😥 Keep up the great work dude ❤
@ganeshkgp Жыл бұрын
Is there any free hosting so i can host and test it and also how to use domain instead of localhost?
@ganeshkgp Жыл бұрын
Please dont get me wrong i am a software developer but i have no idea how to use llms.
@Star-rd9eg Жыл бұрын
How would i use this in runpod ? :)
@avi7278 Жыл бұрын
Now all you need is $10,000 computer! No but seriously the last piece of the puzzle here is a service like runpod where you can install this and it charges you for exact inference time for each request. Does anybody know of anything like it?
@elchippe Жыл бұрын
I think the 3B and 7B parameters versions models can run locally with a CPU or even a 12GB 3060 RTX.
@clray123 Жыл бұрын
No, the last piece of the puzzle is open source models that aren't crap.
@elchippe Жыл бұрын
@@clray123 The easy fine-tuning of these models for specific tasks and the algorithmic optimization to run these models more efficiently in a spectrum of hardware from low to the high end is what is going to make the difference against propetary models.
@clray123 Жыл бұрын
@@elchippe LoRA fine-tuning is like a 200 lines script of Python code. You clone the script from Git and run it. The difficulty of fine tuning is not because you lack some silly API, but rather the choice of parameters and foremost of the input data. And you will not be able to finetune any serious models on "low end" hardware, even with (Q)LoRA and what not.
@avi7278 Жыл бұрын
@@clray123 Yeah well I meant this particular puzzle of being able host your own personal API of an open source model. Model quality is beside the point.
@eyemazed Жыл бұрын
does API support the embedding functionality?
@cheifei Жыл бұрын
Embeddings are just custom text that is passed to the LLM to use as a reference. To get the embeddings: You would need to run a model that can specifically convert text to vectors. Then send you embedded docs to that embeddings model via the API. Take the vector response and store it in a vector store. Then when you make a query, convert you query to a vector via your local model, then perform a similarity search on your vector store. That will return some docs, and you pass the text of those docs to the LLM.
@eyemazed Жыл бұрын
@@cheifei are you implying it's absolutely irrelevant how you create the embeddings? don't different models use different ambedding algorithms, that's why they have different vector dimensionalities among other things?
@cheifei Жыл бұрын
@@eyemazed no, I am not implying that. I agree with you that you have to use the same embedding model for consistency. I think the missing piece is that that you pass the text of the query and the text (not vectors) of the embedded docs to the LLM.
@eyemazed Жыл бұрын
@@cheifei i see, i thought you needed to use the same embedding API for vectorizing the context that you pass along with prompt to the LLM as the LLM uses to vectorize your prompt. so if I understand correctly you're free too choose any embedding API/vector store that you want because it's separate from the LLM and is only used to retrieve the context that you can send along with your prompt to the LLM
@cheifei Жыл бұрын
@@eyemazed That is correct.
@chrisl4211 Жыл бұрын
i think langchain can create api ? no?
@davidc34162 ай бұрын
Pip install for a conda environment is not usually the best way to go.
@eddymison3527 Жыл бұрын
I think it's great.
@varunrao-q5m3 ай бұрын
How much requirement it has can i run it in a 4gb video ram