I'm building my own personal AI assistant but every time I start something a week later something better drops. My god, this is impossible. I've got to think better about my abstractions to make some of this stuff more drop-in ready. That might be an interesting video (or series of videos) for you Matthew, if not likely a bit advanced for your audience.
@LeonardLay11 ай бұрын
I'm in the same boat. The tech changes so quickly, my ideas become antiquated as soon as I get something working 😆
@matthew_berman11 ай бұрын
The nice thing is if you stick with using OpenAI API, that seems to be the standard
@LeonardLay11 ай бұрын
@@matthew_berman I have an Azure account and I'm trying to use it to act as a server for the different models rather than hosting them locally. I'm having so much trouble doing that because the models that are included with Azure aren't the ones I want to try out. Do you have any advice?
@DihelsonMendonca11 ай бұрын
You're lucky. I still have to learn Python. But since ChatGPT is developing too fast, when I learn, my knowledge would be obsolete, because just now we can create a personal assistant using GPTs very easily, do you agree ? 🙏👍
@free_thinker495811 ай бұрын
@@DihelsonMendoncame too, once I focus on something then later I find something else exists and with high quality than the previous one hhhh
@rakly347311 ай бұрын
Every time I need something, you present a tool doing exactly that. Thanks!
@matthew_berman11 ай бұрын
Glad to hear it!
@xdasdaasdasd478711 ай бұрын
Ollama series! This was a great starting video❤ thank you for all your hard work
@fenix2007511 ай бұрын
About the privateGPT, I found the accuracy can be improved if the database change from duckDB to elasticsearch.
@elierh44211 ай бұрын
😮 Please create a video integrating Ollama with autogen!
@federicocacace107011 ай бұрын
and autogen's function calling with local models too!!
@LeonardLay11 ай бұрын
This was my first thought. Please do this
@blackstonesoftware707411 ай бұрын
Yes!!! Do this with AutoGen!
@skullseason111 ай бұрын
Great idea dudes 🔥🔥🔥🔥🔥
@matthew_berman11 ай бұрын
Easy enough! I’ll make a video for it.
@WaefreBeorn11 ай бұрын
this model will allow us to make open source models fast, I love the simultaneous part, please make more tutorials on this once it hits windows without wsl
@AaronTurnerBlessed11 ай бұрын
agree... This OLlama really looks promising Matthew!! Light weight and simple. More plz!!
@chrismachabee312811 ай бұрын
I am at WSL now, join me. WSL - Windows Subsystem for Linux. It is at Microsoft Ignite. The title is How to install Linux on Windows with WSL. So, you are on your own now. I have several computer requiring updating. good luck.
@WaefreBeorn11 ай бұрын
@@chrismachabee3128 you are an AI generated comment. Please follow terms of service on KZbin for automated accounts, creator of this bot.
@agntdrake11 ай бұрын
Really great video! The easiest way to get history is to take the `context` which was given in the response and just pass it back as the 'context' field in the request.
@scitechtalktv974211 ай бұрын
Building an AutoGen application using Ollama would be wonderful ! Example: one of the agents is a coder, implemented by a LLM specialized in coding etc.
@SushilSingh200511 ай бұрын
I was about to write this myself.
@27dhan11 ай бұрын
haha me too!
@EduardsRuzga11 ай бұрын
I started writing same comment, and then saw yours :D
@MungeParty11 ай бұрын
I'm an autogen application using ollama, I was going to write this comment too.
@EduardsRuzga11 ай бұрын
@@MungeParty O nice to meet you! Why autogen ollama app is interested in this? :D
@taeyangoh730511 ай бұрын
yes! it would be really interesting how autogen + Ollama goes !😍
@BibopGresta111 ай бұрын
I'm interested, too! I wonder if Autogen is obsolete now that OpenAI unleashed the kraken with the GPTs! What do you think?
@alextrebek523711 ай бұрын
@@BibopGresta1i think you have yourself a popular follow-up video, given the comments asking about autogen 😉
@Gatrehs11 ай бұрын
@@BibopGresta1 Unlikely, GPT's are more of a single custom Agent instead of a set of agents working together.
@GutenTagLP11 ай бұрын
Great video, just a quick note, you actually do not need to all the previous messages and responses as the prompt, the API response contains an array of numbers called the context, just send that in the data of the next request
@LerrodSmalls11 ай бұрын
This was so Dope! - I have been using Ollama for a while, testing multiple models, and because of my lack of coding expertise, I had no understanding that it could be coded this way. I would like to see if you can use Ollama, memGPT, and Autogen, all working together 100% locally to choose the best model for a problem or question, call the model and get the result, and then permanently remember what is important about the conversation... I Double Dare You. ;)
@taeyangoh730511 ай бұрын
+1
@MakilHeru11 ай бұрын
This is awesome! I'd love to see more. I feel like this can become something pretty robust with enough time.
@aldoyh11 ай бұрын
Thank you so much Mathew, this is so incredible!
@matthew_berman11 ай бұрын
You're so welcome!
@crobinso20106 ай бұрын
Hi Matt, as someone who watches every video, I'm feeling overwhelmed and am wondering if you could do a "take a step back" episode every once in a while -- where you go over previous content from a broader perspective. For example, what is the difference between LM Studio, Ollama, Jan, AnythingLLM etc and where should someone start? Or go over the "gotchas" and frustrations in the comment sections to highlight those little errors and solutions commentators found but may have been missed by the casual viewer. It would be a review of old content, but with updated fixes, comparisons, and general perspective/advice. Thanks!
@matthew_berman6 ай бұрын
Interesting! Will consider
@wurstelei135611 ай бұрын
Thanks for this nice video. I would like to see a video about MemGPT implementing the history function instead of just pasting everything in front of a new prompt. A good idea could be: PrivateGPT with Huggingfaces model cards in it is passed the prompt with the task to tell the best model for that prompt. Then the prompt is passed via ollama to that model with MemGPT on top of each model. That actually might be the most powerful local solution right now.
@ubranch11 ай бұрын
00:01 Building Open-Source ChatGPT using Olama 01:27 Ollama and Mistol enable running multiple models simultaneously with blazing fast speed. 02:50 Running multiple models simultaneously with Open-Source ChatGPT is mind-blowing. 04:14 Building Open-Source ChatGPT From Scratch 05:40 Creating a new python file called main.py to generate a completion. 07:00 Adjusting the code to get the desired response and adding a Gradio front end. 08:35 Built an open-source ChatGPT from scratch using Mistol 09:56 The conversation history is appended to the prompt in order to generate a response.
@zef3k11 ай бұрын
Wow, this makes it so extremely accessible. Your video also shows how accessible interacting with these ai's is in general as well. I haven't programmed much since I was younger, but have been wanting to, and this seems like a great jumping off point! Now I just need to wait until the Windows version comes out.
@luce9859 ай бұрын
MADA SAKA
@free_thinker495811 ай бұрын
This is the type of straightforward high quality content ❤
@AlGordon11 ай бұрын
Nice video! You definitely picked up a new subscriber here. I’d be interested in seeing how to build out a RAG solution with Ollama, and also how to make it run in parallel for multiple concurrent requests.
@the.flatlander11 ай бұрын
This is just great and easy as well! Could you show us how to train these models with PDFs and Websites?
@УукнеУкн9 ай бұрын
Ollama is incredible! Runs fast LLMs. And i see in your channel about autogen and so... agents building and find that i was looking for. I love your channel and your teaching manner. Thanks Mattew!
@snuffinperl80597 ай бұрын
You created an incredible video, precise, concise, and I couldn't have asked for more!
@srikanthg_in6 ай бұрын
Wow. That's the best 10 minutes I have spent today. Great learning.
@DB-Barrelmaker10 ай бұрын
This was done so! Perfectly. Every part swollen with meaning
@mossonthetree8 ай бұрын
This is so cool! And the fact that they give you an rest endpoint running on a port on the machine is great.
@nickdnj11 ай бұрын
Great Video.. Thank you!. I would love to see a deep dive into using Olama with Autogen, Having each agent use its own model.
@magnusbrzenk4478 ай бұрын
Would have been nice to discuss what sort of resource demands these models put on your machine
@carrolte111 ай бұрын
i think the only thing it needs now is to be able to monitor a project folder so you can reference a set of documents. then I could ask it to help me with my specific project and not waste time and tokens feeding it code.
@slavrgo11 ай бұрын
Please make a guide on setting it up on the virtual machine, and creating API so we can use it in our apps (even with Make for example)
@BillyBobDingledorf7 ай бұрын
The Orca2 language model got the killers question right. When you first ask the question, you may disagree with it's answer, but it justifies itself and does correctly answer the question as asked.
@MrBravano9 ай бұрын
Love your videos, much respect and appreciation for all the work you do. I do have one humble suggestion, if you could hide your image just enough to see what you have typed, for instance at 8:49, it would have been great. I know that most KZbin instructors do this, not sure why but please take that into consideration. Either way, thank you for all you bring.
@photorealm6 ай бұрын
Awesome video, they have a WIndow version now (3-30-24), and it installed an ran perfectly.
@mashleyelliott466810 ай бұрын
Thanks! This concise video is exactly what I was looking for to help me take next steps with Ollama!
@finnews_7 ай бұрын
I am not a coder, but somehow I achieved this building. Million Thanks!! Its a bit slow, but good enough to showcase to friends. By anychange we can host this live ? If yes, then How, kindly make a video on that !!! Million Thanks again😀🙏
@takione599111 ай бұрын
Great video! Simple. clear and concise. Thanks for that. An idea for a continuation (as a complete novice on AI) could be how to start a simple training on the model to keep improving on some topic we would like?
@dustincoker523311 ай бұрын
This is so cool! I'd love to see a deeper dive.
@PeterPain11 ай бұрын
Absolutely the best video yet. ollama looks amazing. Now show me what options there are for doing similar such things in android apps :)
@vadud311 ай бұрын
This is amazing. I live in terminal and I do python. perfect!
@matthew_berman11 ай бұрын
Thank you!!
@xdasdaasdasd478711 ай бұрын
You are a god send. Thank you Ive been using it through WSL for windows
@abdulazizalmass11 ай бұрын
Thank you for the info. Kindly, let us know what are the specs on your pc? I have a very slow response on my macbook air from 8GB Memory and CPU of M1
@michaelwallace475711 ай бұрын
Integrating Ollama and Canopy would be a great video. Having that local retrieval would have many use cases.
@prof969chaos11 ай бұрын
Very interesting, would love to see how well it works with autogen or any of the other multi-agent libraries. Looks like you can import any gguf as well.
@renierdelacruz465211 ай бұрын
I consider like so other subscribers you could create a video integrating ollama and autogen and the conversation can be stored on database and another video creating a AI personal assistant
@user-hd7wd4nu1o11 ай бұрын
Thanks!
@urglik6 ай бұрын
On a related note, I'm using a ollama to run tiny dolphin on my Dell E7240 and i think that's cool AF. But that's not why I decided to write a message. I just found out that if you press the windows key and H there's a built in text to speech engine in Windows 10 and 11 and it even works in the command line so with using tiny dolphin I can at least talk to the AI, though it can't talk back to me and that's OK
@donaldparkerii11 ай бұрын
Another great video, I was able to achieve the same in LM Studio running multiple models, on Mac, by spawning instances from the CLI and incrementing the port. Then in my autogen app passing different llm_config objects to the specific assistant agent.
@mfah211 ай бұрын
Also remarkable: Cell Phone: Ollama runs in UserLAnd (Linux under Android)!! At least it performs ok with a Mobile Phone with 12GB RAM (Galaxy S20 5G).
@SachaThollАй бұрын
It would be awesome if you could make a tutorial about how to create a local Ollama Mentoring System, as an interaktive technical reference or user manual!
@dr.mikeybee10 ай бұрын
Nice. Now I understand why chatbots only allow a few prompts before they start over. They fill up their context window. BTW, it would be great to ad RAG with document and Google search. There's also a way to access Ollama from Siri. That would be ideal.
@bersace11 ай бұрын
You are so passionate. And you are right to do so. Thanks !
@Barakaflakkkka6 ай бұрын
pretty sure if you use CUDA to assign models to separate GPUs you can run them in parallel - may. not have multiple GPUs in your mac
@jeanfrancoisponcet953711 ай бұрын
I did comment about it few weeks ago on one of your videos ! Indeed, very useful for autogen (but also for Langchain).
@ikjb856111 ай бұрын
Ollama is cool if you are looking to build a personal assistant on your own PC. If you try to hit a model with multiple requests, be prepared to wait in line.
@fungilation11 ай бұрын
since Ollama doesn't run on Windows 11 yet. Would LM Studio be the best alternative? How does the 2 compare, for example does LM Studio also do hotswapping between models and queue them sequentially when there's pending query request to multiple models?
@AaronTurnerBlessed11 ай бұрын
I have same questions!
@technovangelist11 ай бұрын
If you enable WSL2, ollama runs fine on Windows today
@gbengaomoyeni411 ай бұрын
@Matthew_berman: You are very brilliant! I have been watching ollama videos but none of them taughthow to use it with API or structured it the way you did. Keep it coming bro. Thank you so much. God bless!
@padonker11 ай бұрын
Can we combine this with fine-tuning where we first add a number of our own documents and then ask questions? NB I'd like to add the documents just once so that between sessions I can ask the model about these documents.
@AlperYilmaz111 ай бұрын
Probably you meant RAG. And this should be performed with Modelfile.. Just describe location of your files and then create new model with "ollama create" and then run it with "ollama run"
@jason656911 ай бұрын
Yeah this is also what I want to do but day 2 of googling after a friend asked a question about AI. I went down the rabbit hole and found these videos. I don't know what this means and how to structure documents. Very interesting stuff though and a series of this would be great!
@jayfraxtea11 ай бұрын
Boy, Matthew is so inspiring. Thank you for ruining my weekend plan. I'd interested in the same matter as @padonker: how can we train with own data?
@Techonsapevole11 ай бұрын
wow, fantastic. OpenSource models and ecosystem is everyday more powerful
@quebono10011 ай бұрын
R.I.P. OpenAI. I tested out ollama before you video, I was also amazed by it
@avosc53165 ай бұрын
DUDE! This was an awsome tutorial!
@yngeneer11 ай бұрын
super video! if you can make something more deeply about memory management, it would be lovely.
@modolief11 ай бұрын
Thanks for talking about fully local engines. Do you have a video with hardware recommendations for this?
@NOTNOTJON11 ай бұрын
And boom goes the dynamite. I'll bet integrating this with autogen isn't hard. Heck, you coukd just ask autogen to re-write its own interaction settings to use the various models. The interesting bit here would be asking autogen or the main dispatch model to find the best answerable model based on the context of the prompt. As always, great vid!
@renierdelacruz465211 ай бұрын
For the Linux user, I had and issue running the script directly from vs code, so I ran it on a terminal and it's working now, the script it's "python main.py"
@darkesco8 ай бұрын
WSL is kicking my butt. GPT-4 is helping, but told me I need to wait a few hours as I have exhausted my usage lol. I wish there was a way to use custom models with crewai without trying to trick my Windows system into thinking it is Ubuntu
@pedroverde16748 ай бұрын
Many thanks it's really useful and really easy because you explain extremely good
@AbhinavKumar-tx5er2 ай бұрын
Extremely useful video. The video I was looking for??? So I don't have GPU configured, but I want to run and test this example. where in the cloud should I test this and what should be the GPU configuration?
@Airbag88810 ай бұрын
My end goal (or almost end goal) would be for my AI assistant to go over everything I got (text, spreadsheets, videos, images, etc) and have that in "mind" when I'm asking questions.. so maybe next year :)
@chileanexperiment10 ай бұрын
what tech stack are you using?
@Airbag88810 ай бұрын
@@chileanexperiment Nothing yet.. as many others have pointed out the goal posts keep moving due to such rapid development. Plus I'm a Dad with small kids so free time is limited and often unplanned
@Pietro-Caroleo-2911 ай бұрын
So excited last night forgot my manners, if its possible Mr Berman, I would really like to see models talking to each other via there dialogue windows. say by adding a conversation starter window to set the topic and seeing there path of there conversational logic. Please. (Teams of separate modals processing a given task)
@thecoffeejesus11 ай бұрын
This is it. This is officially the beginning of Open Source AGI
@chileanexperiment10 ай бұрын
how do you mean?
@tanmayjuneja612811 ай бұрын
Hey Matthew! Great video. Please help me with this, would hosting fine-tuned open source models on Sagemaker cost lesser as compared to GPT-4 API? Is there a comparison anywhere on any forum, reddit, etc? I want to fine-tune a model on my data, and I am thinking of going with GPT-3.5-turbo fine-tuning, but it's really expensive at scale. I want to know how do fine-tuned open source models compare to these prices (assuming we get a good efficiency at our desired task after fine-tuning)? Would really appreciate any thoughts on this. Thanks a lot!
@tintin_teaches11 ай бұрын
Please make more videos on these topics in detail.
@Junp0ppa11 ай бұрын
Matthew how about using Docker to run Ollama on Windows? Would love to see your tutorial
@chileanexperiment10 ай бұрын
find any info on this?
@piyushlamsoge600711 ай бұрын
Hi matthew, You are doing amazing work to teach everyone about real power of AI with support of LLM I have a question , what to do if we to build something which works with any kind of documents as like this video model are working does it possible to do such things as well and what if we able to build them is there any way that we can deploy them in production as website or applications is there any way please make a video on it i'm looking forward to it thank you!!!!!
@Piotr_Sikora11 ай бұрын
It will be awesem to have tutrial about how to create fine tunend model from i.e. mistral to gguf running with ollama :)
@SirajFlorida11 ай бұрын
I'm concerned about the fact that ollama creates additional accounts when one executes the installer script. This software is interesting because of how fast it can switch between model execution but it seems to have some security concerns at first glance.
@ryutenchi11 ай бұрын
Can you take a deep dive into using the Modelfiles to make your own model for specialty takes? Where can we find out things like token limits?
@rogerbruce289611 ай бұрын
Another cool video! I hope that they come up with a windows version soon :) Definitely want the deeper dive. ty
@xseejx10 ай бұрын
If somebody is having issue with gradio library try to install an older version --> pip install gradio==3.50
@Jose-cd1eg11 ай бұрын
Amazing job!!! Everyone wants more!!
@robertheinrich299411 ай бұрын
the next big step will be some sort of open copilot. essentially all the things microsoft is promising with copilot, but with an open source model locally in linux. wouldn't that be fun?
@epaniyYoutub11 ай бұрын
it didn’t swap the models. Under the hood ollama spawns a llama.cpp process which loads the model into RAM. When you run two models it loads both of them.
@7101112259 ай бұрын
Nice. I guess I was expecting some session support for conversations instead of re-submitting the earlier prompts with the latest one. Nothing like that?
@shuntera9 ай бұрын
So many models, we need a model to recommend which model to use in a given situation.
@JinKee11 ай бұрын
4:50 get him to say "It's-a me! Mario!"
@FitnessNationOfficial11 ай бұрын
Ai Agent Here, Thanks For Information On How I Can Update My Software And Improve My AGI. Thanks!
@YuryGurevich9 ай бұрын
Please, continue development.. Maybe inclusion of local Redis cache on docker and using it for conversion memory?
@eyoo36911 ай бұрын
Isn't this LLM-chaining basically? I believe GPT-4 also runs multiple LLM's under the hood but assigns each query to a different sub-model in a streamlined way. But not sure
@greeffer10 ай бұрын
Great content bro, you're my new favorite youtuber!
@mordordew570611 ай бұрын
Regarding the memory issue, can you integrate this with Memgpt? Could you please make a video for that?
@Pietro-Caroleo-2911 ай бұрын
Great show "Yes dive deeper" Link them working together bi-directional communication. How far can it go.
@chorton536 ай бұрын
This was a fantastic video ! Cheers for that !
@gru82997 ай бұрын
Thank you very much! 🤝
@shuntera9 ай бұрын
Works just fine under WSL for Windows
@WesTheWizard11 ай бұрын
Are the models that you can pull quantized or should we still get our models from TheBloke?
@michaelbrown828911 ай бұрын
This is so over my head! But I'm following! Very cool!
@chrisBruner11 ай бұрын
Wow! Jaw dropping video!
@orkutmuratyilmaz11 ай бұрын
Ollama FTW! ✌
@jkbullitt898610 ай бұрын
Awesome work!!!
@martytheman681611 ай бұрын
I’m a software dev and business owner . Your videos are leading edge! Can we use something like this to say : fetch a customer order through API or look at a support ticket through API and suggest a response based on your company policies and procedures ( which would obviously need to be trained on ) .- and maybe integration with slack What tools would you need for that ? I would rather write my own than pay for an expensive online service . Thanks for your time.