Hey everyone! Thanks for watching and asking for the tutorial! I've just posted it on my new channel! Enjoy! kzbin.info/www/bejne/r6DdlmR_rcl1mq8
@QuizmasterLaw8 күн бұрын
nice build. i'm still searching for a good easy local rag and fml it appears to be nowhere on the internet. context model only goes so far and no way am i trusting google with my proprietary data.
@JoeyDee865 ай бұрын
Are you going to release any how to’s for this? Preferably with you explaining what each step does rather than just going down a list of steps
@TechnoTim5 ай бұрын
Yes, coming soon on my Techno Tim Tinkers channel! Subscribe there to know when it's available!
@traxeonic36005 ай бұрын
@@TechnoTim I'm surprised it will end up on Tinkers, given these videos would seem to hit your core main channel. Interesting.
@iamkiber5 ай бұрын
If not for reading this comment I would never have know about tinkers
@espressomatic5 ай бұрын
@@TechnoTim Any chance that you might post your AI rig's hardware composition in this description before finishing up with the more detailed video on the other channel?
@TechnoTim5 ай бұрын
I can understand that but this tutorial might be close to 40 minutes long (or longer) 😅. Videos that long do not perform well and ultimately hurt the channel.
@TheMrDrMs5 ай бұрын
9:51 nah, I work for a company where we've been doing this since before "AI" was mainstream and the e2e models have not only helped accuracy but improved performance, even with our CPU workloads. It's been incredible to be working on this and seeing the sudden rapid development.
@VillSid5 ай бұрын
There are two features that I am especially looking forward too: a) Video text search: I have security cameras that are using Frigate NVR that is using AI for image recondition to trigger if a person entered and are; audio AI model that listens for fire alarm or breaking glass but they are working on implementing text search for video clips, so you could search for clips with a guy in red jacket. b) Local audio transcription*: Tested whisper large models for transcribing non English call recording and it works but it is sloooow. I ran out of time on google collaboratory. I saw that there is optimized whisper version that I can run on google coral locally without a GPU, so I still need to test that one out. I would love to be able to search my calls.
@GuilhermeMirandaRaiolАй бұрын
you should try faster whisper and insanely faster whisper. I got a good performance in Portuguese even with smaller models, or you can try a fine tuned model for your language, there's some people on huggingface that already did for various languages
@prathamshenoy98405 ай бұрын
One of the most useful tech videos of this year. Unlike some other channels that post so many videos but 95% of them are useless
@questionablecommands94235 ай бұрын
I'm all about self-hosting these technologies. Ever since DALL-E hit the scenes, I've been thinking that artists should train a model on their own art so if they get creatively stuck, they can, "ask themselves" for inspiration.
@TechnoTim5 ай бұрын
That's an awesome idea. I really wish I knew more about training. Maybe soon!
@Breeegz5 ай бұрын
@@TechnoTim If you can train a dog... actually that's nothing like training an A.I.
@TechnoTim5 ай бұрын
On top of that one of my dogs still bites me!
@nicoschroder86925 ай бұрын
Great video as always :) Would love to see a video about the hardware setup & requirements and some guidelines for which models to choose for different hardware configs
@2ndtlmining5 ай бұрын
Man this looks interesting, you gotta show us how you set this all up
@showmequick22455 ай бұрын
I second this
@droneforfun53845 ай бұрын
I approve this
@Nastalas5 ай бұрын
There is a HACS version of ollama support where you already can control your devices with it in Home Assistant
@tomthompson119820 күн бұрын
dude. love it. plz focus on more content that runs locally and free from so many saas providers with their monthly fees. open source and locally ran solutions must be the future!.
@TechnoTim20 күн бұрын
@@tomthompson1198 🫡
@ewenchan12395 ай бұрын
I've played with Ollama, the open-webui, a different open-webui, and Automatic1111. One of the models ended up needing about 40 GB of VRAM, so I had to use two 3090s to be able to have enough VRAM for the model. Pretty nifty though. Not perfect, but still fun to play with.
@Drkayb5 ай бұрын
You can also set up piper as a server, and just feed it text by curl (local or remote). Then it generates audio-files super quick. It can also be piped to stdout iirc if you don't need the files.
@TechnoTim5 ай бұрын
Thank you! I will look into how to connect this to HASS!
@Drkayb5 ай бұрын
@@TechnoTim I think the problem was that is just a ton of overhead every time you run the executable file, so by keeping a server running the .exe is "running" all the time.
@llortaton28345 ай бұрын
glasses off? it's about to get serious!
@raul175335 ай бұрын
Yeah. Nice try Tim AI
@robotics_and_stuffАй бұрын
Local AI TTS is available now, called F5-TTS
@The_Mup5 ай бұрын
1:45 - Third option: Let surf shark snoop on you. VPN providers are no more trustworthy than your mobile ISP. VPNs are for getting around region blocks, NOT for privacy.
@TechnoTim5 ай бұрын
They both have data logging, selling, sharing, and trading policies... ISP is to do it ...VPNs like this is to not.
@lalalalelelele79612 ай бұрын
VPNs are for creating encrypted tunnels for sensitive data. Not all privacy revolves around torrenting and hiding IPs.
@Nextrix5 ай бұрын
I wonder how well these tools work in an offline or no-internet VLAN. Most still tend to connect to third party domains/servers, and we have no clue what data is being sent when it does. I'm not ready to trust these yet. Would make a good video to showcase the endpoints they do try and connect to.
@jwr67965 ай бұрын
What are the gpu requirements for all this? Are we talking a recent-enough gaming gpu like a 3060, or do you have to shell out for those enterprise cards with no video output?
@TechnoTim5 ай бұрын
3060 should work fine. Smaller models should fit fine!
@jwr67965 ай бұрын
@@TechnoTim good to hear!
@Unselfless5 ай бұрын
What hardware are you using to run this?
@Solus-Regnator5 ай бұрын
this teaser was nice, where is the setup video ? :D
@abhijithabhi585 ай бұрын
What are the hardware requirements ?
@jacobnollette855 ай бұрын
that dances with wolves earned my thumb
@CharafEddineCHERAA5 ай бұрын
For anyone who's using Ollama, what's the minimum hardware needed to run a 70b model?
@krurschak26535 ай бұрын
I would say RTX 4090 but with poor performance experience. For GPT Like experience you will need something like 4x RTX 4090. But than you could deploy Mixtral 8x7B which is a GPT-4 class LLM with good Performance and Context Window.
@antaishizuku5 ай бұрын
Id say 2 4090s or a 4090 plus another nvidia card. Like a 4060 or 3060. You will need about 40gb of vram for decent quantization but if you are willing to give up decent responses go for about 30-ish. Just keep clear of the 2k quantization. The 3k is okw with 4k being a standard. 8k/q is about the same as the full float 16 model but need huge amounts of vram. Anyway more vram/cuda = better
@antaishizuku5 ай бұрын
Phi3 14b 128k is really good and i heard good things about gemma 2 27b. Though overall im still a fan of llama3
@brandonmansfield43285 ай бұрын
It varies since you can adjust the quantization for fit. For the big models (70b) I would suggest > 40GB if you can swing it. >70 GB if you want to run 120b models. A pair of p40s off eBay isn't too bad to buy. Probably the best budget path presently.
@macthiswork30065 ай бұрын
what is the project called that you use for the whisper webui?
@spaceco15 ай бұрын
Awesome video. Would love to see a follow up video where you go over the hardware for inferencing these models. And what kind of performance changes you noticed when playing around with different components
@wyattarich5 ай бұрын
This seems to be covered in many other places, and it's almost entirely subject to the models you run. Hard to generalize such a thing. Google for Llama.cpp benchmarks and INT8 performance for GPUs.
@Mishanw5 ай бұрын
What kind of GPU are using ? I have a Dell R730, I wanted to try to put a GPU on that and run Ollama . I reallly wish there was a low power AI processor that we could plug into any device with sufficient RAM and be able to run models effectively and efficiently at a relatively affordable cost
@Jairoglyphx5 ай бұрын
We need info on the hardware setup! Like are Nvidia GPUs the only option or can we use NPUs in the newer Intel processors?
@brandonmansfield43285 ай бұрын
NPU performance is going to be bound by memory bandwidth performance and ddr5 isn't where you want to be. The soldered lpddr5x is going to have much better memory bandwidth and will be when these chips start to get some reasonable performance. Lunar lake and Zen 5 should both come in this configuration at some point.
@Techonsapevole5 ай бұрын
well done, local LLMs are the future
@tohur5 ай бұрын
I have pretty much been running local AI from the onset of all the opensource models and have ran plenty of backends and now am on ollama and plan to stick with it as its the fastest backend I have ran out of all of them.. and on Linux so easy to run the models on AMD OR Nvidia.. run 7b-13b models on my little ol RX 6600 XT with Rocm and tbh it runs great and also IMO running locally 7b-13b bout all anyone needs just have specialty models on the ready for different tasks which ollama makes that easy af haha.. best feature to me with ollama is having it setup to auto unload models when not in use
@amirzo122 ай бұрын
What is the hardware stack you are using for your AI solution
@BCKammen5 ай бұрын
Ok, Tim, where is the guide for how to set this all up ? Especially the Home Assistant stuff....
@TechnoTim5 ай бұрын
Soon on my Techno Tim Tinkers channel!
@BCKammen5 ай бұрын
@@TechnoTim Standing by then......
@xythiera72555 ай бұрын
If you dont have a realy realy powerfill gpu its not realy possibel in turms of usability if you have to wait ages for something to happen its kind of pointless
@TechnoTim5 ай бұрын
@xythiera7255 It really depends on the GPU, I will cover this in my tutorial!
@krurschak26535 ай бұрын
@@xythiera7255 4090 is enough for llama3 8B. 4x 4090 or one A100 will work for the 70b version or even for Mistral 8x7b nearly as good as GPT-4 and super fast :) but phi-3 and llama3 8B are really not that bad. They are better than GPT-3.5, so i see this as a good starting point. I would recommend waiting for new hardware like llm specific GPUs because they can be much cheaper like 1/4 of the price.
@underfluked383223 күн бұрын
Bring your own model is a big thing with corporates, it’ll come.
@Squirrel4Gir5 ай бұрын
Love the vid. Please also try to include a notice to help these free models either via training or donations to accelerate their further development
@knutblaise94375 ай бұрын
Curious if there is a self-hosted AI which could serve as a replacement for Grammarly? I recently noticed my Office 2016 had a new AI process running. From a privacy perspective I'd prefer not sharing my documents with organizations like MS/Google/Grammarly.
@brandonmansfield43285 ай бұрын
You don't need a full ai for grammar. Language tool is self hostable and they have browser extensions you can configure for your local copy.
@raymondx1375 ай бұрын
Do you have a part list and or setup tutorial?
5 ай бұрын
This is 12 minutes of pure gold, thank you very much. 😊
@Rodent0075 ай бұрын
Thank you, great video. I wish you would run through what hardware you run this on.
@TechnoTim5 ай бұрын
Thanks for the feedback. I have a video on it, it's my new All in One HomeLab server. More to come!
@jason-budney76245 ай бұрын
Really cool video TIm! I've been wanting to play with some image to image "AI" stuff, but it's been hard to find much about it when self hosting is involved. I'll be poking around with the tools you mentioned to see if I can find something.
@DorZ19835 ай бұрын
What is the UI that shows the app stack flow? Is it an actual app or just after effect?
@ezequieligomez21352 ай бұрын
Do you know what's the most cost-effective GPU to get this done as I doubt it will work well generating images or processing PDF smoothly on a CPU?
@neponel10 күн бұрын
great little show case of what you can now do locally. continue to share this with us. perhaps specific tutorial on the mentioned things? tired of wasting my money on siloed saas products/services.
@TechnoTim10 күн бұрын
@@neponel tutorial is the pinned comment! Thank you!
@WMRamadan5 ай бұрын
I tried this a while back with an nVidia 3060 RTX 12GB and of course bigger models wouldn't load. Would using two GPU's help load bigger models giving a combined memory of 24GB? Also do you know if mixing GPU's works, for example having a 3060 12GB with a 4060 16GB to give a combined 28GB?
@xythiera72555 ай бұрын
If you dont have a 3090 at least you are realy limited . Yes that exist but you coud also just buy a workstation card means insane costs . So if you realy want to play with AI you need a 4090 becouse of the Vram its the only real option other then going with a NVIDIA RTX 6000 for 6 grand and 48gb Vram
@WMRamadan5 ай бұрын
@@xythiera7255 I'm going for the cheapest option, If I can buy two 4060 16GB to have a combined 32GB of GPU memory then I will do that!
@matthias32312 ай бұрын
Hi! Nice video! Can you dive a bit deeper in how to set it up, what the draw backs are, hardware requirements (CPU/GPU/disk space/...). The positive things as well but those are covered a lot here already. Thanks!
@TechnoTim2 ай бұрын
There’s a link in the description and pinned comment for the full tutorial
@_coderizon4 ай бұрын
what is the Difference between Ollama with WebUI and LangChain for NLP tasks
@alexjohansson3285 ай бұрын
Super awesome video - unique cutting edge I can't wait to give it a go
@Arthur-o2y3 ай бұрын
which rack is that? at 0:44
@angryox31025 ай бұрын
You’ve just given me so many ideas. This is awesome.
@tchesnokovn5 ай бұрын
What’s the nocode workflow looking thing you are using?
@vaidkun5 ай бұрын
the thing with AI is that even if you are running it locally you need to get the training data from somewhere, so someone still has to give up their privacy :)
@TechnoTim5 ай бұрын
touché
@FreedomToRoam865 ай бұрын
Very cool idea, the private search AI!
@ivlis325 ай бұрын
HA Voice integration is, unfortunately, very strange. They insist on using HA "add-ons" for voice what I really don't want because I do not use HAOS, but deploy HA as any other container.
@dragonhunter24755 ай бұрын
The addons are just docker containers, you can find them in the rhasspy git repo
@abudi455 ай бұрын
It was a great video but U didn't show us how we can install it in our home lab 😢
@skelious3 ай бұрын
Great video thank you so much for the info. I am a completely new person to this space. (Boomer status front and center) But i am going to try to go all in on a self host scenario and try to have fun and learn including taking some python stuff to enhance my experience. Keep up the great work.
@TechnoTim3 ай бұрын
@@skelious thank you! It’s never too late!
@santiago695 ай бұрын
Hello what is the name of the open source web based version of whisper that is mentioned please?
@user-ic6xf5 ай бұрын
I was so ready for you to do a video on this.
@l0gic235 ай бұрын
Been waiting for this one. Let's go!
@andrewbennett57335 ай бұрын
I've watched a few videos on people setting up AI like this, but this just has the perfect blend of information AND instruction. Your 230K follows should be more like 2.30M. Thanks for sharing so much good stuff!
@TechnoTim5 ай бұрын
Thank you so much! If you can believe it, it's actually more difficult to say less. I had to constantly remind myself to not ramble or go on side quests 😅. Thanks for noticing and a full tutorial will be coming soon on my other channel, @technotimtinkers
@andrewbennett57335 ай бұрын
@@TechnoTim I get that! I used to be an educator and it's hard not to tell everyone you meet all of the facts you know, especially when it's stuff that excites you. For the record I would happily listen to all of the side quests haha. And how did I not know about your other channel??? HERE I GO
@TechnoTim5 ай бұрын
@@andrewbennett5733 Sometimes side quests are more fun than the main quest!
@andrewbennett57335 ай бұрын
I need you to go the @JeffGeerling route and start a third channel for side quests 🤣
@TechnoTim5 ай бұрын
That's what Techno Tim Tinkers is for ;)
@Squirrel4Gir5 ай бұрын
Gonna need a video of whisper. Also any chance it can be integrated into Plex drafting subtitles
@SyedZainUlHasan5 ай бұрын
What are the system spec?
@gemargordon68855 ай бұрын
I’m loving Gemini for sure! It’s a bit better than llama or ChatGPT.
@FatalSkeptic5 ай бұрын
haven't been able to get Home Assistant to give me any data back from AI agents, so frustrating
@truckerallikatuk5 ай бұрын
Why do so many services go with such odd names? Like Sear XNG, which is how I'd pronounce it,, not search NG. That's how it's written after all.
@benhillard9195 ай бұрын
I think in the area it comes out of the "x" makes a "ch" sound.
@TechnoTim5 ай бұрын
@@benhillard919 I think so too, and I totally guessed so I hope that's how it's pronounced! Also, now that I see it again, it might be "searching". 🤣
@eliaskallelindholm83395 ай бұрын
This is the first time I have done something Techno Tim is showing before he did show it :D
@TechnoTim5 ай бұрын
Ha! It took a while for me to build, integrate, and actually evaluate all of these systems!
@eliaskallelindholm83395 ай бұрын
Did you try the 70B model from Llama? (because I saw you also used the 8B model only) I read some stuff about this with 2 rtx 4070 or an Ada 6000 but I sadly dont have the hardware to run that purely on Grafic cards yet. The results should even be better than the payed ChatGPT stuff.
@eliaskallelindholm83395 ай бұрын
RTX 4090 with 24GB VRam I mean.
@OvernightSuccess7215 ай бұрын
This is Tim’s evil twin brother NoTechTim. Insert Travolta meme looking for the tech.
@TechnoTim5 ай бұрын
TechNOTim 😂
@lakshaynz4 ай бұрын
Thank you 😊
@coletraintechgames29325 ай бұрын
Im ready for the how to! I have messed with it and have something running,but these features look awesome!
@TechnoTim5 ай бұрын
Soon on my other channel!
@huseinnashr5 ай бұрын
You have other channel?
@dcoidua5 ай бұрын
Would this all run well on a 4090?
@brandonmansfield43285 ай бұрын
The bigger models need more vram than a single 4090 provides. You can run the smaller models just fine. You will lose out on some performance the bigger models provide but it runs!
@koevoet72885 ай бұрын
You can run homeassistant faster whisper on gpu, ive been doing it for months. I’ve got a dockerfile for this, lmk if you want it
@TechnoTim5 ай бұрын
Thank you! I found a forked version of wyoming whisper but it didn't seem to help. I figured I'd wait for the official one to get updated.
@koevoet72885 ай бұрын
@@TechnoTim I’m also using someones fork, don’t remember if i changed it in any way but its running perfectly on my quadro p2000
@dhmybiker50345 ай бұрын
How to define a graphics card on Docker in Ubuntu
@TeambitDK5 ай бұрын
This was really interresting, now I want to build it :D
@TheJoaolyraaraujo5 ай бұрын
Mac Whisper is amazing
@TechnoTim5 ай бұрын
100% agree! I bought it for better models and they work even better for scripted talks (like this). It's so accurate!
@OGH32945 ай бұрын
Can I do these things with a 4060 TI 16Gb version ?
@TechnoTim5 ай бұрын
Yes, just use smaller models.
@xythiera72555 ай бұрын
you can but it will be realy slow
@OGH32945 ай бұрын
Ok. Plan dropped . I will just keep watching TechnoTim 😁.
@jensodotnet5 ай бұрын
I currently run two 1070 (8gb), while a little slow it works fine, but for image generation you would need more vram, 8b llm models works fine on single 8gb vram. A 3090 is much faster and does images very well and can run larger models. imho integrating search had bigger impact than using a larger model of the same type(not tested 70b)
@sree_nath5 ай бұрын
Love your videos, even though there are plenty of how to videos on these topics, I would love to hear it with your mesmerizing voice 😊
@TechnoTim5 ай бұрын
🥰. thank you! Audio in this old wooden / plaster room is hard, so hopefully it sounds ok!
@showmequick22455 ай бұрын
Nice, welcome to Minnesota btw 😂
@itaco80665 ай бұрын
Great video! ❤
@Rohinthas5 ай бұрын
I am generally skeptical of the AI hype but your way of going about it has piqued my interest. Hope more in-depth guides on setup and hardware are coming, subscribed ;)
@Rohinthas5 ай бұрын
Ah I just found your homelab video! That answers some questions!
@tendosingh56822 ай бұрын
So power hungry the good AIs Gpus are.
@TechnoTim2 ай бұрын
@@tendosingh5682 for sure.
@droneforfun53845 ай бұрын
Just subsribed for the upcoming guides on local Ai 😃🥰😎
@TechnoTim5 ай бұрын
@@droneforfun5384 soon!!!
@djstraylight5 ай бұрын
I see a future video of you building a dedicated AI server with multiple GPUs and benchmarking the tokens per second depending on the setup. It would get many views from r/LocalLLM or r/LocalLLaMA groups for sure.
@TechnoTim5 ай бұрын
Thanks! Sounds awesome! I am always hesitant to share my content on subreddits other than my own, but if you feel this is worthy of it feel free to!
@Mrtrunks5 ай бұрын
Glass off so we don’t see that DeskPi
@HenryBiglin5 ай бұрын
Damn, you just sent me down a rabbit hole.. lol
@famousartguymeme5 ай бұрын
this is awesome!
@voodoochild420ai5 ай бұрын
nice vid
@TheRowie755 ай бұрын
Surfshark privacy??? Open Source?
@yewbacca5 ай бұрын
What happened to Tim? Who is this imposter?
@TechnoTim5 ай бұрын
🤓
@hamdibougattaya5 ай бұрын
That's awesome, I like ur vids...
@alexey_sychev5 ай бұрын
Sure, electricity is free nowadays
@TechnoTim5 ай бұрын
It uses a lot less power than a gaming machine since you only use it in spurts, nothing new here, just shifting the workload that's using the card.
@ClayBellBrews3 ай бұрын
I was promised cookies!!!!
@romayojr5 ай бұрын
you forgot to mention the script for this video was made by AI 🤖
@TechnoTim5 ай бұрын
Ha! Nope, 100% me! Bad grammar, bad jokes, stutters were all compliments of HI (Human Intelligence)
@romayojr5 ай бұрын
@@TechnoTim i love AI but HI will always win my heart. but seriously, thanks for this video, i've been waiting for this one. now i need to integrate more stuff to my open webui!
@Act1veSp1n5 ай бұрын
YEESS!!!!
@Tr1pke5 ай бұрын
Surfshark has no logging policy, yea right. A VPN seller with no logging policy will never exist. Don’t lie we like you to much
@Kaleb-lf8kf5 ай бұрын
lol
@mavis-io5 ай бұрын
What hardware is used for this AI heaven?
@tsmot9115 ай бұрын
When AI can write an OS it will have arrived.
@TechnoTim5 ай бұрын
The singularity!
@llortaton28345 ай бұрын
hi!
@CharizardSnyper5 ай бұрын
You're a One Piece fan?
@ABTcorp5 ай бұрын
😀😀🥰🥰🥰🥰
@dhmybiker50345 ай бұрын
Please use audio dubbing from English to Arabic in your videos
@pkt12135 ай бұрын
Can I put this into proxmox? "Proxmox, spin up LXC container for Plex and pass my gpu through from hardware encoding."
@EvgenMo11114 ай бұрын
да, в lxc контейнере с GPU работает без проблем, только настроить сетевой адрес