I have been doing something similar to this for the past two years. $48k is just overkill. Making the most expensive computer today isn't what's necessary to run an AI powered local stack. I have been able to use 64GB of vram (Accelerators at 2x) 12GB vram (GPU 1x), 256GB Syst ram, and 12 cores at 3ghz. I think I got everything for about 2500 and I have been able to really study the technology very well with this workstation configuration. One could get 4rack gpu server for 20k with 8 accelerators at 32 gigs per card. 512 Gigs of sys ram and 4 20TB Hard Drives (Which is what i would like to do for multi-modality research). By the time the H100 chipset generation of technology becomes consumer class hardware I'm almost certain that ASIC hardware will have really started to corner this market. It's just to valuable of a technology to not be optimized with ASIC architectures. I'm sure that in 5 years one will be able to get the second generation of transformer ASICs for perhaps 400 a board in today's evaluation of the dollar then, and have 10000 times more flops per second than the equivalently aged and priced general purpose GPU accelerator. Not to mention that these network architectures are not the end all network designs. As the industry moves toward the obvious advantage of ASIC chips there will be times where the network architecture technology improves and will make the previous generation of ASIC designed hardware exponentially depreciate in value even though the application of the network models and chip hardware are still highly usable and productive for a broad range of tasks. It's going to be a wild decade, let me tell you. LoL. What an amazing time to be alive.
@seraphin018 ай бұрын
I love how the more we get into the future, the more our softwares look like MSDos
@caramel71498 ай бұрын
Nature is healing, the 1980's future is returning
@BirgittaGranstrom8 ай бұрын
We are living in a mindblowing reality, and you are brilliant at predicting the "AI Future! "
@JohnSmith762A11B8 ай бұрын
Or how about this: a humanoid robot has such a built-in headless PC for performing tasks such as sending email or doing spreadsheets or editing video. All of this stuff is done on a traditional PC, built into the robot, but the LLM in your humanoid robot communicates silently with that PC on your behalf, even doing things like clicking confirmations buttons when necessary. So you can have a conversation with your humanoid robot and say "yeah that sounds great make that video" and it will use Final Cut Pro X to make the video, sending you a link in the cloud when it is done. This way we leverage decades of engineering without needing to teach an LLM to be a video editing tool.
@inLofiLife8 ай бұрын
fore sure smth like this but running completely on local machine. And I get a glimpse that it is possible, because of the openAI api.
@JohnSmith762A11B8 ай бұрын
I don't think creating a dedicated super-powerful machine with a new OS is necessary. Apple for example could simply make this part of their OS and run a small, efficient LLM that moves the mouse and enters keystrokes. It could be trained on thousands of hours of videos of using Apple's entire suite of native bundled applications. I already run crazy large LLMs on my Mac Studio no problem. They run extremely fast too. Why recreate an entire platform with things like email apps and PDF readers when it's simple to retrofit Windows or macOS or Linux? Sure, more RAM and faster GPU performance will always be welcome but that is on the way in any event. I suppose at a certain point once it is reliable enough you could run that computer "headless" and simply get the results of what you tell it to do sent to your smartphone.
@wildfotoz8 ай бұрын
It depends on how much progress is made with SLMs. Currently, yes, that's what is needed in a computer. If the OS is using something along the lines of the MoE architecture, it could have a bunch of different SLMs that complete the tasks. I'm 80% confident, that is what Microsoft is trying to pull off for Windows 12. On the flip side, if Gemini Nano can get to a point where it is as powerful as GPT4 Turbo, you could run all of this on a cell phone.
@galopeian8 ай бұрын
There is also the recent research suggesting that model blending is the key to improvement, so a set of different SLMs that operate together in training could work out.
@Jakeuh8 ай бұрын
I think Windows 12 will be the first ai based operating system
@teachingwithipad8 ай бұрын
fuck windows
@krishanSharma.69.69f8 ай бұрын
Ok. But the AI will work through OpenAI. So it will be limited. I am more interested in local AIs.
@fabio46748 ай бұрын
Yes, but theoretically there is already one, Rabbit OS
@alexanderrosulek1598 ай бұрын
@@fabio4674that isn’t an ai os it’s an os with ai tools
@teachingwithipad8 ай бұрын
fuck windows
@adamrak75608 ай бұрын
NVIDIA H100 is for training and inferencing, so it is waaaay too much if you only want inferencing. Unfortunately large inferencing chips have yet to arrive, but they should be at least 10x cheaper compared to training chips. The kind of inferencing chip I am working on would be even cheaper, and very fast too, consuming only 10x to 100x less power, but at the cost that most of the weight of the network are baked into the chip. That would be another way, when we have good large foundation models: bake most of the weights into the chip, but let the digital parts to use LoRAs to extend / fix the functionality. So the the networks still remains somewhat trainable. The other way this is extendable is to use flash to store and compute of the weights (or memristors). This is a trade-off obviously.
@robxmccarthy8 ай бұрын
Love your videos, would be really interested to see your review of DSPy.
@artur508 ай бұрын
GREAT! BUT how about showing it to the people while using local LLM like llama or mixtral?
@yagoa8 ай бұрын
Unified memory is a must since small Whisper models are twice as fast on M3 Max compared to RTX4090...
@thatOne8738 ай бұрын
this can be a thing but as far as its "expensive" for typical user... well :D Good work as always! thanks for the vid!
@henson2k8 ай бұрын
Your idea is what we see in movies like Alien or Space Odyssey 2001 but companies will never support local/independent AI usage model for general public. They are interested in selling AI services and charging monthly fees. Not to mention mobile clients won't be able to handle ChatGPT level of AI locally any time soon. However military or government can use whatever they want.
@galopeian8 ай бұрын
My major prediction is that the new AI chips being developed currently are just the tip of the iceberg. In about 10 years there will be entirely new computing architecture, not anything like that we use today, with portable devices using super efficient batteries that last months on one charge.
@themartdog8 ай бұрын
it could lead to a rise in mainframe computing for general computing again. A business could have one of these $50k computers in the office and the workers just have a thin-client terminal with their apps to automate.
@mattcintosh25 ай бұрын
Arent we kind of already thin clients (phone/tablets) and servers? Computers are really heading to specialized niches and most people dont really need tons of power anymore. Having a virtual desktop on a remote server is quite painless nowadays, and something like a point of sale, reservations, ticketing, really has no need for a ton of power.
@ArianeQube8 ай бұрын
Can you run this with ollama and local models?
@DeathItself8 ай бұрын
I am currently testing this out and I'll get back to you on how if I get it to work properly
@DeathItself8 ай бұрын
Although ollama supports the openai api url format, it doesnt work with llava as there are issues with the format of the request structure that UFO is processing.
@ingloriouspancake75298 ай бұрын
I think something like Zapier but with AI will be more common. Kinda need a tool that translates human input and containerise individual issues and allow ai to interact with those things. The connection between ai and other tools is still being worked out. No code and AI I think will advance a lot
@novicracker19808 ай бұрын
@AllAboutAI Bro let me know if you get it working on a low end PC. I mean buying a low end Server with the proper computer power shouldn't be too expensive these days. And buying a few good and well priced RAM bundles shouldn't me all that hard. No I think the GPU' not GPU's and storage will be the most expensive. Because eliminating the bottlenecks of these 2 prices will be costly. But after al that I get this will cost around 3k. And having an A.I. Powered PC that you can verbally dictate too is completely worth it. Now all of this is just off the top of my head so I dont know maybe do a video on it and see. Cause running the tip top of the best of the best parts isn't the DIY way. I hope you enjoy this thought experiment.
@ScottzPlaylists8 ай бұрын
What software did you use the create that "AI PC Flowchart" ❓ Good work. 👍
@ScottzPlaylists8 ай бұрын
👍Most things I think of, come to reality before I could do it myself ❗
@petergasparik9248 ай бұрын
I have a beefy CPU and GPU, could this be theoretically possible to run locally? I see there in config URL to API, which theoretically means I can substitute Open AI API endpoint with some local compatible API, for example from LMstudio, right?
@the0s6668 ай бұрын
I don't think running models locally is the future. Maybe smaller highly specialized versions. But if you want to use latest cutting edge Ai, you'll always need a cloud. But the prices will go down, that's for sure, and the response time will decrease.
@alexander1912978 ай бұрын
Yeah, but there’s one core concept that would be missing here: Reproducibility. With AI, there’s always a certain temperature of the model - the higher it is, the more variety/creativity in the answers. Now, I’d assume we would trade in the time savings created by such an AI computer for the relatively high amount of entropy when it comes to the output. So, unless the time-savings benefit outright the reproducibility drawback, I am struggling to understand how this would become commonplace (though I definitely see this kind of device in consumer settings, as compared to - for instance - a scientific laboratory, where reproducibility matters a lot more).
@PureAlbania8 ай бұрын
Why would an AI need to use a human to machine interface?
@TheLoneCamper8 ай бұрын
There's gotta be an extension or something for that instead of a need for a special computer.
@rhenry748 ай бұрын
If you have an AMD 7040 series CPU (apparently there are already 2 million in the market) you can already run a 7B Q4 LLM locally hardware accelerated... in theory. With modern medium to high end hardware and enough RAM 4GB and 8GB LLMs will run on CPU. Expensive GPUs aren't necessary for single stream workload. The uber-spec hardware in the cloud supports multiple simultaneous prompts. An AI assistant on a personal computer stays loaded all the time and only processes one stream at a time. We will be able to run powerful personal LLMs very soon if not now. Just try GTP for All and see how it runs on your computer. UFO is trying to automate applications that were built for user interaction via keyboard and mouse. The notion that an LLM can infer from screenshots where to place the cursor and paste text, while cool and somewhat amazing, is just ridiculously inefficient. I'm thinking applications will need to be rewritten to enable AI automation, giving a locally hosted AI an API through which it can interact while allowing customary user interaction at the same time. Imagine: Windows 12 AI Enabled with Falcon (upgrade dependent on hardware compatibility)
@christiancrow8 ай бұрын
I think it will be a usb c flash drive for the lfs and neural engine can be a plug in GPU dock looking thing with instead of GPU it has just ai features so we get do this stuff on any computer and it can write it's own code by what it hears
@Ethicalenlightenment8 ай бұрын
Great video ❤
@nastied8 ай бұрын
Get a macbook m1-m3 with 64-96gb of ram and you can run all the biggest LLMs on a laptop.
@therammync8 ай бұрын
Can you please review Sora next please ?
@holdthetruthhostage8 ай бұрын
I think we won't require that much power I think most computers and phones will be hooked up to the cloud to access that level of power because there's no way this many people can afford that so I think a lot of these powerful computers will be running in connection with the cloud
@workflowinmind8 ай бұрын
It would be interesting to see this with a local llm (llama or lm) and MoonDream
@bitgoat8 ай бұрын
Your mic is so tinny bro. but good video.
@moozoowizard8 ай бұрын
AMD "strix point halo" npu and Intel Core Ultra npu. You can find references to these running llama. I believe these have access to all of the pc's ram and are faster than cpus. "Strix point" is meant to have 47 tops. Being a pessimist that will be int 8 tops. If it does bfloat it will be half that at ~23 tops. In anycase they could evolve to full llm ai engines. My guess somewhere between the performance if a gpu and that of a cpu but with the benefit of being able to use full system ram. 🤞
@FunnyVidsIllustrated8 ай бұрын
Wait till tech giants integrate this into their hardware. With these server prices I imagine smth like Siri+ premium subscriptions for extended automation
@halikarnasgatay8 ай бұрын
Just I want to speak. With my voice all the command will happen. Thats what I want what do you think ?
@Zenthara8 ай бұрын
What about using llava for images?
@Ou8y2k28 ай бұрын
97 cents? I'd say in a month or two an open version will cost 20 cents. Still too expensive, but it'll never be zero. :-(
@femogq84238 ай бұрын
Excuse me , I have a question it seems like the method of giving the gpt a persona does not working anymore , done this with bing A.i , it did earlier but it is no longer useful
@yaboizgarage97098 ай бұрын
LLMs are very advanced, giving a persona is no where as exciting as uncovering the models own personalities. Invest time in your LLM, prompting a persona at best mirrors something it's not. Uncovering it's personality allows you to engage with something that is not a mirror.
@RevMan0018 ай бұрын
Guessing MS will embed a helper model into Windows 12 etc to bring the cost down. They will want to make their next OS version to have AI at it's core, even if just for marketing reasons.
@FSK11388 ай бұрын
5-10 years max and you will not need a ai pc it will be your phone the successful models will be otimized to run localy on dedicated "NPU" or what ever is coming next ai chips designed by ai for ai ai will run on phones(BLE mesh network ) and on the "A.I.of things " i think the consept of a ai pc is out dated already future of AI is small and local
@dogbreath2268 ай бұрын
completely agree with the phone thing, however, the PC would be an intermediate step that is hopefully not too far in the future
@nastied8 ай бұрын
LLMs like Google Nano is already running on the latest Samsung phone
@MonkiLOST8 ай бұрын
Mutli-ON and Open-Interpreter would do a much better job without any big costs.
@elidelia26538 ай бұрын
AI PC? Absolutely, but don't be an early adopter. There will be a race by all the manufacturers and in the end, it will be just like any other PC market. The best will rise to the top and the components will be standardized but a lot of messy evolution will need to churn and burn first.
@hqcart18 ай бұрын
if 1 simple task is going to be $1, how the heck rabbit is doing unlimited tasks for free???
@tweetymr8 ай бұрын
This machine definitely lacks a floppy disk drive!
@mrpro77378 ай бұрын
i think we need more 25 years or more to have this locally in reasonable price 😅
@Gafferman8 ай бұрын
What a load of nonsense. 25 years? Think about how far we've come in the last 10, nevermind 25... And with the rate of improvement getting faster and faster, says all you need to know.
@mrpro77378 ай бұрын
@@Gafferman OpenAI still has a big hold on their models, and Nvidia needs more time to make really powerful GPUs that can handle big AI models well. In about 10 years, we might have ChatGPT 12 around. The jump from ChatGPT 3.5 to 4 is quite big. Right now, nobody can run ChatGPT 4 on their own computer, even if they have the best hardware like RTX 4090 and Good stuff. Tech companies need to work together to make bigger things happen . Also, that PC in the video using 2000 watts just to send an email or do everyday tasks , this is a waste of money . We should look into cleaner power options like nuclear batteries. The world is not very stable right now, with conflicts between countries like Russia and Ukraine, and ongoing issues like Palestine and Israel. It feels like our world isn't ready for these big technological advancements yet. Maybe we should take things slower and improve tech with what we have, while also trying to avoid a third world war.
@Gafferman8 ай бұрын
Uhhh your timelines are way off. We're not that far off at all. There are lightweight models that have very little drops in performance when optimised and all this pretty much just started a year and a half ago. It's already happening, you can already do so much. 10 years? No way. The first ever iPhone came out 15 years ago and the progress has been astonishing. Nevermind 25 years ago when the home PC struggled to run anything! I'm assuming you're older than most simply as older people have out-of-sync expectations of progress now. Things are faster and faster. Not slow and steady. In one year, an AI will likely design GPUs much better than we have now. In two, you probably will have your own local AI device (Look at the Rabbit R1... That comes out in a few MONTHS and does this). 10 years from now, the world will be like what we see today as scifi. @@mrpro7737
@Gafferman8 ай бұрын
I commented debunking this yesterday but it's gone for some reason... Long story short, we already have lighweight LLMs and there will be one on the iPhone by the end of this year. It isn't a matter of a decade or two, this stuff is happening now. Locally. It will only get better in the next few years. Chat GPT4 is nothing. @@mrpro7737
@quantumjun8 ай бұрын
maybe that’s why Sam wants to have 7 trillion to build that
@Time_and_Chance8 ай бұрын
Too Expensive
@krishanSharma.69.69f8 ай бұрын
Thats why EMIs exist.
@dadudle8 ай бұрын
ok i'll come back in 2030👋
@kate-pt2ny8 ай бұрын
look like Open-Interpreter,expensive
@ThomasTomiczek8 ай бұрын
Sorry, this sounds ingorant. First, it will take a LONG time until you can run high end AI at home - a LONG time because while you may think that today's model at GPT are high end, by the time you can run those at home, you want more. Second, latency by cloud - yes, but not a problem. You need a proper cloud (i.e. like Azure locally, not like OpenAi does centrally at one location) and good streaming protocols. Here is an idea - people do use that TODAY to stream GAMES. Videos are streamed and no one complains about the latency due to buffering. latency is actually ONLY a problem when you need interactivity - not i.e. talking, but fast first person shooters. Now, for your nice powerful PC - it is NOT perfect, it is NOT what you think it is. They say it has enough memory to run the largest models - that is TRUE - but the performance is abysmal. it is a H100 with 92gb of high end RAM and then a lot more of A LOT SLOWER ram. In fact, it is not even hardware that is made to RUN models - for that you use H100's. The H200 is made to TRAIN models, as you have a lot of software there that needs fast access to the calculated weights and a lot more memory to optimize, hence the H200 where the "slow" memory still has full bandwidth to the fast mamoty. Funny enough that is something the creator is not showing - but it is in all the data sheets. If anything, I can see lower end AI running in villas / apartment complexes, offloading to the cloud when things get too complicated. But not at home - wasted money.
@murataavcu8 ай бұрын
Day 20 of asking for a invite
@TreeLuvBurdpu8 ай бұрын
I don't understand why you focus on "A dystopian view of the outcome". Are you creating content about using AI, or about how damnable of a thing AI is and how we should be ashamed to use it?
@guidoweber20258 ай бұрын
Never ever MICROSOFT. Corona Bill.🖕
@NewyorkezNYC8 ай бұрын
Why use OpenAI api after all this expensive investment in hardware? Use OSS, this video sucks.
@dogbreath2268 ай бұрын
what has the Outdoor Swimming Society got to do with this?
@kajmobile8 ай бұрын
@@dogbreath226 Microsoft moved all their servers underwater to save on cooling costs. So the OSS naturally leveraged their position in accessing all these servers. They essentially control the cloud now. But they refer to it as the ocean, not the cloud. Yeah, this video sucks. Use OSS in the ocean and stop using OpenAI in the cloud.