Copilot+ PCs - Do you need an NPU? Microsoft Says "Yes", I Say "No"

Рет қаралды 23,756

Күн бұрын

Пікірлер: 328

@ernestuz 6 ай бұрын

The core of current AI algorithms is the 'multiply and accumulate' instructions, that even microcontrollers as cheap as STM32F4s have. So many things can run AI nowadays, the major limitation is the amount of data that has to go through those 'multiply and accumulate'. Originally desktop CPU used Double precision for floating point calculations (lets call it FP80 and FP64, the number is their size in bits), that was too much for graphics, so graphic cards use FP32 and FP16, but neural networks are very resilient to precision loss, so people started to 'quantize' their models to 8 bits and lower, at 4 bits the models I have tried are still very strong (quantized from an original FP16 model).. The state of the art today is models that use a single bit per weight, that don't need the multiply step. Just the algorithmic refinements are driving computing needs down. The fact is that you can run quantized models in a CPU at acceptable speeds, just don't choose a 80billion parameters one, a 3B parameters LLM should run acceptably fast, producing tokens much faster than a person can type. Somebody wants the AI always turned on in your computer to learn everything from you, otherwise I can't see the reason.

@skyak4493 6 ай бұрын

You nailed it! Microsoft's real motivation is to make people train their AI replacements!

@philippeferreiradesousa4524 6 ай бұрын

Language models on device are memory bandwidth limited. The AI moniker is really earned by moving to non-upgradeable on-package RAM that’s 136GB/s bandwidth instead of 50GB/s.

@jordancave6987 6 ай бұрын

Msft gatekeeping AI features to encourage the masses to purchase new PC's, thereby helping out vendors and Win11 adoption. It's hidden in plain sight.

@POVwithRC 6 ай бұрын

Bingo

@csteelecrs 6 ай бұрын

Business 101

@runed0s86 6 ай бұрын

These new PCs will also be constantly screen-shotting your screen and uploading data to microsoft... To sell to the highest bidder!

@LordHog 6 ай бұрын

The question is do we need Microsoft’s Co-Pilot PC, no, absolutely not

@gaiustacitus4242 6 ай бұрын

Software developers need one just to build software for the people who purchase them. Otherwise, your answer is correct.

@gaiustacitus4242 6 ай бұрын

@@MadafakinRio The majority of the people who purchase a Copilot+ PC will never realize any advantage over just using Copilot on a Windows 11 PC based on Intel or AMD CPUs. The small LLMs that Copilot+ PCs can utilize are too low quality to generate high quality output. This means the majority of the workload is offloaded to a hosted AI (i.e., regular Copilot). Copilot+ requires a NPU to perform local AI processing. This is mainly for better support of the new Recall feature in Windows 11, and Recall will prove to be more hindrance than help.

@SPECTRA890 6 ай бұрын

Exactly

@azeemuddinkhan923 6 ай бұрын

@@gaiustacitus4242really? I find it amusing that you have assumed that people won't find massive battery improvements useful

@sinom 6 ай бұрын

@azeemuddinkhan923 there haven't been any independent tests yet to find out if there are any battery improvements. Additionally copilot+ PC ≠ ARM PC. Most of the supposed battery life improvement is ARM vs x64 not "no copilot+ vs copilot+" adding a simple TPU simply cannot magically improve battery life by a significant factor. Additionally both AMD (with their CPUs in general) and Intel (with their e-cores) are making significant improvements in battery life under normal use, so how long this (still not actually confirmed) advantage of ARM chips actually lasts is debatable. Especially since for most apps ARM CPUs will require the use of Pris, a translation layer, which may (still nobody independently tested it) or may not be less efficient than running these apps natively instead.

@owlmostdead9492 6 ай бұрын

It's a good day for Linux

@od1sseas663 6 ай бұрын

lol

@shanehebert396 6 ай бұрын

Let me guess... "Year of Linux on the Desktop" yet again, like every one of the past 30? years?

@owlmostdead9492 6 ай бұрын

@@shanehebert396 Last time I checked "Recall" wasn't a thing 30, 20, 10 or 5 years back.

@od1sseas663 6 ай бұрын

@@shanehebert396 2024 is SURELY the year of Linux THIS time!!!!! 😂😂😂😂😂

@toby9999 6 ай бұрын

Why?

@johnsimon8457 6 ай бұрын

As a desktop user We have GPU for just about everything from drawing UI elements to rendering fonts. CPU is used way less now than 20 years ago for web browser tasks. I see NPUs as another kind of GPU - specialized hardware. But while GPU is used constantly I only see occasional use for NPU hardware today - facial detection, text prediction, etc. An office worker isn’t going to be making constant use of an NPU. Maybe I don’t have any imagination but it feels like a lot of marginal uses but few major ones. The Apple photo app can detect flowers in a field and separate background from subject. It’s a cool trick and insane to see on a phone but it’s not like I use that functionality on a daily or even weekly basis.

@martineyles 6 ай бұрын

I think the Minecraft demo is a situation where you might not want your GPU to do the AI work, as that might affect gameplay performance. Offloading to the NPU Is perhaps helpful here, unless of course this demo does the AI work in the cloud. The recall feature was after all the only thing they said was definitely local.

@deth3021 6 ай бұрын

TTS could be a big use case. Speech to text as well,

@univera1111 6 ай бұрын

Hold up. To me NPU is like a GPU in GPU. Or like an Os in a GPU using kvm. So I believe it will be faster and more efficient.

@deth3021 6 ай бұрын

@univera1111 more like a mini TPU.

@gabrielgon3408 6 ай бұрын

I make calls for work with that "follow me" front facing camera feature on my tab s8 ultra. I consumes more battery noticeably. Maybe with a higher performing NPU it could drain less battery.

@nathanaelsmith3553 6 ай бұрын

I've been using the OpenAI API at work and am a bit underwhelmed. If I ask it the same question 10 times I get the correct answer about 8 times and sometimes completely daft answers for the remainder. This is currently the best I can get it after optimizing my prompts. Not very reliable. I personally don't consider LLMs to be AI because they don't actually understand anything. They just regurgitate correlated data.

@paulbarnett227 6 ай бұрын

"They just regurgitate correlated data." - I didn't think of it like that but yes, you're right.

@andybarnard4575 6 ай бұрын

But isn't that what my brain does?

@POVwithRC 6 ай бұрын

Microsoft is in bed with hardware manufacturers who want to sell more hardware. What do you think the arbitrary leave-behinds that windows 11 were meant to do? Sell more Intel and AMD product.

@AndersHass 5 ай бұрын

Microsoft in bed with themselves to sell more Windows licenses to more hardware manufacturers, lol

@John-cn8jv 6 ай бұрын

I'll just keep my 4 year old laptop going. Microsoft can kiss my ***. Microsoft thinks what's good for their bottom line is what the users need, even if they need to jam it down our throat. I think a raspberry pi is my next desktop. Linux doesn't make the demands of M.S. I prefer to choose my own hardware.

@KeyYUV 6 ай бұрын

I wouldn't use AI baked in the OS anyways. It's good to know MS won't secretly enable AI via windows update for my desktop. If I want to use AI I'll get 3rd party apps.

@faisalrahman9236 6 ай бұрын

Do you guys want the professor to resume the Speedtest-G benchmark? Would you donate him to continue so he can arrange more devices for tests? If you agree, then like and say "yes".

@anb4351 6 ай бұрын

I want professor to start doing SOC showdown he used to do long ago on Android Authority YT Channel

@martineyles 6 ай бұрын

Which things is the NPU used for. Is it deciphering your request and responding, or is it for analysing the screenshots for searchable content (demos show OCR functionality, but also being able to search for things like a brown leather bag in the image). Perhaps some of these tasks require more computational power than others. If the recognition of objects in photographs were unbundled from the other features, perhaps it wouldn't need the NPU.

@alpha007org 6 ай бұрын

Imagine grammarly or deepl translate, but running locally. This is a good use case for NPUs. But we need a huge algorithmic development to reduce hallucinations to 0.01%.

@chengong388 6 ай бұрын

The reason is pretty simple really, because the desktop gaming PC market is tiny and inconsequential, Microsoft doesn't care, they know the gamers don't care either. 90% of the normal people will be running a laptop, or something that doesn't have a 4090 in it, so nobody cares if your 4090 can technically run copilot+ but can't. They're trying to make this work for the 90%, where if this thing did just run on a laptop 4060 or something, it would just kill the battery life for 90% of the users.

@GaryExplains 6 ай бұрын

The desktop gaming PC market is tiny and inconsequential? 🤦‍♂️

@andyH_England 6 ай бұрын

I look forward to local LLMs on devices focused on specific tasks, like an encyclopedia with different volumes, such as a history or maths LLM. This way, we can load them individually into local RAM without needing the cloud. This could be monetised, where you buy/rent an LLM based on your current requirements. They can be updated. I assume focused LLMs would be inherently more accurate, and the package will be more suitable for non-cloud requirements.

@alpha007org 6 ай бұрын

I'm running models locally. I had some success with LLama8B_q8, (I'm trying qwen2 now) for RAG. Speed is insane, but it still hallucinates too much. If I can get ~4-10GB always in memory, and NPU with very low power usage,... well I hope we'll get this in the future.

@PuchoWebSolutions 5 ай бұрын

Hello from New York City! But where can we find a list of application software that has been specifically program to take advantage of the NPU? Thank you for your videos.

@paulbarnett227 6 ай бұрын

Thanks Gary. I've been wondering about this, can I run CoPilot+ on my 4080? Apparently not. It's a real shame and is obviously gatekeeping by Microsoft to push the new NPU equipped laptops. It would seem that they already have a CUDA build but will not release it to the masses.

@nyonkavincenttafeli7002 6 ай бұрын

I'm waiting for that video that will really show me(us, as I'm sure I'm not alone) the day to day usefulness of that copilot thing

@OZtwo 2 ай бұрын

Watch Star Trek TNG the main system computer on the Enterprise is what co-pilot will be. But I just hope we do not have to start an interaction with it by saying "Bing: ..."

@HakushinX68000 Ай бұрын

Microsoft is enhancing copilot to record every single thing you do on your PC and store it in the cloud. Absolutely awesome!

@soragranda 6 ай бұрын

11:28 That is a shitty move to do to a partner... though, people will find a way to put something like copilot but using your gpu (I mean, maybe it doesn't make sense on laptops but on desktops, it does).

@electrodacus 6 ай бұрын

In the new Lunar Lake SOC the CPU can do 5 TOPS (int8) while the NPU can do 48 TOPS with likely lower power than the CPU and GPU. The GPU can also do 67 TOPS same (int8) This are theoretical numbers but it shows the NPU has an order of magnitude better performance and at lower power. The new AMD strix point APU claims an NPU with 50TOPS that seems to be able to to Block FP16 at same performance as int8 so it seems even more impressive. Their iGPU can do around 23TOPS of FP16 not sure if it can do int8 any faster and it will sure use more power while doing this than the NPU.

@andybarnard4575 6 ай бұрын

Lets use the NPU as a CPU then. 2025 will be the year of the 8-bit processor. Bring it on.

@electrodacus 6 ай бұрын

@@andybarnard4575 It may be that just 4bit or even 1bit is needed for effective AI. Due to simplicity and extreme parallelism they can just do much more computation than traditional CPU's. So maybe 2025 is not the year CPU becomes irrelevant but it may not be that many years from now.

@madorsey077 6 ай бұрын

that was my overwhelming question when seeing the MS announcements, Why?

@B.Ch3rry 6 ай бұрын

All I want is for Windows (Microsoft) to be natively supported on Apple M-Series. BootCamp spoiled me with the ability to dual boot/choose based on my needs.

@dansanger5340 6 ай бұрын

The way to look at it is that NPUs are efficiency cores for the GPU. Since Copilot+ is so far laptop-only, it makes sense to have efficient NPUs. The other thing that makes sense is to push an industry standard for consumer AI hardware, if we ever want to be able to do AI on something other than an Nvidia GPU. Right now, if you want to do AI on Windows or Linux you have to jump through hoops if you want to run on anything other than Nvidia. I'm glad Microsoft is doing something to address the de facto Nvidia monopoly on AI hardware.

@Winnetou17 6 ай бұрын

But it's insane that they're pushing it at the OS level. Which is the same OS for desktops too. Since when does "PC" mean "laptop only" ? Oh, wait, we're speaking about Microsoft here. I rest my case.

@dallasgrful 6 ай бұрын

10:30 explains the requirements for NEW PC’s only. Lots of people won’t need these computers.

@PragmaticTornado 5 ай бұрын

It's funny that my desktop with an RTX 4090, would absolutely smoke all of these bespoke NPU's. But because of greed and marketing reasons, that probably won't happen. Though I'm the kind of guy who uses GPedit / RegEdit to disable as much telemetry as possible, so a feature that takes regular snapshots of your actual screen - probably isn't for me. It's a new AI-powered world, of telemetry and privacy concerns.

@KevinInPhoenix 5 ай бұрын

Doesn't BitNet b1.58 make all these hardware requirements go away since it does not use matrix multiplication of f16 numbers? It will be interesting to see what upheaval happens in the AI space in the coming year.

@nfineon 6 ай бұрын

CoPilot + PC (Stupid Naming) is a HARD NO, already switched to Ubuntu Linux and it works perfect for 99% of everything I need. The next generation CPU's are going in too fast into the AI gimmicks, in fact only a small fraction of us will use these features, but on the latest Intel Lunar lake for example, the DIE space taken by the Neural Engine is larger than the entire E-Core complex! FFS, would rather take more compute or cache than dedicating 8 cores of space just for AI neural processing which is beyond Niche at the moment.

@msromike123 3 ай бұрын

I think Microsft is being pretty forward thinking by not introducing technology without putting some sanity checks on how much energy it will consume. It would be a bad business move to spend significant amounts of money on a technology that would be unusable by many people because it would require 250w of power to run even at idle.

@Matrix1Gamer 3 ай бұрын

Integrating NPUs (Neural Processing Units) directly onto processors seems like a logical step forward to enhance AI capabilities at the consumer level. Providing both CPU and NPU technologies within devices empowers users to execute AI tasks locally, offering benefits in terms of privacy, speed, and responsiveness. This approach could lead to exciting advancements in AI applications and a more personalized user experience across various device However, these RTX GPUs are amazing at handling A.I tasks.

@thedevincrutcher 6 ай бұрын

The usefulness of the NPU for the average Windows user is highly unknown. We don't know how people will feel about these new features. But we know automatic screenshots creep people out in ways that log files didn't. Yet considerations remain: 1. Most laptops only have 8 GB of VRAM and the models are larger. It makes more sense to have the NPU share main memory, especially for non-gaming laptops. 2. x86 PCs have mostly not kept up with the Apple Silicon in terms of battery life. 3. The NPU allows for use Ai cases requiring low latency, like background removal or noise isolation. These mostly useful for video streaming, but difficult to justify for asynchronous processing requirements. 4. Latency is different from speed and most users probably won't grasp that nuance. Potentially difficult sell. 5. Most laptop users don't care where the Ai processing is happens as long as they have good battery life. The privacy aspects are likely lost on most users. Remember, the people designing and marketing these systems are deeply technical. They often cannot grasp the end user's perspective or adapt to it quickly enough.

@Bareego 6 ай бұрын

If you run a large language model all the time this means that your pc will waste 8GB of ram, all the time. Ram manufacturers are going to love this. The only hope for low level access might be through RISC-V chips coming up, although they still have a lot to catch up generally speed wise. The whole implementation with copilot just seems like they had a solution that said buzzword AI and looked for problems to throw it at.

@jeczsz 6 ай бұрын

but for desktop it is ok to run llms or smls on grpahics also u should be able to run ai on grpahics in laptops when they are at wall outlet

@GaryExplains 6 ай бұрын

Exactly.

@dave24-73 6 ай бұрын

Million dollar question, will Microsoft lock down the latest versions of windows saying you need a NPU or you can’t install it, looking at Windows 11 this is exactly what they did with TPM, and later CPU now being minimum specs.

@gaiustacitus4242 6 ай бұрын

You can be certain that is precisely what Microsoft will do.

@gaiustacitus4242 6 ай бұрын

@@MadafakinRio It didn't take Microsoft 10 years to make obsolete more than 50% of PCs in use by its customer base when requiring a TPM to run Windows 11. Microsoft doesn't care one whit about its customers who aren't generating new revenue for the company. It must make older hardware obsolete in order to drive industry support for the company's product line. It's just the nature of business.

@dave24-73 6 ай бұрын

@@MadafakinRio I’m not so sure, ARM is coming, and AI is happening much faster, 10 years the ship will have already sailed.

@vasudevmenon2496 6 ай бұрын

Jensen announced they will add new API that will add the copilot runtime layer to nvidia GPU that will be visible as npu in task manager and he says it's much faster than current SoC even with older RTX GPU via driver update.

@TamasKiss-yk4st 6 ай бұрын

But still use 200-450W for that, the NPU do the same with 1-3W power consumption (just for example the iPhone has NPU with 35 TOPS, but not drain the battery in 5 mins, it's remain usable for hours, but a GPU can't work with a tiny battery)

@vasudevmenon2496 6 ай бұрын

@@TamasKiss-yk4st i never talked about performance on a battery. Even nvidia doesn't. it's great to see SoC NPUs are on par with PCs especially their perf per watt with a small compute tile.

@paulbarnett227 6 ай бұрын

@@TamasKiss-yk4st On a desktop system battery is irrelevant. I hope NVidia do what Jensen appears to be saying.

@Quantum_Nebula 3 ай бұрын

NPU's will likely be while upscales our games and runs local LLMs... so I'd say its got uses

@SuperFredAZ 6 ай бұрын

Gary, you always explain everything so well, thanks!

@fuseteam 6 ай бұрын

I'd like to interject what you refer to as "pc" is actually copilot+ pc or as i've recently taken to calling it "copilot" you see everything makes use of copilot, the pc is just the vehicle with which copilot is delivered to you. Copilot is delivered to you through many means the edge browser, your cloud storage, you office tools, the bing website, github and more. The pc is just one mean out of many. It's in the it's all just copilot, just c-co-copilot, just copilot

@Winnetou17 6 ай бұрын

I don't want to imagine the seizure that Richard Stallman will have when finding out about these requirements. Hope he won't. But if I was in his place, I certainly could.

@fuseteam 6 ай бұрын

@@Winnetou17 ikr lmao

@davidbayliss3789 6 ай бұрын

I think Microsoft will extend to GPU's/CPU's eventually. I think they want to start somewhere, get people to buy new stuff for Windows 11 lol ... I think over-all they want to reduce the stuff they'll need to support. But after collecting a ton of data on NPU use with copilot+ then I can't see why they won't start adding switches so you can enable it to run on GPU's / CPU's too. I'm not too clued-up on NVIDIA'S NIMS (might have got that name wrong) container thing yet ... I don't know if it'll apply ... but I thought of that as being the new potential abstraction layer at least in terms of NVIDIA GPU AI and maybe that could interface with Copilot+ and de-risk GPU use for Microsoft a bit.

@yookoala 3 ай бұрын

Maybe Ollama will be able to run workloads on NPU (much like it can run on GPU). And perhaps it can be optional.

@simonabunker 6 ай бұрын

Haven't most major releases of Windows enforced minimum hardware requirements? Mostly to drive sales of new hardware. If you are being very generous, you could say it is ti future proof your computer.

@jaybestemployee 6 ай бұрын

Let's say any development of specialized computing hardware (NPU in this case) needs a large sum of initial investment. The cause for NPU is power efficiency for "AI" applications. The benefits you gain with an NPU varies with how dependent you are on those applications, which can be not very much for a lot of people. However, MS is pushing this new piece of hardware as a new standard so the hardware can be sold with scale to possibly trigger a healthy upward cycle of NPU hardware investments and consumption. Whether the hardware is gatekept I don't think it would matter a lot for long coz people are smart enough usually to hack their way to exploit what they got. Then the investment decision on whether to buy an NPU is whether you want to help with the development of a more power efficient piece of hardware than the GPU and make each type of hardware (CPU,GPU,NPU) do its best in parallel. With the scale of NPU sales goes up, I guess the development of NPU may also include local model training and such for private AI applications, again depending on demand. Diverting such workload from GPU would be a main goal for these NPU initiative coz you know NVidia has been controlling the GPU market for AI for some time and other bigs are not very happy I can tell.

@stevemilchuck9241 6 ай бұрын

The question is do we even need co-pilot other than less than a handful of times of playing with it I absolutely no use for it.

@CTimmerman 6 ай бұрын

Sounds like still adding more features to CISC. RISC already should be fine for matrix multiplications using its many cores.

@GaryExplains 6 ай бұрын

What is the connection between RISC, matrix multiplication, and cores? I am confused?

@CTimmerman 6 ай бұрын

RISC cores are simpler/smaller than CISC cores, so use less power per cycle, in exchange for larger executables which are not a problem with advances in storage. CUDA cores are probably even simpler/smaller, but afaik limited to expensive VRAM.

@GaryExplains 6 ай бұрын

I think your understanding of RISC is stuck in the 1980s.... Things have moved on since then.

@CTimmerman 6 ай бұрын

Turns out RISC processors like ARM's are taking over laptops like Apple's now, after mobile. And CUDA is RISC as well but with extra features. Even CISC CPUs use microcode to feed RISC cores now.

@iscariotproject 6 ай бұрын

no i dont need a clippy on steroids annoying me...microsoft we know you had a psychotic trauma event with microsoft bob failing and you have tried to bring it back over and over..LET IT GO

@brulsmurf 6 ай бұрын

Looking at your screen makes me worried. Are you sure you're on the right track? Let me show you a better way to do things

@RafaCoringaProducoes 6 ай бұрын

microsoft bob, i can see you are a person of culture as well

@skyak4493 6 ай бұрын

Clippy II -Clippy’s Revenge! Humanity is forced to train it’s AI replacement!

@AndersHass 5 ай бұрын

Copilot+ specifically needs a powerful enough NPU. Intel and AMD already got CPUs with NPUs in the same package but they aren't that powerful compared to Qualcomm's latest but they are working on ones that will be powerful enough for Copilot+.

@mikldude9376 6 ай бұрын

Good video Gary , my guess is the one common denominator as always is money , my bet is when microsoft is saying you need an NPU to do it microsofts way , they probably mean you need a microsoft NPU to do it , they have structured their AI in such a way that it is like a walled garden and others cannot play ..... unless they pay $$$$$$$ . Computers and technology may change , but human greed is always the same :).

@commentarytalk1446 6 ай бұрын

Presumably Apple's NPU will be more about OS-AI stuff (there's a list online) eg modify this photo, set my reminder and reschedule my calendar etc all baked in locally? Whereas as said the productivity app AIs still need the cloud for live service of information manipulation eg Dame Sally Markham Compositions: "How many pages Ms. ChatGPT? Wake me up when you've finished."

@ThePowerLover 6 ай бұрын

@gr-os4gd Thet have NPUs since the A11 Bionic.

@zedpassway4140 26 күн бұрын

Depends on whether the software you use, uses primarily the NPU, rhe CPU or the GPU. There is no one answer fits all.

@Big_Yin 6 ай бұрын

I'd rather use NPU to improve graphics or frames performance, i literally have the same opinion on Ai like crypto and nft.

@robertlawrence9000 6 ай бұрын

So in other words, we don't need an NPU and it would be better to have that space used for more graphics processing. It makes sense to me.

@Big_Yin 6 ай бұрын

@robertlawrence9000 no we're currently using neural processing units for tacky gimmicky applications instead of using them to enhance CPU and gpu in what I would hope eventually fix the problems with crossfire/sli and use npus to bridge that gap.

@abbe9641 6 ай бұрын

NPU's are in the CPU, there will be massive latency penalties from doing it this way with the GPU. Best place to utilize AI hardware acceleration is on the GPU die itself.

@GaryExplains 6 ай бұрын

I think you are confusing the terms CPU/GPU etc. The NPU isn't in the CPU, it is in the same chip or processor, but not IN the CPU. They are separate. There are no latency penalties using the GPU, if there were then we won't have high FPS games in 4K!!!

@martineyles 6 ай бұрын

I gather the NPU does similar calculations to the Tensor Cores. Therefore it might be possible to have them so something like DLSS. Microsoft did announce an upscaling technology, so I wonder whether this is run by the GPU part of a mobile SOC, or is farmed out to the NPU.

@martineyles 6 ай бұрын

The NPU could be like a Floating Point coprocessor - Something that is often separate from the CPU, but eventually it dies out because all processors do enough of that type of processing internally. However, there are also things like SIMD, which is as far as I understand the bread and butter of GPUs and got CPU implementations (eg. mmx), yet these did not replace the GPUs.

@skyak4493 6 ай бұрын

But NPUs are 4 bit maybe 8 bit. That is why they take less power than GPUs. All the matrix math I would want to do is high precision simulation of physics.

@dumnthum 6 ай бұрын

To test it, maybe someone can run the Local model in LM Studio with an eGPU. Phi mini is available there.

@shamim64 6 ай бұрын

If it need to run the AI model at full load and 24/7, then it definitely need the NPU. If you want to try running LLM on your computer, try Ollama. It can use CPU and GPU.

@TimothyChapman 6 ай бұрын

This is definitely cause for suspicion. There is no reason to lock the user out of these NPUs. The good news is that the NPU itself probably just does all of the neural processing. The bad news is what the OS running on the CPU probably does with the NPU's output...

@abhiramshibu 6 ай бұрын

But what if you added an external NPU, there are lot of NPU available in the market..

@gaiustacitus4242 6 ай бұрын

If the NPU performance was so much better than a GPU for AI, then nVidia wouldn't be selling so many GPUs for AI processing. Neither a NPU nor a GPU provide any benefit whatsoever for running a cloud-based AI. However, if you want to run a large LLM locally, then you need far more RAM and several dozen GPU cores in addition to a NPU. There is not even one Copilot+ PC that is suitable for running 70+ billion parameter LLMs locally, and the smaller LLMs do not generate output that has very good accuracy. At present, if you want to run very large LLMs with an acceptable degree of responsiveness, then you need: 1. For a Windows environment - a desktop equipped with a powerful Intel or AMD CPU, 128+ GB RAM, and a nVidia 4090 (or better). 2. For a MacOS environment - either a Mac Studio M2 Ultra (or wait for the M4 Ultra in 2025) or MacBook Pro M3 MAX, and I recommend maximum CPU/GPU, RAM and storage. 3. For a Linux environment - #1 or #2 above. Having a local AI with 70+ billion parameter LLMs is the only way that a business should be using AI. Likewise, any responsible person should be running a local AI to protect their own intellectual property or personal information.

@hallkbrdz 6 ай бұрын

Microsoft is going to finally make Desktop Linux take off next year as W10 expires. While my workstation has a NVIDIA 4000 (Solidworks) meets the specs, the CPU doesn't have TPM 2.0 so I won't be upgrading. The next workstation will run Linux, I don't like being told what I have to have for their software.

@KAZVorpal 6 ай бұрын

You can run a pretrained large language model just about as good as ChatGPT on your local computer today, without NPU. The reason they want to force you to run it in the cloud is so they can control you. They want control of what you were allowed to ask and what information you are allowed to have. When you run an LLM on your own computer, you can choose one that is uncensored, that will give you accurate information.

@GaryExplains 6 ай бұрын

Yes you can run LLMs on your local computer, I have several videos doing exactly that, but I don't think you can say "just about as good as ChatGPT".

@KAZVorpal 6 ай бұрын

@@GaryExplains Llama 3 is just about as good as ChatGPT. Not quite, but just about.

@gaiustacitus4242 6 ай бұрын

The real reason they want to run the AI in the cloud is to gain access to your data. They've ran into a wall with training AI based on information available on the Internet. Further advances need access to proprietary intellectual property, and they expect end users will be stupid enough to submit it - and users will do so. I'm 100% with you when it comes running AI. It should be performed locally.

@gaiustacitus4242 6 ай бұрын

@@GaryExplains There are 70+ billion parameter LLMs you can run locally which generate output on par with that generated by ChatGPT. Of course, this assumes that you have the high-end hardware to run them...which most people do not.

@GaryExplains 6 ай бұрын

Llama 3 in its full size is getting closer to ChatGPT, yes, but on a PC you run a quantized version which isn't as good by a long way.

@aleksandardjurovic9203 6 ай бұрын

I totally agree. Thank you for the video!

@abbe9641 6 ай бұрын

As a Linux user iam so happy to not have to care about NPU's.

@youcantata 6 ай бұрын

Someone will come up with a "NPU emulation layer" DLL for window, that will enable Copilot+ PC capability on PC with no NPU, but with nVidia GPU with much less efficiency (consumes more electric power drain but faster speed) or or multi-core CPU (more power drain and slower)

@D.u.d.e.r 6 ай бұрын

Thx Gary for summarizing whole show👍so I don't have to watch it in full which would be definitely more painful as M$ shows r probably least interesting and entertaining out of all BIG players. Seeing M$ limiting their Windows OS now the NPU is not surprising at all, it's actually expected as "M$ standard" besides making Windows more bloated "spyware" data collecting and even more notches slower and less responsive. All this AI buzz around some kind of assistant helping with our daily tasks will succeed only if its going to be well implemented and optimized and I have feeling Windows will not have the best solution in town even it's quite ahead of the AI game also thx to close collaboration with the OpenAI and early adoption into its Azure cloud infrastructure. Still M$ can learn hell of a lot from the Copilot+ however who implements key features most logical, intuitive way and convinces its users to use them actively over traditional approach will be the real winner of the AI game.

@DrB934 6 ай бұрын

I would rather build my own LLM and put a big-arse gpu or two in the system.

@noticing33 6 ай бұрын

Can i ask co pilot to gather a bunch of data and put them in paragrpahs on a certain topic from variousc sources?

@gaiustacitus4242 6 ай бұрын

You can certainly do so, but you likely will not care for the generated output.

@noticing33 6 ай бұрын

@@gaiustacitus4242 🤣🤣😢

@tpadilha84 5 ай бұрын

I bet soon we will see an open source version of copilot+ that not only runs on existing laptops with GPU, but also across different OSes

@andikunar7183 6 ай бұрын

While I totally agree with you in general (CPU/GPU/NPU), you forgot to mention, that during LLM/SLM inference token-generation (non-batched), memory-bandwidth becomes the main limiting factor. This means pumping all the billions of parameter from RAM to the SoC caches for every token-generation calculation really matters. This is why the M2/M3 Max and M2 Ultra with their large+wide high-bandwidth RAM does not do to badly vs. the much, much more performant 4090 during pure token generation (see the llama.cpp performance comparisons). But this would have been to complicated to communicate for Microsoft… And Snapdragon X is supposed to do around 130 GB/s - between M3 and M3 Pro.

@GaryExplains 6 ай бұрын

True, but there is more to local AI stuff than LLMs.

@redacted629 6 ай бұрын

All Information... Asinine Intrusion... Annoying Interactions... for a problem nobody identified comes a solution no one needs.

@BriefNerdOriginal 6 ай бұрын

Oh no, now I'm so sad in discovering that I cannot run Recall fantastic feature on my current laptop ...

@stevemilchuck9241 6 ай бұрын

Speed and efficiency is relative to what you are trying to accomplish. When it comes to whoring out my data I'm certain Microsoft is looking for more speed and efficiency of getting that information to the cloud so they can digest it in a larger model.

@RAVANAZAR 6 ай бұрын

Nope. Don't want Cortana or AI anywhere near my PC.

@lekejoshua4402 6 ай бұрын

Donkey 😂😂

@RUHappyATM 6 ай бұрын

All I want to know is whether it can help me writ my thesis on Before the Big Bang, the 3B's! Oh, and correct my grammar.

@skyak4493 6 ай бұрын

Sure! Hallucinating theory that sounds good is it’s specialty! As long as there is no possible way to check for truth it;s golden!

@HydrasHead 6 ай бұрын

They should at least give us the option, to run those features on whatever hardware we want.

@Pushing_Pixels 6 ай бұрын

As though Win11 needed any more spyware. Apart from basic chatbot stuff, I'm not interested in any AI programs I can't run locally.

@xperiafan5370 6 ай бұрын

2:39 That's what it's all about. Efficiency.

@GaryExplains 6 ай бұрын

14:34 But has that been proved?

@xperiafan5370 6 ай бұрын

@@GaryExplainsIt hasn't been disproven ether. Has it? All of these big CPU companies are pushing for them, which means you can't cancel out the fact that the NPUs gave got efficiency benefits over CPUs and GPUs in ML inferencing. And we will be getting answers to our questions in about 2 weeks time. So there's no need to declare NPUs unnecessary before even getting to use them.

@GaryExplains 6 ай бұрын

If the efficiency is proved to be an actual thing, are the gains sufficiently highly that mandating their use (which means millions of PCs become obsolete and we all need to spend loads of money buying new PCs) warranted? Also, as a side note, all of these big CPU companies are only pushing for them because Microsoft is insisting on it and they have to do it so they don't get left behind in the race to add the word "AI" to every product.

@robertlawrence9000 6 ай бұрын

I don't want a spying, logging everything on my PC. Efficiency or not so we really don't need an NPU.

@GaryExplains 6 ай бұрын

@robertlawrence9000 But the point of my video (I thought) was to discuss that all these AI features can run without the use of an NPU. So I guess you mean you don't want a Copilot+ PC, not specifically an NPU.

@NoName-zf6nr 6 ай бұрын

To test ML on Nvidia GPUs, install CUDA, CuDNN and(!) TensorRT libraries. Choosing a good combination of such libraries can already be responsible for a speed factor ~6, as I have experienced. --robert jasiek

@GaryExplains 6 ай бұрын

Testing ML on Nvidia GPU's isn't the problem, as you say technology like CUDA is well know and very mature. The problem is recreating those tests on the NPUs in a Copilot+ PC.

@AndersHass 5 ай бұрын

It is a shame Microsoft won't allow for CPUs and GPUs (and less powerful NPUs) to run these models locally. I can understand they want to push for powerful NPUs in order to have proper speed and efficiency of running these models but thereby still nice to have the option of running them locally regardless.

@MakeKasprzak 6 ай бұрын

"Year of the Linux Desktop" jokes aside, I would genuinely love some spare highly power efficient matrix math silicon in my laptop. In practice it may only support 4-bit or 8-bit data (at most 16-bit floating point data), but thats still useful, just not as general purpose as a CPU/GPU

@skyak4493 6 ай бұрын

What else uses such low precision matrix math?

@LA-MJ 6 ай бұрын

Amd has had NPU-enabled Laptop-SKUs for at least a year already

@unvergebeneid 6 ай бұрын

Microsoft's artificial barriers aside, I'm super unclear how things are supposed to work in PCs with a powerful NPU and a powerful GPU, both being great for inference. Since they're using completely separate memory, I don't see how they can complement each other, so probably the NPU will just go to waste, right?

@ps3301 6 ай бұрын

They want u to waste money to buy a new laptop!! If u r an idiot, u will waste your money earlier

@Zigg-d5d 6 ай бұрын

OK, do you really think there's anything to this whole "machine learning" business? I know nothing about it myself, but it can, what ChatGPT themselves call "hallucinate" (I think they probably thought of that word themselves, and "shipped it" out to journalists), ie. IT CAN GET IT *WRONG* SOMETIMES! What's the point if it can get it wrong sometimes, and there's no inkling that it has done so??

@DK-ox7ze 6 ай бұрын

Microsoft doesn't gain anything in restricting the Copilot+ PCs to only CPUs with NPUs, as they would want as much penetration as possible for the new OS. So maybe NPUs are actually more efficient. Though one reason why they would want to restrict high penetration is availability of data centers (available GPUs) that can run LLMs for hundreds of millions of users. While LLMs run in the cloud and have nothing to do with client hardware, by restricting the number of users they can efficiently serve all users.

@gaiustacitus4242 6 ай бұрын

Microsoft isn't introducing a new OS. The Copilot+ PCs run Windows 11. The local AI features will only run a Copilot+ PC. You can already run Copilot on a Windows 11 PC where all AI processing is done in the cloud. It is the ability to run part of the AI locally that grants the qualification of Copilot+, and this requires a NPU that is only supported on PCs based on Qualcomm's Oryon (i.e., Snapdragon X and Snapdragon X Elite).

@paulbarnett227 6 ай бұрын

They gain sales of the latest line of Surface products. It's about money.

@berrywin 6 ай бұрын

I don't get the AI hype! A tv station asked a AI: How many legs has an elephant? And got the answer two (2) legs? I asked ChatGPT how long time does it take to go to the nearest sun, beyond our star if you go with 70000 km/hour. It managed to get the star right Proxima Centauri, but the answer was a factor of 1000 wrong!

@GaryExplains 6 ай бұрын

I just asked Gemini about the elephant and it said, "An elephant has four legs. There might be some confusion due to optical illusions depicting elephants with more legs, but real elephants definitively have four legs." 🤷‍♂️

@longboardfella5306 6 ай бұрын

It’s not an answering engine. You’re using it wrong. Try dialoguing with it and using to test hypothesis or to summarise a complex document. You will THEN find what an LLM can do

@robertlawrence9000 6 ай бұрын

They get things wrong all the time. I never trust it. It only takes 1 minor detail to mess up the credibility of the results.

@gaiustacitus4242 6 ай бұрын

@@longboardfella5306 Many of the LLMs have been trained to have a specific political viewpoint. When you ask it questions to which the correct answer is contrary to the political agenda, it responds with insults and recommends that you educate yourself. Only by continuing the "dialog" and backing the AI into a corner by using logic and irrefutable facts will the AI eventually concede that you are correct and provide a proper response.

@gaiustacitus4242 6 ай бұрын

@@robertlawrence9000 Yes, AI has a tendency to make up "facts" which are completely false or court cases which never occurred to support its generated output. This behavior by AI is referred to as hallucination.

@nifftbatuff676 6 ай бұрын

Technology stopped to be a mean to solve problems long ago. Now it is just a way to make your life more miserable. The whole economy is based on creating new problems instead of solving them.

@isiahfriedlander5559 6 ай бұрын

Lunar Lake, state of the art X86, the architecture of real work.

@andrew007s 6 ай бұрын

Excellent video. Microsoft is in the business of selling new installs of windows. It's a win win to sell folks while whole new laptops. Because... Why not. Share Holders stay happy and jobs increase. Haha

@xspydazx 6 ай бұрын

can models be run on just cpu and ram ? or are we forced to by tools... optimized to use gpu over cpu: the graphics cards tessla 1000 have 1000gpu threads inside ! hence with 10 graphics cards you have a 10,000 gou model : this is nvidia cuda : the graphics card did not take off? but in such instances the gpu power over powers the cpu/ram combo... now this proposed chip also adds more capabilitys to regular pcs as people are being forced into the cloud due to the paying market and the tools being create for gpu use, so we are forced to invest in gpu power and yet the Vram seems to be the marker and not the gpu....so we are already going the wrong direction again... now with these new chips we wil also be forced back to the cloud as they will utilize these chips of we do not... the machine locks up when the gpu is used intensly as well as heat issue .. hence the need for such a chip ... in truth we do not want to keep being forced into upgrade as well as to the toolmakers pandering to these chip makers ... my cpu ram combo is higher than my gpu / ram combo ... so we should be able to choose how we alocate resources ... i think that is the true future we should be sereking... ie the ability to transform all power to sheilds and life resources ....they have full control over which device is used for calculation and power consumption. this is what we actually need .... resource control : not forced (as triton does not exist on windows a great downfall for home fine tuners like me....)

@gaiustacitus4242 6 ай бұрын

AI models require far more processing power than can be provided by a system that does not have a GPU and/or a NPU. Theoretically, it could be done but the performance would be so poor that no one would waste their time using it.

@xspydazx 6 ай бұрын

@@gaiustacitus4242 hmm ... im not sure as right now it is possible to quantize the hell out of a model , and load it on a gpuless machine ...or on cpu and not cuda or anything like that ... today these model can even run on tablets and model android systems which do not persay have gpu bare minimal... i think the need of tools which are more universal is all that we need ... after all its only a nerual network.. its not special.. there are some heavy crunching of numbers ... this shows the decline in investments in the cpu architecture hence the rise in gp costruction and its suppasing cpu technology... the lucritive natiure of gpus keep the technology seperate , where even nvida do not encroad on intel or amd domain which are basically monopolizing the market hence are under vaious government constraints with sharing technology of whic ma be higher that te technology that the government has currently upgraded to yet: as you know they need to be at least 100 times in front of the comercial areana , hence the danger of open source model and chat gpt , in which l=kaprthy revealed was actullay over 100 time more powerfull than the modelt hey are actully using publically ...the level they reached already must be released in stages.... as in truth we have reached the level of only computing power holding people back and not coding power....hence despite hiding the transformers and the actual fullimplementation the models have been released in the higgingface transformers! so all models are already open source.... so to make a 500b model is only a matter of 5 100b models as moe! so we aree only limted by the cloud that we can afford or the machines we can make: hence these upgrades ... they alrady know they have implemented these tech already on a mass level they are only bringing the public up to date ! (they already upgraded)... at one point our computer supassed mainframes... nassa was stiill using tape drives .... by todays expenditure and operational calculations required to plot trajectorys etc nasa should have been the biggest cloud providers in the world ... instead they ar a consumer! upgrades are moving too fast we are in james bond.... opps startrek.... (nearly)

@gaiustacitus4242 6 ай бұрын

@@xspydazx Small LLMs can run on mobile devices or computers without a NPU or powerful GPU, but these models have accuracy rates which I find unacceptable. As users come to realize the limits, they too will find the output unacceptable.

@xspydazx 6 ай бұрын

@@gaiustacitus4242 i think you will find that to make thes emodels effective they need to be trained at a higher quantize level , so when saving the model at full precision , the reduction process odes not loose so much ... i have quantized down to q2 and they have been lossless: someobody qunatized one of my models to idfex but i could not even load it to check if it was good or not... .... so its how you decide to create the model : as well ass for mobile phones it needs to be specifically trained to be oin this devce !! .. so it will be highly effective in the end ... so some form of new lora / qlora will be required in training before it will be a higher performer ... i also lowered raised my tempreture during training also ... so when training if the accuracy does not reach its desried rate ... i can adust the temptreeture down to gain a bit more accuracy .. for the latent samples.. its about techniques ... hence opensource public display and sharing and private models with special skils ...wink wink!

@victorc777 6 ай бұрын

Before finishing the video, I’m going to go ahead and say that I’m not interested in Windows at all if they keep trying to shove ads into the OS and if they don’t start taking our privacy more seriously. I’m a Mac user, but I have to use it for support of most of the applications I need for my work/business. When I need Windows, I spin up a VM on my Unraid server. I used to be Windows only just 2 years ago, but Win11 has turned into an ads streaming OS. A bit hyperbolic, but my opinion remains.

@retroheadstuff8554 6 ай бұрын

Microsoft please stop making e-waste! 💻🖥💻🖥💻🖥💻

@churblefurbles 6 ай бұрын

They really don't like to mention how the npus compare to modern gpus

@JynxedKoma 6 ай бұрын

NVIDIA and Microsoft are already working together to allow AI acceleration using the RTX 40 series, thus negating the "need" for an NPU powered laptop. So no, you do not NEED an NPU. Just a40 series RTX GPU. The 4080 alone has 1,000+ TOPS.

@GaryExplains 6 ай бұрын

If you saw the official quote from Nvidia, that I included in the video you need an NPU. Is there an official announcement about using the GPU.

@JynxedKoma 6 ай бұрын

@@GaryExplains Believe so, as someone covering the AMD computex livestream informed me there was such an announcement made between the two.

@GaryExplains 6 ай бұрын

Yeah I think there is some confusion about that announcement. The word Copilot and Copilot+ PC seem to get misheard etc. yes there will be Copilot+ PCs with Nvidia GPUs, but that doesn't mean without an NPU. If someone knows the exact time index of that statement in the keynote we could take a closer look.

@JohnWilliams-gy5yc 6 ай бұрын

If inference is so light that Apple chooses the ARM extension accelerator path, I guess the "real" reason of copilot+ npu requirement is AMD don't want to license Intel AMX.

@cmd_f5 6 ай бұрын

I remember thinking Windows 10 was too annoying to get into because of the changes. This latest in the Win 11 saga is just making me wish ReactOS was actually ready in any way for end users. Guess it's time to mess with Linux for the tenth time and waste more energy

@sidensvans67 2 ай бұрын

Microsoft : The back door is now a Barn. Door . Also Microsoft : Trust us ..

@skyak4493 6 ай бұрын

Microsoft learned from “Clippy” that people will not pay for an anoying software sidekick, so this time around they are making all existing PCs obsolete -then “copilot” will be the software everyone needs to run on their new NPUs! If you don’t buy copilot you are waisting your NPU! In the sequel “Clippy II the revenge of Clippy” all old computers are are hamstrung so everyone is forced to train their AI replacement!

@AlainPaulikevitch 6 ай бұрын

NPUs are a sad joke. Instead of increasing the compute units instruction set to make them as general purpose as possible and allow their usage by any type of processing, these idiots at NVIDIA went and made a specialized set of cores that can only be used to perform work that was predetermined by the chip makers. The concept is completely ridiculous. While it was justified to so for graphic processing partly because observing the software involved over a long enough period of time led to identifying a reduced instruction set which could efficiently be implemented into smaller easier to scale cores, it is today neither justified to still limit the graphic cores to a limited instruction set and even less justified to make hardware that can only compute today's AI algorithms. On the graphics front the need for an extension of the operation set was popularized by ray tracing but earlier applicative needs such as codecs could have greatly benefited from an extended instruction set which would have allowed for new codecs to be added on the driver level. On the AI front where algorithms are even more volatile it is even less justified to make specialized hardware. So the question here is: are NVIDIA decision makers stupid or simply manipulative and greedy bastards who purposefully release overpriced products that will quickly become obsolete? Either way, the way things are going on that front is plain disgusting.

@donaldduck5731 6 ай бұрын

Seems an awful lot of money to find five things to do in Paris and make a fake photo, is there anything else AI can do? I'm quite happy for now with my Win10 HP ZBook, my Wacom pen tablet, using Matlab, Python, SolidWorks, Sketchbook and my "real" intelligence to create the design things.

@paulwoodward8265 6 ай бұрын

If the new silicon is highly optimised to do this workload, this is kind of reasonable. We don’t want these models run inefficiently by millions of devices, that’s bad for the planet. Arguably LLMs are bad for the planet full stop. I’m sure someone will figure out a way to run representative workloads on these NPUs , on M4, and on GPUs, and then we’ll know.But looking at how efficient hardware video encoders and decoders are, compared to CPU and GPU, it’s certainly plausible these NPUs are way more efficient than general purpose silicon.

@GaryExplains 6 ай бұрын

Isn't having to buy new PCs with the NPUs built-in, also bad for the planet?

@JoeEbitDa 6 ай бұрын

Nvidia have ChatRTX is this their response to Microsoft's gaslighting?

@Psychlist1972 6 ай бұрын

Hi Gary. Technically, a CPU (without an embedded GPU) is capable of doing your graphics work as well, but it will be more efficient and faster on a GPU. Faster means you can do more, like higher resolution (think larger models, or run more often). NPUs are really no different. (Also, FWIW, Apple includes a "Neural Engine". It's an NPU. Not sure why the confusion there.) ChatGPT gets all the press, but AI is also about things like noise removal, camera tracking, OCR, high-fidelity background elimination, graphics upscaling, etc. Having AI in-line and being evaluated in real-time opens up a lot of capabilities. DirectML will be the low-level interface for NPUs, GPUs, CPUs, etc. like Direct3D is to graphics. The discussion around "What does it all mean" makes me think you missed those announcements at Build last month. ONNX Runtime, PyTorch etc. can run on top of that as the higher-level access. (Disclosure: I work in Windows at Microsoft, but not specifically on AI tech)

@GaryExplains 6 ай бұрын

Yes of course Apple's processors have NPUs, that wasn't the point I was making.

@Psychlist1972 6 ай бұрын

@@GaryExplains Maybe I misunderstood then. I thought you were comparing and contrasting and saying Apple just integrated the acceleration into the CPU so an NPU wasn't needed. But that's all part of their AI Accelerator/NPU/Neural Engine from what I understand.

@GaryExplains 6 ай бұрын

@@Psychlist1972 No, I was saying that with the M4, Apple added a second ML accelerator, a CPU based one, that uses the Armv9 Arm Scalable Matrix Extension, showing that you can make a useful ML accelerator in the CPU. In Apple's case it compliments the Neural Engine, I am suggesting that it could replace it. Apple wouldn't have added it if it was useless or undesirable etc.

@Psychlist1972 6 ай бұрын

@@GaryExplains Ahh, got it. Thanks. That's another matrix/vector instruction set like Arm NEON or Intel AVX/AMX (the Xeon version, not to be confused with Apple's AMX), and part of the updated Armv9 set. Usable for much more than AI, but usable for AI as well, of course, for apps compiled to use it. It would be nice if the X elite Oryon cores implemented SME, but there's always a bit of leap-frogging going on between competitors, and Apple does some great work. There's still benefit to having a dedicated NPU, vs using CPU instruction cycles, but whether or not that is "essential" would depend on how much CPU you're willing to part with when running those models. I suspect much of the AI work Apple does will still run on their Neural Engine where it's more efficient. Unfortunately, any depth I have on CPU instruction sets pretty much bottoms out here.

@OZtwo 2 ай бұрын

We may not need it now but yes we will need one. I remember years ago videos like this yet titled: Do we need GPUs?

@GaryExplains 2 ай бұрын

I guess you didn't watch the video 🤦🏼‍♂️

@OZtwo 2 ай бұрын

@@GaryExplains Yes I did. And I say again many said the same about GPUs when they first were released. NPU will be the key to free up the system so not to take a hit to any other tasks. Even AI future games will run much faster on an NPU then on a CPU as it is freeing the CPU to process the game. Of course just like the GPU more features will be added to NPUs and even standards created as we have seen with the simply GPU as it once was. But yes, you did bring out a good point and really why an NPU is needed: power savings.

@StraussBR 4 ай бұрын

GPUs are more efficient we Already know this, specially nvidia, that is why Google didn't build big versions of the coral NPU

@LabelsAreMeaningless 4 ай бұрын

While I would love to have an LLM running locally, copilot and corresponding features (like recall) will never be something I want. I don't see why anyone would be so willing to throw away their privacy like that. I don't care if someone says they never do anything wrong, it's irrelevant. It's none of Microsoft's business what you do and don't do. Not to mention the more places which have your data, the more risk there is for hackers or scammers to take that information and use it against you. Even a contact list leaked could cause headaches for people. Recall files aren't even encrypted properly so anyone who gets access can see every single thing you did. Login info, financial info, NDA work, anything.

@msromike123 3 ай бұрын

Not sure how it's any less safe or private than running Win 11 in the first place. As you said running "locally."