LLM System and Hardware Requirements - Running Large Language Models Locally

  Рет қаралды 7,000

AI Fusion

AI Fusion

Күн бұрын

Пікірлер: 38
@nishitkakkad4976
@nishitkakkad4976 4 күн бұрын
Hey, I'm thinking of getting a new laptop with an RTX 4060 and 8GB of VRAM. But I'm also considering using Google Colab or Jupyter notebooks to learn about and play around with Large Language Models (LLMs) and their applications. The thing is, I need a new laptop anyway because my current one is ancient and barely hanging on. So, I'm wondering if it makes more sense to just buy my own machine or if I should go the Colab/Jupyter route. What do you think?
@watcharasupphakan9326
@watcharasupphakan9326 Ай бұрын
Thank you for a very informative video. I just brought rtx 4060 8gb Vram with 16gb ram laptop. Looking to run smaller llm and learning to build Ai agent and also using stable diffusion. Hopefully its enough for start. Saving up for rtx 4090 pc. A bit far fetch at the moment 😂
@AIFusion-official
@AIFusion-official Ай бұрын
Thanks for the feedback! Congratulations on the new RTX 4060 laptop. With 8GB of VRAM and 16GB of RAM, you're well-equipped to start working with smaller LLMs and building AI agents. I also have an RTX 4060, and it runs models like Gemma 2 2B in FP16, LLaMA 3.1 8B in Q4 quantization, and Qwen 2.5 3B in Q8 smoothly. These models should perform well with your setup, but I would recommend paying attention to the laptop temperature. Keep going with your AI projects!
@watcharasupphakan9326
@watcharasupphakan9326 Ай бұрын
Thank you for the model recommendation. That's what I was thinking as well but with a little assurance goes a long way. I thinking on moving on to a PC in a couple months. Probably not rtx 4090 😅. I would probably be going for 4070ti or 4080 super, don't really need very large model for what I'll be doing. Keep up the good work and hope you build up large subscribers 💪
@AIFusion-official
@AIFusion-official Ай бұрын
You're welcome! I'm glad the recommendations were helpful. Moving to a PC sounds like a great plan, and the 4070 Ti or 4080 Super are solid choices, as they both have 16 GB of VRAM, which will allow you to run larger models. Thank you for the kind words! I'm excited about the journey ahead and appreciate your support. Best of luck with your AI endeavors, and feel free to reach out if you have any questions along the way. By the way, I would recommend playing around with Whisper, which is an opensource speech to text model. I'm using it to create subtitles for the videos, and "Medium" and all the ones below it works well on the RTX 4060.
@YasirJilani-f1j
@YasirJilani-f1j Ай бұрын
very nice thanks, can you do a video on TPU?
@MonsieugarDaddy
@MonsieugarDaddy Ай бұрын
I'm glad I found this video, but I got lost at the first part: GPU memory I knew most ppl will refer the graphic part as a 'dedicated' component yet I'm curious: can we include iGPU to contribute to our setup? considering the latest Ryzen 880M is quite a 'capable' iGPU..
@AIFusion-official
@AIFusion-official Ай бұрын
Thank you for your comment! While the Ryzen 880M is a powerful iGPU for many tasks, running large language models (LLMs) typically requires a dedicated GPU with substantial VRAM (often 8GB to 24GB) due to the heavy memory and computational demands. iGPUs like the 880M share system memory and lack the dedicated resources and bandwidth needed to efficiently handle LLMs, so they wouldn't contribute significantly in this context. For best results, a discrete GPU with sufficient VRAM is essential.
@jamegumb7298
@jamegumb7298 16 күн бұрын
What if you want to just run an LLM specifically for better speech recognition. It should be very small, a subset. Could that be done on integrated to keep the GPU free?
@markverhoeven7518
@markverhoeven7518 24 күн бұрын
"cutting edge research" read: "for more serious science or commercial tasks"
@azkongs
@azkongs 2 ай бұрын
Do you know if a model that 16GB in size, could it run with graphics card with 16GB VRAM?
@AIFusion-official
@AIFusion-official 2 ай бұрын
A model that is 16GB in size might not fit perfectly into a 16GB VRAM graphics card due to additional memory requirements for computations and overhead. While it’s theoretically possible, practical use often requires more VRAM. Techniques like quantization and reducing batch size can help manage memory usage.
@matthiasandreas6549
@matthiasandreas6549 Ай бұрын
Hello you have the link for quantization calcukation what you showing in the video? Please thank you
@AIFusion-official
@AIFusion-official Ай бұрын
that's not a tool, that's just something i made using HTML, Css and js for the sake of the video. I have made a tool right after this video, where you can choose a large language model and see what GPUs could run it (and how many of them) in FP32, FP16, INT8 and INT4. (you can find the link to this tool in the description)
@matthiasandreas6549
@matthiasandreas6549 Ай бұрын
@@AIFusion-official thanks a lot for your anwer, i look but cant find the link 🤔
@AIFusion-official
@AIFusion-official Ай бұрын
You're welcome, Here it is : aifusion.company/gpu-llm . Hope it helps!
@matthiasandreas6549
@matthiasandreas6549 Ай бұрын
@@AIFusion-official thank you so much
@AIFusion-official
@AIFusion-official Ай бұрын
You're welcome!
@irocz5150
@irocz5150 Ай бұрын
Apologies for the lack of knowledge..but why no AMD video cards?
@AIFusion-official
@AIFusion-official Ай бұрын
AMD Video cards work too, i didn't mention them in the video, but there are some good AMD GPUs for LLMs tasks.
@jefflane2012
@jefflane2012 Ай бұрын
Nvidia GPUs have tensor cores to perform massive simultaneous calculations for AI. AMD has cuda which is more of a general processor.
@akhathos1618
@akhathos1618 Ай бұрын
@@jefflane2012 No tienes ni idea de lo que estás diciendo.
@jamegumb7298
@jamegumb7298 16 күн бұрын
@@jefflane2012 AMD has no CUDA at all, AMD has ROCm which is kinda the same but different. Tensor cores helps, but other capavbilities factror into it also, RDNA3 can work fine.
@matthiasandreas6549
@matthiasandreas6549 2 ай бұрын
Hello and thanks for the video, i have a possibility to choose at the mean tim to buy a Nvidia RTX 3060 12GB or a Nvidia RTX 4060 8GB GPU to use 8b llms in private testing. Which one will be the better one i understand that both gpus eill be work at T4 85% accuracy is this so? Thank you for answerig
@AIFusion-official
@AIFusion-official 2 ай бұрын
I have an RTX 4060, and it works fine. However, I would recommend getting the RTX 3060 because it has more video RAM, which allows it to load larger models. Hope this helps!
@matthiasandreas6549
@matthiasandreas6549 2 ай бұрын
@@AIFusion-official​​⁠, yes thanks it helps a lot, mostly you stream was very informative. But the accurracy value with 12Gb VRAM will be the same or a bit better for understand?
@matthiasandreas6549
@matthiasandreas6549 2 ай бұрын
@@AIFusion-official, and how big is your RAM? About 32Gb or bigger? I can choose between 32 or 64gb but there is a bigger price.
@AIFusion-official
@AIFusion-official 2 ай бұрын
i have 32gb ram
@AIFusion-official
@AIFusion-official 2 ай бұрын
The accuracy would be the same, with the only difference being the speed-specifically, how many tokens per second it can generate.
@harshdeep7015
@harshdeep7015 Ай бұрын
I want to use Ai tools like stabe diffusion, comfyui, 3D game development, also want to train my ai models. Searching for a beast laptop under $1800. Also want to do video editing 😢
@AIFusion-official
@AIFusion-official Ай бұрын
For the tasks you're looking to do-AI tools like Stable Diffusion, ComfyUI, 3D game development, AI model training, and video editing-getting a powerful laptop under $1800 is going to be tricky. While some laptops in that range offer decent specs, the problem is that they often struggle with heat management, especially during heavy workloads. Overheating can cause thermal throttling, which severely impacts performance and productivity, making it frustrating for tasks like AI model training and video editing. Honestly, you'd be better off going with a desktop. Desktops provide much better cooling systems, which means they can handle long hours of heavy use without slowing down. Plus, they're way more price-friendly for the same level of performance. With $1800, you can get a desktop with a much more powerful GPU, higher RAM, and more storage space than any laptop in the same price range. And as a bonus, desktops are more upgradable, so you can easily improve the specs down the line as your needs grow.
@harshdeep7015
@harshdeep7015 Ай бұрын
@@AIFusion-official but I am a student I need portability, I will also build pc after 3-4 years
@AIFusion-official
@AIFusion-official Ай бұрын
For AI tools like Stable Diffusion, 3D game development, and video editing, finding a strong laptop under $1800 is tough but doable. Look for one with at least an NVIDIA RTX 3060 or RTX 4060/4070 GPU, paired with an Intel i7 or Ryzen 7 processor. You’ll need 16GB RAM (or ideally 32GB) and 1TB SSD storage. Good cooling is important to avoid overheating during heavy tasks. Some solid options are the ASUS ROG Strix G15, MSI Katana GF76, and Lenovo Legion 5 Pro. They’ll give you the portability you need as a student while handling your workload.
@harshdeep7015
@harshdeep7015 Ай бұрын
@@AIFusion-official What about 3070Ti laptop??
@AIFusion-official
@AIFusion-official Ай бұрын
A laptop with an RTX3070Ti is a solid choice! It’ll handle AI tasks, game development, and video editing really well. Just pair it with a good i7 or Ryzen 7, 16GB or 32GB RAM, and 1TB SSD. If you find one under $1800, go for it! Just check the cooling to avoid overheating.
How do Graphics Cards Work?  Exploring GPU Architecture
28:30
Branch Education
Рет қаралды 696 М.
Cheap mini runs a 70B LLM 🤯
11:22
Alex Ziskind
Рет қаралды 172 М.
Сюрприз для Златы на день рождения
00:10
Victoria Portfolio
Рет қаралды 2,5 МЛН
🕊️Valera🕊️
00:34
DO$HIK
Рет қаралды 14 МЛН
Я сделала самое маленькое в мире мороженое!
00:43
How to Crack Software (Reverse Engineering)
16:16
Eric Parker
Рет қаралды 650 М.
Hardware Raid is Dead and is a Bad Idea in 2022
22:19
Level1Techs
Рет қаралды 684 М.
Run ALL Your AI Locally in Minutes (LLMs, RAG, and more)
20:19
Cole Medin
Рет қаралды 186 М.
LLAMA 3.1 70b GPU Requirements (FP32, FP16, INT8 and INT4)
5:15
What's the BEST Low-Profile GPU for your Home Lab?
23:24
Hardware Haven
Рет қаралды 120 М.
NAS vs. Home Server - What's the difference?
7:31
Wolfgang's Channel
Рет қаралды 389 М.
All You Need To Know About Running LLMs Locally
10:30
bycloud
Рет қаралды 164 М.
Run your own AI (but private)
22:13
NetworkChuck
Рет қаралды 1,6 МЛН
Сюрприз для Златы на день рождения
00:10
Victoria Portfolio
Рет қаралды 2,5 МЛН