LLM System and Hardware Requirements - Running Large Language Models Locally

Рет қаралды 7,000

Күн бұрын

Пікірлер: 38

@nishitkakkad4976 4 күн бұрын

Hey, I'm thinking of getting a new laptop with an RTX 4060 and 8GB of VRAM. But I'm also considering using Google Colab or Jupyter notebooks to learn about and play around with Large Language Models (LLMs) and their applications. The thing is, I need a new laptop anyway because my current one is ancient and barely hanging on. So, I'm wondering if it makes more sense to just buy my own machine or if I should go the Colab/Jupyter route. What do you think?

@watcharasupphakan9326 Ай бұрын

Thank you for a very informative video. I just brought rtx 4060 8gb Vram with 16gb ram laptop. Looking to run smaller llm and learning to build Ai agent and also using stable diffusion. Hopefully its enough for start. Saving up for rtx 4090 pc. A bit far fetch at the moment 😂

@AIFusion-official Ай бұрын

Thanks for the feedback! Congratulations on the new RTX 4060 laptop. With 8GB of VRAM and 16GB of RAM, you're well-equipped to start working with smaller LLMs and building AI agents. I also have an RTX 4060, and it runs models like Gemma 2 2B in FP16, LLaMA 3.1 8B in Q4 quantization, and Qwen 2.5 3B in Q8 smoothly. These models should perform well with your setup, but I would recommend paying attention to the laptop temperature. Keep going with your AI projects!

@watcharasupphakan9326 Ай бұрын

Thank you for the model recommendation. That's what I was thinking as well but with a little assurance goes a long way. I thinking on moving on to a PC in a couple months. Probably not rtx 4090 😅. I would probably be going for 4070ti or 4080 super, don't really need very large model for what I'll be doing. Keep up the good work and hope you build up large subscribers 💪

@AIFusion-official Ай бұрын

You're welcome! I'm glad the recommendations were helpful. Moving to a PC sounds like a great plan, and the 4070 Ti or 4080 Super are solid choices, as they both have 16 GB of VRAM, which will allow you to run larger models. Thank you for the kind words! I'm excited about the journey ahead and appreciate your support. Best of luck with your AI endeavors, and feel free to reach out if you have any questions along the way. By the way, I would recommend playing around with Whisper, which is an opensource speech to text model. I'm using it to create subtitles for the videos, and "Medium" and all the ones below it works well on the RTX 4060.

@YasirJilani-f1j Ай бұрын

very nice thanks, can you do a video on TPU?

@MonsieugarDaddy Ай бұрын

I'm glad I found this video, but I got lost at the first part: GPU memory I knew most ppl will refer the graphic part as a 'dedicated' component yet I'm curious: can we include iGPU to contribute to our setup? considering the latest Ryzen 880M is quite a 'capable' iGPU..

@AIFusion-official Ай бұрын

Thank you for your comment! While the Ryzen 880M is a powerful iGPU for many tasks, running large language models (LLMs) typically requires a dedicated GPU with substantial VRAM (often 8GB to 24GB) due to the heavy memory and computational demands. iGPUs like the 880M share system memory and lack the dedicated resources and bandwidth needed to efficiently handle LLMs, so they wouldn't contribute significantly in this context. For best results, a discrete GPU with sufficient VRAM is essential.

@jamegumb7298 16 күн бұрын

What if you want to just run an LLM specifically for better speech recognition. It should be very small, a subset. Could that be done on integrated to keep the GPU free?

@markverhoeven7518 24 күн бұрын

"cutting edge research" read: "for more serious science or commercial tasks"

@azkongs 2 ай бұрын

Do you know if a model that 16GB in size, could it run with graphics card with 16GB VRAM?

@AIFusion-official 2 ай бұрын

A model that is 16GB in size might not fit perfectly into a 16GB VRAM graphics card due to additional memory requirements for computations and overhead. While it’s theoretically possible, practical use often requires more VRAM. Techniques like quantization and reducing batch size can help manage memory usage.

@matthiasandreas6549 Ай бұрын

Hello you have the link for quantization calcukation what you showing in the video? Please thank you

@AIFusion-official Ай бұрын

that's not a tool, that's just something i made using HTML, Css and js for the sake of the video. I have made a tool right after this video, where you can choose a large language model and see what GPUs could run it (and how many of them) in FP32, FP16, INT8 and INT4. (you can find the link to this tool in the description)

@matthiasandreas6549 Ай бұрын

@@AIFusion-official thanks a lot for your anwer, i look but cant find the link 🤔

@AIFusion-official Ай бұрын

You're welcome, Here it is : aifusion.company/gpu-llm . Hope it helps!

@matthiasandreas6549 Ай бұрын

@@AIFusion-official thank you so much

@AIFusion-official Ай бұрын

You're welcome!

@irocz5150 Ай бұрын

Apologies for the lack of knowledge..but why no AMD video cards?

@AIFusion-official Ай бұрын

AMD Video cards work too, i didn't mention them in the video, but there are some good AMD GPUs for LLMs tasks.

@jefflane2012 Ай бұрын

Nvidia GPUs have tensor cores to perform massive simultaneous calculations for AI. AMD has cuda which is more of a general processor.

@akhathos1618 Ай бұрын

@@jefflane2012 No tienes ni idea de lo que estás diciendo.

@jamegumb7298 16 күн бұрын

@@jefflane2012 AMD has no CUDA at all, AMD has ROCm which is kinda the same but different. Tensor cores helps, but other capavbilities factror into it also, RDNA3 can work fine.

@matthiasandreas6549 2 ай бұрын

Hello and thanks for the video, i have a possibility to choose at the mean tim to buy a Nvidia RTX 3060 12GB or a Nvidia RTX 4060 8GB GPU to use 8b llms in private testing. Which one will be the better one i understand that both gpus eill be work at T4 85% accuracy is this so? Thank you for answerig

@AIFusion-official 2 ай бұрын

I have an RTX 4060, and it works fine. However, I would recommend getting the RTX 3060 because it has more video RAM, which allows it to load larger models. Hope this helps!

@matthiasandreas6549 2 ай бұрын

@@AIFusion-official⁠, yes thanks it helps a lot, mostly you stream was very informative. But the accurracy value with 12Gb VRAM will be the same or a bit better for understand?

@matthiasandreas6549 2 ай бұрын

@@AIFusion-official, and how big is your RAM? About 32Gb or bigger? I can choose between 32 or 64gb but there is a bigger price.

@AIFusion-official 2 ай бұрын

i have 32gb ram

@AIFusion-official 2 ай бұрын

The accuracy would be the same, with the only difference being the speed-specifically, how many tokens per second it can generate.

@harshdeep7015 Ай бұрын

I want to use Ai tools like stabe diffusion, comfyui, 3D game development, also want to train my ai models. Searching for a beast laptop under $1800. Also want to do video editing 😢

@AIFusion-official Ай бұрын

For the tasks you're looking to do-AI tools like Stable Diffusion, ComfyUI, 3D game development, AI model training, and video editing-getting a powerful laptop under $1800 is going to be tricky. While some laptops in that range offer decent specs, the problem is that they often struggle with heat management, especially during heavy workloads. Overheating can cause thermal throttling, which severely impacts performance and productivity, making it frustrating for tasks like AI model training and video editing. Honestly, you'd be better off going with a desktop. Desktops provide much better cooling systems, which means they can handle long hours of heavy use without slowing down. Plus, they're way more price-friendly for the same level of performance. With $1800, you can get a desktop with a much more powerful GPU, higher RAM, and more storage space than any laptop in the same price range. And as a bonus, desktops are more upgradable, so you can easily improve the specs down the line as your needs grow.

@harshdeep7015 Ай бұрын

@@AIFusion-official but I am a student I need portability, I will also build pc after 3-4 years

@AIFusion-official Ай бұрын

For AI tools like Stable Diffusion, 3D game development, and video editing, finding a strong laptop under $1800 is tough but doable. Look for one with at least an NVIDIA RTX 3060 or RTX 4060/4070 GPU, paired with an Intel i7 or Ryzen 7 processor. You’ll need 16GB RAM (or ideally 32GB) and 1TB SSD storage. Good cooling is important to avoid overheating during heavy tasks. Some solid options are the ASUS ROG Strix G15, MSI Katana GF76, and Lenovo Legion 5 Pro. They’ll give you the portability you need as a student while handling your workload.

@harshdeep7015 Ай бұрын

@@AIFusion-official What about 3070Ti laptop??

@AIFusion-official Ай бұрын

A laptop with an RTX3070Ti is a solid choice! It’ll handle AI tasks, game development, and video editing really well. Just pair it with a good i7 or Ryzen 7, 16GB or 32GB RAM, and 1TB SSD. If you find one under $1800, go for it! Just check the cooling to avoid overheating.