AI/ML/DL GPU Buying Guide 2024: Get the Most AI Power for Your Budget

  Рет қаралды 52,218

TheDataDaddi

TheDataDaddi

Күн бұрын

Welcome to the ultimate AI/ML/DL GPU Buying Guide for 2024!
In this comprehensive guide, I'll help you make informed choices when selecting the ideal graphics processing unit for your AI, Machine Learning, and Deep Learning projects.
🚀 Whether you're a data scientist, researcher, or just an enthusiast, I've meticulously evaluated nearly 70 GPUs and summarized their AI capabilities. My aim is to ensure that you get the most AI power for your budget in 2024.
🔍 What to Expect in this Video:
- Guidance on how to choose a GPU for AI/ML/DL
- Price-performance comparisons to empower your decision-making
- Expert recommendations tailored to different AI/ML/DL requirements
By the end of this guide, you'll have a clear understanding of which GPU is the perfect fit for your AI-related tasks in 2024, enabling you to maximize your investment's value. If you're passionate about AI and want to make the most of your budget, this video is a must-watch. Don't forget to like, subscribe, and hit the notification bell to stay updated on our upcoming AI/ML/DL content!
📚 Additional Resources:
Guide For Choosing AI/ML/DL GPU
docs.google.com/presentation/...
GPU Comparison - Excel Data
docs.google.com/spreadsheets/...
Interactive GPU Comparison - PowerBI Report
drive.google.com/file/d/1kpv3...
AI/ML/DL with the Dell PowerEdge R720 Server - Energy, Heat, and Noise Considerations
• AI/ML/DL with the Dell...
Throttle No More: My Strategy for GPU Cooling in Dell PowerEdge
• Throttle No More: My S...
drive.google.com/file/d/1Pd-R...
Installing Tesla P100 GPU on Dell PowerEdge R720 Server with Driver Installation
• Installing Tesla P100 ...
Installing DUAL Tesla P100 GPU on Dell PowerEdge R720 Server with Driver Installation
• Installing DUAL Tesla ...
Dell PowerEdge R720XD GPU Upgrade: Installing Tesla P40 with NVIDIA Drivers
• Dell PowerEdge R720XD ...
Dell PowerEdge R720 GPU Deep Learning Upgrade: Installing Dual Tesla P40s with NVIDIA Drivers
• Dell PowerEdge R720 GP...
HOW TO GET IN CONTACT WITH ME
🐦 X (Formerly Twitter): @TheDataDaddi
📧 Email: skingutube22@gmail.com
💬 Discord: / discord
Feel free to connect with me on X (Formerly Twitter) or shoot me an email for any inquiries, questions, collaborations, or just to say hello! 👋
HOW TO SUPPORT MY CHANNEL
If you found this content useful, please consider buying me a coffee at the link below. This goes a long way in helping me through grad school and allows me to continue making the best content possible.
Buy Me a Coffee
www.buymeacoffee.com/TheDataD...
As a cryptocurrency enthusiast, I warmly welcome donations in crypto. If you're inclined to support my work this way, please feel free to use the following addresses:
Bitcoin (BTC) Address: bc1q3hh904l4uttmge6p58kjhrw4v9clnc6ec0jns7
Ethereum (ETH) Address: 0x733471ED0A46a317A10bf5ea71b399151A4bd6BE
Should you prefer to donate in a cryptocurrency other than Bitcoin or Ethereum, please don't hesitate to reach out, and I'll provide you with the appropriate wallet address.
Thanks for your support!

Пікірлер: 259
@starlordhero1607
@starlordhero1607 Ай бұрын
00:11 Choose GPU with newer architecture for better performance 02:28 Choose NVIDIA GPUs with active support and sufficient VRAM for future scalability. 06:51 Key considerations for choosing an NVIDIA GPU for deep learning 09:08 Consider driver support for deep learning framework compatibility. 13:10 Factors to consider when choosing an NVIDIA GPU for deep learning 15:12 Understanding the key GPU metrics is crucial for making the right choice. 19:46 Choosing GPU based on performance, memory, and bandwidth criteria. 22:00 GeForce RTX 2060 Super and GeForce RTX 4060 TI 8 Gbit are the best bang for your buck GPUs. 26:27 Comparison of NVIDIA GPU models for Deep Learning in 2023 28:45 GeForce RTX 4060 Ti 16GB has the best raw performance 33:18 Choosing NVIDIA GPUs for Deep Learning in 2023 35:36 Best bang for your buck: P100 and P40 GPUs 39:22 P100 and P40 are recommended for deep learning Crafted by Merlin AI.
@TheDataDaddi
@TheDataDaddi Ай бұрын
Thanks so much for adding these! I appreciate it.
@jeffm4284
@jeffm4284 5 ай бұрын
Brother - you went down a serious rabbit hole. A man after my own heart. One thing you didn't mention that is very important to data integrity and therefore results is ECC RAM. The pro Nvidia GPUs have ECC RAM. Most casual users don't realize how many bits are flipped through flaws in silicon and cosmic events (literally). Then some garbage gets written to disk. That's why I would never own a workstation that doesn't have ECC from top to bottom. Better if the OS / Hypervisor is using ZFS (the best file system IMO... with 40 years building enterprise and global systems). Consumer equipment is fine to test and learn on, especially on a budget. But if you want data and result integrity, at a minimum, buy pro equipment. Like you - I've had great results buying refurbs from eBay. Used electronic prices drop faster than pulling out of an auto sales lot in a new car. Well - at least that used to be true. But it still is for workstations and servers. I recently looked at some Dell PowerEdge 730's (NVMe M2 bootable) with 128 GB ECC RAM and dual, upper-end, v3 Xeon processors for about $400 - with IDRAC (out of band management). You did emphasize the use case that the GPU is for, and that's what I'd emphasize too. If the data you are processing isn't something that you can't replace with numerous backups, or can't suffer glitches - go with professional equipment - either new or used. Then use ZFS RAID and not hardware RAID. ZFS controls the whole data stack - from RAM to permanent storage. You'll want to disable hardware RAID so firmware doesn't fight with ZFS. If you are learning and experimenting, but not relying on the end result - use cheaper consumer products. The learning curve is lower and so is the price tag. If you've never managed a PowerEdge server - that is an entirely different animal because it has to be. The difference between one of those and a consumer PC is like the difference between a flip phone and a Linux workstation. Night and day. But if you're a nerd like me - that's what you want as a platform.
@TheDataDaddi
@TheDataDaddi 5 ай бұрын
Hey there. Thanks so much for the detailed response! I really appreciate comments like this. Yeah I definitely went down a rabbit hole for sure with the video. Really did not mean too, but once I started I couldn't stop. lol I will definitely keep that in mind going forward. I have never really had any issues thus far but what you are saying definitely makes sense. Also, from my research it really does not appear to be too much more expensive. Yeah, I actually have 3 Dell PowerEdge R720s right now in my basement in a rack. Unfortunately, I do not believe any of them have ECC RAM though. Always wondered about ZFS RAID never used it myself, but I will have to go that route. The next server I buy I want something newer, and I will make sure to buy ECC RAM and use ZFS RAID. Also, that is a great price on the R730. Is it still available? lol
@Christopher-lb6rf
@Christopher-lb6rf 3 ай бұрын
> Most casual users don't realize how many bits are flipped through flaws in silicon and cosmic events (literally). Yup, almost none.
@TheDataDaddi
@TheDataDaddi 3 ай бұрын
I can attest. I would not consider myself a casual user, but that is something I never really considered until recently. Upon further research, this is indeed quite common and definitely highlights the need for ECC ram especially for critical system or production environments. @@Christopher-lb6rf
@RSV9
@RSV9 3 ай бұрын
What most people don't think about is that time is money !
@TheDataDaddi
@TheDataDaddi 3 ай бұрын
You are definitely right. There is a reason most companies use the cloud rather than on prem hardware. These are definitely arguments in both directions. In my case, however, I cannot afford the could based solution and like full control my own hardware.
@bdhaliwal24
@bdhaliwal24 4 ай бұрын
Your insight and the spreadsheet you provided are invaluable. Thanks.
@TheDataDaddi
@TheDataDaddi 4 ай бұрын
Hi there. Thank you so much for the feedback! I am so glad this was useful to you.
@prentrodgers
@prentrodgers 7 ай бұрын
This is a fantastic video explaining how to choose a GPU for deep learning/AI/ML. He extended Tim Dettmers single GPU performance chart into a masterpiece of a spreadsheet and PowerGI dashboard. Masterful. I wonder if you factored electricity cost, the cost of removing heat from the room, and total decibel output into the decision. I see in a subsequent video that the server is installed in what looks like a basement. The rack is within a few feet of a gas can. Those Dell machines can run hot, so you might want to move the gas can elsewhere. How noisy is the final product with two P100's?
@TheDataDaddi
@TheDataDaddi 7 ай бұрын
Hi Prent. Thank you so much for the kind words. This took me quite a while to put together so I really appreciate the positive feedback. I have not directly calculated the electricity cost for all of the GPUs. However, for the P100s in the dell PowerEdge R720. I estimated it costing roughly about $200 per year base on 15 ¢/kWh avg in GA where I live and my anticipated usage. However, that is really just a guess. I need to buy a gauge to actually measure average power consumption over a month or so and extrapolate that out. This seem like a good idea for another video so stay tuned. lol The heat removal has not been an issue at this point as the basement is fairly large and stays naturally around 65 F. I have to see what it is in a week or so after the servers heat things up. Also, as far as the noise is concerned. As long as it is in the basement or a room that you don't use often it is fine. They were in my bed room for awhile, but had to move them because of the noise lol. As far as decibels are concerned, I have not measured. However, If you are interested I can try to check and let you know more precisely. Thankfully the gas cans are empty, but I agree probably a good idea to move them. Thanks again for the interest! Glad you found this helpful.
@pliniosborges7963
@pliniosborges7963 6 ай бұрын
I wish I had watched this video 1 day ago. Great material for beginners in ML. Thank you. I ended up choosing a RTX 4070 12gb. Not the best choice for money, but I guess still very powerful
@TheDataDaddi
@TheDataDaddi 6 ай бұрын
Hey man! Thanks so much for the feedback. Yeah, unfortunately, most of the 4000 series do not have the greatest price to performance ratio. However, they perform better at general task list rendering and gaming so there is definitely something to be said for them if you have the money to afford them. Anyway, that said I think the 4070 is fine GPU and will be an absolute work horse when it comes to smaller machine learning problems. if you remember, let me know how you like it after a few months of experimenting with it. Very curious about the performance of the 4000 series in general. Thanks again for your feedback!
@iskiiwizz536
@iskiiwizz536 Ай бұрын
@@TheDataDaddi very good reply man
@HuzMS
@HuzMS 6 күн бұрын
Thank you for your hard work. Was p40 or p100 the better choice? ALso were you using nvlink?
@whomair
@whomair 4 ай бұрын
Just what I was looking for! Thank you for all the hard work in putting this together - you make the world a better place :)
@TheDataDaddi
@TheDataDaddi 4 ай бұрын
Hi there. Thank you so much for you kind words and the feedback. I am just glad this was helpful to you.
@gax1985
@gax1985 3 ай бұрын
Thank you brother for your hard work. You have saved me a lot of time. Your spreadsheet is amazing! We can sort out GPUs by the desired category! I believe that , after viewing your results, the GeForce GPUs that are most notable (especially considering the price) is the 3080 Ti. It is close in CUDA cores to the 3090 (I am aiming for a 3090 , but it might be easier and cheaper for me to get a second-hand 3080 Ti)
@TheDataDaddi
@TheDataDaddi 3 ай бұрын
Hi there. So glad this video has helped you! Yep, I would definitely agree with that statement. It might definitely may be easier and cheaper to get your hands on a 3080. Most people it seems have their eye on the 3090s for the extra VRAM. However, if your use case does not require that, I think the 3080 would be an excellent way to go for sure.
@miguelpineda5654
@miguelpineda5654 2 ай бұрын
Great video, I'm very grateful; it was worth watching in its entirety. Thank you for your effort, greetings from Panama. Your video will greatly assist me in a project that my classmates and I want to undertake at the university. Many thanks for sharing such valuable information
@TheDataDaddi
@TheDataDaddi 2 ай бұрын
Hi there! Thanks so much for your comment! I am so glad that my video was able to help!
@marktwain5232
@marktwain5232 2 ай бұрын
This is just great! This is just what I was looking for. You have my great respect! Extremely helpful to people. Kudos of the highest order!
@TheDataDaddi
@TheDataDaddi 2 ай бұрын
Hi there. I am so glad that this was able to help you! Really appreciate the feedback!
@aidanm5578
@aidanm5578 2 ай бұрын
I live for this deep dive stuff. Thanks for your thorough work!
@TheDataDaddi
@TheDataDaddi 2 ай бұрын
Thanks so much! Glad you enjoyed the video!
@user-T.Baldwin
@user-T.Baldwin Ай бұрын
Wow ! How many eons worth of time have you put into this ? Theres not very many people who would go so far out of their way for others like this, this great work of art, thank you. So, for LLMs like Mixtral I can just use P 40 ? Yay
@TheDataDaddi
@TheDataDaddi Ай бұрын
Hi there. Thank you so much for the comment! This video took quite a while for me to put together. I think a couple weeks if I remember correctly. I was doing all the research for myself so I figured that I could save others some time and frustration by sharing my results. I am glad to hear that it is appreciated! Yes! After looking at the specs on hugging face, it looks like you can run inference with even the largest mistral 7B available with a max VRAM requirement of 10.20GB. Unfortunately though, if you wanted to use a non quantized version you may run into memory issues. huggingface.co/TheBloke/Mistral-7B-v0.1-GGUF
@srinjoy.bhuiya
@srinjoy.bhuiya 2 ай бұрын
Very good analysis and helpful ! You got yourself a subscriber.
@TheDataDaddi
@TheDataDaddi 2 ай бұрын
Hi there. Thanks so much for the positive feedback! I am so glad the video was useful for you and thank you very much for subscribing. I hope you continue to enjoy the content!
@debtanudatta6398
@debtanudatta6398 5 ай бұрын
Hello, Great video! Another thing, you have not benchmarked the Nvidia RTX 3000 Ada Generation Laptop GPU yet. It will be helpful if you do that also.
@TheDataDaddi
@TheDataDaddi 2 ай бұрын
Hi there! Thanks for letting me know. I will try to get that added as soon as I can!
@ShakeShifter84
@ShakeShifter84 2 ай бұрын
Would have loved to see the top 2 of each category put in a list together and compared in the spreadsheet. Great video
@TheDataDaddi
@TheDataDaddi Ай бұрын
Hi there. Thanks so much for the comment! I am actually in the process right now of creating by own website: thedatadaddi.com. One of the first things I am going to put on the website is a real time GPUs price to performance dashboard. I will certain add in the ability to make this kind of comparison. Please stay tuned for progress here.
@noth606
@noth606 8 күн бұрын
GFLOPS is not calculated like shown in the video at 15:18, remove the Giga which we know, FLOating Point operations per Second, simply(there is some history for why this is used). It is somewhat archaic since a lot if other things are being done too, which aren't incorporated in this but in general most other operations take less cycles than a floating point one does because the comma needs special attention so to speak, 15*15 and 1,5*1,5 are the same thing except for tracking the comma separately with the result being 225 or 2,25. What I mean is the circuit needs additional logic to track commas or rather fractions so to speak, which is why we separate floating point from integer operations - additional hardware is required to track the comma "in top of" the integer type numerical operations. No idea if this makes any sense or is useful, I thought it would be simple to explain until I thought it through and realized I need to type this as opposed to scribble and show on a white board. I'm sure there is a good explanation for it out there, just trying to point to why since to a person doing math it's not as obvious as it is designing a circuit to do it.
@TheDataDaddi
@TheDataDaddi 7 күн бұрын
Hi there. Thanks so much for the comment! This is great information. Thanks so much for sharing. I did know that floating point ops were different fundamentally than other operations though I was not 100% sure why. This makes a lot of sense! Also, if you know the correct formula for calculating FLOPs in general please let me know.
@kidrock777
@kidrock777 2 ай бұрын
I'm watching this as a newbie from a hotel room on my laptop with sub-par speakers. Just 1 request from my unique context would be to amp up the volume on future uploads so it is easier to listen when in similar situations.
@TheDataDaddi
@TheDataDaddi 2 ай бұрын
Hi there. Thanks so much for the feedback. I will certainly do that in the future!
@kidrock777
@kidrock777 2 ай бұрын
@@TheDataDaddi hi. Thank you for the quick reply and I appreciate the willingness to up the volume. Just wanted to add. The audio issue was on my laptop’s end and roughly half way through the video, the volume issue fixed itself. Earlier they sounded like they were at half volume. I forgot to edit my comment earlier. Much love and support ❤️
@TheDataDaddi
@TheDataDaddi 2 ай бұрын
Oh okay no worries at all. I really appreciate you reaching back out to let me know. You are awesome!@@kidrock777
@Sensorium19
@Sensorium19 4 ай бұрын
Thank you for putting this information together. Reading up on some of the cards mentioned here in the context of deep learning cooling issues seemed to come up a couple of times. People were talking about the Tesla m40, for example, noting having enough native cooling to deal with constant loads from deep learning. Have you had any issues from that in your builds? EDIT: I see you have some videos about heat and cooling just after this. I'll take a look.
@TheDataDaddi
@TheDataDaddi 4 ай бұрын
Hi there. Thanks so much for the positive feedback! Yep, I was just about to suggest that video. I have not personally had any horrible issues with any of my GPUs overheating, but it certainly can be. If you have any specific question after the video feel free to reach out, and I will do my best to help you.
@Sensorium19
@Sensorium19 4 ай бұрын
@@TheDataDaddi Thank you. I'll let you know if I have any questions.
@TheDataDaddi
@TheDataDaddi 4 ай бұрын
Sounds good! @@Sensorium19
@goldholder8131
@goldholder8131 4 ай бұрын
I'm just starting this video and I already know it's going to be good. Thanks a ton ahead of time!
@TheDataDaddi
@TheDataDaddi 4 ай бұрын
Hi there. So glad that you are enjoying the content. Hope it helps you!
@goldholder8131
@goldholder8131 4 ай бұрын
@@TheDataDaddi You literally changed my life. I could have gone off the deep end and bought a top-tier $1000+ gaming GPU, but something like that is just not ideal for the kind of AI stuff I'm interested in like SDXL and high-end 70b LLM's. This video is definitely going to be a re-watch to assure I retain the information you've kindly shared in the video! Thanks again!!
@TheDataDaddi
@TheDataDaddi 4 ай бұрын
So glad this was helpful to you man! I am big proponent of buying what you need for your specific application. Please let me know if I can help in any way. @@goldholder8131
@iskiiwizz536
@iskiiwizz536 Ай бұрын
Thanks for the work man ! Love from France
@TheDataDaddi
@TheDataDaddi Ай бұрын
Hi there. Thanks so much for the comment. So glad you enjoyed the video!
@vap0rtranz
@vap0rtranz 29 күн бұрын
I made a spreadsheet too, but yours is thorough! I'd also come to similar conclusions as you: that the P40 / P100 were cheap ways to get medium size LLM models into a GPU with decent tokens/second. Your spreadsheet would have saved me time if I'd known about it! At least there's some independent confirmation of your conclusions. There's a lot of detail to add, like how fast/slow models are on certain GPUs ... perhaps another vid on that to save me the effort? :P
@TheDataDaddi
@TheDataDaddi 29 күн бұрын
Hi there. Thanks so much for the comment! I am glad to here you can confirm! Especially as hardware prices keep increasing I think these are actually becoming even more relevant for those who are budget conscious. Funny you mention this. I am actually working right now on a benchmarking suite to enable reliable comparison between GPUs for different models. There is not a reliable open source benchmarking solution for GPUs so I am trying to create one (or make steps toward it at least). As soon as I get something decent, I will make a video series on it and start using it to benchmark GPUs in a real way with respect to individual models.
@samjco
@samjco 3 ай бұрын
It would be nice to add in energy consumption, heat vs cool, most durable, and real life comparison using an actually local LLM system (and which LLM size) with these cards. For instance, the best card for single use performance, vs multi use performance in using a local LLM system like privateGPT, for an example.
@TheDataDaddi
@TheDataDaddi 3 ай бұрын
Yep this is coming eventually. Things like that just take awhile to do correctly. I am working right now to create a set of comprehensive benchmarks that the community can use to evaluate GPU performance across the major AI/ML/DL areas.
@SunandSnowGardening
@SunandSnowGardening 3 ай бұрын
Thanks!!!
@GusKesaranond
@GusKesaranond 4 ай бұрын
Hi, I'm a deep learning beginner with a 4070 super(12gb) looking to potentially add more gpu(s) to my 2 empty slots on the GIGABYTE areo trx50 d motherboard. I'd like to have more vram for no more than $1000-$2000 budget. What are some of my options and do I need to consider power consumption with a 1200w psu and other compatibility issues? I have a pretty good 32cores 64 thread threaripper cpu and 256gb of ram. What are some benefit of those for ML? I bought this pc for 3d simulation use originally.
@TheDataDaddi
@TheDataDaddi 3 ай бұрын
So, I think the Threadripper cpu and mobo are wonderful for deep learning applications just really expensive relatively speaking. They are really the only cpu and mobo combos I know of that go over 2 x16 lane pcie slots. This means you can use more than 2 gpus with full x16 lane bandwidth. For me personally, I believe the best you can get right now for that price range is a pair of RTX 3090s with NVLink. I would scan EBAY and wait for a good deal then grab 2 when they are cheap or on sale. Hope this helps!
@djmccullough9233
@djmccullough9233 Ай бұрын
I'd love to know your opinion of the modified Nvidia p102-100's with 10 gb of vram being sold for about $50-$60 on ebay since they have no display outputs. They are basically 1080 ti's with a bit of performance nerfing. they have no display outputs, but seem like they'd be ideal to plop into a system with an existing AMD Gpu just to give Cuda acceleration. or perhaps multiple cards?
@TheDataDaddi
@TheDataDaddi Ай бұрын
I responded to you on the other comment, but in case you see this one first. Short answer. I think this would be a great way to go provided they work!
@johnireoluwababalola629
@johnireoluwababalola629 2 ай бұрын
The effort you put into this video is just mind-blowing. I subscribed immediately 😁
@TheDataDaddi
@TheDataDaddi 2 ай бұрын
Hi there. So glad you enjoyed this video and thank you very much for subscribing! I will do my best to continue to make great content for you.
@iluvpopuplights
@iluvpopuplights 5 ай бұрын
What do you think about the Titan V? I didn't see it listed.
@TheDataDaddi
@TheDataDaddi 5 ай бұрын
Yeah I did not cover it unfortunately. I guess I missed it somehow. Anyway, I would that it is a great GPU. However, the best price I can find even for a used one is about $489.99 atm. This is still about double what you can get the P100 or P40 for at this point and the performance gain is not that much greater for the money (11.7 TFLOPS vs 13.8 TFLOPS for single precision) (4.6 TFLOPS vs 6.9 TFLOPS for double precision). So overall I would still recommend the p100 if you have a need for higher double precision performance or the p40 if you care more about single precision and more VRAM. With that said, I think this is a fine GPU and one of the best you can get for the $500 price range. It also has a newer architecture so I will be relevant for longer. So at the end of the day it really depends on your budget, project needs, and how long you need it to last. Hope this helped! Please let me know how it works for you if you decide to get it!
@jaredisaacs7626
@jaredisaacs7626 2 ай бұрын
Loving the content! Looking forward to the next video!!!!
@TheDataDaddi
@TheDataDaddi 2 ай бұрын
Hi there. So glad you are loving the content! Thanks for the support!
@samjco
@samjco 3 ай бұрын
I heard that Tensor Core are most important for AI, ML and then Cuda cores are secondly important. I don't think I seen Tensor code referenced on you sheet.
@TheDataDaddi
@TheDataDaddi 3 ай бұрын
Yes they are definitely a consideration. Admittedly I should have included those stats as well. I will likely go back an add them as well. However, in most cases the higher number of cuda core also translates into a higher number of tensor cores for those GPUs that have them.
@user-uh8po2sx6y
@user-uh8po2sx6y Ай бұрын
Is there a big difference in performance and speed in AI tasks like stable diffusion etc between RTX 4080 super and RTX 4090?Which one should i buy as I seldom play games or should i wait for 5090 at the end of the year?I am not a video editor or hold any jobs related to designing or editing,just a casual home user.
@TheDataDaddi
@TheDataDaddi Ай бұрын
Hi there! Thanks so much for the question. I would say at this current moment. For stable diffusion related tasks, I would go with the 4080. 16GB VRAM should be enough to comfortably handle pretty much all stable diffusion tasks are this point (to my knowledge) and its performance is 60% of what you get with the 4090 for less than half the price. All that said, If you are not pressed for time, I would probably wait to see what happens to the market when the new 5000 series GPUs come on market. It could bring the prices down for the 3000 and 4000 series GPUs as people dump their older GPUs in favor of the latest and greatest. This approach is always a gamble though so if you prefer a safe bet I would look for good deals on a 4080 and not worry too much about what will happen down the road.
@benbencom
@benbencom 5 ай бұрын
Nice work. I think the only big flaw I see in your analysis is that purchase price is not the entire upfront cost. Each card should have an overhead cost based on the fraction of a chassis, mobo+cpu, and PSU it would use. I think you just had a chassis for 2 cards as a sunk cost in your mind so it didn't matter. But for anyone building a full system (or systems) it would have a big impact on their purchasing decision.
@TheDataDaddi
@TheDataDaddi 5 ай бұрын
Hi there! Thanks for the kind words and the feedback! This is a good point. I was doing all this research to figure out what to put in my Dell Power Edge R720 so my analysis is a bit biased in that way I suppose. My assumption was just that people would be able to compare GPUs most easily by looking at purchase price to performance. However, I do agree for those that are building or using different server's this might shift the total cost significantly. For example, if you wanted to work with RTX 4090s (even in my server) I would have to buy external PSU(s) for power in order to use them. That would likely mean I would have to build a entire external rig so that would definitely increase the overall cost. These are definitely things to consider when thinking about your build. I appreciate you bringing that up, and I will definitely keep it in mind for any future analysis I do. Thanks again!
@noth606
@noth606 8 күн бұрын
@@TheDataDaddi It is a helpful comparison but as with everything else, no tool is perfect in and of itself, it may be the perfect tool for one job but not another, or at a certain scale. Mostly for me it reinforces my previous idea which is to aim for 4090 as the most versatile and best fitting for my usecase, also because it isn't a one trick pony and can do other things as well. Some of the other options do come close but do not 'cut the mustard' for one reason or another, primarily that it needs to fit in my one rig which can only take 2 GPU's max physically but then power becomes a limit since 1200w is the biggest PSU that makes sense since above it you run into breakers popping and wiring becoming a factor etc. Every possible factor can't be realistically accounted for or factored in, in a spreadsheet. The 'slot cost' I think would be valuable for many to be able to add to it, since some uses do not require much from the platform itself while others very much do.
@TheDataDaddi
@TheDataDaddi 7 күн бұрын
@@noth606 You are definitely right. One of the reasons I am trying to focus more on hardware at this stage of my journey is because it so nuanced. For best results, it really should be though about on a case by case basis. This was just meant as a way to show people generally how to start thinking about GPUs for their particular use case and hopefully lessen the research burden a little bit for anyone interested.
@noth606
@noth606 6 күн бұрын
@@TheDataDaddi I'm sure this is helpful to get people to start thinking on more concrete terms when they get to a point at which they want to put together hardware specifically to "crunch numbers" rather than just using a typical general use PC configuration. I would think that the journey in a certain sense can or might be split into 3 "stages" conceptually where stage 1 is a normal PC, stage 2 is a PC built to "crunch numbers" but at a general level and stage 3 is a problem/usecase specific PC or set of hardware designed around the exact specific task they are meant to tackle, where for stage 3 you would include efficiency calculating "work units/sec/price/watt" type factors into it. It's important to factor in time and energy on some level because they are of a slightly different nature than calculations per second are conceptually. What I mean is they are inflexible, you don't have access to unlimited amounts of either regardless of other factors. Saying this not to be patronizing, but as someone who has at times forgotten which parts of this kind of equation are inflexible 🙂 and paid a price for that. Ask my ex wife, she'd have tips and examples of the inflexibility of time and energy for sure, lol.
@DCX1
@DCX1 4 ай бұрын
This is a badass video. Thank you.
@TheDataDaddi
@TheDataDaddi 4 ай бұрын
Really appreciate the feedback. So glad it helped you.
@vampritt
@vampritt 20 күн бұрын
omg so detailed.. thank you for your time.
@TheDataDaddi
@TheDataDaddi 17 күн бұрын
Hi there. Thanks so much for you comment. So glad that the content was useful for you!
@MrKhapp
@MrKhapp 3 ай бұрын
Hi, I am new to the idea of learning about ML/AI. I appreciate your video and am contemplating piecing together a budget friendly system to start learning with. In the past, I was able to purchase some used crypto mining rigs from a person that was getting out of crypto mining. I parted most of the systems out and made a profit, but I kept a couple crazy 8 GPU motherboards. My first question is is there any restrictions that would prevent one from using multiple GPUs (more than 2 which seems more common), and my second question is there a certain GPU that would make sense from a budget standpoint where having multiple of them would be more beneficial than one or two standard GPUs. I would think a system running 8 x Tesla M40 having a total amount of 96GB of VRAM would be better than a system running 1 or even 2 3060s w/ 12GB or 24GB of VRAM. I look forward to hearing your response if you find the time to respond, I appreciate your time in advance!
@TheDataDaddi
@TheDataDaddi 3 ай бұрын
Hi there. Thanks so much for the great question. So glad that you have found this video helpful! QUESTION 1 There are a couple considerations here: 1. Unless you shard or split the model itself you are going to be limited to whatever the smallest memory size is in your available GPUs to actually load and train models. So for example if you have 4 Tesla P40 GPUs that would be 24x4 GB of VRAM total, but in many cases with parallelism people end up using data parallelism by default and the model must be loaded onto each GPU. This speeds up training time because you are able to process more batches in parallel across all GPUs, but it does not allow you to load larger models. There is also many ways to do model parallelism in which you break parts of you model across different GPUs and process the data in a pipeline like fashion. I have never actually had to do this so I cannot really get into particulars here, but it is definitely possible. From what I understand though this is much more involved because it require you to logical partition your model in a way that makes sense and assign layers or segments of it to the various available GPUs. All of this is a long winded way of saying if your model is not to parallelize, you may be limited in term of the size model you are able to load and use even if you have many GPUs. 2. Every mother board has a maximum number of PCIE lanes that it can support so in some cases even if a mother board has slots for 8 GPUs it may not support them all at the full x16 lanes require for each GPU to utilize its full bandwidth. This is okay in many cases because the GPUs will still work, but it will limit performance and may cause a bottleneck if you have a ton of data IO. 3. GPU form factor. This is a lot more important than I originally thought, but each GPU manufacturer may have slightly different dimensions and specs. For example, the RTX 3090 founders edition is actually physically quite different than the RTX 3090 Zotac I bought. The RTX 3090 founders edition might be able to fit in one of my servers, but the RTX 3090 Zotac was much larger and would not. While this is not a major consideration, you definitely do need to confirm that the GPU will fix in or on whatever chasis/mobo you are thinking about working with. QUESTION 2 This question is also a bit murky to answer because it depends on what you will eventually be doing and the size of models you will be working with. That said. I will recommend what I would do. If you want to fill up one of you 8 GPU mobos, I would suggest 8 Tesla P40 GPUs. They are ~ $200 a piece. They are great in terms of price to performance. Having this many GPUs also allows you to use an immense amount of data parallelism to train and test faster for models that will fit inside 24GB of VRAM. It also gives you the ability to split very large models across all 8 of you GPUs. This theoretically would be enough to earnestly start playing around with some the largest open source LLM models currently avaliable. In addition, you can section of subsets of you 8 GPUs for different models or training/testing different version of your models all at once. For me personal, I find the last point invaluable in my research. Finally, these GPUs are cheap enough to add them slowly over time. You can buy them one at a time as you have the funds rather than having to shell out thousands for a single GPU. If you value performance over the flexibility mentioned above and have a bit more cash available. I would probably go with 2 RTX 3090s with NVLink. I have not been able to test this setup, but I think it should be excellent in terms of performance. You could also use the same idea here by adding RTX 3090s over time as you have the funds. In summary, if you want a scalable mid - high end range rig where performance is you main concern. I would go with the dual RTX 3090s with NVLink. If you value flexibility and have less cash to invest upfront I think the P40 route is a great way to go. Very excited to hear that you are starting your journey, and I am glad I can help you along the way. I hope this helps, and if you have any other questions along the way please feel free to reach out!
@jubayehossainarnob6069
@jubayehossainarnob6069 6 ай бұрын
Amazing. Thanks a lot bro. If you were to choose between 3060 12GB and 4060TI 16GB, which one would you go for?
@TheDataDaddi
@TheDataDaddi 6 ай бұрын
Thanks so much for the kind words! So glad this video was helpful. I found it extremely helpful myself in choosing GPUs for my own projects. Mmmm. Depends on cost as you may be able to find some great deals for black Friday, but if you want to get the most performance for your money I would go with the 3060. However, realistically its not that much more the the 4060. Overall though, I am unimpressed with the 4000 series GPUs compared to the 3000 series. If it were me, I would buy the 3060 (unless you really need the larger VRAM) and wait until NVIDIA releases the next series of GPUs then consider those options. On a personal note I use the 3060 in my daily driver PC. I use it as a test bed for models I want to scale up to run on my home lab servers. It works really well for that. I have not have any issues loading and working with fairly decent sized models.
@jubayehossainarnob6069
@jubayehossainarnob6069 6 ай бұрын
Thanks a ton bro@@TheDataDaddi
@TheDataDaddi
@TheDataDaddi 6 ай бұрын
Sure man! Glad I could help @@jubayehossainarnob6069
@zZTrungZz
@zZTrungZz 4 ай бұрын
4060 ti 16GB. More memory is better and the first priority
@TheDataDaddi
@TheDataDaddi 4 ай бұрын
Probably the way I would go as well.@@zZTrungZz
@averystech8218
@averystech8218 3 ай бұрын
Awesome! Very helpful!
@TheDataDaddi
@TheDataDaddi 3 ай бұрын
Thanks so much for the kind words!
@dragonmaster1500
@dragonmaster1500 4 ай бұрын
Hello! First off, I want to thank you for the well researched and presented explanation regarding why GPUs are important to the topic of machine learning and the comparison of the major line of GPUs that you are familiar with. I'm currently in the middle of doing research for building a creator PC for performing GIS (Cartography/Data science/Computer science/programming/Machine and Deep Learning/etc.) with photo editing and Gaming as a side benefit. I'm looking for a GPU that can handle a wide variety of tasks with a focus on visualization and processing of high resolution RGB imagery, high resolution 4+ band multispectral imagery, Hyperspectral imagery and LiDAR (from Unmanned Aerial Vehicles) as well as machine learning and deep learning tasks. I was hoping to play around with the settings in the tool you provided, however I was not able to get power bi working and I'm too lazy to want to spend time trying to figure out how to set up the program properly. I was already looking at purchasing a 4070 ti for my build, but would you say that a 3080 ti, a 4080 series or 4090, if I can find one for a good price, would be a better choice?
@TheDataDaddi
@TheDataDaddi 4 ай бұрын
Hi there. Thanks so much for your feedback and the question! So since you also want to do photo editing and gaming as a well as AI/ML/DL application I think the RTX family of NVIDIA GPUs is definitely the right way to go. I think the 3 main questions here are: 1) What is your budget? 2) How large are your expected datasets 3) Do you plan on expanding the number of GPUs in the future? Based on your needs and the price points you're considering, my recommendation leans towards the RTX 3090 or 3090 Ti, especially if you can find a compelling deal. These GPUs offer exceptional value around the $1,000 mark for your specific applications. Moreover, it's possible to find them at even lower prices, approximately $800, on platforms like eBay with a bit of patience. Their price-to-performance ratio is among the best in this price range, making them a highly attractive option. A significant advantage of these last-generation NVIDIA RTX GPUs is their support for NVLink, which, in my opinion, offers a notable benefit over the more powerful 4090 for your use case. Starting with a single 3090 or 3090 Ti allows for a robust setup. As your requirements expand and your budget allows, you can further enhance your system by adding another GPU and linking them with NVLink. This approach provides a scalable and highly effective setup for a wide range of tasks.
@dragonmaster1500
@dragonmaster1500 4 ай бұрын
@@TheDataDaddi Thanks for the response, It's pretty hard to find information about creator PC building, especially if you're not doing video editing, so I really appreciate it. I'll definitely be keeping my eye out for a 3090 or 3090ti with a good deal. The datasets I'm working with right now vary pretty wildly in size depending on the study area and which sensors we're using to collect data (a 10-band multispectral image will be larger than a RGB image and much smaller than a LiDAR point cloud which is in turn smaller than a Hyperspectral profile for example). Anyway, thanks again for the information! I hope you have a great rest of your day!
@dragonmaster1500
@dragonmaster1500 4 ай бұрын
@@TheDataDaddi I just checked the current retail prices for the 3090 and 4090, and was reminded that the reason I was looking at the 4070ti was because because of that good old sticker shock. It's less than half the price of even a 3090 (The cheapest 3090 at retail is about $2100.00 Canadian, compared to the 4070ti's $899.00 Canadian at retail) here in Canada, where everything is 1.4x more expensive due to the exchange rate. I guess that's the price we pay for all that extra VRAM. I'll still be keeping my eyes open for a deal, but I may end up buying a less expensive graphics card in the meanwhile. I can always upgrade in the future, and when I do the graphics card I buy now can be recycled into part of a home server later.
@TheDataDaddi
@TheDataDaddi 4 ай бұрын
Of course! Glad you found it helpful. Yeah one of the reasons I focus more on hardware on this channel is because not many other people out there focus on that aspect of machine learning and it is super important. @@dragonmaster1500
@TheDataDaddi
@TheDataDaddi 4 ай бұрын
Yeah the sticker shock with GPUs its really hard to take sometimes. lol. Check out this one on EBAY. I bought 2 recently. They are a pretty good deal if you are willing to use refurbed equipment (almost all of my equipment is used bought from EBAY or similar). I think it would be about $1200 CAD. www.ebay.com/itm/155867314803?epid=28044609256&hash=item244a6a7a73:g:CnMAAOSwmxNlRStE&amdata=enc%3AAQAIAAAA0GhaLrApc303M8MFhLKXaC1XCZUsnm98lj%2BZeFSruH9oJCFANdXBqU29SOoKs%2BWXGvlPyaIiK5XaubhTqwcQcesmE5FwiNLe0DFWbTLSQ%2FedCQeh%2FYGwxBressF0aNTusfEfh6%2FPh2A%2FG7Uz%2B%2FxEz5CVwvRLABldqDMSoIn%2BM32M3Spzp9f5vb9qFjFE3B7TxotPhewTVPG5AlHyBpu4J07YixG%2FvLiZ2XJDt4nOaaDYjXWNF89%2F8WSbWK8TIBBumuk1germV%2BC3pNIkixMDAGA%3D%7Ctkp%3ABFBMpqHIra5j Yep definitely agree here. That might be the better way to go for now. The best GPU is always the one that fits in your budget. Lol. @@dragonmaster1500
@sun_ada
@sun_ada 3 ай бұрын
Thanks for the guidance, man. Can you help me on this, 4060ti 16 gb is same at price for me as 4070 Super. Which is the better card for DL, I want to get into AI so I wanted a GPU to get me going
@R1L1.
@R1L1. 3 ай бұрын
again that would depend on you if ur work does not require 16gigs of vram, then go for the 4070 and it will cut the training times.
@TheDataDaddi
@TheDataDaddi 3 ай бұрын
Hi there. So glad the video could help! Congrats on the start of your journey! If it were me, I would go with the 4060. I almost always go for the GPUs with more VRAM in the context of AI/ML/DL. I think this should suit you better as you grow to larger and larger models. Good luck! Feel free to reach out if you have any questions along the way.
@sun_ada
@sun_ada 3 ай бұрын
oh thanks for replying, well in the meantime I bought a much cheaper 12 gb 3060 thats like almost 1/4 the price of a 4060ti. And in the future, I am planning to buy another 12 gb 4070super to make the total vram 24gb, how good is this strategy ?@@TheDataDaddi
@joaovitorsantos4174
@joaovitorsantos4174 4 ай бұрын
Excellent content! Congratulations! I had a question: Why didn't you include AMD GPUs? Would that change anything?
@TheDataDaddi
@TheDataDaddi 4 ай бұрын
Hi there. Thanks so much for the positive feedback! I have not include AMD GPUs here because AMD GPUs to my knowledge and experience are not well suited to machine learning. The AMD drivers are really buggy and make using them for machine learning a huge pain. NVIDIA unfortunately is the only manufacture I trust at this point for GPUs for AI/ML/DL. That said as I have time in the next few months I am going to try to experiment with and older AMD GPU I recently bought in depth to see what current limitations of the AMD drivers are. I will report my findings a future video. Also, if the results are good. I will make another video just like this but include AMD GPUs.
@galasitygaming
@galasitygaming 3 ай бұрын
@@TheDataDaddi Yes please!!
@huannguyentrong1249
@huannguyentrong1249 4 ай бұрын
Thanks, very detailed and hardwork you've spent making this video. What do you think using two 3060s 12gb for deep learning ? I'm planning to add one more 3060 into my current setup (i5 12400f, 3060 12gb, 700W psu and ny motherboard does support multi gpus). I saw your other videos about P40 and want to build like you, but it has no fp16 calculation so im not sure it worth it. My budget is around 200-300$ so i may add the used one if you recommend it.
@TheDataDaddi
@TheDataDaddi 4 ай бұрын
Hi there. Thanks so much for the positive feedback and for your question. I think that is a fine place to start. I started with one 3060, and it was great. Two would be even better. For that price range of GPUs if going with a Tesla P40 or P100 is not an option or you need something more general purpose, I think the 3060 is the best way to go.
@huannguyentrong1249
@huannguyentrong1249 4 ай бұрын
@@TheDataDaddi Thank you so much, my concern is can a 700w psu enough for those gpus in the full workload ?
@TheDataDaddi
@TheDataDaddi 4 ай бұрын
The TDP of one RTX 3060 is 170W so that would be 340W total. In theory this should work, but I really depends on all the other components in your build. I would recommend using website like pcpartspicker.com to spec out you current build then see if the current PSU you have can support all the components. That is what I normally do. @@huannguyentrong1249
@user-np2te6tk9z
@user-np2te6tk9z 4 ай бұрын
Hi, I noticed that the Tesla v100 is around $1400. Where did you find the one is $670?
@TheDataDaddi
@TheDataDaddi 2 ай бұрын
Hi there. All of my pricing is directly pulled from EBAY. I normally try to find the lowest cost reputable seller. GPU prices have been sky rocketing recently so that deal probably no longer exists unfortunately.
@durden0
@durden0 2 ай бұрын
hey, please consider adding the 4070ti super 16gb model to your spreadsheet!
@TheDataDaddi
@TheDataDaddi 2 ай бұрын
Good call! I will update it as soon as I get a chance.
@Vay0
@Vay0 5 ай бұрын
For stable diffusion, which is the best price/performance?
@TheDataDaddi
@TheDataDaddi 5 ай бұрын
Hi there. Thanks for the question. What is your price range for GPUs? The best price/performance overall is going to be the p40 and/or p100. But if you care more about performance and willing to spend a bit more I could recommend something more performant. Hope this helps!
@Marioz773199
@Marioz773199 3 ай бұрын
The work you did to put this together is very much appreciated. Thank you for the thorough and thoughtful analysis!
@TheDataDaddi
@TheDataDaddi 3 ай бұрын
Hi there. So glad this was able to help you!
@cattnation6257
@cattnation6257 3 ай бұрын
@@TheDataDaddi GRAPHICS CARD Msi Rtx 4070 Ti Ventus 3X OC 12GB GDDR6X is the graphic card good please help me i am thinking to buy black dimond 2.0 from dell which give me this gpu please give me some suggestion
@cattnation6257
@cattnation6257 3 ай бұрын
i can invest only one time so can you please help me my buget is 2500$@@TheDataDaddi
@cattnation6257
@cattnation6257 3 ай бұрын
thank you in advacn3 @@TheDataDaddi
@TheDataDaddi
@TheDataDaddi 3 ай бұрын
Hi there. Thanks for the comment. I think this would be a fine GPU overall. It would help to know your use case, but I would say that it is a good GPU for most small to mid size application. Since this is prebuilt machine, do you have any other options for GPUs?@@cattnation6257
@b-ranthatway8066
@b-ranthatway8066 6 күн бұрын
So would this mean I can't use a 7900XT to make AI meme pictures? I've actually been interested in the whole AI thing, even though I'm not a smart dude on tech. (I find it cool just because I can use my computer for something other than just gaming/streaming/video editing, but I'm going to try a resist a little against our developing AI overlords lol) I know the tech is still developing, but I thought it would be cool to use AI to create a Vtuber model to stream with. (even if it came out bad, I thought it would be a fun little experiment to do for some views and laughs) However one of my hardest parts to upgrade, in my mind, was a GPU. I know AMD is a step or two behind Nvidia (My last card and current card is a 1070) but when it comes to price, it's hard to beat. I just didn't know if something like a 7900XT or -XTX would at least make up for it vs a 4070 TI Super in terms of AI generation. (I still have no idea what app to use to even make use of my GPU to even make stuff with AI) Alright, enough rambling with the thoughts in my brain, I'll keep watching 👌
@YosuCadilla
@YosuCadilla 4 ай бұрын
Where do you buy the P100? and P40s?
@TheDataDaddi
@TheDataDaddi 4 ай бұрын
Hi there. Thanks so much for you question. I normally buy most of my tech gear from EBAY. I try to find refurbished used gear at a good price from a reliable seller. Hasn't failed me yet. I highly recommend this approach because new hardware in this areas is incredible expensive.
@capturingmoments20s
@capturingmoments20s 4 ай бұрын
Sir which do you think I should take rtx 3060 12gb, rtx 3070 or rtx 4060 my budget only suits this gpu
@TheDataDaddi
@TheDataDaddi 4 ай бұрын
Hi there. Thanks for your question! I would definitely go with the RTX 3060 in this case. The extra 4GB of VRAM will make a ton of difference. Best of luck!
@wood6454
@wood6454 5 ай бұрын
Hey man, do you know of any online resource on the NVIDIA T40? Planning on using it to serve LLM but I see nothing about it online.
@TheDataDaddi
@TheDataDaddi 5 ай бұрын
Hey there. So I was not able to find anything official from Nvidia, but I did find this online: www.techpowerup.com/gpu-specs/tesla-t40-24-gb.c3942 Might not be as reliable as actual documentation from Nvidia though so I would take it with a grain of salt. From what I can tell though its a solid card, but a bit expensive for the performance. I would compare it to the P40 before going that route to see if the price justifies the performance gains for your use case.
@wood6454
@wood6454 5 ай бұрын
@@TheDataDaddi I didn't buy the T40 due to the lack of documentation. Can't even find the driver for it online. I ended up buying P100s as the P40 can't do gpu-only inference which is what I need.
@TheDataDaddi
@TheDataDaddi 5 ай бұрын
Gotcha. Yeah I think that is a solid choice for sure. Sucks the documentation is so sparse for the t40 though@@wood6454
@cyklondx
@cyklondx 20 күн бұрын
i think you missed out on parts - actual performance per model, and if one can use fp32, fp16 or int8 or tensor. P40 is terrible options for any ai workload due to amount of time one would have to wait... and its power requirement.
@TheDataDaddi
@TheDataDaddi 17 күн бұрын
Hi there. Thanks so much for your comment! I would agree that for anything below fp32 operations these GPUs would be quite slow. However, the GPU is less than $200 dollars for 24GB of VRAM. So, if you are wanting to experiment with larger models cheaply, I think these GPUs still have good value.
@ntn888
@ntn888 6 ай бұрын
amazing! thanks.
@TheDataDaddi
@TheDataDaddi 6 ай бұрын
Glad you liked it!
@sushmaanilsingh4489
@sushmaanilsingh4489 Ай бұрын
4:02 But Sir, LLM like chatgpt are trained from multiple gpus only right?
@TheDataDaddi
@TheDataDaddi Ай бұрын
Hi there! Thanks for the comment. For LLMs like GPT-3, the resources required to train or even pre-train such models from scratch are beyond the reach of most individuals and many companies. This is due to the immense computational power and data handling capacities needed. For example, models like GPT-3 are were basically trained on most of the internet (Common Crawl, WebText2, etc.), Tons of full books, a snapshot of Wikipedia at that time, and more. Training was preformed with literally thousands of high-end GPUs across various datacenters. Also, ChatGPT and other similar models are proprietary so lack of detailed specifics about their training processes and architecture make it hard to know exactly what it would take to train something like GPT-3 (or other similar model) from scratch. What is accessible for most people and organizations is running inference using these models. "Inference" refers to the process of using a pre-trained model to make predictions or generate text in the case of LLMs. The feasibility of running inference smoothly depends largely on the amount of VRAM available, as larger models require more memory to operate efficiently. For instance, smaller versions of LLMs might run on a single GPU with 12 GB of VRAM, while more extensive models might require a GPU setup with significantly more memory. For those with more robust computing setups, such as advanced home labs or small to medium-sized enterprises, fine-tuning an LLM might be within reach. Fine-tuning involves adjusting a pre-trained model on a new dataset or for a specific task, which typically requires fewer resources than full-scale training from scratch. This process allows users to tailor the model's responses to better fit particular contexts or industry-specific needs without the prohibitive cost of training a new model from the ground up. The following Reddit thread is pretty useful in providing more details here: www.reddit.com/r/MachineLearning/comments/15uzeld/d_estimating_hardware_for_finetuning_llm/ For fine-tuning, a setup with one or more high-end GPUs, such as the NVIDIA A100 or V100 (or RTX 3090 as I advocate for), would generally suffice. This allows for modifications of large LLMs using varied sizes of data, making it a viable option for enhancing model performance on specialized tasks. In summary, while training large-scale LLMs from scratch is out of reach for most, leveraging these models through inference or fine-tuning them for specific applications is quite feasible with the right hardware setup. This opens up opportunities for a wide range of applications, from personalized AI assistants to sophisticated data analysis tools, even for smaller organizations or dedicated individuals with the appropriate resources.
@jaystannard
@jaystannard 29 күн бұрын
This is a cool little project you did.
@TheDataDaddi
@TheDataDaddi 28 күн бұрын
Hi there. I am so glad you enjoyed the content! Really really appreciate the donation. Really helps the channel!
@mrbabyhugh
@mrbabyhugh Ай бұрын
29:11 haha, that's the exact card I am looking at. Comparing it with ARC a770 actually.
@TheDataDaddi
@TheDataDaddi Ай бұрын
Hi there. Thanks so much for the comment! I have also been interested in non NVIDIA solutions. The ARC GPUs have certain interested me. However, I would caution you. If you leave the NVIDIA ecosystem, it is like going into the wild west so just make sure you are prepared. Here is a Reddit thread that might shed some light. www.reddit.com/r/MachineLearning/comments/z8k1lb/does_anyone_uses_intel_arc_a770_gpu_for_machine/ If you do decide to go the ARC route, please let me know how it goes for you. I would be super curious to better understand where those GPUs are in terms of AI/ML/DL applications.
@dirtdabest
@dirtdabest 4 ай бұрын
Amazing video! I wonder if you could advise me on my next setup... I'm a deep learning engineer who typically works in medical computer vision, however, i'm looking to buold something for my home office. My budget is 3-4k for the entire setup. Datasets I use tend to be quite large, and 20gb vram is probably a minimum in terms of model size. I've been looking at a 4090 prebuilt because for that budget, I could get a nice spec with the option to do some occasional gaming. In an ideal world, i'd want more vram! 48gb would be amazing. I wondered about going for a 2x 3090 using nvlink. What do you think about something like this? Thank you in advance!
@TheDataDaddi
@TheDataDaddi 4 ай бұрын
Hi there! Thanks so much for reaching out. So for your budget, I highly recommend going the 2x 3090 route. This is the best configuration I have found so far that balances price and performance for what I would consider mid to upper end setups in terms of compute resources. I actually just bought 2x 3090s and am going to be making some video exploring their performance with and without NVlink. This will also be good in your case because you are interested in occasional gaming. Unfortunately, for 48GB VRAM at this stage you are going to have to go the Tesla series GPUs and those break the bank even used. For right now, unless you absolutely have to have 40+ GB of VRAM I think 2x 3090s with NVlink is the way to go for mid to upper range computing projects.
@dirtdabest
@dirtdabest 4 ай бұрын
@@TheDataDaddi Thanks so much for your response! I'm having a tough time clarifying whether 2x3090's with NVlink allow for larger models to be loaded during training/testing...? I understand data parallelisation and how that is beneficial, but can I actually distribute the model in a way that allows me to experiment with really large models with a 2x3090 setup? I use both tensorflow and pytorch and have seen forum posts going either way with regards to this setup. I would be really interested to hear your thoughts.
@TheDataDaddi
@TheDataDaddi 4 ай бұрын
So, utilizing a dual RTX 3090 setup with NVLink indeed expands your capacity to experiment with larger models during training and testing phases. Typically, in such configurations, the entire model is loaded onto each GPU. The remaining memory is then allocated for data batches during training, testing, or inference processes. With each RTX 3090 leveraging about 24GB of VRAM, you should find this setup sufficiently robust for handling fairly large models (pretty much most things up to the mid to large open source LLMs). That said though, it really depends on the size of the models you will be loading. I would recommend loading a typical model you might work with onto the CPU checking the memory utilization before and after. This should give you a good idea of how large the model actually is. Then you can see a) will the model in its current state fit into memory on both GPUs and b) what kind of batch sizes you might be able to work with. Smaller batch sizes, while manageable, could become a limiting factor in your workflow, potentially impacting training efficiency and model performance evaluation. From here you can make a decision as to whether or not you really need more VRAM. An alternative strategy (especially relevant with NVLink because you have direct memory access between GPUs) involves distributing different parts of a model across the two GPUs, allowing for batch processing to occur sequentially across both units. This method, albeit more complex, can effectively mitigate VRAM limitations by leveraging the combined memory more efficiently. However, it's worth noting that this approach requires careful implementation and might not be suitable for all models or scenarios. It is also important to note that NVLink does not create a single unified memory space accessible by all GPUs as in the traditional sense. Put another way 24GB + 24GB of VRAM does not equal 48GB of VRAM in this case as it would on a single GPU. Frameworks like TensorFlow and PyTorch do support multi-GPU setups and offer varying degrees of support for model parallelism and data parallelism. For TensorFlow, strategies like tf.distribute.MirroredStrategy can be employed for data parallelism, which synchronizes training across the GPUs for each step. PyTorch users can leverage torch.nn.DataParallel or torch.nn.parallel.DistributedDataParallel for similar purposes. For model parallelism, where the model is split across multiple GPUs, PyTorch provides more explicit support through manual implementation, allowing you to define how different parts of the model reside on separate GPUs. With full transparency, I have not had to use model parallelism thus far so I cannot comment more specifically on how to implement it. It will also vary from use case to use case. I think this would be a great video topic though and certainly something that would be great to know as models continue to get larger and larger. I will try to make a video on this specifically as soon as I get a chance. Anyway, I apologize for the long winded answer, but I hope that this response is useful to you! @@dirtdabest
@dirtdabest
@dirtdabest 4 ай бұрын
Ah this is a brilliant response! I am going to go for the dual 3090 setup! The potential freedom of having a larger model is what's most important to me on reflection. Are there any other special considerations for a build like this? Thank you so much for your response, it's really cleared things up! I will be watching out for the video when it comes!
@TheDataDaddi
@TheDataDaddi 4 ай бұрын
Glad to hear it! I think that setup will suit you well. They also hold their value well so it does not end up being what you need you can easily get most of your money back out. Only thing I would say about the build is if you are going the server route make sure that you choose one that will fit in whichever RTX 3090 you choose. I might consider an external rig with pcie extenders. I am doing this myself actually so I will have a video on this soon. You might consider that as well. External rigs are also much easier to keep cool. So glad I could help and keep me updated throughout your journey! @@dirtdabest
@chentecorreo
@chentecorreo 22 күн бұрын
Excelent work!
@TheDataDaddi
@TheDataDaddi 17 күн бұрын
Hi there. Thanks so much for the comment! Really appreciate your positive feedback!
@pixelslayertv7140
@pixelslayertv7140 Ай бұрын
what yould you recommend one rtx 4070 ti super or two rtx 4060 ti 16 Gb for llm inferencing? i know there are a lot of things to consider
@TheDataDaddi
@TheDataDaddi Ай бұрын
Hi there. Thanks so much for the question! So, in terms of theoretical performance the 4070 ti super is about twice as performant. Since you are just interested in inferencing, I would say that in this case having the better performance of the 4070 will benefit you more.
@pixelslayertv7140
@pixelslayertv7140 Ай бұрын
@@TheDataDaddi thanks a lot for you answer. I think its worth the better performance since i dont get any performance benefits of having two rtx 4060 ti besides more memory. In case i do some training ate some portion i can still rent some hardware. I my case i want to have a local llm for work where i can not use anything connected to the internet. So i would use it primarily for inference and if i need to optimize the model this should only occur one ore twice (hopefully)
@TheDataDaddi
@TheDataDaddi Ай бұрын
@@pixelslayertv7140 Sure! Yeah unfortunately since the memory pools for both 4060s would be separate you really don't get much benefit even having more total VRAM. You may be able to get away with fine tuning some of the smaller open source LLMs especially if you look into quantization. However, my gut tells me you will have a hard time doing much beyond that with 16GB VRAM. Like you said though you could always rent for the few times you do need access to more VRAM.
@SyamsQbattar
@SyamsQbattar Ай бұрын
Which one is better, the Rtx 3060 12GB or 3060Ti?
@TheDataDaddi
@TheDataDaddi Ай бұрын
Hi there. Thanks so much for the comment! It really depends on your use case. If you plan on trying to expand to larger models like some of the diffusion related models or want to use larger batch sizes then 12 GB VRAM would be helpful. However, for most conventional deep learning models 8GB should be fine and you will get faster training and inference speeds. Personally, though I would probably go with the 3060 with 12GB. I normally always default to the GPU with the higher VRAM even if the performance is slightly worse. I would rather be able to load the model and just have training and/or inference be slower than get out of memory errors and not be able to load a model I want work with.
@volshebniyfonar
@volshebniyfonar 3 ай бұрын
Great job!
@TheDataDaddi
@TheDataDaddi 3 ай бұрын
Thanks so much for your positive feedback!
@RSV9
@RSV9 3 ай бұрын
Is there a current laptop with more than 8 GB vram that is recommended and does not cost 3000? Is it better to wait for new processors or graphics cards that incorporate new chips specially built for AI models?
@TheDataDaddi
@TheDataDaddi 3 ай бұрын
Off the top of my head, I know that the 4090 mobile gpu has 16gb of VRAM. I do not know what laptop have this standard. This link may help you. medium.com/@ibrahimcreative172/top-10-laptops-for-deep-learning-machine-learning-and-data-science-in-2023-f8a6ba861c4f I think it depends. For example, if you could wait until 2025 when the RTX 5000 series comes out, that might be worth it as they will hopefully fix some of the shortcomings of the 4000 series. However, they will be super expensive when they first come out. So I normal prefer to go for older GPUs. I feel that they have a much better value. Long winded way of saying I would probably not wait and try to find value in what exists currently.
@RSV9
@RSV9 3 ай бұрын
@@TheDataDaddi Yes, I think I'm going to wait for new laptops to come out with better video cards and maybe better processors with NPU which seems to be coming strong
@RSV9
@RSV9 3 ай бұрын
Now I have a gaming laptop even though I don't play, with an rtx3050 ti but only with 4 vram. When I bought it in 2021, I didn't know that I was going to need more vram for Stable Diffusion. We'll see how things develop. Thank you so much
@TheDataDaddi
@TheDataDaddi 3 ай бұрын
I did some more research last night in the area, and I think this might be a good option for the mobile/laptop route. It seems NPU technology stands to make AI/ML/DL much more viable on laptops. @@RSV9
@TheDataDaddi
@TheDataDaddi 3 ай бұрын
Yeah it would definitely difficult to do much with stable diffusion on 4GB of VRAM. It might be worth upgrading to a GPU with 8 or 12 GB of VRAM while you wait for a better new laptop. You can find some pretty good deals on EBAY if you are patient.@@RSV9
@ufukyarsan2149
@ufukyarsan2149 3 ай бұрын
damn thats the analysis
@ToddWBucy-lf8yz
@ToddWBucy-lf8yz 3 ай бұрын
Personally I love the P40 at the moment however BE AWARE that the Pascal cards do NOT offer NVLink, for that you have to go to the Ampere cards. That said you can operate the P40 in x8 PCIe mode without significant loss in performance. Not ideal but if you have a consumer motherboard and are trying to get one more GPU in there this one might not be a bad choice for a Gen 4 board where the increased PCIe buss speeds more than make up for the lack of full x16 access for this Gen 3 card.
@TheDataDaddi
@TheDataDaddi 3 ай бұрын
Hi there! Thanks so much for the comment. Yeah unfortunately they do not seem to. Although interestingly they have a cut out on one side that looked like it was made for something NVLink related. However, I have never been able to find anything that would link them together. Yeah that is true. In most cases for consumer mobos you will be working with x8 pcie not the full x16 unfortunately. Like you said though for most it shouldn't make that much of a difference.
@ToddWBucy-lf8yz
@ToddWBucy-lf8yz 3 ай бұрын
@@TheDataDaddi and if you are looking to water cool there will be challenges for the P40 s well. The PCB cutout is the same as the 1080 but with the rear plug on the P40 you will likely have to do some modding on a 1080 water block to make it work. The 40x40x28 15000rpm fans I have on there scream when at 100% so be prepared to write some custom code to control the fan speeds. If you are interested I'll push mine to GitHub and send you the link.
@maamla_boy5208
@maamla_boy5208 4 ай бұрын
Hi there. Can you suggest me midrange or budget cpu and gpu for ai/ml/dl pc.
@TheDataDaddi
@TheDataDaddi 4 ай бұрын
Sure! What all are you looking to do specifically in the ai/ml/dl space?
@maamla_boy5208
@maamla_boy5208 4 ай бұрын
I'm studying artifical intelligence and Machine learning engineering in india. Is intel i5 14600k and rtx 4060ti 16/8gb good, if not so suggest some😊. My brother is also studying Machine learning and deep learning. Can you suggest some laptops for him. I need a pc and he wants a laptop.
@TheDataDaddi
@TheDataDaddi 4 ай бұрын
Got it. The I5 is a solid choice and so is the 4060 ti 16GB. I would not get a GPU these days for AI/ML/DL less than 12GB. That would be my minimum. Also for AI/ML/DL I would really not recommend a laptop. I think have a server or workstation to ssh or remote access via a laptop is fine. However, I do realize as a student a laptop makes a lot more sense. I will try to find a good solution for your brother. My first though its one of the Mac M3 Pros. These are really expensive though and I really can't speak on how good they are from a machine learning perspective. I have an M1, and it has actually been pretty good. Just has definitely been buggy from time to time for machine learning. I actually just found this article. Tell your brother to check it out. It has 10 options that vary in price. He should be able to find something here that fits his budget. If it were me, I would probably start by checking out the Mac M3 Pro first though. As far as laptops go, I have always been impressed with Macs minus the price tag. medium.com/@ibrahimcreative172/top-10-laptops-for-deep-learning-machine-learning-and-data-science-in-2023-f8a6ba861c4f Hope this helps! @@maamla_boy5208
@maamla_boy5208
@maamla_boy5208 4 ай бұрын
Thank you so much. Appreciate your efforts
@TheDataDaddi
@TheDataDaddi 4 ай бұрын
Sure. Glad to help!@@maamla_boy5208
@Travis-jl3wx
@Travis-jl3wx 2 ай бұрын
thank you so much for the info but I am so conflicted, the information in this video is great, but the pacing gives me severe anxiety. nothing against you it's my own issue, so this comment is not a complaint is to help others who possess an overactive brain and 0 attention span. watch the video at 1.75 speed while reading through the additional resources. This will cut the play time down to 23.1 minutes which I admit, is still a very long time based on the amount of information, but it is good information so its worth it! this should keep your brain from losing interest during the loooong pauses. again , great content
@TheDataDaddi
@TheDataDaddi 2 ай бұрын
Hey there! Thanks so much for the comment. I really appreciate the honest feedback. I have been trying to get better about being more direct and to the point in my videos. I know that some of them are unnecessarily long. My assumption is that most people would likely watch on faster speeds or hop around rather than listen to me drone on and on. I do agree though I need to do a better job of keeping the audience engaged and make shorted videos with less long pauses. I appreciate the candid feedback, and I will work to do better on this in the future. Glad you at least thought the content itself was good though!
@Travis-jl3wx
@Travis-jl3wx 2 ай бұрын
@@TheDataDaddi thanks for the response! believe me there was no hate, keep up the good work!
@TheDataDaddi
@TheDataDaddi 2 ай бұрын
@@Travis-jl3wx Of course man. I did not take it that way at all. I really appreciate the honest feedback. Something I just got to work on.
@Kirmo13
@Kirmo13 3 ай бұрын
This is exactly what I was looking for, although I was more inclined towards AMD GPUs.
@TheDataDaddi
@TheDataDaddi 3 ай бұрын
Yeah unfortunately I just don't feel that AMD GPUs are really that viable for machine learning at this point. I am trying to experiment with some to get a better picture of what machine learning is really like with AMD. I will post more video in the future on this topic once I get a chance to experiment.
@deneguil-1618
@deneguil-1618 4 ай бұрын
i'm currently debating between the AMD 7900XT and 4070 Ti Super for my upgrade, my PC would be a mix of gaming and development including AI and DL in tensorflow and both GPUs fit into my budget. The main differences are that the 7900XT is generally faster in gaming than the 4070 Ti Super, in linux it's plug and play and the drivers are way better and that it has more VRAM while being less expensive. Though, even if the 7900XT is officially supported by tensorflow-rocm I believe it'd be quite a bit slower than the 4070 Ti Super still, although i guess any GPU is better than no GPU
@nabeelnasar7935
@nabeelnasar7935 4 ай бұрын
Have you seen the latest reviews? Its only around 2.5-3% faster, not to mention driver issues that you are likely to face once you go the ml route. I think 4070ti makes a lot of sense when you bring in the ml factor
@deneguil-1618
@deneguil-1618 4 ай бұрын
@@nabeelnasar7935 i have seen the latest reviews, there are several games where the 7900xt is over 10% faster at 1440p and will have a longer lifespan too, drivers are also not an advantage for Nvidia on Linux, they're infamously hard to get working and overall not good
@TheDataDaddi
@TheDataDaddi 4 ай бұрын
Hi there. Thanks so much for your question. I have not yet experimented much with AMD GPUs. I actually just bought a cheap on this week so I can start figuring out to what extent AMD GPUs are currently viable for AI/ML/DL applications. I will make a video on this once I have thoroughly tested it. However, for now I cannot recommend AMD GPUs because I have heard and read too many bad things about their driver issues for AI/ML/DL related tasks. So my recommendation would be to stick to NVIDIA and go with the 4070 ti for now if you want to dabble into AI/ML/DL at all. Its unfortunate because AMD GPUs seem to have a lot more performance for the price (most of the time). Right now, however, I still see AMD and Intel GPUs as primarily gaming GPUs. However, if anyone out there has used AMD (or Intel for that matter) GPUs for machine learning please share your experience in the comments. Would really appreciate that.
@autkarsh8830
@autkarsh8830 Ай бұрын
sweet video!
@TheDataDaddi
@TheDataDaddi Ай бұрын
Hi there. Thanks for the comment. So glad you enjoyed it!
@fubleach2
@fubleach2 Ай бұрын
Really good video but had to play at 1.75X 😅
@TheDataDaddi
@TheDataDaddi Ай бұрын
Hi there. Thanks so much for the feedback! In the future, I will work on keeping things more concise.
@user-wf5pm4xe6t
@user-wf5pm4xe6t 3 ай бұрын
Awesome❤
@TheDataDaddi
@TheDataDaddi 3 ай бұрын
Thanks so much for your positive feedback!
@agftun8088
@agftun8088 3 ай бұрын
basically for my budget it's either 3060 for arround 300 euro or 4060 ti for 480 euro , easy decision.
@TheDataDaddi
@TheDataDaddi 3 ай бұрын
Hi there. Thanks for the comment! Yeah definitely. If you can get the 4060 ti 16GB for 480 euro, that is definitely an easy decision. Best of luck with it!
@Klarkooi
@Klarkooi 3 ай бұрын
Hey bud, how can I DM you for some Qs?
@TheDataDaddi
@TheDataDaddi 3 ай бұрын
Hey man. I really appreciate you reaching out. I am going to make an X or threads or similar account soon to become more reachable. For now though, just shoot me an email at: skingutube22@gmail.com
@KiraSlith
@KiraSlith 3 ай бұрын
It's funny how regularly I hear "oh don't bother with the Pascal cards, just buy an Ada Lovelace, just sink $1,200 on just the GPU, don't waste your time" every time I point out the two Pascal Tesla cards are the best bang for your buck right now, assuming you can cool them. Minor errata btw, the Maxwell Tesla cards don't actually supply their complete VRAM capacity and CUDA count in 1 addressable device, but rather 2-4 smaller GPUs with 4-8gb of VRAM each, and most DL/SD/LLM applications don't scale linearly with CUDA cores spread across devices. Worse, many apps don't support anything older than Pascal anyways, meaning the effective usefulness gap will be even larger than your scoring system lets on.
@TheDataDaddi
@TheDataDaddi 3 ай бұрын
Hey there! Thanks so much for the great information. Yeah it is amazing to me how often people overlook the Pascals. I understand that they are old and definitely far from the state of the art, but they are still a great value for tons of compute. You can't argue with 16GB or 24GB of VRAM for ~ $200 Honestly, for most people starting out, they are more than enough for most use cases anyway. I use both the P40 and P100 almost everyday in my PhD research, and they are great for the money. Ah okay. I was reading something about that a while back. That definitely makes sense. Thanks so much for sharing. Great info for anyone considering Maxwell series GPUs.
@KiraSlith
@KiraSlith 3 ай бұрын
@@TheDataDaddi Appreciate your content by the way! Been binge watching your channel, there's a lot here for me to learn from, been a huge help in figuring out how to scale up from my little workstation rig in the future. (T7910, 2x E5-2699 v4s, 128gb of ECC DDR4, P40 w/ cooling duct in the overhead slot, a v5900 as a display adapter, 4x1tb NVMe drives in a Hyper M.2 carrier)
@TheDataDaddi
@TheDataDaddi 3 ай бұрын
So glad you have been taking a lot from it. I primarily made this channel because of how hard it was for me to learn this stuff. There are not many channels that focus on how to cost effectively create home labs or work stations specifically for machine learning. Most that do focus on things that are out of the price range of most normal individuals. Really glad to hear that it has been helping you. Great setup btw the way! Wonderful place to start. Keep me updated on how the scaling goes! Be curious to see what you do. @@KiraSlith
@KiraSlith
@KiraSlith 3 ай бұрын
@@TheDataDaddi Current plan is to move the P40 down to the main chamber, add a second P40 below it, and cool the P40s by pulling air through the back via a high CFM 120mm fan, rather than pushing through the front. I'd have to move the v5900 up to the top CPU1 slots though. I know you couldn't attach GPUs to CPU1 with the old T7800, but I've never tried with the T7910. The NVMe drives will be just fine though, thankfully.
@TheDataDaddi
@TheDataDaddi 3 ай бұрын
Just looked at the layout for the T7910. Seems like a good plan and that looks like it should work from what I can tell. Definitely agree adding a fan to pull through the back is a good call. Let me know how it ends up working! @@KiraSlith
@DarrenTarmey
@DarrenTarmey Ай бұрын
Can I get access to this spreadsheet
@TheDataDaddi
@TheDataDaddi Ай бұрын
Hi there. Thanks so much for reaching out. You are should be able to download a local copy and change it in anyway that you see fit. I acknowledge that the spreadsheet needs to be updated. I am working right now on a website actually that lists GPU specs and then keeps track of historical price trends. So all of the information in the spread sheet and more should be available updated on a daily basis soon. Unfortunately, I am not comfortable giving you direct editable access to the original version in the Google drive. I apologize. It is nothing personal. I just don't know you well enough.
@migellebarlis
@migellebarlis 4 ай бұрын
Hoping for the new update due to the new release of NVIDIA GPUs :)
@TheDataDaddi
@TheDataDaddi 4 ай бұрын
Hi there. Thanks for the feedback! I will definitely have to make another video on the new release one I have a chance to do some research on them!
@willdwyer6782
@willdwyer6782 Ай бұрын
Thanks for the long winded answer to a simple question. That's 40 minutes of my life that I won't get back.
@TheDataDaddi
@TheDataDaddi Ай бұрын
Hi there. Thank you for the comment! I am really sorry that you feel like I wasted your time. Honest, feedback is always welcome. I will do my best to keep my videos more concise in the future. I do appreciate you watching the video nonetheless.
@user-T.Baldwin
@user-T.Baldwin Ай бұрын
Hi, Xi, Ji !
@bindiberry6280
@bindiberry6280 Ай бұрын
Can we emulate GPU with USB?!!
@TheDataDaddi
@TheDataDaddi Ай бұрын
Hi there! Thanks so much for the comment. I am sorry though. I am not sure I understand what is being asked here. Could you give a little more context?
@tsclly2377
@tsclly2377 Күн бұрын
Heat kills.. so is the RTX4090 experience. I'm going slow and low wattage. Also type of SSD must be big petabyte write .. NVLinking is slow, 112Gb/s and is just dumping onto the Cudas/memory in the not really working like type GPU.. especially if that GPU is stuck into a 16lane that is really only 8 lane.. best to get a bigger Vram card like an A6000 @48GB instead of NVLink(2)ing two A5000 24GB, plus the link is going for premo$$$.. Me.. I'm going the 'Potato'/Orin route that doesnt' need a new air-conditioner/heat pump route, under 350W, and fuk the need for speed, I can wait 30 minutes instead of 3. It is not the speed, but the GPU ability to actually do the work and most can't. 48GBvram with 2000 cuda cores is the can do.. top TDP is like money spent. this review is best for MS BitNet 1.58.. so if you are Windows.. rock on... and I've read the NVLink Quadro cables work for RTX cards.. not the LoveLace...the need the LoveLace Link-2s..
@robertfontaine3650
@robertfontaine3650 5 ай бұрын
Modern AI needs heaps of video ram. You shouldn't bother looking at any without 24gb
@TheDataDaddi
@TheDataDaddi 5 ай бұрын
Hi there. I definitely agree. I think 24gb is a good happy medium between price and performance.
@rayf3244
@rayf3244 4 ай бұрын
What’s your thoughts on Apple silicon, for example, would 32GB of unified memory on an M2 Max, be an equivalent of 24GB in a dGPU (and not considering a CUDA advantage)?
@TheDataDaddi
@TheDataDaddi 4 ай бұрын
Hi there. Thanks so much for you question. I actually had some ask a similar question yesterday. Lets dive in. Apple's silicon, particularly the M2 Max, represents a significant shift in computing architecture. The concept of unified memory in Apple's design is quite innovative. Unified memory essentially allows the CPU and GPU to share the same memory pool, which can lead to more efficient use of resources. Regarding your question about the equivalence of 32GB of unified memory to 24GB in a discrete GPU setup, it's not a straightforward comparison. In traditional setups, the CPU and GPU have separate memory pools, and data needs to be transferred between them, which can create a bottleneck. With Apple's unified memory, this bottleneck is reduced, as both the CPU and GPU can access the same memory pool directly. This can make the system more efficient, potentially allowing 32GB of unified memory to perform comparably or even outperform a 24GB discrete GPU setup in certain scenarios. However, this doesn't mean it's superior in all aspects. For example, tasks heavily reliant on GPU performance, especially those optimized for CUDA (a parallel computing platform and API model created by Nvidia), might still perform better on a traditional discrete GPU setup. This is because CUDA has been around for a longer time and is extensively optimized for specific professional and scientific applications. So to sum everything up, while 32GB of unified memory on an M2 Max might offer comparable performance to a 24GB dGPU in many use cases, the actual performance can vary depending on the specific applications and workloads. @@rayf3244
@robertfontaine3650
@robertfontaine3650 4 ай бұрын
@@rayf3244 Machine Learning rests on linear algebra. Massively parallel matrix algebra is what video cards do. (until we see neuromorphic chips in wide production, cheap and actually successful. Could just be a pipe dream (rnn's). The RTX 3090 with NVlink is the best bang for the buck and the old Tesla cards are the cheapest entry. 2-3090's give you 48GB and let you play up to llama 2 78b. No I can't afford two 3090's either let alone a couple of last generation a6000s. The Tesla P100's have 16GB and NVlink for a total or 32GB for less dollars. When PCI-e 5.0 boards come out NVlink won't be necessary (We are told) but until then we are pretty limited as hobbyists. NVidia is really the only game in town and they are focused on the Enterprise not us. AMD and Intel haven't invested in AI at this level and Apple isn't even in the game.
@adisaksukul8088
@adisaksukul8088 3 ай бұрын
@@TheDataDaddidid you use ChatGPT answer this?
@Kamology_
@Kamology_ Ай бұрын
$100 for a 2060 Super is WILD lol. If anyone finds one for that price send it my way I’ll buy them all
@TheDataDaddi
@TheDataDaddi Ай бұрын
Hey there. Thanks so much for the comment! Yeah, the prices in the video are probably a bit outdates at this point unfortunately, and I do not have the time to keep it updated. It is insane how much the price of hardware (GPUs in particular) has skyrocketed in such a short period of time. I am actually working on building a website right now that monitors the price of all the GPUs in the referenced spreadsheet in real time to help people make more informed decisions with real current GPU prices. Hopefully, I can have this built in the next few months. Stay tuned for updates here.
@Kamology_
@Kamology_ Ай бұрын
@@TheDataDaddi oh that would be very cool and useful. Yeah, 3090s in particular have gone way way up. I was thinking about grabbing one for $500-$600 a few months ago, checked the price and was like woah. Nvm. I should try to scoop some P40s before it’s too late for those too lol
@ThePandalars
@ThePandalars 16 күн бұрын
perun much?
@TheDataDaddi
@TheDataDaddi 11 күн бұрын
Hi there. Thanks so much for the comment! I am not sure I understand the question. If you could give me, a bit more context that would be great!
@hp2073
@hp2073 6 ай бұрын
Consider you have a 3060 12G. Do you prefer to buy a used 3090 with 24G RAM (around 600$) or stay for the 50 series of nVidia and instead solve higher ram requirement on the cloud?
@TheDataDaddi
@TheDataDaddi 6 ай бұрын
Hi there! Thanks for the question. If you could find one for around $600 I definitely buy the 3090. However, on my end I have not seen any from reliable sellers for less than $800 and normally closer to $1000. Also, for me personally the last thing I would do is look to the cloud for higher VRAM requirements. From my research, it is crazy expensive (relatively speaking) to use the cloud for home lab, personal projects, and even some small business applications. For example, with Linode (now Akamai) which is one of the cheapest cloud providers I know about, the lowest tier of service PER MONTH is $1000 for specs that dont even come close to what you could build and self host for the same amount. I am actually going to make a video on this topic relatively soon to address when it might make sense to use cloud solutions vs building/self hosting. I also am very interested to see what the 50 series holds for NVIDIA. However, to my knowledge these are not slated to come out until 2025 so you would be likely be waiting quite a while. They will also likely be super expensive in the short term and there is no guarantee that the will overcome the challenges the 40 series face (for example pcie 4 when 5 was promised). With all that said, I would probably go with the 3090 (depending on your use case) especially if you can get one for $600. I like the 30 series GPUs best as general purpose work horses at this point especially bc you can use them with NVLink. This may help you with your higher VRAM requirements as well. Hope this helps!
@hp2073
@hp2073 6 ай бұрын
@@TheDataDaddi Thank you genteleman. Yes, I can find a healthy 3090 for this price because here no one is interested in mining anymore. I will do all tests and even run a game at the end to make sure it is fine. However my concern is return for the investment, I want to earn money or build a startup around this using machine vision projects, so I am not sure will it work for me or not (making money). should I go for 3060 12g for 200$ and accept a lower risk or go for the 600$ 3090 24g. this is my only concern.
@TheDataDaddi
@TheDataDaddi 6 ай бұрын
Ah okay. I see your use case. If you are going to use it to make money not just for personal projects. I would invest in the 3090 for sure. This will allow you train much larger models as well as speed up training significantly. I believe that is probably one of the best GPUs you can buy right now for the money for computer vision related tasks. Especially if you can get a good one so cheap. I would say if you can safely afford it go with the 3090. @@hp2073
@hp2073
@hp2073 6 ай бұрын
@@TheDataDaddi Thank you very much. I hope you get what you like in your life indeed. I also remembred one thing to ask. somebody told me the kaggle gives 30 hours per month of cloud GPU for free. also it seems companies like salad cloud gives a 4090 gpu instance for around 0.34$ per hour. as you have more experience, are these real?
@TheDataDaddi
@TheDataDaddi 6 ай бұрын
I have not used kaggle in several years so take what I with a grain of salt. This be a good place to start. 30 hours per week might be enough for you depending on what you are doing. At the very least you could use it as a free test bed until you start maxing it out then buy your own hardware. To add some context, some of the computer vision and graph ML model I have been training recently take over a week solid on 2 tesla p40 GPUs. The computer vision model are really not even that big comparatively speaking. Also, instead of the RTX gpus you could go with older Tesla data center GPUs like the p40 or p100 (I would recommend these as they are some of the most compute and VRAM for the money). So, I have never used or heard of Salad Cloud, but I just checked it out, and it looks interesting. I have been thinking about a distributed peer to peer GPU compute sharing platform for sometime now. Anyway, it seem like that is what this is. Two drawbacks would be the following: 1) it looks like it is mainly for inference not training on pretrained models (I am not sure if you could use custom models). 2) It looks like it only has RTX series gpus and is geared toward gamers who want to rent out their idle gaming PCs. These would probably not be optimized for machine learning. Anyway, it looks cheap so if it actually performs well then this might be a great option. I have just never personally tested it. Personally speaking I like owning my own hardware because you always have absolute control over everything, and it is generally the cheapest option by far.@@hp2073
host ALL your AI locally
24:20
NetworkChuck
Рет қаралды 740 М.
Универ. 10 лет спустя - ВСЕ СЕРИИ ПОДРЯД
9:04:59
Комедии 2023
Рет қаралды 1,8 МЛН
$10,000 Every Day You Survive In The Wilderness
26:44
MrBeast
Рет қаралды 139 МЛН
🛑STOP🛑 Buying Bad Combos! Best CPU and GPU Combo 2024
16:46
Best AI/ML/DL Rig For 2024 - Most Compute For Your Money!
17:57
TheDataDaddi
Рет қаралды 13 М.
NVIDIA CEO Jensen Huang Leaves Everyone SPEECHLESS (Supercut)
18:24
Ticker Symbol: YOU
Рет қаралды 805 М.
All You Need To Know About Running LLMs Locally
10:30
bycloud
Рет қаралды 113 М.
The ULTIMATE Budget Workstation.
14:57
aChair Leg
Рет қаралды 189 М.
You need to learn AI in 2024! (And here is your roadmap)
45:21
David Bombal
Рет қаралды 645 М.
I Made a Neural Network with just Redstone!
17:23
mattbatwings
Рет қаралды 588 М.
Building a Portable PC for AI! 2x RTX 3090, 20-cores, 256GB RAM
30:49
5 НЕЛЕГАЛЬНЫХ гаджетов, за которые вас посадят
0:59
Кибер Андерсон
Рет қаралды 1,6 МЛН
Iphone or samsung?
0:14
Obeyfendimen
Рет қаралды 767 М.
сюрприз
1:00
Capex0
Рет қаралды 1,4 МЛН
i like you subscriber ♥️♥️ #trending #iphone #apple #iphonefold
0:14
Нашел еще 70+ нововведений в iOS 18!
11:04
🔥Идеальный чехол для iPhone! 📱 #apple #iphone
0:36