How to Choose an NVIDIA GPU for Deep Learning in 2023: Ada, Ampere, GeForce, NVIDIA RTX Compared

Рет қаралды 173,953

Күн бұрын

Пікірлер: 320

@FiveFishAudio Жыл бұрын

I'm a beginner in ML, and got a used RTX3060 with 12GB for $200 off eBay. No regrets and for now meets my needs. A good upgrade from my old GTX 970!

@ILoveSlyFoxHound Жыл бұрын

How did you find one for $200? That's amazing! I'll give it some more time to see if i can find one around there soon to upgrade from my RX 470.

@lp67O Жыл бұрын

Can you deliver some numerical data on speedup? In terms of training speed, I mean

@gnanashekar Жыл бұрын

@FiveFishAudio does it supports cuDNN and what do you think about it for a lap (the same one)

@FiveFishAudio Жыл бұрын

@@gnanashekar Yes,RTX 3060 supports cuDNN. I compiled OpenCV with CUDA 11.7, and cuDNN 8.7.0, running on Linux Ubuntu 22.04.02

@FiveFishAudio Жыл бұрын

@@lp67O Depends on model complexity and size of your dataset, but yes it speeds up a lot compared to CPU only or my old GTX 970

@jeffm4284 Жыл бұрын

I like your explainers @jeff Heaton. Don't forget, the professional workstation RTX models use ECC RAM, use less power, and have a profile that fits in most cases. The GeForce models are enormous! In my line of work - ECC RAM and TDW is a huge deal for the targeted AI/ML processing. I don't play computer games, nor do I imagine I ever will. I may have an Xbox to de-stress occasionally with NBA 2K15 (yep, that's how invested I am in gaming). If I could share images - I'd share my matrix comparison of specs and current price differences. There's no question that you get a big bump from the RTX 4000 Ada to the previous gen RTX A5000. But, the 4000 Ada uses a TDP 70W (no power cable necessary!) vs 230W, has a fair amount of Tensor Cores and RAM, and it's MUCH smaller. Each one of these video cards needs to be matched to the expected workload. They're just too expensive not to optimize the workload mission. For instance, if you look at what's benchmarked at techpowerup for supposedly equal GPU's, the later-gen RTX A5000 crushes the RTX 4070 (4070 FP64 = 455.4 GFLOPS vs A5000 FP64 = 867.8 GFLOPS). Here's the Price Ladder: PNY NVIDIA RTX 4000 SFF Ada Gen $1,500 PNY NVIDIA RTX A5000 $1,670-$1,500 = $170 more PNY NVIDIA RTX A5500 $2,289-$1,670 = $619 more PNY NVIDIA RTX A6000 48GB GDDR6 $3,979-$2,289 = $1,690 more PNY NVIDIA RTX 6000 Ada 48 GB GDDR6 $7,985-$3,979 = $4,006 more Those are REALLY big steps-up in cost! Links in order: www.techpowerup.com/gpu-specs/geforce-rtx-4070.c3924 www.techpowerup.com/gpu-specs/rtx-4000-sff-ada-generation.c4139 www.techpowerup.com/gpu-specs/rtx-a5000.c3748 www.techpowerup.com/gpu-specs/rtx-a5500.c3901 www.techpowerup.com/gpu-specs/rtx-a6000.c3686 www.techpowerup.com/gpu-specs/rtx-5000-ada-generation.c4152 www.techpowerup.com/gpu-specs/rtx-6000-ada-generation.c3933

@ProjectPhysX Жыл бұрын

You're missing the most important spec for performance in these kind of applications: VRAM bandwidth. RTX 40/Ada in this regard is abysmal value, and RTX 30 is much better, equally fast or even faster (3080 10GB is 7% faster than 4080), for half the price.

@DavePetrillo 2 жыл бұрын

I found a 3060 12gb that was taken out of an oem on amazon about a year ago for about $480 and it has been pretty good.

@F0XxX98 Жыл бұрын

Just got a used 3060 for $300 :)

@datapro007 Жыл бұрын

Great video, thanks Jeff. I bought a RTX 3060 for $340 ($280 as of 6/23) shipped from B&H on Black Friday. The 12 GB of RAM at that price was the deciding factor. BEWARE: NVidia is now selling a "3060" with 8GB RAM in addition to the 12 GB version, so read the specs carefully before you buy.

@Jake___W Жыл бұрын

Yes it has 12 GB of ram, but they cheaped out on it, the ram is only rated for 192-bits. It's why if you look at the 3070 it has 8GB of 256-bit ram, but the ram is faster and more efficient making up for the smaller number and evidently being better than the 3060's 12GB. People complained about the older ram in the 3060 so they released a new 3060ti that has 8GB of the newer and faster ram. Honestly both cards are good, some say the 8GB one will preform better, others claim the older 12GB one works better on their machine.

@datapro007 Жыл бұрын

@@Jake___W I'm not talking about the 3060 Ti having only 8GB - I mean the plain old 3060 in some versions only has 8GB. Say what you will, the 3060 12GB for ~ $280 is good value for money for machine learning. You can have the speediest card in the world, but if your model won't fit in available RAM you are out of luck as well as memory.

@marhensa Жыл бұрын

@@datapro007 slower speed for me personaly is okay. I eventualy choose RTX 3060 12GB DDR6 over RTX 3060Ti 8GB DDR6X. so many reviewer says don't buy RTX 3060 in 2021, and recommend to buy Ti instead. but it's seem their recommendation is changed to favor 12GB now, lots of games now need more than 8GB, and also for AI stuff, more VRAM Size is important as a matter of go or no go, not just slow or fast.

@enduga0 Жыл бұрын

@@marhensa is i5 12 h 16gb ram intel iris good enough

@shreyashgaur4397 4 ай бұрын

can be use 2 RTX 3060 using both for ml and dl tasks

@TheTakenKing999 2 жыл бұрын

In Montreal right now. Got a 3090 for like 750 Canadian brand new on marketplace. Your videos are always helpful. I am doing a Masters in ML here and even though I have cluster access the personal GPU does help a lot especially with smaller assignments and projects. RoBERTa with my dataset for an NLP task took 2 hours for training and 2 for fine tuning with 8 epochs on pytorch, if anyone is curious. Mixed precision would have helped speed it up even more most likely.

@congluong463 Жыл бұрын

I did some testing on RTX 3x graphics card by training Classifier with and without Mixed precision (on Tensorflow). And I found that mixed precision actually slowed down training on RTX 3x cards (2.5x slower). I think it's because the RTX 3x uses tensor float 32 to calculate, so there are no need to use float 8 or float 16, the operations that convert float 32 to float 16 slowed down the computation. Or maybe I did something wrong.

@Gogies777 Жыл бұрын

is this even real life? I'm in westcoast Canada and nowhere near around that price range we could get 3090 for. by any chance you could share a link of some of these 3090s?

@fluxqc Жыл бұрын

@@Gogies777 keep looking!! Can confirmed I just bought 5 3090s at 700$ cad

@zurgmuckerberg Жыл бұрын

Where did you catch the Canadians to trade?

@Cuplex1 9 ай бұрын

Were they stolen? 🤔 Which brand? I mean, the average price for a RTX 3090 is about $1600 USD today. The cheapest I could find was $1050 USD. That's about 1430 Canadian dollars for the cheapest! Brand new for half the price of the cheapest brand available sounds unlikely, since it would be a loss for the retailer.

@FzZiO1 Жыл бұрын

Hello, your video is incredible, an update is necessary in this regard with the new 4060Ti 16GB graphics card from NVIDIA. Considering the memory model, bandwidth and bus... is it fair to sacrifice all that for more VRAM, are we facing the best graphics for ML? What do you think about it?

@FzZiO1 Жыл бұрын

It is fully compatible with cuda

@pranjal86able Жыл бұрын

what is your opinion on 4060 ti 16GB? I want to use the langchain models etc.

@yonnileung 10 ай бұрын

Cost of 4060Ti is much higher than 3060Ti, but the numbers of Cuda core is not worth... I think can consider get a second hand 12GB 3060Ti.

@aayushgarg5437 Жыл бұрын

@Jeff, one thing more that goes in favour of RTX 40 series is its ability to support FP8 bit. Going forward this year, Nvidia is going to release CUDA 12 with the support of FP8 bits for training and inference. You can only run FP8 bits on RTX 40 series (not on RTX 30). I think that is something one should also consider while buying a GPU. It is better to shed a bit more now so that your GPU remains relevant for next 3-4 years.

@xphix9900 Жыл бұрын

i agree, the memory usage/dependency will change... do you think the 4070ti is a good purchase or shell out more cash for more cuda cores on the 4080/4090 or wait? 4070ti is my budget but im not in a rush

@aayushgarg5437 Жыл бұрын

@@xphix9900 It depends on your use case. If you are buying a GPU for DL then go for RTX 4090 (if your budget allows) solemnly for the VRAM of 24 Gb. You don't want to be in a situation where you face the VRAM bottleneck whenever you are running a bigger network training. My suggestion is wait and go for 24GB card whenever your budget allows.

@xphix9900 Жыл бұрын

@@aayushgarg5437 thank you sir, appreciate the advice, and i agree!

@xphix9900 Жыл бұрын

@@aayushgarg5437 also just to get something running for now to use NERF, would you have a suggestion?

@aayushgarg5437 Жыл бұрын

@@xphix9900 I don't know much about neural 3D rendering. Apologies.

@ErhanBicer-v1f Жыл бұрын

Great video! Thanks for your efforts. What do you think about A5000 card in terms of deep learning performance? Regarding its specs, I assume it can be considered as an alternative to 4070 Ti. Would you suggest 4090 or A5000 for a deep learning research regardless of their prices?

@CARTUNE. Ай бұрын

Does the CPU play a large roll in any of this? Is there much multitasking between apps while using the GPU or is a 6 core (7600x - 9600x or non X) sufficient for lighter stuff?

@JoseSilva-gt6zj 2 жыл бұрын

Amazing video, Jeff! Please, could you show some models running and how much memory they consumes? Thank you !!!

@denysivanov3364 Жыл бұрын

buy 4090, 24 gb is the minimal amateur size when you actually can do stuff with limitations.. 40 and 80 gb are pro memory sizes.

@JoseSilva-gt6zj Жыл бұрын

@@denysivanov3364 Thank you !

@nathanbanks2354 Жыл бұрын

I'm looking forward to FP8, which should be coming to the 4090's as a software update. That will allow running 20B parameter models at full speed with 24GB of RAM and little loss in quality and some performance gain since it needs less memory bandwidth. But I'll probably be renting 4090's on runpod; for my own machine, I just bought a used Dell precision 7720 with a Quadro P5000 with 16GB of VRAM. It runs the 7B Alpaca/LLaMA models at 7 tokens/second. It also runs whisper at slightly faster than real time Speech to Text, which was around 30% the speed of a 3090. (I even experimented with 3 bit quantization for the LLaMA 30B model, but it wasn't worth it because it was super slow and it still ran out of RAM for large queries.)

@tridevyatzemel Ай бұрын

I have 3060 with 12 gb and 3080 with 10 gb. So Im just curios can i use them both for AI purpose to increase memory that can be all utilized during deep learning, I know for games its not matter only first gpu going to be used. but i havent seen any video where some one use 2 different videocards for deeplearning and what cons and pros can be expected.

@yolo4eva 2 жыл бұрын

Thank you for the wonderful video! currently using a 3060, seeking to move to a 40s. will be waiting for new videos with the new setup ready!

@HeatonResearch 2 жыл бұрын

Oh thanks, they do exist!! I was curious about that one. I tried to obtain one, but had failed in my attempt.

@JustSomePersonUK 2 жыл бұрын

Thanks!

@HeatonResearch 2 жыл бұрын

Oh wow, you are totally awesome! Thank you so much!

@xntumrfo9ivrnwf Жыл бұрын

I've noticed something interesting here in Europe. Used 3090's seem significantly cheaper than I remember them a few months ago (no surprise I guess). It's priced somewhere between a 4070ti and a 4080 (but closer to the 4070). I'm seriously considering getting a used 3090 for that sweet, sweet VRAM. I'm building my 1st PC right now. Deep learning is more a hobby for me - nothing at all related to my day job, etc.

@danieldaniel625 Жыл бұрын

same here. I was thinking about getting a 4090 but now I think it'd be nice to save some money

@xntumrfo9ivrnwf Жыл бұрын

@@danieldaniel625 for what it's worth, I ended up getting a used 3090. I paid around 800 eur + 40 for buyer protection on the platform I used. Overall, I'm happy with everything!

@ectoplasm Жыл бұрын

@@xntumrfo9ivrnwf Lucky. I went this route and mine was DOA.

@shafiiganjeh8082 Жыл бұрын

There is also the budget option to get used cards from the nvidia tesla series, mainly the p100 and v100. There are a pain to set up and generally hard to find at a reasonable price but absolutely worth it if you only care about deep learning. I was using a p100 for 250€ until I got my hands on the 32gb v100 for 700€ (which is an absolute steal btw).

@AlexanderSelyutin Жыл бұрын

What do you think about 4060ti 16gb?

@podunkman2709 9 ай бұрын

Yep, what about this one? Looks like good choice

@ericp6890 Жыл бұрын

You really need to also consider the TensorFlow and PyTorch compatibility with the 40 series as of now. If one needs to dig right in, then getting 30 series is the right choice for him or her.

@Zekian Жыл бұрын

What compatibility issues have you seen with 40 series? I have not yet encountered issues.

@nitishu2076 Жыл бұрын

What are your thoughts on Intel arc770 for deep learning? Would you recommend buying it over rtx 3060?

@HeatonResearch Жыл бұрын

That is an interesting system, it looks like you could get support for PyTorch through ROCm, though you would be dealing with Linux at that point. If your willing to deal with somewhat more complex installs, and maybe some features not working, it could work out.

@VamosViverFora 2 ай бұрын

Hi Jeff, what’s your opinion about RTX 4060 TI (16GB) for beginners? Your main complaint about 3060 was memory. Do you think worth invest (maybe in two in the future)? Thanks.

@dragonmaster1500 10 ай бұрын

I'm a beginner in machine learning, currently work as a research assistant for a college in the GIS department. I'm building a personal PC to handle GIS related tasks, with photo editing and gaming as a side benefit, and after having researched for a bit I've noticed that everyone seems to place way more focus on having a high amount of RAM than they do on having a good GPU. Our 'Heavy Processing' computers for example, have 128GB of RAM, but only have a GPU with 8GB of VRAM. For my own build, I'm thinking of starting out with a 4070 Ti Super with 16 GB. I wanted to buy a 3090 Ti, but it's almost double the price ($2400.00 Canadian Vs the 4070Ti Super's $1199.99 Canadian).

@TheUltimateBaccaratApp Жыл бұрын

Is getting NVIDIA brand ONLY....necessary. For example I see NVIDIA RTX 3060 Asus, MSI, GIGABYTE PNY etc.... is there a difference for machine learning? Thank you!

@HeatonResearch Жыл бұрын

It is the path of least resistance, CUDA very much dominates and is also very common in the cloud. I believe there are more options for AMD these days, its just not something that I've had the time/desire to venture into.

@pokepress Жыл бұрын

I think the poster was more confused about what the difference is between the different manufacturers of RTX cards. The way it works is that the core components of each card are all designed and produced by NVIDIA, who makes their own cards, but also licenses them out so that board partners (MSI, ASUS,etc.) can make their own cards. As long as the card has the same overall model number (3060, 3090, 3090 Ti, etc.) and specs, performance should generally be within a few percentage points.

@mattelles Жыл бұрын

I'm in doubt between an A4000 and a 4070 TI. They are the same price where I live. As NLP models keep getting bigger and bigger, would the additional 4GB VRAM of the A4000 make a huge difference? BERT-Like Base models seem to run fine with 12GB. However, I'm not so confidence about these larger multimodal models like LayoutLM

@martin777xyz 6 ай бұрын

It would be interesting to have a video on the techniques for dealing with VRAM limitations for local AI models. I guess some of the Nvidia tools are addressing this issue? What does it take to run some of the open source LLMs on a 12Gb GPU? What are the options if it's 8Gb VRAM? is a flat no go, or does it run, but slower? Forgive me if there's already a video detailing this... 🙏

@anonimes4005 Жыл бұрын

How important is the speed in comparison to ram? I am deciding between a 12gb 3060 and a 24gb tesla p40 for about 250eur, and couldnt really find a lot of information about the tesla.

@blanchr2 Жыл бұрын

I am looking at buying a GPU for machine learning, and this was helpful advice. Thanks!

@issamu2k Жыл бұрын

Is the 4070 better for ML than the 4080 TI?

@MikeG-js1jt 7 ай бұрын

Is it just me, your camera, or are your hands remarkably "tiny"

@ElinLiu0823 7 ай бұрын

Hi Sir,i have watch the all content. So i have i quession: Is Quad TITAN V still can be use as a budget limit solution ?

@shreyashgaur4397 4 ай бұрын

can be use 2 RTX 3060 using both for ml and dl tasks

@mrmboya3491 4 ай бұрын

I bought an AMD 6700hs 8gb I know AMD has less community support, can it still do the work

@HeatonResearch 4 ай бұрын

Maybe. But I have much less experience with AMD, would love to hear how that goes.

@pedrojosemoragallegos Жыл бұрын

Hi Jeff, I want to go deep into NLP and Speech Recognition. Do you recommend a RTX 4090 or should I take more RAM? I don’t know how big these models are and will be.. hope you can help me!!

@walter274 Жыл бұрын

Where do tensor cores fit in to all of this. I think the operations they accelerate are ubiquitous in statistics. Also i know cuda core count is correlated with tensor cores. I'm starting to explore using my GPU in my statistical work. It's not that what i do is so complex that i need the extra computational power, but I do want to acquire the skill. Thanks.

@chazscholton3137 Ай бұрын

I went with RTX 3060 12GB purchase a few years ago due to the amount of RAM it has for getting into ML and for Rendering Video/3D stuff. I am rather curious about how an NVIDIA Tesla M10 with 48GB of RAM might be able to be used for ML. I am not use to those kinds of cards.

@EssDubz Жыл бұрын

I'm confused about the 2nd assumption about not using server GPUs. Is there any reason why you couldn't put an A100 (for example) in a computer and have it sit on your desk?

@diazjubairy1729 11 ай бұрын

How important the memory buss width for ML and DL ?

@Sohamdebnath88 5 ай бұрын

Is 12gb vram more than enough for machine learning? Will i not run out of memory?

@perceptoshmegington3371 Жыл бұрын

Managed to get Tensorflow working on my Arc A750 for a fraction of the price an Nvidia card would've cost :)

@mawkuri5496 4 ай бұрын

is GeForce RTX™ 4060 Ti GAMING X 16G good for deep learning?

@rossrochford1236 Жыл бұрын

What do you think of the following setup in terms of price/performance? Two 3060s and a 3090?

@gtbronks Жыл бұрын

Thank you from Brasil!

@HeatonResearch Жыл бұрын

You are welcome, from the states!

@durocuri1758 Жыл бұрын

Now it is October, i am training sdxl model. If i want to train high pixel pictures, it need 48gb or high. Is it still deserve to buy a6000 sli extend it vram to 96gb now? Or just rent a a100 on cloud😂

@peterxyz3541 Жыл бұрын

Hi, any advice on setting up multi GPU, mixing various Nvidia? I’m trying to find a motherboard for 2 or 3 GPU. What should I look for?

@jakestarr4718 10 ай бұрын

i'd be looking at my pciex16 slots and if they share lanes. usually port 1 has direct separate lanes to the processor. port 2 and 3 are usually connected or share lanes, for instance pciex16 port 3 is directly connected to processor and port 2 runs through port 3. How i'd use it? port 1 would have my rendering card like a RTX 4080. Port 3 would an A5000 because its tremendously fast, now port 2 i'd probably install a usb 3.2 card. then make a external rack of say 4 NVidia k80's all doubled up and plugged into port 2's usb to pciex16 slot because it will be like giving my A5000 an extra 96gb of ram for like $700. So if you needed a crazy amount of ram to load terabytes of data sets this is what i'd be looking at and my electric bill. At least making the ram boost external gets heat away from my more expensive hardware and i can simply plug in however many stacks i need and power up what i need for the task. Its more of a modular approach to cutting costs with electric and heat. hopefully that gives you a better idea of board architecture and how you'd want to look at it. There is software like octane render that can link all that ram up and make it work together. So looking at it like well i'll just run out and buy the BEST isn't always bright. For instance i can buy 1 4090 but i can buy 2 4080's for the same price and i'd gain 8gb ram and 4k cuda cores and double the speed of processing. Its all perspective and knowing what you'll need and why, compatibility for use might be a major issue too! You might not want a card more for rendering at all as its a mining rig setup or solely a ai computational setup, but i'd say keep in mind that having a rendering card is beyond useful for video editing and a multitude of other tasks that these cards won't be. If you're go doing something insane with lots of heat, shell out the money for a more expensive and heavier board then, anything up too the $500 range on a mobo is going to be a much heavier built board, which means better soldering metals and thicker plating and oh boy does it matter when you're shoving over 5,000 watts through it. Now if its separate entirely from your main pc, like a server build then definitely take a look at the micro boards with multiple processors. Those can actually be setup between a stack and the main controller pc (your rendering or personal use pc). Having the extra processor lanes allows you to direct traffic of the information as you can have 4 processors on one board that are moving information in/out giving a higher efficiency of data transfer through the network you've now built. If you got enough money and understand the architecture, you'll realize you can do anything and the market is a blessing. Other important things to look at would be your processor's ram and its latency for much higher speeds and the ability to use nvme memory cards directly installed to those pciex16 lanes for even faster computing(it makes a massive difference).

@sophiawright3421 Жыл бұрын

very useful video - would love your thoughts on what kind of models you think can be trained on the new RTX-4090 laptops, and what you think of those laptops for deep learning in general, thank you!

@krishnateja4291 Жыл бұрын

I have a below configuration is this enough to start: I5 12400f 6 core 12 thread rtx 4060 ti 8gb 32gb ddr4 3200mhz Pcie 4 nvme ssd

@tl2uz Жыл бұрын

To make two GPU system, is it ok to use different brands of RTX4090? For example, one from Asus 4090 and one from MSI 4090

@yashahamdani6724 Жыл бұрын

Hay, thanks for video. I have an question, do you think Gtx 1660Ti is still worth it?

@sarfarazsakid 5 күн бұрын

Is RTX 4060 good for machine learning or should I wait for RTX 5060?

@gonzalodijoux5953 Жыл бұрын

Currently I have a ryzen 5 2400g, a B450M Bazooka2 motherboard and 16GB of ram. I would like to use vicuna/Alpaca/llama.cpp in a relatively smooth way. - Would you advise me a card (Mi25, P40, k80…) to add to my current computer or a second hand configuration? - what free open source AI do you advise ? thanks

@daspallab772 Жыл бұрын

rtx 4080 vs a5000 which one would be better for deep learning and Tensor Flow processing ?

@testales Жыл бұрын

If you want to play around with stable diffusion, I'd recommend to go at least for a 3090 because of its 24GB RAM.

@danilo_88 Жыл бұрын

3060 12gb is more than enough to play with SD, unless you want to train large models

@jamesf2697 7 ай бұрын

I am wondering is it better to buy a 40series or 2 20 or 30 series for ai?

@neoblackcyptron Жыл бұрын

I am consulting for a SMB. We are building a strategy map for on-premises vs cloud setup for our machine learning models. I am primarily an ML techie guy, not a business man, so for an SMB what is the better approach. I am assuming once we put money into this fixed cost of setting up a training server for our models we might not have to scale over time.

@chakrameditation6677 Жыл бұрын

How do you feel about 4060 TI - 16 GB ? I plan on mainly using it for LLM's ; Text generation. Thank you Jeff.

@ravirajchilka Жыл бұрын

4070 Ti with 12 GB would be enough for any Deep learning ?

@masterwayne9790 Жыл бұрын

Hey big man I'm getting 3070 or 3080 second hand card for Deep learning. Which one should I choose CPU is 13700K & I can also reduce my build to AM4 and can buy 4070 if it's better than 3070 & 3080 for DL.

@saribmalik985 7 ай бұрын

Can I use a ryzen 5 3600 in biggener deep learning and ai It's weak but it allows me to get a 3060 in my budget

@abh830 Жыл бұрын

or if you are aware of affect of pcie riser cable on performance ?

@HeatonResearch Жыл бұрын

I have yet to try risers myself.

@kevindunlap2832 8 ай бұрын

I have an NVIDIA T1000 with 8gb. Is that enough for learning ML?

@samuel0705 Жыл бұрын

Hello Jeff, Some AIB GPUs offer water cooling (like the MSI Suprim Liquid X or Gigabyte Waterforce). Supposedly they offer ever so slightly better temperatures. Does that matter much for training models (considering the system could be running consecutively for hours/days)?

@MrLostGravity Жыл бұрын

Water-cooling is seldom worth the time/cash investment unless wanting to go for maximum performance while staying on ambient cooling. I'd say it's potentially a 5% performance improvement for 15-25% more cost (relative to the GPU base price). The only application I'm aware where this is useful other than satiate ones need for tinkering is when trying to squeeze maximum performance out of a single GPU for gaming purposes for instance. I don't see the value for ML purposes unless going for absolute performance or if noise is a main concern, since water-cooling tends to be less noisy.

@amiga2091 Жыл бұрын

How do I set all this up? Multi gpu rig. What operating system? Ubuntu? 1-3090 and 3-3060.

@Idtelos Жыл бұрын

Driver support for pro series cards is why I went with the rtx a6000. Working on ML application in CFD, gpu acceleration is of great help in getting solutions faster.

@hybridtron6241 Жыл бұрын

do you think the motherboards that support the new amount of ddr5 ram amount of 192 is worth it ?!

@triplethegreatdali4238 Жыл бұрын

"Hello, Sir. I am learning deep learning and have worked on many projects in the lab's GPU system. I want to build my personal setup within a budget of 2,000 USD. Could you please suggest the best components for it? (Excluding: Monitor, Keyboard, and Mouse)." Thank you.

@saminmahmud6049 2 жыл бұрын

I have MSI Ventus GeForce RTX 3060 12GB. In my opinion, it's a little slow for DL but it gets the job done; takes a while!

@Jorvs Жыл бұрын

Does nvda better in AMD in AI? Do have recommendations on AMD brand for AI?

@DIVYESH-vn7ss 6 ай бұрын

Is GEFORCE RTX 2050 is cuda supported??

@faizalelric Жыл бұрын

what do you think between RTX 3080 ti or 4070 ti?which one is better for DL?

@autoboto 11 ай бұрын

Does anyone have experience with egpu system with thunderbolt 4 and ada series 4090 or professional grade GPU's? I know egpu is too slow for gaming but curious if for ML applications could be applicable?

@Pure_Science_and_Technology Жыл бұрын

It would be nice to know what GPUs work well with what popular LLM model. Sorry if you’ve already covered this.

@rachitbhatt40000 Жыл бұрын

Does memory Bandwidth matters for deep learning just like the memory size?

@flightsimdev9021 Жыл бұрын

I have an old Quadro M6000 12gb variant, is this good enough for AI ?

@henry3435 2 жыл бұрын

Hey Jeff, thanks for the great content. Have you needed around with AMD & ROCm at all? Would be interesting to learn more about that.

@yuliu4177 Жыл бұрын

Excellent video, Dose less ram needed if not research in the computer vision but in the machine learning which deal with PDEs or ODEs. Dose 8GB enough for most of things?

@sluggy6074 Жыл бұрын

What's your opinion on the tensor cores? They run in the same price range as the a6000. They are very task specific

@Argoo41 2 жыл бұрын

Really can agree on VRAM. That's unfortunate that nvidia had a big gap between 3060 and 3080ti. And still, 12gb is not much. Would be great to see more VRAM in future. I saw a rumor of a 3070ti with 16gb, but never found in in the wild. In addition to information from this video, it would be nice to know how memory bandwidth can affect the speed. If you have any benchmarks or info, please share Thanks!

@ProjectPhysX Жыл бұрын

For a lot of these professional applications, performance is directly proportional to VRAM bandwidth and the TFLOPs are well enough to not matter at all, as it's bandwidth-limited. More VRAM is indeed very desirable. I hope we will soon see cards with 96GB GDDR6W, and the leaked H100 PCIe with 120GB HBM.

@andrewjazdzyk1215 10 ай бұрын

Yeah I need at least 48 GBs of vram, any less and I'm like ehhhhhhhhh

@faisalq4092 2 жыл бұрын

Thanks for the video. I've been a fan your work since 2020, and it had a big impact on me. Currently I'm getting into NLP (in both research and applications). and I've been looking into buying a new GPU for my work. and I always wanted to know if different gpus would provide a marginally different training time for NLP models. Also, if you have trained a large NLP model before, do you know how long would it take a 4090 to train a BERT base model. Thanks in advance

@vishwajeetshukla6927 Жыл бұрын

Nvidia has released a new 4060Ti 16GB with gddr6 vram. Any comments on the performance? Are tensorflow and pytorch versions being supported on new 40 series graphics card? Can someone ping the version compatibility for cuda and tensorflow for 40 series graphics card.

@HeatonResearch Жыл бұрын

I use a 40-series with Tensorflow, no issues that I've hit. That sounds like a decent GPU.

@tassoskat8623 Жыл бұрын

@@HeatonResearch Hello and thanks for your content! Considering its price and 16Gb of vram the 4060ti is on top of my list. However, my main concern is that its bus size is 128-bit, compared to the 192-bit of most gpus. Is this a factor that can limit the performance of the gpu in deep learning tasks? If yes, should I go for exampe for 4070 which has 12 Gb vram but 192-bit bus size? Thanks in advance and I wish you a happy new year!

@xtrading888 Жыл бұрын

I have several 3080ti and one 3070, but their memory is low compared to 3090. Are they not suitable for AI?

@jpsolares Жыл бұрын

rt core? are good for machine learning?

@adamstewarton Жыл бұрын

what about tesla p40 as a second card for ML ?

@StaMariaRock 2 жыл бұрын

Here one with a 3060 with 12gb, pretty ok, the memory is big enough to work, but I think is getting "slower" comparing to more powerful gpus, I wish I could get a 3080, but not any time soon, but if you are tight in budget and want something that works well, this 3060 is pretty cool to have

@HeatonResearch Жыл бұрын

Nice to hear, I always thought that would be a decent entry point GPU.

@bud1239 8 ай бұрын

I am in a Physics PhD program and I am interested in CUDA coding. I got my 3060 12gb for CUDA coding as a starting point. Got it for $250 new for my pc build so I am pretty happy. I am still working on figuring out how to program in CUDA but I figured out how to program in parallel with Python using my CPU (have a 10 core, i5 12600kf)

@kestasjk Жыл бұрын

I don't get it why did you eliminate the server GPUs right out of the gate? I was hoping you would compare the K80 which has 24GB of VRAM, the only way you can get that much VRAM on an RTX/GeForce is with a 4090. I bought one for an LLM I needed to run before I realized these cheap K80s were a possibility.

@jreamer0 Жыл бұрын

K80s have 1/3 the CUDA Cores

@gabesusman4592 Жыл бұрын

exactly what I needed, thanks man

@MalamIbnMalam Жыл бұрын

This is a great video, with that said, I notice that you utilize Windows 11 as your main OS, please correct me if I'm wrong. Have you ever tried a Linux distro? (e.g. Ubuntu or Fedora) with Tensorflow or Pytorch on CUDA? Does it perform better on Linux or on Windows?

@w4r10rd Жыл бұрын

is 4070 good for deep learning tasks?? i got one recently

@lowqchannel 2 жыл бұрын

how about memory bandwidth. 4080 is actually slower than 3080 Ti and 3080 12GB

@HeatonResearch 2 жыл бұрын

This is a really good point. I have not been bitten by memory bandwidth, at least that I've seen. This could be a small sample size, on my part that I've not noticed it. 4080 head-to-head against a 3080 Ti would be interesting. Has anyone seen such a benchmark? Either on the gamer or ML space.

@dongyangli3985 Жыл бұрын

@@HeatonResearch 4080 with a 256bit bandwidth? That's not a problem for DL, unless you play computer games with a 4k monitor.

@ProjectPhysX Жыл бұрын

@@HeatonResearch I have benchmarks of the 4080 vs 3080 (slower 10GB model) in computational fluid dynamics, for which only VRAM bandwidth matters, as the algorithm is non-cacheable. The 4080 is 7% slower than the 3080 10GB, despite costing double. I also have the comparison between 4090 and 3090 Ti, which have the same 1008GB/s VRAM bandwidth on the data sheet. They are equally fast, despite the 4090 costing double. So RTX 40 series is about 5 years backwards development for these kind of applications.

@nidungr3496 Жыл бұрын

Thanks for the warning 😮you saved me €1.5K

@faridbang9851 Жыл бұрын

Would you recommend a 12GB A2000 for a beginner in ML?

@HeatonResearch Жыл бұрын

Yes, that would be a fine GPU.

@faridbang9851 Жыл бұрын

@@HeatonResearch Thank you!

@mrjones8257 Жыл бұрын

Jeff, what do you think about using 6 x 3090 GPUs? For instance, one GPU on the MB (with 16x PCIE lanes) and 5 GPUs connected via risers (1x PCIE lane each) - This setup with a modest amount of ram (16GB) and a middling CPU (affordable I7) Is this a decent configuration? Any ways to improve upon this? Any insight greatly appreciated.

@h-xr1oi Жыл бұрын

1x pcie does not work in my case, i need at least 8x or the training time will x3 You need to go with epyc, threadripper or xeon to support them. X299, x99 or x399 mobo will be the cheapest.route. e5 v4 cpu is really chip with dual cpu x99 chinese mobo

@dhrumil5977 2 ай бұрын

How about 4060ti 16g even tho slow but fits in budget and and vram is good too

@manthanpatki146 Жыл бұрын

I am getting rtx 4080 for 1300$ and rtx 4090 for 1800$, what should I get. (Im a beginner)

@JingsongLin Жыл бұрын

Does upgrade from 4080 to 4090 worth the money?

@GwriPennar 4 ай бұрын

I am using 3060 12gb rev2 from Gigabyte. Price/performance point at the time was superb, still can get them in the UK. Would really like to jump to a 4090 to run open source LLMs. So far I have been able to do deep learning projects on the 3060 using Pytorch and Tensorflow.

@atharvahude 2 жыл бұрын

If possible you can also check for an older Rtx card which offer 24gb memory.

@davidwhite2011 2 жыл бұрын

What is the speed penalty in rough terms if your model doesn't fit in 24 gb?

@HeatonResearch 2 жыл бұрын

If you don't scale/rearchitect it, then I suspect considerable. Host to GPU transfers remain expensive, even with 5th gen PCIe. Generally I've just thrown GPU RAM as much as I can, so I've not specifically benchmarked this.

@techtran 2 жыл бұрын

Great video Professor Heaton. I appreciate your opinion and all the effort in putting out these videos. Your guidance has assisted me in making choices with my PC builds for future machine learning. I recently updated my 2019/2020 ML desktop with additional memory and a RTX 3090 Ti 24GB FE that I purchase at $1,599.99 in mid-2022 (slightly below MSRP of 1,999.99). Two months later, I was able to purchase a RTX 3090 Ti 24GB FE for $1,099.99 (still available at this price directly from NVIDIA), which I currently use in an eGPU for accelerating GPU performance in a thunderbolt-4 intel laptop (Windows 11 and WSL-2 works great, and now trying to get it to work with native Ubuntu). I’m also currently building a new small form factor (SFF) desktop for my home office, but I’m waiting for the RTX 4090 24GB FE to be available at MSRP (I’m very reluctant to buy anything above MSRP). I feel the RTX 3090 Ti 24GB FE at $1,099.99 and RTX 4090 at MSRP are better choices over the RTX 4080!

@MB-pt8hi 8 ай бұрын

Are your 3090s second hand? I cannot find any 3090 for that price even now.

@silicon_sage17 Жыл бұрын

Currently a student and looking into this sort of thing. Are we expected to have "this" by the time of employment or later on? Was gonna use high end pc parts but this is clearly industrial parts that's not fully widespread to the mainstream. Thank you.

@GabrielTobing 2 ай бұрын

Bro idk, it seems like they expect us to be fully loaded to spend money on cloud computing Pain man

@jacobhatef3156 2 жыл бұрын

Any thoughts on using 2 3090s with NVLink, or 2 4080s (roughly the price of a 4090)? My budget is about 2-2.4k, and I'd like to buy the best most versatile thing I can afford. 2 3090s or 2 4080s double your VRAM, which is something to consider.

@bigdreams5554 2 жыл бұрын

3090 all day every day, in my opinion. As far as nvlink... Not sure how well that works for pytorch etc. (I think the professional cards are the ones worth the drivers to take advantage of that). Definitely do more research about linking the GPUs. Make sure you have a good PSU (1600w) if u have multiple 3090s in a box.

@HeatonResearch Жыл бұрын

Good points bigdreams!

@jacobhatef3156 Жыл бұрын

@@bigdreams5554 if nvlink can’t work should I switch to something else? 3090ti, 4080, etc? Are the amd gpus even worth considering?

@harrythehandyman Жыл бұрын

no NV Link or P2P over PCI-E support on 4080/4090. The last consumer card that support NV Link and P2P is 3090. If your model need VRAM, 3090 x2 is better. If your model fits within 24GB, single 4090 is probably close to dual 3090 performance.