Downgrading My GPU For More Performace

Рет қаралды 42,938

Күн бұрын

Пікірлер: 118

@syspowertools3372 9 ай бұрын

I picked one up on Ebay for $45 shipped. I also had a FTW 980ti cooler laying arround. As long as the cooler fits the stock PCB of any 970 to titan X card, you can just swap it. You may need to cut out or re-solder the 12v power connector in the other orientation tho, in my case I moved it from the back to the top. I also thermal glued heat sinks on the backplate because not beingin a server case means that vram gets warm.

@yungdaggerdikkk 9 ай бұрын

holy molly bro, 45? any link or tip to get one that cheap? ty and hope u enjoy it x)

@joshuachiriboga5305 8 ай бұрын

@@yungdaggerdikkk Newegg has them at about that price

@joshuachiriboga5305 8 ай бұрын

Running Stable Diffusion does it run out of vram at 12gb or at 24gb? The tech docs claim the system is 2 systems of Cuda and vram etc...

@KiraSlith Жыл бұрын

I'm using a trio of P40s in my headless Z840, kinda risking running into the PSU's power limit, but there's nothing like having a nearly real-time conversation with a 13b or 30b parameter model like Meta's LLaMA.

@jaffmoney1219 Жыл бұрын

I am looking into buying a Z840 also, how are you able to keep the P40s cool enough?

@KiraSlith Жыл бұрын

@@jaffmoney1219 Air ducting and cramming the PCIe zone intakes to 100%. If you buy the HP branded P40s supposedly their BIOS will tell the motherboard to ramp the fans automatically. I'm using a pair supposedly from PNY so I don't know.

@strikerstrikerson8570 Жыл бұрын

@@KiraSlith Hello! Can you make a short video on how it works for you from the side of hardware and a language model such as LLAMA? If you can’t or don’t want to make a video, you can briefly describe here your hardware configuration, and what is better to take for this? I'm looking at an old platform 2011-v3 18-22 core cpu, gaming motherboard from asus or asrock with 128/256gb ddr4 ecs ram. At first I wanted to buy a modern video card RTX 30xx / 40xx line, but then I came across Tesla server accelerators, which have a large amount of VRAM 16/24/32 GB which we have about 150/250/400 euros Unfortunately, there is somehow little information, and if you come across videos on KZbin, then people start stable diffusion, which gives very deplorable results even at tesla V 100, which the RTX3060 bypasses. Thanks in advance!

@KiraSlith Жыл бұрын

@@strikerstrikerson8570 Sure, when it comes down for maintenance next. It's currently training a model. If you want new cards only and don't have a fat wallet to spend from, you're stuck with Consumer cards either way. Otherwise, what you want depends entirely on what your primary goal is. Apologies in advance for the sizable wall of text you're about to read, but it's necessary to understand how to actually pick a card. I'll start by breaking it down by task demand: - image recognition and voice synthesis models want fast CUDA cores but still benefit from higher core counts, and the larger the input or output, the more VRAM they need. - Image generation and voice recognition models also want fast CUDA cores, but their VRAM demands expand exponentially faster. - LLMs want enough VRAM to fit the whole model uncompressed and lots of CUDA cores. They aren't as affected by core speed but still benefit. - Model training always requires lots of VRAM and CUDA cores to complete in a reasonable amount of time. Doesn't really matter what the model you're training does. Some models bottleneck harder than others (though the harshest bottleneck is always VRAM capacity), but ALL CUDA Compute capable GPUs (basically anything made after 2016) are able to run all models to some degree. So I'll break it down by their degree of capability, within their same generation and product tier. - Tesla cards have the most CUDA cores and VRAM, but have the slowest cores and require your own high CFM cooling solution to keep them from roasting themselves to death. They're reliably the 2nd cheapest card option for their performance used and the only really "good" option for training models. - Tesla 100 variants trade VRAM capactiy for faster HBM2 memory, but don't benefit much from that faster memory outside enterprise environments with remote storage. They're usually the 2nd most expensive card in spite. - Quadro cards strike a solid balance between Tesla and Consumer. Fewer CUDA than Tesla but more than Consumer. Faster CUDA cores than Tesla but slower than Consumer. More VRAM than consumer, but usually less than Tesla. Thanks to "RTX Experience" providing solid gaming on these cards too, they're the true "Jack of all trades" option and appropriately end up with a used price right in the middle. - Quadro "G" variants (eg GP100) trade their VRAM advantage over consumer for HBM2 VRAM at absurd clock speeds, giving them a unique advantage in Image generation (and video editing). They're also reliably the most expensive card in their tier. - Consumer cards are the best used option for the price if you want bulk image generation, voice synthesis, and voice recognition. They're slow with LLMs, and if you try to feed them a particularly big model (30b or more) will bottleneck even more harshly on their lacking VRAM (be it capacity or speed) and potential to bottleneck even further paging out to significantly slower system RAM.

@og_tokyo Жыл бұрын

Stuffed a z440 mobo into a 3u case, will be putting 2x p40s in here shortly.

@joo9125 Жыл бұрын

Turing, not TURNing lol

@nneeerrrd Жыл бұрын

He's a Pro, don't tell him he's wrong 😂

@subsubl 8 ай бұрын

😂

@gregorengelter1165 Жыл бұрын

I also got myself an M40 a few months ago. But cooling with air is not really a good solution in my opinion. I was lucky enough to get a Titan X (Maxwell) water block from EK for 40€/~44USD. With it, the part runs perfectly and comes under full load to a maximum of 60 ° C / 140 °F. If you are not so lucky, I would still recommend using these AiO CPU to GPU adapters (e.g. from NZXT). Air cooling is comparatively huge and extremely loud (most of the time).

@SpottedHares Жыл бұрын

So according to Nvidia own specs the m40 uses the same board as the titan x and 900 series. So theoretical any cooling system that works for either of those two should also work on the M40.

@KomradeMikhail Жыл бұрын

SD, GPT, and other AI apps _still_ not taking advantage of AI Tensor cores... Literally what they were invented for.

@gardenerofthesun 10 ай бұрын

As long as I know, llama-cpp can use tensor cores

@StitchTheOtter Жыл бұрын

I did get myself a P40 for 170€. RTX 2080 gaming performance and 24gb GDDR5 694.3 GB/s. Stable diffusion on my 2080 runs around 5-10x Faster than on the P40. But it would make a good price/performance cloud gaming GPU.

@madman1397 Жыл бұрын

Tesla P40 24gb cards are on ebay for sub $200 now. Considering one for my server

@bioboi4438 Ай бұрын

What about 3x Nvidia Tesla K80? Each with 4,992 cuda cores and 24Gb of vram.

@Bjarkus3 4 ай бұрын

If you put a p40 with a 3090 will it be bottlenecked at p40 speeds or will it be an average?

@zilog1 Жыл бұрын

They are going for $50 currently. get a server rack and fill them up!

@KratomSyndicate Жыл бұрын

I just bought a rtx 4090 last night and all the parts for a new desktop, i9 13900K, MSI Meg Z790, ddr5 128gb, 4 - samsung 990 pros, to just do SD and AI, maybe over kill

@Mark300win Жыл бұрын

Dude you’re loaded 😁$

@sa_med 10 ай бұрын

Definitely not overkill if it's for professional use

@schifferu Жыл бұрын

Got my Tesla M40 a while back, and now have a fan cooling on it (EVGA SC GTX 980ti cooler) to mess around with, but just seeing the power consumption 😅😅

@FlexibleToast Жыл бұрын

You say you need a newer motherboard to use the P40. Does any motherboard with PCIe x16 3.0 work?

@k-osmonaut8807 Жыл бұрын

Yes, as long as it supports above 4g decoding

@brianwesley28 3 ай бұрын

Also don't forget the fan.

@thespecialkid1384 22 күн бұрын

you can get dual m10s of ebay second hand for around £200, they each have 32gb of vram, and 2560 cuda cores, so thats 64gb of vram, and 5120 cuda cores for just £200. I use mine for creating ML models in python.

@DanRegalia Жыл бұрын

So, I picked up a P40 after watching this video... Thanks! Do you have any videos that talk about loading these LLMs, or if I should go with linux/windows/etc... maybe install Jetpack from the Nvidia downloads? I've screwed around a little with hugging face, and that made me want to get the card to run better models, but rabbit hole after rabbit hole, I'm questioning my original strategy.

@NovaspiritTech Жыл бұрын

i'm glad you were able to pick up a p40 and not the m40 since pascal arch can run 4bit modes which is most llm models but llm's changes so rapidly i can't even keep up myself but i have been running the docker container for github.com/Atinoda/text-generation-webui-docker . but yes this is a deep rabbit hole i feel your pain

@vap0rtranz 7 ай бұрын

Easiest out-of-box apps for running local LLMs are GPT4All and AnythingLLM. Huggingface requires lots of hugging to not sink into rabbit holes :) The apps like I mention keep things simple. Both have active Discord channels that are helpful too.

@l0gic23 4 ай бұрын

Remember how much it was at the time?

@DanRegalia 3 ай бұрын

@@l0gic23 180 bucks locally here off of fb marketplace.

@alignedfibers Жыл бұрын

I went with K80 but stable diffusion only runs with torch 1.12 and cuda 11.3 and right now only runs on 12GB half memory and half gpu in the k80 because it is Dual GPU. M40 should allow modern cuda and nvidia driver and also no work around needed to access full 24GB on K80.

@joshuachiriboga5305 8 ай бұрын

Thank you, I have been looking for this info

@BoominGame 8 ай бұрын

Does it use the whole 25Gb VRam, because it's basically 4 cores put together, is the Vram working as 1?

@AChrivia Жыл бұрын

2:21 Actually, that Tesla card has 1150 more cuda cores than that 2070... 3,072-1,922= 1150 The only thing im curious about is how well it can mine. 🤔 If anything, why the hell wouldnt you just get a 3090ti? It has 10,496 cuda cores which is far and beyond the tesla in both capabilities for work and gaming. If its due to sheer prices, i get it but the specs are still beyond what you currently have.

@Antassium Жыл бұрын

Cost:Performance...

@jamestaylor1849 2 ай бұрын

I got to ask. Why do you say it needs PCIE Gen 4 and a newer motherboard? Documentation says it's PCIe 3

@timomustamaki5407 Жыл бұрын

I have been planning this move as well as the M40 is dirt cheap on ebay. But I worry about one thing you did not touch on this video (or at least I did not notice if you did): How did you solve the power cabling issue? I believe the M40 does not take a regular pcie gpu power cable but needs something different, an 8-pin cable?

@KiraSlith Жыл бұрын

That's right, the Tesla M40 and P40 use an EPS (aka "8-pin CPU") cable, which can thankfully be resolved using an adapter cable. Just a note, the 6-pin PCI power to 8-pin EPS cables some chinese sellers offer should ONLY be used with a dedicated cable run from the PSU to avoid cable meltdowns! Thankfully this isn't an issue if you're using a HP Z840 (which also conveniently solves the airflow issue too), or a custom modular PSU with plenty of PCI power connections, but it can quickly become an issue for something like a Dell T7920.

@mythuan2000 2 ай бұрын

Guys, have you ever heard about mining gpus? If you do, you have a solution for maybe 10 or 20 cards at once, why complaint about 2 or 3 cards?

@fuba44 Жыл бұрын

But wait, i was under the impression that both the M40 and the P40 are dual GPU cards, so the 24gb of vram is split between the to gpu's. or am i mistaken ? when i look up the specs it looks like only 12gb per gpu.

@unicronbot Жыл бұрын

M40 and P40 GPU are single CPU

@yb801 Жыл бұрын

I think you are talking about K80 gpu.

@joshuachiriboga5305 8 ай бұрын

The Tesla K80 with 24gb vram, claims a setup of 2 system each with it own Cuda and vram. When running Stable Diffusion does it behave as one GPU with 24gb or does it behave as 2? Does it run out of vram at 12gb or 24gb in image production?

@BoominGame 8 ай бұрын

That's exactly my question.

@carlosmiguelpimientatovar8458 11 ай бұрын

Excellent video. In my case I have a workstation with an msi X99A TOMAHAWK motherboard with an Intel Xeon E5-2699 v3 processor, (and I currently use 3 monitors). Because of this I installed a GPU, AMD firepro w7100 which works very well for me in Solidworks. The RAM is Non-ECC 32 gigabytes. The problem is that I am learning to use ANSYS, and this software is married to Nvidea, and for GPU calculation acceleration, looking at the Ansys GPU compatibility lists, I see that the K80 is used, and taking into account the second-hand price, I am interested in purchasing one. How can I configure my system to install an Nvidea Tesla K80 and have the AMD GPU work as an image or video generator for my monitors as it currently does? Does the Nvidea K80 gpu have 24 GB of ram, can this be affected when using this gpu in conjunction with the AMD GPU that only has 8 GB of ram? Would the K80 be restricted to the RAM of the Firepro w7100? My PSU is 700 watts. Thank you.

@TheRainbowdashy 10 ай бұрын

How does the p40 perform for video editing and 3D design programs like Blender?

@beholder4465 Жыл бұрын

i have asus h410 hdv m.2 intel chipset, compatibilty good with the tesla m40? ty

@vap0rtranz 7 ай бұрын

Great explanation. Basically a Gamers vs AI hackers. The AI models want to fit into V/RAM, but are huge, so the 8G or 12G VRAM cards can't run them. Getting a new + huge VRAM GPU is hella expensive right now. So an older card with lots of VRAM works. Also, the Gamers tend to overclock/overheat, but the Tesla and Quadro are usually datacenter liquidations, so there's less risk of getting a fried GPU. BTW: the P40 is newer version of the M40.

@Robstercraw Жыл бұрын

You can't just plug that card in and go. There are driver issues. Did you get it working?

@gardenerofthesun 10 ай бұрын

Owner of P40 and 3090 in the same PC. No problems whatsoever, just install Studio driver

@edgecrush3r Жыл бұрын

I just purchased a Telsa P4 some weeks ago, and having a blast with it. The Low Profile even fits in the QNAP 472XT chassis. Passthrough works fine (minor tweaks). Currently compiling kernel to get support for vGPU (if i ever succeed).

@titopancho 5 ай бұрын

after watching your video, i tried to do the same, but, i had a problem.. i have the HP DL380 server and I purchased the Nvidia Tesla P100 16GB, but i can't find the power cable. watching other poeple i am afraid to buy the wrong one and fry my server.... can you please tell me the right cable to buy please..

@bopal93 Жыл бұрын

What's idle power consumption the m40. I'm thinking to use in my server but can't find details on internet. Thanks

@jerry5566 Жыл бұрын

P40 is good, but only concern is that it had probably been used for mining

@Antassium Жыл бұрын

Mining has been proven to not cause any more significant wear than regular duty cycles.. In fact, in some situations the mining rig would be a cleaner and safer environment than in a PC case, on the floor in some persons home with toddlers sloshing their chocky milk around, for example 😂

@chjpiu 9 ай бұрын

Can you suggest a desktop workstation can include tesla m40? Thank you so much

@BoominGame 8 ай бұрын

look for an HP z840, but buy a GPU separately because you are probably going to pay way more if included.

@jetfluxdeluxe Жыл бұрын

what is the power draw "idle" of that?! if on 24/7 in a server. can it power down? cant find info on that online.

@execration_texts Жыл бұрын

My M40 idled at ~30 watts, P40 is closer to 20

@davidburgess2673 Жыл бұрын

What about hbcc on a vega 64 to "unlimited" boost in ram all be it a little slower but with video out etc

@simpernchong Жыл бұрын

Great video. Thanks!!

@charleswofford5515 Жыл бұрын

For anyone wanting to do this. I found the best cooling solution is a Zotac gtx 980 amped edition 4 Gb model. It has the exact same footprint. The circuit board is nearly identical. Bolts right on with very little modifications. You will need to use parts from tesla and zotac gpuGPU to make it work. Been running mine for a while now without issue.

@idcrafter-cgi Жыл бұрын

My 4090 takes 2 seconds to make a 512x512 at 25 steps. It only has 24gb vrm which means that i can only like make 2000x2000 inages with no upscaling

@seanoneill9130 Жыл бұрын

Home Depot has free delivery.

@NovaspiritTech Жыл бұрын

😂

@garthkey Жыл бұрын

With them having the choice of worst wood, no thanks

@bulcub Жыл бұрын

I have server that I'm going repurpose as a video renderer to a multiple storage drive bay (24) I wanted to know if this is possible? would I need proxmox etc would the p40 model be sufficient?

@NovaspiritTech Жыл бұрын

I have a video on this topic with using tdarr

@blackthirt33n 4 ай бұрын

i have one of these cards how do i use it an ubuntu 22.04 computer

@cultureshock5000 Жыл бұрын

is the 8gb lopro good for my sff dell i like my rx550 but i could play alot more stuff i bet i could lay starfield 1080 on low on the 8gb m4 .... is it worth th e90 bucks

@hardbrocklife Жыл бұрын

So P40 > M40?

@b_28_vaidande_ayush93 Жыл бұрын

Yes

@ghardware_3034 Жыл бұрын

For training or FP16 inference get the P100, it got decent FP16 performance, the P40 is horrible at that, it was specialised for INT8 inference@@b_28_vaidande_ayush93

@joshuascholar3220 10 ай бұрын

I'm about to try it with a 32 gb Radeon Instinct Mi 50.

@MWcrazyhorse Жыл бұрын

How does this compare to an RTX A2000?

@sergiodeplata Жыл бұрын

You can use both card simultaneously. There will be two CUDA devices.

@alignedfibers Жыл бұрын

m40?

@brachisaurous 3 ай бұрын

P100 would be better for stable diffusion

@win7best 4 ай бұрын

the p40 from the price is already way better, also if you wanted more cuda cores you could have gotten 2 K80s for the same price

@nodewizard 4 ай бұрын

Just buy a used RTX 3090 for $500. Works great with generative art, LLMs, etc.

@MrHasie Жыл бұрын

Now, I have Fit, what’s its comparison? 🤭

@zygge Жыл бұрын

Pc dont need HDMI output to boot. Any display interface is ok. VGA, DVI or DP

@112Famine Жыл бұрын

Did anyone able to get this server graphic card able to play video games? Or only able to get it to only work how you have, running tasks, its a "smart" card, like how cars are able to drive.

@llortaton2834 Жыл бұрын

All tesla cards can play games, the problem with those is the cooling because there is no heatsink fan, you have to either buy your own 3D printed shroud or have a server that shoots air across the chassis

@FreakyDudeEx Жыл бұрын

kind of sad that the price of these cards in my region is ridiculous.... its actually cheaper to get a rtx3090 2nd hand rather than getting the p40.... and the m40 is double the price compared to the one in this video....