Пікірлер
@oscarmejia2174
@oscarmejia2174 4 сағат бұрын
This is great. what other GPUs re supported in this platform? can you run any modern Nvidia car like a 3090?
@Artikel.1
@Artikel.1 8 сағат бұрын
Really great Video! I am considering working with ML and developing myself further in the field of AI. But the price of the P40 seems very strange to me. I feel like I haven't found a P40 anywhere on the internet that costs less than $200. Maybe it's because I live somewhere else. But more than $200 for a single P40 is a bit much for me. I'm still in school, so the price is a deciding factor.
@DoBaMan77
@DoBaMan77 17 сағат бұрын
This is really awesome and worked pretty well. One thing to mention is that in your cookbook this command (~/miniconda3/bin/conda init bash) is missing to use conda . And can you make a video on how o setup MS Visual Studio code and Docker in order to use it you do? Kind regards Dom
@Anthony-c7o
@Anthony-c7o Күн бұрын
I can believe you blocked the air flow with the cables :(
@drewroyster3046
@drewroyster3046 2 күн бұрын
Sorry to be this guy but anyone got a TLDR?
@TheDataDaddi
@TheDataDaddi Күн бұрын
Hi there. Thanks for the comment! Yeah, I apologize for the video length. There was just a lot to discuss. I know a lot people don't have time for full videos of this length (myself included) so maybe I'll start doing a video summary at the beginning so those that are pressed for time can watch that and move on. For those reading this, please let me know if you think this might be beneficial. TLDR Anyway, I'll do my best to summarize in a paragraph. P40 and P100 are still some of the best (if not the best) GPUs you can buy right now for the money. P40 is overall my recommendation for most people for general AI/ML/DL tasks due to comparable performance to the P100 and more VRAM (P40 5.5x CPU vs P100 6.5x CPU). P100 is still the most well rounded GPU for the money so if you need various levels of precision for your workloads this would be a good choice. RTX 3090 offers best performance of the three overall (17x CPU) especially if you have use cases where you use or can use mixed precision. For those interested, in working with LLMs specifically this would be my recommendation. Hope this helps! Please let me know if you have any specific questions, and I will do my best to answer them succinctly.
@drewroyster3046
@drewroyster3046 Күн бұрын
@@TheDataDaddi thank you! Seemed like thorough well thought out content but didn’t have the time for the full deep dive. Thanks!
@repixelatedmc
@repixelatedmc 2 күн бұрын
Wow! Easy to follow, no gibberish, and pure information with clear and readable statistics!
@TheDataDaddi
@TheDataDaddi Күн бұрын
Hi there. Thanks so much for the positive feedback. So glad you found it clear and useful!
@fikrifirman
@fikrifirman 2 күн бұрын
great video and information, really helpful.. thanks for the hard work
@TheDataDaddi
@TheDataDaddi Күн бұрын
Hi there.Of course! So glad to hear that it helped you!
@VastCNC
@VastCNC 2 күн бұрын
I’m coming from power bi as well and the new company I’m working with is a Google shop, so definitely interested into a looker studio comparison if you’re game.
@TheDataDaddi
@TheDataDaddi Күн бұрын
Hi there! Thanks so much for the feedback. Awesome. I am glad to hear there is some interest here. This is definitely of the better tools I have used, and its FREE. lol. Stay tuned for a video here. I'll try to make one as soon as I get a free evening!
@VastCNC
@VastCNC Күн бұрын
@@TheDataDaddi the most critical question is though, do they have a dark mode that is decent? Microsoft is weak in the dark mode game for office suite
@ICanDoThatToo2
@ICanDoThatToo2 2 күн бұрын
Thanks for this! We've been wondering since Craft Computing mentioned it recently. But .. 30:00 I don't follow your math here. Firstly, the 2xP40 bar _says_ 10.74, but lies on the graph at over 15. I believe this bar should stop at 10.74 which not only shows its true value, but the height of the blue bar would visually show the performance added by the 2nd GPU. 2nd, I can't see where Throughput per Dollar numbers come from. The 3090 has T=17 and $=820, so should appear here at T/$=0.02 or $/T=48. Where did the 141 come from? 3rd, if you're going to look at running costs, then electricity is very important. In some locations electricity costs can exceed server costs in well under a year. It's the reason this hardware is so cheap -- companies can't afford to keep it running.
@TheDataDaddi
@TheDataDaddi Күн бұрын
Hi there. Thanks so much for the comment! 1) Yes, you are absolutely right. This graph does read badly. I stacked them to save space because my laptop screen is small, and I thought it would be a bit more readable for the video. However, I do agree it is misleading the way it is. I will update the report to fix this. 2) This has to do with the way the average is calculated in this case. The CPU-scaled throughput for each scenario is divided by the price of the GPU(s) for each different scenario (GPU, Number of GPUs, Model, Precision, Task, etc). For more specific details, please take a look at the raw data in the Google Sheet link in the video description. It should make things clearer if there is confusion here. After some review, in a fair number of cases, the LSTM did not see much performance benefit over the CPU. This drove the GPU-scaled throughput per dollar way up nominally. These values artificially dragged up the overall average. I tested this by switching the aggregation method to median rather than average, and the values are more in line with what you would expect based on your observation here. You can also see this if you just look at the BERT or RESNET50 model scenarios. These are much more in line with what you would expect. In summary, the numbers do appear to be correct even if they are higher than expected globally. The data when training the LSTM was significantly different than the other models. This leads me to believe there may be some issue with the model setup, dataset, or hyperparameters (or some other reason I am missing). My gut tells me I just did not use large enough batch sizes or a large enough dataset. In any case, more digging will be required to understand why this occurred. 3) Yes. This is an excellent point. I actually wanted to include this in the video; however, I am not able to do so at the moment. I am away from my homelab for the summer, so I am not able to access the devices I have set up to measure the power consumption for these GPUs. Once I get back, I plan on making a specific video just to address this. I apologize for not being able to include it in this video, as I do definitely agree this is a major factor to consider when comparing GPUs, especially if electricity costs are high. If you are interested, I invite you to stay tuned, and I will put out a video on this topic as soon as I am able. I really appreciate your feedback here. There is truly no substitute for a second set of eyes. Hope I answered your questions here. Cheers!
@BaldyMacbeard
@BaldyMacbeard 3 күн бұрын
Wow... Randomly stumbled upon the video. Thanks super useful! I wish someone would do a nice dataset like this, but with multi-GPU configs with nvlink vs. pcie and so on
@TheDataDaddi
@TheDataDaddi Күн бұрын
Hi there. Thanks so much for the comment! Gotcha. So in the video the RTX are test in 1 and 2 gpus configurations with NVLink. I will definitely try to make a video in the future highlighting the performance difference with NVLink vs PCIE. I would have done it on this video, but I am unfortunately not able to physically access my GPUs for the next few months. Thanks so much for the suggestion and please stay tuned for a video on this topic!
@makerspersona5456
@makerspersona5456 3 күн бұрын
we cant hear you... :(
@makerspersona5456
@makerspersona5456 3 күн бұрын
might be good to invest in a good mic and make these videos more concise too :)
@makerspersona5456
@makerspersona5456 3 күн бұрын
its informative content but feels like a corporate team meeting. sure this would appeal to more ppl if you made it more youtube friendly
@TheDataDaddi
@TheDataDaddi 3 күн бұрын
Hi there. Thank you so much for letting me know. Oh no! I am so sorry. I am traveling right now so I do not have my normal nice mic for recording. I really apologize. I did spot check the video and the parts I watched seemed to have at least okay audio. Can you let me know specifically where in the video the sound is bad? I will do my best to fix it.
@TheDataDaddi
@TheDataDaddi 3 күн бұрын
Agreed. I will try to find a better solution for while I am traveling. Also, I do apologize for the length of the video sometimes it is hard to guess what people will find interesting and what is to much. Hopefully the automatic time stamps and fast forward feature will allow viewer to skip the parts they find boring. I really do appreciate the feedback here though. I have gotten this before, and believe it or not I am working on being more concise in my videos. I will try to continue improving here.
@makerspersona5456
@makerspersona5456 3 күн бұрын
@@TheDataDaddithe video is extremely useful and well made in terms of content. I just have to max out my volume from the start and play at 1.5x. It’s really just my opinion and no one else has said this.
@gorangagrawal
@gorangagrawal 3 күн бұрын
If possible, please upload NVLink and PCIe extender video. It would be really helpful to understand them.
@TheDataDaddi
@TheDataDaddi 3 күн бұрын
Hi there. Thanks for your comment! Do you mean like a general video on how they work? Or specifically with respect build I have done on the channel? Either way here is a video I have that might help you here. kzbin.info/www/bejne/sKPGfHp8ZpppmKM Let me know if this helps explain things for you!
@gorangagrawal
@gorangagrawal 3 күн бұрын
@@TheDataDaddi Thanks for sharing your build video. It covers the information I was looking for.
@Horesmi
@Horesmi 3 күн бұрын
For some reason, 3090s are going out for $500 over here, and there are a lot of them on the market. Crypto crash or something? Anyway, that seems to change the calculations a lot in my case
@publicsectordirect982
@publicsectordirect982 3 күн бұрын
Over where?
@Horesmi
@Horesmi 3 күн бұрын
@@publicsectordirect982 Ukraine
@ericspecullaas2841
@ericspecullaas2841 3 күн бұрын
​@publicsectordirect982 my guess China......
@Horesmi
@Horesmi 3 күн бұрын
@@publicsectordirect982 Ukraine
@TheDataDaddi
@TheDataDaddi 3 күн бұрын
Hey there. Thanks so much for sharing! Yeah that does change things quite a bit! If you can get the RTX 3090 for only $500, first of all I am jealous. lol. Second of all, I think you are going to be hard pressed to find a better GPU for that price. You can also use my spread sheet and plug in the numbers to see how it compares to the p40 and p100 in your area to make a more data driven decision. Personally though I'd go with the RTX 3090 for that price! Cheers!
@JimCareyMulligan
@JimCareyMulligan 3 күн бұрын
Thank you for your work. Do you have any plans to test the tesla v100 16 GB? They are goes for half the price of the 3090 and support nvlink.
@TheDataDaddi
@TheDataDaddi 3 күн бұрын
Hi there. Thanks so much for your comment! So, at the moment, I do not have any plans to test the V100 16GB. Where I am located, there are almost the exact same price as the RTX 3090 for less VRAM. So I hadn't really considered them at this point because I think there are better options for the price. However, if you can find them at half the cost please let me know where, and I would be happy to make a video on them. Would also be willing to test people's GPUs. For example, if you had a V100 16GB, I would be willing to pay for shipping both ways so I can test it. Idk if you or anyone would go for this, but that would allow me to test more GPUs without have to incur the full cost of buying them. I have been looking at the V100 32GB SMX2 versions though. These have tons of VRAM and great performance for less the $1000 on EBAY where I am. Only problem is finding a server to put them in. There are not many options that I can find. So if I do make a video it will likely be with those GPUs not the standard PCIE 3.0 V100 16GB.
@werthersoriginal
@werthersoriginal 3 күн бұрын
Oh wow, I'm in the market for the 3090 for LLMs but I've been eyeing the P40s because of their price. I saw your videos on the R720s and now I'm wondering if I can put a P40 in my R710.
@TheDataDaddi
@TheDataDaddi 3 күн бұрын
Hi there. Thanks so much for the question! In theory, this should absolutely possible. I can't vouch for it because I have never tried it personally, but I would be surprised if it didnt work. Only thing to note here is the r710 has PCIE 2.0 so data transfer might be a bottleneck at some point. Really curious about this so if you end up trying please do let me know how it turns out! Best of luck!
@H0mework
@H0mework 3 күн бұрын
I'm happy whenever you upload
@TheDataDaddi
@TheDataDaddi 3 күн бұрын
Hi there. Thank you so much for the kind words! I really, really appreciate it! Makes all the work to make the videos worth it.
@jaroman
@jaroman 3 күн бұрын
PCI 3.0 vs PCI 4.0 makes a difference in this kind of setups?
@TheDataDaddi
@TheDataDaddi 3 күн бұрын
Hi there. Thanks so much for the question! I have not used PCIE 4.0 because none of my servers support it (the ones that do are much more expensive) so I cannot say for sure how much of a difference it makes. What I can say is that while using PCIE 3.0, I have not experiences any major bottlenecks due to data transfer. Even the evidence in this video supports that. Now, if you have better GPUs and are loading extremely large batch sizes then PCIE 4.0 might make a huge difference. For me personally, so far I would say that I have not really "needed" PCIE 4.0.
@scentilatingone2148
@scentilatingone2148 3 күн бұрын
Brilliant bud.
@TheDataDaddi
@TheDataDaddi 3 күн бұрын
Hi there. Really appreciate the kind words! So glad you enjoyed the content!
@TheNicCraft25-go8he
@TheNicCraft25-go8he 4 күн бұрын
Ehrenmann
@TheDataDaddi
@TheDataDaddi 3 күн бұрын
Vielen Dank
@GrossGeneralization
@GrossGeneralization 4 күн бұрын
Did you look into whether the 1U bump lid (supermicro part number MCP-230-41806-0N) clear an rtx 3090? Looks like that Zotac card is a lot lower than some other 3090 cards, you might get by with the MCP-230-41803-0N which was designed to clear the GTX cables. (Note that these were part numbers for the 4029GP-TRT2, but the chassis looks pretty much the same).
@TheDataDaddi
@TheDataDaddi 3 күн бұрын
Hi there. Thank you so much for the comment! So, I have not looked into this, but this is a great idea. I do not know 100% if it would work, but my gut tells me it would. I think this would be a great way to keep the GPUs in the server. One problem for me is that I do not have 1U worth of space in my race because it is completely full. However, for others it would likely be a great option. I might order one just to try it and let viewer know for sure if it works or not. Also, if you go this route please let me know the results. Also, I heard from a viewer recently that the RTX 3090 founders edition fix in the chassis with the standard lid. Might be worth checking that out as well.
@tsclly2377
@tsclly2377 5 күн бұрын
Yup, just got a couple og P40s for a ML350P... after investigating the NVIDIA site.. Slow, yes, but for cheap and something that can run on MS Win2012 Enterprise, it's the ticket. (old mining ETH machine) It will run in the garage without air-conditioning.... Slow but study.
@TheDataDaddi
@TheDataDaddi 3 күн бұрын
Hi there. Thanks so much for the comment! Yep. These are still a great option in my mind. Slow and steady wins the race as they say. Lol
@dustintunget4177
@dustintunget4177 7 күн бұрын
This is still a solid video 5 months later. I ended up with a 4U ATX case and one of the epyc/supermicro combos from ebay, then 2xP40 + a 3060 12gb. Upside is that a 4U ATX rack case can use 120mm fans that can spin slower, and quieter...and you can fit more GPUs, and use standard PSU's. I'd like to see this kind of breakdown in the context of a EATX case (rack or desktop)...those results could be interesting. I think mine was in the realm of $1700...but that's not with much storage....I bought it over time so I don't have solid numbers. Great video, earned a sub!
@TheDataDaddi
@TheDataDaddi 3 күн бұрын
Hi there. Thank you so much for the comment and the sub! Really really appreciate it! Also, really appreciate you sharing your build. It is always great to see how other people are doing things. Definitely sounds like a great setup with the 4U ATX case, Epyc/Supermicro combo, and those GPUs. The 4U ATX rack case with 120mm fans definitely helps with keeping things quieter and cooler, which is a big plus. And the ability to fit more GPUs and use standard PSUs adds a lot of flexibility. It would indeed be interesting to see a breakdown like this for an EATX case, whether in a rack or desktop configuration. The potential for enhanced airflow and additional expansion options could make a significant difference, especially for those looking to maximize their build's capabilities. Maybe one of the these days I can do a build like this! I think it would be super interesting. Your total cost of around $1700 seems quite reasonable for such a robust setup, even without much storage. It's great to hear you managed to spread the cost over time, making it more manageable. Glad you enjoyed the video, and thanks for subscribing! If you have any more insights or updates on your build, feel free to share. It's always great to hear about different setups and configurations.
@xxriverbeetxx1.065
@xxriverbeetxx1.065 8 күн бұрын
Hi Iam really into servers and homela. which usecases does such a ai server have. I run stable diffusion on my pc but I don’t use it regularly. What are some ais you ran. I can’t imagine one which it is worth buying a server?
@TheDataDaddi
@TheDataDaddi 3 күн бұрын
Hi there. Thanks so much for the comment! This server setup specifically is really meant to be a good general purpose setup for a wide variety of AI/ML/DL tasks. It should be suitable for most common DL applications. Some of the most common deep learning models and their use cases include GPT (Generative Pre-trained Transformer) for text generation, language translation, and conversational AI, with examples like GPT-3 and GPT-4. BERT (Bidirectional Encoder Representations from Transformers) is widely used for natural language understanding, question answering, and text classification, including variants such as BERT, RoBERTa, and DistilBERT. In the realm of image processing, models like VGG (Visual Geometry Group) and ResNet (Residual Networks) are popular for image classification and object detection, with examples like VGG16, VGG19, ResNet50, and ResNet101. YOLO (You Only Look Once) is known for real-time object detection, with versions like YOLOv3, YOLOv4, and YOLOv5. GANs (Generative Adversarial Networks) are used for image generation, style transfer, and data augmentation, including models like DCGAN, StyleGAN, and CycleGAN. For image segmentation and medical imaging, UNet and V-Net are commonly used. Transformer models, such as the original Transformer and T5 (Text-to-Text Transfer Transformer), excel in sequence-to-sequence tasks and language translation. RNNs (Recurrent Neural Networks), including LSTM (Long Short-Term Memory) and GRU (Gated Recurrent Unit), are applied in time series prediction, text generation, and speech recognition. MobileNet models, like MobileNetV1, V2, and V3, are designed for efficient image classification on mobile and edge devices, while EfficientNet models (B0 to B7) offer improved efficiency in image classification and object detection. DALL-E models are used for generating images from textual descriptions, with notable examples being DALL-E and DALL-E 2. CLIP (Contrastive Language-Image Pre-Training) connects images and text, enabling zero-shot learning. Lastly, AlexNet is a foundational model for image classification. This should also allow you to get into stable diffusion and work with some of the smaller open source LLMs locally like LLaMa-7B, Falcon-7B, Mistral-7B, etc. Hope this helps!
@GrossGeneralization
@GrossGeneralization 8 күн бұрын
You probably found it already, but looks like one of your ram modules isn't fully seated (approx half way between both CPUs)
@TheDataDaddi
@TheDataDaddi 3 күн бұрын
Hi there. Yep, I did catch that and have since fixed it. Thank you so much for letting me know though. Really appreciate it!
@tsclly2377
@tsclly2377 8 күн бұрын
Heat kills.. so is the RTX4090 experience. I'm going slow and low wattage. Also type of SSD must be big petabyte write .. NVLinking is slow, 112Gb/s and is just dumping onto the Cudas/memory in the not really working like type GPU.. especially if that GPU is stuck into a 16lane that is really only 8 lane.. best to get a bigger Vram card like an A6000 @48GB instead of NVLink(2)ing two A5000 24GB, plus the link is going for premo$$$.. Me.. I'm going the 'Potato'/Orin route that doesnt' need a new air-conditioner/heat pump route, under 350W, and fuk the need for speed, I can wait 30 minutes instead of 3. It is not the speed, but the GPU ability to actually do the work and most can't. 48GBvram with 2000 cuda cores is the can do.. top TDP is like money spent. this review is best for MS BitNet 1.58.. so if you are Windows.. rock on... and I've read the NVLink Quadro cables work for RTX cards.. not the LoveLace...the need the LoveLace Link-2s..
@TheDataDaddi
@TheDataDaddi 3 күн бұрын
Hi there. Thanks so much for the comment! Yeah this is definitely a good point. I agree that more VRAM is typically better even if you actually have lower throughput, better to be able to do the task even if its slow than not because you dont have enough VRAM for your usecase.
@wasifmasood969
@wasifmasood969 9 күн бұрын
Hi, I am facing a different challenge now. I have this machine at home and it creates a lot of noise. I have checked the fan setting, even on the "Optimal" mode it is loud but when I start some training, it goes really unbearable. I have checked various fan configurations, it unfortunately has none stating "Quite mode". What else can I do to make it some reasonable to sit next to, are there any high quality fans which can make a difference. Many thanks in advance?
@TheDataDaddi
@TheDataDaddi 3 күн бұрын
Hi there. So I have a video actually related to this. kzbin.info/www/bejne/iIa6ZHSvatemebs Check this out. In the video, I should how to adjust the fans manually. You could put yours manually on the lowest setting and you may be able to turn them off completely. However, please be carefully when doing this. It is never good to run your GPUs or the rest of you system too hot for too long. Another alternative might be to get some snail blower fans and install them that make less noise then lower the naive fans to the lowest setting. Something like the following may work for you. www.ebay.com/itm/186459922797?mkcid=16&mkevt=1&mkrid=711-127632-2357-0&ssspo=2ucjrgqosnm&sssrc=2047675&ssuid=xv3fo9_stiq&widget_ver=artemis&media=COPY
@romeo-sierra
@romeo-sierra 10 күн бұрын
Overall a good video. Not misleading or lazy. Don't take KZbin comments to heart. People love to say negative things. It's often all they got in life. Only suggestion would be keeping making videos. You'll get better with experience.
@TheDataDaddi
@TheDataDaddi 3 күн бұрын
Hi there. Thank you so much for the encouraging feedback. Really appreciate it!
@mateuslima788
@mateuslima788 10 күн бұрын
Thank you for the video!
@TheDataDaddi
@TheDataDaddi 3 күн бұрын
Hi there! Of course. So glad that you enjoyed the content!
@TheUserIWantedIsTakenAlready
@TheUserIWantedIsTakenAlready 10 күн бұрын
This is awesome, that’s exactly what I was looking for!
@TheDataDaddi
@TheDataDaddi 3 күн бұрын
Hi there. So glad this is what you needed! Thanks for the comment!
@HuzMS
@HuzMS 13 күн бұрын
Thank you for your hard work. Was p40 or p100 the better choice? ALso were you using nvlink?
@TheDataDaddi
@TheDataDaddi 3 күн бұрын
Hi there. Thanks so much for the kind words and the comment! It really depends on the use case, but overall for most people I would say that the P40 is actually the better choice. The only GPUs I currently have that support NVlink are my RTX 3090s. Yes, I have them connected via NVLink.
@b-ranthatway8066
@b-ranthatway8066 13 күн бұрын
So would this mean I can't use a 7900XT to make AI meme pictures? I've actually been interested in the whole AI thing, even though I'm not a smart dude on tech. (I find it cool just because I can use my computer for something other than just gaming/streaming/video editing, but I'm going to try a resist a little against our developing AI overlords lol) I know the tech is still developing, but I thought it would be cool to use AI to create a Vtuber model to stream with. (even if it came out bad, I thought it would be a fun little experiment to do for some views and laughs) However one of my hardest parts to upgrade, in my mind, was a GPU. I know AMD is a step or two behind Nvidia (My last card and current card is a 1070) but when it comes to price, it's hard to beat. I just didn't know if something like a 7900XT or -XTX would at least make up for it vs a 4070 TI Super in terms of AI generation. (I still have no idea what app to use to even make use of my GPU to even make stuff with AI) Alright, enough rambling with the thoughts in my brain, I'll keep watching 👌
@TheDataDaddi
@TheDataDaddi 3 күн бұрын
Hi there! Thanks so much for the comment. So, my take here is the NVIDIA GPUs are going to be much easier to work with at this stage. I have head from some of my viewers that AMD GPUs can and do work. It is just a lot more of a pain to work around bugs and the learning curve is steeper. NVIDIA is more or less plug and play when it comes to AI/ML/DL, but that is also why you pay a premium. I guess what I would say is if you want the easier route or don't have time to do much trouble shooting NVIDIA might be a better way to go. However, from the sounds of it you are more partial to AMD GPUs and have many other workloads besides just AI. In your case, it may be a better idea to go with AMD GPUs because you will get better price for performance for all of your other workloads then deal with the pain of setting up you AMD GPUs for your specific AI use case. The 7900XT is definitely a powerful card and can handle AI tasks, though you might need to use specific software or frameworks that support AMD GPUs, like ROCm. Creating a VTuber model sounds like an interesting project! I would recommend maybe starting with programs like DeepFaceLab for deepfake-style video or some stable diffusion flavors to generate images as a starting point. For generation, tools like Blender for 3D modeling could be helpful and for real-time animation you might could use VMagicMirror or VSeeFace which can utilize your GPU to bring your VTuber model to life. Hope this helps!
@agriculture7188
@agriculture7188 15 күн бұрын
do you have a recommendation for what gpu(s) I should purchase for my R720? I mostly plan on running LLMs, Stable Diffusion, and image detection models. I don’t have a super high budget and was considering a dual P100 setup but wanted the opinion of someone a little more educated in the ML field.
@TheDataDaddi
@TheDataDaddi 14 күн бұрын
Hi there! Thanks so much for the question. So for all of those things on a low budget. I would probably recommend the p100. You could also go with the p40 for more VRAM. The p100 will handle quantization more efficiently and have high significantly higher throughput which will be important when working with LLMs. The p40 has higher VRAM to start with so you can load larger models, but the fp16 performance is really bad so its through put will be a lot worse (theoretically). As budget options though, I think these could still put some of the smaller open source LLMs within reach for you to start experimenting with. Hope this helps!
@noth606
@noth606 16 күн бұрын
GFLOPS is not calculated like shown in the video at 15:18, remove the Giga which we know, FLOating Point operations per Second, simply(there is some history for why this is used). It is somewhat archaic since a lot if other things are being done too, which aren't incorporated in this but in general most other operations take less cycles than a floating point one does because the comma needs special attention so to speak, 15*15 and 1,5*1,5 are the same thing except for tracking the comma separately with the result being 225 or 2,25. What I mean is the circuit needs additional logic to track commas or rather fractions so to speak, which is why we separate floating point from integer operations - additional hardware is required to track the comma "in top of" the integer type numerical operations. No idea if this makes any sense or is useful, I thought it would be simple to explain until I thought it through and realized I need to type this as opposed to scribble and show on a white board. I'm sure there is a good explanation for it out there, just trying to point to why since to a person doing math it's not as obvious as it is designing a circuit to do it.
@TheDataDaddi
@TheDataDaddi 14 күн бұрын
Hi there. Thanks so much for the comment! This is great information. Thanks so much for sharing. I did know that floating point ops were different fundamentally than other operations though I was not 100% sure why. This makes a lot of sense! Also, if you know the correct formula for calculating FLOPs in general please let me know.
@iasplay224
@iasplay224 20 күн бұрын
how did you download ubuntu on the dell server , I have installed the ubuntu on usb bootable then I installing on a ssd hard drive but I cannot boot into that hard drive did you have that issue or does someone know how to fix it
@TheDataDaddi
@TheDataDaddi 18 күн бұрын
Hi there. Thanks so much for your comment. The process is normally as easy as: 1) Creating a bootable drive with whatever OS you want to install 2) Plug in the usb 3) Power on the machine and boot into BIOS 4) Adjust the bios settings so that the usb device is the first boot choice (although many times you dont even need to do this) 5) Let the machine boot normally, and you should be prompted to install the OS. 6) Install the OS 7) Restart machine and boot into BIOS 8) Adjust bios setting so that the new drive where the OS is installed is the first boot option 9) Let the machine boot normally, and you should have a working fresh install of whatever OS you choose.
@TheDataDaddi
@TheDataDaddi 18 күн бұрын
Let me know if this does not work for you. I can try to help you trouble shoot from there.
@iasplay224
@iasplay224 11 күн бұрын
@@TheDataDaddi I was able to solve the issue but the things is that I originally tried to plug in a 2.5 ssd drive to the back of the server and that didn't work in terms of booting, I was able to install the os on it but when restarting the bios said no bootable drive was found. my solution was to use a usb to Sata adapter and plug that in the internal usb port and that worked and one could then boot, but the drives are detected in ubuntu and can save files to them so it's weird
@wasifmasood969
@wasifmasood969 20 күн бұрын
Hi, I have recently bought this system. I see there are two EPS 8-pin connectors (JPW3 and 5) on the mother board, in addition to the other 8-pin PIC power connectors. My question is that I have bought Tesla M10 32 GB GPU which requires an 8-pin EPS connector. Can I connect that GPU to one of these EPS connectors. The card needs an 8-pin 12V EPS connector. What would you suggest. Many thanks for your amazing support.
@TheDataDaddi
@TheDataDaddi 18 күн бұрын
Hi there. Thanks so much for the question! You should be able to connect if they are the traditional PCIe 8 Pin (6+2) GPU connectors. The 4028GR-TRT has 8 12 EPS power cables that support most Tesla GPUs naturally. From what I can tell, the M10 uses the more traditional 8 pin connector. You could buy an adapter from 12 V EPS to PCIe 8 Pin (6+2) power. I think the following should work for you. a.co/d/hMUEDvs Hope this help you! Please let me know how it goes.
@wasifmasood969
@wasifmasood969 16 күн бұрын
@@TheDataDaddi thanks for your prompt reply. I am wonder if instead of 6+2 pin, I can use 4+4 pin since I have it at home.
@TheDataDaddi
@TheDataDaddi 15 күн бұрын
@@wasifmasood969 Unfortunately, I do not think the 4+4 pin will work in your case. You can certainly try, but it will likely not fit.
@sphinctoralcontrol
@sphinctoralcontrol 21 күн бұрын
Tossing up between 3090s, A4000, and P40/P100 cards for my use case which would not exactly be ML/DL but rather local LLM usage hosted using something the likes of OLlama and various (I assume at least q4) models of higher parameters. I'm also dabbling with Stable Diffusion as well - at the moment I am shocked I'm able to run q4 quantized LLMs via LM Studio as well as Stable Diffusion models, on my little old aging M1 2020 Macbook Air with 16GB ram. I'm getting into the homelab idea, especially the idea of using a Proxmox server to spin up different VMs (including the Mac ecosystem) with way higher resources than what I'm working with currently. I'm also looking to integrate a NAS and other homelab services for media - but the GPU component is where I'm a little hung up - just what tier of card, exactly, is needed for this sort of use case? Am I nuts to think I could run some of the lesser quantized (as in, higher q number) LLMs on the low profile cards, as well as SD? It's been 10+ years since I've build a PC and am totally out of my element in terms of knowing just how good I've got it using the M series of chips - I've even been hearing of people running this sort of setup on a 192GB RAM M2 Ultra Mac Mini Studio, but would really love to get out of the Apple hardware if possible. I realize this was several questions by now... but, to distill this down, GPU thoughts? lol
@TheDataDaddi
@TheDataDaddi 18 күн бұрын
Hi there. Thanks so much for you question! Yeah so this is a really good question. It really depends on the size of model you are trying to run. For example, to host Llama2 70B for FP16 you need approximately 140GB of VRAM. However, you could run quantized versions with much less. Or you could always work with the smaller model sizes. In terms of GPUs, I would recommend GPUs that have at least 24GB VRAM. I have been looking at this a lot for my next build, and I think I actually like the RTX titan best. The RTX 3090 would also be a good choice its FP16 performance just isn't as good. I think the P40/P100 are also great GPUs for the price, but for LLMs specifically they may not be the greatest options because the p100 has only 16 GB of VRAM and the p40 has very poor FP 16 performance. Another off the wall option is to look at the V100 SMX2 32GB. Since these are are SMX2, they are cheaper, but there are a lot fewer servers that they will fit in. The only one I know of off the top of my head is the Dell C4140/C4130. From my research, they the SMX2 GPUs are also fairly tricky to install. Anyway, these are the routes I would go to make a rig to host these models locally. I will eventually build a cluster to host and train these models locally so stay tuned for videos to come on that subject if you are interested
@Meoraclee
@Meoraclee 23 күн бұрын
Um Hi daddy, Im having a trouble with building a pc to train ai (play game sometimes). With 2500$ budget should I aim for 2x3090 to run 48 GB VRAM or 2x4060ti 24GB VRAM ? Is there any better option in my case ?
@TheDataDaddi
@TheDataDaddi 18 күн бұрын
Hi there. Thanks so much for the question! I think it depends a lot on your use case, but I would say if you plan on using it primarily for AI workloads the 3090s would be a better choice because of the higher VRAM and ability to support NVLINK. However, if you want to focus more on gaming and some AI work loads I would choose the 4060s and put more money towards other components like a better CPU. You could also just go with a single RTX 4090. This would give you great performance for AI workloads and gaming with budget enough for other high quality components.
@ThePandalars
@ThePandalars 23 күн бұрын
perun much?
@TheDataDaddi
@TheDataDaddi 18 күн бұрын
Hi there. Thanks so much for the comment! I am not sure I understand the question. If you could give me, a bit more context that would be great!
@rohithkyla7595
@rohithkyla7595 23 күн бұрын
keen about the comparison video!
@TheDataDaddi
@TheDataDaddi 18 күн бұрын
This will be coming very soon! Stay tuned.
@vampritt
@vampritt 27 күн бұрын
omg so detailed.. thank you for your time.
@TheDataDaddi
@TheDataDaddi 25 күн бұрын
Hi there. Thanks so much for you comment. So glad that the content was useful for you!
@cyklondx
@cyklondx 27 күн бұрын
i think you missed out on parts - actual performance per model, and if one can use fp32, fp16 or int8 or tensor. P40 is terrible options for any ai workload due to amount of time one would have to wait... and its power requirement.
@TheDataDaddi
@TheDataDaddi 25 күн бұрын
Hi there. Thanks so much for your comment! I would agree that for anything below fp32 operations these GPUs would be quite slow. However, the GPU is less than $200 dollars for 24GB of VRAM. So, if you are wanting to experiment with larger models cheaply, I think these GPUs still have good value.
@chimpera1
@chimpera1 28 күн бұрын
I'm thinking of building one of these with 8xp100. My knowledge is fairly limited but I want to explore llms. Would you recommend
@TheDataDaddi
@TheDataDaddi 25 күн бұрын
Hi there! I think this would be a good cheaper way to start experimenting! Do be aware through that training or fine tuning most of the largeer open source LLMs will be out of reach even with a setup of this magnitude. However, you could likely host some of the smaller one or quantized versions locally. Hope this helps. Cheers!
@chentecorreo
@chentecorreo 29 күн бұрын
Excelent work!
@TheDataDaddi
@TheDataDaddi 25 күн бұрын
Hi there. Thanks so much for the comment! Really appreciate your positive feedback!
@jjolleta
@jjolleta Ай бұрын
I see that cable management and I kinda wonder, what about thermals....... You should find a way to make those cables even more custom for them to fit better and as in the second card below the air ducting. This is just a suggestion, hope you can manage to give those cards a better lifespan. Greetings !!!!
@TheDataDaddi
@TheDataDaddi 25 күн бұрын
Hi there! So I have actually found a better cabling strategy. The updated cabling should be included in the video description. Please have a look there if you are interested in going that route. Cheers!
@jaystannard
@jaystannard Ай бұрын
This is a cool little project you did.
@TheDataDaddi
@TheDataDaddi Ай бұрын
Hi there. I am so glad you enjoyed the content! Really really appreciate the donation. Really helps the channel!
@vap0rtranz
@vap0rtranz Ай бұрын
I made a spreadsheet too, but yours is thorough! I'd also come to similar conclusions as you: that the P40 / P100 were cheap ways to get medium size LLM models into a GPU with decent tokens/second. Your spreadsheet would have saved me time if I'd known about it! At least there's some independent confirmation of your conclusions. There's a lot of detail to add, like how fast/slow models are on certain GPUs ... perhaps another vid on that to save me the effort? :P
@TheDataDaddi
@TheDataDaddi Ай бұрын
Hi there. Thanks so much for the comment! I am glad to here you can confirm! Especially as hardware prices keep increasing I think these are actually becoming even more relevant for those who are budget conscious. Funny you mention this. I am actually working right now on a benchmarking suite to enable reliable comparison between GPUs for different models. There is not a reliable open source benchmarking solution for GPUs so I am trying to create one (or make steps toward it at least). As soon as I get something decent, I will make a video series on it and start using it to benchmark GPUs in a real way with respect to individual models.
@kellyheflin5931
@kellyheflin5931 Ай бұрын
What dimensions are necessary for the Supermicro SuperServer 4028GR-TRT to fit in a mid-sized server rack? I'm grateful. Thank you.
@TheDataDaddi
@TheDataDaddi Ай бұрын
Hi there. Thank you so much for the comment! So, I forget what length I set my rack at, but the Supermicro SuperServer 4028GR-TRT is 29" long so I would set it about 3" to 5" longer than this to comfortably fit the server. To my knowledge, servers are pretty much all the same width. One thing to keep in mind though is not all servers are the same length so if you every buy others in the future they may be longer so it is a good idea to set you rack a few inches bit deeper than the longest server you plan on housing.
@jack-nguyen
@jack-nguyen Ай бұрын
i am building one myself with 2x rtx 4090, is there any suggestion for the cpu?
@TheDataDaddi
@TheDataDaddi Ай бұрын
Hi there! Thanks for the questions. Couple of suggestions/comments here. 1) Pretty much all consumer grade motherboards and CPUs that i know of besides the threadripper series CPUs and compatible motherboards will not be able to support both GPUs at the full x16 lane bandwidth (this route is incredibly expensive). Not having the full x16 lanes for both GPUs might be okay with you. I just wanted to make you aware. With some CPUs you can get have configurations like x16x8 or x8x8. 2) For the CPU, it also depends one whether you want to go the DDR4 or DDR5 route with respect to RAM. I would personally recommend DDR4. It is cheaper, you have more options, and is supposedly more stable at this stage (do some research of your own here as things change very fast these days). I think both AMD and Intel CPUs are fine. I would recommend at least 12 cores, but the exact CPU really depends on you budget. Some cost effective suggestions might be: AMD Ryzen 9 5950X Intel Core i9-12900K
@jack-nguyen
@jack-nguyen Ай бұрын
@@TheDataDaddi I also struggle finding the motherboard as you mentioned. I found a solution that using the X11DPI-N mainboard and I have to use intel xeon gold 6138 in this way which only support pcie 3.0. Thank you for your response and I really appreciate it.
@TheDataDaddi
@TheDataDaddi Ай бұрын
@@jack-nguyen Yeah I did as well when I was researching for this video. Eventually, I just switched over to using servers because this normally isn't an issue with server mobos and CPUs. I like the idea though! That should CPU and mobo should work great for you. What case/chasis are you planning on using? Of course man! I am always happy to help.
@mrbabyhugh
@mrbabyhugh Ай бұрын
29:11 haha, that's the exact card I am looking at. Comparing it with ARC a770 actually.
@TheDataDaddi
@TheDataDaddi Ай бұрын
Hi there. Thanks so much for the comment! I have also been interested in non NVIDIA solutions. The ARC GPUs have certain interested me. However, I would caution you. If you leave the NVIDIA ecosystem, it is like going into the wild west so just make sure you are prepared. Here is a Reddit thread that might shed some light. www.reddit.com/r/MachineLearning/comments/z8k1lb/does_anyone_uses_intel_arc_a770_gpu_for_machine/ If you do decide to go the ARC route, please let me know how it goes for you. I would be super curious to better understand where those GPUs are in terms of AI/ML/DL applications.
@PreparelikeJoseph
@PreparelikeJoseph Ай бұрын
Is using a server build better than a PC build or were parts just cheaper ?
@TheDataDaddi
@TheDataDaddi Ай бұрын
Hi there. Thanks so much for the question! The reasons I prefer servers over OC builds: 1) Price for compute is almost always better (main reason) 2) Ability to support more cores and higher CPU RAM 3) Remote management tools like IPMI or IDRAC 4) Generally more stable and built to run forever with out being turned off 5) Built in redundant power supplies and failovers That said. A custom build is always going to be more flexible and likely give you the ability to have the latest and greatest. Overall though, I like refurbed servers because I find they provide the price to performance.