Why does this GPU have an SSD? - AMD Radeon Pro SSG

Рет қаралды 1,875,266

Linus Tech Tips

Күн бұрын

Пікірлер: 2 100

@Legatron17 2 жыл бұрын

Truly opened my eyes as to why the GPU does

@comedy6631 2 жыл бұрын

This*

@justsam07 2 жыл бұрын

🌚👍

@AnxulJyoti 2 жыл бұрын

Registering my comment before this blows up.

@deki9827 2 жыл бұрын

Exactly why I dislike the clickbait titles, they don't tell us why the gpu does!

@aumshumanmohapatra7567 2 жыл бұрын

But why?

@alphapuggle 2 жыл бұрын

I'm glad there's finally an answer to why that GPU does

@32bites 2 жыл бұрын

Has Anyone Really Been Far Even as Decided to Use Even Go Want to do Look More Like?

@noamtsur 2 жыл бұрын

but no one ever asks how is GPU

@brownie2648 2 жыл бұрын

@@32bites r/ihadastroke

@brownie2648 2 жыл бұрын

@@noamtsur the gpu iz brocken :((((

@tobiwonkanogy2975 2 жыл бұрын

this gpu does well. Gpu fricks.

@mirandahw 2 жыл бұрын

Ah, yes, "Why does this GPU" Gotta love the original titles...

@cartermassey9030 2 жыл бұрын

This gpu do because it does

@toyotanice9944 2 жыл бұрын

What the GPU doin'

@erlendtryti 2 жыл бұрын

@UCsHt0kYQUaXK7sT8UIyN5RA How’s the gpu doin’?

@akhilallamraju9079 2 жыл бұрын

Not yet lol

@thespicywolf8818 2 жыл бұрын

It does because it will do, quality title

@joshuatyler4657 2 жыл бұрын

Imagine being the engineering team that made all of that work, only for the industry to say "ok. cool."

@mnomadvfx 2 жыл бұрын

The people that actually buy them for working on appreciate it. The people who just commentate on it are not "the industry" at all, just glorified journalists.

@JonatasAdoM 2 жыл бұрын

@@mnomadvfx If the industry needed it it would exist. Just look how much better enterprise tools and systems are. Also, a victory against ever more complex and complicated hardware.

@paulpoco22 2 жыл бұрын

Just like VESA Local Bus video cards

@lazertroll702 2 жыл бұрын

@@JonatasAdoM aye! #KludgeDeath

@davidg5898 2 жыл бұрын

If the API was actually widely rolled out, something like this would be incredibly useful for science departments at universities (which is a niche market, but not an insubstantial one).

@jinxash3580 2 жыл бұрын

Would have been Ideal to train models for Deep Learning using this GPU

@davidg5898 2 жыл бұрын

@Sappho I was doing astronomical simulations (my work was more with globular cluster formation and evolution, 10s of thousands to millions of stars, sometimes coupled with a molecular cloud during early formation) and there definitely would have been a performance boost if the read-out/save time of each time slice could have been sped up by having the GPU dumped it straight to storage. Just as you also described, most of my work was done on a Beowolf cluster with a lot of high powered off-the-shelf GPUs.

@TheCodyLaxton 2 жыл бұрын

Niche but actually large market haha, I worked in university research at the National Weather Center. Basically any excuse to build something cool for research is the path most travelled by pre-doctoral and under grads.

@lazertroll702 2 жыл бұрын

meh ... there's only so many times storage malware can give the feelz... although .. iffins the gpu executed from that storage on init ... 🤔

@pezz1232 2 жыл бұрын

I wonder how long until the title changes lol

@mys31f70 2 жыл бұрын

probably within the first hour or 2

@christiannkulcsar7951 2 жыл бұрын

Same

@slug8039 2 жыл бұрын

Why does that GPU?

@shadihaddad7 2 жыл бұрын

I bet they leave it out of spite lol

@ritualduke8825 2 жыл бұрын

Why do they do that? Linus does seem to do that a lot and it’s confusing.

@mini-_ 2 жыл бұрын

Everyone always ask "Why does this GPU?", but never asks "How is this GPU?" 😔

@alexdavis9324 2 жыл бұрын

Original

@_yuri 2 жыл бұрын

how does this gpu? *

@Roriloty 2 жыл бұрын

What da GPU doing?

@grimmel9894 2 жыл бұрын

What's tha gpu doing?

@writingpanda 2 жыл бұрын

Lol I laughed out loud to this.

@Steamrick 2 жыл бұрын

'Why does this' indeed. I think the question the manufacturer asked is 'why not?'.

@Worthy_Edge 2 жыл бұрын

@Czarodziej stronger than my will to click this link and be greeted by cringe

@nikoheino3927 2 жыл бұрын

@Czarodziej stronger than my will to live after reading your comment.

@nobody7817 2 жыл бұрын

@@Worthy_Edge I think you meant to say "greeted" but still that was a HILARIOUS retort!"

@loicgregoire3058 2 жыл бұрын

Those would be crazy useful in AI Application, imagine loading all your dataset in you GPU once without having to reload it for each training iteration

@steevem4990 2 жыл бұрын

i actually thought it was where they where going to when antony switched the ssd's

@fluidthought42 Жыл бұрын

Especially now that ML has moved away from NVIDIA proprietary tech

@tranquilitybase8100 2 жыл бұрын

This technology eventually found a home... AMD used it in the PS5 and Xbox Series. Both systems can load into RAM directly from the SSD, bypassing the CPU.

@Pacbandit13 2 жыл бұрын

Is that true?

@louism771 2 жыл бұрын

Well, this lead to Direct Storage, that we have today on this consoles, and soon will have on Windows 11. Pretty much the same idea, technologically a bit different i guess.

@derpythecate6842 2 жыл бұрын

I think bypassing the CPU is difficult/insecure, and I did some research and was right. Complete CPU bypass would mean being able to skip the kernel layer, and all the security checks which gives you arbitrary read/write to memory from disk that should be minimized as it provides a loophole. What DirectStorage does is simply use the GPU to decompress compressed data sent over the bus from the SSD, which then hands it to the CPU to decode and execute. This basically just speeds up the data retrieval pipeline, but doesn't not expose any loopholes as it is fundamentally the same underlying mechanism all computers use today to fetch data. The AMD SSG card in the video can do such caching as the GPU doesn't execute the kernel code, which means that while you can still write malicious code in to target the GPU, it's way more self contained than executing it directly on the CPU which takes control of all your processes including your OS.

@watercannonscollaboration2281 2 жыл бұрын

I learned that this was a thing a month ago when doing research on the WX 2100 and I’m surprised no major tech channel did something funny with it

@cool-soap 2 жыл бұрын

@@joz534 no, run.

@amiltonfcjunior 2 жыл бұрын

Maybe because it is too expensive?

@Hobo_X 2 жыл бұрын

I honestly wondered if this GPU was actually hiding a secret, that Microsoft had these to base DirectStorage work off of for all these years while they worked on it. Maybe now that it's finally public and AMD has actual tangible research into this as the product actually exists... well, I don't know... imagine if RDNA3 has this as a surprise feature to work amazingly with DirectStorage?!

@nielsbishere 2 жыл бұрын

This is exactly what's needed to blow graphics to a new area, think of the huge scenes you could render

@epobirs 2 жыл бұрын

The beta setup for DirectStorage used Nvidia RTX cards, as Nvidia was already doing work in the same direction for RTX I/O, aimed at the workstation market. Remember, they needed something that was going to work in the PCs people will own in the foreseeable future rather than create something requiring a costly niche hardware design. If Microsoft used them in R&D at all, it was more likely for the Series X/S Velocity Architecture, as a proposed console design was less sensitive to non-standard hardware if the cost was good. Even then, this wasn't very close as the major component there (and in the PS5) is the controller functionality with the dedicated decompression block. Offloading those operations from CPU and GPU are a big factor in letting the console perform optimally. I strongly suspect that Microsoft and AMD will try to push an open standard for a PC hardware spec that will bring a version of Velocity Architecture to PC to give DirectStorage the full functionality it has on Xbox. This needs to be a vendor independent spec to get Intel and Nvidia on board, otherwise it's will remain a niche that game developers will be reluctant to use. A recent previous example would be DirectML, which is hardware agnostic and relies on the drivers to bridge the gap between PCs and vendors of ML focused hardware. Thus the ML hardware can live in the CPU, GPU, or a separate device on the PCIe bus, the user doesn't need to know so long as the driver tells the system what to look for and how to talk to it.

@kkon5ti 2 жыл бұрын

This would be amazing

@erdem-- 2 жыл бұрын

At this point, i think we don't need CPU's. It is cheaper and better (for gaming) to produce all in one APU designs, like how PS5 and other game consoles are designed.

@zaidlacksalastname4905 2 жыл бұрын

According to some market analysts, top RDNA4 could come with 512 gigs of pcie gen 4 memory

@scorch855 2 жыл бұрын

If ML libraries target this platform it seems like it could be a compelling option. Now days models are getting so large that even 24gig of VRAM is not enough. Yes the performance would undoubtedly be worse using SSDs but the alternative is not being able to use the model at all.

@nocountryforoldmen6360 2 жыл бұрын

THIS, so much this.

@artlessbene 2 жыл бұрын

would that be much faster than just using swap?

@nocountryforoldmen6360 2 жыл бұрын

@@artlessbene maybe not faster, perhaps cheaper

@virtualtools_3021 2 жыл бұрын

Code it for m1 ultra 128gb of vram

@terriplays1726 2 жыл бұрын

Nvidia wants you to buy their NVLink systems instead. Or A100 80GB.

@Tsudico 2 жыл бұрын

I wonder if they had done something like a separate PCIe daughter card with something similar to the SLI or Crossfire interfaces would have worked better. It wouldn't have shared bandwidth across the PCIe bus but still allowed direct access to the SSDs installed.

@ryanchappell5962 2 жыл бұрын

Hey this seems like a really cool idea.

@jeromenancyfr 2 жыл бұрын

I am not sure I understand , aren't the SLI and Crossfire interface very slow by themselves ? The data would move through PCIE... Like any NVME drive

@Rakkakaze 2 жыл бұрын

@@jeromenancyfr I imagine the idea is, GPU calls for data, pass over link, then out through pci... CPU calls for data, pass through pci.

@mnomadvfx 2 жыл бұрын

That eats into compute density though if you want to have several of them per node.

@legendp2011 2 жыл бұрын

well my understanding is thats basically how nvme direct storage drives are going to work (and are already working like that in the ps5)

@Leon3cs 2 жыл бұрын

7:06 "It is a CRUCIAL component to anything a gpu does" I see what you did here

@EvanMorgoch 2 жыл бұрын

With respect to the random read speeds (1:41); Why not test the drives independently from the SSG, or use MP600 drives in the SSG to get a proper apples to apples comparison? The drives firmware themselves may just be crap and account for why the random speeds don't scale nearly as well.

@ThranMaru 2 жыл бұрын

Ain't nobody got time for that.

@flandrble 2 жыл бұрын

Because driver overhead for RAID increases latency. on AM4 you're loosing approx 30% of your IOPs even if all your SSDs are connected to the CPU and not chipset. Intel is no where near this bad (same with Windows) but it's still a loss.

@ayoubboulehfa3932 2 жыл бұрын

@@ThranMaru well they tested a GPU from 2017 that no one have, so yes they have time,

@bigweeweehaver 2 жыл бұрын

@@ayoubboulehfa3932 has nothing to do with time and more with uniqueness to interest the viewer into clicking on the video.

@I2obiNtube 2 жыл бұрын

Because then you'd just be testing drive performance which wouldn't make sense. It's end to end testing

@DasFuechschen 2 жыл бұрын

I remember the launch event of this this at siggraph. AMD „gifted“ some of those cards to RED which then gave them to some Indian filmmakers which had previously betatested the card in animation and editing one of their movies if I remember correctly. But TBH, I have more memories of the after-party than the event itself.

@CaptainScorpio24 2 жыл бұрын

which movies they were?

@guadalupe8589 2 жыл бұрын

I bet, them furry after parties are insane

@CaptainScorpio24 2 жыл бұрын

@@GeneralKenobi69420 ok thanks 🙂 haven't watched it

@CoolJosh3k 2 жыл бұрын

Actually handy for when your motherboard does not have enough m.2 slots. You can buy these as just the raid 0 cards that will plug in and use PCIx4.

@WayStedYou 2 жыл бұрын

They could literally give you more m.2 if they gave you a pcie card with m.2 slots

@upperjohn117aka 2 жыл бұрын

@@WayStedYou but those dont look cool

@Bobis32 2 жыл бұрын

@@WayStedYou as someone who uses an itx system when pcie5.0 comes out i really hope something like this comes out as even gpus barely use the extra bandwidth from pcie4.0 why not put some m.2 slots on gpu's espessialy with MDA coming in the near future

@somefish9147 2 жыл бұрын

@@Bobis32 power and bandwith

@virtualtools_3021 2 жыл бұрын

@@somefish9147 oh yeah bc SSD use sooooooooooooooooooooo much power

@cleverclever2317 2 жыл бұрын

7:58 hello back mr editor

@hdrenginedevelopment7507 2 жыл бұрын

That kind of reminds me of the old Real3D Starfighter PCI Intel i740 gfx card from wayyyy back in the day. Intel had just released the AGP bus architecture and the i740 was their first foray into the discrete graphics space…probably to help support the launch of AGP 1X, because it wasn’t all that fast otherwise. For the majority of the non-AGP systems, Real3D built a card with an AGP-PCI bridge chip that basically had an AGP bus and dedicated SDRAM AGP texture memory on board, in addition to the i740’s local SGRAM framebuffer RAM like any other graphics card. It was pretty cool at the time. They were sold with 4-8 MB framebuffer plus 8-16 MB AGP texture memory for a max of whopping 24 MB total onboard. They weren’t very fast, but they supported 1024x1024 px texture tile resolution whereas the vast majority of the competition including 3DFX only supported 256x256 pixels max resolution texture tiles. It was slow, but it looked so much better than anything else on the market and helped milk some extra capability from old non-AGP slot systems…perfect tradeoff people like Nintendo 64 players were used to dealing with, lol. 3DFX Voodoo 2 cards had a similar structure with separate RAM for framebuffer and texturing. Ok, now I’m done dating myself 😂

@jseen9568 2 жыл бұрын

When PCIe Gen 4 first came out, everyone was saying how it wasn't practical because it would be used fully. I said then that what would be more interesting if you saw some instances where multiple uses through a single PCIe 16x slot could take place without any hindering in performance. This would be one of those scenarios. not useful, but pretty cool.

@BrentLobegeier 2 жыл бұрын

Couldn't agree more. When someone made a car, everyone said horses were better. Without manufacturers trying things outside of the box we would never progress, and I have no idea why everyone is so against innovation. Noone is forcing anyone to become early adopters of anything, and most things people were skeptical about soon became integral to everyday life. With progression comes niche products like this, but at least we can say they are trying.

@bojinglebells 2 жыл бұрын

and now we're up to PCIe 5.0 with Alder Lake...there's even consideration to adjust NVMe storage standards from 4 lanes down to 2 because of how much bandwidth 4.0 and now 5.0 offer. I would love a product like this if only to gain more NVMe storage without taking up extra slots

@CheapSushi 2 жыл бұрын

@@bojinglebells same, I love the dual functionality. I get a pretty decent GPU and 4 NVMe slots in two PCIe slots instead of three if I had to get a separate addon card. I personally love using up all my 7 slots with lots of cards.

@jseen9568 2 жыл бұрын

@@bojinglebells And I think about some more niche area like small form factor PCs and even the NUC extreme. With the speed and bandwidth increases, these types of compute cards could make for near instantaneous connections and make those types of products more viable

@fuzzynine 2 жыл бұрын

Boy, this is awesome. I wish you would show more obscure tech. I feel like watching retro computer channels right now. Only with new stuff. :D Thanks. This is really awesome!

@GuusKlaas 2 жыл бұрын

Man, from what I recall, this thing was baller for Revit/CAD work. Those needed the entire model in VRAM, and it'd be a massive hurdle to do that over SSD > CPU > MEM > GPU. This was pre-host bus controller, which is the 'not as fancy' name for directstorage. Allowing devices 'other' than the main controller in a PCIe network to take control of another device. Like a GPU just... assuming direct control of an SSD (after some mediation obv) to just load stuff off without the big overhead. Obviously since then we also got (first on AMD, later on Intel) SSD's direct on CPU, rather than a PCH in-between (like Intel had until recently when they figured out that just 16 lanes from CPU was not enough).

@MrTrilbe 2 жыл бұрын

I was kinda thinking the same, or using it for parraralised ML or big data applications it is a WS card after all, running an openCL coded ML algorithm direct from 2TB of fast storage on the GPU, that's a lot of test data.

@ProjectPhysX 2 жыл бұрын

It's very interesting for computational fluid dynamics too. Although there are ways to make CFD codes require less VRAM (demos on my YT channel), you always want maximum resolution possible. You could do colossal resolutions with TBs of storage. But in practice it's unusable, because even the 14GB/s is extremely slow, and you would rewrite this storage 24/7 which would quickly degrade/break it. With VRAM directly, you can do 400-3300 GB/s. So the SSG never really took off.

@Double_Vision 2 жыл бұрын

I occasionally deal with massive scenes and mesh or particle caches in Redshift for Maya, and Redshift could use this for sure! The same goes for trying to use Redshift to render massive print images where Redshift's out-of-core technology could benefit from having all this storage connected directly to the GPU core. No more Out-Of-Memory failures!

@user-rd3jw7pv7i 2 жыл бұрын

I can see this being used for ONE specific use case. Instead of having a separate SSD-in-one-enclosure and GPU and taking more than 1 or 2 PCIE slot, i.e. 1 LIQID Honey badger and 1 GPU, just use this! This card actually make sense and I'm sad to see this tech not taking off because if you know how and why to use it, this is revolutionary!

@Craft97pl 2 жыл бұрын

with direct storage sharing bandwidth with ssd is no problem. Problem is gpu it self. In few yers it will suck.

@adriancoanda9227 2 жыл бұрын

@GoSite solder a better one flash a model firmware done or you can adapt a Socket like mountings and replace gpu as fast as you want before the gpu is assembled how do you think they test it

@vectrobe 2 жыл бұрын

theres also one thing that wasnt really mentioned and that they launched EPYC and threadripper around the same time, which effectively provided the same functionality. This card was in a timeframe where NVMe RAID's were an amazing concept but the PCIe lanes needed for it were often hard to come by, even on the xeon and opteron series

@zoey.steelimus 2 жыл бұрын

LTT: "Why does this GPU?" Me: "Yes, but have you considered HOW the GPU does?"

@blunderingfool 2 жыл бұрын

How about WHO does this GPU?

@fl4shb4ckGaming 2 жыл бұрын

I'll do you one better. Why is GPU?

@Noxelius 2 жыл бұрын

but where does this GPU?!

@SecureInMyHead 2 жыл бұрын

When does GPU

@Healcraft 2 жыл бұрын

@@SecureInMyHead 2017

@seireiart 2 жыл бұрын

"Why does this GPU?!!" Great question.

@Worthy_Edge 2 жыл бұрын

Only 11 minutes and there’s already 2 bot replies

@seireiart 2 жыл бұрын

@@Worthy_Edge These bots can't just chill. Can they?!!

@vgaggia 2 жыл бұрын

I wonder how it'd work with deep learning stuff, if the memory capacity would outweigh the speed.

@ilyearer 2 жыл бұрын

I was surprised there was no mention of that potential application as well.

@ZandarKoad 2 жыл бұрын

@@ilyearer Same. Seriously looking hard at this card now, since memory size is an upper limit on the types of existing neural nets you can fine tune. RTX 3090 has only 24 Gigs compared to this, 2048 Gigs. Yikes.

@Matando 2 жыл бұрын

7:54 Hello editor! Thank you for your hard work and I hope you have an absolutely lovely day as well

@TheWetworm 2 жыл бұрын

7:51 finally an American that can pronounce 'niche'

@00kidney 2 жыл бұрын

Everyone is asking "Why does this GPU?" but I'm just glad to see an upload featuring Anthony.

@MerpSquirrel 2 жыл бұрын

I could see this being used for machine learning or data analysis for Microsoft R. Good usecase for direct storage.

@willgilliam9053 2 жыл бұрын

train a model with very limited host CPU usage... ya that would be cool

@abhivaryakumar3107 2 жыл бұрын

Ngl Anthony is my favourite LTT member and it makes me so happy whenever I see his face in a thumbnail:))

@LordYamcha 2 жыл бұрын

Same but these bots goddamnit

@abhivaryakumar3107 2 жыл бұрын

@@LordYamcha I stg what the absolute fuck is this, commented 2 minutes ago and there are already 2 bots

@RevolverOcelotMGS2 2 жыл бұрын

This is the first actual review of the SSG in the five years since it was announced. Every other article about it just parroting the AMD PR sheet from when it was announced. It uses a custom API to treat the NAND flash as an extension of the VRAM pool. The only adopter of that API was Adobe Premier. In theory, it's a huge benefit for some workloads where larger VRAM pools are required. As long as the VRAM extension doesn't need to be as high performance. In practice, it was only really practical back in 2017 when VRAM was both very expensive and small capacity. The NAND flash was just a cheaper way to expand the memory pool, something that AMD was not able to do traditionally with HBM. There are many solutions, even back then, that could render this GPU unnecessary for most of the workloads it was targeted for.

@xaytana 2 жыл бұрын

I'd be curious to see this concept again once m.2 key f finally sees some use; though if we never see a future where there's high bandwidth busses with tight memory timings, essentially combining what GPUs and CPUs like, this concept should be put off to key H, J, K, or L, to not confuse high bandwidth GPU memory with tight timing CPU memory on key f, assuming a future memory standard ever actually makes the switch. Though with how fast devices are becoming, it'd be cool to see a unified memory-storage platform where the only difference is if the chip itself is considered volatile or not, essentially the original concept of Optane on steroids; this would also be cool if there's semi-volatile chips where a sudden shutdown could retain otherwise volatile data.

@lakituwick7002 2 жыл бұрын

After years of searching, I finally understood why this gpu does.

@mrkezada5810 2 жыл бұрын

I think one of the most productive uses for this GPU is to enable fast Unified memory accesses to memory when programming with OpenCL or something like that. Although that is a really niche and low level use case, mostly investigation-focused.

@0tool505 2 жыл бұрын

I think brands should be more transparent and start answering the consumers why does the GPU do

@theredscourge 2 жыл бұрын

Actually it wouldn't be too bad if they put say 50GB of flash storage on the card, but it would need to be a type that can withstand a LOT of writes. They'd need some software to allow the user to choose which 1-2 AAA games at a time that you'd like to have their relevant texture files cached directly on the GPU. Or they could try to develop some sort of Windows Prefetch cache type thing, where it aggressively uses the 80/20 rule to try to identify the slowest and most loaded texture files that each game uses as you play it, then start saving a history of which files it will want to slowly pre-load onto the card the next time you go to play it. Perhaps they could pre-calculate what those are on a per-game basis and distribute some sort of map file, sorta like how the GPU drivers these days load different profiles for each game.

@ProjectPhysX 2 жыл бұрын

It's a very interesting GPU for computational fluid dynamics too. Although there are ways to make CFD codes require less VRAM (demos on my YT channel), you always want maximum resolution possible. You could do colossal resolutions with TBs of storage. But in practice it's unusable, because even the 14GB/s is extremely slow, and you would rewrite this storage 24/7 which would quickly degrade/break it. With VRAM directly, you can do 400-3300 GB/s. So the SSG never really took off.

@benjaminlynch9958 2 жыл бұрын

Huge use case for AI training. Anything over 80GB of memory means training has to move from GPU’s to CPU today, and that means a slowdown by multiple orders of magnitude. Unfortunate AMD has never had any real market share in the AI/ML world because their software support - even in 2020 - sucks.

@ManuSaraswat 2 жыл бұрын

how about in 2022?

@TheWisePongo 2 жыл бұрын

@@RyTrapp0 ye but intel bad

@fernbear3950 2 жыл бұрын

Wearout makes it a nonstarter, for inference though maybe could be a monster in the right circumstances.

@ProjectPhysX 2 жыл бұрын

AMD has introduced their new MI250X GPU with 128 GB memory. But still you can never have enough memory. I'm working with CFD (see my YT channel), and there it's the same problem: You always want maximum resolution possible. You could do colossal resolutions with TBs of storage. But in practice the SSG is unusable, because even the 14GB/s is extremely slow, and you would rewrite this storage 24/7 which would quickly degrade/break it. With VRAM directly, you can do 400-2000 GB/s. So the SSG never really took off.

@ZandarKoad 2 жыл бұрын

@@ProjectPhysX Thanks, I figured as much. It's a shame. Memory in the TB range truly opens up new possibilities for deep learning.

@Michplay 2 жыл бұрын

it just amazes me that Direct Storage / RTX IO is taking this long for a demo to test with

@writingpanda 2 жыл бұрын

Anthony is fantastic. Just wanted to say he's doing an excellent job with these videos. Kudos, Anthony!

@board2death 2 жыл бұрын

Agreed Anthony is the man!

@TopGearScooter 2 жыл бұрын

I subscribe because of Anthony

@z31Joshyman 2 жыл бұрын

Anthony is the #techgod a fuckin goat.

@keldwikchaldain9545 2 жыл бұрын

When I saw that board I thought they were gonna have a complex memory controller that'd drive the nvme drives with the normal ddr memory as a cache, not as literal storage devices sitting on the gpu for fast load times.

@chadlumpkin2375 2 жыл бұрын

This reminds me of the Intel math coprocessors for the 286/386 CPU's Before floating point unit (FPU) processing became the default for all X86 processors. With the 486 Intel introduced the 486DX with the FPU and the 486SX with the FPU disabled.

@CharcharoExplorer 2 жыл бұрын

5:35 - That is not true. HBM2 is still connected by a 1024-bit memory bus. Its just that 2 stacks of HBM2 = 2048, while 2 stacks of HBM1 ... also means 2048 bit bus. They are exactly the same here. HBM2 brought much higher capacities, higher speeds, and lower latencies, it didnt change the connection it had. The Radeon VII for example and the R9 Fury are both 4096 bit machines, one is just 16GB of HBM2 while the other is 4 GB of HBM 1.

@bsadewitz 2 жыл бұрын

Reading your post, for some reason I recalled this: en.m.wikipedia.org/wiki/Personal_Animation_Recorder I had the PC version. It used its own dedicated IDE bus, had its own framebuffer, etc. Upon its release, there was only one HDD that was capable of the sustained throughput required. The images also don't quite convey how huge these cards were. It is probably the heaviest PC expansion card I have ever handled. It did not compress the video whatsoever, and could not use the system's bus/IDE controller--too demanding. Furthermore, IIRC the video was stored as still images, one frame per file. I don't recall whether it used FAT or a proprietary filesystem. It was primarily intended for playing back 3d animation, but you could use it for whatever you wanted. I think it cost at least $1000US.

@NdxtremePro 2 жыл бұрын

I wonder how hard implementing a Direct Storage layer over the API would be.

@mnomadvfx 2 жыл бұрын

Probably easier because you are cutting out a middle man - though there might be some latency introduced as they communicate with each other.

@markk8104 2 жыл бұрын

Did you try seeing if one of the versions of graphics card powered SQL works well on this? Current issue with this is the data transfer speed with the CPU step involved. So might be worthwhile trying that.

@cheeseisgud7311 2 жыл бұрын

It really wouldn't help unless the sql server used the api this GPU needs to directly access the file

@MrPruske 2 жыл бұрын

I feel like the only person that could have made use of this was the slow-mo guys in 2017. I'd like to see them try to use it now

@Sencess 2 жыл бұрын

0:01 LOL whose idea was that, EPIC intro

@kevinheimann7664 2 жыл бұрын

Would be intresting if such a Idea would be combined with optane memory that with a driver using it as 2nd level ram

@suhendiabdulah6061 2 жыл бұрын

What did you mean with optane? Is optane can store data? Sory if i am wrong

@JorgeMendoza-qx5bp 2 жыл бұрын

Video Idea Could we get an updated video for 2022 of your "3D Modeling & Design - Do you REALLY need a Xeon and Quadro??" video. A cheap computer for 3D CAD modeling.

@commanderoof4578 2 жыл бұрын

Blender + EEVEE = you need a potato and will still render multiple minutes of frames before something such as 3DS max even does a dozen

@Respectable_Username 2 жыл бұрын

Interesting no discussion of what benefit this could bring to ML on large datasets. Is it the SSDs being that close doesn't provide enough of a benefit to data transfer speeds, or is it the price being too expensive for those doing ML research at places such as universities?

@jet_aviation 2 жыл бұрын

*Apple after soldering the RAM onto the CPU with integrated graphics:* _The time has come to integrate storage as well._

@BromTeque 2 жыл бұрын

It might be usefull for select machine learning applications.

@LycanWitch 2 жыл бұрын

i'd imagine this is where pci gen 4 or especially 5 could have shined if this concept kept going to present. No worries about sharing bandwidth with the gpu as there is plenty to go around, far more than the graphics card and m.2 drives combined could saturate.

@gertjanvandermeij4265 2 жыл бұрын

Would still love to see an GPU with some sort of GDDR slots, so everybody can choose their own amount of Vram !

@coni7392 2 жыл бұрын

It would be amazing to be able to add more VRAM to my card

@psycronizer 2 жыл бұрын

@@coni7392 why ? your GPU can only access and throw around only so much data, and oddly enough the GPU's are tailored exactly to how much ram they have, might be useful for static images at high res, but high frame rates at higher res ? not so much.

@oiytd5wugho 2 жыл бұрын

The expense in no way justifies the benefit. The only thing you'd get is limited upgradibility. GPUs have a highly specified memory controller, basically supporting a few variations in volume, like, a chip might support 4, 8 and 16 gigs of discrete memory ICs each holding 512MB and nothing else

@lazertroll702 2 жыл бұрын

@@psycronizer assuming dynamic physical ram size and that firmware binds the addresses on init, would there really be no advantage in gaming, like having more loaded pre-render objs or prefetch code? it seems that the disadvantage is letting the os treat them as global instead of driver-exclusive/defined fs ..? 🤨

@psycronizer 2 жыл бұрын

@@lazertroll702 not really, transfer speeds from non display storage to frame buffer are really a non issue now, so at some point adding more ram just makes for higher cost with no benefit

@kandmkeane 2 жыл бұрын

This gpu has always made me want to know of Intel’s Optane m.2 could be used? Would it even work? would there be any use cases for that? Any benefits? Probably not but it’s just such an interesting opportunity to experiment with mixing different computer technologies…

@christopherschmeltz3333 2 жыл бұрын

I haven't used Optane much, but the technology is fundamentally more like non-volatile RAM, with higher performance and endurance but a fraction of the capacity as a comparably priced NAND Flash SSD. It's most commonly effective when utilized as a hybrid cache layer, like how it's built into Intel H10 and H20. I don't think the data center grade M.2 have achieved 1.5TB yet, last time I noticed that capacity was reserved for special use server DIMM! Therefore, I expect Optane should mostly function in this SSG, but the benefits of upgrading would probably just be how long the M.2 would last with medium sized but frequently changing data sets before wearing out and if you're using it's API to not get performance bottlenecked elsewhere. Perhaps use the API to write your own storage subsystem using two Optane 64GB 80mm long cache drives and two high capacity 2TB 110mm long storage drives... but I'm not aware of when an ordinary M.2 RAID card feeding multiple compute GPUs wouldn't be more practical.

@mnomadvfx 2 жыл бұрын

@@christopherschmeltz3333 Exactly. Optane phase change memory tech hasn't even breached 4 layer device limitations. While the state of the art in 3D/VNAND is already up to 170 layers and counting. In short Intel bet on the wrong tech foundations to build Optane upon - it simply isn't well suited to 3D scaling which is a necessity for modern memory as area scaling is already reaching problematic limitations.

@christopherschmeltz3333 2 жыл бұрын

@@mnomadvfx Intel fabrication definitely bet on the wrong 10mn tech, but Optane will probably hold onto a smaller niche than planned until hardware RAM Disks make a comeback. You know, like the Gigabyte iRAM back in the DDR1 era... there are newer and older examples, but Gigabyte's seemed to have been noticed by the most PC enthusiasts and should be simple to research.

@JedismyPet 2 жыл бұрын

whoever did the ad animation i love you for adding Saitama

@SuperLarryJo 2 жыл бұрын

So good to see Anthony back on Anthony Tech Tips

@frknaydn 2 жыл бұрын

Main usecase could be AI research. When we run our application sometimes take too much time to load files for training. This way it could be a lot faster. I wish you guys testing not just games. Computers not just game platforms. Please add some software development tests as well. compile nodejs or golang program. Run simple AI trainings.

@IvanpilotNX1 2 жыл бұрын

AMD when was creating this gpu: AMD: Hmmm... We need a different gpu, something different. That one worker: Boss and if we combine storage with a gpu AMD: Hmmm... That idea is... PERFECT, another increase James, good job 😀

@martinlagrange8821 2 жыл бұрын

Well...I would have a use for it. When running Tensorflow through a GPU as a coprocessor for Neural Networks, the SSG would result in supercomputer performance for complex multi-level networks. Its not for apps & games - its for AI !

@Albtraum_TDDC 2 жыл бұрын

Oh, so that's what that little loop on the back of my shoes is for...

@ritecomment2098 2 жыл бұрын

more relevant and up to date videos, a graphics card from the past you can't buy now, wouldn't want to buy, and isnt for sale. great job ltt.

@undeadlolomat8335 2 жыл бұрын

Ah yes, ’Why does this GPU?’ 😂

@TheOnlyTwitchR6 2 жыл бұрын

If I had to guess, this didn't take off because we didn't have OS direct access to GPU storage I really want this to become normal in the future to throw m.2's onto the GPU

@chrcoluk 2 жыл бұрын

From a flexibility standpoint this is amazing, GPU's integrating their own m.2 slots, and sharing the 16 lanes is awesome. Also makes it easier to change ssd's as can just remove GPU from case to work on them easily.

@mnomadvfx 2 жыл бұрын

In the case of DirectStorage use it makes even more sense again - not to mention you can cool the m2 drives better with a full GPU cooler (albeit the SSG cards are passively cooled server form factor).

@thesix______ 2 жыл бұрын

7:55 thx editor

@Bigspoonzz 2 жыл бұрын

This card was originally designed to deal with RAW files like Red. It wasn't just a response to a card, it was a response to give prosumers the ability to work with high end raw files on non customized PCs or pro workstations. In certain raw files the CPU unpacks the file and hands off image files to the GPU for processing. The CPU keeps all other parts of the stream in sync with what the GPU is doing .. In the API, there was plenty of room for Blackmagic, Adobe, Baselight, and many other manufacturers to support the card... And in fact, offloading storage of the cache, certain images of parts would have sped up 4k or higher Rez SDR quite a bit, and did, as you proved.

@steelerfaninperu 2 жыл бұрын

I would totally buy a good graphics card if it let me even slap one M.2 in there just as a bonus storage space. Most GPUs won't use all the 16x bandwidth anyway so why not turn some of it into a storage space? It'd be super useful for SFF builds or even mATX cases.

@sparkyenergia 2 жыл бұрын

Now stuff it with optane drives to truly summon Cthulhu with that Intel/AMD hybrid nightmare.

@andrewbrooks2001 2 жыл бұрын

Great video and information presentation! Thank you!

@ianemery2925 2 жыл бұрын

Pro tip for small, non ferrous screws, use a tiny bit of Blue-Tack to stick the screw driver to the screw head, then a larger blob to remove it if it stays in the screw threads and you want it back.

@beythastar 2 жыл бұрын

I've been waiting for this video for such a long time! Finally, I can see how it performs!

@ravencorvus7903 2 жыл бұрын

Really needed some Anthony today. Not disappointed.

@jmssun 2 жыл бұрын

It was used to accelerate large scaled industrial Ray Tracing or simulation. The industrial scene files (of factories with complete parts) are so large that they usually would not fit in regular Ram/VRam, and by having it in SSD within GPU allows random look up to such humongous scene possible

@fat_pigeon 2 жыл бұрын

6:10 Probably the screws are ferrous, but they're stainless steel, which responds only weakly to a magnet. Try sticking a magnet right onto the screwdriver bit; the stronger magnetic field should pick them up.

@ensuredchaos8098 2 жыл бұрын

I can now die happy now that I know what the GPU does.

@keithv708 2 жыл бұрын

I love when he is doing the presenting he is way more techie.

@thenormiegamer94 2 жыл бұрын

For someone who used to built sff this would be a godsend in 2017 infact even now it still good, i had a dan case a4 sfx and most of it volume is dedicated for gpu and cpu and yes you can cramp in 3 2.5 drive but boy you need custom cable for everything including for mb, cpu, gpu to make space for the drive. even lian li tu105 also had 1-2 drive mount and itx mb low end one come with maybe 1 and high end with 2 m.2, having this would solve so much of space issues for me, my steam library already 6TB

@thepolarblair1 2 жыл бұрын

It's fantastic to see Anthony so comfortable. MOAR Anthony!

@floogulinc 2 жыл бұрын

Actually this design is kinda awesome for mini itx machines where storage expansion is very limited and you're already using your only PCIe slot for the GPU.

@ShiroKage009 2 жыл бұрын

This would have been awesome for things like genomic alignment and similar applications that lost performance due to latency when attempting to utilize GPUs.

@Armacham2 2 жыл бұрын

Everybody asking WHY does this GPU. But nobody asking HOW does this GPU.

@jimmymifsud1 2 жыл бұрын

Anthony is a gift that keeps giving; would never of found him without LTT

@defeqel6537 2 жыл бұрын

To be clear, all GPUs use system RAM if they run out of VRAM. HBCC is just there to do this in a more granular and smart fashion, allowing the card to push the less used data to system RAM and keep the busier data in VRAM. Also IIRC the currently implementation of DirectStorage still uses CPU mirroring, which might not be a bad thing, and also a RAM cached version of the data. It would still be much faster as the slow part (decompression) is relegated to the GPU.

@8700k-u9e 2 жыл бұрын

"WHY DOES THIS GPU." thanks for the amazing title linus. I hope the video enlightens me to why the gpu does.

@Czeckie 2 жыл бұрын

4:07 such a pro 'check this out'

@preoco8241 2 жыл бұрын

its reminds me of a old trick called GT610 Mad cow disease which have 1TB of vram.

@ab-hv 2 жыл бұрын

Hey. you rocking this video man! Nice hosting

@andrewr7820 2 жыл бұрын

Anthony- Test This! There's a use case similar to this which is far more useful: a four-SSD card (such as the ASUS Hyper M.2 card) using a PCI Express x16 slot as a RAID0 drive array. The performance of this device (especially with PCI Express 4.0 SSDs) provides truly blistering throughput (at least on Linux) on both reads and writes The problem becomes one of available PCI Express lanes. On most Ryzen motherboards, there are only one or two x16 slots and generally only one of them is truly x16, The other generally being x4. In this configuration the video card would then need to go into the x4 slot. Until you get to Threadripper or Epyc machines do you have enough PCI Express lanes to have multiple full x16 slots.

@sirfer6969 2 жыл бұрын

Love your work Anthony, keep it up =)

@text-85367 2 жыл бұрын

Congratulations⤴️contact on claming your prize.

@kd7jhd 2 жыл бұрын

Hey Editor! Hello back at ya. - A Viewer

@solo-ion3633 2 жыл бұрын

Before I watch this video, I'm going to say that I've heard of these several years ago and they were aimed at video editors working on 4K videos. When editing, the source files would be loaded into GPU which made editing way speedier. All of the video information would be there already in the GPU. Oh yeah... I heard that the earliest version had the SSD soldered onto the GPU which was dumb because you couldn't swap it out when it wore out.

@Roensmusic 2 жыл бұрын

very informative vid interesting stuff

@tzachi26 2 жыл бұрын

I architected this solution while I was working at AMD in 2015. The idea was to ignite a "DirectStorage" ecosystem that is only now materializing. Mind you, GPU compute cores still can't access the SSD Nand cells in random access fashion, these are two completely different electrical technologies, "DirectStorage" only saves the "hop" of the data through host RAM.

@GHOST22x02 2 жыл бұрын

I find watching vids with this guy more informative than watching Linus.

@phillee2814 2 жыл бұрын

If they'd rolled out proper driver support for that wee beastie for all OSs, it could have been awesome, with a Hyper M.2 card in another 16-lane slot, and some nice big NVMes in both, you could have a mirrored pair of 32TB VDEVs - with maybe parity added by having a couple on the motherboard as well. Downsize each by a wee bit with partitions to allow for L2ARC, SLOG and boot partitions to be split between lanes/devices and mirrored and striped for the bits you want to (so not swap - that could be stripe only). Stuff it with fast ram and a decent processor and you have a heck of a graphics workstation or gaming rig. All for the lack of decent driver support, which if it came with source code for the Linux drivers, would be easy for game or video software developers to hook into.

@ffwast 2 жыл бұрын

It probably didn't sell well because the features aren't properly supported for most people and *still cost seven thousand dollars at release and currently costs more than a 3090 on newegg.* It could work out if they produced an updated one every generation for a more reasonable price (because the groundwork is already done) to have a series of them out there to build tools for instead of one weird expensive card from half a decade ago.