Deep-dive into the technology of AMD's MI300

Рет қаралды 64,035

Күн бұрын

Пікірлер

@HighYield Жыл бұрын

Do you think all five technologies will "trickle down" into the consumer market? Will we buy a single "gaming SoC" at some point in the future?

@ChristianHowell Жыл бұрын

We're almost there... I have a miniPC with a 6900HX and I can play EVERY game at 720p... WHen StrixPoint comes out next year I think we'll be at 1080p60 using FSR3... I haven't seen any benches with 7x40 APUs but they should be 50% faster than the 6900HX... Especially if they can get 30% higher clocks... And then Strix should be 30-40% faster than that...

@closerlookcrime Жыл бұрын

@@ChristianHowell Those system on a chip machines are really cool. I hope they make a bunch of improvements over the next 5 years.

@boredgunner Жыл бұрын

@@ChristianHowell Those performance numbers are good for something like a portable system (Deck) or standalone VR headset

@FfbeEXVIUS Жыл бұрын

No, heat dissipation, leakage, and memory bandwidth, issues with latency. Just look at modern GPU's from nvidia and AMD way too costly. Physics isn't magic.

@hristobotev9726 Жыл бұрын

probably zen6 - 2026. 8+8core + 24-32 wgp GPU chiplet, On bottom IO+128mb cache and 2 chanel DDR6 memory. All for $500-600

@SirMo Жыл бұрын

One correction I would mention is. The latest version of PyTorch. PyTorch 2.0 is moving away from CUDA. For ever increasing AI models CUDA on its own can't optimize for the scale. The frameworks themselves have to be optimization aware. This is why ML frameworks are shifting from the eager mode to the graph mode, which sidesteps CUDA (cuDNN) and provides better performance. Instead of using CUDA, they will use tools like Triton (this is what Open AI's ChatGPT uses) which interfaces directly with Nvidia's NVPTX and AMD's LLVM AMDGPU backends. So CUDA is on its way out and with it Nvidia's software moat. MI 300 will be a monster.

@HighYield Жыл бұрын

I'm not up to date with all the AI/ML software. Maybe this is AMDs chance to catch up to Nvidia?

@SirMo Жыл бұрын

@@HighYield I believe so. There is a big push at AMD to improve their AI offerings. mi250 was kind of a test, with limited matrix multiplication units, and built more for pure scientific HPC applications (like the ones used by Oak Ridge supercomputers). mi300 will focus on AI. And I do believe it's going to be very competitive with Nvidia.

@benjaminoechsli1941 Жыл бұрын

@@SirMo Excellent. AMD has been lacking in that market, and some competition against Nvidia's monopoly is welcome. I know they've had their CUDA compatibility layer for ages, but a proper alternative is always best.

@Shubhamrawat364 Жыл бұрын

Another blind fanboy having the amd cool aid drink, amd has zero chance in AI because nvidia is light years ahead with full stack ecosystem and partnerships, all amd can do is make chips but that wont take them anywhere, its all about the ecosystem and the advantage of scale where nvidia absolutely doimantes, poor amd was only able to catch on intel because intell got fallen behind TSMC in process node technology and amd used TSMC but with nvidia there are no low hanging fruits to eat because unlike intel nvidia is not sitting on its laurels they are moving ahead with light speed in AI and here amd is just warming up.

@closerlookcrime Жыл бұрын

You think so? I am learning CUDA and working on 12.1. I am just getting started.... Just want to do some small scale work to learn this new tech. There are a ton of processors on the market for cheap that I would have been thrilled to have just 5 years ago. Did you see CUDA has a new communication system that is not only under development but already being used that reduces i/o bottle neck. I hope AMD does something great though to keep the competition going. Anyway gotta run.

@lefthornet Жыл бұрын

This maybe the first step of the war coming between AMD and Nvidia, I'm waiting for Intel to react, but the advances that AMD is making are huge.

@beachslap7359 Жыл бұрын

They don't have enough supply to compete with nvidia no matter how competitive their architecture is.

@coladict Жыл бұрын

@@beachslap7359 If Nvidia doesn't have chiplets with Blackwell, they're screwed. It's known they're working on it, but what they themselves don't know is if they'll get it done on time.

@SirMo Жыл бұрын

@@beachslap7359 AMD produces more chips than Nvidia. They have all the supply they need.

@chrisvicera6696 Жыл бұрын

@@beachslap7359 AMD has the supply, they just don’t have the demand or market share. If there was proper demand, they would shift wafer capacity from their Ryzen and gaming gpus

@beachslap7359 Жыл бұрын

@@coladict they have by far a more superior architecture right now compared to amd. Why would that change for the next generation just because they don't have chiplets? Especially considering the node jump is gonna be slim for both companies.

@hanspeter24 Жыл бұрын

your videos are amazing and i learn so much new interesting information even though i don't understand everything its still rewarding to watch you explain and develop my own knowledge just as you did. thanks for that and greetings from hamburg, germany :)

@cocosloan3748 Жыл бұрын

100% agree 👍

@HighYield Жыл бұрын

Something I wanted to add: since we don't know the packaging method yet, when I talk about the "interposer" it doesn't have to be a large silicon interposer, it might l be a small "organic interposer" like on Navi31, using TSMCs "Fan-Out" technology. Once we know more another video will follow! www.tsmc.com/english/dedicatedFoundry/technology/InFO

@ChristianHowell Жыл бұрын

IIRC, TSMC is using CoWoS and TSVs with the caches like X3D... They renamed it to 3D Fabric for the whole infrastructure... AMD is the cutting edge of silicon right now... They had to sell their own fabs and came out even better in the long run... All their 5nm products haven't even come out yet and the 4/3nm Zen5 ones will have even more instrs and accelerators...

@HighYield Жыл бұрын

@@ChristianHowell AFAIK, SoIC is also a Chip-on-Wafer technology.

@ChristianHowell Жыл бұрын

@@HighYield Yeah, they folded it all into 3D Fabric terminology... But the basis is CoWoS because it lets them package huge chips like MI300...

@Cooe. Жыл бұрын

It's 100% confirmed to be a standard, absolutely freaking MASSIVE (as in it pushes TSMC's max reticle size limits right up to their absolute freaking er.... limit lol 🤣) silicon interposer. 🤷 AMD were originally planning on doing what MI250X did to connect its HBM, which is use a hybrid of a standard interposer ala MI300 & fan-out wiring ala Navi 31, w/ TSMC's "CoWoS-R" packaging, which uses a small silicon bridge + fan-out technology. (Think Intel's EMIB, but instead of the tiny silicon mini-interposer bridges being directly connected on both ends w/ physically die-to-die attached TSV's [which TSMC can now do too, called "CoWos-L"], they're connected w/ less dense fan-out wiring [although still plenty dense enough for HBM3].) But then at the absolute very last minute possible, AMD had to switch to a traditional absolutely gargantuan silicon interposer (aka using TSMC's "CoWoS-S" packaging) pretty much as late as they possibly could due to reliability concerns, thanks to the massive package flexing/warping when kept at its full-fat 750W load for seriously extended periods. Basically, with MI300 having a MUCH larger overall package than even MI250X, the tiny silicon bridges connecting the HBM stacks were simply much more prone to failure than a single massive contiguous silicon interposer across the entire package under the kinds of "100% load, 24/7/365" conditions that are inherently endemically common in HPC/data center land. 🤷

@Cooe. Жыл бұрын

@@ChristianHowellIt's CoWoS, sure, but TSMC has like 10x COMPLETELY DIFFERENT technologies under that single banner, making it a mostly meaningless/useless label just by itself outside letting you know the product is multi-chip. 🤷 MI300 uses a classic single massive (literally reticle limit pushing) contiguous silicon interposer base layer (just like in say Vega), which in TSMC's marketing speak is called their "CoWoS-S" ("CoWoS on silicon") packaging technology. For an example, CoWoS-S/a traditional large silicon interposer is an ENTIRELY DIFFERENT packaging tech than what's in MI300's also CoWoS but rather "CoWoS-R" using MI250X predecessor! ("CoWoS-R" is a hybrid between Intel's EMIB tiny silicon interposer bridges [which TSMC also now has a proper version of, called "CoWoS-L"] & the ultra dense fan-out wiring tech used by Navi 31/2 & Apple's M# Ultra SKU's. Basically think EMIB style silicon bridges but connected on both sides w/ less dense but still plenty fast for HBM3 fan-out wiring vs EMIB's/CoWoS-L's direct chip-to-chip TSV's. They didn't end up using CoWoS-R again on MI300 like they'd originally planned to because of reliability issues w/ the tiny silicon bridges at its full 750W load for truly extended periods, as is the standard HPC/data center usage environment.)

@jelipebands1700 Жыл бұрын

Awesome job covering the mi300 it’s so impressive and beyond anything ever made no one knows how to cover it or even talk about it. The technology is going to make it into gaming consoles. I predict next gen is going to be so integrated the thought of adding ram and separate cards to a pc will feel ancient

@RobBCactive Жыл бұрын

Incidentally when AMD was buying ATI for Radeon the "Fusion" idea of not just seamless GPU fp compute but a unified address space was used as justification. It's over a decade but at last this becomes more feasible. Not just MI300 but SAM and recent DX12 extensions are aimed at shared address space.

@HighYield Жыл бұрын

From "the future is fusion" to the actual "fusion future". AMD's slogan back then was more than just marketing, it was a promise.

@peterconnell2496 Жыл бұрын

Exactly, & the sale formalized in 2005 - almost a generation ago.

@RobBCactive Жыл бұрын

@@peterconnell2496 the bottom line was the APU wasn't that feasible, but now we see Intel & AMD strengthening gfx, while Nvidia are launching ARM CPUs for the data centre. So this memory unification is the direction for high performance. The old APUs were too compromised by cost limits, restricted memory bus, cache; despite AMD having a transparent NUMA in their server CPUs, they didn't have the investment funds to realize the possibilities. A sceptical observer suspects the justifications were rationalisation for a panic reaction to CUDA. IIRC ATI were in trouble, AMD wanted a slice of GPUGP, so the Fusion concept of another VLSI step was born. The problem was Intel had successfully responded to AMD64 and the x2 chips with Core Duo, had bribed the key OEMs and had a voucher system of rebates with Intel Inside that small dealers relied on. So AMD were squeezed from both sides, not able to realise the profit from their real innovations and NOT having the financial muscle to buy an OpenCL counter to CUDA of sufficient quality & application support. AMD were following and trying to catch up; Intel had gone awry with P4/Itannic but commercial power kept them strong. Nvidia meanwhile reaped the rewards of the collapse of competition with their new main competitor having to divert funds away from future GPU designs.

@bartomiejpopielarz8283 Жыл бұрын

I think you are spot on. Though I would say that another tech to look out for is in-chip fluid cooling. Heat is a huge problem, especially with 3D stacking. Efficient extraction of heat allows for higher frequencies and lower energy use (as heat increases resistance).

@HighYield Жыл бұрын

Next-gen cooling tech is definitely worth a future video :)

@adoredtv Жыл бұрын

Great video, cheers!

@earllemongrab7960 Жыл бұрын

1. Excellent video. I don't follow the data center innovations that closely, I'm more of a desktop gaming guy, so this video was absolutely fascinating to me. Well explained, well segmented. And it's exciting to think about what this will mean for the desktop for the upcoming decade. 2. Before you introduced the 5 new technologies, I paused the video and gave it a quick think of myself. I basically came up with the same categories. Except I came up with "heterogeneous design". In my head that was something that takes the SoC and disaggregates it into chiplets but also includes mixing process nodes and possibly chiplets made by different vendors / foundries. We're not quite there yet. But in my head mixing 5 with 6 nanometers is a part of it. So I basically mushed your "SoC" and "Packaging" category together and added a bit of my own flavor. 3. The classic 'von Neumann' architecture on PC can't keep up anymore. We see this with the consoles, how a smaller, much cheaper design can yield incredible performance. Mid to high-end PCs that cost 3 times more struggle to play the latest console game ports. This is ridiculous, somethings got to change. I'm curious how a next gen PC architecture will look like. Will we still have a modular design, how will cooling look like and will manufacturers be able to agree on a standard in time before consoles make the PC look even more boomer than it already looks to some people? 4. Exciting times ahead.

@HighYield Жыл бұрын

It doesn't really matter how you call it. Remember when AMD's company slogan was "the future is fusion"? It's kinda ironic that now that they achieved this complete "fusion", it's not their slogan anymore. But the technology has been a long time coming. Long before their Ryzen comeback, AMD was forced to innovate to stay alive. The design lead AMD currently has is a result of this. Fully agree, exciting times ahead!

@BrandonMeyer1641 Жыл бұрын

Fugaku with the Fujitsu A64FX walked so El Capitan and Mi300 could run. Seriously.

@WXSTANG 8 ай бұрын

AMD is way ahead of the curve vs the competition. They just need someone to market the tech better. They are a true heterogenious system and get better and better every year. Now AMD is sharing GDDR with CPU / GPU and other AI accelerators.

@ChristianHowell Жыл бұрын

MI300 is something I've been waiting for since I saw the initial HPC chiplet APU patent... The interesting thing is that older CPUs used to remove functions from the CPU die because of limited transistors at 180nm etc... But the biggest thing about it is that I heard some autonomous driving folks say that 2000 TOPs are needed for an FSD experience and since MI250X has 383 TOPS, 8X that is almost 3500 TOPS... AMD can now theoretically provide all the chips for automobiles out of nowhere it seems (NOT!!!)... They can use an edge like appliance with a Pensando front end for network relays for traffic and weather, etc. for a LARGE MAP area, while an upcoming Ryzen APU can do the entire system, including 4K video and gaming... Companies are selling mini PCs with Ryzen and Radeon 6000 that can do 4K30+... Zen4 telco servers can do edge processing while EPYC can stream games and all types of data including AI for predictive routing...

@NootNoot. Жыл бұрын

The packaging/chiplet design is quite brilliant (speaking of, gratz on the sponsorship)! One day, hopefully we'll see all of these techniques trickle down to desktop/consumers! The Zettascale strategy is interesting because it pulls you into real world limitations, that is physics, that will inevitable halt performance if we don't invest in new techniques. Like with 3D V-Cache, although is a great solution for more L3$, there are still thermal limits. AMD investing in RnD is a long term goal. And investing and brute forcing into todays technologies like monolithic designs, we'll see in the near future to be unreasonable.

@HighYield Жыл бұрын

With X3D CPUs and Navi31 AMD is really aggressive in using their top technology for consumer products, tho as you said, its always "trickle down". Zen 2 chiplets where designed for servers, not desktop, same with 3D V-Cache. But they work great for desktop too.

@D.u.d.e.r Жыл бұрын

Excellent vid, thank u for making it!

@HighYield Жыл бұрын

Thanks for watching :)

@D.u.d.e.r 11 ай бұрын

@@HighYield 110% agree with u that MI300 is a prototype of the chip we will see in the future also in the consumer market with lot of 3D stacked L3/4 V-Cache and possibly even with some of the HBM memory. I would say next gen consoles, maybe even PS5 Pro will utilize V-Cache because of its huge benefit for gaming. In the enterprise segment similar chips to MI300 will also expand to next level not just on the points u have mentioned (especially 3D stacking & packaging) but mostly with - chip/chiplet customization. Companies will be able to fully customize (for an extra price of course) what kind of chiplets/accelerators they'll get in their chips . It won't be surprising to see future MIxxx AMD chips with their Xilinx FPGA and for example Tenstorrent chiplet.

@peterdoyle8571 Жыл бұрын

1) LightMatter wafer scale optical interconnect 2) Ultraram replacing most chiplet cache, HBM, DRAM, and NVM 3) Accelerators on chip/package 4) Combining CPU, GPU, FPGA on package 5) Backside power delivery 6) VTFET 7) Deep trench capacitor on wafer with direct 3D bonding integration 8) Glass based motherboards with integrated photonics, power deliver, and microfluidic cooling

@ericpickering2406 Жыл бұрын

Thanks!

@HighYield Жыл бұрын

No, thank you!

@craftspro 7 ай бұрын

Such a great channel & amazing video explanation. Even big youtubers like linus tech tips don't explain chip design like this. Very underrated channel.

@lasbrujazz Жыл бұрын

And with chiplets design, AMD can scale their products way easier than competitor. Shown last week, MI300 has 2 variants, "A" with 6 GCDs and 3 CCDs, and "X" with basically all CCDs replaced entirely with GCDs, making it GPU only. This modularity is going to please any kinds of customers.

@OneAngrehCat Жыл бұрын

To me, the most memorable part of the keynote in the entire Zettascale race was logic on memory. I can't really imagine just how much you could realistically put on RAM, probably only basic math operations as anything too complex would probably be too costly. But if you can even just do basic math, even just add/substract/jump, it'd be a true revolution. So many basic operations would be loaded off the CPU and live in the RAM. The CPU would just have to send the request and that would seriously take down transfers. You could go from 20 transfers and operations down to something like one CPU -> RAM transfer, operations by the ram, and then RAM -> CPU transfer when done to send back the updated data the CPU wanted. It's truly revolutionary in speed and efficiency. How costly/plausible...don't know. But I find it to be the most impressive thing.

@peterconnell2496 Жыл бұрын

maybe even on nvme storage?

@closerlookcrime Жыл бұрын

I hope all manufacturers start stacking chips and putting them in our desktops. Gonna get some cool stuff soon.

@u-def Жыл бұрын

I felt this is best ad transition ever. It kinda convinced me to learn some on brilliant 😂

@HighYield Жыл бұрын

I get lots of sponsorship offers, but I only take the ones I actually think are really useful and Brilliant is definitely useful.

@TrueThanny Жыл бұрын

The eight extra "dies" are probably what I could call fillers, not spacers. I don't think they are there to provide any structural purpose, but simply to take up space that would otherwise have to be taken up by the resin used to make the whole package flat. As for the stacking method, I can only make a wild guess. The bottom dies are apparently mostly for I/O, so they could be quite large, given the lack of scaling with I/O. They would then also have a lot of dead space, making TSV's an easy option on that side of the equation. If I had to guess, I'd say they connect to the upper chiplets using the normal connection points those chiplets would have if they sat directly on a package. Meaning the bottom dies replace the substrate as far as the stacked dies are concerned, and use TSV's as needed to connect to the actual substrate for reaching the socket pins. That would make the overall design chiplets on top of active interposers, on top of a passive interposer with HBM at the sides.

@Cooe. Жыл бұрын

Wrong. They are officially & explicitly according to both AMD & TSMC there to maintain structural integrity across the massive chip package, just like OP claimed. If it was just a massive valley of pure in-fill material placed over those areas in-between the HBM3 stacks instead of hard silicon, it would have allowed SIGNIFICANTLY more flex and warping under the package's full 750W power/thermal load, which could EASILY break the RIDICULOUSLY FUCKING FRAGILE like 1100mm² active silicon interposer everything sits on top of. 🤷 (Not break as in "crack it" or anything, but rather cause some of the countless MICROSCOPIC TSV's ["Through-Silicon Vias"] connecting the massive interposer to all of the various chips on top of it to become disconnected.)

@Cooe. Жыл бұрын

And the bottom dies sit on top of a massive package size active silicon interposer, NOT on the package substrate itself! The gargantuan silicon interposer underneath all the active chips above is what's connected to the actual package substrate.

@zetaDirective Жыл бұрын

The most fun part there is the 3d stacked L3 CPU cache. A-freaking-some!

@6SoulHunter9 Жыл бұрын

I noticed that I started to anticipate your videos! When I have free time it's the first thing that I look for. There are lots of technology channels, but lots have fillers and content oriented to entertain, which isn't bad, but I am a huge nerd and I enjoy more this channel. Keep going!

@HighYield Жыл бұрын

Not stopping anytime soon, but I've had this on-and-off cough for almost a month now, which makes recording videos harder. I will try to get back to my original "once a week" schedule, or at least not keep the current "once every 2-3 weeks" :X

@6SoulHunter9 Жыл бұрын

@@HighYield Don't worry! I view your videos for free, so I cannot complain, and I'd rather have a few and good quality than stream of conciousness videos.

@smactardian 9 ай бұрын

The transition to GAAFET/MBCFET in near future process nodes like 18A strongly indicates that process will still prove to be a driving factor in performance. Despite TSMC still using FINFET for 3nm, GAA may be feasible in sub-nanometer

@igavinwood Жыл бұрын

New sub. Thank you for highlighting the tech along with the exciting and somewhat scary aspects of modern computing. This vid is a good indicator of how fast the computing field has moved in recent years. I expect that the AI at work currently is being utilised to create the next generation of SOC and AI. Something of a self fulfilling prophecy, thus the tech questions to achieve zetascale are likely to be answered in the next 15 years, however the bigger question is the impact it has. A discussion that has yet to really hit the larger population.

@forrestnorrod1547 Жыл бұрын

Very good video. Great analysis and insights on what AMD is doing and why.

@HighYield Жыл бұрын

Thank you, good to hear you liked it!

@wayofflow955 Жыл бұрын

Interconnects are the key to new age of 3D stack chips I think. We will get to a point where the processor is not 2d but more like a solid cube. Inside this solid cube is all semiconductor.

@HighYield Жыл бұрын

And if you come up with a good way of cooling this cube, you could be the next Bill Gates!

@franchocou Жыл бұрын

I want solid cube of gan

@APHRODIZZYAC Жыл бұрын

Excellent video, really enjoyed it. I'm not an expert by any means but I do think once we are able to produce graphene transistors at scale that both speed and efficiency is going to make a gigantic leap forward, combine that with optical data transfer off chip the leap forward will be incredible.

@CharlieboyK Жыл бұрын

Great video about the technology. AMD has certainly embraced chiplets well. The acquisition of Xlinix earlier on by AMD ensures that they have the best substrate and packaging technology as Xlinix is considered the best in this area. AMD can make custom SOCs to target high end AI companies. Exciting times ahead for this technology.

@m_sedziwoj Жыл бұрын

5:50 I would not agree, spacers would not be split in half, double trouble to place and keep height in check

@marce.fa28 Жыл бұрын

You are amazing, thank you 😊

@miyagiryota9238 Жыл бұрын

Yea absolutely great presentation!

@closerlookcrime Жыл бұрын

You can get an IBM x3650 M4 with Intel Xeon E5-2670 2.6ghz base frequency and set it to run at 3.3ghz on high power mode. There are two silicone chips with 8 processors in each chip and 2 threads per cpu. The unit also comes with 64gb memory at 1333mhz for 150.00 bucks. I then added a gtx 1660 super with 6gb ddr6. and 128gb additional ram. I love it. I put Ubuntu 22.04lts on it for the operating system. I picked up 10 900gb sas drives and put 8 in the machine to start and set them up with raid 5. There are a ton of these servers on ebay and can be obtained in a server or desktop server package. Thought I would share. Good luck. Have fun.

@adela5561 Жыл бұрын

Great video and well explained. Thks

@jet_mouse9507 Жыл бұрын

With memory integrated onto the chip, the memory timing will probably be REALLY good! Plus, with a gpu with performance levels of dedicated gpus on the same die, that's gotta have a huge performance increase. If nobody posts a KZbin video of gaming performance with fps graphs with this thing, I will be VERY disappointed. Maybe Linus will do that.

@HighYield Жыл бұрын

While CDNA3 isn't really a gaming architecture like RDNA3, I'm sure you can run games on it. If you get the drivers working MI300 would be a power house!

@honkhonk8009 Жыл бұрын

GPU's are the few things I would rather not have directly integrated onto the CPU. For the express purpose of it destroying the modularity modern systems have. I like the idea of having memmory integrated onto the GPU die. I think its much wiser to just offload more and more things onto the GPU, than to just integrate the CPU and the GPU closer. For gaming especially, the CPU shouldnt be doing shit when it comes to rendering. It should mostly be hardware doing the job.

@raisofahri5797 Жыл бұрын

@@honkhonk8009sadly the industry are move into unified system

@kkgt6591 Жыл бұрын

What is the 128GB HBM for ? Is it the RAM ? Can customer upgrade for additional RAM?

@HighYield Жыл бұрын

Yes, that's the RAM. And it's not upgradeable, as it's HBM that sits on the package itself.

@TheRandomguy06 Жыл бұрын

You said in the video that you think in the future semiconductor a packaging will be more important than process node. can you expand on that? or maybe make that its own video?

@HighYield Жыл бұрын

The basic idea is that with process node progress slowing down, how you package/combine chips becomes much more important. I'm sure it will be a topic in future videos :)

@squeezedoz Жыл бұрын

Fantastic summary!

@legobuildingsrewiew7538 11 ай бұрын

This is so cool. I want this for my home lab.

@VnikXum Жыл бұрын

Thank you for video! I think changing chip materials from silicon to carbon (or somthing else) is additional way to improve efficiency.

@kingkrrrraaaaaaaaaaaaaaaaa4527 Жыл бұрын

Do you mean carbon nanotubes or graphene when talking about carbon?

@VnikXum Жыл бұрын

@@kingkrrrraaaaaaaaaaaaaaaaa4527 Yes, maybe, probably diamond structure instead of silicium.

@JxcksonSF Жыл бұрын

Hbm still more expensive than GDDR? Will it return to gaming gpus someday?

@HighYield Жыл бұрын

I think at some point HBM might have it's come back, but not anytime soon. GDDR7 will be next.

@julianfiacconi709 Жыл бұрын

MCR DIMM is the future. Latest pursuit in memory tech, from what I’ve recently read.

@ttb1513 Жыл бұрын

@HighYield Great channel. If I may, please pay attention to how long you keep a slide on the screen, especially when text is flashed onto the slide at the last moment. Just a little thing.

@HighYield Жыл бұрын

Which slide are you referring to? A timestamp would help :)

@ttb1513 Жыл бұрын

@@HighYieldSure: 2:19 9:39 2.5/3D stacking difference: diff is "both (MEM/compute) are actually active". At 9:47 irrelevant fine print comes into view for 2s. Even tho that text is unimportant, it creates the impression that I’m missing something. 9:50 "The 5nm MCD and CCD chiplets …". I couldn’t find MCD in chip diagram or "AMD MI300" text column at right. 11:54 slide phase-in effect made text unreadable for 4s. Slide was on for 10s, but disappeared once motion of text froze. Combined, only once text froze was I ready to read, so reading felt very rushed. 13:29 uses same phase-n effect. But is slide stays longer. 14:19 same 17:15 4s The slide phase-in effect where the start of all sentences are not visible at first and text keeps moving but disappears once it stops moving makes a slow reader like me feel rushed. (Secret: especially for someone like me who likes to listen at faster speeds, but reads slowly. I understand it’s not possible to cater to making readable slides at faster speeds). The section on 2.5D vs 3D at 9:39 created confusion because the explanation of "in 3D, both chips are active" made me investigate the diagram more, and I couldn’t find MCD, so I was trying to find that or determine if you actually said GCD. The main point is that the explanation of both chips being active in 3D seemed off: the difference strikes me as more about how intensive the transistor switching density (and heat) is for the chips stacked, how tolerant the circuitry is of heat (DRAM, SRAM are finicky), and heat dissipation for the stack. After all, in HBM, all die in the stack can have dram banks active at the same time. Please take my original comment as a bit nit picky. I really like your channel and appreciate the time you put into such thorough content research and video production. I’d like to see your channel develop larger viewership.

@PunmasterSTP 10 ай бұрын

MI300? More like "Magnificent technology, and they're going way ahead!" 👍

@cocosloan3748 Жыл бұрын

Love this YT channel . Excellent info -so educational 👍

@HighYield Жыл бұрын

Thank you! But its always important to realize I'm not 100% correct on everything, especially when some details are still unknown.

@cocosloan3748 Жыл бұрын

@@HighYield Still - we learn so much about the technologies you talk about. TY 👍

@lordvass3377 Жыл бұрын

and honestly i dont think any one can beat amd in raw power at this point and big tech corps are seeing that now

@Kratochvil1989 Жыл бұрын

weeks ago i saw an stupid comment.. AMD is just all about brute force thats useless.. " I saw that comment and i couldnt stop laughthing" hell yes efficent brute force, best combination ever and that random guy was just upset about it :D. It seems like amd has some insanely talented engeniers, its great to see it , if i am not mistaken Apple thought about chiplets many years ago for pretty long time and they failed. Make this architecture real thing was realy an exceptional and kinda scary challenge.

@justinpatterson5291 Жыл бұрын

Imagine a (professional) consumer grade APU with 24+ Compute Units of RDNA 3-4, with 128MB 3DvCache or, on die HBM. Filling a PCIE slot with other expensive silicon could become optional in some cases.

@jihadrouani5525 Жыл бұрын

Very informative videos, please keep up the awesome content =]

@HighYield Жыл бұрын

As long as ppl are watching I won't stop!

@yanhao5703 Жыл бұрын

Is the packaging option you mentioned where the base chiplets are smaller than the silicon on top so that there can be direct connection between the interposer and the silicon similar tp what Intel has presented some time ago as Foveros Omni?

@judehariot8076 Жыл бұрын

Great coverage and analysis, but could you perhaps remove the 'fizzy' distracting background music please? Or would it be possible to upload different version of the video that has no background music?

@HighYield Жыл бұрын

Is it bg music in general or you just dont like the one I used for this video specifically?

@judehariot8076 Жыл бұрын

@@HighYield It would be the one used in this video in particular. It's the quick drum percussion synonymous with college/high school football bands, but also commonly used in modern hip hop. It tends to be effective in drawing one's attention to the beat, away from what you are trying to say. I hope that's clear enough. Cheers.

@npip99 Жыл бұрын

2:30 I think Unified Memory will really not be a thing. 32GB of RAM for the best-in-class DDR5 Hynix A-die is $80 at its cheapest. The top-of-the-line CPU and GPUs combine to thousands of dollars. Unified Memory involves a lot of management for divy'ing up RAM between the two, which is likely to be slower than just giving CPU and GPU its own separate RAM. Especially given just how cheap RAM is. For Data Centers, Unified Memory for training AI models is important since we're talking Terabytes of RAM. But atm even 16GB vs 32GB is showing virtually no difference to consumers.

@dralord1307 Жыл бұрын

Thanks for the interesting video

@kirkseywysinger2112 Жыл бұрын

Thank you, for doing What you do. New fan here, really appreciate the content, and your presentation.

@RealLifeTech187 Жыл бұрын

Nice video 👍 Very interesting.

@vincentlzl921 Жыл бұрын

Within these 5 years AMD has never ceased to amaze the market with better products.

@IainMcClatchie 12 күн бұрын

I don't know if you are still reading comments from a video a year ago, but here's are mine. First, nice job showing how the chiplets are stacked up. I haven't seen another presentation of the stacking that is clear. At all. The claim is that interposers and chiplets reduce power. Okay, prove it. * For a signal travelling, say, 5 cm, how many picojoules per bit are consumed pushing that signal across metal on the interposer, versus how many picojoules per bit are consumed pushing that signal through a short-range serdes, across a lower-capacitance organic composite substrate, and then back up into another serdes? * From that energy cost, and the stated bandwidths that these chips have, you can figure out the power dissipation in the communications system. Is it 10% of the total power budget, or 50%?

@ZAGAN-OZ Жыл бұрын

This gets me exited for RDNA4.

@anttimaki8188 Жыл бұрын

The genius part is making a sort of universal chiplets, that can actually be used in more than one purpose. Whoever figured it out needs someone to go grab him a coffee.

@HighYield Жыл бұрын

Intel's Meteor Lake will also be chiplet based (Intel calls them "tiles") and I'm sure sooner than later Apple & Nvidia will join the club.

@Cooe. Жыл бұрын

There will even be a PCIe accelerator version that's basically 1/2 of a MI300X on a half size interposer & thus package! (2x base tiles + 4x GCD's & HBM3 stacks vs MI300X's 4x/8x, so 152 CDNA 3 CU's w/ 96GB HBM3 vs MI300X's 304 CU's/192GB). Think the new MI210 equivalent to MI300X's direct MI250X replacement.

@moncyn1 Жыл бұрын

3:06 so many artifacts, must be algorithm

@theevilmuppet Жыл бұрын

Wonderful video! May I offer one correction? The units for die size are square millimetres, not millimetres squared: en.wikipedia.org/wiki/Square_metre

@HighYield Жыл бұрын

Thanks for pointing that out. Honestly never thought there would be difference. In German (my native language) it's also "square millimeters", idk why I'm saying the other way around when I talk in English.

@theevilmuppet Жыл бұрын

@@HighYield because English is a terrible language! Many native English speakers who speak no other languages make the same mistake, saying "well, that's how it's written". By the way - your English is perfect!

@samghost13 Жыл бұрын

I really enjoy your videos. Please more : )

@joehopfield 3 ай бұрын

UMA makes data access *faster* (lower latency) as well as more efficient. Managing caches and external memory access was a huge burden for traditional multi-processors.

@alb.1911 Жыл бұрын

Thank you

@HighYield Жыл бұрын

Thank you for watching!

@GIANNHSPEIRAIAS Жыл бұрын

can you imagine if mi300 variants replace the current threadrippers?

@MrHighway2000 Жыл бұрын

Will MI300 be able to access off-chip regular memory? Say 1TB DDR5 type.

@HighYield Жыл бұрын

Thats a good question. 128GB HBM3 is a lot, but modern severs have TBs of RAM per CPU. Could be possible.

@jackskalski3699 Жыл бұрын

isn't SOC kind of counter to chiplets?

@raisofahri5797 Жыл бұрын

Soc are pretty much just general name for chip that have more than one component.

@NaumRusomarov Жыл бұрын

what's the purpose of the cpu cores here?

@HighYield Жыл бұрын

So it can function as an APU and you dont spend energy transferring data between CPU and GPU over a motherboard. It's for maximum efficiency.

@samlebon9884 Жыл бұрын

CPUs do the serial computing, CDNA cores do the parallel. Together it's called heterogeneous computing. AMD has nice paper on heterogeneous computing, just Google AMD HSA paper, or something like that.

@SirMo Жыл бұрын

GPUs are accelerators only used to perform a specific set of operations. CPUs are still needed to feed the GPUs and run the actual programs. For instance if you're doing AI training. You still need the CPU to parse and provide the data to the GPU, and then compile those results into a model. CPUs are really good at executing serial code, while GPUs are good at executing highly parallel code. You need both. And AMD is the only company that can provide best of breed CPU and GPU.

@RickBeacham 4 ай бұрын

When will AMD sell these for the gaming PC market or is this the next Play Station?

@GustavoNoronha Жыл бұрын

Great video as always, I could do without the repetitive background music, though ;P

@HighYield Жыл бұрын

HYou mean no bg music at all or just change it up more? I feel like without bg music its too "empty".

@GustavoNoronha Жыл бұрын

@@HighYield I am quite sensitive to repetitive noise, so I am probably not a good reference. I think if you change it up it should be good, as I did not feel it grow in my mind in the smaller videos. Maybe use the chapters to change it up or something like that? Still, the video was very worth it anyway =)

@josiahmoorhouse8036 Жыл бұрын

But can it run Crysis?

@anttimaki8188 Жыл бұрын

Nah, its How Many?

@jaynorwood2 5 ай бұрын

Intel's PVC GPUs have 16 compute tiles sitting on top of a base layer that has sram cache.

@seanjorgenson7251 11 ай бұрын

They are using atomera's mst

@Artoooooor Жыл бұрын

I want to have this monster in my desktop computer...

@radicalrodriguez5912 Жыл бұрын

great engineering is usually simple

@jarenpocopio6033 Жыл бұрын

Cant wait to equip sandivista prototype mk 1.

@HighYield Жыл бұрын

Just wait another 54 years ;)

@NKG416 Жыл бұрын

i do think the music is too loud

@RoyaltyInTraining. 6 күн бұрын

Imagine all the performance we're gonna get once AI calms down and all this tech comes to consumer hardware

@denvera1g1 Жыл бұрын

When i proposed to Moore's Law is Dead back when their fist chiplet CPUS were announced(2019), that AMD would make a mega APU in a few years he laughed and called me stupid

@denvera1g1 Жыл бұрын

To be fair to MLID it was a superchat, so couldnt clarify the timeline, nor the market, so he may have been thinking a threadripper APU in 2019 based on Zen2 and CDNA/RDNA1

@DB-nl9xw Жыл бұрын

Will Nvidia produce CPU?

@andersolsen1478 Жыл бұрын

It is not only that AMD has a better product than Nvidia but they are also going to use open source software which will be better and cheaper than Nvidia software. It is a win win for AMD. 🎉

@Dianaranda123 Жыл бұрын

I am a little bit afraid of Unified Memory and SoC. I prefere upgradeable memory and GPU's It would be horrible if i where locked out of content creation (3D Artist) just because i cant afford to lay down 3000€+ just to get the right amount of memory and a powerful enough GPU in one go.

@HighYield Жыл бұрын

True, upgradeable RAM is a nice thing to have. My current system started with 16GB and now I have 32GB.

@egalanos Жыл бұрын

They're not mutually exclusive. A base amount of on package memory could be provided for performance/efficiency reasons and additional memory added-on via either DDR sockets or CXL. I think the main thing preventing it is profit margins; AMD can't sell integrated memory unless they add the large margin that their shareholders expect. System integrators/OEMs don't want to have their costs inflated.

@Dianaranda123 Жыл бұрын

@@egalanos True-ish i geuss but knowing currentday manufacturers of motherboards would do the following: Cheap motherboards all the way upto 800€ wouldnt have DDR/CXL slots. And then 1000€ ++ would have the slots, and then they would claim that the slots are not neccessary or something along those lines.

@hstrinzel Жыл бұрын

Wow, I don't even know yet how to use my Raspberry Pi 4 to its fullest...

@theminer49erz Жыл бұрын

I say APUs will be the next wave as they are able to perform equal to/better than a console for less money. Small form factor mid range Gaming PCs will be their replacement. I would like to see mobos using dual APUs and shared VRAM style RAM for system and graphics. I really like where they are going. It's what I was hoping for, so I think it will close about the console prediction. Maybe one more "next gen" system from MS and Sony then done....unless they start making pre-built under the name, but they prob won't wa t to deal with that if there is no licensing revenue. Nintendo will probably keep making consoles though, I hope. Anyway my same old ramblings. Thanks for the update!

@lordvass3377 Жыл бұрын

so amd dose not have any reason not to run through the bottom die as its likly cache they could run the power using interposers on the oud side and get the best of both worlds with out the power interfering

@A-BYTE94 Жыл бұрын

140B transistors 😮😮😮😮😮

@markvietti Жыл бұрын

AMD should no longer acknowledge Intel...as a kick in the face.. Intel deserves it..

@HighYield Жыл бұрын

Meteor Lake will be Intels make-or-break moment. If they can execute their version of a chiplet architecture well, they are back in the game.

@burakozc3079 Жыл бұрын

nice laptop chip. 👍🏿

@HighYield Жыл бұрын

With a 5 min battery life :D

@burakozc3079 Жыл бұрын

@@HighYield Gaming laptops doesnt last much more.

@pavankumarreddy7888 Ай бұрын

Thanos comparison was 😂😂😂

@ZhoRZh37 Жыл бұрын

yeah, but can it run Crysis?

@ivesennightfall6779 10 ай бұрын

can't wait to find out if this thing runs doom

@GraveUypo Жыл бұрын

eh... i don't want socs. that will just make everything more expensive to upgrade and less modular and will remove competition and/or options from some markets (like ram)

@hughjassstudios9688 Жыл бұрын

True, but at some point, the interconnect becomes the bottleneck unless we can make it optical instead of electrical

@OfSheikah Жыл бұрын

sooo is MI300 on SP5?

@HighYield Жыл бұрын

There are rumors of a new "SH5" socket.

@OfSheikah Жыл бұрын

@@HighYield for the MI300 to use that seems sensible, to my casual knowledge all of those pins SP5 provide might not have been designed already for a compute monstrosity of this kind I have been in the perception that it is SP5, mostly because the mockup when the MI300 reveal officially broke out includes the look of the ILM, I have only looked closely at the SP5 board shot the TYAN website has, it looks fairly close but not exact and identical

@RafaGmod Жыл бұрын

The SoC design for HPC make a lot of sense! I can se us, normal people, buying a PC ou notebook and needeng to upgrade it's RAM because of greedy software. But HPC normally go right were it's needed and don't make a lot of upgrades. If need an upgrade for the whole system, normally change the platform. But ok, when theses chips will show up in aliexpressn in a sketchy motherboard with a low price? 2028? I can wait with my zen 3 HAHAHHAHA

@BurningDrake39 Жыл бұрын

Hope linus manages to make a video on it, I want to see it run games.

@picblick 11 ай бұрын

2:00 "Apple has been a pioneer in this area..." No, no they have not. They licensed ARM just like thousands of companies before. If they had integrated RISC-V, maybe you would have a point. What about Chromebooks? Many of those run on SOCs and they have been around for many years. Is somebody who shows up at some point after all technologies exist a pioneer? Damn, I should do some pioneering.

@MaxKrumholz Жыл бұрын

AMD BEST

@RRsalin Жыл бұрын

Maybe we shouldn't enter the zettascale era. Maybe we should use what we already have and fix climate change (thank you people of the internetz for not responding to this comment if you disagree) Ps: great video.