Each Nvidia core consumes 500 w, how much is your whole wafer consume?
@MARKXHWANG5 күн бұрын
The problem is you cannot leverage 3D packagesed hbm memory
@incaseidie6 күн бұрын
This is so interesting and really excites me that i can find stuff like this on youtube still.
@JMCV27 күн бұрын
I'm trying your online inference, the speed is insane
@basarisizlikbasartir10 күн бұрын
很好
@genemoore90520 күн бұрын
When do you project the IPO happening?
@trustme94863 күн бұрын
Not for years because there is too much negative news around G42 and Cerebras needs to increase their customer base before being taken seriously.
@sitechca7028 күн бұрын
Can you please show some real hardware? Been seeing nothing but that dummy mechanical wafer for months and at most a glimpse of a server rack with a mechanical cover blocking most of the view. Everything else is powerpoint.
@TheIntelligentVehicle29 күн бұрын
Super cool tech and explanation. Ty! But isn't the bottom line TCO? If, for all those challenges, Nvidia outperforms Cerebras on TCO, I'm not sure the customers care. Would def love a video focusing more on how Cerebras's technical advantages translate into customer value and competitive advantages. To be fair, this video def hints at one, which is that Nvidia's challenges may only compound from here with future generations relative to Cerebras's approach, though again, it would be nice to get a little more insight into that. Wishing you guys all the best, though. If bottom line is TCO, I certainly have no vested interest in Nvidia having any particular mkt shr, and certainly not a dominant one.
@rupertsmith609729 күн бұрын
Good talk, thanks for all the details.
@Paolo1964Ай бұрын
Great video.
@alexcastas8405Ай бұрын
12:57 you still need to get information (bits) in and out of that memory though right? The software would need to ‘efficiently’ put it in the correct memory module for the correct chip to process given the large die size, so inputs and outputs from across the chip would need to be managed to do this? Sorry not a chip designer or elec engineer or software engineer haha but this is all very interesting!
@iandanforthАй бұрын
This sounds like a very promising strategy. It will be interesting to know if a fine tuned model ever completely replaces the large system prompt!
@stgeorgetalk98492 ай бұрын
Is memory bandwidth such a big deal when external bandwidth somewhere else will obviously be the next bottleneck? ex, You would need to build a complete SOC with memory on a single die package, but you will be limited to the fastest ethernet protocol and connection methodology limiting usefullness.
@stgeorgetalk98492 ай бұрын
Ouf... 10 to 15um external connections on a separate silicone wafer! That is a challenge. Not sure how AMD is doing it, but with any new tech innovative new solutions will need to be addressed. Great video guys.
@yotubecreators472 ай бұрын
Q: what is weights and biases ? A: "anybody who's an ml engineer is very familiar with weights and biases " I had to open video watch ads etc to hear these words, these 10 words could be used to say something more interesting about weights and biases
@yangYang-df2zy2 ай бұрын
Is the 44GB SRAM distributed across the wafer enough for data fetching in training and reasoning? It seems that Cerebras remove the HBM or DDR but directly load the NVMe to its on-wafer SRAM?
@harveywang80202 ай бұрын
Will we have similar demos in the nyc event?
@jhockey11liu912 ай бұрын
Niche company. Can go nowhere
@jhockey11liu912 ай бұрын
Just like I imagine. But do you create memory-let yourself?
@opelfrost2 ай бұрын
wework on cloud 😂
@Ligeia-La2 ай бұрын
10 years behind Nvidia - so sorry
@bretthowser67772 ай бұрын
I guess the CEO Andrew Felon is no longer doing videos?
@degenplanet2 ай бұрын
Nvidia antitrust investigation back in 2004 and now same stuff happening here (France) in 2024
@timothyweakly24962 ай бұрын
Hoping to collaborate with your company but hoped it could go in my product alongside lord willing our data center
@hg69962 ай бұрын
If this wse is really that good why is still nobody talking about Cerebras AI while Nvidia is still printing money?
@jhockey11liu912 ай бұрын
Because they are f-u-c-k up
@Marqui172 ай бұрын
Because todays biggest models dont fit on one Cerebras chip
@hg69962 ай бұрын
@@Marqui17 Hmm. So it's not possible to put together more of them in order to make the models fit on such a system?
@Marqui172 ай бұрын
@@hg6996 I guess you should be able to interconnect them and split the model on them but then you are introducing the same complexities Nvidia has, taking away Cerebras' main advantage
@Sam-vn9jm2 ай бұрын
😍
@ramakarl2 ай бұрын
Nice video. Curious about one part. You mention having logic & memory (L/M, time 12:00) right next to each other as being a great innovation. Yet how is this any different than NV shared memory?, that is fast memory local to a group of cores. Could you elaborate on why your logic-memory design is better?
@taota36813 ай бұрын
simple question: do you believe in microservice or monolithic design? even if I believe everything said is right here for now. I will still choose modular design. Memory and CPU can develop differently, probably in the future in different size by different materials. Now your trying to combine them in one piece, probably same material. I am not chip expert, but I can feel modular design always have its better use.
@IvanMeouch3 ай бұрын
great video, I'm waiting for the IPO
@andersonsystem23 ай бұрын
This is very good. I could use this for my KZbin video promotion and marketing this is a game changer and it’s fast output ❤ 0:07
@MrTravolta243 ай бұрын
Great talk. Very informative.
@TheShepherdsChapelChannel3 ай бұрын
when can we get access to Cerebras AI tools ? this would be great!
@TheShepherdsChapelChannel3 ай бұрын
great presentation.
@TheShepherdsChapelChannel3 ай бұрын
when are we getting Cerebras GPU for consumer gaming and work stations? to compete with Nvidia in gaming and VR ai etc.
@jose-lael3 ай бұрын
never, each of these chips cost more than most people's net worth.
@MuhammedAjmalAp3 ай бұрын
Bravo...why nobody talking about this miracle
@gamedevunity3dliviopanizzi1703 ай бұрын
potrebbero creare una sede auto allineante e non su superficie piana .
@N3omega3 ай бұрын
Dude its like a layered pyramid
@jp58623 ай бұрын
gemma2는용? 혹시 gemma 2B같은 모델을 가정용 로봇에 사용할수는 없을까요? 모두 외우지않고, 사람처럼 자면서 오늘있던 일들을 중요도와 부정,긍정으로 나눠서 아침까지 파인튜닝되게요. 가볍고 빠르고 학습하고
@jimmyjoesmith74523 ай бұрын
Very informative, thank you
@jimmyjoesmith74523 ай бұрын
I am very interested in your company as an investor. Please send me your notifications for all your updates.
@jimmyjoesmith74523 ай бұрын
Please send me updates. Thanks
@genemoore9053 ай бұрын
What is this?
@kohchoy56953 ай бұрын
Hi ZSean what is the yield in production of this Chip ? Appreciate a reply Thank you KC
@felicialynch356633 ай бұрын
This demo really showcases the potential of AI in marketing. It's interesting to see how different tools are evolving in this space. I recently came across Myko Assistant, which streamlines info searches without needing complicated setups. It's refreshing to see tools that enhance productivity so seamlessly.
@CaseyKoh3 ай бұрын
What is the yield of that wafer sir ? thank you
@christopherkeates41473 ай бұрын
Incredible work. How do you scale a trained model down so that you can put it in something smaller and run inference real-time for control of a system?
@dennissinitsky97794 ай бұрын
Very interesting presentation. But I am sure Nvidia experts might have a different story. And I think that software support for AI processors is also very critical in future. Nvidia has CUDA behind it which is a huge bonus for them. In fact, the reason that RISC was in the shadows of Intel's x86 architecture is precisely because of software stack issues. Nevertheless, still one can nothing but admire Cerebras team for their vision and innovation. If anyone knows a good book or review article about modern GPU/NPU/TPU architecture (not CPU) please post in reply to my comment; I really appreciate it!
@TheBestNameEverMade4 ай бұрын
What happened to the Qualcomm deal they had for inference? Parhaps used for smaller models?
@MediaCreators4 ай бұрын
The Cerebras inference speed difference between Llama 3.1 7B and 70B is roughly 4X. As 405B is about 5.8X the size of 70B, will the token/s also divide by 5.8 compared to 70B (reaching roughly 75 to 85 T/s)? In short: is the inference speed linear to the size of the model?
@iandanforth4 ай бұрын
Very very cool. Seen anyone use your engine for deeper / faster MCTS over Tree of Thoughts style inference?