Is it the Fastest AI Chip in the World? Groq Explained

Рет қаралды 172,303

Күн бұрын

Пікірлер: 527

@AnastasiInTech 8 ай бұрын

Friends! NVIDIA’s AI conference #GTC24 is in 2 weeks. I’ll be attending it to bring you all the news and exciting developments direct from San Jose! ! You can register right now for the free and get a chance to win one of 5 x RTX4090 GPUs from NVIDIA. To participate in the giveaway register now using this link: nvda.ws/3WzoH6y

@yuugenr7549 8 ай бұрын

This chip is faster than new accell chip made by china?

@syamkumarkaturi9761 8 ай бұрын

Captain planet use platinum to make pollution zero...All Indian politicians stores more gold and bad people stores more gold so gold price should fall even more..... world should turn to platinum standard no gold standard.. DECREASE GOLD PRICE AND BAN OR STOP IMPORTS AND EXPORTS GOLD..

@DigitalDesignET 8 ай бұрын

Hi Anastasi, love your work. Can you cover AMD VP1902? What is it used for?

@widowmaker7831 8 ай бұрын

It would be a phenomenal victory for NVIDIA to be dethroned as the AI king, from a decades old gamer point of view, oh the humble beginnings of gtx gpu's funded by gamers for decades to be betrayed by such a greedy current state of affairs. I sincerely wish this up and comer the best.

@simplemanideas4719 8 ай бұрын

Hmm, I hoped you dive into the technology used by groq to make the fast inference possible. Kind of disappointed.

@lukehaswell3075 4 ай бұрын

Intel and AMD will definitely have their share of the market. TSMC is at max capacity and investing in other semiconductor companies will be an absolute power move, I keep increasing my shares manageably. Different chips are good at different things and Nvidia has been very specialised, which leaves other aspects of Al open.

@mondimlotshwa3958 4 ай бұрын

This is the type of in-depth detail on the semiconductor market that investors need, also the right moment to focus on the rewarding AI manifesto.

@lukehaswell3075 4 ай бұрын

certainly, i had bought NVDA shares at $300, $475 cheap b4 the 10 for 1 split and with huge interest I keep adding, i’m currently doings the same for PLTR and AMD constructively. Best possible way to get ahead, is participating behind top experienced performers.

@donnahensley2459 4 ай бұрын

How much of their stock do you own? Seems like a lot of your investment is riding on this

@davegustavo4726 4 ай бұрын

You are buying a company to own it and not a piece of paper, The market is a zero-sum game (2 sides), Know what you are buying not just out of trend interest.

@DoughRollers-tb5li 4 ай бұрын

Amazingly, people are starting to get the uniqueness of Palantir.

@DigitalDesignET 8 ай бұрын

Finally Groq is getting the attention it deserves. A Software defined processor, specifically designed for inference..

@lindab.5624 8 ай бұрын

Do you know How to buy or invest this ?

@TreCayUltimateLife 8 ай бұрын

@@lindab.5624 Groq is a privately held company and therefore does not have a public stock price. However, you may access Groq's private market stock price with Forge Data.

@dchdch8290 8 ай бұрын

Not yet possible

@eugeniustheodidactus8890 8 ай бұрын

After hearing Pat Gelsinger say that _Nivida got lucky and fell into it's GPU based AI dominance...._ I RECENTLY ASKED THE DUMBEST QUESTION ON EARTH ..... _isn't there a dedicated inference chip available in the world_ ? The only question now: when does groq go public?

@SirusStarTV 8 ай бұрын

Wouldn't they be sued for using Qrok sounding name?

@SureNuf 8 ай бұрын

Just love how well you explain and unpack details, considerations and potentials, like a mini course in each video - Thank you for your excellent analysis and discussions Anastasi.

@JohnMoor1970 8 ай бұрын

Another great video -thankyou! I asked Groq how it compared with Graphcore - here's what it said: Groq and Graphcore are two companies that specialize in graph neural networks (GNNs) and graph-based machine learning. While both companies share some similarities, there are also some key differences in their approaches and focus areas. Here are some key differences: 1. Approach: * Groq's approach is based on the concept of "graph attention," which allows the model to selectively focus on certain parts of the graph that are relevant for the task at hand. This allows Groq's models to be more efficient and scalable than traditional GNNs. * Graphcore's approach is based on the concept of "message passing," which involves iteratively passing messages between nodes in the graph to update their representations. This allows Graphcore's models to capture complex relationships between nodes in the graph. 2. Focus: * Groq's focus is on developing practical, scalable, and easy-to-use GNN models that can be applied to a wide range of applications, including computer vision, natural language processing, and recommendation systems. * Graphcore's focus is on developing more fundamental advances in GNNs, such as the development of new algorithms and architectures that can handle complex graph-related tasks. 3. Products: * Groq's main product is the Groq Engine, a scalable GNN engine that can be used to build and deploy GNN models. * Graphcore's main product is the Graphcore IPU (Intelligent Processing Unit), a specialized hardware accelerator designed specifically for graph processing. 4. Use cases: * Groq's models have been applied to a wide range of applications, including image classification, natural language processing, recommendation systems, and fraud detection. * Graphcore's technology has been applied to a range of applications, including social network analysis, recommendation systems, and natural language processing. 5. Partnerships and collaborations: * Groq has partnerships with a number of leading organizations, including Google, AWS, and NVIDIA. * Graphcore has partnerships with a number of leading organizations, including Google, Microsoft, and NVIDIA. In summary, while both Groq and Graphcore are focused on developing advanced GNN technologies, they differ in their approaches, focus areas, and product offerings. Groq's approach is based on graph attention and is focused on developing practical, scalable, and easy-to-use GNN models for a wide range of applications. Graphcore's approach is based on message passing and is focused on developing more fundamental advances in GNNs, such as new algorithms and architectures for handling complex graph-related tasks. Any refutes or further insights?

@JamesMBC 8 ай бұрын

I love how Anya always includes diagrams of the chips. Many creators are content with just talking about the topic, and that's fine. but good editing and diagrams with clear explanations make this easier to understand for us mortals outside of the industry. What a cool job, designing what is effectively nanotech for state of the art computing. This channel got me excited for photonics, neuromorphic computing, analog computing and more. I hope to see a video on GAA and even more advanced transistors (if there isn't one already), or maybe even magnonics or spintronics, if there are any significant new updates. Just recently learned about magnonic wave transmiting information faster and with no electrical resistance issues, which sounds really promising (despite the curernt challenges).

@milliamp 8 ай бұрын

Thanks for researching the context to all the specs. It takes a little more time but I'm grateful for the data.

@JulianFoley 8 ай бұрын

A very insightful analysis and future search. Thank you.

@chesterhackenbush 8 ай бұрын

Excellent, informative and beautifully delivered. You really have your finger on the pulse.

@Viewpoint314 8 ай бұрын

This is definitely one of the best channels on modern technology. I will definitely be subscribing.

@bobharris7401 8 ай бұрын

And there’s so much great info too.

@TimiOfDaVoid 8 ай бұрын

Does anyone know how to buy Groq's stocks? Some websites say that it's not available yet.

@jessedavis5065 8 ай бұрын

Love to see good quality videos thank you chip goddess!

@jscarswell1 8 ай бұрын

I love that. "Chip Godess" LOL

@adamkadmon6339 8 ай бұрын

Actually I don't love all your videos, but when you are so detailed and to the point like this, it is REALLY GREAT.

@Intemkt1 8 ай бұрын

I don't know why you don't have billions of followers... Oh wait... Maybe soon. You are amazing at explaining thing that I don't understand in a way that makes me understand it. Big thanks! Seriously.

@jaysonp9426 8 ай бұрын

Because we're the only ones who want to know about microchips lol

@allreadynotinuse4169 8 ай бұрын

Ya sadly I do not think there enough people interested in this.

@dchdch8290 8 ай бұрын

I think there are , it just takes a while for YT to reach them.

@timmygilbert4102 8 ай бұрын

Not enough SHOCK sword 🗡️ and STUN shield 🛡️ in her title for click bait 😢

@stevenbrenner2862 8 ай бұрын

Will there be artificial intelligence followers? For artificial intelligence to succeed, it will need to interact and cross reference with many different platforms, including artificial intelligence, apparently graph based.currently, it appears AI is based on matching graphs, or selecting the best match. Likely new languages, likely math based languages will need to be developed, with translation of math based languages into human, analog based languages.

@stevefreeman6646 8 ай бұрын

Oddly enough, I had no idea this engineering feat was getting attention in the mainstream media. Kudos for that! Born from U.S.'s dominant days of Sun Microsystems, ASICs have decades of research and proven track records. Using a larger scale makes it cheap for a competitive entry into the industry and establishes a profound growth path its production processes. I live nearby the fab plant.

@pacobrezel 8 ай бұрын

Another excellent tech update. Quality over speed and frequency. Love it👌🏼

@JMeyer-qj1pv 8 ай бұрын

There's so much low hanging fruit on speeding up AI, we should see huge progress in the next few years. Fast, cheap, and powerful AI will be everywhere.

@larryc1616 8 ай бұрын

Will make humans and college obsolete

@paul1979uk2000 8 ай бұрын

We're going to be massive progress, we should remember that gpu's where the best we had when it comes to accelerating A.I. but it was only a matter of time before someone invents a chip that is designed for that task. The irony is, because of the task of processing A.I. an A.I. designed chip is likely far less complicated and a lot cheaper to manufacture compared to a gpu which has a lot of fluff around it, so it wouldn't surprise me if many more companies enter the A.I. specialised chip market which should drive the price down a lot and what will fuel that is the high growth demand and the current high prices, which I think will get slashed massively in the next 5 or so years.

@NineSeptims 8 ай бұрын

Good thing is that it will force us to incorporate the technology into ourselves or be left behind.@@larryc1616

@paddiman772 8 ай бұрын

Anastasia, though I'm not working in your field of endeavor, I find it easy to understand what you're presenting to us. Continue to grow and flourish in what you do. Our future depends upon it! Ciao!

@EdwardPike 8 ай бұрын

Groq chip is NOT Grok AI on X, in case anyone was unaware. As of 3/1/24, Groq is supposedly 13x faster inference than Nvidia. However TCO may be an issue. An ASIC is less flexible over time and everything is changing quickly. When are we getting Groq/llm inference asics on our smart phones? How does this compare to Google TPUs?

@finestructureconstant3921 8 ай бұрын

Every silicon implementation has pros and cons.

@Mike80528 8 ай бұрын

ASICs just seem to be the better overall design choice for this specific need. But I understand CPU's much better than tensor processors.

@scott701230 8 ай бұрын

Well done! Your best presentation ever!

@xenuburger7924 8 ай бұрын

I recently had a dream about an AI chip with a huge amount of memory on chip. I like how you are tuned in. As you said before, all the cool kids are designing their own silicon.

@MahamadouDiarra-n4f 8 ай бұрын

Think no better than GSIT Gemini APU

@philmisc3513 8 ай бұрын

Another highly informative video. Groq looks like a really promising solution targeting the non-hyper scaler market segment you described. Thank you for the amazing and easily understandable content.

@morenofranco9235 8 ай бұрын

Thanks, Anastasi, for a really remarkable presentation. We live in exciting times.

@Noqtis 6 ай бұрын

Her Name is Anastamariastasia

@jerrymont2595 7 ай бұрын

Thank you, you have one of the best tech videos, indeed .....

@bobasawrus 8 ай бұрын

Thankyou so much for your comprehensive explanations of the chip and the space. Your effortless traversal of the layers in the explanations is amazing! The magic is the on-chip memory. This is difficult to achieve

@fangeming1 6 ай бұрын

Very good video, the only one I saw explaining the cons of the technology. Requiring several hundreds groq chips to run llama or mixtral, costing several millions USD (1groq chip costs 18k USD), compare to Nvidia less than 10k USD hardware to run the same model, the speed gain has a huge cost. To be profitable, groq will have to put many customers in a queue. The generation speed will be offset by the time spent queuing. This is already what we experience in groq site. Thanks a lot Anastasi for the thorough analysis.

@karlrmaier 8 ай бұрын

It is the "Feedback of Competition" that provides both the Information and Motivation which forces continuous improvements in Quality, Service, and Price in free markets. Let the market fight it out, and the human race will be the beneficiary.

@petrari2157 8 ай бұрын

As always, you provided excellent and interesting information :) Thank you very much :D

@pazitor 8 ай бұрын

Thanks, Anastasi. Another good one.

@bladestarX 8 ай бұрын

The real question is that given at how and why they are able to achieve this speed; how long until others who have the manufacturing capacity can replicate the architecture. It see that while impressive the barrier of entry is very low for companies like Intel and NVIDIA. Groq is essentially optimizing for inference by adding components that would naturally reduce latency. Others like NVIDIA are yet to start doing this. Nevertheless it's exciting to see how they were able to improve inference dramatically with 14nm and using such basic architecture. I can't wait to see what the inference speed will be at 4nm or using more advance components.

@isajoha9962 8 ай бұрын

I hope Groq (LPU) based personal computers will be available, within next 10 years. And that it does not cost a car.

@syamkumarkaturi9761 8 ай бұрын

@ПИЦКВлад 8 ай бұрын

❤❤❤Anastasi, ich liebe deine Show. sehr informativ. Danke schön.😘

@LionTurtle3 5 ай бұрын

"Training AI models is a one-time problem. And computing power is getting cheaper and cheaper. But inference is a constant problem, and its inherently a larger market than training." Light bulb moment with this statement. Thank you for such clear explanations! 🙏

@digitman2010 8 ай бұрын

Groq can increase memory per processing unit for very large models. Its not a big con. Also keep in mind it will be manufactured on 4nm which will decrease area and increase performance per watt.

@kazedcat 8 ай бұрын

Mix logic and memory is always less dense than separate chips for logic and memory. Without logic causing interference you can pack memory closer together not even accounting that DRAM uses a completely different process so integrated memory must be either eDRAM or SRAM both are less dense than standard DRAM.

@alertbri 8 ай бұрын

1.58bit LLM around 3,000 tokens per second. It's going to be wild.

@BrianFedirko 8 ай бұрын

Asynchonous or "no synchronous" is an important aspect to consider. It can say compress a century of high speed sample data to produce and "answer" in a bit or a byte or bytes. It allows so many assumed rules to be discarded or alleviated towards progress to quality of information in efficient time domains. To include "feedback loops" to both sides of the underlying way a program can re-program itself, while including ways to incorporate "vectors" in time and in space. Any mix/match use of these concepts at any level will drive efficiency, and produce answers to create better questions. It is better to know an answer before one need consider the questions. Gr8! Peace ☮💜

@gyulanagy5910 2 ай бұрын

14 nm is not only mature but SRAM doesn't scale with smaller nodes. So they wouldn't get significantly better area but higher price.

@cnotation 8 ай бұрын

Been waiting for this.. thank you.

@bfair6284 8 ай бұрын

Now the question is.... is Anatasia an AI model? So beautiful and knowledgeable at the same time! 😁😁😁

@Noqtis 6 ай бұрын

DING DING DING you won the internet, gz

@yoyo-jc5qg 8 ай бұрын

What they do with thousands of old gpu's when upgrading?

@phrankus2009 8 ай бұрын

SCALE LOCALLY ... Decouple the memory but keep it *very close. Another way to look at it is to distribute processor nodes throughout an air-cooled, 3D "block" of ultra dense memory. Pack it tight and allow for cooling, at every layer of scale. .. In this way, you can vary the ratio of processor nodes, to memory cells, at any layer of scale, as dedicated applications may require.

@HoriaCristescu 8 ай бұрын

I was hoping you were going to mention the most interesting aspect - synchronous & deterministic architecture, software defined memory and network access, all orchestrated by the compiler over many chips acting as one large chip

@JamesMBC 8 ай бұрын

Awesome tech! I'm particularly interested in how ASICs can help bring on-device inference, instead of cloud platforms. Cloud services are always less attractive to me, offline AI is way more practical for mobile devices and non-work tasks. It's like video games and software in general: I hate live service payments (looking at you, Office 365), prefer to pay once upfront. I'm just sick of the list of subscriptions we are already subjected to.

@dennisdonnelly4440 8 ай бұрын

This is an amazing advancement! The efficiency improvement is stunning. Their competitors must be shaking in their boots. Looks absolutely disruptive. 👍👍

@HamguyBacon 8 ай бұрын

not really since they are not aiming to replace anyone other than sell processing power for companies as a service.

@HappyPandaBear73 8 ай бұрын

Absolutely Fascinating! Thank you Anastasia for your insights news segment about technology!👍🙏🐲🐉🐼🌏🙂

@MikkoRantalainen 8 ай бұрын

Great point on lowering packaging costs thanks to built-in memory!

@wiredgorilla 8 ай бұрын

Very interesting indeed!

@alb3eer007 8 ай бұрын

Love your analysis, love more your Angelic Soul 🙏🏼

@Jacobk-g7r 8 ай бұрын

The problem is that there are so many differences. It’s hard to integrate them together. As a human are neurons are living in a superposition they are reflecting the difference while maintaining themselves. So it’s actually just everyone’s perspective not aligning just like languages words, and letters not shape the same but still made of the same pencil lead and made it the same stuff so differences can embody differences like a mirror while maintaining themselves. The problem is all the differences are hard to align just like when a bunch of clocks aren’t set right and then we’re trying to use each other‘s clocks to trace back. What’s the right time? So imagine the humans are actually finding things and then leveraging those things so time isn’t the fourth dimension, but it may just be the fourth that we have leveraged or found and as we integrate those differences, we need to re-align ourselves, so that we better reflect what’s really going on. So that’s why no one really knows what’s going on some people have more alignment. Some people have less alignment and that would be like smart and not smart but it’s not a simple as that because every perspective is differences that could possibly matter because others aren’t aligned properly. You perspectives matter, thing on that for a sec. We integrate the perspectives and that’s why we put ourselves in the shoes or align the differences so we can see better or predict better. Hopefully it’s understandable and makes sense. I mean the errors are allowed, anything you think of is just like distance and we fill the details between. I mean like this, we aligned ai by reflecting the differences through the systems similar to the human body and how the difference is integrated. It’s not absorbed but absorption is one way. Reflection is like a key figure here and we gotta align them. It’s like we are backwards lol that’s what aligning ai means

@imusiccollection 8 ай бұрын

I appreciate Your objectivity, thank you good to think on a bigger scale make a bigger picture

@Jacobk-g7r 8 ай бұрын

@@imusiccollection thank you, most comments tell me I’m wrong but that’s allowed. There are no errors.

@Bassotronics 8 ай бұрын

No chip more lovely than an Anastasi chip. 👍🏻

@karlweaver2597 8 ай бұрын

Great comments, you have given us valuable information！ keep it up, Karl

@Minneolaos 8 ай бұрын

The 4 nanometer chip will be instantaneous and allows models to change the output. You could have "filters" for terse, elaborate, sad, happy, cute etc. So emotiveness and all kinds of alterations can be added on the raw output. At that point there will be the raw output and the filtered/altered output available. Most people will want to change the mood and hear the altered output that has gone through some kind of model. Every person has a personal taste so the mood can be fine tuned.

@erniea5843 8 ай бұрын

Awesome, I was hoping you would cover Groq

@christopherleubner6633 8 ай бұрын

When your chip is an entire 12 inch wafer.. I can only imagine the heat that would make when running at full load, also the reject rate.. 😮😮😮

@Steamrick 8 ай бұрын

Apparently it's 23kW sustained per 'chip'. But the yield should be almost 100% because of built-in redundancies to deal with manufacturing defects.

@miinyoo 8 ай бұрын

I do training and inference at home on open source platforms with just one 3090. Some things do take a very long time but most can be done especially on stable diffusion in seconds to minutes for reasonably complicated things. Difference is, I'm only doing it for one person's imagination, mine and I don't mind walking away to quite literally go for a walk while something heavy is churning. I cannot imagine even with 80GB A100s or newer just how many GPUs nor how many megawatt hours nor how many megaBTUs of cooling you'd need to train and serve something as huge as GPT4, Gemini, Grok while serving hundreds of thousands of people simultaneously 24/7.

@springwoodcottage4248 8 ай бұрын

Super well presented, interesting & topical. The primary issues for all Nvidia competitors is the risk of troubles for these small companies leading to supply issues & potential no supply if a smaller player like this one is taken over, whereas Nvidia has scale making its systems more attractive & able to command better margins. For one of these new entrants to succeed it must be clearly better, cost less & be robustly funded & stable. Still it can happen & asic is a great way to go giving the impressive low latency due to the combination of processor & memory. It will be interesting to see who wins in this great race to enhance human minds with artificial intelligence. Thank you for sharing!

@melchiorhof6557 8 ай бұрын

Great to hear about the Groq hardware. But I found out that the Grok AI model has a very confusing name choice. Maybe to explain they are not the same.

@ted_van_loon 8 ай бұрын

the working of that chip sounds largely like someting I suggested many years ago, also for similar compute schenarios to AI even though back then it wasn't specifically aimed at AI since that wasn't really a thing that was done other than in early tests or very small scale.

@SimonGimm 8 ай бұрын

great video, thank you. just for your future reference SK Hynix is read S-K - Hai - Nix, not skiinks... lol

@Lowlin85 8 ай бұрын

Great video. Informative as always 😊

@SaymonFernandes 8 ай бұрын

Great Content ! Thank you.

@ETRdotTV 8 ай бұрын

Very clear presentation!

@alaad1009 8 ай бұрын

This is an excellent video!

@drewbizdev 8 ай бұрын

Another excellent video Anastasi. Keep up the good work! 🌺👍🌺

@digitalclips 8 ай бұрын

Interesting! I have not heard of GF for a long time. Remember 2014 when GF's stock temporarily shot up on rumors of Apple using them for the iPhone?

@take5th 8 ай бұрын

It’s amazing how much money and time is spent on improving response times from slightly too slow to quicker than you can think. This is true in all fields, I guess to keep work-flow intact and to waste as little time, which is the only thing you cannot acquire, as possible.

@guest578 8 ай бұрын

11:45 every time I hear that "more than you know" by axwell Λ ingrosso starts playing.

@ps3301 8 ай бұрын

We need it on our phone !!

@Gusman007 8 ай бұрын

Grog will clean up with LLM hardware requirements for robots using its LPU. No competition yet! Of course prototypes are easy production is hard, but from what I understand they are already into production.

@equerol1396 8 ай бұрын

Gracias.

@petereriksson7166 8 ай бұрын

Very good video. Good job.

@pan_salceson 8 ай бұрын

Damn, you're so good at talking and teaching about it!

@uTubeJRod 8 ай бұрын

Anastasi!! How come all of the chip manufacturers were not already doing on chip memory? It seems like such an obvious engineering decision! I would love to know! You mentioned that large memory computations would be at a disadvantage - but MOST or at least MUCH of the computing we've been doing until recently hasn't needed such large amounts of memory. I hope you answer this! thank you for your informative videos, I was really looking forward to your insights on GROQ!!

@kornydad14 8 ай бұрын

cost, on chip memory (HBM) is very expensive and from what I have heard, the per chip yield for the accompanying processor is lower as well.

@scottaseigel5715 8 ай бұрын

We’re at the point that we need some inference functionality co-located with the user. Phone companies, app devs & servers will then be able to offload just enough AI processing to vastly enhance user experience and workflow efficiency. I estimate a little bit of this technology will make its way into our smartphones in only 1-3 years (one-or at most two-phone releases (versions) away).

@msalmanpour6288 8 ай бұрын

Nice, I'm glad that outsourced technologies are finally coming back to the US.

@KashmirParadise786 8 ай бұрын

Thanks for the information

@paul1979uk2000 8 ай бұрын

I'm sure we all know that rivals are more or less going to do much of the same thing to rob them of that advantage, probably even integrated into cpu's or gpu's. On the plus side, this could have a big impact on reducing the price of A.I. chips, which is distorting the entire gpu market at the moment. From what I understand of it, from performance per watt, it's already much better than what Nvidia offers, if so and considering that this is 14nm, I do wonder how much faster and lower power consumption the 4nm will be. I also think that because of the task that this chip does, which is far more streamlined compared to a cpu or gpu, it's likely that a lot of rivals will be able to enter that market much easier than the traditional cpu and gpu markets, especially because for A.I. the drivers are much more streamlined and easier to do, that's great news because it very likely in the coming years will put a lot of downwards pressure on the price whiles boosting performance and with that, we might end up with some kind of A.I. chip that we can slot into a consumer grade PC, which I think locally run A.I. models are going to be far more important then the online centralised models, because of privacy and security reasons, which has A.I. because more capable and useful, it's going to be more involved in our daily lives, which I suspect for most of us as well as businesses, it wouldn't be that appealing to share so much data with an online central database/

@jigglypuff4227 8 ай бұрын

Anastasi you are such an amazing human being

@MrKelaher 8 ай бұрын

Hopefully they will do an edge chip for running 7B models

@aGj2fiebP3ekso7wQpnd1Lhd 8 ай бұрын

Great video!

@AmandaFessler 8 ай бұрын

Subbed because this is the most depth I've seen on Groq in a lot of my searching. Also because you're a chip specialist. I guess my question is, can we expect consumer grade inference chips any time soon? The reality of how many Groq chips it takes to run Mixtral right now is... disheartening.

@jason.arthur.taylor 8 ай бұрын

get real. she has no idea how the chip works.

@AmandaFessler 8 ай бұрын

@@jason.arthur.taylor Oh, I see. That's disappointing. Do you know any actual chip experts you could recommend?

@jason.arthur.taylor 8 ай бұрын

@@AmandaFessler This guy explains the chip. kzbin.info/www/bejne/l6DJeJuPach-qdk

@jason.arthur.taylor 8 ай бұрын

@@AmandaFessler Were you able to see my reply? Sometimes I'm censored especially if I give a web site.

@AmandaFessler 8 ай бұрын

@@jason.arthur.taylorUnfortunately no.. not even a notification. I guess it was censored. Maybe the name of the site will do?

@guilhermepimenta_prodabel 8 ай бұрын

The challenge is reduce the software to hardware time, when the hardware defines the software design. In my opinion, we need a more general hardware platform that is scalable and cheaper. The solution is create an open hardware architecture like RISC-V, with a lot of collaborators to create a new Open Neural Network Unit. Data processing is a very complex system engineering problem, because the old problem of transfer the data from low latency to high latency memory to be processed and than be stored back. In the recente past, huge processing problems was not so data oriented so with more process power more information, but now the problem of memory hierarchy is back. We need more memory near process units and less processing power to reduce the latency.

@PaR2020 8 ай бұрын

You don't need 2xH100 to run Mixtral 8x7B model. 2xRTX3090 with 4 bit precision is running 30T/s (Yes lower precision, but it works very well anyway). 10TB model will probably be not needed. The specialized models the same way this hardware is specialized will be more efficient and popular.

@jonteno 4 ай бұрын

sounds perfect for smipler ai agents, i see nothing keeping these models from working with other technologies that easier could handle bigger harder opperstions

@vi6ddarkking 8 ай бұрын

It won't be long now. We'll have a CPU, GPU and LPU in our motherboards. And considering recent developments in AI training. I am fully expecting to be able to run, Trillion Parameter Models Locally before the end of the decade.

@yagoa 8 ай бұрын

for casual use it is nicer with slower text imo, but for multi-AI teams it will be great!

@leematthews6812 8 ай бұрын

I think Anastasi should bring back the dreamy background music she used to close her videos with 🙂

@DeckEdge 8 ай бұрын

Great challenges at the end, I wonder how much will be solved going from the 14nm chip to the 4nm chip you mentioned in the beginning.

@Noqtis 6 ай бұрын

about 9.73 pipi will be solved by 4nm

@moozoowizard 8 ай бұрын

It takes 8 chips to run tinyllama and the pcie board they sell has one chip. So 8 pcie cards. And I believe each is price something like the cost of a h100. So a very long way from being able to run anything useful on a standard PC. 🤞 one day they will be able to fit 578 groq chips into one chip and sell them at 4090 prices. We just need to wait for a 0.44nm process node.

@dm3035 8 ай бұрын

LOVE YOUR VIDEOS - VERY INFORMATIVE - AT THE CHIP nano LEVEL 🤗🤗🤗 DO NOT STAY AWAY FOR LONG PERIODS OF TIME - MISS YOU 🤗🤗🤗

@MozartificeR 8 ай бұрын

14 nanometers to 4 nano meters is huge. Thats something with which you should keep you ear to the ground:) I wonder if they will ever do a hybrid of Groc and Cerebres?

@MahamadouDiarra-n4f 8 ай бұрын

@Anastasia, does Groq chip violate patents by GSIT regarding their Gemini APU?

@zbaktube 8 ай бұрын

About the groq chip: It outperforms nvidia mainly because of the on-chip ram, isn't it? How is it coming with the heat? It is easier to cool a 14nm chip, isn't it? Can the less efficient cooling at 4nm make the next version if this chip less efficient?

@volome 8 ай бұрын

First thanks for your your wonderful video. Groq can always add more memory with future design. And it seams like the software stack is flexible to be compatible with future hardware design without much development. At the same time the development of LLM getting smaller and moving forward to more specialized models that are chosen by a predictable model. So if Groq can utilize this it will be perfect. And as you said the small and middle enterprises that want to keep their data local is a huge business. A small business don’t need 10 trillions of parameters for there internal data. Yes it’s required more chips than GPUs. But there are more easy to manufacture and they don’t need to stand in line at the 1 nanometer factories. And the Groq flexible router is built in the chip so no expensive Switches are needed. And at last it’s a huge saving in electrical power. NVIDIA should buy Groq now before they get to large . Jerry

@MikkoRantalainen 8 ай бұрын

Running low latency AI chips on cloud seems a bit weird because latency from your own system to cloud is always going to add extra latency. I guess you should locate your servers into the same cloud provider as the Croq service provider.

@oldtools 8 ай бұрын

the interconnect for the grok architecture would need a ridiculous amount of bandwidth in order to retain any compute advantage for LLM's anywhere near the size of even yesterday's flagship models.

@crhu319 5 ай бұрын

The focus is inference delivery.

@oldtools 5 ай бұрын

@@crhu319 I see this. thanks.

@MikkoRantalainen 8 ай бұрын

Scaling to 100T models is going to be hard as long as accelerator cards have 10-100 GB RAM and you really wanted cards with 1-50 TB RAM.

@ItsTristan1st 8 ай бұрын

The memory limit is a killer limitation. Tightly coupled small RAM is not a new approach and the problems are well understood. The big question is whether or not they can develop their algorithms to cope with the limitation however, this is typically not a simple problem to solve and leads into combinatoral optimization scenarios. Generally the long term solution to that sort of problem is always to just add more RAM. Ironically I could see this Grok chip working far better for canned problems rather than the general application it is being target for. Let us see what happens.

@yoshikimatsui6384 8 ай бұрын

I would like to think of a semiconductor that operates only by the vibrations of the stone. Is it possible to accelerate the flow of electrons in a semiconductor using just the vibrations of the stone? If we do that, I think we can create a flow of electrons within the semiconductor with minimal current and voltage.

@yewhanlim8916 8 ай бұрын

Fact. Memory which needs to hold charges, does not scale down as well logic circuits. Terabytes memory is better off chip. We still can have some in-chip cache.

@marcvayn 8 ай бұрын

Do you see Nvidio following a similar path? It is odd that a company with about $5M in revenue and antiquated 14nm tech can run circles around the latest H100s for Inference.