Friends! NVIDIA’s AI conference #GTC24 is in 2 weeks. I’ll be attending it to bring you all the news and exciting developments direct from San Jose! ! You can register right now for the free and get a chance to win one of 5 x RTX4090 GPUs from NVIDIA. To participate in the giveaway register now using this link: nvda.ws/3WzoH6y
@yuugenr75498 ай бұрын
This chip is faster than new accell chip made by china?
@syamkumarkaturi97618 ай бұрын
Captain planet use platinum to make pollution zero...All Indian politicians stores more gold and bad people stores more gold so gold price should fall even more..... world should turn to platinum standard no gold standard.. DECREASE GOLD PRICE AND BAN OR STOP IMPORTS AND EXPORTS GOLD..
@DigitalDesignET8 ай бұрын
Hi Anastasi, love your work. Can you cover AMD VP1902? What is it used for?
@widowmaker78318 ай бұрын
It would be a phenomenal victory for NVIDIA to be dethroned as the AI king, from a decades old gamer point of view, oh the humble beginnings of gtx gpu's funded by gamers for decades to be betrayed by such a greedy current state of affairs. I sincerely wish this up and comer the best.
@simplemanideas47198 ай бұрын
Hmm, I hoped you dive into the technology used by groq to make the fast inference possible. Kind of disappointed.
@lukehaswell30754 ай бұрын
Intel and AMD will definitely have their share of the market. TSMC is at max capacity and investing in other semiconductor companies will be an absolute power move, I keep increasing my shares manageably. Different chips are good at different things and Nvidia has been very specialised, which leaves other aspects of Al open.
@mondimlotshwa39584 ай бұрын
This is the type of in-depth detail on the semiconductor market that investors need, also the right moment to focus on the rewarding AI manifesto.
@lukehaswell30754 ай бұрын
certainly, i had bought NVDA shares at $300, $475 cheap b4 the 10 for 1 split and with huge interest I keep adding, i’m currently doings the same for PLTR and AMD constructively. Best possible way to get ahead, is participating behind top experienced performers.
@donnahensley24594 ай бұрын
How much of their stock do you own? Seems like a lot of your investment is riding on this
@davegustavo47264 ай бұрын
You are buying a company to own it and not a piece of paper, The market is a zero-sum game (2 sides), Know what you are buying not just out of trend interest.
@DoughRollers-tb5li4 ай бұрын
Amazingly, people are starting to get the uniqueness of Palantir.
@DigitalDesignET8 ай бұрын
Finally Groq is getting the attention it deserves. A Software defined processor, specifically designed for inference..
@lindab.56248 ай бұрын
Do you know How to buy or invest this ?
@TreCayUltimateLife8 ай бұрын
@@lindab.5624 Groq is a privately held company and therefore does not have a public stock price. However, you may access Groq's private market stock price with Forge Data.
@dchdch82908 ай бұрын
Not yet possible
@eugeniustheodidactus88908 ай бұрын
After hearing Pat Gelsinger say that _Nivida got lucky and fell into it's GPU based AI dominance...._ I RECENTLY ASKED THE DUMBEST QUESTION ON EARTH ..... _isn't there a dedicated inference chip available in the world_ ? The only question now: when does groq go public?
@SirusStarTV8 ай бұрын
Wouldn't they be sued for using Qrok sounding name?
@SureNuf8 ай бұрын
Just love how well you explain and unpack details, considerations and potentials, like a mini course in each video - Thank you for your excellent analysis and discussions Anastasi.
@JohnMoor19708 ай бұрын
Another great video -thankyou! I asked Groq how it compared with Graphcore - here's what it said: Groq and Graphcore are two companies that specialize in graph neural networks (GNNs) and graph-based machine learning. While both companies share some similarities, there are also some key differences in their approaches and focus areas. Here are some key differences: 1. Approach: * Groq's approach is based on the concept of "graph attention," which allows the model to selectively focus on certain parts of the graph that are relevant for the task at hand. This allows Groq's models to be more efficient and scalable than traditional GNNs. * Graphcore's approach is based on the concept of "message passing," which involves iteratively passing messages between nodes in the graph to update their representations. This allows Graphcore's models to capture complex relationships between nodes in the graph. 2. Focus: * Groq's focus is on developing practical, scalable, and easy-to-use GNN models that can be applied to a wide range of applications, including computer vision, natural language processing, and recommendation systems. * Graphcore's focus is on developing more fundamental advances in GNNs, such as the development of new algorithms and architectures that can handle complex graph-related tasks. 3. Products: * Groq's main product is the Groq Engine, a scalable GNN engine that can be used to build and deploy GNN models. * Graphcore's main product is the Graphcore IPU (Intelligent Processing Unit), a specialized hardware accelerator designed specifically for graph processing. 4. Use cases: * Groq's models have been applied to a wide range of applications, including image classification, natural language processing, recommendation systems, and fraud detection. * Graphcore's technology has been applied to a range of applications, including social network analysis, recommendation systems, and natural language processing. 5. Partnerships and collaborations: * Groq has partnerships with a number of leading organizations, including Google, AWS, and NVIDIA. * Graphcore has partnerships with a number of leading organizations, including Google, Microsoft, and NVIDIA. In summary, while both Groq and Graphcore are focused on developing advanced GNN technologies, they differ in their approaches, focus areas, and product offerings. Groq's approach is based on graph attention and is focused on developing practical, scalable, and easy-to-use GNN models for a wide range of applications. Graphcore's approach is based on message passing and is focused on developing more fundamental advances in GNNs, such as new algorithms and architectures for handling complex graph-related tasks. Any refutes or further insights?
@JamesMBC8 ай бұрын
I love how Anya always includes diagrams of the chips. Many creators are content with just talking about the topic, and that's fine. but good editing and diagrams with clear explanations make this easier to understand for us mortals outside of the industry. What a cool job, designing what is effectively nanotech for state of the art computing. This channel got me excited for photonics, neuromorphic computing, analog computing and more. I hope to see a video on GAA and even more advanced transistors (if there isn't one already), or maybe even magnonics or spintronics, if there are any significant new updates. Just recently learned about magnonic wave transmiting information faster and with no electrical resistance issues, which sounds really promising (despite the curernt challenges).
@milliamp8 ай бұрын
Thanks for researching the context to all the specs. It takes a little more time but I'm grateful for the data.
@JulianFoley8 ай бұрын
A very insightful analysis and future search. Thank you.
@chesterhackenbush8 ай бұрын
Excellent, informative and beautifully delivered. You really have your finger on the pulse.
@Viewpoint3148 ай бұрын
This is definitely one of the best channels on modern technology. I will definitely be subscribing.
@bobharris74018 ай бұрын
And there’s so much great info too.
@TimiOfDaVoid8 ай бұрын
Does anyone know how to buy Groq's stocks? Some websites say that it's not available yet.
@jessedavis50658 ай бұрын
Love to see good quality videos thank you chip goddess!
@jscarswell18 ай бұрын
I love that. "Chip Godess" LOL
@adamkadmon63398 ай бұрын
Actually I don't love all your videos, but when you are so detailed and to the point like this, it is REALLY GREAT.
@Intemkt18 ай бұрын
I don't know why you don't have billions of followers... Oh wait... Maybe soon. You are amazing at explaining thing that I don't understand in a way that makes me understand it. Big thanks! Seriously.
@jaysonp94268 ай бұрын
Because we're the only ones who want to know about microchips lol
@allreadynotinuse41698 ай бұрын
Ya sadly I do not think there enough people interested in this.
@dchdch82908 ай бұрын
I think there are , it just takes a while for YT to reach them.
@timmygilbert41028 ай бұрын
Not enough SHOCK sword 🗡️ and STUN shield 🛡️ in her title for click bait 😢
@stevenbrenner28628 ай бұрын
Will there be artificial intelligence followers? For artificial intelligence to succeed, it will need to interact and cross reference with many different platforms, including artificial intelligence, apparently graph based.currently, it appears AI is based on matching graphs, or selecting the best match. Likely new languages, likely math based languages will need to be developed, with translation of math based languages into human, analog based languages.
@stevefreeman66468 ай бұрын
Oddly enough, I had no idea this engineering feat was getting attention in the mainstream media. Kudos for that! Born from U.S.'s dominant days of Sun Microsystems, ASICs have decades of research and proven track records. Using a larger scale makes it cheap for a competitive entry into the industry and establishes a profound growth path its production processes. I live nearby the fab plant.
@pacobrezel8 ай бұрын
Another excellent tech update. Quality over speed and frequency. Love it👌🏼
@JMeyer-qj1pv8 ай бұрын
There's so much low hanging fruit on speeding up AI, we should see huge progress in the next few years. Fast, cheap, and powerful AI will be everywhere.
@larryc16168 ай бұрын
Will make humans and college obsolete
@paul1979uk20008 ай бұрын
We're going to be massive progress, we should remember that gpu's where the best we had when it comes to accelerating A.I. but it was only a matter of time before someone invents a chip that is designed for that task. The irony is, because of the task of processing A.I. an A.I. designed chip is likely far less complicated and a lot cheaper to manufacture compared to a gpu which has a lot of fluff around it, so it wouldn't surprise me if many more companies enter the A.I. specialised chip market which should drive the price down a lot and what will fuel that is the high growth demand and the current high prices, which I think will get slashed massively in the next 5 or so years.
@NineSeptims8 ай бұрын
Good thing is that it will force us to incorporate the technology into ourselves or be left behind.@@larryc1616
@paddiman7728 ай бұрын
Anastasia, though I'm not working in your field of endeavor, I find it easy to understand what you're presenting to us. Continue to grow and flourish in what you do. Our future depends upon it! Ciao!
@EdwardPike8 ай бұрын
Groq chip is NOT Grok AI on X, in case anyone was unaware. As of 3/1/24, Groq is supposedly 13x faster inference than Nvidia. However TCO may be an issue. An ASIC is less flexible over time and everything is changing quickly. When are we getting Groq/llm inference asics on our smart phones? How does this compare to Google TPUs?
@finestructureconstant39218 ай бұрын
Every silicon implementation has pros and cons.
@Mike805288 ай бұрын
ASICs just seem to be the better overall design choice for this specific need. But I understand CPU's much better than tensor processors.
@scott7012308 ай бұрын
Well done! Your best presentation ever!
@xenuburger79248 ай бұрын
I recently had a dream about an AI chip with a huge amount of memory on chip. I like how you are tuned in. As you said before, all the cool kids are designing their own silicon.
@MahamadouDiarra-n4f8 ай бұрын
Think no better than GSIT Gemini APU
@philmisc35138 ай бұрын
Another highly informative video. Groq looks like a really promising solution targeting the non-hyper scaler market segment you described. Thank you for the amazing and easily understandable content.
@morenofranco92358 ай бұрын
Thanks, Anastasi, for a really remarkable presentation. We live in exciting times.
@Noqtis6 ай бұрын
Her Name is Anastamariastasia
@jerrymont25957 ай бұрын
Thank you, you have one of the best tech videos, indeed .....
@bobasawrus8 ай бұрын
Thankyou so much for your comprehensive explanations of the chip and the space. Your effortless traversal of the layers in the explanations is amazing! The magic is the on-chip memory. This is difficult to achieve
@fangeming16 ай бұрын
Very good video, the only one I saw explaining the cons of the technology. Requiring several hundreds groq chips to run llama or mixtral, costing several millions USD (1groq chip costs 18k USD), compare to Nvidia less than 10k USD hardware to run the same model, the speed gain has a huge cost. To be profitable, groq will have to put many customers in a queue. The generation speed will be offset by the time spent queuing. This is already what we experience in groq site. Thanks a lot Anastasi for the thorough analysis.
@karlrmaier8 ай бұрын
It is the "Feedback of Competition" that provides both the Information and Motivation which forces continuous improvements in Quality, Service, and Price in free markets. Let the market fight it out, and the human race will be the beneficiary.
@petrari21578 ай бұрын
As always, you provided excellent and interesting information :) Thank you very much :D
@pazitor8 ай бұрын
Thanks, Anastasi. Another good one.
@bladestarX8 ай бұрын
The real question is that given at how and why they are able to achieve this speed; how long until others who have the manufacturing capacity can replicate the architecture. It see that while impressive the barrier of entry is very low for companies like Intel and NVIDIA. Groq is essentially optimizing for inference by adding components that would naturally reduce latency. Others like NVIDIA are yet to start doing this. Nevertheless it's exciting to see how they were able to improve inference dramatically with 14nm and using such basic architecture. I can't wait to see what the inference speed will be at 4nm or using more advance components.
@isajoha99628 ай бұрын
I hope Groq (LPU) based personal computers will be available, within next 10 years. And that it does not cost a car.
@syamkumarkaturi97618 ай бұрын
Captain planet use platinum to make pollution zero...All Indian politicians stores more gold and bad people stores more gold so gold price should fall even more..... world should turn to platinum standard no gold standard.. DECREASE GOLD PRICE AND BAN OR STOP IMPORTS AND EXPORTS GOLD..
@ПИЦКВлад8 ай бұрын
❤❤❤Anastasi, ich liebe deine Show. sehr informativ. Danke schön.😘
@LionTurtle35 ай бұрын
"Training AI models is a one-time problem. And computing power is getting cheaper and cheaper. But inference is a constant problem, and its inherently a larger market than training." Light bulb moment with this statement. Thank you for such clear explanations! 🙏
@digitman20108 ай бұрын
Groq can increase memory per processing unit for very large models. Its not a big con. Also keep in mind it will be manufactured on 4nm which will decrease area and increase performance per watt.
@kazedcat8 ай бұрын
Mix logic and memory is always less dense than separate chips for logic and memory. Without logic causing interference you can pack memory closer together not even accounting that DRAM uses a completely different process so integrated memory must be either eDRAM or SRAM both are less dense than standard DRAM.
@alertbri8 ай бұрын
1.58bit LLM around 3,000 tokens per second. It's going to be wild.
@BrianFedirko8 ай бұрын
Asynchonous or "no synchronous" is an important aspect to consider. It can say compress a century of high speed sample data to produce and "answer" in a bit or a byte or bytes. It allows so many assumed rules to be discarded or alleviated towards progress to quality of information in efficient time domains. To include "feedback loops" to both sides of the underlying way a program can re-program itself, while including ways to incorporate "vectors" in time and in space. Any mix/match use of these concepts at any level will drive efficiency, and produce answers to create better questions. It is better to know an answer before one need consider the questions. Gr8! Peace ☮💜
@gyulanagy59102 ай бұрын
14 nm is not only mature but SRAM doesn't scale with smaller nodes. So they wouldn't get significantly better area but higher price.
@cnotation8 ай бұрын
Been waiting for this.. thank you.
@bfair62848 ай бұрын
Now the question is.... is Anatasia an AI model? So beautiful and knowledgeable at the same time! 😁😁😁
@Noqtis6 ай бұрын
DING DING DING you won the internet, gz
@yoyo-jc5qg8 ай бұрын
What they do with thousands of old gpu's when upgrading?
@phrankus20098 ай бұрын
SCALE LOCALLY ... Decouple the memory but keep it *very close. Another way to look at it is to distribute processor nodes throughout an air-cooled, 3D "block" of ultra dense memory. Pack it tight and allow for cooling, at every layer of scale. .. In this way, you can vary the ratio of processor nodes, to memory cells, at any layer of scale, as dedicated applications may require.
@HoriaCristescu8 ай бұрын
I was hoping you were going to mention the most interesting aspect - synchronous & deterministic architecture, software defined memory and network access, all orchestrated by the compiler over many chips acting as one large chip
@JamesMBC8 ай бұрын
Awesome tech! I'm particularly interested in how ASICs can help bring on-device inference, instead of cloud platforms. Cloud services are always less attractive to me, offline AI is way more practical for mobile devices and non-work tasks. It's like video games and software in general: I hate live service payments (looking at you, Office 365), prefer to pay once upfront. I'm just sick of the list of subscriptions we are already subjected to.
@dennisdonnelly44408 ай бұрын
This is an amazing advancement! The efficiency improvement is stunning. Their competitors must be shaking in their boots. Looks absolutely disruptive. 👍👍
@HamguyBacon8 ай бұрын
not really since they are not aiming to replace anyone other than sell processing power for companies as a service.
@HappyPandaBear738 ай бұрын
Absolutely Fascinating! Thank you Anastasia for your insights news segment about technology!👍🙏🐲🐉🐼🌏🙂
@MikkoRantalainen8 ай бұрын
Great point on lowering packaging costs thanks to built-in memory!
@wiredgorilla8 ай бұрын
Very interesting indeed!
@alb3eer0078 ай бұрын
Love your analysis, love more your Angelic Soul 🙏🏼
@Jacobk-g7r8 ай бұрын
The problem is that there are so many differences. It’s hard to integrate them together. As a human are neurons are living in a superposition they are reflecting the difference while maintaining themselves. So it’s actually just everyone’s perspective not aligning just like languages words, and letters not shape the same but still made of the same pencil lead and made it the same stuff so differences can embody differences like a mirror while maintaining themselves. The problem is all the differences are hard to align just like when a bunch of clocks aren’t set right and then we’re trying to use each other‘s clocks to trace back. What’s the right time? So imagine the humans are actually finding things and then leveraging those things so time isn’t the fourth dimension, but it may just be the fourth that we have leveraged or found and as we integrate those differences, we need to re-align ourselves, so that we better reflect what’s really going on. So that’s why no one really knows what’s going on some people have more alignment. Some people have less alignment and that would be like smart and not smart but it’s not a simple as that because every perspective is differences that could possibly matter because others aren’t aligned properly. You perspectives matter, thing on that for a sec. We integrate the perspectives and that’s why we put ourselves in the shoes or align the differences so we can see better or predict better. Hopefully it’s understandable and makes sense. I mean the errors are allowed, anything you think of is just like distance and we fill the details between. I mean like this, we aligned ai by reflecting the differences through the systems similar to the human body and how the difference is integrated. It’s not absorbed but absorption is one way. Reflection is like a key figure here and we gotta align them. It’s like we are backwards lol that’s what aligning ai means
@imusiccollection8 ай бұрын
I appreciate Your objectivity, thank you good to think on a bigger scale make a bigger picture
@Jacobk-g7r8 ай бұрын
@@imusiccollection thank you, most comments tell me I’m wrong but that’s allowed. There are no errors.
@Bassotronics8 ай бұрын
No chip more lovely than an Anastasi chip. 👍🏻
@karlweaver25978 ай бұрын
Great comments, you have given us valuable information! keep it up, Karl
@Minneolaos8 ай бұрын
The 4 nanometer chip will be instantaneous and allows models to change the output. You could have "filters" for terse, elaborate, sad, happy, cute etc. So emotiveness and all kinds of alterations can be added on the raw output. At that point there will be the raw output and the filtered/altered output available. Most people will want to change the mood and hear the altered output that has gone through some kind of model. Every person has a personal taste so the mood can be fine tuned.
@erniea58438 ай бұрын
Awesome, I was hoping you would cover Groq
@christopherleubner66338 ай бұрын
When your chip is an entire 12 inch wafer.. I can only imagine the heat that would make when running at full load, also the reject rate.. 😮😮😮
@Steamrick8 ай бұрын
Apparently it's 23kW sustained per 'chip'. But the yield should be almost 100% because of built-in redundancies to deal with manufacturing defects.
@miinyoo8 ай бұрын
I do training and inference at home on open source platforms with just one 3090. Some things do take a very long time but most can be done especially on stable diffusion in seconds to minutes for reasonably complicated things. Difference is, I'm only doing it for one person's imagination, mine and I don't mind walking away to quite literally go for a walk while something heavy is churning. I cannot imagine even with 80GB A100s or newer just how many GPUs nor how many megawatt hours nor how many megaBTUs of cooling you'd need to train and serve something as huge as GPT4, Gemini, Grok while serving hundreds of thousands of people simultaneously 24/7.
@springwoodcottage42488 ай бұрын
Super well presented, interesting & topical. The primary issues for all Nvidia competitors is the risk of troubles for these small companies leading to supply issues & potential no supply if a smaller player like this one is taken over, whereas Nvidia has scale making its systems more attractive & able to command better margins. For one of these new entrants to succeed it must be clearly better, cost less & be robustly funded & stable. Still it can happen & asic is a great way to go giving the impressive low latency due to the combination of processor & memory. It will be interesting to see who wins in this great race to enhance human minds with artificial intelligence. Thank you for sharing!
@melchiorhof65578 ай бұрын
Great to hear about the Groq hardware. But I found out that the Grok AI model has a very confusing name choice. Maybe to explain they are not the same.
@ted_van_loon8 ай бұрын
the working of that chip sounds largely like someting I suggested many years ago, also for similar compute schenarios to AI even though back then it wasn't specifically aimed at AI since that wasn't really a thing that was done other than in early tests or very small scale.
@SimonGimm8 ай бұрын
great video, thank you. just for your future reference SK Hynix is read S-K - Hai - Nix, not skiinks... lol
@Lowlin858 ай бұрын
Great video. Informative as always 😊
@SaymonFernandes8 ай бұрын
Great Content ! Thank you.
@ETRdotTV8 ай бұрын
Very clear presentation!
@alaad10098 ай бұрын
This is an excellent video!
@drewbizdev8 ай бұрын
Another excellent video Anastasi. Keep up the good work! 🌺👍🌺
@digitalclips8 ай бұрын
Interesting! I have not heard of GF for a long time. Remember 2014 when GF's stock temporarily shot up on rumors of Apple using them for the iPhone?
@take5th8 ай бұрын
It’s amazing how much money and time is spent on improving response times from slightly too slow to quicker than you can think. This is true in all fields, I guess to keep work-flow intact and to waste as little time, which is the only thing you cannot acquire, as possible.
@guest5788 ай бұрын
11:45 every time I hear that "more than you know" by axwell Λ ingrosso starts playing.
@ps33018 ай бұрын
We need it on our phone !!
@Gusman0078 ай бұрын
Grog will clean up with LLM hardware requirements for robots using its LPU. No competition yet! Of course prototypes are easy production is hard, but from what I understand they are already into production.
@equerol13968 ай бұрын
Gracias.
@petereriksson71668 ай бұрын
Very good video. Good job.
@pan_salceson8 ай бұрын
Damn, you're so good at talking and teaching about it!
@uTubeJRod8 ай бұрын
Anastasi!! How come all of the chip manufacturers were not already doing on chip memory? It seems like such an obvious engineering decision! I would love to know! You mentioned that large memory computations would be at a disadvantage - but MOST or at least MUCH of the computing we've been doing until recently hasn't needed such large amounts of memory. I hope you answer this! thank you for your informative videos, I was really looking forward to your insights on GROQ!!
@kornydad148 ай бұрын
cost, on chip memory (HBM) is very expensive and from what I have heard, the per chip yield for the accompanying processor is lower as well.
@scottaseigel57158 ай бұрын
We’re at the point that we need some inference functionality co-located with the user. Phone companies, app devs & servers will then be able to offload just enough AI processing to vastly enhance user experience and workflow efficiency. I estimate a little bit of this technology will make its way into our smartphones in only 1-3 years (one-or at most two-phone releases (versions) away).
@msalmanpour62888 ай бұрын
Nice, I'm glad that outsourced technologies are finally coming back to the US.
@KashmirParadise7868 ай бұрын
Thanks for the information
@paul1979uk20008 ай бұрын
I'm sure we all know that rivals are more or less going to do much of the same thing to rob them of that advantage, probably even integrated into cpu's or gpu's. On the plus side, this could have a big impact on reducing the price of A.I. chips, which is distorting the entire gpu market at the moment. From what I understand of it, from performance per watt, it's already much better than what Nvidia offers, if so and considering that this is 14nm, I do wonder how much faster and lower power consumption the 4nm will be. I also think that because of the task that this chip does, which is far more streamlined compared to a cpu or gpu, it's likely that a lot of rivals will be able to enter that market much easier than the traditional cpu and gpu markets, especially because for A.I. the drivers are much more streamlined and easier to do, that's great news because it very likely in the coming years will put a lot of downwards pressure on the price whiles boosting performance and with that, we might end up with some kind of A.I. chip that we can slot into a consumer grade PC, which I think locally run A.I. models are going to be far more important then the online centralised models, because of privacy and security reasons, which has A.I. because more capable and useful, it's going to be more involved in our daily lives, which I suspect for most of us as well as businesses, it wouldn't be that appealing to share so much data with an online central database/
@jigglypuff42278 ай бұрын
Anastasi you are such an amazing human being
@MrKelaher8 ай бұрын
Hopefully they will do an edge chip for running 7B models
@aGj2fiebP3ekso7wQpnd1Lhd8 ай бұрын
Great video!
@AmandaFessler8 ай бұрын
Subbed because this is the most depth I've seen on Groq in a lot of my searching. Also because you're a chip specialist. I guess my question is, can we expect consumer grade inference chips any time soon? The reality of how many Groq chips it takes to run Mixtral right now is... disheartening.
@jason.arthur.taylor8 ай бұрын
get real. she has no idea how the chip works.
@AmandaFessler8 ай бұрын
@@jason.arthur.taylor Oh, I see. That's disappointing. Do you know any actual chip experts you could recommend?
@jason.arthur.taylor8 ай бұрын
@@AmandaFessler This guy explains the chip. kzbin.info/www/bejne/l6DJeJuPach-qdk
@jason.arthur.taylor8 ай бұрын
@@AmandaFessler Were you able to see my reply? Sometimes I'm censored especially if I give a web site.
@AmandaFessler8 ай бұрын
@@jason.arthur.taylorUnfortunately no.. not even a notification. I guess it was censored. Maybe the name of the site will do?
@guilhermepimenta_prodabel8 ай бұрын
The challenge is reduce the software to hardware time, when the hardware defines the software design. In my opinion, we need a more general hardware platform that is scalable and cheaper. The solution is create an open hardware architecture like RISC-V, with a lot of collaborators to create a new Open Neural Network Unit. Data processing is a very complex system engineering problem, because the old problem of transfer the data from low latency to high latency memory to be processed and than be stored back. In the recente past, huge processing problems was not so data oriented so with more process power more information, but now the problem of memory hierarchy is back. We need more memory near process units and less processing power to reduce the latency.
@PaR20208 ай бұрын
You don't need 2xH100 to run Mixtral 8x7B model. 2xRTX3090 with 4 bit precision is running 30T/s (Yes lower precision, but it works very well anyway). 10TB model will probably be not needed. The specialized models the same way this hardware is specialized will be more efficient and popular.
@jonteno4 ай бұрын
sounds perfect for smipler ai agents, i see nothing keeping these models from working with other technologies that easier could handle bigger harder opperstions
@vi6ddarkking8 ай бұрын
It won't be long now. We'll have a CPU, GPU and LPU in our motherboards. And considering recent developments in AI training. I am fully expecting to be able to run, Trillion Parameter Models Locally before the end of the decade.
@yagoa8 ай бұрын
for casual use it is nicer with slower text imo, but for multi-AI teams it will be great!
@leematthews68128 ай бұрын
I think Anastasi should bring back the dreamy background music she used to close her videos with 🙂
@DeckEdge8 ай бұрын
Great challenges at the end, I wonder how much will be solved going from the 14nm chip to the 4nm chip you mentioned in the beginning.
@Noqtis6 ай бұрын
about 9.73 pipi will be solved by 4nm
@moozoowizard8 ай бұрын
It takes 8 chips to run tinyllama and the pcie board they sell has one chip. So 8 pcie cards. And I believe each is price something like the cost of a h100. So a very long way from being able to run anything useful on a standard PC. 🤞 one day they will be able to fit 578 groq chips into one chip and sell them at 4090 prices. We just need to wait for a 0.44nm process node.
@dm30358 ай бұрын
LOVE YOUR VIDEOS - VERY INFORMATIVE - AT THE CHIP nano LEVEL 🤗🤗🤗 DO NOT STAY AWAY FOR LONG PERIODS OF TIME - MISS YOU 🤗🤗🤗
@MozartificeR8 ай бұрын
14 nanometers to 4 nano meters is huge. Thats something with which you should keep you ear to the ground:) I wonder if they will ever do a hybrid of Groc and Cerebres?
@MahamadouDiarra-n4f8 ай бұрын
@Anastasia, does Groq chip violate patents by GSIT regarding their Gemini APU?
@zbaktube8 ай бұрын
About the groq chip: It outperforms nvidia mainly because of the on-chip ram, isn't it? How is it coming with the heat? It is easier to cool a 14nm chip, isn't it? Can the less efficient cooling at 4nm make the next version if this chip less efficient?
@volome8 ай бұрын
First thanks for your your wonderful video. Groq can always add more memory with future design. And it seams like the software stack is flexible to be compatible with future hardware design without much development. At the same time the development of LLM getting smaller and moving forward to more specialized models that are chosen by a predictable model. So if Groq can utilize this it will be perfect. And as you said the small and middle enterprises that want to keep their data local is a huge business. A small business don’t need 10 trillions of parameters for there internal data. Yes it’s required more chips than GPUs. But there are more easy to manufacture and they don’t need to stand in line at the 1 nanometer factories. And the Groq flexible router is built in the chip so no expensive Switches are needed. And at last it’s a huge saving in electrical power. NVIDIA should buy Groq now before they get to large . Jerry
@MikkoRantalainen8 ай бұрын
Running low latency AI chips on cloud seems a bit weird because latency from your own system to cloud is always going to add extra latency. I guess you should locate your servers into the same cloud provider as the Croq service provider.
@oldtools8 ай бұрын
the interconnect for the grok architecture would need a ridiculous amount of bandwidth in order to retain any compute advantage for LLM's anywhere near the size of even yesterday's flagship models.
@crhu3195 ай бұрын
The focus is inference delivery.
@oldtools5 ай бұрын
@@crhu319 I see this. thanks.
@MikkoRantalainen8 ай бұрын
Scaling to 100T models is going to be hard as long as accelerator cards have 10-100 GB RAM and you really wanted cards with 1-50 TB RAM.
@ItsTristan1st8 ай бұрын
The memory limit is a killer limitation. Tightly coupled small RAM is not a new approach and the problems are well understood. The big question is whether or not they can develop their algorithms to cope with the limitation however, this is typically not a simple problem to solve and leads into combinatoral optimization scenarios. Generally the long term solution to that sort of problem is always to just add more RAM. Ironically I could see this Grok chip working far better for canned problems rather than the general application it is being target for. Let us see what happens.
@yoshikimatsui63848 ай бұрын
I would like to think of a semiconductor that operates only by the vibrations of the stone. Is it possible to accelerate the flow of electrons in a semiconductor using just the vibrations of the stone? If we do that, I think we can create a flow of electrons within the semiconductor with minimal current and voltage.
@yewhanlim89168 ай бұрын
Fact. Memory which needs to hold charges, does not scale down as well logic circuits. Terabytes memory is better off chip. We still can have some in-chip cache.
@marcvayn8 ай бұрын
Do you see Nvidio following a similar path? It is odd that a company with about $5M in revenue and antiquated 14nm tech can run circles around the latest H100s for Inference.