Optimize GPU performance for AI - Prof. Gennady Pekhimenko

  Рет қаралды 8,321

Machine Learning Street Talk

Machine Learning Street Talk

Күн бұрын

Пікірлер: 19
@MachineLearningStreetTalk
@MachineLearningStreetTalk Ай бұрын
REFS: [0:00:15] Bill Gates and Mark Zuckerberg - Examples of technical founders who dropped out of Harvard to pursue business opportunities, both with strong programming backgrounds. (Harvard Gazette) news.harvard.edu/gazette/story/2017/05/make-a-difference-zuckerberg-tells-harvard-graduates/ [0:06:15] Jensen Huang's NVIDIA Management - Discussion of flat organizational structure with ~50 direct reports, demonstrating unconventional tech company management approach. (Harvard Business Review) hbr.org/podcast/2023/11/nvidias-ceo-what-it-takes-to-run-an-a-i-led-company-now [0:11:05] LLaMA 2 - Meta's open-source large language model collection ranging from 7B to 70B parameters. (Touvron et al.) arxiv.org/abs/2307.09288 [0:25:35] Mistral 7B - 7B parameter language model outperforming Llama 2 13B using grouped-query attention and sliding window attention. (Jiang et al.) arxiv.org/abs/2310.06825 [0:34:45] Blocksworld Problems - Research on self-verification and chain-of-thought limitations in LLMs. (Kambhampati et al.) www.arxiv.org/pdf/2402.08115 [0:35:55] LLM Arithmetic Limitations - Research demonstrating gaps in mathematical reasoning capabilities, particularly in three-digit multiplication. (Dziri et al.) arxiv.org/abs/2305.18654 [0:41:35] AI Energy Consumption - Quantitative analysis of computational costs in modern AI training compared to biological systems. (Strubell et al.) arxiv.org/abs/1906.02243 [0:46:20] Cursor.sh - AI-first code editor featuring multi-file diff review and AI-assisted code generation. cursor.sh [1:07:25] NVIDIA CUDA - Parallel computing platform and programming model for NVIDIA GPUs, discussed in context of kernel optimization evolution. docs.nvidia.com/cuda/cuda-c-programming-guide/ [1:09:15] TVM Compiler - Automated end-to-end optimizing compiler for deep learning workloads across diverse hardware backends. (Chen et al.) arxiv.org/abs/1802.04799 [1:17:20] No Free Lunch Theorem - Proves that all optimization algorithms have identical average performance across all possible problems, with implications for ML optimization. (Wolpert & Macready) arxiv.org/abs/2007.10928 [1:52:40] BERT - Introduction of deep bidirectional representations from unlabeled text, discussed in context of Google scaling bidirectional attention models. (Devlin et al.) arxiv.org/abs/1810.04805 [1:56:00] The Hardware Lottery - Analysis of how hardware availability historically influenced AI research success. (Hooker) arxiv.org/abs/2009.06489 [1:56:35] Geoffrey Hinton Nobel Prize - Awarded 2024 Nobel Prize in Physics for pioneering work in deep learning and neural networks. www.technologyreview.com/2024/10/08/1105221/geoffrey-hinton-just-won-the-nobel-prize-in-physics-for-his-work-on-machine-learning/ [2:03:57] Chomsky's LLM Critique - Argues that language models do not constitute genuine linguistic theory as they fail to properly delineate language possibilities. arxiv.org/pdf/2401.03910 [2:06:15] NVIDIA Market Share - Analysis showing 98% revenue share in data-center GPU market with $36.2B revenue in 2023. www.hpcwire.com/2024/06/10/nvidia-shipped-3-76-million-data-center-gpus-in-2023-according-to-study/
@uw10isplaya
@uw10isplaya Ай бұрын
MLST, where even the sponsored content is banger certified
@enkidughom2508
@enkidughom2508 Ай бұрын
I interviewed for them, and they have the most fun and interesting interviewing experience
@tedlasso2887
@tedlasso2887 Ай бұрын
Can you share some
@pedrogorilla483
@pedrogorilla483 Ай бұрын
You can’t just say that and leave bro, please tell us more
@Outplayedqt
@Outplayedqt Ай бұрын
Mind sharing 1 or 2 tips or insights? Or anything that surprised you, in particular?
@ErmekD
@ErmekD Ай бұрын
Always great to see technical guests break down complicated concepts in a simple language!
@tommybtravels
@tommybtravels Ай бұрын
Hi Dr. Scarfe, thanks for another great episode. Some additional questions you might have considered asking Dr. Pekhimenko include: -end to end neural nets all the way to AGI, or something like neurosymbolic required? -his take on the ARC challenge -are any chip startups such as Cerebras, Groq, and/or Sambanova doing anything interesting in terms of architecture/chip design, and could any of them (or others) threaten Nvidia in terms of training and/or inference? -How much of a threat to Nvidia’s market dominance are custom ASICs made by Google, Amazon, and soon to be OpenAI via Broadcom? -How much of a moat is CUDA now, and how much staying power is that moat likely to have in the future? Big fan of your work. Thanks again
@AhmedMOHAMMEDAHMED-hm2rx
@AhmedMOHAMMEDAHMED-hm2rx Ай бұрын
I wonder if it is possible to offload the verification process in the future .
@yurona5155
@yurona5155 Ай бұрын
Once a new MLST episode drops (even a sponsored one), I'm very 'suspectible' to staying up way past my bedtime...*scnr*
@luisluiscunha
@luisluiscunha Ай бұрын
Use a good KZbin summarizer
@luisluiscunha
@luisluiscunha Ай бұрын
If subscribed to chatGPT 😊
@MachineLearningStreetTalk
@MachineLearningStreetTalk Ай бұрын
@@luisluiscunha we go to great lengths to produce PDF shownotes, look at them! www.dropbox.com/scl/fi/w9kbpso7fawtm286kkp6j/Gennady.pdf?rlkey=aqjqmncx3kjnatk2il1gbgknk&st=2a9mccj8&dl=0 - you can feed it into Claude and ask specific questions
@Reversed82
@Reversed82 Ай бұрын
49:02 i guess you are talking about tracing? i hope you already know about tracing and correlating it with logs + metrics (from a software dev turned operations/SRE)
@Pingu_astrocat21
@Pingu_astrocat21 Ай бұрын
Cool stuff🔥
@DelandaBaudLacanian
@DelandaBaudLacanian Ай бұрын
I wonder why Cassie Kozyrkov dropped out of Google? Her projects are something to watch out for
@shubhamarle96
@shubhamarle96 Ай бұрын
isn't MAMBA better than Transformers?
@DailyTuna
@DailyTuna Ай бұрын
The goal is to be less technical. as these tools do the technical work, it makes sense for a simplification for people to focus on creativity orproblems solving. .
@BuFu1O1
@BuFu1O1 Ай бұрын
lol is he Gwern??
Learning at test time in LLMs
51:02
Machine Learning Street Talk
Рет қаралды 25 М.
What is “reasoning” in modern AI?
1:44:43
Machine Learning Street Talk
Рет қаралды 14 М.
Support each other🤝
00:31
ISSEI / いっせい
Рет қаралды 81 МЛН
Don’t Choose The Wrong Box 😱
00:41
Topper Guild
Рет қаралды 62 МЛН
The Dark Matter of AI [Mechanistic Interpretability]
24:09
Welch Labs
Рет қаралды 74 М.
Anil Ananthaswamy: ChatGPT and its ilk
1:12:36
TheHITSters
Рет қаралды 2,6 М.
Visualizing transformers and attention | Talk for TNG Big Tech Day '24
57:45
Is the Cosmos a Vast Computation?
43:05
World Science Festival
Рет қаралды 96 М.
Test-Time Adaptation: A New Frontier in AI
1:45:57
Machine Learning Street Talk
Рет қаралды 23 М.
This is why Deep Learning is really weird.
2:06:38
Machine Learning Street Talk
Рет қаралды 410 М.
Gemini 2.0 and the evolution of agentic AI with Oriol Vinyals
51:57
Google DeepMind
Рет қаралды 59 М.