Optimize GPU performance for AI - Prof. Gennady Pekhimenko

Рет қаралды 8,321

Күн бұрын

Пікірлер: 19

@MachineLearningStreetTalk Ай бұрын

REFS: [0:00:15] Bill Gates and Mark Zuckerberg - Examples of technical founders who dropped out of Harvard to pursue business opportunities, both with strong programming backgrounds. (Harvard Gazette) news.harvard.edu/gazette/story/2017/05/make-a-difference-zuckerberg-tells-harvard-graduates/ [0:06:15] Jensen Huang's NVIDIA Management - Discussion of flat organizational structure with ~50 direct reports, demonstrating unconventional tech company management approach. (Harvard Business Review) hbr.org/podcast/2023/11/nvidias-ceo-what-it-takes-to-run-an-a-i-led-company-now [0:11:05] LLaMA 2 - Meta's open-source large language model collection ranging from 7B to 70B parameters. (Touvron et al.) arxiv.org/abs/2307.09288 [0:25:35] Mistral 7B - 7B parameter language model outperforming Llama 2 13B using grouped-query attention and sliding window attention. (Jiang et al.) arxiv.org/abs/2310.06825 [0:34:45] Blocksworld Problems - Research on self-verification and chain-of-thought limitations in LLMs. (Kambhampati et al.) www.arxiv.org/pdf/2402.08115 [0:35:55] LLM Arithmetic Limitations - Research demonstrating gaps in mathematical reasoning capabilities, particularly in three-digit multiplication. (Dziri et al.) arxiv.org/abs/2305.18654 [0:41:35] AI Energy Consumption - Quantitative analysis of computational costs in modern AI training compared to biological systems. (Strubell et al.) arxiv.org/abs/1906.02243 [0:46:20] Cursor.sh - AI-first code editor featuring multi-file diff review and AI-assisted code generation. cursor.sh [1:07:25] NVIDIA CUDA - Parallel computing platform and programming model for NVIDIA GPUs, discussed in context of kernel optimization evolution. docs.nvidia.com/cuda/cuda-c-programming-guide/ [1:09:15] TVM Compiler - Automated end-to-end optimizing compiler for deep learning workloads across diverse hardware backends. (Chen et al.) arxiv.org/abs/1802.04799 [1:17:20] No Free Lunch Theorem - Proves that all optimization algorithms have identical average performance across all possible problems, with implications for ML optimization. (Wolpert & Macready) arxiv.org/abs/2007.10928 [1:52:40] BERT - Introduction of deep bidirectional representations from unlabeled text, discussed in context of Google scaling bidirectional attention models. (Devlin et al.) arxiv.org/abs/1810.04805 [1:56:00] The Hardware Lottery - Analysis of how hardware availability historically influenced AI research success. (Hooker) arxiv.org/abs/2009.06489 [1:56:35] Geoffrey Hinton Nobel Prize - Awarded 2024 Nobel Prize in Physics for pioneering work in deep learning and neural networks. www.technologyreview.com/2024/10/08/1105221/geoffrey-hinton-just-won-the-nobel-prize-in-physics-for-his-work-on-machine-learning/ [2:03:57] Chomsky's LLM Critique - Argues that language models do not constitute genuine linguistic theory as they fail to properly delineate language possibilities. arxiv.org/pdf/2401.03910 [2:06:15] NVIDIA Market Share - Analysis showing 98% revenue share in data-center GPU market with $36.2B revenue in 2023. www.hpcwire.com/2024/06/10/nvidia-shipped-3-76-million-data-center-gpus-in-2023-according-to-study/

@uw10isplaya Ай бұрын

MLST, where even the sponsored content is banger certified

@enkidughom2508 Ай бұрын

I interviewed for them, and they have the most fun and interesting interviewing experience

@tedlasso2887 Ай бұрын

Can you share some

@pedrogorilla483 Ай бұрын

You can’t just say that and leave bro, please tell us more

@Outplayedqt Ай бұрын

Mind sharing 1 or 2 tips or insights? Or anything that surprised you, in particular?

@ErmekD Ай бұрын

Always great to see technical guests break down complicated concepts in a simple language!

@tommybtravels Ай бұрын

Hi Dr. Scarfe, thanks for another great episode. Some additional questions you might have considered asking Dr. Pekhimenko include: -end to end neural nets all the way to AGI, or something like neurosymbolic required? -his take on the ARC challenge -are any chip startups such as Cerebras, Groq, and/or Sambanova doing anything interesting in terms of architecture/chip design, and could any of them (or others) threaten Nvidia in terms of training and/or inference? -How much of a threat to Nvidia’s market dominance are custom ASICs made by Google, Amazon, and soon to be OpenAI via Broadcom? -How much of a moat is CUDA now, and how much staying power is that moat likely to have in the future? Big fan of your work. Thanks again

@AhmedMOHAMMEDAHMED-hm2rx Ай бұрын

I wonder if it is possible to offload the verification process in the future .

@yurona5155 Ай бұрын

Once a new MLST episode drops (even a sponsored one), I'm very 'suspectible' to staying up way past my bedtime...*scnr*

@luisluiscunha Ай бұрын

Use a good KZbin summarizer

@luisluiscunha Ай бұрын

If subscribed to chatGPT 😊

@MachineLearningStreetTalk Ай бұрын

@@luisluiscunha we go to great lengths to produce PDF shownotes, look at them! www.dropbox.com/scl/fi/w9kbpso7fawtm286kkp6j/Gennady.pdf?rlkey=aqjqmncx3kjnatk2il1gbgknk&st=2a9mccj8&dl=0 - you can feed it into Claude and ask specific questions

@Reversed82 Ай бұрын

49:02 i guess you are talking about tracing? i hope you already know about tracing and correlating it with logs + metrics (from a software dev turned operations/SRE)

@Pingu_astrocat21 Ай бұрын

Cool stuff🔥

@DelandaBaudLacanian Ай бұрын

I wonder why Cassie Kozyrkov dropped out of Google? Her projects are something to watch out for

@shubhamarle96 Ай бұрын

isn't MAMBA better than Transformers?

@DailyTuna Ай бұрын

The goal is to be less technical. as these tools do the technical work, it makes sense for a simplification for people to focus on creativity orproblems solving. .