New course taught by Jay Alammar and Maarten Grootendorst: How Transformer LLMs Work

  Рет қаралды 2,858

DeepLearningAI

DeepLearningAI

Күн бұрын

Learn more: bit.ly/3WKa3fK
Introducing "How Transformer LLMs Work," created with Jay Alammar and Maarten Grootendorst, authors of the “Hands-On Large Language Models” book. This course offers a deep dive into the main components of the transformer architecture that powers large language models (LLMs).
The transformer architecture revolutionized generative AI. In fact, the "GPT" in ChatGPT stands for "Generative Pre-Trained Transformer."
Originally introduced in the groundbreaking 2017 paper Attention Is All You Need, by Ashish Vaswani and others, transformers were a highly scalable model for machine translation tasks. Variants of this architecture now power today’s LLMs such as those from OpenAI, Google, Meta, Cohere, and Anthropic.
In their book, Jay and Maarten beautifully illustrated the underlying architecture of LLMs through insightful and easy-to-understand explanations.
In this course, you'll learn how a transformer network architecture that powers LLMs works. You'll build the intuition of how LLMs process text and work with code examples that illustrate the key components of the transformer architecture.
Key topics covered in this course include:
The evolution of how language has been represented numerically, from the Bag-of-Words model through Word2Vec embeddings to the transformer architecture that captures word meanings in full context.
How LLM inputs are broken down into tokens, which represent words or pieces before they are sent to the language model.
The details of a transformer and the three main stages, consisting of tokenization and embedding, the stack of transformer blocks, and the language model head.
The details of the transformer block, including attention, which calculates relevance scores followed by the feedforward layer, which incorporates stored information learned in training.
How cached calculations make transformers faster, how the transformer block has evolved over the years since the original paper was released, and how they continue to be widely used.
Explore an implementation of recent models in the Hugging Face transformer library.
By the end of this course, you’ll have a deep understanding of how LLMs process language and you'll be able to read through papers describing models and understand the details that are used to describe these architectures. This intuition will help improve your approach to building LLM applications.
Enroll now: bit.ly/3WKa3fK

Пікірлер: 2
Правильный подход к детям
00:18
Beatrise
Рет қаралды 11 МЛН
人是不能做到吗?#火影忍者 #家人  #佐助
00:20
火影忍者一家
Рет қаралды 20 МЛН
AI, Machine Learning, Deep Learning and Generative AI Explained
10:01
IBM Technology
Рет қаралды 1 МЛН
What if all the world's biggest problems have the same solution?
24:52
Visualizing transformers and attention | Talk for TNG Big Tech Day '24
57:45
You're Not Behind: Become AI-Native in 2025
10:11
Jeff Su
Рет қаралды 509 М.
AI Is Making You An Illiterate Programmer
27:22
ThePrimeTime
Рет қаралды 331 М.
Devin just came to take your software job… will code for $8/hr
5:13
Transformers (how LLMs work) explained visually | DL5
27:14
3Blue1Brown
Рет қаралды 4,8 МЛН
A Visual Guide to Mixture of Experts (MoE) in LLMs
19:44
Maarten Grootendorst
Рет қаралды 11 М.
Introduction to Deep Research
20:16
OpenAI
Рет қаралды 503 М.
Правильный подход к детям
00:18
Beatrise
Рет қаралды 11 МЛН