Transforming Language with Generative Pre-trained Transformers (GPT)

Рет қаралды 30,953

IBM Technology

Күн бұрын

Пікірлер: 18

@atharvakpatil 2 ай бұрын

IBM on fire with banger releases🔥🔥🔥🔥🔥

@raymobula 2 ай бұрын

These videos get better and better.

@trekkergal460 2 ай бұрын

Loved this video! Such an amazing explanation

@uddinshehab Ай бұрын

So nice presentation. Loved it.

@lei-m6r 2 ай бұрын

good explanation,easy to understand

@datadoctor10 2 ай бұрын

So simple and useful. Thank you

@simonkalu 2 ай бұрын

Outstanding. Nice job

@enelanonymatus5634 2 ай бұрын

Good Video. Meetings from Iquitos - Perú

@learnbydoingwithsteven 2 ай бұрын

I really like how well the content is organized and displayed.

@davidcampos1463 2 ай бұрын

KZbin closed captions are still full of errors. But not here and now.

@jonnysolaris 2 ай бұрын

There's an error at @3:16 - the encoder module doesn't map tokens to a 3D space, but rather a higher dimensional hypersphere. More info below 👇🏻

@jonnysolaris 2 ай бұрын

Token embeddings in Large Language Models operate in a far more complex space than simple 3D representations might suggest. Instead of mapping tokens to basic coordinate points, they exist in a high-dimensional hypersphere where the dimensionality (d_model) can reach up to 12,000 in advanced models like GPT. Rather than thinking of these embeddings as fixed points in this vast vector space, it's more accurate to conceptualise them as directional vectors on the surface of this hypersphere. Each token's representation combines two key elements: its direction in the high-dimensional space (which encodes semantic meaning) and its magnitude (representing emphasis or importance). This approach enables the model to capture rich semantic relationships through directional similarities while maintaining information about each token's relative significance. The massive dimensionality allows for extremely nuanced distinctions between concepts, far beyond what could be achieved in lower-dimensional spaces. This is why simplistic visualisations of embedding spaces can be misleading - they necessarily lose much of the geometric complexity that makes these representations so powerful for language processing tasks.