How CUDA Programming Works | GTC 2022

  Рет қаралды 26,622

Dan the Man

Dan the Man

Күн бұрын

Пікірлер: 23
@sami9323
@sami9323 5 ай бұрын
this is one of the clearest and most lucid presentations i have seen, on any topic
@dennisrkb
@dennisrkb 2 жыл бұрын
Great presentation on GPU architecture, performance tradeoffs and considerations.
@ypwangreg
@ypwangreg Жыл бұрын
I was always puzzled and fascinated about how those grid/block/threads work in parallel in the GPU and this video explains it in one and all. very impressive and helpful!
@SrikarDurgi
@SrikarDurgi 4 ай бұрын
Dan is definitely the MAN. Great talk!
@hadiorabi692
@hadiorabi692 8 ай бұрын
Man this is amazing
@KingDestrukto
@KingDestrukto Ай бұрын
Fantastic presentation, wow!
@holeo196
@holeo196 2 жыл бұрын
Another great presentation by Stephen Jones, fascinating
@chamidou2023
@chamidou2023 6 ай бұрын
Great presentation!
@purandharb
@purandharb Жыл бұрын
thanks for detailed explanation. Really enjoyed it.
@mugglepower
@mugglepower 9 ай бұрын
oh man I hope my mum fixed me with a better brain processing unit so I could understand this
@kimoohuang
@kimoohuang 3 ай бұрын
Great presentation!It is mentioned that 4 warps x 256 bytes per warp = 1024 bytes, and that equals to the Memory page size 1024 bytes. It only happens when the 4 warps running adjacent threads。Are the 4 warps always running adjacent threads?
@perli216
@perli216 3 ай бұрын
@@kimoohuang Not necessarily. Depends on the warp scheduler
@openroomxyz
@openroomxyz 2 жыл бұрын
Interesting!
@KalkiCharcha-hd5un
@KalkiCharcha-hd5un 3 ай бұрын
@21:17 "Its exactly the right amount of data to hit the peak bandwidth of my mem system , Even if my program reads data from all over the place , each read is exactly ONE page of my memory " I didnt understand this statement 21:17 "Even if my program reads data from all over the place" Does it mean even if the data is read from non consecutive memory ??
@perli216
@perli216 3 ай бұрын
yes
@perli216
@perli216 3 ай бұрын
You got the benefits of reading contiguous memory for free basically, even when doing random reads
@KalkiCharcha-hd5un
@KalkiCharcha-hd5un 3 ай бұрын
@@perli216 Ok cool so basically only mem is contigues we get advantage like if i = tid + bid*bsize , and not like i = 2*(tid + bid*bsize)
@perli216
@perli216 3 ай бұрын
@@KalkiCharcha-hd5un I don't understand your question
@KalkiCharcha-hd5un
@KalkiCharcha-hd5un 2 ай бұрын
@@perli216 "Even if my program reads data from all over the place" , I think I got it , Initially I thought "... all over the place" as in any random memory / non consecutive . all over the place as in diff threads from same page , because single thread will bring in the data from same page anyway.
@LetoTheSecond0
@LetoTheSecond0 2 ай бұрын
Looks like the link in the description is broken/truncated?
@perli216
@perli216 2 ай бұрын
@@LetoTheSecond0 yes, yourube did this. It's just the original source for the video
How GPU Computing Works | GTC 2021
39:36
Dan the Man
Рет қаралды 35 М.
😜 #aminkavitaminka #aminokka #аминкавитаминка
00:14
Аминка Витаминка
Рет қаралды 2,8 МЛН
Wait for it 😂
00:19
ILYA BORZOV
Рет қаралды 11 МЛН
Amazing remote control#devil  #lilith #funny #shorts
00:30
Devil Lilith
Рет қаралды 9 МЛН
Zen, CUDA, and Tensor Cores - Part 1
21:06
Molly Rocket
Рет қаралды 28 М.
Tutorial: CUDA programming in Python with numba and cupy
45:43
nickcorn93
Рет қаралды 80 М.
GPU Series: Multi-GPU Programming Part 1
1:24:55
NCAR Computational and Information Systems Laboratory (CISL)
Рет қаралды 10 М.
How do Graphics Cards Work?  Exploring GPU Architecture
28:30
Branch Education
Рет қаралды 1,4 МЛН
CUDA Programming
1:08:32
Tom Nurkkala
Рет қаралды 53 М.
CUDA Hardware
42:21
Tom Nurkkala
Рет қаралды 17 М.
How Graphics Cards are made - Insane PowerColor Factory Tour
26:02
der8auer EN
Рет қаралды 826 М.