How CUDA Programming Works | GTC 2022

  Рет қаралды 28,434

Dan the Man

Dan the Man

Күн бұрын

Пікірлер: 26
@sami9323
@sami9323 7 ай бұрын
this is one of the clearest and most lucid presentations i have seen, on any topic
@zoriiginalx7544
@zoriiginalx7544 Ай бұрын
The illustration of how RAM works on a physical level really was illuminating. Really drove home why linear accesses are important and why cache lines are the way they are. Fantastic talk.
@dennisrkb
@dennisrkb 2 жыл бұрын
Great presentation on GPU architecture, performance tradeoffs and considerations.
@ypwangreg
@ypwangreg Жыл бұрын
I was always puzzled and fascinated about how those grid/block/threads work in parallel in the GPU and this video explains it in one and all. very impressive and helpful!
@SrikarDurgi
@SrikarDurgi 5 ай бұрын
Dan is definitely the MAN. Great talk!
@holeo196
@holeo196 2 жыл бұрын
Another great presentation by Stephen Jones, fascinating
@KingDestrukto
@KingDestrukto 3 ай бұрын
Fantastic presentation, wow!
@hadiorabi692
@hadiorabi692 10 ай бұрын
Man this is amazing
@KalkiCharcha-hd5un
@KalkiCharcha-hd5un 4 ай бұрын
@21:17 "Its exactly the right amount of data to hit the peak bandwidth of my mem system , Even if my program reads data from all over the place , each read is exactly ONE page of my memory " I didnt understand this statement 21:17 "Even if my program reads data from all over the place" Does it mean even if the data is read from non consecutive memory ??
@perli216
@perli216 4 ай бұрын
yes
@perli216
@perli216 4 ай бұрын
You got the benefits of reading contiguous memory for free basically, even when doing random reads
@KalkiCharcha-hd5un
@KalkiCharcha-hd5un 4 ай бұрын
@@perli216 Ok cool so basically only mem is contigues we get advantage like if i = tid + bid*bsize , and not like i = 2*(tid + bid*bsize)
@perli216
@perli216 4 ай бұрын
@@KalkiCharcha-hd5un I don't understand your question
@KalkiCharcha-hd5un
@KalkiCharcha-hd5un 4 ай бұрын
@@perli216 "Even if my program reads data from all over the place" , I think I got it , Initially I thought "... all over the place" as in any random memory / non consecutive . all over the place as in diff threads from same page , because single thread will bring in the data from same page anyway.
@miramar-103
@miramar-103 2 күн бұрын
superb!
@chamidou2023
@chamidou2023 7 ай бұрын
Great presentation!
@purandharb
@purandharb Жыл бұрын
thanks for detailed explanation. Really enjoyed it.
@kimoohuang
@kimoohuang 4 ай бұрын
Great presentation!It is mentioned that 4 warps x 256 bytes per warp = 1024 bytes, and that equals to the Memory page size 1024 bytes. It only happens when the 4 warps running adjacent threads。Are the 4 warps always running adjacent threads?
@perli216
@perli216 4 ай бұрын
@@kimoohuang Not necessarily. Depends on the warp scheduler
@mugglepower
@mugglepower 11 ай бұрын
oh man I hope my mum fixed me with a better brain processing unit so I could understand this
@LetoTheSecond0
@LetoTheSecond0 4 ай бұрын
Looks like the link in the description is broken/truncated?
@perli216
@perli216 4 ай бұрын
@@LetoTheSecond0 yes, yourube did this. It's just the original source for the video
@openroomxyz
@openroomxyz 2 жыл бұрын
Interesting!
How GPU Computing Works | GTC 2021
39:36
Dan the Man
Рет қаралды 36 М.
Intro to CUDA (part 1): High Level Concepts
9:26
Josh Holloway
Рет қаралды 89 М.
Quando eu quero Sushi (sem desperdiçar) 🍣
00:26
Los Wagners
Рет қаралды 15 МЛН
黑天使被操控了#short #angel #clown
00:40
Super Beauty team
Рет қаралды 61 МЛН
How do Graphics Cards Work?  Exploring GPU Architecture
28:30
Branch Education
Рет қаралды 2,3 МЛН
GPU Series: Multi-GPU Programming Part 1
1:24:55
NCAR Computational and Information Systems Laboratory (CISL)
Рет қаралды 11 М.
Tutorial: CUDA programming in Python with numba and cupy
45:43
nickcorn93
Рет қаралды 82 М.
Writing Code That Runs FAST on a GPU
15:32
Low Level
Рет қаралды 572 М.
Is Functional Programming DEAD Already?
21:07
Continuous Delivery
Рет қаралды 75 М.
When Optimisations Work, But for the Wrong Reasons
22:19
SimonDev
Рет қаралды 1,1 МЛН
An Intro to GPU Architecture and Programming Models I Tim Warburton, Virginia Tech
2:05:49
Argonne Meetings, Webinars, and Lectures
Рет қаралды 56 М.
CUDA Programming
1:08:32
Tom Nurkkala
Рет қаралды 55 М.
Let's Create a Compiler (Pt.1)
1:11:03
Pixeled
Рет қаралды 574 М.
Quando eu quero Sushi (sem desperdiçar) 🍣
00:26
Los Wagners
Рет қаралды 15 МЛН