CUDA Programming on Python

  Рет қаралды 1,179,132

Ahmad Bazzi

Ahmad Bazzi

Күн бұрын

Пікірлер
@aishahoura2619
@aishahoura2619 2 жыл бұрын
Thank you so much for responding to my request for making a CUDA programming. I have donated 0.1 BTC to your account as a way to thank you. My professor has done so many hours trying to explain CUDA and none of my classmates really understood. I just can not believe that you do all this for free and that is why me and my classmates have decided to collect some funds to donate to you. Thanks for all that you do and please keep going.
@AhmadBazzi
@AhmadBazzi 2 жыл бұрын
Thank you for the donation, it really means a lot !
@aishahoura2619
@aishahoura2619 2 жыл бұрын
@@AhmadBazzi No thank you !
@mdrubelahmed6434
@mdrubelahmed6434 2 жыл бұрын
Thank you so much for responding to my request for making a CUDA programming.
@btspower3844
@btspower3844 2 жыл бұрын
Wow amazing
@btspower3844
@btspower3844 2 жыл бұрын
Wow amazing
@denizart2255
@denizart2255 2 жыл бұрын
You just opened my eyes to parallel programming. Thanks for the quick overview.
@mdrubelahmed6434
@mdrubelahmed6434 2 жыл бұрын
Too hard to find high -quality content like this these days. Thank you so much
@nilsu1941
@nilsu1941 2 жыл бұрын
12:36 This guy is a God !
@sportscorridor4064
@sportscorridor4064 2 жыл бұрын
very nice
@mdrubelahmed6434
@mdrubelahmed6434 2 жыл бұрын
So beautiful
@aoungamingyt3160
@aoungamingyt3160 2 жыл бұрын
Thank you so much. Probably the best introdution to CUDA with Python. The example you use, while very basic, touches on usage of blocks, which is usually omitted in other introduction-level tutorials. Great stuff! Hope you return with some more videos. I have subscribed!
@danielsalazar6637
@danielsalazar6637 2 жыл бұрын
Excelent
@sksk-lo8kc
@sksk-lo8kc 2 жыл бұрын
That was very well explained. I have only have taken one course, and you made it clearer than my professor or fellow students ever did.
@Drex.Yt1
@Drex.Yt1 2 жыл бұрын
Too hard to find high-quality content like this these days. Thank you so much
@leonelaguilera9059
@leonelaguilera9059 2 жыл бұрын
this was such an excellent video
@tatldunyas2471
@tatldunyas2471 2 жыл бұрын
Just did my research and this guy is at one of the most prestigious universities in the world ! No wonder why his lectures come up neat !
@apogeetheboss9999
@apogeetheboss9999 2 жыл бұрын
as a data scientist +2 years of experience, i ALWAYS learn something new with your content! please nich, never stop doing this things, and also, never cut your smile in your face, even if your are having bugs!!
@pantherofficial5059
@pantherofficial5059 2 жыл бұрын
I have been looking into gpu programming using numba and python for a while, this seems to be the best tutorial I was able to find so far.. . thank you
@excolabirbuyuyecek9438
@excolabirbuyuyecek9438 2 жыл бұрын
LOL. Loved the graphic at 6:23! Brought tears to my eyes.
@basslvers4501
@basslvers4501 2 жыл бұрын
wanted to comment that the information in this presentation is very well structured and the flow is excellent.
@erenbasak7694
@erenbasak7694 2 жыл бұрын
Hey this is super useful! I elected High Performance Computing and Microprocessors and Embedded Systems modules for my degree, and this channel has become my go-to guide.
@muradhesenov5245
@muradhesenov5245 2 жыл бұрын
the essence of Deep learning in a few lines of code... awesome
@RAVIShankar-bm4ou
@RAVIShankar-bm4ou 2 жыл бұрын
Thank you so much for this series! It's so clear and easy to follow
@adeeshaamabidu9616
@adeeshaamabidu9616 2 жыл бұрын
Love the channel Nicholas, have recently graduated from an NLP Master's degree and seeing you explain stuff in a simpler way and your coding challenges is really helping me connect with the material I've learned! Keep it up and I'll keep watching!
@-oof1016
@-oof1016 2 жыл бұрын
Ahmad , thanks for taking time to create these videos. It is unfortunate that people view your videos and then feel inspired to complain about a free gift. Folks could just keep it moving or add helpful insights.
@todotasks7645
@todotasks7645 2 жыл бұрын
Too hard to find high-quality content like this these days. ⚡
@tajaochrisciamae4121
@tajaochrisciamae4121 2 жыл бұрын
what a passionate tutorial! I wish you were my professor for my parallel programming course. Well done!
@LouieVon
@LouieVon 2 жыл бұрын
This is the best introduction to CUDA I've seen, thanks a lot !
@user-sh6jw1ce3m
@user-sh6jw1ce3m 2 жыл бұрын
#
@youtubemullim319
@youtubemullim319 2 жыл бұрын
I feel like Cuda has been demystified. Very glad I found your series.
@user-sh6jw1ce3m
@user-sh6jw1ce3m 2 жыл бұрын
#
@astaadxofficials7813
@astaadxofficials7813 2 жыл бұрын
I have no idea what kind of videos i am watching ... but i sure will learn
@rithusvlogtime3342
@rithusvlogtime3342 2 жыл бұрын
Fantastic tutorials on CUDA. You deserve more followers.
@user-sh6jw1ce3m
@user-sh6jw1ce3m 2 жыл бұрын
Thanks for the comment... contact me for information and profitable investment strategies..⤴️
@teamchanel4384
@teamchanel4384 2 жыл бұрын
I'm doing an internship in a research lab and I'll have to program some kernels to implement Blas primitives, this video really helps :)
@user-sh6jw1ce3m
@user-sh6jw1ce3m 2 жыл бұрын
Thanks for the comment... contact me for information and profitable investment strategies..⤴️
@cavansirmahmudov217
@cavansirmahmudov217 2 жыл бұрын
You saved me, i had to read the PointNet2 implementation for my BCS thesis. this made the job much easier!
@kelechijames5577
@kelechijames5577 2 жыл бұрын
This is very helpful. Most people don't realize the overheads and code refactoring necessary to take advantages of the GPUs. I am going to refactor a simple MNIST training propgram I have which currently uses only Numpy. See if I can get meaningful improvements in training time.
@fatihcalidkan2254
@fatihcalidkan2254 2 жыл бұрын
holy shit, i was looking into this to speed up my mandelbrot-zooms and they are what you use as an example! This is a dream come true!
@SiTacuissem
@SiTacuissem 2 жыл бұрын
Interesting, but two remarks: Example 1: on my setup (3080Ti, i7-8700K, running in WSL2 with Ubuntu 22.04) vector multiplication runs actually *faster* on CPU (if you either use the vectorized formulation in MultiplyMyVectors with target "cpu" or, simply, a*b instead of the unnecessary for loop in the CPU code). IMO that is mostly due to the overhead of copying the data to the GPU memory. Example 2: to get a fair comparison, you should also use the JIT for FillArrayWithouGPU, decorating with @jit(target_backend="cpu"). Then, GPU array filling is still faster, but only by a factor of 2.
@Kvmizo
@Kvmizo 2 жыл бұрын
This was by far one of the most enlightening videos you have put up on your channel. Thanks and keep up the good work!!
@mrfcbs1251
@mrfcbs1251 2 жыл бұрын
Oh Ahmad , your tutorials are incredible and inspiring....
@Fiekriekd
@Fiekriekd 2 жыл бұрын
and that's what I call a great tutorial. Thankyou sir. I wish you make more tutorials.
@user-sh6jw1ce3m
@user-sh6jw1ce3m 2 жыл бұрын
Thanks for the comment... contact me for information and profitable investment strategies...⬆️
@mustafasamet2783
@mustafasamet2783 2 жыл бұрын
Ayyyy, so glad you like it @Patrick. For the last two weeks I've just been making videos on stuff I find hard or want to get my head around I figure it's not just me staring there at some of these concepts like huh?!? Thanks for checking it out!!
@lixaxel6815
@lixaxel6815 2 жыл бұрын
Excellent example of vector addition of using for loop and using CUDA
@Animals-vi5wt
@Animals-vi5wt 2 жыл бұрын
Woah congrats @Ally 🎊 🎉 glad you’re enjoying the challenges, plenty more to come!!
@notachannel2601
@notachannel2601 2 жыл бұрын
It is effectively a very easy approach to harness the power of cuda in simple python scripts.
@turkceraplyrics6701
@turkceraplyrics6701 2 жыл бұрын
Hey Ahmad , I love watching your videos because of the way you tell the story. Great graphics mate. Love the reference to rocket man too... lol keep up the good work.
@merthanozer2964
@merthanozer2964 2 жыл бұрын
Ohh, yes, Thank you, and the documentation at nvidia site about CUDA is highly professionally written. Thank you.
@TheTurksxayers
@TheTurksxayers 2 жыл бұрын
Thank you for this great introduction to numba and more specifically numba+cuda.
@mipxello7749
@mipxello7749 2 жыл бұрын
Very well explained. The best CUDA explaination I have come across uptil now 😊😊. Keep up the spirits sir.👍👍
@user-sh6jw1ce3m
@user-sh6jw1ce3m 2 жыл бұрын
Thanks for the comment... contact me for information and profitable investment strategies...⤴️
@yasincaferzade8069
@yasincaferzade8069 2 жыл бұрын
Great video, I like this kind of video where you code some AI task counterclock, you teach us the concepts and show us the reality of implementing it👏
@emircoltu875
@emircoltu875 2 жыл бұрын
Once you initialized lr to 0.0, I knew you were going to forget to change it lol. Love the challenges tho, keep doing them, I think it would be cool to see how you implement a neural network from scratch
@peloizol8947
@peloizol8947 2 жыл бұрын
I like how you did the website for documenting the video notes for reference later
@arifefe4264
@arifefe4264 2 жыл бұрын
KZbin questions are typically the best way to handle an issue or an issue request on Github on a project demo. I will do my best to answer when I have time.
@crystalannringor1025
@crystalannringor1025 2 жыл бұрын
Thank you so much for this video. It has helped me massively to prepare for my computer science exam.
@نٌے.نٌے.نٌےۦصہٰ̐كْٰٓاكہٰ̐ہٰ̐يي
@نٌے.نٌے.نٌےۦصہٰ̐كْٰٓاكہٰ̐ہٰ̐يي 2 жыл бұрын
OHHHH MANNN, I thought about doing that but I was debating whether I'd hit the 15 minute deadline already. Good suggestion @Julian!
@mjmlangenihd8706
@mjmlangenihd8706 2 жыл бұрын
yes, you could do this by hand, which would be a great challenge in distributed computing to code by hand. Another option is to use a framework/platform like AWS Sagemaker to do distributed kmeans. Most organizations will do this.
@uniquevlogsbyadil
@uniquevlogsbyadil 2 жыл бұрын
This was oddly intense. Great job Nicholas! Even though you ran out of time, this video is still a win to me. 😉
@totallycz6819
@totallycz6819 2 жыл бұрын
You are a lifesaver @Spencer, will do it next time i'm on the streaming rig!
@prietjepruck
@prietjepruck Жыл бұрын
Thank you very much for this tutorial. I would love to have the code available because typing it in myself from the video is a bit hard especially with the atocomplete on all the time. Keep up the good work.
@beratcansamur1517
@beratcansamur1517 2 жыл бұрын
An insanely underrated series!!!
@user-sh6jw1ce3m
@user-sh6jw1ce3m 2 жыл бұрын
Thanks for the comment... contact me for information and profitable investment strategies..⤴️
@pandamusic_tz
@pandamusic_tz 2 жыл бұрын
The Knowledge of Ahmad knows no bounds.
@a.s.m.rashedchowdhury784
@a.s.m.rashedchowdhury784 2 жыл бұрын
opened my eyes to parallel programming
@user-sh6jw1ce3m
@user-sh6jw1ce3m 2 жыл бұрын
Thanks for the comment... contact me for information and profitable investment strategies..⬆️
@bombosbikanal3858
@bombosbikanal3858 2 жыл бұрын
Perfect Video! Saw was revealing to me to understand how it works. Thank you! I am a new subscriber of your channel. Regards from Buenos Aires, Argentina
@hilalkoskli6266
@hilalkoskli6266 2 жыл бұрын
Wow It is really awesome! It is much better than a tutorial from university! Thanks!
@user-sh6jw1ce3m
@user-sh6jw1ce3m 2 жыл бұрын
Thanks for the comment... contact me for information and profitable investment strategies...⬆️
@mendes.02
@mendes.02 2 жыл бұрын
This is an academic example that shows the process of copying data to the GPU, doing a vectorized operation, then showing the results. Actually what makes sense on the GPU vs CPU is something I didn't cover, and am hoping other can figure out some cool ideas.
@tugrasolak7852
@tugrasolak7852 2 жыл бұрын
this is extremely helpful. you did an amazing job explaining the foundations
@user-sh6jw1ce3m
@user-sh6jw1ce3m 2 жыл бұрын
Thanks for the comment... contact me for information and profitable investment strategies...⤴️
@enescakmak6699
@enescakmak6699 2 жыл бұрын
It works on both AMD and NVIDIA. If you have CUDA code, you can convert it to HIP with their automated tool, there is very little CUDA specific that can't be just translated over.
@maher9422
@maher9422 6 ай бұрын
يعطيك العافية اخوي احمد بدي اطلب منك طلب بسيط انه تعمل نفس الكورس عربي عارف مش حيكون كثير المشاهدات. ولكن اخوانك يحتاجوك اكثر من الاجانب انا بفهم عليك بس فيه غيري بيحبوا المجال وبيحبوا يتعلموه بلغتهم. اذا انت ما تملك الوقت اسمحلي اترجم الفيديو واشرحه عندي بلايك من عندك على انك موافق.
@sachinram3783
@sachinram3783 2 жыл бұрын
Sir,make more detailed sessions on CUDA,your explanation is great
@likky2229
@likky2229 2 жыл бұрын
The video was very helpful for me. Many thanks to the author for developing his audience with interesting and useful content
@bodyprodaction9718
@bodyprodaction9718 2 жыл бұрын
Well just built a new rig with a 980ti and a 4790k so I'm gonna put that to test. Thank you for your wonderful explanation :D
@theatlantisreport1595
@theatlantisreport1595 2 жыл бұрын
This reminds me a lot of the computer tutorial tapes from the 90s
@yigitboran5565
@yigitboran5565 2 жыл бұрын
That's mostly how it works. It's more like sorting the stones by its color and pattern and counting each variety. Using the CPU way, you would need to count each variety separately. If you have 100 different colors and patterns, that would take a long time to count (even if you could count extremely accurate and fast, similar to how the CPU makes up for it's lack of parallelism). The GPU way lets many people count them. Given 100 people (like the GPU), each person would count each variety at the same time.
@vanshd6884
@vanshd6884 2 жыл бұрын
PS. I really so moved for your stock price episode. thank you so sosososo much.
@imsoumyajitbag
@imsoumyajitbag 2 жыл бұрын
Awesome video !! It's preety cool to see such theoretical concepts coded and explained like this. Keep going Nich !!
@yusifhsnov1802
@yusifhsnov1802 2 жыл бұрын
So stoked you liked it 🙏
@cemiltuna
@cemiltuna 2 жыл бұрын
Thanks for the video, I found the first half and the wrap up really excellent.
@hgmalani21
@hgmalani21 2 жыл бұрын
It's very informative and a good intro to CUDA programming. Thanks very much!
@user-sh6jw1ce3m
@user-sh6jw1ce3m 2 жыл бұрын
#
@gularif1
@gularif1 2 жыл бұрын
On the PC side Matrox was the first company to introduce GPU's . This was followed by ATI . NVidia came into the scene after the success of these 2 Canadian companies. Matrox's original 3D board was a 3 setboard with custom asics. I believe NVIDA actually acquired ATI. So yes, NVIDIA was not the first but they are the biggest in the space now. Matrox is still around but more involved in the industrial and nice markets.
@putraairdrop1260
@putraairdrop1260 2 жыл бұрын
It's great video programming sir,, hope the best for you
@mehmetak4349
@mehmetak4349 2 жыл бұрын
What makes the CPU better than the GPU is that each core is clocked at a faster speed and has many built-in instructions like SSE, allowing data to be processed faster. This provides a tremendous benefit to programs that only run on 1 core. In rendering where multiple cores can be used, you would need the CPU to process pixels about 5+x faster to match the GPU's performance.
@pusher7051
@pusher7051 2 жыл бұрын
glad to see you take it as a feedback and not as a hate comment
@agusexclusife2577
@agusexclusife2577 2 жыл бұрын
Can't wait to see Juan's better tutorial that he's definitely going to release :') lmao. Great video Ahmad .
@blackiselia8054
@blackiselia8054 2 жыл бұрын
i need to say this: you are the gamechanger here!!
@weds8296
@weds8296 2 жыл бұрын
Thanks for making all these topics very approachable!
@ricardomilos5889
@ricardomilos5889 2 жыл бұрын
This was a great video to me, I have very limited C++ experience and was looking for an explanation of CUDA. Another video like this could easily have been 70-80% over my head. This one was only about 15% whoosh. And now I actually find C++ interesting again!
@denizugurbiltekin622
@denizugurbiltekin622 2 жыл бұрын
Thank you so very much. This is the exact kind of material I was looking for on this very specific subject. Kudos.
@keremkipri9436
@keremkipri9436 2 жыл бұрын
Thanks for the video, subscribed! A suggestion : this small change to your code would demonstrate a real-world gradient descent solution for linear regression with noisy data. E.g. :
@aysezuhre8606
@aysezuhre8606 2 жыл бұрын
Ahmad sorry for bother you, the problem was not installing Cuda Toolkit, srly I hate people who doesnt watch full video closely and ask stupid questions....and now I m one of them :D. Thx alot for this tutorial in 2 months i will try write my own GPU operator for my program, would be interting if this will be faster than CPU. (Btw using normal Visual code in python 3.10 env. on win 11, so far so good. (Altrough i have some code output delay problem when using openCV for some strange reason)
@aliyensagaltc3121
@aliyensagaltc3121 2 жыл бұрын
You are bloody watching a master at work xD
@TheGameboyTheDream123
@TheGameboyTheDream123 2 жыл бұрын
@nvidia I personally think the way you did the demonstration was perfectly sufficient. IMO, fancy graphics are unnecessary. Good job.
@coolboy_0459
@coolboy_0459 2 жыл бұрын
I was needing this!!! Thanks a lot, Sir!!!!
@besttwitcher4569
@besttwitcher4569 2 жыл бұрын
YESSSS, right?! Glad you liked it Miguel!
@Voicemelod
@Voicemelod 2 жыл бұрын
Amazing! I'm learning so much watching you code. Thank you for sharing.
@Muhannad_ALAZZO
@Muhannad_ALAZZO 2 жыл бұрын
wold love to see a video on what are a few CUDA programming challenges
@kolaybreaworlds3178
@kolaybreaworlds3178 2 жыл бұрын
Also, the CT5 simulator from 1981 may not count as being from the '70s or '60s, but from what I understand, the CT5 was capable of realtime, rasterized, 3D polygonal rendering and was $20 million at the time. It used gouraud shading, if memory serves. There were several other CT (continuous tone) simulators developed by E&S in the '70s that did something similar or of much lower capability than the CT5 of '81. There was also the Digistar planeteriums that date back to the early '80s, and the Picture System goes back to at least the early '80s. Might be vector or raster, not entirely sure myself, though.
@MTHHC
@MTHHC 2 жыл бұрын
Hey, thanks for explanation! Very well done 👍 I am downloading CUDA 💪
@mehmetplgx8025
@mehmetplgx8025 2 жыл бұрын
Love your videos. Please don't stop!
@vipyt8550
@vipyt8550 2 жыл бұрын
It can be found in O(1). As far as I remember the formula is derived using LDU decomposition or Diagonalising a matrix, for matrix exponentiation.
@Cardexs
@Cardexs 2 жыл бұрын
This is really helpful for my computing. Thank you.
@yapmabekanka7151
@yapmabekanka7151 2 жыл бұрын
This is amazing! Thank you for taking effort to make it!
@floricicalaurentiumarian167
@floricicalaurentiumarian167 2 жыл бұрын
Best CUDA tutorial ever...can you please make a tutorial on opencl and MPI?
@user-sh6jw1ce3m
@user-sh6jw1ce3m 2 жыл бұрын
Thanks for the comment... contact me for information and profitable investment strategies...⬆️
@bekirok4503
@bekirok4503 2 жыл бұрын
I tried to follow this on my Windows 10 machine. The function you call as at 7:16 is still create_fractal() and not mandel_kernel() so I don't see why it is faster. When I changed it to mandel_kernel(), it complained that I hat to provide a launch configuration, telling the gpu how many grids and blocks to create. I added it like so (First properly setting a grid and block variable): mandel_kernel[grid, block](-2.0, 1.0, -1.0, 1.0, image, 20). It then worked and really was nearly 100x faster than the jit version.
@aliarda9719
@aliarda9719 2 жыл бұрын
Thanks a million @Lakshman!! I try to keep it pretty tight so it’s a good challenge otherwise I know I’ll just talk for 22 minutes anyway😅
@alidoruk5588
@alidoruk5588 2 жыл бұрын
This guy is so underrated.
@wintergacha5370
@wintergacha5370 2 жыл бұрын
Very nice tutorial. I really liked it. It's brief, to the point and very clear. Thanks. Could you please make a video for the same example but in Linux?
@herseydenbirazck2051
@herseydenbirazck2051 2 жыл бұрын
HEYYYYY PHIL!! Long time no see, thanks a mil!!
@halilkuser8455
@halilkuser8455 2 жыл бұрын
Sorry for bothering you, the problem was not installing Cuda Toolkit, srly I hate people who doesnt watch full video closely and ask stupid questions....and now I m one of them :D. Thx alot for this tutorial in 2 months i will try write my own GPU operator for my program, would be interting if this will be faster than CPU. (Btw using normal Visual code in python 3.10 env. on win 11, so far so good. (Altrough i have some code output delay problem when using openCV for some strange reason)
@dragnn1430
@dragnn1430 2 жыл бұрын
Dear Ahmad, you are 30 years old only doing post-doc ? I'm sorry but this to me sounds very underrated. Postdocs are not always well compensated for their work but spend a lot of time working and doign research. If i were you, i'd invest more time on my youtube channel, rather than doing something that does not compensate well.
@Марат-ъ1в5у
@Марат-ъ1в5у 2 жыл бұрын
Йдйж
@Марат-ъ1в5у
@Марат-ъ1в5у 2 жыл бұрын
----
@Марат-ъ1в5у
@Марат-ъ1в5у 2 жыл бұрын
-----1--11
@Марат-ъ1в5у
@Марат-ъ1в5у 2 жыл бұрын
Уу3уййфйфйфяфффЯффффыяяыыяфяфыффффффыыяяяЯш
@altairgeo7462
@altairgeo7462 3 ай бұрын
​@@Марат-ъ1в5у гений
@saplamacay1579
@saplamacay1579 2 жыл бұрын
Ahmad , great video. You have a great way of explaining things and help a lot of people. IMO a lot of the critic you get - such as here is unfunded. By the way, I do not see any video postings by Juan???. I am trying to get my KZbin channel started and hope that in 10 years time I will be 1/10th as good as Ahmad 👍.
Make Python code 1000x Faster with Numba
20:33
Jack of Some
Рет қаралды 448 М.
Unlocking your CPU cores in Python (multiprocessing)
12:16
mCoding
Рет қаралды 312 М.
1% vs 100% #beatbox #tiktok
01:10
BeatboxJCOP
Рет қаралды 67 МЛН
Compiled Python is FAST
12:57
Doug Mercer
Рет қаралды 120 М.
Mind-bending new programming language for GPUs just dropped...
4:01
Python Shadow Simulator
14:21
Code Station
Рет қаралды 821
Dear Game Developers, Stop Messing This Up!
22:19
Jonas Tyroller
Рет қаралды 776 М.
Writing Code That Runs FAST on a GPU
15:32
Low Level
Рет қаралды 574 М.
Striding CUDA like i'm Johnnie Walker
11:07
Ahmad Bazzi
Рет қаралды 545 М.
Faster than Rust and C++: the PERFECT hash table
33:52
strager
Рет қаралды 612 М.
5 Useful F-String Tricks In Python
10:02
Indently
Рет қаралды 339 М.
I built my own 16-Bit CPU in Excel
15:45
Inkbox
Рет қаралды 1,6 МЛН