FPGAs are (not) Good at Deep Learning [Invited]

  Рет қаралды 22,520

Crossroads 3D-FPGA Academic Research Center

Crossroads 3D-FPGA Academic Research Center

Күн бұрын

Пікірлер: 21
@prat1024
@prat1024 2 жыл бұрын
The presentation was extraordinary!! I am a student at the university of Stuttgart as well and this post randomly came across my feed.
@مقاطعمترجمة-ش8ث
@مقاطعمترجمة-ش8ث 2 ай бұрын
same lol
@eafindme
@eafindme 10 ай бұрын
Imagine that you have 3 binary files, each represent a FPGA binary for different DNN models. Then you have an FPGA. Instead of making hardware architecture universal that could support all 3 DNN models, like GPU or ASIC, you could just optimize each DNN model for FPGA via software codesign, and reprogram the FPGA on the fly, such that 3 of the DNN models has distinctive hardware optimization. Now the FPGA has the same flex as ASIC yet cost way less space and money. This is where the fun begins.
@wangshuoleon4400
@wangshuoleon4400 2 ай бұрын
but i fell development such algorithms on fpga cost much more time compared with that on GPU,
@مقاطعمترجمة-ش8ث
@مقاطعمترجمة-ش8ث 2 ай бұрын
@@wangshuoleon4400 That's the dilemma, fpga save you ASIC cost, more flexible, GPU faster to develop
@MrTweetyhack
@MrTweetyhack 2 жыл бұрын
"If you can build it in ASIC, it won't be competitive on an FPGA" So what can't be built in ASIC? Actually, this has been know for a long long time
@gm7361
@gm7361 Жыл бұрын
it means if you have the resources and the budget.
@vicktorioalhakim3666
@vicktorioalhakim3666 Жыл бұрын
The problem is that ML engineering is a dynamic discipline: models change all the time, and are updated. So, if one wants to map their model in an efficient way to hardware wrt power usage, resource usage, throughput, latency, etc, then the hardware must also be flexible and dynamic. If you design an ASIC-based accelerator, you kinda have to make it as general as possible to support various changes to topology and parameters of the model. Because the architecture of this accelerator is fixed, this means that often you will have underutilization (resource waste, higher power usage, etc..) or overutilization (lower throughput, higher latency, etc). Now, if you have to tape out many ASICs for different types of models, then this will become costly quite quickly, and quite frankly a waste since newer models will come up, quickly deprecating the design. This is where the power of FPGAs can come in handy: here you have the power to customize your HW arch on the fly, such that it suits the given model best. The biggest difficulty is coming up with a good HW "compiler", so that you minimize the amount of manual labor involved in mapping a model to the HW, including the pre and post-processing stages.
@shashwatkhandelwal367
@shashwatkhandelwal367 2 жыл бұрын
Loved the talk!👏 Some very cool ideas!
@shaikon5617
@shaikon5617 2 жыл бұрын
Great presentation. Thanks a lot for sharing. Is the Intel project publicly available ?
@enkidughom2508
@enkidughom2508 10 ай бұрын
Excellent!! Is there a technical report following this? Would l9ve to dive into the details and try to reproduce some results
@aqf0786
@aqf0786 11 ай бұрын
If you knew the fundamental difference in area, speed and power of an FPGA vs ASIC, why not just focus on the key architectural improvements and make an ASIC? Surely, Intel would be able to do so?
@harishabibullah1286
@harishabibullah1286 2 жыл бұрын
Thanks for the talk Mr. Abdelfattah. Is there any course / training to learn these stages of custom h/w kernel development for deep learning ? I am also in a similar field, and my approach is simply to import the hardware from the Synthesizing tool, like Vitis HLS. I am intrested in defining or tweeking some paramteres to make a more customized hardware.
@mabdelfattah88
@mabdelfattah88 Жыл бұрын
My course on ML HW & SYS (www.youtube.com/@mabdelfattah88) could help give you an overview but we don't really go deep into the hardware design part of it. I am preparing a new FPGA-focused course now which should cover the detailed design of HW accelerators - I hope to also post parts of it online. Stay tuned!
@vatsan2483
@vatsan2483 Жыл бұрын
@@mabdelfattah88 Looking forward to this course.. but based on the above presentation a quick question sir.. on the topic of co-design for DNN, you had suggested that FPGA-X can achieve 100imgs/s for imagenet classification rather than DLA can achieve 80imgs/s for this ResNet-50.. basically more generic model for a larger class than specialised/tuned for specific testcase.. But isnt the underlying purpose of DNN itself is rather specific than of broader notion? Like tuning of parameters by nature is a subject of its input data isnt?
@jacoblin0820
@jacoblin0820 Жыл бұрын
@@mabdelfattah88 Looking forward to the new course!
@rulekop
@rulekop Жыл бұрын
Very interesting and clearly presented!
@chriswysocki8816
@chriswysocki8816 7 ай бұрын
did I hear that right, mr. presenter? you did this project while working at Intel? And you were not using Intel/Altera FPGAs but Xilinx. Why???? As a former Altera/Intel manager in the FPGA group I feel disappointed :)
@MrMaguuuuuuuuu
@MrMaguuuuuuuuu 2 ай бұрын
Oops. Altera can’t give away their parts 😂
@BharatIndiaHindustan628
@BharatIndiaHindustan628 Жыл бұрын
Hi Mohamed, I'm a beginner at AI and deep learning. And I have just started to learn these things. In order to build some deep learning hardware applications/IPs for practice and hands on purpose. I'm really fascinated with the things that AI can do in field of health monitoring and medical diagnostics. I'll be really grateful and happy if you can provide me your mail id. I would like to keep in touch with you for guidance and mentorship. Thanks
Trends in Deep Learning Hardware: Bill Dally (NVIDIA)
1:10:58
Paul G. Allen School
Рет қаралды 25 М.
What's the future for generative AI? - The Turing Lectures with Mike Wooldridge
1:00:59
БАБУШКА ШАРИТ #shorts
0:16
Паша Осадчий
Рет қаралды 4,1 МЛН
Ozoda - Alamlar (Official Video 2023)
6:22
Ozoda Official
Рет қаралды 10 МЛН
Announcing Bluefin GDX (Alpha) - for AI and ML Professionals
5:40
The History of the FPGA: The Ultimate Flex
18:08
Asianometry
Рет қаралды 352 М.
Transformers (how LLMs work) explained visually | DL5
27:14
3Blue1Brown
Рет қаралды 4,7 МЛН
OverGen: Improving FPGA Usability through Domain-specific Overlay Generation [Invited]
47:52
Crossroads 3D-FPGA Academic Research Center
Рет қаралды 508
Architecture All Access: Modern FPGA Architecture | Intel Technology
20:48
GEOMETRIC DEEP LEARNING BLUEPRINT
3:33:23
Machine Learning Street Talk
Рет қаралды 385 М.
XDC 2019 | Everything Wrong With FPGAs - Ben Widawsky
1:03:20
X.Org Foundation
Рет қаралды 12 М.
[1hr Talk] Intro to Large Language Models
59:48
Andrej Karpathy
Рет қаралды 2,5 МЛН
MIT Introduction to Deep Learning | 6.S191
1:09:58
Alexander Amini
Рет қаралды 893 М.
БАБУШКА ШАРИТ #shorts
0:16
Паша Осадчий
Рет қаралды 4,1 МЛН