FPGAs are (not) Good at Deep Learning [Invited]

  Рет қаралды 21,115

Crossroads 3D-FPGA Academic Research Center

Crossroads 3D-FPGA Academic Research Center

Күн бұрын

Пікірлер: 17
@eafindme
@eafindme 7 ай бұрын
Imagine that you have 3 binary files, each represent a FPGA binary for different DNN models. Then you have an FPGA. Instead of making hardware architecture universal that could support all 3 DNN models, like GPU or ASIC, you could just optimize each DNN model for FPGA via software codesign, and reprogram the FPGA on the fly, such that 3 of the DNN models has distinctive hardware optimization. Now the FPGA has the same flex as ASIC yet cost way less space and money. This is where the fun begins.
@prat1024
@prat1024 Жыл бұрын
The presentation was extraordinary!! I am a student at the university of Stuttgart as well and this post randomly came across my feed.
@enkidughom2508
@enkidughom2508 7 ай бұрын
Excellent!! Is there a technical report following this? Would l9ve to dive into the details and try to reproduce some results
@MrTweetyhack
@MrTweetyhack 2 жыл бұрын
"If you can build it in ASIC, it won't be competitive on an FPGA" So what can't be built in ASIC? Actually, this has been know for a long long time
@gm7361
@gm7361 Жыл бұрын
it means if you have the resources and the budget.
@vicktorioalhakim3666
@vicktorioalhakim3666 9 ай бұрын
The problem is that ML engineering is a dynamic discipline: models change all the time, and are updated. So, if one wants to map their model in an efficient way to hardware wrt power usage, resource usage, throughput, latency, etc, then the hardware must also be flexible and dynamic. If you design an ASIC-based accelerator, you kinda have to make it as general as possible to support various changes to topology and parameters of the model. Because the architecture of this accelerator is fixed, this means that often you will have underutilization (resource waste, higher power usage, etc..) or overutilization (lower throughput, higher latency, etc). Now, if you have to tape out many ASICs for different types of models, then this will become costly quite quickly, and quite frankly a waste since newer models will come up, quickly deprecating the design. This is where the power of FPGAs can come in handy: here you have the power to customize your HW arch on the fly, such that it suits the given model best. The biggest difficulty is coming up with a good HW "compiler", so that you minimize the amount of manual labor involved in mapping a model to the HW, including the pre and post-processing stages.
@shaikon5617
@shaikon5617 2 жыл бұрын
Great presentation. Thanks a lot for sharing. Is the Intel project publicly available ?
@harishabibullah1286
@harishabibullah1286 2 жыл бұрын
Thanks for the talk Mr. Abdelfattah. Is there any course / training to learn these stages of custom h/w kernel development for deep learning ? I am also in a similar field, and my approach is simply to import the hardware from the Synthesizing tool, like Vitis HLS. I am intrested in defining or tweeking some paramteres to make a more customized hardware.
@mabdelfattah88
@mabdelfattah88 Жыл бұрын
My course on ML HW & SYS (www.youtube.com/@mabdelfattah88) could help give you an overview but we don't really go deep into the hardware design part of it. I am preparing a new FPGA-focused course now which should cover the detailed design of HW accelerators - I hope to also post parts of it online. Stay tuned!
@vatsan2483
@vatsan2483 Жыл бұрын
@@mabdelfattah88 Looking forward to this course.. but based on the above presentation a quick question sir.. on the topic of co-design for DNN, you had suggested that FPGA-X can achieve 100imgs/s for imagenet classification rather than DLA can achieve 80imgs/s for this ResNet-50.. basically more generic model for a larger class than specialised/tuned for specific testcase.. But isnt the underlying purpose of DNN itself is rather specific than of broader notion? Like tuning of parameters by nature is a subject of its input data isnt?
@jacoblin0820
@jacoblin0820 Жыл бұрын
@@mabdelfattah88 Looking forward to the new course!
@aqf0786
@aqf0786 8 ай бұрын
If you knew the fundamental difference in area, speed and power of an FPGA vs ASIC, why not just focus on the key architectural improvements and make an ASIC? Surely, Intel would be able to do so?
@shashwatkhandelwal367
@shashwatkhandelwal367 2 жыл бұрын
Loved the talk!👏 Some very cool ideas!
@chriswysocki8816
@chriswysocki8816 4 ай бұрын
did I hear that right, mr. presenter? you did this project while working at Intel? And you were not using Intel/Altera FPGAs but Xilinx. Why???? As a former Altera/Intel manager in the FPGA group I feel disappointed :)
@rulekop
@rulekop Жыл бұрын
Very interesting and clearly presented!
@BharatIndiaHindustan628
@BharatIndiaHindustan628 10 ай бұрын
Hi Mohamed, I'm a beginner at AI and deep learning. And I have just started to learn these things. In order to build some deep learning hardware applications/IPs for practice and hands on purpose. I'm really fascinated with the things that AI can do in field of health monitoring and medical diagnostics. I'll be really grateful and happy if you can provide me your mail id. I would like to keep in touch with you for guidance and mentorship. Thanks
The History of the FPGA: The Ultimate Flex
18:08
Asianometry
Рет қаралды 347 М.
Architecture All Access: Modern FPGA Architecture | Intel Technology
20:48
How Much Tape To Stop A Lamborghini?
00:15
MrBeast
Рет қаралды 195 МЛН
ТВОИ РОДИТЕЛИ И ЧЕЛОВЕК ПАУК 😂#shorts
00:59
BATEK_OFFICIAL
Рет қаралды 4,3 МЛН
Who's spending her birthday with Harley Quinn on halloween?#Harley Quinn #joker
01:00
Harley Quinn with the Joker
Рет қаралды 25 МЛН
XDC 2019 | Everything Wrong With FPGAs - Ben Widawsky
1:03:20
X.Org Foundation
Рет қаралды 12 М.
Trends in Deep Learning Hardware: Bill Dally (NVIDIA)
1:10:58
Paul G. Allen School
Рет қаралды 23 М.
AI’s Hardware Problem
16:47
Asianometry
Рет қаралды 632 М.
Cracking Enigma in 2021 - Computerphile
21:20
Computerphile
Рет қаралды 2,5 МЛН
Groq’s Software-Defined Hardware for Dataflow Compute [Invited]
41:54
Crossroads 3D-FPGA Academic Research Center
Рет қаралды 1,1 М.
Crossroads FPGA Seminar: RAD-Sim - Rapid Architecture Exploration for Novel Reconfigurable Devices
44:41
Crossroads 3D-FPGA Academic Research Center
Рет қаралды 270
Transformers (how LLMs work) explained visually | DL5
27:14
3Blue1Brown
Рет қаралды 3,6 МЛН
How Much Tape To Stop A Lamborghini?
00:15
MrBeast
Рет қаралды 195 МЛН