Onur Mutlu Lectures

Onur Mutlu Lectures

This channel contains videos and slides from courses taught and lectures delivered by Professor Onur Mutlu (people.inf.ethz.ch/omutlu/).

Onur Mutlu is a Professor of Computer Science at ETH Zurich. He is also a faculty member at Carnegie Mellon University, where he previously held the William D. and Nancy W. Strecker Early Career Professorship. His current broader research interests are in computer architecture, systems, and bioinformatics. A variety of techniques he and his group have invented over the years have influenced industry and have been employed in commercial microprocessors and memory/storage systems. His industrial experience spans starting the Computer Architecture Group at Microsoft Research (2006 2009), and various product and research positions at Intel Corporation, Advanced Micro Devices, VMware, and Google. He obtained his PhD and MS in ECE from the Univ. of Texas at Austin and BS degrees in Computer Engineering and Psychology from the Univ. of Michigan.

P&S Genomics - Lecture 12b: RawAlign (Spring 2024)

28:20

P&S Genomics - Lecture 12b: RawAlign (Spring 2024)

28 күн бұрын

P&S Genomics - Lecture 12a: Introduction to Real-Time Raw Nanopore Signal Analysis: RawHash (S 2024)

38:22

P&S Genomics - Lecture 12a: Introduction to Real-Time Raw Nanopore Signal Analysis: RawHash (S 2024)

28 күн бұрын

The Story of RowHammer - Keynote Talk at NTUA 6th Computing Systems Research Day - 10.01.2023

1:35:36

The Story of RowHammer - Keynote Talk at NTUA 6th Computing Systems Research Day - 10.01.2023

28 күн бұрын

Memory-Centric Computing - Talk by Prof. Mutlu at Stanford University SystemX Seminar - 04.04.2024

1:01:35

Memory-Centric Computing - Talk by Prof. Mutlu at Stanford University SystemX Seminar - 04.04.2024

Ай бұрын

Memory-Centric Computing: Distinguished Lecture at Arizona State University ACME Center - 28.03.2024

1:09:15

Memory-Centric Computing: Distinguished Lecture at Arizona State University ACME Center - 28.03.2024

Ай бұрын

RowHammer, RowPress & Beyond: Can We Be Free of Bitflips (Soon)? - Talk at UC Berkeley - 28.02.2024

56:20

RowHammer, RowPress & Beyond: Can We Be Free of Bitflips (Soon)? - Talk at UC Berkeley - 28.02.2024

Ай бұрын

RowHammer, RowPress & Beyond: Can We Be Free of Bitflips (Soon)? - Talk at NCSU (Extended Version)

1:27:32

RowHammer, RowPress & Beyond: Can We Be Free of Bitflips (Soon)? - Talk at NCSU (Extended Version)

Ай бұрын

Memory-Centric Computing: Duke University & ATHENA AI Institute Distinguished Lecture, 15.02.2024

1:04:19

Memory-Centric Computing: Duke University & ATHENA AI Institute Distinguished Lecture, 15.02.2024

Ай бұрын

Future Computing Platforms - Talk at Stanford University SystemX Seminar - 08.02.2024

1:03:06

Future Computing Platforms - Talk at Stanford University SystemX Seminar - 08.02.2024

Ай бұрын

Memory-Centric Computing - Talk at UT-Austin ECE Colloquium - 31.01.2024

1:04:05

Memory-Centric Computing - Talk at UT-Austin ECE Colloquium - 31.01.2024

Ай бұрын

34:32

ApHMM: Accelerating Profile Hidden Markov Models for Fast & Energy-Efficient Genome Analysis -HiPEAC

Ай бұрын

RowHammer, RowPress & Beyond - Invited Talk at NJIT - Prof. Onur Mutlu - 18.12.2023

1:03:32

RowHammer, RowPress & Beyond - Invited Talk at NJIT - Prof. Onur Mutlu - 18.12.2023

Ай бұрын

Modern SSDs - Lecture 5: GenStore: In-Storage Processing System for Genome Analysis (Spring 2024)

43:50

Modern SSDs - Lecture 5: GenStore: In-Storage Processing System for Genome Analysis (Spring 2024)

Ай бұрын

P&S Genomics - Lecture 11b: AirLift (Spring 2024)

25:59

P&S Genomics - Lecture 11b: AirLift (Spring 2024)

Ай бұрын

Memory-Centric Computing - Talk at AI Summit, School of AI Algeria - 14.12.2023

2:34:57

Memory-Centric Computing - Talk at AI Summit, School of AI Algeria - 14.12.2023

Ай бұрын

P&S Genomics - Lecture 11a: BLEND (Spring 2024)

56:07

P&S Genomics - Lecture 11a: BLEND (Spring 2024)

Ай бұрын

Security of PIM Systems: Invited Talk at Dagstuhl MAD Seminar - 30.11.2023

14:10

Security of PIM Systems: Invited Talk at Dagstuhl MAD Seminar - 30.11.2023

Ай бұрын

RowHammer, RowPress & Beyond: Invited Talk at Google Zurich Hardware Security Summit - 15.11.2023

48:19

RowHammer, RowPress & Beyond: Invited Talk at Google Zurich Hardware Security Summit - 15.11.2023

Ай бұрын

RowHammer, RowPress & Beyond: Can We Be Free of Bitflips (Soon)? - Stanford University - 26.10.2023

1:09:37

RowHammer, RowPress & Beyond: Can We Be Free of Bitflips (Soon)? - Stanford University - 26.10.2023

Ай бұрын

Memory-Centric Computing - Keynote Talk at DOT-PIM Workshop @ ESWEEK - Prof. Onur Mutlu - 21.09.2023

45:42

Memory-Centric Computing - Keynote Talk at DOT-PIM Workshop @ ESWEEK - Prof. Onur Mutlu - 21.09.2023

Ай бұрын

Storage-Centric Computing for Modern Data-Intensive Workloads: NCIS Keynote Speech - 16.09.2023

31:46

Storage-Centric Computing for Modern Data-Intensive Workloads: NCIS Keynote Speech - 16.09.2023

Ай бұрын

Memory-Centric Computing - Talk at Hunan University on 17.08.2023 - Prof. Onur Mutlu

57:15

Memory-Centric Computing - Talk at Hunan University on 17.08.2023 - Prof. Onur Mutlu

Ай бұрын

Memory-Centric Computing - Lightning Talk @ DAC - 13.07.2023

11:46

Memory-Centric Computing - Lightning Talk @ DAC - 13.07.2023

Ай бұрын

P&S Genomics - Lecture 10b: TargetCall (Spring 2024)

21:38

P&S Genomics - Lecture 10b: TargetCall (Spring 2024)

Ай бұрын

Accelerating Genome Analysis - Talk at Univ. of Montenegro Faculty of Medicine - 8.6.2023

2:01:28

Accelerating Genome Analysis - Talk at Univ. of Montenegro Faculty of Medicine - 8.6.2023

Ай бұрын

P&S Genomics - Lecture 10a: GenPIP (Spring 2024)

27:27

P&S Genomics - Lecture 10a: GenPIP (Spring 2024)

Ай бұрын

TCuARCH meets Prof. Onur Mutlu - 2023

36:28

TCuARCH meets Prof. Onur Mutlu - 2023

Ай бұрын

Storage-Centric Computing - Keynote Talk at Huawei Data Storage Architecture Innovation Forum

30:22

Storage-Centric Computing - Keynote Talk at Huawei Data Storage Architecture Innovation Forum

Ай бұрын

Modern Solid-State Drives (SSDs) - Lecture 4: Introduction to MQSim (Spring 2024)

52:21

Modern Solid-State Drives (SSDs) - Lecture 4: Introduction to MQSim (Spring 2024)

Ай бұрын

Пікірлер

@linerider195 2 күн бұрын

The slide mentions "How to use 193nm light to pattern 20nm structures", but isn't EUV around 13nm wavelength?

@yurigansmith 3 күн бұрын

I've just found these amazing lectures, and first want to thank Prof. Mutlu for providing them in the first place. But I have one question: There are other playlists of this lectures on the channel, but from former semesters, and the one from (for example) 2023 has 41 videos, whereas this one has only 26. Why is that so? Is this playlist actually complete?

@bharadwaj767 7 күн бұрын

ISA is an interface between hardware micro-arch and software system (assembly programming)......... complex hardware == specific hardware to perform special functions (say, matrix multiplication) [could we call these as accelerators? I mean hardware designed specifically to perform a special task or grp of special tasks]

@bharadwaj767 7 күн бұрын

Von-Neumann.. Excited!

@bharadwaj767 8 күн бұрын

Think Critically! if you don't, who'd?

@pranavdani3810

@pranavdani3810 9 күн бұрын

Thank you so much for your lectures. This is a very good resource for understanding the basics of GPU architecture and programming models.

@pulkitjain25 9 күн бұрын

Where is onur

@zjoy5723 10 күн бұрын

👍

@wolpumba4099 12 күн бұрын

*Talk Summary:* This talk presents a novel paradigm shift in genome analysis, moving away from traditional basecalling towards direct interpretation of raw nanopore signals. This approach, powered by the RoHash and RoHash2 algorithms, promises unprecedented speed, efficiency, and scalability, especially for complex and large genomes. *Motivation:* * *[**0:00:00**] The Need for Real-Time Genomics:* Basecalling, while accurate, introduces a significant computational bottleneck, hindering real-time analysis. Direct raw signal analysis offers a faster and more efficient alternative. * *[**0:09:00**] Harnessing Nanopore's Power:* Nanopore sequencing provides unique advantages like long reads, portability, and real-time signal generation, which RoHash and RoHash2 leverage for their breakthroughs. *Tackling Real-time Analysis Challenges:* * *Speed:* Algorithms must match the rapid data generation rate of nanopore sequencing. * *Decision Making:* Rapid and accurate analysis is crucial for real-time decision-making (e.g., selective sequencing). * *Accuracy:* Maintaining high accuracy despite signal variations is essential for reliable results. * *Power Efficiency:* Crucial for portable sequencing applications. *The RoHash and RoHash2 Breakthrough:* * *[**0:14:50**] Understanding Raw Signals:* * *k-mers:* Short DNA sequences (6-9 nucleotides) that create distinct electrical signals as they pass through the nanopore. * *Events:* These distinct electrical signals corresponding to individual k-mers. * *Signal Variation:* Identical k-mers can produce slightly different signals due to noise and other factors. * *[**0:16:00**] Hash-Based Search for Efficiency:* RoHash introduces a novel approach using hash functions to rapidly map raw signals to a reference genome, significantly boosting speed and scalability for larger genomes. * *[**0:20:00**] The Art of Signal Quantization and Hashing:* - *Quantization:* Grouping similar signal values to minimize the impact of noise and variations, improving mapping accuracy. - *Hash Value Construction:* Combining multiple quantized event values to create longer, more unique hash values that represent short k-mer sequences, balancing sensitivity and specificity. * *[**0:27:42**] Sequence-Scale Analysis for Efficiency:* RoHash dynamically determines when sufficient data has been acquired for accurate analysis, allowing for early termination of sequencing runs and significant cost reduction (up to 15x). * *[**0:30:41**] RoHash2: Refining for Speed and Accuracy:* - Improved quantization techniques for greater accuracy. - A more sensitive chaining algorithm to better connect matching segments. - Machine learning-based decision-making for robustly determining when to stop mapping. - Integration of filters and minimizer sketching for enhanced efficiency. *Impactful Outcomes:* * *[**0:32:00**] Unmatched Speed:* RoHash2 surpasses existing raw signal analysis tools with up to 27x faster throughput. * *[**0:34:00**] High Accuracy Maintained:* RoHash2 consistently achieves the highest accuracy in raw signal mapping while offering substantial speed improvements. * *[**0:37:00**] Cost-Effective Genomics:* Sequence-scale analysis, combined with RoHash2, leads to significant reductions in sequencing time and costs without compromising accuracy. * *[**0:39:00**] Real-Time Assembly Potential:* Early results show promise for performing genome assembly in real-time, directly from raw signal overlaps. *The Road Ahead: Exciting Future Directions* * *[**1:01:00**] Conquering Repetitive Regions:* Addressing challenges in accurately analyzing repetitive genomic sequences for improved accuracy in complex genomes. * *[**0:39:00**] Expanding Real-Time Applications:* Developing new algorithms for downstream genomic analyses (variant calling, structural variation detection) directly on raw signals, unlocking a new era of real-time insights. * *[**0:39:00**] Synergy with Basecalling:* Exploring the integration of raw signal analysis with basecalling to leverage their respective strengths and create even more powerful analysis pipelines. * *[**0:42:00**] Hardware Acceleration:* Harnessing the power of specialized hardware like FPGAs, GPUs, and in-memory computing to further accelerate RoHash and unlock its full potential for real-time applications. *Conclusion:* This talk heralds a paradigm shift in genomics by introducing real-time analysis of raw nanopore signals. RoHash and RoHash2, with their speed, accuracy, and efficiency, hold the potential to transform fields like personalized medicine, pathogen detection, and evolutionary biology. This is just the beginning of a new era of real-time genomic exploration! i used gemini 1.5 pro to summarize the transcript

@brunosobreirafranca6720

@brunosobreirafranca6720 12 күн бұрын

Amazing amazing

@caalher 12 күн бұрын

So, just wondering, how does cpu caches play here? Here we are seeing that memory internal row buffer "caches" a single row to serve the same request over and over as long as they are in the same row, but, isn't the first request not in the cpu caches will now cache this line (I mean within the cpu caches)? Anyways, great lecture

@uis246 14 күн бұрын

His brain is so big, that you can use his head as gravitational lens.

@Tapajara 15 күн бұрын

A RISC processor requires more memory accesses to accomplish the same things as a well-designed CISC processor. So our infatuation with RISC is partly to blame for the waste you talk about. Having a more concentrated instruction set is the equivalent of keeping your most frequently used subroutines close to the CPU. That is what CISC is all about.

@saiiiiiii1 15 күн бұрын

Oftentimes I hear people say garbage collection is so good nowadays that you don't even need TRIM. For example in a RAID scenario, it's most of the time not available anyways. What I'm asking myself is how would GC ever know which pages are invalid without the OS telling it. It never knows until the OS tells it to overwrite it? So where does TRIM even come into play here? What's the difference between invalid and TRIM marking the page as free?

@dghtucs 16 күн бұрын

Thank you for sharing your knowledge selflessly

@matthewvenn 17 күн бұрын

Surprising, shocking, inspirational - loved the keynote, happy I saw it live!

@Jegan_Eng 17 күн бұрын

Thankyou 🥺

@Tapajara 22 күн бұрын

"Economics is Not Important!" Okay. I have to tell you that you lost your house and you have to live in the gutter without food. So I now know that is okay with you.

@Tapajara 22 күн бұрын

Top Down Design with Bottom Up Implementation. But yes, learning should be Bottom Up too.

@Tapajara 22 күн бұрын

What you are calling "Research" I call "Development". They are the two different parts of R&D. Scholarship is yet another thing that has more to do with fame than the other two.

@robertocarballo8006

@robertocarballo8006 23 күн бұрын

Thank you Professor for sharing Yale's talk!

@N0RT0X 23 күн бұрын

Professor Onur, you are an incredibly based man

@user-yw6vb5zx7e

@user-yw6vb5zx7e Ай бұрын

The glitch came back near the end of the lecture

@user-yw6vb5zx7e

@user-yw6vb5zx7e Ай бұрын

The error with the slides was minimal and only listed items that the speaker was talking about. They realized it a few minutes into the second speaker's lecture and fixed it

@user-yw6vb5zx7e

@user-yw6vb5zx7e Ай бұрын

Second speaker's slides are glitching. Can't follow very easily

@tarkcetin3389 Ай бұрын

This lesson was very very informative for me, who is interested in long read sequencing and ONT. Thank you very much sayın hocam

@YA_LUNNAYA_PONI

@YA_LUNNAYA_PONI Ай бұрын

A slightly unrelated question but still, I'm not sure that the "Large Semantic Gap" is necessarily bad. Like the IA-64 showed that writing a proper optimizing compiler for an ISA like that can be very tricky. And maybe that's because the IR is usually RISC-ish, or maybe because major compilers target multiple architectures, but in the end it's almost like all that ISA's complexity is deemed an "overkill" and either ignored or underutilized?

@YA_LUNNAYA_PONI

@YA_LUNNAYA_PONI Ай бұрын

And like the x86 ISA gives vibes of an ISA that was designed to be written by actual programmers (hence the multiple complex instructions that can like read the memory, or the easy aka one inst immediate loads), but compilers don't really care about all that stuff. Like yeah the binary size is a thing, but it's not what modern compilers optimize for, and having more granularity (smaller, simpler instructions) even sounds like a good thing.

@DAYSHIVANSH Ай бұрын

what is this video about ?

@Gr8ness8999 Ай бұрын

in the logic multiplexers (III) slide shouldnt Y = AB*C+B*C* + A*BC (* means complement as i cant type it on the keyeboard)

@YA_LUNNAYA_PONI

@YA_LUNNAYA_PONI Ай бұрын

29:15 but the data propagation time here is worse than in the second design. So it's not like, unequivocally better.

@YA_LUNNAYA_PONI

@YA_LUNNAYA_PONI Ай бұрын

Thinking about it a bit more I guess it is better because we care about the worst case scenario (which is mentioned further in the lecture)

@YA_LUNNAYA_PONI

@YA_LUNNAYA_PONI Ай бұрын

28:10 I'm not exactly clear on why the second implementation omitted the 4-input OR gate. Like if you also omit that OR gate in first design the control signal propagation will be faster (and it's not like two AND gates can be enabled at the same time can they?)

@DanielCardenas1

@DanielCardenas1 Ай бұрын

Looks like the wrong video was uploaded for the title. Please upload the correct video.

@nadigrakesh Ай бұрын

Thank you for your comment. We have changed the title to better describe the lecture.

@salvatorelionetti1274

@salvatorelionetti1274 Ай бұрын

"As people dream, much application will come". Very nice POV.

@user-pk5xu1ir5w

@user-pk5xu1ir5w Ай бұрын

Hi, A doubt: At timestamp: 1:43:11, how can 8 SIMB units execute a 32 thread warp ?

@jimgolab536 Ай бұрын

Do the victim bits always fail with false zeroes? Ie, the victim cap is always falsely discharged, and never charged?

@jimgolab536 Ай бұрын

Is the victim bit cell charge loss dominated by the sub-threshold leakage of word line pass transistor? A small rise in the gate voltage of the victim pass transistor (due to capacitive coupling to the rising aggressor word line) would have an exponential effect on the current through it and thus the rate of discharge of the victim bit.

@user-wu6lt3wm7t

@user-wu6lt3wm7t Ай бұрын

how does this type of model of distributed processing would handle the communication between different tasks that would need to communicate there results? my point is the problem that would limit this type of architecture would be how to handle the communication between different systems (processing units). This problem is essentially the same problem with the multicore architecture and the bottle neck would be how to design programs that would make use of the increased number of compute units. and this means that the main problem that would have to be solved is how to design programs that can be run in parallel .

@user-wu6lt3wm7t

@user-wu6lt3wm7t Ай бұрын

i realize now that you don't necessarily mean to remove the cpu but delegate some of the operation that deals with memory. however, i am still interested in what could be used to make a distributed compute system work.

@user-wu6lt3wm7t

@user-wu6lt3wm7t Ай бұрын

also a followup you mentioned that we could move a lot of the compute to the memory as the actual operation is happening in the memory. how would you deal with the latency of the memory cells of the system?

@Ic3q4 Ай бұрын

Sir can i have a phonecall with u? I can also come to eth but i'd like to discuss some things.

@BimanDebbarma Ай бұрын

Can SRAM based in memory computation handle the huge number of MAC operations required by modern AI algorithms?

@BimanDebbarma Ай бұрын

Sir what is the scope of research in SRAM based in memory computation?

@user-gb7cc3fv8c

@user-gb7cc3fv8c Ай бұрын

1:45:49 Lecture 17b: Memory Ordering (Memory Consistency) 1:49:55 Memory ordering in multiprocessors 1:50:20 Memory consistency vs Cache coherence (Note: These bookmarks are not exhaustive)

@user-gb7cc3fv8c

@user-gb7cc3fv8c Ай бұрын

How are write conflicts resolved? Scenario - Multiple cores have cached the same memory location locally (say an integer value). All the cores update the integer at the same time.

@balajikannan78

@balajikannan78 Ай бұрын

Such an excellent lecture. Thanks for this!

@user-vp6fh8gx7z

@user-vp6fh8gx7z Ай бұрын

China needs your help.

@Michael-kj9mh Ай бұрын

I don't quite understand why the P2L mapping table is needed to find that PPA 12 is mapped to LBA 15. Since given LBA 15, we can find PPA 12 through the L2P table, it is obivous to me that given PPA 12 we can also find LBA 15 through the table by just going the opposite lookup direction. What am I missing here?