Пікірлер
@linerider195
@linerider195 2 күн бұрын
The slide mentions "How to use 193nm light to pattern 20nm structures", but isn't EUV around 13nm wavelength?
@yurigansmith
@yurigansmith 3 күн бұрын
I've just found these amazing lectures, and first want to thank Prof. Mutlu for providing them in the first place. But I have one question: There are other playlists of this lectures on the channel, but from former semesters, and the one from (for example) 2023 has 41 videos, whereas this one has only 26. Why is that so? Is this playlist actually complete?
@bharadwaj767
@bharadwaj767 7 күн бұрын
ISA is an interface between hardware micro-arch and software system (assembly programming)......... complex hardware == specific hardware to perform special functions (say, matrix multiplication) [could we call these as accelerators? I mean hardware designed specifically to perform a special task or grp of special tasks]
@bharadwaj767
@bharadwaj767 7 күн бұрын
Von-Neumann.. Excited!
@bharadwaj767
@bharadwaj767 8 күн бұрын
Think Critically! if you don't, who'd?
@pranavdani3810
@pranavdani3810 9 күн бұрын
Thank you so much for your lectures. This is a very good resource for understanding the basics of GPU architecture and programming models.
@pulkitjain25
@pulkitjain25 9 күн бұрын
Where is onur
@zjoy5723
@zjoy5723 10 күн бұрын
👍
@wolpumba4099
@wolpumba4099 12 күн бұрын
*Talk Summary:* This talk presents a novel paradigm shift in genome analysis, moving away from traditional basecalling towards direct interpretation of raw nanopore signals. This approach, powered by the RoHash and RoHash2 algorithms, promises unprecedented speed, efficiency, and scalability, especially for complex and large genomes. *Motivation:* * *[**0:00:00**] The Need for Real-Time Genomics:* Basecalling, while accurate, introduces a significant computational bottleneck, hindering real-time analysis. Direct raw signal analysis offers a faster and more efficient alternative. * *[**0:09:00**] Harnessing Nanopore's Power:* Nanopore sequencing provides unique advantages like long reads, portability, and real-time signal generation, which RoHash and RoHash2 leverage for their breakthroughs. *Tackling Real-time Analysis Challenges:* * *Speed:* Algorithms must match the rapid data generation rate of nanopore sequencing. * *Decision Making:* Rapid and accurate analysis is crucial for real-time decision-making (e.g., selective sequencing). * *Accuracy:* Maintaining high accuracy despite signal variations is essential for reliable results. * *Power Efficiency:* Crucial for portable sequencing applications. *The RoHash and RoHash2 Breakthrough:* * *[**0:14:50**] Understanding Raw Signals:* * *k-mers:* Short DNA sequences (6-9 nucleotides) that create distinct electrical signals as they pass through the nanopore. * *Events:* These distinct electrical signals corresponding to individual k-mers. * *Signal Variation:* Identical k-mers can produce slightly different signals due to noise and other factors. * *[**0:16:00**] Hash-Based Search for Efficiency:* RoHash introduces a novel approach using hash functions to rapidly map raw signals to a reference genome, significantly boosting speed and scalability for larger genomes. * *[**0:20:00**] The Art of Signal Quantization and Hashing:* - *Quantization:* Grouping similar signal values to minimize the impact of noise and variations, improving mapping accuracy. - *Hash Value Construction:* Combining multiple quantized event values to create longer, more unique hash values that represent short k-mer sequences, balancing sensitivity and specificity. * *[**0:27:42**] Sequence-Scale Analysis for Efficiency:* RoHash dynamically determines when sufficient data has been acquired for accurate analysis, allowing for early termination of sequencing runs and significant cost reduction (up to 15x). * *[**0:30:41**] RoHash2: Refining for Speed and Accuracy:* - Improved quantization techniques for greater accuracy. - A more sensitive chaining algorithm to better connect matching segments. - Machine learning-based decision-making for robustly determining when to stop mapping. - Integration of filters and minimizer sketching for enhanced efficiency. *Impactful Outcomes:* * *[**0:32:00**] Unmatched Speed:* RoHash2 surpasses existing raw signal analysis tools with up to 27x faster throughput. * *[**0:34:00**] High Accuracy Maintained:* RoHash2 consistently achieves the highest accuracy in raw signal mapping while offering substantial speed improvements. * *[**0:37:00**] Cost-Effective Genomics:* Sequence-scale analysis, combined with RoHash2, leads to significant reductions in sequencing time and costs without compromising accuracy. * *[**0:39:00**] Real-Time Assembly Potential:* Early results show promise for performing genome assembly in real-time, directly from raw signal overlaps. *The Road Ahead: Exciting Future Directions* * *[**1:01:00**] Conquering Repetitive Regions:* Addressing challenges in accurately analyzing repetitive genomic sequences for improved accuracy in complex genomes. * *[**0:39:00**] Expanding Real-Time Applications:* Developing new algorithms for downstream genomic analyses (variant calling, structural variation detection) directly on raw signals, unlocking a new era of real-time insights. * *[**0:39:00**] Synergy with Basecalling:* Exploring the integration of raw signal analysis with basecalling to leverage their respective strengths and create even more powerful analysis pipelines. * *[**0:42:00**] Hardware Acceleration:* Harnessing the power of specialized hardware like FPGAs, GPUs, and in-memory computing to further accelerate RoHash and unlock its full potential for real-time applications. *Conclusion:* This talk heralds a paradigm shift in genomics by introducing real-time analysis of raw nanopore signals. RoHash and RoHash2, with their speed, accuracy, and efficiency, hold the potential to transform fields like personalized medicine, pathogen detection, and evolutionary biology. This is just the beginning of a new era of real-time genomic exploration! i used gemini 1.5 pro to summarize the transcript
@brunosobreirafranca6720
@brunosobreirafranca6720 12 күн бұрын
Amazing amazing
@caalher
@caalher 12 күн бұрын
So, just wondering, how does cpu caches play here? Here we are seeing that memory internal row buffer "caches" a single row to serve the same request over and over as long as they are in the same row, but, isn't the first request not in the cpu caches will now cache this line (I mean within the cpu caches)? Anyways, great lecture
@uis246
@uis246 14 күн бұрын
His brain is so big, that you can use his head as gravitational lens.
@Tapajara
@Tapajara 15 күн бұрын
A RISC processor requires more memory accesses to accomplish the same things as a well-designed CISC processor. So our infatuation with RISC is partly to blame for the waste you talk about. Having a more concentrated instruction set is the equivalent of keeping your most frequently used subroutines close to the CPU. That is what CISC is all about.
@saiiiiiii1
@saiiiiiii1 15 күн бұрын
Oftentimes I hear people say garbage collection is so good nowadays that you don't even need TRIM. For example in a RAID scenario, it's most of the time not available anyways. What I'm asking myself is how would GC ever know which pages are invalid without the OS telling it. It never knows until the OS tells it to overwrite it? So where does TRIM even come into play here? What's the difference between invalid and TRIM marking the page as free?
@dghtucs
@dghtucs 16 күн бұрын
Thank you for sharing your knowledge selflessly
@matthewvenn
@matthewvenn 17 күн бұрын
Surprising, shocking, inspirational - loved the keynote, happy I saw it live!
@Jegan_Eng
@Jegan_Eng 17 күн бұрын
Thankyou 🥺
@Tapajara
@Tapajara 22 күн бұрын
"Economics is Not Important!" Okay. I have to tell you that you lost your house and you have to live in the gutter without food. So I now know that is okay with you.
@Tapajara
@Tapajara 22 күн бұрын
Top Down Design with Bottom Up Implementation. But yes, learning should be Bottom Up too.
@Tapajara
@Tapajara 22 күн бұрын
What you are calling "Research" I call "Development". They are the two different parts of R&D. Scholarship is yet another thing that has more to do with fame than the other two.
@robertocarballo8006
@robertocarballo8006 23 күн бұрын
Thank you Professor for sharing Yale's talk!
@N0RT0X
@N0RT0X 23 күн бұрын
Professor Onur, you are an incredibly based man
@user-yw6vb5zx7e
@user-yw6vb5zx7e Ай бұрын
The glitch came back near the end of the lecture
@user-yw6vb5zx7e
@user-yw6vb5zx7e Ай бұрын
The error with the slides was minimal and only listed items that the speaker was talking about. They realized it a few minutes into the second speaker's lecture and fixed it
@user-yw6vb5zx7e
@user-yw6vb5zx7e Ай бұрын
Second speaker's slides are glitching. Can't follow very easily
@tarkcetin3389
@tarkcetin3389 Ай бұрын
This lesson was very very informative for me, who is interested in long read sequencing and ONT. Thank you very much sayın hocam
@YA_LUNNAYA_PONI
@YA_LUNNAYA_PONI Ай бұрын
A slightly unrelated question but still, I'm not sure that the "Large Semantic Gap" is necessarily bad. Like the IA-64 showed that writing a proper optimizing compiler for an ISA like that can be very tricky. And maybe that's because the IR is usually RISC-ish, or maybe because major compilers target multiple architectures, but in the end it's almost like all that ISA's complexity is deemed an "overkill" and either ignored or underutilized?
@YA_LUNNAYA_PONI
@YA_LUNNAYA_PONI Ай бұрын
And like the x86 ISA gives vibes of an ISA that was designed to be written by actual programmers (hence the multiple complex instructions that can like read the memory, or the easy aka one inst immediate loads), but compilers don't really care about all that stuff. Like yeah the binary size is a thing, but it's not what modern compilers optimize for, and having more granularity (smaller, simpler instructions) even sounds like a good thing.
@DAYSHIVANSH
@DAYSHIVANSH Ай бұрын
what is this video about ?
@Gr8ness8999
@Gr8ness8999 Ай бұрын
in the logic multiplexers (III) slide shouldnt Y = AB*C+B*C* + A*BC (* means complement as i cant type it on the keyeboard)
@YA_LUNNAYA_PONI
@YA_LUNNAYA_PONI Ай бұрын
29:15 but the data propagation time here is worse than in the second design. So it's not like, unequivocally better.
@YA_LUNNAYA_PONI
@YA_LUNNAYA_PONI Ай бұрын
Thinking about it a bit more I guess it is better because we care about the worst case scenario (which is mentioned further in the lecture)
@YA_LUNNAYA_PONI
@YA_LUNNAYA_PONI Ай бұрын
28:10 I'm not exactly clear on why the second implementation omitted the 4-input OR gate. Like if you also omit that OR gate in first design the control signal propagation will be faster (and it's not like two AND gates can be enabled at the same time can they?)
@DanielCardenas1
@DanielCardenas1 Ай бұрын
Looks like the wrong video was uploaded for the title. Please upload the correct video.
@nadigrakesh
@nadigrakesh Ай бұрын
Thank you for your comment. We have changed the title to better describe the lecture.
@salvatorelionetti1274
@salvatorelionetti1274 Ай бұрын
"As people dream, much application will come". Very nice POV.
@user-pk5xu1ir5w
@user-pk5xu1ir5w Ай бұрын
Hi, A doubt: At timestamp: 1:43:11, how can 8 SIMB units execute a 32 thread warp ?
@jimgolab536
@jimgolab536 Ай бұрын
Do the victim bits always fail with false zeroes? Ie, the victim cap is always falsely discharged, and never charged?
@jimgolab536
@jimgolab536 Ай бұрын
Is the victim bit cell charge loss dominated by the sub-threshold leakage of word line pass transistor? A small rise in the gate voltage of the victim pass transistor (due to capacitive coupling to the rising aggressor word line) would have an exponential effect on the current through it and thus the rate of discharge of the victim bit.
@user-wu6lt3wm7t
@user-wu6lt3wm7t Ай бұрын
how does this type of model of distributed processing would handle the communication between different tasks that would need to communicate there results? my point is the problem that would limit this type of architecture would be how to handle the communication between different systems (processing units). This problem is essentially the same problem with the multicore architecture and the bottle neck would be how to design programs that would make use of the increased number of compute units. and this means that the main problem that would have to be solved is how to design programs that can be run in parallel .
@user-wu6lt3wm7t
@user-wu6lt3wm7t Ай бұрын
i realize now that you don't necessarily mean to remove the cpu but delegate some of the operation that deals with memory. however, i am still interested in what could be used to make a distributed compute system work.
@user-wu6lt3wm7t
@user-wu6lt3wm7t Ай бұрын
also a followup you mentioned that we could move a lot of the compute to the memory as the actual operation is happening in the memory. how would you deal with the latency of the memory cells of the system?
@Ic3q4
@Ic3q4 Ай бұрын
Sir can i have a phonecall with u? I can also come to eth but i'd like to discuss some things.
@BimanDebbarma
@BimanDebbarma Ай бұрын
Can SRAM based in memory computation handle the huge number of MAC operations required by modern AI algorithms?
@BimanDebbarma
@BimanDebbarma Ай бұрын
Sir what is the scope of research in SRAM based in memory computation?
@user-gb7cc3fv8c
@user-gb7cc3fv8c Ай бұрын
1:45:49 Lecture 17b: Memory Ordering (Memory Consistency) 1:49:55 Memory ordering in multiprocessors 1:50:20 Memory consistency vs Cache coherence (Note: These bookmarks are not exhaustive)
@user-gb7cc3fv8c
@user-gb7cc3fv8c Ай бұрын
How are write conflicts resolved? Scenario - Multiple cores have cached the same memory location locally (say an integer value). All the cores update the integer at the same time.
@balajikannan78
@balajikannan78 Ай бұрын
Such an excellent lecture. Thanks for this!
@user-vp6fh8gx7z
@user-vp6fh8gx7z Ай бұрын
China needs your help.
@Michael-kj9mh
@Michael-kj9mh Ай бұрын
I don't quite understand why the P2L mapping table is needed to find that PPA 12 is mapped to LBA 15. Since given LBA 15, we can find PPA 12 through the L2P table, it is obivous to me that given PPA 12 we can also find LBA 15 through the table by just going the opposite lookup direction. What am I missing here?