Lecture 13. Out of Order Execution II and Data Flow - CMU - Comp. Arch. 2015

Lecture 13. Out of Order Execution II and Data Flow - CMU - Comp. Arch. 2015 - Onur Mutlu

Рет қаралды 13,142

Carnegie Mellon Computer Architecture

Күн бұрын

Пікірлер: 8

@guliyevshahriyar 2 жыл бұрын

Onur Mutlu teacher, thank you very much!

@douggale5962 3 жыл бұрын

The main load dependent upon store is when you pushed the return address to the stack, and you called something, that returned, now you are popping the return address from the stack. If that thing you called was really quick, you could be popping that return address before it even retired the store that pushed the return address. RSB covers that, but it has to check nonetheless. It has to do it for callee saved registers too, if it needed to save some and use them. It could be restoring them before the pushes for them retired.

@samzx81 8 жыл бұрын

The data flow stuff was really cool. Especially the idea of macro data flow. :)

@adeshshetty2830 7 жыл бұрын

Lecture starts at : kzbin.info/www/bejne/sGGonKGcj8mff9Um20s Data flow starts at: kzbin.info/www/bejne/sGGonKGcj8mff9Uh7m11s

@utkarshagarwal5391 7 жыл бұрын

who asked for it

@agsystems8220 Жыл бұрын

Are there any architectures that offload some of this memory difficulty onto the programmer/compiler? It seems like something where context can usually answer the question. For example, could you provide many different load/store instruction pairs, and have different policies for each? You could pretend that you have virtually partitioned the memory so that there is no way for some of these to hit the others, culling most of the dependencies at a software level. Some of the pairs could be extremely aggressive, taking an 'on your head be it' approach, and not even attempting to catch errors. This is absolutely fine for a load of immutable data, or a store that will next be read after some guaranteed pipeline clear such as a thread switch. Something like a stack access needs very different behaviour though, not least because cache misses are extremely unlikely. Why not simply tell the machine, or at least give it hints? It seems like a lot of area is being spent on working out what the programmer could just tell you.