I didn't know Tom Hanks made videos about instruction pipelining in his free time!
@zackjohnson93874 жыл бұрын
tony stark*
@glykeriatheodorou75864 жыл бұрын
hahahaha actually their voices are very similar and I just noticed it 😂😂😂
@jilha11223 жыл бұрын
Wow now that you mention it...
@Spinogrl20002 жыл бұрын
Thank you so much! KZbin has taught me more about pipelines and data paths in an hour than my prof. has this whole month. They make it so much harder than it has to be!! Again, thank you!
@ad.i10 ай бұрын
I know it's been a whole year, but do you recall any specific videos that helped explain the topic that you think are definitely worth looking at? If not, that's all good but thanks a lot!
@darianxd55087 ай бұрын
@@ad.i david and sarah harris
@MrICH9999910 ай бұрын
Still helping me in 2024 - big thanks!
@apex16086 ай бұрын
true
@Vwcz6 жыл бұрын
Great video, and active in comments section. Excellent content creator! This is what we need. Thanks
@kevinle47666 жыл бұрын
I am a bit confused as to how the iteration is from 5 to 18?
@MrXinchuan7 жыл бұрын
I don't understand why you stall in the first beq (second instruction), but you don't stall lw (fourth instruction) and let forwarding take care of it. Because the previous instruction blt and add, both have the result ready after the execute stage
@matthewwatkins887 жыл бұрын
In the processor this is dealing with, branches are resolved in the decode stage. In this case that means that the value of $t0 is needed in the decode stage. Since the instruction before the branch (the add) writes to $t0, the value needs to come from the slt instruction and the result of the slt isn't available until the end of the EX stage so the first beq has to stall a cycle so that it can get the correct value from EX forwarded into the ID stage. The lw doesn't need to stall because it doesn't need the value of $t0 until the EX stage (where the branch needed it in the ID stage). In this case, the add instruction has completed the EX stage before the lw enters the EX stage and so no stalling is needed (it is just directly forwarded).
@shubhrajit21175 күн бұрын
Yup operand forwarding happens between ALU, so there should be no need of stall
@ahmadahm35133 жыл бұрын
in case of forwarding beq schuld not wait for data cause it can get the data passed throw in the same cycle, unless you are assuming that EX stage for slt and ID stage for beq they are not happening in the same cycle:)
@codewithven93912 ай бұрын
This isn't true. In the SLT instruction, the arithmetic is performed at the beginning of cycle 3 therefore it can't be forwarded until the end of cycle 3. So BEQ will not have the correct value in cycle 3 but in cycle 4 it can after the result has been forwarded from SLT - E/M
@Simppi967 жыл бұрын
This is a really great video, thanks! But I am still not sure on how data dependencies work. How do you know when a command has the data ready for another to use? For example, the BEQ command needs $t0 and it can get it after the SLT command has executed, but the next BEQ command has to wait for the LW command to get to the memory clock cycle. I would be very grateful for an answer, thanks in advance!
@matthewwatkins887 жыл бұрын
This partially depends on the implementation of the pipelined processor. For this example it is assumed that for all instructions that produce data, except for load instructions, the data is available as the instruction moves from the execute (E) to the memory (M) stage. This means that for the slt/beq combination. The SLT produces the data in execute and so that data can be forwarded from the beginning of the memory stage to the decode (D) stage (where it is needed for branches). For load instructions, the data is not available until after the instruction accesses the data memory, which means it is only available as the instruction moves from memory (M) to writeback (W). This is why the lw/beq combination has to wait another cycle as it is only as the lw moves into writeback that the data is available to forward to decode.
@rekr63816 жыл бұрын
@@matthewwatkins88 Thanks for this response, very helpful!!
@owenzhang75038 ай бұрын
If we dont have the last line, what the pipeline will be? Can we begin the IF of the first loop line directly in circle 14?
@matthewwatkins888 ай бұрын
The last line, as I interpret it anyway, is never executed, so removing it really wouldn't change anything.
@owenzhang75038 ай бұрын
@@matthewwatkins88 I see. Thank you very much!
@albaraam1873 Жыл бұрын
I think in the third case you meant first branch (beq $t0,$0, end) is taken only
@matthewwatkins88 Жыл бұрын
I'll stop my head, I would agree with you.
@selvalooks5 жыл бұрын
this is wonderful , pipeline fantastic explanation !!!!
@_nognom6 жыл бұрын
The value for $t0 from the SLT instruction should be ready to forward at the later half of stage E, which is right before the early half of stage E for the BEQ instruction, which suggests that value for $t0 will be forwarded to the ALU instead of requiring a stall. Is this not correct?
@matthewwatkins886 жыл бұрын
No, this is not correct. The result of the SLT (or any instruction computed in the execute stage) is only ready at the end of the cycle and so can really only be forwarded at the beginning of the next stage (the memory stage). Additionally, the BEQ needs the value for $t0 in the decode stage since it resolves the branch in this stage. This means the branch can not properly complete the decode stage until the previous SLT has completed the execute stage. *If* the branch was resolved in the execute stage (which is not the case here), then a stall would not be necessary as forwarding would take care of the dependency.
@trumpetperson114 жыл бұрын
@@matthewwatkins88 I had a similar though. It seems that I have been told that you can forward the data directly to the ALU (or more precisely, the register in between the D and E stages) for the calculation (overwriting the data received from the register in the D stage). This would give you the time to not require a stall there. Is this just not correct?
@ahmadahm35133 жыл бұрын
@@matthewwatkins88 that not realy true cause the result of each stage could be ready in the first half of the cycle like the WB stage
@wendyli62386 жыл бұрын
Are we using forwarding in this problem? I'm confused on when the next instruction should start if we are using forwarding
@matthewwatkins886 жыл бұрын
The example definitely assumes forwarding. I'm not 100% sure what you mean by "start." The processor fetches the next instruction the next cycle. If there is a dependency that forwarding can't handle, then the processor will stall the necessary stages (stalling is shown in the example by stages shown in '()', such as (F)).
@wendyli62386 жыл бұрын
What I meant by start was where the next instruction would begin F,D,E... if we didnt use forwarding but needed information from the previous instruction. If we were not using forwarding and need information from a current register in the next instruction, we wouldnt decode the next instruction until after the current instruction finished its memory stage?
@matthewwatkins886 жыл бұрын
If there was no forwarding at all, the dependent instruction wouldn't truly start decode until the previous was in writeback (assuming writes to the register file appear to happen before reads, which is what is assumed in the video). Data is only written to the register file in writeback, so, without forwarding, wouldn't be available until then.
@wendyli62386 жыл бұрын
Thank you that is very helpful! :D
@mahanteshmise69305 жыл бұрын
instruction no3 and no4 there must be stall at decode for instr 4.Correct me if i am wrong
@boathecat9196 жыл бұрын
For when neither branch taken, why does the last instruction "add $v0, $s0, S0" have no cycle?
@codewithven93912 ай бұрын
Because it's outside the loop. Only the ones inside the loop are considered for this problem. We are determining the overall CPI for the loop
@Manas09rai3 жыл бұрын
Hey I just wanted to ask if an add instruction was dependent on a ld or lw instruction prior to it, would there be the same 2 cycle stall as there was for the beq instruction that was dependable on the lw instruction?
@yogeshbalbehra89307 жыл бұрын
what are stages in typical four stage cpu pipeline? and whats the purpose of each stage? this question was in my exam. can you help me with answee
@jayz66986 жыл бұрын
why is the iteration 14 is including the first W and does not include the last W (between 5 to17) ?
@matthewwatkins886 жыл бұрын
As is noted in the comment for the video, there is a slightly updated version of this video (kzbin.info/www/bejne/eJvCc42VmZWCobc). The CPI calculation shown is correct, but, as you note, the line at ~7:00 should extend to cycle 18, for a total of 14 cycles. (Also, the W in cycle 18 for the slt should really be an M.)
@martint53406 жыл бұрын
This is awesome. Thank you very much!
@eduardomiguelsalaspalacios33255 жыл бұрын
mi causa dice que te equivocaste, es cierto? que opinas?
@matthewwatkins885 жыл бұрын
I don't speak Spanish.
@CrashOverride3322 жыл бұрын
This went way too fast for me. I kept having to rewind.
@zachnanabooboo5176 жыл бұрын
Didn't know mike greenberg knew mips
@motorheadbanger906 жыл бұрын
There are a lot of things you are saying that contradict my teachings and readings on this matter. Can you please explain to me what you define as the following: 1) What is "branch taken/not taken" 2) What is forwarding Additionally, are you saying that the resource in t0 cannot be accessed by the subsequent instruction until the memory stage of the previous instruction? And we have forwarding in this problem? Assuming yes, then your understanding of forwarding, and my understanding of forwarding contradict. Can you help explain?
@mohammadrezabaqery74925 жыл бұрын
did you forget to resolve a dependency between add and lw? add $t0, $s3, $s4 lw $t0, 0($t0)
@dmm27084 жыл бұрын
there is is a dependency but it doesn t change the outcome
@pavuluriviratchowdary44804 жыл бұрын
Because t0 is already executed in first instruction so there is no need for the processor to run it second time.
@perionan72817 жыл бұрын
OH MY GOD!! THANK YOU
@a96185e5 жыл бұрын
this is fantastic :)
@x3axDev6 ай бұрын
thank you tony stark
@pacifiky9 ай бұрын
This is so cool
@mehmetb87036 жыл бұрын
nice tutorial
@profitjourneywithsk21365 жыл бұрын
Good video
@sukrusekeroglu4 жыл бұрын
ohh no offence but i am happy to hear non-indian accent, I said oh god thanks in the beginning of the video
@FelixTheForgotten2 жыл бұрын
Sometimes I am so desperate I try to understand Vietnamese videos to study. Always feels good to find an English video even though it isn't my first language
@PEGuyMadison4 ай бұрын
Oh come on throw those branch delays in and show how inefficient RISC code is.
@matthewwatkins884 ай бұрын
When you say "RISC" do you mean actual RISC? If so, actual RISC code is equavelent to what is shown. If you are refering to the Mips code, then yes, real Mips code would change the performance, but it doen't necesarily destroy it.