reason: to get CPI closer to 1 (by reducing stall)
In-order scheduling
every instructions (further pipeline stage also)
Out-of-Order scheduling
- static scheduling done by compiler. usually consider all code
- dynamic scheduling
done by hardware. usually consider local code, so make compiler simpler. Also, by maintaining data flow in pipeline, we get three attribute.
- compatibility to compiler
- dependence information may not be known at compile time (ex. dynamic linking)
- tolerance for unpredictable delay (ex. long-latency cache miss
And, exactly out of order execution, in order issue
RAW : Cannot read second instruction until first write WAR : Cannot write the second instruction before the first has read WAW : Cannot write second until first write - read → execute → write stage is fixed order so we call anti-true - all these data dependency can be occur (rar is not hazard)
if dependency compatible then no hazard
key idea #1: allow instruction behind stall to proceed
raw is simply solved by stall: data dependacy = true dependancy
key idea #2: register renaming (Tomasulo algorithm)
war(anti(opposite of true)-dependence), waw(output-dependece) is occur because of OoO
wa - two types are name dependence
- issue need available functional unit
- execution begin as soon as data operands are available after issue → war, waw data hazards
- variable latency → forwarding for raw hazards become hard
Split ID pipe stage
Issue - decode instructions, check for structural hazard read operands - wait until no data hazards(sufficient resource), then read operands
Methods (in basic block & before cache)
if waw, do not change, handle war
remove war, waw by handling speculations
RAW be cannot solved when read
- WAW: stall when issue at scoreboarding not tomasulo
- WAR: stall when write at scoreboarding not tomasulo
Store is read
RAW means Read should be after Write, so need stall. WAW, WAR is reason of error.