Term
Historically, what has been the relationship between latency and bandwidth? Explain |
|
Definition
Bandwidth has improved faster than latency, due to both architectual and commercial reasons |
|
|
Term
Give one reason CPU implementation is becoming as important (if not more) as instruction set design in influencing CPU performance. |
|
Definition
Instruction set architectures (ISA) have converged, while implementation options (e.g., pipelining, superscalar, multicore) for the same ISA have a major impact on CPU performace |
|
|
Term
Give one reason computer architects use benchmarks instead of real programs to measure computer performance. |
|
Definition
Benchmarks provide reproducible results that are easy to compare across computing platforms, and are easier to use than real programs. |
|
|
Term
Why do computer architects prefer using geometric mean instead of arithmetic or harmonic mean when presenting benchmark results? |
|
Definition
Results are consistent regardless of choice of reference platform |
|
|
Term
What is a structural hazard? |
|
Definition
Hardware limitation preventing pipeline instruction from executing (e.g., single-ported memory) |
|
|
Term
Why are RAW hazards fundamentally different than WAR and WAW hazards? |
|
Definition
RAW hazards represent actual flow of values, whereas WAR and WAW hazards are name dependences caused by reuse of storage (and can be eliminated using additional storage) |
|
|
Term
Explain why exceptions are problematic for pipelined processors. |
|
Definition
Because when an exception occurs, both previous & succeeding instructions may be still executing in the pipeline, and need to be handled before the exception can be processed. |
|
|
Term
Describe the difference between synchronous and asynchronous exceptions. |
|
Definition
Synchronous exceptions occur when a particular instruction is executed. Asynchronous exceptions may occur any time. |
|
|
Term
Explain why pipelined processors face additional problems when certain instructions (e.g., mult) require more time (i.e., clock cycles) to complete. |
|
Definition
Instructions may complete out of order, requiring additional checks to avoid data hazards. |
|
|
Term
Explain why filling the branch delay slot with an instruction from before the branch is preferred to the alternatives.
|
|
Definition
If permitted by language semantics, putting an instruction from before the branch will not introduce extra execution costs regardless of whether branch is actually taken |
|
|
Term
|
Definition
|
|
Term
|
Definition
|
|
Term
|
Definition
|
|
Term
|
Definition
At the execute stage of a cycle, it passes information to the next cycle |
|
|
Term
When do we find out that the PC needs to
be modified? |
|
Definition
Answer: In pipeline stage ID of a branch
instruction
So, if a branch is not-taken (i.e., if the PC is notmodified), need a one-cycle delay |
|
|
Term
Question: When is a taken branch’s address known?
|
|
Definition
ALU used to compute, so EX stage
Need two (or three) cycle delay |
|
|
Term
What are the 4 branch hazard alternatives? |
|
Definition
1. Stall untill branch direction is clear
2. Predict branch not taken (47%)
3. Predict branch taken (53%)
4. Delayed branch |
|
|
Term
Question: What makes pipelining hard to implement? |
|
Definition
Exceptions, faults, interrupts |
|
|
Term
15
Give Some Examples Of Exceptions
|
|
Definition
Request for I/O
•
Arithmetic troubles: overflow or underflow
•
Cache miss: data not in (on-chip) cache memory
•
Page fault: data not in (physical) memory
•
Illegal address, giving a memory protection
violation
•
Hardware failure |
|
|
Term
When Do MIPS Exceptions Occur? |
|
Definition
IF: -Page fault on instruction fetch
-misaligned memory access
-memory protection violation
ID: -undefined or illegal opcode
EX:-arithmetic exception
MEM:-page fautl on data fetch/store
-misaligned memory access
-memory protection violation
WB:None! |
|
|
Term
How many cycles do MIPS instructions take? |
|
Definition
|
|
Term
What are the 4 major causes of pipeline stalls? |
|
Definition
load stalls – from using load result 1 or 2
cycles after load
branch stalls – 2 cycles on every taken branch,
or empty branch delay slot
FP result stalls – RAW hazards for an FP
operand
FP structural stalls – from conflicts for
functional units in FP pipeline |
|
|
Term
CS252 S05
CMSC 411 - 13 (some from Patterson, Sussman, others)
21
How to reduce the miss rate? |
|
Definition
•Use larger blocks
• Use more associativity, to reduce conflict misses
• Victim cache
• Pseudo-associative caches (won’t talk about this)
• Prefetch (hardware controlled)
• Prefetch (compiler controlled)
• Compiler optimizations |
|
|
Term
Why are cache misses sometimes inevitable? |
|
Definition
Compulsory miss » The first time a block is used, need to bring it into cache
Capacity miss » If need to use more blocks at once than can fit in to cache, some will bounce in and out
Conflict miss » In direct mapped or set associative caches, there are certain combinations of addresses that cannot be in cache at the same time |
|
|
Term
Explain how caches improve processor performance. |
|
Definition
Because access to data in cache is faster than access to data in memory, and
temporal & spatial locality allows data in cache to
be reused once it is
brought in from memory. |
|
|
Term
Explain why pipelining improves processor performance |
|
Definition
Performance is improved because pipelining allows instruction execution to
be overlapped with other instructions, allowing multiple instructions to be
executed at the same time |
|
|