Term
|
Definition
Marie is a simple architecture. |
|
|
Term
|
Definition
• Binary 2’s complement integers • Stored programs with fixed word lengths • Word addressable; not byte addressable • 4 kilowords of main memory (12 bit addresses) • 16 bit data words and instruction words • 4 bit opcodes, 12-bit address operands |
|
|
Term
|
Definition
• AC: 16-bit accumulator • PC: 12-bit program counter • IR: 16-bit instruction register • MAR: 12-bit memory address register • MBR: 16-bit memory buffer register • InREG, OutREG: 16-bit input/output registers |
|
|
Term
Register Transfer Notation |
|
Definition
A typical instruction in MARIE is the ADD instruction: it loads the memory location specified by its address operand into the memory access register, loads the value of the memory at that location into the memory buffer register, then orders the ALU to add the accumulator and the memory buffer register together and store the result in the accumulator. To express what instructions do more concisely, we use register transfer notation. This is a conceptual language that uses the left-arrow to express data transfer or assignment, and M[x] to represent the value of memory at location x. |
|
|
Term
|
Definition
MARIE provides the following instructions. |
|
|
Term
|
Definition
has the effect of letting you conditionally skip the next instruction. Together with Jump it allows for conditional branching – i.e., if statements. |
|
|
Term
Labels and Initialization |
|
Definition
MARIE allows instructions and memory locations to be labeled, so that the above X values need not be actual numbers. These labels can point to instructions, or to locations in memory initialized to specific values. |
|
|
Term
|
Definition
The above human-readable assembly language is translated by a program called an assembler into the actual code that can be executed on a (presumably simulated) MARIE architecture. In the case of MARIE this process is relatively simple: each instruction is translated to its four-bit opcode, and if there is a 12-bit address operand it is appended to the opcode. |
|
|
Term
The Assembly Process Pt. 2 |
|
Definition
However, MARIE cannot, on the first run through, resolve something like AddN or X from our labels example – a reference to a location that is not defined until later. |
|
|
Term
The Assembly Process Pt. 3 |
|
Definition
This problem is solved by two-pass assembly: the assembler goes through the code translating opcodes and resolving names into locations, keeping a symbol table that holds the location of each name. Then it goes through the code again, and appends numeric addresses to the opcodes based on the contents of the symbol table. |
|
|
Term
|
Definition
MARIE’s instruction set has the fundamental characteristics that it’s composed of opcodes and operands and manipulates registers and memory. But it’s oversimplified to say the least; real instruction sets are much more complicated. |
|
|
Term
|
Definition
• Shorter or longer instructions? Shorter ones take up less space and can be fetched more quickly, but limit the number of instructions and operands. • Fixed or variable length? Fixed-length instructions decode more easily, but waste space. o This decision isn’t binary – for instance, you could have a fixed opcode length but variable numbers of operands. • What’s the memory organization? If it’s not byte-addressable, accessing single bytes is going to be annoying. • What addressing modes to support? • What byte order? • How many registers? |
|
|
Term
|
Definition
• Order of bytes in integers • Little endian: Least significant byte first o Certain advantages in arithmetic • Big endian: Most significant byte first o More intuitive o Store integers in the same order as strings • Different formats use different byte orders • Network byte order is big endian |
|
|
Term
|
Definition
• Little endian: Least significant byte first o Certain advantages in arithmetic • Different formats use different byte orders |
|
|
Term
|
Definition
• Big endian: Most significant byte first o More intuitive o Store integers in the same order as strings • Different formats use different byte orders • Network byte order is big endian |
|
|
Term
|
Definition
• Stack architectures are now mostly a historical curiosity. • Accumulator architectures are like MARIE. • General-purpose register architectures are what gets used now. Three levels of flexibility: o Load-Store architectures require everything to be loaded into registers first. (PowerPC) o Register-Memory architectures require at least one operand in a register. (Intel) o Memory-Memory architectures let you act directly and entirely on memory. (VAX) |
|
|
Term
|
Definition
• Stack architectures are now mostly a historical curiosity. |
|
|
Term
Accumulator architectures |
|
Definition
• Accumulator architectures are like MARIE. |
|
|
Term
General-purpose register architectures |
|
Definition
• General-purpose register architectures are what gets used now. Three levels of flexibility: o Load-Store architectures require everything to be loaded into registers first. (PowerPC) o Register-Memory architectures require at least one operand in a register. (Intel) o Memory-Memory architectures let you act directly and entirely on memory. (VAX) |
|
|
Term
|
Definition
Principle of orthogonality: each instruction should do something unique and not duplicate another instruction. • Data movement • Arithmetic • Boolean Logic • Bit Manipulation • I/O • Control • Special Purpose |
|
|
Term
|
Definition
• Analogous to an assembly line • Break the Von Neumann cycle into its components • Assign each component to a different part of the processor • Could even do this with different steps in the instruction… • But what about branches? Dependent values? • More detail when we get to parallelism. |
|
|
Term
|
Definition
There are many types of memory, but they all break down into two types in the end. |
|
|
Term
|
Definition
• Read-Only Memory o Generally only read by the computer, not written to o Retains its state after power is lost, making it useful for permanent or quasi-permanent data storage o Not all ROM-like technologies are actually read-only Flash USB drives are based on ROM: Flash EEPROM, or Flash Electronically Erasable Programmable Read-Only Memory Non-flash EEPROM, EPROM, and PROM also all exist |
|
|
Term
|
Definition
• Random-Access Memory o When we say memory, this is what we mean o Can be read from or written to by the computer arbitrarily at any time o Loses state when power is lost – not useful for permanent data storage o RAM technologies include SDRAM, DDR SDRAM, DDR2, DDR3, RDRAM, et cetra |
|
|
Term
|
Definition
• We want memory to be fast, cheap and large. We can have any two of those. o Of course, we always want cheap. o So we have a large amount of slow memory, and a small amount of fast memory. o This dichotomy easily becomes a sliding scale, and implies the memory hierarchy. |
|
|
Term
The Memory Hierarchy Pt. 2 |
|
Definition
The memory hierarchy is a conceptual and actual arrangement of increasingly large layers of primary memory. Memory “closest” to the CPU is the fastest and smallest; memory “farthest” from the CPU is the largest and slowest. |
|
|
Term
The Memory Hierarchy Pt. 3 |
|
Definition
Very fast cache memory is typically measured in single-digit megabytes, as opposed to single-digit gigabytes – it is much more expensive. |
|
|
Term
The Memory Hierarchy Pt. 4 |
|
Definition
We don’t want to have to manage this explicitly – just keeping track of RAM is enough trouble for the programmer. We need a way to automatically manage the movement of information up and down the hierarchy. |
|
|
Term
|
Definition
The principle of locality observes simply that we will tend to use information in identifiable groups. Three aspects: • Spatial locality: We will tend to use information close to the information we just used. • Temporal locality: We will tend to use information we have recently used. • Sequential locality: We will tend to progressively use information from a “start” to an “end”. This principle lets us develop methods to automatically cache information from the lower levels of the hierarchy into the higher ones. |
|
|
Term
|
Definition
• Spatial locality: We will tend to use information close to the information we just used. |
|
|
Term
|
Definition
• Temporal locality: We will tend to use information we have recently used. |
|
|
Term
|
Definition
• Sequential locality: We will tend to progressively use information from a “start” to an “end”. |
|
|
Term
|
Definition
All caches work the same way up to a point: • Divide memory into logical blocks or pages. • Keep a set of pages – the cache – in very fast memory. • When a page already within the cache is needed, just use it there. • When a page not yet within the cache is needed, pull it from the slower RAM into the cache. All caches also share some terminology: • Hit: When information is successfully accessed from the cache. • Miss: When information has to be accessed from the slower RAM. • Hit rate: How often information is successfully accessed from the cache. • Miss rate: How often information has to be accessed from the slower RAM. (1 – Hit rate.) • Hit time: How long it takes to access information from the cache. • Miss time: How long it takes to access information from the |
|
|
Term
|
Definition
• Hit: When information is successfully accessed from the cache. |
|
|
Term
|
Definition
• Miss: When information has to be accessed from the slower RAM. |
|
|
Term
|
Definition
• Hit rate: How often information is successfully accessed from the cache. |
|
|
Term
|
Definition
• Miss rate: How often information has to be accessed from the slower RAM. (1 – Hit rate.) |
|
|
Term
|
Definition
• Hit time: How long it takes to access information from the cache. |
|
|
Term
|
Definition
• Miss time: How long it takes to access information from the slower RAM. |
|
|
Term
|
Definition
• Miss penalty: The miss time minus the hit time. |
|
|
Term
|
Definition
The most straightforward type of cache is a direct-mapped cache. • Divide main memory pages into sets. • Each set gets exactly one corresponding page in the cache. • Misses occur whenever a page in a set is accessed that is not that set’s current page in the cache. The direct-mapped cache is effective and improves performance, but has several problems. |
|
|
Term
|
Definition
A much more flexible type is the fully associative cache. • Any page in main memory may be loaded into any page in the cache. • Misses occur whenever a page in main memory is accessed that is not currently in the cache. The problem with fully associative caches is that there is too much overhead – the entire cache must be indexed. |
|
|
Term
|
Definition
Setassociative caches are an attempt to retain some of the simplicity of direct-mapped caches while gaining some of the performance of fully associative caches. • Divide main memory pages into sets. • Each set gets n pages in the cache. This n is used as a secondary indicator of the type of cache – an n-way cache gives each set n pages. • Misses occur whenever a page in a set is accessed that is not any of that set’s current pages in the cache. |
|
|
Term
|
Definition
For any type of associative cache, when we need to load a page into the cache, we have to decide which page to replace. The ideal approach is to remove the page that it will be the longest until we need. Since we can’t do that, we have a few ways to try to approximate it: • LRU – Replace the least recently used page. • LFU – Replace the least frequently used page. • FIFO – Replace the page that was loaded longest ago – first-in, first-out. All of these strategies have overhead problems; we usually use an optimized approximation of one or more of them. |
|
|
Term
|
Definition
• LRU – Replace the least recently used page. |
|
|
Term
|
Definition
• LFU – Replace the least frequently used page. |
|
|
Term
|
Definition
• FIFO – Replace the page that was loaded longest ago – first-in, first-out. |
|
|
Term
|
Definition
Modern systems often use more than one level of cache – a level one cache that in turn caches a level two cache that in turn caches main memory. |
|
|
Term
|
Definition
The only measure of cache performance that matters is the effective access time – the time it takes to access memory. Averaging out all accesses, this time can be written as: (Hit rate)(Hit time) + (Miss rate)(Miss time) (The book uses different notation, but the equations are equivalent.) |
|
|
Term
|
Definition
Virtual memory, along with the related concept of swapping, allows the use of disk space as memory space, providing an extremely slow but extremely large bottom layer of the memory hierarchy. |
|
|
Term
|
Definition
We discuss virtual memory only in the context of the memory hierarchy here. It actually solves a lot more problems than just allowing for swapping; it’s important even if you have enough memory to never need to swap. You’ll learn why in Operating Systems. |
|
|
Term
|
Definition
• Virtual address – the logical memory addresses that programmers use. |
|
|
Term
|
Definition
• Physical address – the actual address of something in RAM. |
|
|
Term
|
Definition
• Mapping – translation of virtual addresses to physical addresses. |
|
|
Term
|
Definition
• Page frame – a place in physical RAM to store a page. |
|
|
Term
|
Definition
• Paging – copying of a page from disk to memory. |
|
|
Term
|
Definition
• Page fault – a miss in the context of virtual memory. |
|
|
Term
|
Definition
• Page file – an area of disk reserved for RAM. |
|
|
Term
|
Definition
In a system using virtual memory, each process has a page table that describes which of its virtual pages are currently in physical memory and where. Every virtual address has a page field and an offset field; it is the pages we are concerned with here. • Whenever memory is accessed, a virtual page field must be translated into a physical page number. • The page table is checked to see whether the page is in memory. • If the page is in memory, the virtual address is translated into a physical address. • If the page is not in memory, the page is loaded into memory, the page table is updated, and then the virtual address is translated into a physical address. If we were out of page frames, we have to pick a page to replace. We’ve already got algorithms for that above. |
|
|
Term
|
Definition
An unfortunate implication of virtual memory is the following: • We need to check the page table for every memory access. • The page table is in memory. • That means that each memory access is doubled. The way we get around this is, of course, to cache the page table. The translation look-aside buffer is simply a fullyassociative cache of virtual-to-physical page field translations. |
|
|
Term
Translation Look-Aside Buffer |
|
Definition
The translation look-aside buffer is simply a fullyassociative cache of virtual-to-physical page field translations. |
|
|
Term
|
Definition
So, in a virtual memory system, every time we access memory, we do the following: • Obtain the virtual address. • Get the page field from the virtual address. • Check if the page field is in the TLB. o If the page field is in the TLB, we’ve got the page frame. o If the page field isn’t in the TLB, check if it’s in the page table. If it is in the page table, we’ve got the page frame. Update the TLB. If it isn’t in the page table, load it from disk. • If memory’s full, pick a page to write back out to disk. • Either way, load the page from disk into memory. • Now we’ve got the page frame. Update the page table and the TLB. • Now that we have the page frame, check the cache. o If the page is in the cache, access it. o If it isn’t in the cache, load it into the cache then access it. Modern computers do all of this for each and every memory access. |
|
|
Term
|
Definition
The overall speedup obtained by an upgrade to a computer system depends on both the speedup of the upgraded component and how much that component is used. Amdahl’s Law computes speedup, and helps us make decisions on what components to upgrade Let f be the fraction of work performed by a component and k be the proposed speedup of that new component. Then the system speedup S is can be obtained by. |
|
|
Term
|
Definition
I/O subsystems include: • Areas of main memory dedicated to I/O • Buses dedicated to I/O • Control modules in the computer and in peripherals • Interfaces to those peripherals • Cables, wires or traces from the computer to peripherals |
|
|
Term
|
Definition
There are four fundamental control methods of I/O, and they are not always mutually exclusive. |
|
|
Term
|
Definition
Programmed • The CPU periodically polls a control register associated with I/O ports. • When data is ready, the CPU retrieves it. • After retrieval, the CPU resumes monitoring. • Simple and flexible, but has performance consequences. |
|
|
Term
|
Definition
Interrupt Driven • Each device is given a direct line to an interrupt controller. • When a device is ready for the CPU’s attention, the interrupt controller signals the CPU. • Execution is interrupted and the interrupt is handled. • Execution is then returned to where it was interrupted. • More complex and requires more code to handle, but has significantly better performance. |
|
|
Term
|
Definition
Direct Memory Access • In traditional I/O, the CPU uses an output register or logical “port” to transfer data one byte, or word, at a time. • DMA gives the DMA controller access to a specific area of main memory. • When I/O begins, the controller is allowed to access memory independent of the CPU to transfer data. • There is still only one memory bus; conflicts must be managed. |
|
|
Term
|
Definition
Channel I/O • Dedicates memory and resources to certain I/O paths – for example, between a disk and tape backup – to allow them to proceed without slowing down the computer at all, even for memory access. • Highly complex and expensive; used for specific server-class applications. |
|
|
Term
|
Definition
• I/O buses cannot generally operate synchronously – you don’t know when I/O is going to happen. o That doesn’t mean a clock isn’t involved! o We will not describe bus protocols and timing extensively; refer to 7.4.3 and the references if you are curious. |
|
|
Term
|
Definition
Parallel • Transmits data across multiple signal lines at once • Fast and cheap at short distances • Highly prone to interference Serial • Uses only one signal line to transmit data • Generally, can be reliably sent faster over longer distances • Most modern high-performance interfaces are serial |
|
|
Term
|
Definition
Parallel • Transmits data across multiple signal lines at once • Fast and cheap at short distances • Highly prone to interference |
|
|
Term
|
Definition
• Uses only one signal line to transmit data • Generally, can be reliably sent faster over longer distances • Most modern high-performance interfaces are serial |
|
|
Term
|
Definition
Fixed Disk When we say “hard disk” – or increasingly, even just “disk” – this is what we mean. • Comprised of metal or glass platters stacked on a spindle, the platters holding films of magnetized material • The platters spin. 7,200RPM is typical these days; some disks are as fast as 15,000 RPM • Read-write heads on an actuator arm traverse the radius of the platters • A track is data laid in a circle around a platter. Inner tracks are physically shorter than outer ones. • A cylinder is all the tracks directly above and below each other on the platters. Performance characteristics: • Seek time is how long it takes for an arm to reach a cylinder. • Rotational delay is the time for a sector to get under a read-write head. • Access time is the sum of the seek time and the rotational delay. • Transfer time is how long it takes to actually read data. Transfer speed is obviously related. |
|
|
Term
|
Definition
• Seek time is how long it takes for an arm to reach a cylinder. |
|
|
Term
|
Definition
• Rotational delay is the time for a sector to get under a read-write head. |
|
|
Term
|
Definition
• Access time is the sum of the seek time and the rotational delay. |
|
|
Term
|
Definition
• Transfer time is how long it takes to actually read data. Transfer speed is obviously related. |
|
|
Term
|
Definition
Flexible Disks • Work basically the same way as fixed disks, but are much, much slower. • Basically dead. |
|
|
Term
|
Definition
CD-ROM • ~700MB optical disk • Can hold large amounts of data, and good at sequential streaming • Not at all good at random access |
|
|
Term
|
Definition
DVD-ROM • ~4.8GB (per layer) optical disk • Significantly better seek time than CD • (That doesn’t mean it has good seek time) |
|
|
Term
|
Definition
Blu-Ray • ~25GB (per layer) optical disk |
|
|
Term
|
Definition
Tape • Slow and sequential • Not suitable for “online backup” or fast restoration • Still very cost-effective |
|
|
Term
|
Definition
• 1988: Patterson, Gibson, and Katz, UC Berkeley: A Case for Redundant Arrays of Inexpensive Disks • Acronym has been changed to “independent” rather than “inexpensive” o …since nobody uses expensive disks any more |
|
|
Term
|
Definition
• RAID 0: Striping without redundancy o Stripes data across disk surfaces so that one record sits across multiple surfaces o Does not attempt to ensure redundancy o Highest performance o Risk raised by number of drives |
|
|
Term
|
Definition
• RAID 1: Mirroring o Mirrors all data across two drives o Does not attempt to improve performance o Performance equal to one drive o Risk lowered drastically |
|
|
Term
|
Definition
• RAID 2: Bitwise striping with Hamming codes o Breaks bytes into bits and writes one bit per drive o Writes data correction codes to extra drives o Far too much overhead to be useful |
|
|
Term
|
Definition
• RAID 3: Bitwise striping with parity bits o Like RAID 2 except uses simple XOR codes o Can recover if any one drive fails o Still a lot of overhead |
|
|
Term
|
Definition
• RAID 4: Record-length striping with a parity drive o Parity drive becomes untenable performance bottleneck |
|
|
Term
|
Definition
• RAID 5: Striping with parity scattered across the drives o Increases performance significantly by using multiple drives at once o Can recover if any one drive fails o Very commonly used |
|
|
Term
|
Definition
• RAID 6: Like RAID 5, but uses Reed-Solomon coding instead of parity o Also increases performance o Can recover if any two drives fail o More expensive with greater overhead than RAID 5; used for high-reliability server applications |
|
|