Wed. Jan. 20 | Overview of class topics | pptx(opens in new window)(opens in new window) (some pics: ppt(opens in new window)(opens in new window)) | |
| TOPIC 1: Modern Superscalar Processors | | |
Mon. Jan. 25 | Physical Register File management: phys. RF, RMT, freelist | pptx(opens in new window)(opens in new window), pptx(opens in new window)(opens in new window) | |
Wed. Jan. 27 | Physical Register File management: committing and freeing registers, exception recovery, branch misprediction recovery | pptx(opens in new window)(opens in new window) | D. Wall. Limits of Instruction Level Parallelism. ASPLOS IV, April 1991.(opens in new window)(opens in new window)
ACM Digital Library and IEEE Xplore paper links will work as-is if you are working from a computer on the NCSU network.
If not accessing from an NCSU network, log in to lib.ncsu.edu and then use this proxy link(opens in new window)(opens in new window). I obtained this link by logging in to lib.ncsu.edu and searching journals for ACM Digital Library (for IEEE papers use IEEE Xplore), and then from the ACM DL I searched on the paper title.
|
Mon. Feb. 1 | overflow lecture | | |
Wed. Feb. 3 | overflow lecture |
| |
Mon. Feb. 8 | Dynamic Scheduling Algorithm: Phys. RF ready bits, Issue Queue Sizing structures | pptx(opens in new window)(opens in new window) pptx(opens in new window)(opens in new window) | |
Wed. Feb. 10 | Handling loads and stores: terminology (memory disambiguation and store-load forwarding), LQ/SQ operation, load speculation and memory dependence predictors | pptx(opens in new window)(opens in new window) | K. C. Yeager. The Mips R10000 superscalar microprocessor. IEEE Micro, 16(2):28-41, April 1996.(opens in new window)(opens in new window) |
Mon. Feb. 15 | overflow lecture |
| |
Wed. Feb. 17 | overflow lecture | | G. Chrysos and J. S. Emer. Memory Dependence Prediction Using Store Sets. ISCA-25, 1998.(opens in new window)(opens in new window) |
Mon. Feb. 22 | Canonical Superscalar Pipeline Pipeline stages: fetch, decode, rename, dispatch | pptx(opens in new window)(opens in new window) pptx(opens in new window)(opens in new window) | G. Chrysos and J. S. Emer. Memory Dependence Prediction Using Store Sets. ISCA-25, 1998.(opens in new window)(opens in new window) |
Wed. Feb. 24 | Pipeline stages: schedule, register read, execute, writeback, retire | pptx(opens in new window)(opens in new window) | |
Mon. Mar. 1 | overflow lecture | | |
| TOPIC 2: High- ILP Processors | | |
Wed. Mar. 3 | Trace cache | pptx(opens in new window)(opens in new window) | E. Rotenberg, S. Bennett, and J. E. Smith. Trace Cache: a Low Latency Approach to High Bandwidth Instruction Fetching. MICRO-29, Dec. 1996.(opens in new window)(opens in new window) |
Mon. Mar. 8 | overflow lecture | | |
Wed. Mar. 10 | Value prediction | pptx(opens in new window)(opens in new window) | Y. Sazeides and J. E. Smith. The Predictability of Data Values. MICRO-30, Dec. 1997.(opens in new window)(opens in new window) |
Mon. Mar. 15 | Midterm Exam | | |
Wed. Mar. 17 | overflow lecture | | |
Mon. Mar. 22 | Predication | pptx(opens in new window)(opens in new window) (aux. pptx(opens in new window)(opens in new window)) | A. Klauser, T. Austin, D. Grunwald, and B. Calder. Dynamic Hammock Predication for Non-predicated Instruction Set Architectures. PACT, Oct. 1998.(opens in new window)(opens in new window) |
Wed. Mar. 24 | no class (Wellness Day) | | |
Mon. Mar. 29 | overflow lecture
Moved this to April 5: Advice on research projects:
| | |
Wed. Mar. 31 | Simultaneous multithreading (SMT) | pre-recorded lecture: Part 1(opens in new window)(opens in new window) Part 2(opens in new window)(opens in new window)
pptx(opens in new window)(opens in new window) | D. M. Tullsen et al. Exploiting Choice: Instruction Fetch and Issue on an Implementable Simultaneous Multithreading Processor. ISCA-23, May 1996.(opens in new window)(opens in new window) |
Mon. Apr. 5 | Discuss last question of Quiz 8. Performance of SMT with multiple threads (pros, cons). Q&A on SMT paper and pre-recorded modules. Advice on research projects: Project guidelines(opens in new window)(opens in new window) Report format(opens in new window)(opens in new window)Presentation guidelines(opens in new window)(opens in new window) | | |
Wed. Apr. 7 | Trace processors | pre-recorded lecture(opens in new window)(opens in new window)
pptx(opens in new window)(opens in new window) | E. Rotenberg, Q. Jacobson, Y. Sazeides, and J. E. Smith. Trace processors. MICRO-30, Dec. 1997.(opens in new window)(opens in new window) |
| TOPIC 3: Large-Window Processors | | |
Mon. Apr. 12 | Checkpoint Processing and Recovery (CPR):
fine-grain (ROB) vs. coarse-grain (Checkpoint) recovery, aggressive register reclamation
operation, example simulation | pre-recorded lecture: Part 1(opens in new window)(opens in new window) Part 2(opens in new window)(opens in new window)
pptx(opens in new window)(opens in new window) pptx(opens in new window)(opens in new window)
| H. Akkary, R. Rajwar, and S. Srinivasan. Checkpoint Processing and Recovery: Towards Scalable Large Instruction Window Processors. MICRO-36, 2003.(opens in new window)(opens in new window) |
Wed. Apr. 14 | Continual Flow Pipelines (CFP) | pptx(opens in new window)(opens in new window) pptx (enhanced ROB+CFP animation)(opens in new window)(opens in new window)
Project presentation guidelines(opens in new window)(opens in new window) | S. Srinivasan, R. Rajwar, H. Akkary, A. Gandhi, and M. Upton. Continual Flow Pipelines. ASPLOS’04, 2004.(opens in new window)(opens in new window) |
Mon. Apr. 19 | Runahead Execution | (see prev. pptx) | O. Mutlu, J. Stark, C. Wilkerson, and Y. Patt. Runahead Execution: An Alternative to Very Large Instruction Windows for Out-of-Order Processors. HPCA-9, 2003.(opens in new window)(opens in new window) |
Wed. Apr. 21 | project presentations | | |
Mon Apr. 26 | project presentations | | |
Wed. Apr. 28 | project presentations | | |