Skip to main content

ECE 721 Spring 2021 Advanced Microarchitecture

Schedule

datetopicnotesquiz
Wed. Jan. 20Overview of class topicspptx(opens in new window)(opens in new window)
(some pics: ppt(opens in new window)(opens in new window))
  TOPIC 1: Modern Superscalar Processors  
Mon. Jan. 25Physical Register File management: phys. RF, RMT, freelistpptx(opens in new window)(opens in new window)pptx(opens in new window)(opens in new window)
Wed. Jan. 27Physical Register File management:
committing and freeing registers, exception recovery, branch misprediction recovery
pptx(opens in new window)(opens in new window)D. Wall.  Limits of Instruction Level Parallelism. ASPLOS IV, April 1991.(opens in new window)(opens in new window)

ACM Digital Library and IEEE Xplore paper links will work as-is if you are working from a computer on the NCSU network.

If not accessing from an NCSU network, log in to lib.ncsu.edu and then use this proxy link(opens in new window)(opens in new window). I obtained this link by logging in to lib.ncsu.edu and searching journals for ACM Digital Library (for IEEE papers use IEEE Xplore), and then from the ACM DL I searched on the paper title.
Mon. Feb. 1overflow lecture  
Wed. Feb. 3overflow lecture
Mon. Feb. 8Dynamic Scheduling Algorithm: Phys. RF ready bits, Issue Queue
Sizing structures
pptx(opens in new window)(opens in new window)
pptx(opens in new window)(opens in new window)
Wed. Feb. 10Handling loads and stores:
terminology (memory disambiguation and store-load forwarding), 
LQ/SQ operation, load speculation and memory dependence predictors
pptx(opens in new window)(opens in new window)K. C. Yeager. The Mips R10000 superscalar microprocessor. IEEE Micro, 16(2):28-41, April 1996.(opens in new window)(opens in new window)
Mon. Feb. 15overflow lecture
Wed. Feb. 17overflow lectureG. Chrysos and J. S. Emer. Memory Dependence Prediction Using Store Sets. ISCA-25, 1998.(opens in new window)(opens in new window)
Mon. Feb. 22Canonical Superscalar Pipeline
Pipeline stages: fetch, decode, rename, dispatch
pptx(opens in new window)(opens in new window)
pptx(opens in new window)(opens in new window)
G. Chrysos and J. S. Emer. Memory Dependence Prediction Using Store Sets. ISCA-25, 1998.(opens in new window)(opens in new window)
Wed. Feb. 24Pipeline stages: schedule, register read, execute, writeback, retirepptx(opens in new window)(opens in new window)
Mon. Mar. 1overflow lecture
  TOPIC 2: High- ILP Processors  
Wed. Mar. 3Trace cachepptx(opens in new window)(opens in new window)E. Rotenberg, S. Bennett, and J. E. Smith. Trace Cache: a Low Latency Approach to High Bandwidth Instruction Fetching. MICRO-29, Dec. 1996.(opens in new window)(opens in new window)
Mon. Mar. 8overflow lecture
Wed. Mar. 10Value predictionpptx(opens in new window)(opens in new window)Y. Sazeides and J. E. Smith. The Predictability of Data Values. MICRO-30, Dec. 1997.(opens in new window)(opens in new window)
Mon. Mar. 15Midterm Exam
Wed. Mar. 17overflow lecture
Mon. Mar. 22Predicationpptx(opens in new window)(opens in new window)
(aux. pptx(opens in new window)(opens in new window))
A. Klauser, T. Austin, D. Grunwald, and B. Calder. Dynamic Hammock Predication for Non-predicated Instruction Set Architectures. PACT, Oct. 1998.(opens in new window)(opens in new window)
Wed. Mar. 24 no class (Wellness Day)  
Mon. Mar. 29
overflow lecture

Moved this to April 5:
Advice on research projects:
 
Wed. Mar. 31Simultaneous multithreading (SMT)pre-recorded lecture:
Part 1(opens in new window)(opens in new window)
Part 2(opens in new window)(opens in new window)

pptx(opens in new window)(opens in new window) 
D. M. Tullsen et al.  Exploiting Choice: Instruction Fetch and Issue on an Implementable Simultaneous Multithreading Processor. ISCA-23, May 1996.(opens in new window)(opens in new window) 
Mon. Apr. 5Discuss last question of Quiz 8.  Performance of SMT with multiple threads (pros, cons).
Q&A on SMT paper and pre-recorded modules.
 
Advice on research projects:
Project guidelines(opens in new window)(opens in new window)
Report format(opens in new window)(opens in new window)Presentation guidelines(opens in new window)(opens in new window)
Wed. Apr. 7Trace processorspre-recorded lecture(opens in new window)(opens in new window)

pptx(opens in new window)(opens in new window)
E. Rotenberg, Q. Jacobson, Y. Sazeides, and J. E. Smith. Trace processors. MICRO-30, Dec. 1997.(opens in new window)(opens in new window)
 TOPIC 3: Large-Window Processors  
Mon. Apr. 12Checkpoint Processing and Recovery (CPR):

fine-grain (ROB) vs. coarse-grain (Checkpoint) recovery, aggressive register reclamation

operation, example simulation 
pre-recorded lecture:
Part 1(opens in new window)(opens in new window)
Part 2(opens in new window)(opens in new window)

pptx(opens in new window)(opens in new window)
pptx(opens in new window)(opens in new window)
H. Akkary, R. Rajwar, and S. Srinivasan. Checkpoint Processing and Recovery:
Towards Scalable Large Instruction Window Processors.  MICRO-36, 2003.(opens in new window)(opens in new window)
Wed. Apr. 14 Continual Flow Pipelines (CFP)pptx(opens in new window)(opens in new window)
pptx (enhanced ROB+CFP animation)(opens in new window)(opens in new window)

Project presentation guidelines(opens in new window)(opens in new window)
S. Srinivasan, R. Rajwar, H. Akkary, A. Gandhi, and M. Upton. Continual Flow Pipelines. ASPLOS’04, 2004.(opens in new window)(opens in new window)
Mon. Apr. 19 Runahead Execution(see prev. pptx)O. Mutlu, J. Stark, C. Wilkerson, and Y. Patt. Runahead Execution: An Alternative to Very Large Instruction Windows for Out-of-Order Processors. HPCA-9, 2003.(opens in new window)(opens in new window)
Wed. Apr. 21 project presentations 
Mon Apr. 26 project presentations 
Wed. Apr. 28 project presentations