1
|
01/08/2018
|
Dark Silicon
|
Power Challenges May End
the Multicore Era
Denard Scaling paper:
Design of ion-implanted mosfet s with very small physical dimensions
|
Hadi Esmaeilzadeh
|
2
|
01/10/2018
|
Unifying Machine Learning Accelerators
|
TABLA: A Unified Template-Based Framework for Accelerating Statistical Machine Learning (you can find the TABLA homepage here)
Towards a Unified Architecture for in-RDBMS Analytics
|
Hadi Esmaeilzadeh
|
3
|
01/15/2018
|
M.L.K, Jr. National Holiday
|
|
|
4
|
01/17/2018
|
Machine Learning Accelerators
|
Mandatory Readings:
Scale-Out Acceleration for Machine Learning
Parallelized Stochastic Gradient Descent
Optional: An overview of gradient descent optimization algorithms
|
Hadi Esmaeilzadeh
|
5
|
01/22/2018
|
Machine Learning Accelerators
|
DianNao: a small-footprint high-throughput accelerator for ubiquitous machine-learning
DaDianNao: A Machine-Learning Supercomputer
Optional: FPGAS vs GPUS in Datacenters
Eyeriss: A Spatial Architecture for Energy-Efficient Dataflow for Convolutional Neural Networks
|
Hadi Esmaeilzadeh
|
6
|
01/24/2018
|
Scalable Deep Networks
|
Project Adam: building an efficient and scalable deep learning training system
Large Scale Distributed Deep Networks
|
Hadi Esmaeilzadeh
|
7
|
01/29/2018
|
Distributed Training
|
Device Placement Optimization with Reinforcement Learning [ICML17]
Deep Gradient Compression: Reducing the Communication Bandwidth for Distributed Training [Submitted to ICLR18]
|
Jongse Park
|
8
|
01/31/2018
|
Variable-Bitwidth DNN Acceleration
|
Stripes: Bit-Serial Deep Neural Network Computing
Bit Fusion: Bit-Level Dynamically Composable Architecture for Accelerating Deep Neural Networks
|
Jongse Park
|
9
|
02/05/2018
|
Approximate Computing
|
Neural Acceleration for General-Purpose Approximate Programs
General-Purpose Code Acceleration with Limited-Precision Analog Computation
|
Hadi Esmaeilzadeh
|
10
|
02/07/2018
|
Tiled/Dataflow Architectures
|
Exploiting ILP, TLP, and DLP with the Polymorphous TRIPS Architecture
Understanding Sources of Inefficiency in General-Purpose Chips
|
Hadi Esmaeilzadeh
|
10
|
02/12/2018
|
Tiled/Dataflow Architectures
|
The WaveScalar architecture
The raw microprocessor: A computational fabric for software circuits and general purpose programs
|
Hadi Esmaeilzadeh
|
11
|
02/14/2018
|
No class
|
|
|
12
|
02/19/2018
|
Presidents' Day Holiday
|
|
|
13
|
02/21/2018
|
Project proposals
|
None
|
Hadi Esmaeilzadeh
|
14
|
02/26/2018
|
Access-Execute Architectures
|
Decoupled access/execute computer architectures
Stream-Dataflow Acceleration
Optional: Dynamically Specialized Datapaths for Energy Efficient Computing
Optional: Meet the Walkers: Accelerating Index Traversals for In-Memory Databases
|
Hadi Esmaeilzadeh
|
15
|
02/28/2018
|
Multi Threading/Value Prediction
|
Exploiting Choice: Instruction Fetch and Issue on an Implementable Simultaneous Multithreading
Highly Accurate Data Value Prediction using Hybrid Predictors
|
Hadi Esmaeilzadeh
|
17
|
03/05/2018
|
No Class
|
Project Development
|
|
18
|
03/07/2018
|
No Class
|
Project Development
|
|
19
|
03/12/2018
|
Branch Prediction
|
1. The YAGS branch prediction scheme
2. Design Tradeoffs for the Alpha EV8 Conditional Branch Predictor
3. Analysis of the O-GEometric History Length branch predictor
4. Neural methods for dynamic branch prediction
Optional: The L-TAGE Branch Predictor
|
1. Chao Li
2. Uday Mallappa
3. Vasudev Patel
4. Ahmed Youssef
|
20
|
03/14/2018
|
Scale-Out Acceleration
|
1. A Reconfigurable Fabric for Accelerating Large-Scale Datacenter Services
2. A Cloud-Scale Acceleration Architecture
Optional: 3. ASIC Clouds: Specializing the Datacenter
Optional: 4. Applied Machine Learning at Facebook: A Datacenter Infrastructure Perspective
|
1. Behnam Khaleghi
2. Tushar Shah
3. Anuj Rao
4. Nivedha Krishna
|
21
|
03/19/2018
|
No Class - Final exam week
|
|
|
22
|
03/21/2018
|
Project Presentation
|
|
|