CSE 240C: Advanced Microarchitecture (Spring 2020)


Instructor: Hadi Esmaeilzadeh


Email: hadi [AT] eng [DOT] ucsd [DOT] edu
Office: CSE 3228

TA:
Soroush Ghodrati: soghodra [AT] eng [DOT] ucsd [DOT] edu
Office hours: TBD

Schedule

Data Topic Reading Assignment Alternative Reading Assignment Presenter
0 03/30/2020 Introduction Power Challenges May End the Multicore Era

Denard Scaling paper: Design of ion-implanted mosfet s with very small physical dimensions
Hadi Esmaeilzadeh
0 04/01/2020 Introduction Power Challenges May End the Multicore Era

Denard Scaling paper: Design of ion-implanted mosfet s with very small physical dimensions
Hadi Esmaeilzadeh
1 04/06/2020 Unifying Machine Learning Accelerators TABLA: A Unified Template-Based Framework for Accelerating Statistical Machine Learning

Towards a Unified Architecture for in-RDBMS Analytics

Optional:In-RDBMS Hardware Acceleration of Advanced Analytics
Divya Mahajan
2 04/08/2020 Machine Learning Accelerators Scale-Out Acceleration for Machine Learning

Parallelized Stochastic Gradient Descent

Optional:An overview of gradient descent optimization algorithms
Hadi Esmaeilzadeh
3 04/13/2020 Neural Network History Hadi Esmaeilzadeh
3 04/15/2020 No Lectures Hadi Esmaeilzadeh
3 04/20/2020 Neural Network History Hadi Esmaeilzadeh
3 04/22/2020 Deep Learning Accelerators DianNao: a small-footprint high-throughput accelerator for ubiquitous machine-learning

DaDianNao: A Machine-Learning Supercomputer

Eyeriss: A Spatial Architecture for Energy-Efficient Dataflow for Convolutional Neural Networks

Optional: FPGAS vs GPUS in Datacenters
Alternative to DianNao: TETRIS: Scalable and Enfficient Neural Network Acceleration with 3D Memory

Alternative to Eyeriss: Eyeriss v2: A Flexible Accelerator for Emerging Deep Neural Networks on Mobile Devicest
Hadi Esmaeilzadeh
4 04/27/2020 Scalable Deep Networks Project Adam: building an efficient and scalable deep learning training system

Simba: Scaling Deep-Learning Inference with Multi-Chip-Module-Based Architecture
Hadi Esmaeilzadeh
5 04/29/2020 Distributed Training A Network-Centric Hardware/Algorithm Co-Design to Accelerate Distributed Training of Deep Neural Networks< Hadi Esmaeilzadeh
6 05/04/2020 Variable Bitwidth DNN Acceleration Stripes: Bit-Serial Deep Neural Network Computing

Bit Fusion: Bit-Level Dynamically Composable Architecture for Accelerating Deep Neural Networks
HAQ: Hardware-Aware Automated Quantization with Mixed Precision

DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients
Hadi Esmaeilzadeh
7 05/06/2020 Mixed-Signal DNN Acceleration Mixed-Signal Charge-Domain Acceleration of Deep Neural networks through Interleaved Bit-Partitioned Arithmetic

Bit-Parallel Vector Composability for Neural Acceleration
Hadi Esmaeilzadeh
7 05/11/2020 Approximate computing Neural Acceleration for General-Purpose Approximate Programs

General-Purpose Code Acceleration with Limited-Precision Analog Computation
Hadi Esmaeilzadeh
8 05/13/2020 Tiled/Dataflow Architectures Exploiting ILP, TLP, and DLP with the Polymorphous TRIPS Architecture

Understanding Sources of Inefficiency in General-Purpose Chips
Hadi Esmaeilzadeh
8 05/18/2020 Tiled/Dataflow Architectures Exploiting ILP, TLP, and DLP with the Polymorphous TRIPS Architecture

Understanding Sources of Inefficiency in General-Purpose Chips
Hadi Esmaeilzadeh
9 05/20/2020 Tiled/Dataflow Architectures The WaveScalar architecture

The raw microprocessor: A computational fabric for software circuits and general purpose programs
Hadi Esmaeilzadeh