Email: eyeriss at mit dot edu
Deep neural networks (DNNs) are currently widely used for many AI applications including computer vision, speech recognition, robotics, etc. While DNNs deliver state-of-the-art accuracy on many AI tasks, it comes at the cost of high computational complexity. Accordingly, designing efficient hardware architectures for deep neural networks is an important step towards enabling the wide deployment of DNNs in AI systems.
In this tutorial, we will provide an overview of DNNs, discuss the tradeoffs of the various architectures that support DNNs including CPU, GPU, FPGA and ASIC, and highlight important benchmarking/comparison metrics and design considerations. We will then describe recent techniques that reduce the computation cost of DNNs from both the hardware architecture and network algorithm perspective. Finally, we will discuss the different hardware requirements for inference and training.
An overview paper based on the tutorial "Efficient Processing of Deep Neural Networks: A Tutorial and Survey" is available here.
Entire Tutorial [ slides ]
Entire Tutorial [ slides ]
@article{2017_dnn_piee,
title={Efficient processing of deep neural networks: A tutorial and survey},
author={Sze, Vivienne and Chen, Yu-Hsin and Yang, Tien-Ju and Emer, Joel},
journal={Proceedings of the IEEE},
year={2017}
}