Benchmarking DNN Processors




In order to enable comparison, we recommend designs report benchmarking metrics for widely used state-of-the-art DNNs (e.g. AlexNet, VGG, GoogLeNet, ResNet) with input from well known datasets such as ImageNet. We aim to summarize the results on this website.

DNN models can be downloaded here.

Please submit benchmarking metrics using this form.


Explanation of Metrics

  • Measure energy and off-chip (e.g., DRAM) access relative to number of non-zero MACs and bit-width of MACs
    • Account for impact of sparsity in weights and activations
    • To compute the off-chip access, assume the DNN processor is a stand-alone chip. The off-chip access should account for all accesses needed to complete all the layers listed including initial inputs and final outputs from an off-chip device (e.g., DRAM). The goal is to compare the off-chip access at steady state, so accesses during ramp-up/ramp do not need to be included (e.g. loading configuration parameters, or loading weights *if* all weights can be stored on chip).
  • Energy Efficiency of Design
    • pJ/non-zero MAC
  • External Memory Bandwidth
    • Off-chip access (in Bytes)/non-zero MAC
  • Area Efficiency
    • Total chip mm2/multiplier and storage capacity/multiplier
    • Accounts for on-chip memory
More details at Tutorial on Hardware Architectures for DNN

Summary

Note: All energy and off-chip access values are normalized relative to the number of non-zero multiply-and-accumulates (MAC).

Processor Specifications

                                       
Name [Publication]Process TechnologyPower Supply VoltageClock Frequency (MHz)Number of multipliersPeak Performance (GMACs/sec)Total Core area /
Total number of multipliers
(mm2)
Total On-Chip memory /
Total number of multipliers
(kB)
Measured or Simulated
Eyeriss
[ISSCC 2016]
65nm 1.0 200 168 (16-bit) 33.6 0.073 1.14 Measured
KU Leuven
[VLSI 2016]
40nm 0.85 - 0.9 204 256 (16-bit) 52.2 0.0094 0.58 Measured
Envision
[ISSCC 2017]
28nm 0.65 - 1.0 200 256* (16-bit)
*changes with bitwidth
52.2 0.0074 0.58 Measured
EIE
[ISCA 2016]
45nm 1.0 800 64 (16-bit) 51.2 0.638 162 Simulated (PnR)

AlexNet

                                            
Name [Publication]Dense/SparseSupported LayersBatch SizeBits per WeightBits per Input Activation Chip Power
(mW)
Chip Energy per non-zero MAC
(pJ)
Run Time
(ms)
Multiplier Utilization vs Peak
(%)
Off-chip accesses per non-zero MAC
(Bytes)
Eyeriss
[ISSCC 2016]
Dense CONV [all] 4 16 16 278 21.7 115.3 41 0.010
KU Leuven
[VLSI 2016]
Dense
[WACV 2016]
CONV [all] 1 7,7,8,9,9 4,7,9,8,8 78 10.7 21 14 0.066
Envision
[ISSCC 2017]
Dense
[WACV 2016]
CONV [all] 1 7,7,8,9,9 4,7,9,8,8 44 6.0 21 14 0.055
EIE
[ISCA 2016]
Sparse
[ICLR 2016]
FC [all] 1 16 16 579 14.5 0.05 76 0.009

VGG-16

                                              
Name [Publication]Dense/SparseSupported LayersBatch SizeBits per WeightBits per Input Activation Chip Power
(mW)
Chip Energy per non-zero MAC
(pJ)
Run Time
(ms)
Multiplier Utilization vs Peak
(%)
Off-chip accesses per non-zero MAC (Bytes)
Eyeriss
[ISSCC 2016]
Dense CONV [all] 3 16 16 236 52.0 4309.4 13 0.016
Envision
[ISSCC 2017]
Dense
[WACV 2016]
CONV [all] 1 5 4 (first),
6 (other layers)
26 4.4 596.5 12 0.028
EIE
[ISCA 2016]
Sparse
[ICLR 2016]
FC [all] 1 16 16 610 22.6 0.05 49 0.036


Detailed summary of results here.


Feedback and questions are welcome at eyeriss at mit dot edu