Dynamic Quantization for Energy Efficient Deep Learning

Randy Ardywibowo, Venkata Ravi Kiran Dayana, Hau Hwang

March 2022

Variational growing architecture

Abstract

A method performed by a deep neural network (DNN) includes receiving, at a layer of the DNN during an inference stage, a layer input comprising content associated with a DNN input received at the DNN. The method also includes quantizing one or more parameters of a plurality of parameters associated with the layer based on the content of the layer input. The method further includes performing a task corresponding to the DNN input, the task performed with the one or more one quantized parameters.

Type

Patent

Publication

U.S. Patent App.

Dynamic Quantization for Energy Efficient Deep Learning

Abstract

Randy Ardywibowo

Ph.D.