Neural net compression using the "learning-compression" (LC) algorithm

This page collects papers describing the "learning-compression" (LC) algorithm, a nonconvex optimisation algorithm for finding an optimally compressed deep neural net, i.e., which minimises a loss (such as classification error) while being compressed as much as possible. The LC algorithm alternates two steps till convergence: a learning (L) step, which learns the neural net, and a compression (C) step, which compresses its parameters. It can handle many types of compression (quantization, binarization, pruning, low-rank decomposition, etc.) by simply calling the relevant compression routine in the C step.

This work has been done in collaboration with my students Magzhan Gabidolla, Kuat Gazizov, Yerlan Idelbayev and Arman Zharmagambetov.

 

It has been funded in part by:

  • NSF award IIS #1423515 (2014-2017): "Algorithms for accelerating optimization in deep learning".
    Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
  • GPU donations by the NVIDIA Corporation.

Selected presentations

Software

Available under the BSD 3-clause license at https://github.com/UCMerced-ML/LC-model-compression. This implements nearly all the algorithms described in the papers below.

References


Miguel A. Carreira-Perpinan
Last modified: Sun Dec 18 14:40:59 PST 2022

UC Merced | EECS | MACP's Home Page