This page collects papers describing the "learningcompression" (LC) algorithm, a nonconvex optimisation algorithm for finding an optimally compressed deep neural net, i.e., which minimises a loss (such as classification error) while being compressed as much as possible. The LC algorithm alternates two steps till convergence: a learning (L) step, which learns the neural net, and a compression (C) step, which compresses its parameters. It can handle many types of compression (quantization, binarization, pruning, lowrank decomposition, etc.) by simply calling the relevant compression routine in the C step.
This work has been done in collaboration with my students Magzhan Gabidolla, Kuat Gazizov, Yerlan Idelbayev and Arman Zharmagambetov.
It has been funded in part by:

Second Summer School on Optimization, Big Data and Applications (OBA) (Jun. 30  Jul. 6, 2019): [PDF]
Amazon (Jun. 19, 2018): [PDF]
Allen Institute for Artificial Intelligence (AI2) (Jun. 12, 2018): [PDF]
Dept. of Statistics, University of Washington (May 11, 2018): [PDF]
Microsoft Research, Redmond (Apr. 5, 2018): [PDF]
Available under the BSD 3clause license at https://github.com/UCMercedML/LCmodelcompression. This implements nearly all the algorithms described in the papers below.
CarreiraPerpiñán, M. Á. (2017): "Model compression as constrained optimization, with application to neural nets. Part I: general framework". Unpublished manuscript, Jul. 5, 2017, arXiv:1707.01209.
[external link] [paper preprint]
CarreiraPerpiñán, M. Á. and Idelbayev, Y. (2017): "Model compression as constrained optimization, with application to neural nets. Part II: quantization". Unpublished manuscript, Jul. 13, 2017, arXiv:1707.04319.
[external link] [paper preprint]
CarreiraPerpiñán, M. Á. and Zharmagambetov, A. (2018): "Fast model compression". Bay Area Machine Learning Symposium (BayLearn 2018).
[external link] [paper preprint] [poster]
CarreiraPerpiñán, M. Á. and Idelbayev, Y. (2018): ""LearningCompression" algorithms for neural net pruning". IEEE Conf. Computer Vision and Pattern Recognition (CVPR 2018), pp. 85328541.
[external link] [paper preprint] [poster] [supplementary material] [Python implementation (old version)] [© IEEE]
Idelbayev, Y. and CarreiraPerpiñán, M. Á. (2020): "Lowrank compression of neural nets: learning the rank of each layer". IEEE Conf. Computer Vision and Pattern Recognition (CVPR 2020), pp. 80468056.
[external link] [paper preprint] [poster] [supplementary material] [© IEEE]
Idelbayev, Y. and CarreiraPerpiñán, M. Á. (2020): "A flexible, extensible software framework for model compression based on the LC algorithm". Unpublished manuscript, May 15, 2020, arXiv:2005.07786.
[external link] [paper preprint] [Python implementation]
Extended abstract at the Bay Area Machine Learning Symposium (BayLearn 2020): [paper preprint] [video]
Short version at the 2nd OnDevice Intelligence Workshop (MLSys 2021): [external link] [paper preprint] [slides] [video]
Idelbayev, Y. and CarreiraPerpiñán, M. Á. (2021): "Neural network compression via additive combination of reshaped, lowrank matrices". Data Compression Conference (DCC 2021), pp. 243252.
[external link] [paper preprint] [slides] [Python implementation] [© IEEE]
Idelbayev, Y. and CarreiraPerpiñán, M. Á. (2021): "Optimal selection of matrix shape and decomposition scheme for neural network compression". IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP 2021), pp. 32503254.
[external link] [paper preprint] [slides] [Python implementation] [© IEEE]
Idelbayev, Y. and CarreiraPerpiñán, M. Á. (2021): "An empirical comparison of quantization, pruning and lowrank neural network compression using the LC toolkit". Int. Joint Conf. on Neural Networks (IJCNN 2021).
[external link] [paper preprint] [slides] [Python implementation] [© IEEE]
Idelbayev, Y., Molchanov, P., Shen, M., Yin, H., CarreiraPerpiñán, M. Á. and Alvarez, J. M. (2021): "Optimal quantization using scaled codebook". IEEE Conf. Computer Vision and Pattern Recognition (CVPR 2021), pp. 1209012099.
[external link] [paper preprint] [slides] [poster] [© IEEE]
Idelbayev, Y. and CarreiraPerpiñán, M. Á. (2021): "More general and effective model compression via an additive combination of compressions". 32nd European. Conf. Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD 2021), pp. 233248.
[external link] [paper preprint] [slides] [Python implementation]
Longer version: CarreiraPerpiñán, M. Á. and Idelbayev, Y. (2021): "Model compression as constrained optimization, with application to neural nets. Part V: combining compressions". Jul. 9, 2021, arXiv:2107.04380: [external link] [paper preprint]
Idelbayev, Y. and CarreiraPerpiñán, M. Á. (2021): "Beyond FLOPs in lowrank compression of neural networks: optimizing devicespecific inference runtime". IEEE Int. Conf. Image Processing (ICIP 2021), pp. 28432847.
[external link] [paper preprint] [slides] [poster] [Python implementation] [© IEEE]
Idelbayev, Y. and CarreiraPerpiñán, M. Á. (2021): "LC: A flexible, extensible opensource toolkit for model compression". Conference on Information and Knowledge Management (CIKM 2021), resource paper, pp. 45044514.
[external link] [paper preprint] [slides] [Python implementation]
Idelbayev, Y. and CarreiraPerpiñán, M. Á. (2022): "Exploring the effect of l_{0}/l_{2} regularization in neural network pruning using the LC toolkit". IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP 2022), pp. 33733377.
[external link] [paper preprint] [slides] [poster] [Python implementation] [© IEEE]