Neural Network Compression

Artificial neural networks have been adopted for a broad range of tasks in multimedia analysis and processing, such as visual and acoustic classification, extraction of multimedia descriptors or image and video coding. The trained neural networks for these applications contain a large number of parameters (i.e., weights), resulting in a considerable size. Thus, transferring them to a number of clients using them in applications (e.g., mobile phones, smart cameras) requires compressed representation of neural networks.

In April 2021, MPEG completed the first international standards on Neural Network Compression for Multimedia Applications (ISO/IEC 15938-17), designed as a toolbox of compression technologies. The specification contains different (i) parameter reduction (e.g., pruning, sparsification, matrix decomposition), (ii) parameter transformation (e.g., quantization), and (iii) entropy coding methods that can be assembled to encoding pipelines combining one or more (in the case of reduction) methods from each group. The results show that trained neural networks for many multimedia problems such as image or audio classification or image compression can be compressed by a factor of 10-20 with no performance loss and even by more than a factor of 30 with some performance trade-off. The new standard is not limited to a particular neural network architecture and is independent of choice of the neural network exchange format. Interoperability with common neural network exchange formats is described in the annexes of the standard.

Subscribe
Notify of
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments