Page Nav

HIDE

Breaking News:

latest

Ads Place

AI-Based Image Compression: The State of the Art

https://ift.tt/3l5GjY7 An overview of some of the leading libraries and frameworks out there Image Source:  Pixabay What Is Image Compr...

https://ift.tt/3l5GjY7

An overview of some of the leading libraries and frameworks out there

Image Source: Pixabay

What Is Image Compression?

Image compression involves reducing the pixels, dimensions or color components of images in order to reduce their file size. This reduces their storage and processing burden (for web performance). Advanced image optimization techniques can identify the more important image components and discard less crucial components.

Image compression is a form of data compression, as it reduces the data bits required to encode images but preserves image details.

Applications of image compression include:

  • Storage — the compressed data takes up less disk space, which is particularly useful for archiving detailed images, such as medical images.
  • Principal component analysis (PCA) — image compression methods that extract the most significant components of an image are used for extracting or summarizing features and analyzing data.
  • Standardization — in some cases, sets of images must conform to a standard size and format, requiring compression of all images into the same size, shape and resolution. For example, records maintained by security and government agencies require standardized images.

Image Compression with Deep Learning

Deep Learning (DL) has been used for image compression since the 1980s, and has evolved to include techniques such as multi-layer perceptrons, random neural networks, convolutional neural networks and generative adversarial networks.

Multi-Layer Perceptrons

Multi-layer perceptrons (MLPs) have a hidden layer (or layers) or neurons sandwiched between a layer of input neurons and a layer of output neurons. Theoretically, MLPs with multiple hidden layers are useful for dimension reduction and data compression. Image compression with MLPs involves a unitary transformation of the entire spatial data.

The initial MLP algorithm for image compression was made public in 1988, and incorporated conventional image compression mechanisms such as spatial domain transformation, binary coding and quantization into an integrated optimization task. The algorithm relied on a decomposition neural network to identify the optimal binary code combination within the compressed bitstream output, but it could not fix the parameters of the neural network to a variable compression ratio.

The algorithm has been further developed with predictive techniques to estimate the value of each pixel based on the surrounding pixels. The MLP algorithm then uses backpropagation to minimize the mean square error between predicted and original pixels.

Convolutional Neural Networks

Convolutional neural networks (CNNs) offer enhanced compression artifact reduction and super-resolution performance compared with traditional computer vision models. The convolution operations of CNNs allow them to determine how neighbouring pixels correlate. Cascaded convolution operations mirror the properties of complex images.

However, it is challenging to incorporate a CNN model throughout the image compression process, as it requires gradient descent algorithms and backpropagation, which are challenging to incorporate in end-to-end image compression.

The first implementation of CNNs for image compression was in 2016, with an algorithm consisting of an analysis module and a synthesis module. The analysis module consists of convolution, divisive, and subsampling normalization stages. Each stage begins with an affine convolution, produces downsampled output, and then uses Generalized Divisive Normalization (GDN) to calculate the downsampled signals.

CNN-based image compression improves JPEG2000 metrics such as Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity (SSIM). This algorithm was developed further with entropy estimation using the scale called hyper priors. This resulted in image compression performance level approaching standards such as High-Efficiency Video Coding (HEVC).

Generative Adversarial Networks

A generative adversarial network (GAN) is a deep neural network consisting of two opposing generative network models. The first GAN-based image compression algorithm was made available in 2017. It produces compressed files that are half the size of WebP, 2.5 times smaller than JPEG or JPEG200 and 1.7 times smaller than BPG. The algorithm also leverages parallel computation GPU cores to run in real time.

GAN image compression involves reconstructing a compressed image in a tiny feature space, based on the features from the input image. The main advantage of GANs over CNNs in terms of image compression is adversarial loss, which improves the quality of the output image. The opposing networks are trained together, against each other, enhancing the performance of the image generation model.

Frameworks and Libraries for AI-Based Image Compression

It is theoretically possible to code an entire image processing application yourself, but it is more realistic to leverage what others have developed, and simply adjust or extend existing software according to your own needs. Many existing frameworks and libraries provide models for image processing, many of them pre-trained on large data sets.

OpenCV

The Open Source Computer Vision (OpenCV) Library offers hundreds of machine learning and computer vision algorithms, with thousands of functions to support these algorithms. It is a popular choice, given its support for all the leading mobile and desktop operating systems, with Java, Python and C++ interfaces.

OpenCV contains numerous modules for image compression functions, including image processing, object detection and machine learning modules. You can use this library to obtain image data and extract, enhance and compress it.

TensorFlow

TensorFlow is Google’s open-source framework that supports machine learning and deep learning. TensorFlow allows you to custom-build and train deep learning models. It offers several libraries, some of which are useful for computer vision applications and image processing projects. The TensorFlow Compression (TFC) library offers data compression tools.

You can use the TFC library to create machine learning models with built-in optimized data compression. You can also use it to identify storage-efficient data representations, for example of your images and features, which only minimally affect model performance. You can compress floating point tensors into much smaller sequences of bits.

MATLAB Image Processing Toolbox

Matrix Laboratory, or MATLAB, refers to both a programming language and a popular mathematical and scientific problem-solving platform. The platform offers an Image Processing Toolbox (IPT) containing various workflow applications and algorithms for processing, analyzing and visualizing images, and can be used to develop algorithms.

MATLAB IPT enables the automation of image processing workflows, with applications ranging from noise reduction and image enhancement to image segmentation and 3D image processing. IPT functions often support generating C/C++ code and are useful for deploying an embedded vision system or for desktop prototyping.

Although MATLAB IPT is not open source, it does offer a free trial.

High-Fidelity Generative Image Compression

This high-fidelity generative image compression is a Github project, which leverages learned compression and GAN models to create a lossy compression system. This is attractive for coding enthusiasts, who can experiment with the HiFiC code on Github. This model is highly effective in reconstructing detailed textures in compressed images.

Conclusion

In this article I discussed the state of the art in image compression algorithms based on deep learning, including Multi-Layer Perceptrons, Convolutional Neural Networks, and Generative Adversarial Networks. I also presented readily available tools you can use to build AI-based image compression applications:

  • OpenCV — includes hundreds of machine learning models, including several modules that perform image compression.
  • TensorFlow — allows you to build and finely customize image processing and compression models.
  • MATLAB Image Processing Toolbox — lets you build image processing workflows and algorithms, including image segmentation and 3D image processing.
  • High-Fidelity Generative Image Compression — an open source project that uses GAN models to perform lossy compression.

I hope this will be of help as you evaluate the use of deep learning in image compression and optimization projects.


AI-Based Image Compression: The State of the Art was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.



from Towards Data Science - Medium https://ift.tt/3raWpDs
via RiYo Analytics

No comments

Latest Articles