Machine Learning Video compression

Artificial Intelligence (AI) and Machine Learning (ML) is the mantra of the current era of cognitive technologies. ML processes data (including video) makes predictions and helps make decisions based on artificial neural network (ANN).

ML & ANN are going to face recognition, analytics in video surveillance, video compression etc. All these applications deal with processing of the images consisting of large amount of pixels. Availability of such huge data-sets/training examples allows ML-based technologies to be particularly effective and gain very good results.

Video occupies about 75% of the data transmitted on IP-networks and that percentage has been steadily growing and is projected to continue to grow further. Introduction of ultra-high definition (UHD), high dynamic range (HDR), wide color gamut (WCG), high frame rate (HFR) and future immersive video services have dramatically increased the challenge. Therefore, the need for highly efficient video compression technologies are pressing and urgent issue. As an answer to this challenging request the video coding performance improves around 50% every 10 years under the cost of increased computational complexity and memory.

Video coding according to ITU-T H.26x uses an idea of lowing spatial and temporal redundancy based on C.Shannon’s rate-distortion function in their cores. Despite the ubiquitous use of such traditional video coding techniques they obviously lead to fundamental limit of video compression that places a bound on the resulting video quality.

Improved video compression is important for delivering digital video stream more quickly and with higher quality, while using less bandwidth and storage. Everything from 4K movie streaming to smartphone video chat to laptop screen sharing can be enhanced by making the video stream smaller through better compression. It’s possible to deliver the required video quality and fit in growing video traffic to existing channel bandwidth by shifting the video compression paradigm by introduction of a concept of “application-defined redundancy based on information value” (e.g. excess scene detail, foreground and background interrelation, grayscale, color palette, etc). Compensation of less valued information is achieved by a neural network that is used at decoding.

Video classification based on information value allows a decision maker to extract the required data and exclude information noise. SPIRIT is using the new paradigm and proprietary patented ML-based technology to gain 500 times video compression unattainable for traditional coding.

SPIRIT ML-based video compression allows watchers to enjoy better quality at low bit rate or see 30% to 50% less buffering at the same quality compared with VP9 or H.265 content. In case video stream appeared patchy and distorted on a 100 kbps stream, with SPIRIT ML-based video compression its quality improves dramatically.

ML-based video compression algorithm by SPIRIT can save storage and free up capacity on carriers’ networks.