Serious video compression is a mix of math, science, and black arts. Discussion of the many esoteric features of modern compression technology is well beyond the scope of this blog, and is better dealt with by folks like Jason Garrett-Glaser. However, there are a couple bits of terminology we want to review so that we can better discuss workflow and post production technologies. In this post, we’re going to discuss intraframe compression. In the next post, we’ll cover its controversial brother, interframe compression.
Imagine the simplest video format possible. Take a series of still images, and play them one after another, like a filmstrip. Now, make those still images JPEGs. Congratulations, you’ve just invented DV compression, one of the most widely known intraframe formats. Also, you’ve violated a variety of patents, so pay up.
Intraframe compression simply means that each and every frame exists as a discrete item. This means that decoding and playing a file is simple – just load each image, display it on the screen for a set amount of time, then display the next image. You can easily seek to any image within the stream, or split the stream wherever you want. Like a metaphorical worm (though not, apparently, an actual worm) you’ll end up with two playable streams.
While DV is essentially just a series of still JPEG images with sound, timecode and a few other bits stuck on, modern intraframe formats use increasingly sophisticated forms of per-frame compression.
Let’s take a brief detour to discuss JPEG compression. There are two main parts of JPEG encoding that we care about at the moment. The first is the discrete cosine transform, or DCT. This is a mathematical method for mapping data from the spatial domain (pixels in a grid) to the frequency domain – decomposing small squares of the image into a set of overlapping patterns. This is a simple reorganization of the data – contrary to popular belief, no compression takes place during the DCT process, it merely prepares the data to be compressed.
The next step in JPEG compression is quantization. This is where quality loss occurs – high frequency data (think small details) is discarded. Quantization, at it’s most basic, involves dividing all the values in your DCT transformed block by a fixed value. Modern quantization is substantially more complicated, and far beyond the scope of this article.
Finally, the quantized data is run-length encoded. A common type of run-length encoding is called Huffman encoding. Huffman coding is lossless, and is used across the computing world. Zip compression for example is derived from Huffman. Run-length encoding looks for commonly occurring patterns, and replaces them with shorter values. For example, if your video stream contains big chunks of “0b01010101” it might make more sense to say “Ok, I’m going to just store 0b01 and that means 0b01010101.” By doing that, you gain substantial savings each time that pattern is repeated.
Many formats go beyond the simple discrete cosine transform plus Huffman encoding of DV. The last few years have seen a variety of new intraframe formats hit the market, both for acquisition and post production.
In terms of acquisition, Panasonic’s AVC-Intra and Apple’s iFrame standards both leverage the popular H.264 format in an intraframe-only setup. H.264 leverages a variety of technologies to improve on the JPEG-style compression of DV. For example, it uses in-frame prediction to interpolate values when possible, storing only the changes (deltas) rather than full pixels or blocks. A smarter quantization process adapts to the content in a far more flexible manner than older formats. It also leverages a much more efficient run-length encoding called CABAC – context adaptive binary arithmetic coding.
On the post production side, common intraframe formats include DVCProHD, ProRes, Apple Intermediate, and Avid DNxHD. Uncompressed formats generally get called “intraframe” as well, though of course they don’t use any compression at all.
So, if intraframe is simple to implement and easy to edit, why do we need anything else? In a word, efficiency. For that, we need interframe compression, and for that, you’ll have to wait for the next post.