Large Scale P2 Production | Main | Youtube allows web-based recording

December 7, 2006

Apple Patent for Dynamic mpeg encoding

Please note, this blog has been archived and now lives at www.discretecosine.com

Macnn has a post about a recent patent granted to Apple for dynamic field/frame encoding of MPEG. It's not terribly exciting, but I figured I'd comment since I've seen some other sites mischaracterizing it.

Basically, when you have interlaced video, you get better codec efficiency if you compressed each field separately, instead of compressing a frame made up of two fields combined. The reason is pretty straightforward. A field is more or less a picture of your scene, taken once every 60th of a second. So, two sequential fields can be very different if there's fast motion or the camera is moving. MPEG encoding works in terms of macroblocks - 16x16 blocks of pixels. If there's lots of continuity among the pixels in that block, you'll get good compression. However, if every other line in that block is totally different, you'll either get terrible compression or have terrible quality. So, with interlaced video, field based compression is great. With progressive video, or animation, or other non-interlaced video, you'll get better efficiency with frame-based encoding.

That's great if you're able to tell the compressor in advance about the source video. However, with many video codecs, it's not easy to programatically determine whether the video is interlaced or progressive. It gets even worse when both formats are mixed within a single video!

What Apple's patent proposes is a compression process that dynamically selects between field-based and frame-based encoding with each macroblock. The only bit that really matters is the discrete cosine transform on the luminance macroblock. So, Apple does two DCTs on the macroblock (actually double that, but then it gets confusing), one treating the block as if it were field based, and one treating it as if it were frame based. Because a DCT can easily be vectorized, you can do multiple DCTs in parallel. Then you just check to see which block has the most zeros and use it.

With H.264, the entropy encoding (CABAC or CAVLC) takes far more CPU than the DCT, so this is a pretty clever way to get better efficiency with a really simple addition.

Rock on.

Posted by at December 7, 2006 7:34 PM | Misc

Comments