Many years ago, I created this guide to 'extreme' compression using x264 - attaining compact, high-quality files through the use of the most heavily optimised configuration, without regard to processing time or to the amount of human attention required.

In the years since, technology and expectations have advanced and rendered much of this guide obsolete. HTML5 video, then a new technology, has become widely established. Where once almost all video was downloaded, it is now as likely to be streamed. Most importantly, device profiles matter a lot more, as viewing video on comparatively low-powered devices such as tablets and phones has become just as important as viewing on PCs and laptops.

As I write this new introduction, h265 is poised to take over from h264 - a codec substantially more advanced. I hae decided against updating this guide to modern technologies and conventions, as this would be a near-total rewrite. Certain portions of it, however, still hold value, and so nor am I going to take it down entirely. Insead I am archiving it into IPFS, to serve as both a reference and as a historical view into the video technology of the 2000's.

Please bear this in mind when using anything you read below: Much of it is written with old technology in mind, and no longer conforms to current best practices. These instructions are aimed at creating the most compact possible file for download, and push the playback device to a limit that many portables are unable to play back. Nothing below should be followed blindly, but only taken as a suggestion.

- Codebird, 2017, with regard to Codebird, 2007.


Contents:
Introduction
Preparing the source
Filtering and restoration
Basic encoding settings
Advanced encoding settings
Audio, muxing and metadata
Preparing the source.
The evils of interlacing, and conversion to simple progressive.
Example of blended interlacing artifacts.

In this image, two frames have been blended together. Note the lack of the 'comb' artifacts. This type of damage may be a result of an unusual method of frame rate conversion,or - more commonly - an ineptly-performed attempt at deinterlacing. If the former you can just delete the dud frames and change FPS, but if the latter it's a write-off. The damage is not repairable. If your own attempt to deinterlacing comes out looking like this, then you are doing something wrong.

Example of interlacing artifacts.

An example of interlacing with field separation intact. Though difficult, a video with frames such as this can be returned to proper progressive form. Note the one-pixel-high 'comb' appearance.

In an ideal world, you'll have a source that is and always has been progressive, at the same frame rate as when it was made. That's good. Viewers like it, filters like it, and the compressor likes it. If your video is of recent origin - from a blu-ray, modern video camera or most internet video - that's usually exactly what you'll get. Take a look. If it is, celebrate, because the rest of this section doesn't apply to you.

Video is not always like that. Anything broadcast before the takeover of digital technology, or stored in PAL or NTSC, is going to be interlaced. It may have been shot that way, or converted for broadcast. A key rule to remember here is that interlacing is evil. It is substantially harder to compress, even if the encoder has an interlaced mode (x264 does), and it tends to confuse most video filters into outputting a mangled mess. It makes the simpleist operations into an overcomplicated nightmare - even deleting a single frame becomes an easy way to ruin a video.

When it comes to actually doing the deinterlace, it can be a little trickier. Ideally you should just get a non-interlaced source, but that isn't always popular - often video was produced interlaced, or all pre-interlacing sources are lost. In that case, your only option is deinterlacing. There are a lot of filters for this, but deinterlacing is more than just a quick filter.

Firstly, you need clear separation of the fields - something recognizable by the presence of distinct 'comb' artifacts on moving objects. If the fields have been previously blended together by incorrectly-performed processing or compression, the video is effectively ruined - you'll just have to find another source. Even if you have what looks like separation, it might be just a sign that the blend is too slight to notice - you can check this by using virtualdub's deinterlace filter on 'split' mode. If you see both images clear with no ghosting, then you can be sure of sepatation. If you don't have separation, then it is near-impossible to fix: You'll probably have to find another source.

If you do have clear separation, then it is time to use a deinterlace filter. Here there is an important point to bear in mind: There are two forms of interlaced video, which must be handled in very different ways.
- 'True' interlaced video: Each field is from a moment exactly half the inter-frame interval apart, and alternating scan lines only. True interlaced video is produced only by electronic cameras, which can operate natively in an interlaced manner. If the video came from a pre-digital camcorder or most television programs before the switch to HD, then it'll likely be this.
- Telecine video. This is video which was recorded in a progressive format - either old-fashioned photographic film or a progressive mode digital camera - and then converted to an interlaced format for broadcast. As pre-HD televisions could only show interlaced video, this was a very common practice. If you're dealing with a film or a higher-quality television program recorded from broadcast, you'll likely have this. Additionally all animation, almost without exception, that appears interlaced is actually telecined.

If you have true interlaced video, then there is no simple way to convert to progressive. As the fields are not taken from simutainous moments they cannot be placed back into complete progressive frames, and nor can a field be turned into a frame in itsself because they lack the vertical resolution and are taken from offsets that would introduce an undesireable 'field bob' artifact. The only possible way is to use an adaptive filter, able to use different algorithms for low- and high-motion areas of the image. Complicated, and often requiring some fine-tuning. Depending on the approach taken, these filters may either duplicate a lot of information in order to double the video frame rate and keep all the temporal information, or discard some sections of some fields in order to maintain the frame rate without introducing comb or bob artifacts. It's a matter of preference, though I strongly dislike the doubling of frame rate: It just wastes far too much storage to record events too fast for all but the fussiest of viewers to perceive.

Telecined video, though, is a different matter entirely. In this case, the video is really a progressive video that is made to 'look' interlaced - and if you can invert this transformation, you can recover the original progressive video. This process is called the 'inverse telecine' or 'reverse pulldown.' There are a lot of filters available for this, and if you can do it then it is essential that you do. Some fiddling is often needed here, as the telecine sequence is often interupted at edit points. The one built into virtualdub (Look under 'frame rate') is surprisingly effective.

#Example: Typical AVIsynths script for correction of duplicated frames.
AVISource("movie_source.avi")
ConvertToYUY2()
dedup=dup(threshold=1,show=false)
seg1=SelectRangeEvery(dedup, 5, 4, 0,audio=false)
seg2=SelectRangeEvery(dedup, 5, 4, 1,audio=false)
seg3=SelectRangeEvery(dedup, 5, 4, 3,audio=false)
#Frame numbers reflect locations of removed commercial breaks,
#and thus places where the sequence is broken.
trim(seg1, 0, 32179)+trim(seg2, 32180, 65660)+trim(seg1, 65661, 101780)+trim(seg3, 101780, 0)
assumeFPS(23.976) #Exactly 4/5ths of the original FPS.
audiodub(dedup)

Nor is interlacing the only thing to look out for. Over the decades many different frame rates have been in use around the world for different video standards. 24fps, 25fps, 23.976, 24fps, 30fps, 29.97fps - and that's just the common ones. If you're working with historic video from the early days of cinema there were few standards, and the correct frame rate might not even be known - often producers simply used the lowest they could get away with to reduce the use of expensive film or allow more to fit on a reel. The video you have may not always be at the frame rate in which it was made, and optimal encoding calls for fixing this. If the frame rate has been increased by duplicating frames, these duplicates simply waste space and encoder resources.

Common frame rates you will encounter:
<18fps: Many silent-era films. There were no standards back then. If you see an object of known dimensions falling, you can try to calculate based on acceleration. Otherwise, estimate by eye.

18fps: Super8 video. Those old family films.

23.976fps: Film shot at 24fps is commonly converted to this (with audio shortened) in NTSC regions, because it can then easily be increased to the 29.97 NTSC rate by doubling every 4th frame. Commonly confused with 24fps.

24fps: The usual for film throughout the modern era.

25fps: PAL TV, as used in most of Europe. Actually interlaced, 50 fields per second.

29.97fps: NTSC, the US television standard. More precisely, 30/1.001fps. That very slight difference from 30fps is to shift a few key frequencies around in the analog encoding to prevent interference.

30fps: Common for digital video in NTSC regions. Often confused with 29.97fps, leading to problems with audio sync.

>30fps: Various not-yet-widely-deployed cinema standards run at 48, 72 or even higher FPS. The benefits of this higher frame rate are the subject of a holy war within video circles, with some arguing that anything above 30 or even 24fps is of no benefit to limited human vision while others argue just as insistently that they can see terrible jerkyness at anything below 48. Subjective data is hard to come by in this debate. Unless you actually work professionally with cinema equipment, you are unlikely to encounter these. The only other place extreme frame rates are found is the output from high-speed cameras.

By far the most common conversion to be found is the 25-to-30fps via doubling every fourth frame commonly used when converting PAL video to show on NTSC broadcast - that is, anything made in Europe and shown in the US. Fortunately this is a very easy thing to spot, because a brief look through the video in any frame-viewing editor like virtualdub will show every fourth frame is doubled: ABCDDEFGHHIJKLL. If you correct this back to 25fps, you immediately lose 20% of the frames without any loss of meaningful information. Not only is that an instant 20% reduction in raw bitrate, but a 20% reduction in encode time, a 20% reduction in time and memory to decode and a significant increase in the capabilities of I and B frame compression. Something of such benefit that it simply must be done. As an added bonus, the video may even look smoother with each frame now displaying for exactly equal times. You can do this using a simple AVISynth script utilising the dup (Optional, but strongly advised), SelectRangeEvery and assumeFPS filters. There is but one thing to look out for: If the video has been edited after conversion (eg, advertising breaks added or removed) in any way, the sequence of every-fourth-frame will be broken at the point of editing. You'll just have to go through the whole thing finding the errors and editing your script to take them into account. Don't forget to specify the frame rate to 80% of the original, or else the video will play too fast and audio will be out of sync.

It gets worse. Interlacing is awkward, frame rate conversions are awkward, and often they will be found together. This can introduce all manner of strange and complicated messes. If you have to deal with the headache of interlacing often, you'll soon learn that there is no magic 'click these options' solution: You need to study the video frame-by-frame and adapt your technique accordingly.

This can require an eye for detail. For example, film has historically used 24fps (for reasons lost in the mists of time), while broadcast TV runs 25fps in the UK - thus films in PAL may be sped up to 25/24 speed (with audio compensated to match), or may have every 24th frame shown twice (and audio unaltered) according only to the preferences of the editor. If you're just encoding a movie to watch on your DVD-drive-less netbook on holiday, you don't need to worry about this level of fiddly perfection, but if you are working with restoring video of some historical importance you'll need to know how to best handle situations like this.

Whenever you work on video, you'll be studying artifacts left over from the history of its creation. To take an example, just imagine a program recorded from (interlaced) television. That program will probably have been recorded in progressive, certainly so if animated or computer-rendered. If it's old, on film. That footage will have been edited to make the program, which is no problem for you at all. After that, it will have been converted to an interlaced format for broadcast and additional editing will be performed by the broadcaster. They may want to remove a scene or alter a shot to fit their BS&P specifications, and they'll certainly want to edit in breaks for advertising. If the film is broadcast in a country other than that of its origin, the video format will also be altered - changing line count and frame rate, which may be through all manner of different patterns of padding or dropping. It'll be further edited for local standards. If sourced from a previous broadcast (Common practice in made-for-TV movies) then previous advertising breaks may have to be removed before new ones can be added, and during digitization occasional frames may be dropped. The result is video archeology, and to achieve a perfect deinterlace free of artifacts and without losing valuable information may require some very precise examination of the video to determine places where field orders may change or frames may be missing from a sequence.

Always keep in mind your goal here. Everything that follows - the filtering, cleaning, encoding itself - depend upon having a properly progressive, tidy video to work from, and if you don't have this then you will have to recover it as best you can. For video shot on film, your objective should be to recover the sequence of photographs exactly as they were first recorded. Likewise, for animation you should be retrieving the sequence of drawings. If it was filmed on an interlacing video camera... do the best you can. Interlacing is slowly dieing out, and it will not be missed, but the legacy of half a century of interlaced video production will be with us for a long time.

The best option of all, if possible, is to avoid having to deal with interlacing entirely and use a progressive source. But if this isn't possible, make the most of what you have.