Many years ago, I created this guide to 'extreme' compression using x264 - attaining compact, high-quality files through the use of the most heavily optimised configuration, without regard to processing time or to the amount of human attention required.

In the years since, technology and expectations have advanced and rendered much of this guide obsolete. HTML5 video, then a new technology, has become widely established. Where once almost all video was downloaded, it is now as likely to be streamed. Most importantly, device profiles matter a lot more, as viewing video on comparatively low-powered devices such as tablets and phones has become just as important as viewing on PCs and laptops.

As I write this new introduction, h265 is poised to take over from h264 - a codec substantially more advanced. I hae decided against updating this guide to modern technologies and conventions, as this would be a near-total rewrite. Certain portions of it, however, still hold value, and so nor am I going to take it down entirely. Insead I am archiving it into IPFS, to serve as both a reference and as a historical view into the video technology of the 2000's.

Please bear this in mind when using anything you read below: Much of it is written with old technology in mind, and no longer conforms to current best practices. These instructions are aimed at creating the most compact possible file for download, and push the playback device to a limit that many portables are unable to play back. Nothing below should be followed blindly, but only taken as a suggestion.

- Codebird, 2017, with regard to Codebird, 2007.


Contents:
Introduction
Preparing the source
Filtering and restoration
Basic encoding settings
dvanced encoding settings
Audio, muxing and metadata
Advanced encoding settings

With the above tweaks, you can't easily go wrong. They work, they work well, and they work almost all the time. With few exceptions, they will not lower the encode quality - the worst they might do is waste processor time. The following options are more troublesome though - they might work, they might not, and even if they do work the ideal option varies greatly by video. They could actually make the encode worse.

I have closely studied all of these options in a very systematic manner - and I have the graphs to back up my findings. All my assessments are based on pure SSIM/PSNR metrics: I do not trust anything as subjective as a visual comparison, and rely only upon hard data.

 

SSIM vs merange graphExample of the effect of merange on a test video - the shape of the curve depends upon the nature of the video being encoded.
merange=?

Set the search radius for motion estimation. This only works if you have me=umh or me=tesa as well. The default is sixteen. The speed of motion estimation is inverse to the square of this, so putting it up is going to hurt encoding time. A lot. This is perhaps one of the very slowest options you can set, but if you're aiming for extreme encoding you're going to need it.

In mathematical principle, it should be impossible to lose quality by setting merange too high - the worst you can do is waste processor time. In my own experience, it can be done - but I think this may just be an artifact of the way SSIM and PSNR are calculated causing slight local minima, as can be seen in the graph here.

merange has a serious impact on encoding time - it's above a linear slowdown. As the graph shows, there does come a point of diminishing returns. This point depends upon the content and resolution of the video - high-resolution HD video will benefit from high merange than SD resolution. Since setting too high a merange has no negative effect aside from slower encode though, there is no reason not to err on the side of excess. I personally suggest a value of 64 in most cases - though it's possible 4k video may benefit from higher, I have yet to test this.

SSIM vs bframes.
Example of the effect of bframes on a test video - the shape of the curve depends upon the nature of the video being encoded.
CGI test, bframes and b-adapt
bframes=4

Sets the number of consecutive bidirectional predictive frames. More is better, but not always much better - in most cases, anything more than four is completely pointless and serves only to make videos that need more processing time to both encode and decode. Unless your video has unusual properties, I wouldn't advise going above four - and a lot of the time three is plenty. I've run a few tests on cgi and live-action, and as these graphs show there just is no point going above four. It might be worthwhile on animation, I've not tested that.

I also tested b-adapt=2. The x264 wiki may call it 'optimal' but my own tests show it to be decidedly inferior to the default b-adapt=1.

filter=?:?
This is a hard one. These two numbers control the parameters for the inline deblocking filter. The default, 0:0, is defined as what the developers have found experimentally to be optimal. In almost all cases, it is - those developers knew what they were doing with this one. The only time you are likely to see any benefit from changing this is when encoding very low-bitrate animation.

Unfortunately the parameters are based around arcane mathematics that few can understand, so the only way to know what works and what makes things worse is trial and error. It'll take many attempted encodes on different parameters. I've written a little script to systematically test values, ponymath.pl. It'll take an age, but if you're that determined to squeeze every bit to the fullest you may wish to use it. The heat map it produces will serve as a guide. Both paramaters can take either positive or negative integer values (No decimals!), and you'll find the range -3 to 1 to be of most interest. Just be aware that this will take a truly ridiculous amount of processing power (To get accurate results, you need to test with all other optimizations already enabled) and give you a very small improvement. Even to the most determined encoder, this isn't likely to be worth the effort.

You should now have worked out that there is no 'right' way to encode video in general. The optimal setting is different depending on the video - not just the resolution and frame rate, but hard to quantify aspects. I can't simply give you numbers to plug in. Fortunately, there is a solution: A very simple I created called Ponymath. It's just a script to draw graphs (It's made to run under linux or other *nixs, though you could probably make it run on windows with enough effort) by varying a parameter and plotting SSIM results. With Ponymath you can optimize parameters through sheer processor time. It's slow - really slow - but if you have the time this may help.

A note on other options

There are many rumors on forums and contradictory guides on this subject. I strive to test throughly before I advise anything. Notably, I have conducted no-fast-pskip option, and found it to be not only ineffective but detrimental.

A note on threads

SSIM vs threads graph x264 can use an arbitrary number of threads, but there is a small cost to this: As the number of threads increases, encode quality goes down. This is due to some internal decision-making having to be done using incomplete information. Usually the effect is very slight (I went from 0.9411 at threads=1 to 0.9409 at threads=16 on Rush Hour), but I have noticed that it appears to be much more pronounced on animation as this graph from the Friendship is Magic opening shows. At threads=2 or threads=4, the most common scenario for users without high-end workstations, the effect remains negligible (Not even I'll object to a loss of 0.0002), but if you are using eight threads (eg, four cores with hyperthreading) or sixteen (Dual-socket workstation) it becomes a concern. Fortunately there are two solutions for this. The simpleist is to explicitly specify threads=4 and accept the slowdown. Or, if you are encoding multiple files or episodes and have enough RAM, you can encode them in parallel. Four parallel instances at threads=4 should take little longer than running four encodes at threads=16 one after another, but avoids the quality-sapping implications of over-parallization.