Hacker Timesnew | past | comments | ask | show | jobs | submitlogin

Some years ago when H.264 was new and uncommon, I thought about archiving tapes in XviD so I could share more easily with friends and family. I tested it and found that the quality vs x264 was much too bad. I begrudgingly went ahead with using x264 even if my friends and family wouldn’t be able to play it back as I didn’t want to save smudged videos.

Now extrapolate to today: today it sounds ridiculous to use XviD and even H.264. For ease of use I’d use x265-10 bit, for future proofing I would need to read up on av1. Think what it will look like in 10 years (2032) as you will have those files in 10 years for sure.



There's a slightly counter-intuitive thing with H.264 and H.265 encoding: for any given bit-rate, you'll get better quality if you encode at 10-bits instead of 8-bits, even if the source clip was 8-bit.

The reason for this is the DCT transform results in 16-bit numbers for every pixel. The reason that doesn't make things worse is (partly) because it only sends the non zero values.

There's too many more reasons why, but that's the quick summary.


How is this possibly true? The argument that 16-bit DCT somehow gives better precision AND doesn't change the size of the encode makes no sense. If you get better precision you need to keep those extra-precise bits that are no longer zero due to truncation.

I haven't seen this argued as 16-bit DCT, but in color space conversion. The gist is that all 8-bit RGB values cannot be represented properly in 8-bit YUV420, so you're supposed to use 10-bit to get "proper" YUV values. But if you start with an 8-bit encode you've already thrown away the extra precision, so why waste the (considerable) extra compute on 10-bit just to make sure you don't truncate the already-truncated YUV?

I have a project in progress to measure all of the variations, but from quick testing with CRF encoding the same value results in much longer compute AND a larger file in 10-bit versus 8-bit. The larger file has a slightly higher VMAF score, as would be expected from spending more bits. The work is in finding a set of encoding parameters to measure the quality difference at the same output size, and to measure the relative improvement across CRF vs size vs bit depth.


Replying to just your 1st paragraph:

The process is: Raw input pixels (8 or 10 bit) minus predicted pixels (8 or 10 bit) -> residual pixels (8 or 10 bit + 1 sign bit).

You take these residual pixels and pass them through a 2D DCT, then scale and quantise them. At the end of this, the quantised DCT residual values are signed 16-bit numbers - you don't get to choose the bit-depth here; it's part of the standard (section 8.6). For every 16x16 pixel input, you get a 16x16 array of signed 16-bit numbers.

The last step is to pass all non-zero quantised DCT residual values through an entropy coder (usually an arithmetic coder), then you get the final bitstream.

The key point is that it didn't matter if the original raw pixel input was 8-bit or 10-bit; the quantised DCT residual values became 16 bits before being compressed and transmitted. This is also true for 12-bit raw pixel inputs.

This seems impossible; for 8-bit inputs, you've doubled the size of the data (slightly less than double for 10-bits), so you must be making things worse! The key is that after scaling and quantisation, most of those 16-bit words are zero. Those that are non-zero are statistically closer to zero so that the entropy encoder won't have to spend a lot of bits signalling them.

The last part comes when you reverse this process. The mathematical losses from scaling and quantising 10-bit inputs into the transmitted 16-bit values are less than the losses for 8-bit inputs. When you run the inverse quant, scale and iDCT, you end up with values that are closer to the original residual values at 10-bit than you do at 8-bit.


Sorry, I don't understand. I can see why it wouldn't be worse, but why would it be better?


If I remember correctly that is only with x264, and no longer the case with x265.


You'd be better off using the highest profile h.264 today. _MOST_ of the way to h.265 and is way better on patents and slightly better for still-frame quality.

For long term archive work the above or AV1 (if you have infinite time / energy budget) are probably better, depending on settings.


Patents on XviD's underlying standard are also going to expire within a year:

https://meta.wikimedia.org/wiki/Have_the_patents_for_MPEG-4_...


why would it be ridiculous to use today H.264 ? it's still used everywhere, it takes more space than newer codecs, but there is nothing ridiculous about it, if you want wildly supported format with low requirements for decoding and very reasonable amount of space taken, it's not even that bad compared to H.265 for same quality


It's not obvious. HEVC is dead on web browser and not looks like become available. Jump AVC to AV1.


Ten years from now your first AV1 video might be finished encoding. /s but not really.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: