Jump to content

H.266 Video Codec promises 50% size reduction over the current H.265

On 7/6/2020 at 7:44 PM, gabrielcarvfer said:

How long will it take until we start using auto-encoder neural networks to compress video?

I have been thinking a lot about this over the last year or so. 

 

I haven't put much legwork into the research but I am very interested by this idea. Specifically seeing if we can increase the distance between keyframes using neural networks to infer the frames between two keyframes. 

Link to comment
Share on other sites

Link to post
Share on other sites

26 minutes ago, gabrielcarvfer said:

Traditionally automating tweening has caused serious snap problems in animation.  There are or were a bunch of systems for it.

Not a pro, not even very good.  I’m just old and have time currently.  Assuming I know a lot about computers can be a mistake.

 

Life is like a bowl of chocolates: there are all these little crinkly paper cups everywhere.

Link to comment
Share on other sites

Link to post
Share on other sites

12 minutes ago, gabrielcarvfer said:

Probably due to unnatural motion (stuff popping in/out of the frame, rapidly rotating, etc). There's nothing preventing you from building/training a specialized encoder for animations (treating frames more like pictures and less like movies).

 

This technique looks pretty good:

https://youtu.be/IK-Q3EcTnTA

My memory is things looked kind of robotic.  It can be seen in turn of the century cheap animation.  Diseny abandoned it when they didalladin and things worked much better.  What they did was outsource tweening to low wage countries.  Dark wing duck did this and looks good as a result. 

Not a pro, not even very good.  I’m just old and have time currently.  Assuming I know a lot about computers can be a mistake.

 

Life is like a bowl of chocolates: there are all these little crinkly paper cups everywhere.

Link to comment
Share on other sites

Link to post
Share on other sites

43 minutes ago, Bombastinator said:

My memory is things looked kind of robotic.  It can be seen in turn of the century cheap animation.  Diseny abandoned it when they didalladin and things worked much better.  What they did was outsource tweening to low wage countries.  Dark wing duck did this and looks good as a result. 

I always find "interpolation" efforts to make animation look 60fps, makes it worse. One of the key reasons not to have interpolation turned on in your smartTV. In regular films, the frame interpolation makes it go from having a film aesthetic, to a webcam aesthetic, and I don't like it.

 

60fps in a game "looks nicer" because your reaction time is in sync with the game. Same when you have a 144 or 120hz monitor. That lack of interpolation makes the game feel better. It doesn't actually make it look better, and you can always see this when a game has 720p24/720p30 videos in one scene and then switch to the system's native resolution the next. It breaks the immersion.

 

But some people like everything to be be high frame rate, and that's why the feature exists. 

 

In animation, there's really no easy way to actually animate things to fill in the blank between the key frames. 

 

yeti.gif?resize=235,167&strip=all

src: https://venturebeat.com/2017/06/02/pixar-veteran-creates-a-i-tool-for-automating-2d-animations/

This looks more unsettling than anything.

 

Likewise Live2D does it too.

(img too big for the forum, click the link below) 

src: https://docs.live2d.com/cubism-editor-tutorials/motion-hint/?locale=en_us

The way Live2D works is that it has the model cut up into a texture atlas, but it can't really move the parts of the model beyond about 10-15 degrees because there really is no depth. However it can "tween" by stretching and that makes it look at least less bad than that image above it. There are separate textures for "hands up" and "hands closed" much like a 3d model would have, but unless a model has those images, the software rendering it just doesn't show it. I've seen at least one product try to "fill in the blank" so you can have the same kind of articulation that a 3d model would, but it is just extremely unsettling.

 

AI would really need to go about being trained on a specific art style (without the background) to learn how to "draw" the character, and each character would need to be trained. Then you have the problem of the characters needing to make eye contact when they speak, which the AI would only understand if the keyframes for both characters were present.

 

Like with the Live2D stuff, you can exactly replicate this with any physics engine in a game, without a 3D model, which is it's entire selling point. 3D models look unnatural for 2D, and the games that have the best looking cel shaded 3D, actually cheat a lot to get that look. There's a few videos about Street Fighter V explaining how they pulled off the cell shaded look, and it's very informative. But it's a very manual effort.

 

The cheapest, and worst looking games, are those that the developer does not know what they are doing and use series of h.264 videos for the actual animations (I kid you not, I've seen this more than a dozen times) and they tend to be an devolution of the "one swf file per animation loop" style, which was very space efficient for a game when it was flash, but is horrific as video. The end result is that you get these really large video files that would have taken up so much less space as 3D models. But hey, sometimes a VN-style game only calls for popping in looping characters, not moving them in response to input. Since you can't rely on there being software or hardware support for any other decoder, they're stuck with that. 

 

Link to comment
Share on other sites

Link to post
Share on other sites

On 7/6/2020 at 12:05 PM, Tedny said:

16k videos incoming, boi 

SCARY porn res. 🤨

Oneplus 6 | Sony 55" X900F . Lenovo Y540 17" 144Hz RTX 2060 . i7 9750h. 

Audio-GD DAC-19MK3 > Schiit Lyr 2 + > FOSTEX TH900 . Sony XBA-Z5

Link to comment
Share on other sites

Link to post
Share on other sites

11 minutes ago, gabrielcarvfer said:

@Kisai
the technique you've posted is pretty old at this point. AI ones are evolving pretty fast (look for the walking cat animation in the video I linked). It's amazing.

 

The interpolation TVs do are for natural movement. It is more than fine for most people, especially with live action content.

No that cat animation looks terrible.  Half of those animations don't improve enough to matter, and the other half make them look like a mess.

 

If you're going to let an AI animate 2D for you, you don't want to have to go back and clean up every interpolated frame. Classic tweening in flash only works because the input is vector. In pretty much every animation software like Toonboom and CSP, the animation is internally stored as vectors, even for cut-out limited animation. The AI on this would only work if it had the source material to bounce off of, and even then, the animation software already has the means to do this, it's undesirable.

 

Link to comment
Share on other sites

Link to post
Share on other sites

5 hours ago, Kisai said:

Doesn't matter. If someone says "we will sue you" there's nobody out there going "yep, do it asshole so we can own you"

 

Mitsubishi already bought into their licence pool, so if anything, they added legitimacy to their claim.

https://rethinkresearch.biz/articles/aomedia-will-relent-to-fair-av1-patent-licensing-program/

 

http://www.digitaltvnews.net/?p=34317

So far that basically covers mostly the who's who of home theater equipment and major telcos in Korea and Japan. 

 

You can pretend all you want that patents don't exist. You're still wrong.

We will see what happens with that licensing pool. I don't think it will go anywhere though and that AV1 will take over. 

 

 

Until it is actually proven in court, it's just a threat that we don't know if it's true or not. Since AOMedia had the particular goal in mind to verify that AV1 does not infringe other patents I find it very unlikely that the lawsuit will go anywhere. Like I said earlier, the lawyers for all the tech titans have looked it through. 

 

 

All that has happened so far is basically a company is going "we wanna get paid if someone uses AV1". They haven't even filed a lawsuit or anything. They have just made an announcement that they want money. But it's worth noting that as soon as a lawsuit is filed, they will be banned from using AV1 in their own products according to the AV1 licensing. So as soon as they try and extract money from their patent pool (which we don't even know of its legit, patents gets invalidated all the time or they pad the lawsuit with irrelevant patents) their products will no longer be allowed to support AV1. Since AV1 will be massively adopted, banning yourself from supporting it is suicide. 

 

AOMedia has already thought of this and made a defense mechanism. 

But your goal seems to make people scared of AV1 so that we stick with paid formats, so I'm not surprised that you present this in the way you do and leave out the info you do. 

 

 

Flimsy that's were all it used to take. That technique has worked in the past and I am not surprised that it is being tried again. However, this time it's not just one company behind the free and open codec. It's basically the entire industry. Good luck suing everyone at the same time. 

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, gabrielcarvfer said:

Matter of taste. I think it is amazing for hand drawn animations (thinking of testing it on the Reluctant Dragon). Completely agree that it doesn't make any sense to use this kind of technique for computer generated animations.

There's probably a possibility in using it for FBF (Frame by Frame) animation, but that's still going back to a GIGO problem (garbage in, garbage out), if the source animation is not good enough. Lets say the source animation in key frames was roughly 2 frames per second, and the software was explicitly animating on 2's in a 24fps setup, that is 10 out of 12 frames it has to figure out. That is probably enough to do squash-and-stretch, but not enough to have a character flip directions.

 

But using it on existing content that wasn't supposed to be animated that way, results in the background speed increased by 2.5x while the foreground is speed up 5x, and it just doesn't look correct. With 3D animation, the background and foreground are in sync, so doubling the frames doesn't look quite as terrible until there is rapid movement.

 

Anyway, moving the topic back on.

 

AI for increasing the space between keyframes in the actual compression, rather than animation. I don't think there is really any need to do this. With lossy compression, you only need to wait for the next key frame, and most of the time this is about 1/2 a second away from navigation points. However if you are talking about lossless compression, it's another story.

 

As per my reference earlier about uploading uncompressed video to youtube and having youtube unable to navigate it, thus determining that it was compressing on the fly, and the lack of keyframes in the video made it impossible, the way to "retrofit" this compression would in fact require keeping lossy and delta to lossless in the stream, or require pulling at least two streams, one for each "layer" of quality you want to go up to. So you might want a 4Kp60 stream, but the layers break down as such

 

240p -> Lossless

480p -> Lossless

720p -> Lossless

1080p -> Lossless

1440p -> Lossless

1600p -> Lossless

2160p -> Lossless

So if you are on a mobile phone and can't appreciate the lossless data, or a higher bit rate, you might just pull the 480p lossy video and be done with it, but if you want to watch it on a 4K TV at home, you might switch to the lossless stream, but the actual navigation uses the lossy stream to restart the key frame for the lossless. So this might result in several seconds of lossy video while it waits for the next keyframe in the lossy upgrade. Now... notice I said "4K TV" and not "4K master." If you use netflix you'll actually notice this exact effect when you navigate, it starts on the 240p stream when you navigate and then it takes a few seconds to switch back to 1080p or 2160p. It's also how it deals with people's crappy internet.

 

The lossless stream might have the keyframes 15 seconds apart where the lossy stream might have it only 30 frames apart.

 

The same effort required to do lossless seeking is also required to do mastering. So if all your input layers are lossless (or computationally expensive) the cpu has to recompute every layer, individually, and that means long scrub times. However the computer also keeps a lossy, low-res video for seeking, which can be considered the "240p" mode in that list, once you find the place you wanted to seek, it rebuilds the timeline from that point forward, just in case you decide to play at that moment. However if you decide to switch layers off, it has to rebuild the preview again.

 

So multiply that by layers and you can quickly see where working in lossless video can become ridiculous without a high end system. Now if you have either a hardware encoder, or a GPU that can assist here, things get a lot faster, as the CPU (eg quicksync) or the GPU (nvenc) can be invoked to to do some of these encoding processes. 

 

So what could youtube do? Well youtube actually has like 20 streams, but only sends two to you, the audio and video as separate streams if it uses the DASH mode. It uses DASH mode for all 1080p and larger streams, no matter what the codec is. And both 30fps and 60fps versions of videos exist. It can, and does, try to use adaptive streams (which is the A in DASH), but in practice users just override it, and the resolution picked by the browser is the one used for the video unless you've set AV1 as a preference for youtube (which it's been using only for 480p and below videos.) If Youtube was really hellbent on saving bandwidth it would "remember" your account you are logged into and preferences on a per-browser basis, and thus never force you to change resolution. In practice it doesn't even remember AV1 settings when you're logged in. Where the 4Kp60 auto button is located, really should also have a framerate and decoder to use (eg software or hardware), so that the browser uses the most efficient codec, not the one youtube would prefer.

Link to comment
Share on other sites

Link to post
Share on other sites

On 7/11/2020 at 9:24 AM, Kisai said:

and the resolution picked by the browser is the one used for the video unless you've set AV1 as a preference for youtube (which it's been using only for 480p and below videos.)

Worth pointing out that this is outdated information.

AV1 is used for high resolution content on Youtube these days as well.

 

Here is a list of all the available formats for this video for example (I have highlighted all the AV1 videos):

 

 

So as we can see, Youtube serves AV1 videos at all resolutions up to 1080p, and anything above that is just VP9.

Untitled-2.thumb.png.9b9c840259d84082daff7d8cd2fee26a.png

Link to comment
Share on other sites

Link to post
Share on other sites

4 hours ago, LAwLz said:

Worth pointing out that this is outdated information.

AV1 is used for high resolution content on Youtube these days as well.

 

Here is a list of all the available formats for this video for example (I have highlighted all the AV1 videos):

 

 

So as we can see, Youtube serves AV1 videos at all resolutions up to 1080p, and anything above that is just VP9.

Untitled-2.thumb.png.9b9c840259d84082daff7d8cd2fee26a.png

It does not pick a stream it can not play. Stream ripping tools can force whatever stream youtube is willing to give it. Notice how AV1 isn't available for the 4K stream. If you were to view the video on an Android TV device, none of the AV1 streams would be picked unless AV1 is in the SoC.

 

If i play that video on Safari on the iphone it picks avc1.42001E,mp4a.40.2 at 720p. If I use the youtube app itself, 1080p is available, but the "stats for nerds" feature isn't present to check what it's pulling. Though it's certainly picking the AVC1.

 

On systems where youtube doesn't believe it can play av1, it won't. This of course relies on the browser actually telling the truth.

 

On my PC, it picks av01.0.05M (398) / opus (251) in chrome and firefox. On Edge (the non-chrome version) it picks VP9 (248) /opus (251). If I open it in MSIE 11, it picks avc1.640028 (137) / mp4a.40.2 (140)

 

The 4K option isn't listed in MSIE to begin with.

 

So what is your point?

 

image.thumb.png.adfb4b6ce8b98c83677abb1fcf97d599.png

Hover over that (?) next to "Prefer AV1 for SD"

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, Kisai said:

So what is your point?

Probably that AV1 is used by default above 480p which it is. Selecting 'Prefer AV1 for SD' is not default, you do that if you are having playback issues due to performance and AV1 is being selected for high resolution content.

 

Videos that will not have AV1 will be ones uploaded and processed to YouTube before it was supported on the platform.

 

So in regards to your comment that was quoted, no AV1 is used by default above 480p as long as supported by the playing client.

Link to comment
Share on other sites

Link to post
Share on other sites

On 7/6/2020 at 9:27 AM, Franck said:

Hey procedural technologies have been making 15 minutes video with 64kb or less for decades. Wonder when someone will convert an actual video to a procedural function so we get super high quality videos for less than 1 megabyte. A 3 hour 16k video 120 fps HDR for 600 kb... yes please.

By that time, there will be 64K video for 100MB 10 min.

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, leadeater said:

Probably that AV1 is used by default above 480p which it is. Selecting 'Prefer AV1 for SD' is not default, you do that if you are having playback issues due to performance and AV1 is being selected for high resolution content.

 

Videos that will not have AV1 will be ones uploaded and processed to YouTube before it was supported on the platform.

 

So in regards to your comment that was quoted, no AV1 is used by default above 480p as long as supported by the playing client.

There is no "Never use AV1" was my point. Youtube is always pushing it unless your browser or app doesn't support it.

 

Because it pushes it, it will be pushed through software decoders on desktop and laptop devices, as there are no hardware decoders in most cases. So the battery life is reduced, yadda yadda, we've had this discussion already in this thread.

 

https://developer.nvidia.com/video-encode-decode-gpu-support-matrix

VCSDK_006a.png

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, Kisai said:

Because it pushes it, it will be pushed through software decoders on desktop and laptop devices, as there are no hardware decoders in most cases. So the battery life is reduced, yadda yadda, we've had this discussion already in this thread.

I doubt the practical battery life difference is that much to worry about and I'd bet more people would get a better user experience by using a codec that requires less bandwidth. Most people, ISPs and service providers are more bandwidth constrained than anything else so I think no matter what using something that requires less is the best option actually for everyone.

 

My S10E plays PokemonGO for a fair decent length of time on it's battery and playing a AV1 video uses much less power than that does. Also at least for me video viewing on a mobile phone is last resort when literally nothing else is available, for a laptop if you data logged real user usage over say 100 people where 50 had hardware decode and 50 did not I seriously doubt you'd be able to discern run time difference between those two groups as specifically the hardware decode capability. Sometimes things just don't matter that much, it's the whole it depends issue.

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×