Jump to content
Search In
  • More options...
Find results that contain...
Find results in...

Youtube Using Custom-Designed Video-Transcoding Chips

Youtube is building and using a custom-designed video transcoding chip for use to support video transcoding for uploaded content on Youtube. This is being reported by the Youtube Blog, and Ars Technica:

 

https://blog.youtube/inside-youtube/new-era-video-infrastructure

 

https://arstechnica.com/gadgets/2021/04/youtube-is-now-building-its-own-video-transcoding-chips/

 

Quote

Google has decided that YouTube demands such a huge transcoding workload that it needs to build its own server chips. The company detailed its new "Argos" chips in a YouTube blog post, a CNET interview, and in a paper for ASPLOS, the Architectural Support for Programming Languages and Operating Systems Conference. Just as there are GPUs for graphics workloads and Google's TPU (tensor processing unit) for AI workloads, the YouTube infrastructure team says it has created the "VCU" or "Video (trans)Coding Unit," which helps YouTube transcode a single video into over a dozen versions that it needs to provide a smooth, bandwidth-efficient, profitable video site.

The new video transcoding chip is called "Argos", and is mounted on a full-length PCI-E card, with two chips per card. There is a giant heat sink that covers the entire card, and what appears to be an external power connector as well. They very much resemble a video card and this is apparently no accident; Google said that they chose the video card like format because it fits with their existing accelerator trays.

 

Per what Google claims, the Argos chip provides "up to 20-33x improvements in compute efficiency compared to our previous optimized system, which was running software on traditional servers."

 

The Google supplied chip diagram indicates some more details about the chip design. It lists 10 "encoder cores" on each chip, with Google's white paper adding that "all other elements are off-the-shelf IP blocks." Google further notes in the white paper that each encoder core can encode 2160p in realtime, up to 60 FPS (frames per second) using three reference frames.

 

It is being reported that Google already has thousands of these chips already in operation, and thanks to these chips, people can watch 4k content on Youtube within hours instead of the days it previously took with their previous Intel Skylake and T4 Tensor core GPUs based systems. With these chips, Google apparently can save tons of money, even factoring in development and manufacturing costs to develop these chips.

 

Reading over what Google is saying about their chip, I can definitely see why Google elected to develop their own video transcoding chip specifically for their workload; this appears to be a far more efficient and cost effective way to handle the workload compared to other off-the-shelf options.

Link to post
Share on other sites

I can hear Linus's heavy breathing already.

PC - NZXT H510 Elite, Ryzen 5600, 16GB DDR3200 2x8GB, EVGA 3070 FTW3 Ultra, Asus VG278HQ 165hz,

 

Mac - 1.4ghz i5, 4GB DDR3 1600mhz, Intel HD 5000.  x2

 

Endlessly wishing for a BBQ in space.

Link to post
Share on other sites

Makes sense. ASICS will almost always blow general purpose silicon out of the water when it comes very specialized tasks. I am surprised they didn't do it sooner considering that google has the resources and personal in house to do so.

CPU: Intel i7 - 5820k @ 4.5GHz, Cooler: Corsair H80i, Motherboard: MSI X99S Gaming 7, RAM: Corsair Vengeance LPX 32GB DDR4 2666MHz CL16,

GPU: ASUS GTX 980 Strix, Case: Corsair 900D, PSU: Corsair AX860i 860W, Keyboard: Logitech G19, Mouse: Corsair M95, Storage: Intel 730 Series 480GB SSD, WD 1.5TB Black

Display: BenQ XL2730Z 2560x1440 144Hz

Link to post
Share on other sites

What I'm interested in knowing is whether or not Google will sell these cards to other companies or cloud services. I could definitely see a use case for these for Instagram, Snapchat, Netflix, and others.

Arch is better than Ubuntu. Fight me peko.

Link to post
Share on other sites
11 minutes ago, JLO64 said:

What I'm interested in knowing is whether or not Google will sell these cards to other companies or cloud services. I could definitely see a use case for these for Instagram, Snapchat, Netflix, Floatplane, and others.

Fixed...

My eyes see the past…

My camera lens sees the present…

Link to post
Share on other sites
14 hours ago, JLO64 said:

What I'm interested in knowing is whether or not Google will sell these cards to other companies or cloud services. I could definitely see a use case for these for Instagram, Snapchat, Netflix, and others.

They likely don't need to. Transcoding is very off-the-shelf IP logic.

 

Like I imagine it works like this:

 

H264/H265 hardware decoder + H264(for mobile), VP9 encoder at 8 different resolutions. So you just put enough memory on it to handle 15 decoded frames and 5 encoded frames per resolution. So back-of-the-envelope:

 

48bit x 8k = 7680 × 4320 x 48bpp = 200MB per frame, so you need 3GB of memory on the input side.

Then on the output side, you need 1GB (200MB x 5) for 8K UHD HDR, 500MB for 8K non-HDR, 500MB for 4K UHD, 250MB for 4K non-UHD, 64MB for 1080p UHD, 32MB for 1080p, 5MB for 720p, 1.2MB for 360p, so add all that up: 5GB of memory. Plus you can recycle the same frame buffer to do H264 and VP9 at the same time. 

 

Note Youtube's suggested encoding has:

Quote
  • 2 consecutive B frames
  • Closed GOP. GOP of half the frame rate.

 

So if you're encoding for 60fps or 120fps HDR, the GOP is going to be 30 or 60. You don't need enough memory for the full 60, 90, 120, 144, etc frames, because each GOP begins with an I frame, and each GOP has two B frames, the rest are P frames. You can only seek to I-frames in a video. So typically at most you need just enough frames between the I frame and the B frame in order to do any kind of compression without seeking. A B frame however requires both forward and backward data, so you can't just discard the B frame until you've put all the data into it. So under the assumption you have one B frame per 15 frames, this is probably the minimum reasonable. Again, back-of-the-envelope. There will be other kinds of videos encoded that are not h264/h265.

https://support.google.com/youtube/troubleshooter/2888402?hl=en

Quote
  • .MOV
  • .MPEG4
  • .MP4
  • .AVI
  • .WMV
  • .MPEGPS
  • .FLV
  • 3GPP
  • WebM
  • DNxHR
  • ProRes
  • CineForm
  • HEVC (h265)

So the VAST majority of these are software codecs found in FFMPEG, and if you rip a stream from youtube, you'll usually see exactly what version of FFMPEG they forked from. Like I know from experience I can upload ZBMV video from dosbox with no I frames, or used in other emulators to youtube and it will work, but it will not seek very good because ZBMV is more of an archival format than a streamable one. It's actually kinda interesting to see as the videos will PLAY on youtube but will not seek, which tells me that they are doing a lot of playback on-the-fly, right off the source video. In the future they might just build video files with variable bitrate and resolution so that they can dynamically "build up" from the lowest resolution to the resolution the stream is requested at, which suggests why some videos don't expose both 30 and 60fps streams when the source is 60fps, since that leads to timing issues if you try to synchronize the audio to streams running at different rates (Which are sent as separate streams.)

 

That said, youtube botches up RGB source videos in colorspace conversion, this is a known thing that people who do animation (particularly that which is converted from Adobe Flash/Animate) to youtube, is that they have to add noise to the video otherwise the encoder will turn all the gradients in to thick color bands, and the color will be horrific.

Link to post
Share on other sites

I find it amusing the VCU is described as like GPU form factor when I see GPUs as one example of an add in expansion card. Might be showing my age, but at some point in my life before they got integrated into the motherboard, systems had an IO card (IDE, serial, parallel), video card (before 3D was a thing), sound card, and maybe even a network card if you were sufficiently advanced in those days. So a VCU card continues that history. There isn't really any other alternative is there?

 

37 minutes ago, Kisai said:

That said, youtube botches up RGB source videos in colorspace conversion, this is a known thing that people who do animation (particularly that which is converted from Adobe Flash/Animate) to youtube, is that they have to add noise to the video otherwise the encoder will turn all the gradients in to thick color bands, and the color will be horrific.

To say they botch the conversion implies if you can supply the same video in a format other than RGB encoded so it doesn't have to do the conversion, it is fine? Basically I'm asking to confirm it is the colourspace conversion that is at fault and not the more general codec properties? It has long been an annoyance of mine that most codecs for stills or video are often tuned around photorealistic content. Other content does ok at higher bitrates, but when squeezed cracks start to show. Don't think this is going to go away any time soon.

TV Gaming system: Asus B560M-A, i7-11700k, Scythe Fuma 2, Corsair Vengeance Pro RGB 3200@21334 4x16GB, MSI 3070 Gaming Trio X, EVGA Supernova G2L 850W, Anidees Ai Crystal, Samsung 980 Pro 2TB, LG OLED55B9PLA 4k120 G-Sync Compatible
Streaming system: Asus X299 TUF mark 2, i9-7920X, Noctua D15, Corsair Vengeance LPX RGB 3000 8x8GB, Gigabyte 2070, Corsair HX1000i, GameMax Abyss, Samsung 970 Evo 500GB, Crucial BX500 1TB, BenQ XL2411 1080p144 + HP LP2475w 1200p60
Desktop Gaming system (to be retired): Asrock Z370 Pro4, i7-8086k, Noctua D15, G.Skill Ripjaws V 3200 2x8GB, Asus Strix 1080 Ti, NZXT E850 PSU, Cooler Master MasterBox 5, Optane 900p 280GB, Crucial MX200 1TB, Sandisk 960GB, Acer Predator XB241YU 1440p144 G-sync

Former Main system (to be retired): Asus Maximus VIII Hero, i7-6700k, Noctua D14, G.Skill Ripjaws 4 3333@2133 4x4GB, GTX 1650, Corsair HX750i, In Win 303 NVIDIA, Samsung SM951 512GB, WD Blue 1TB, Acer RT280k 4k60 FreeSync [link]
Gaming laptop: Lenovo Legion, 5800H, DDR4 3200C22 2x8GB, RTX 3070, 512 GB SSD, 165 Hz IPS panel


 

Link to post
Share on other sites
43 minutes ago, porina said:

To say they botch the conversion implies if you can supply the same video in a format other than RGB encoded so it doesn't have to do the conversion, it is fine? Basically I'm asking to confirm it is the colourspace conversion that is at fault and not the more general codec properties? It has long been an annoyance of mine that most codecs for stills or video are often tuned around photorealistic content. Other content does ok at higher bitrates, but when squeezed cracks start to show. Don't think this is going to go away any time soon.

 

The thing is, you end up having to "dirty" the animation, and pre-encode it to YUV420 to make youtube not compress it into a mess. Like I've done a lot tests in throwing things at youtube to see what it does, and youtube, repeatedly, even on 8-bit palette visuals, makes what should be gradients into a mess, particularly along edges. So my strategy for this is to always upscale the content to 2x or 3x the original resolution. 

 

So if you solve the colorspace problem in advance, the output won't have look like vaseline has been smeared all over the screen. This is a problem also seen when people stream games from consoles, as the consoles HAVE a RGB output, IF they are not connected to a TV. If they're connected to a TV they are likely in 16-235 not 0-255. 

 

At any rate, animation, basically stuff is too clean on the input side, and thus the compression works too hard converting it from RGB24 to YUV420.

 

Link to post
Share on other sites

iS tHiS tHe eNd oF iNtel?

 

seriously: Just logical, why would you buy hardware that isn't specialized for your usecase. This is something we will see more and more. with a (almost) "free" architectures like ARM and RISC-V you have a platform with already huge software support ready to be implemented. ( I know designing custom SoCs is more difficult than that but it has become definetly easier to make them) . Independent Fab companies like Samsung, TSMC and Global Foundries are making it easy to produce chips with high performance nodes.

Hi

 

Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler
Spoiler

hi

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Link to post
Share on other sites

But can it run Crysis?

Is it plugged in? Is it turned on? Are you sure? No, really. 

List of my systems:

Main PC:

Spoiler

HP z620 workstation, 2x Xeon e5-2690 v0, GTX 980 FE, 8x4gb DDR3-1333, 2x 2tb HDD, 1tb boot SATA SSD, Windows 10 Pro

Linux system:

Spoiler

ProDesk 600 g1 SFF, i5-4590, Quadro k620, 2x4gb DDR3-1333, 64gb boot SSD, 250gb storage HDD from the ewaste pile with many bad sectors, Ubuntu 20.04 LTS

XP system:

Spoiler

Compaq DC7600 SFF, Pentium 4 HT 631, MSI GT 210 passive, 2x512m DDR2-667, 80gb Spinmaster HDD

Testbench:

Spoiler

Optiplex 760 board, GameMax VP-600 PSU, 4x 64gb AliExpress SSD (Windows XP, Vista, 7, 10), 1tb HDD (storage)

(currently has a Core 2 Quad Q9550 and 4x2gb DDR2-800 but I change its config frequently)

 

Link to post
Share on other sites

Read the blog post, and I'm rather underwhelmed. Firstly, because there is no way they were doing transcoding in "software" previously. They'd have been using accelerators inside a standard server, so this likely just means they've moved to custom server packages. This is a lot more of the result of increases in interconnect bandwidth and faster storage.

 

The other thing is: why wasn't this done by 2015 rather than started in 2015? It's not like ASICs are new.

Link to post
Share on other sites
1 hour ago, wat3rmelon_man2 said:

But can it run Crysis?

Well, it can decode Crysis intro videos with zero CPU load 😄

AMD Ryzen 7 5800X | ASUS Strix X570-E | G.Skill 32GB 3733MHz CL16 | PALIT RTX 3080 10GB GamingPro | Samsung 850 Pro 2TB | Seagate Barracuda 8TB | Sound Blaster AE-9 MUSES Edition | Altec Lansing MX5021 Nichicon/MUSES Edition

Link to post
Share on other sites
1 hour ago, Taf the Ghost said:

Read the blog post, and I'm rather underwhelmed. Firstly, because there is no way they were doing transcoding in "software" previously. They'd have been using accelerators inside a standard server, so this likely just means they've moved to custom server packages. This is a lot more of the result of increases in interconnect bandwidth and faster storage.

 

The other thing is: why wasn't this done by 2015 rather than started in 2015? It's not like ASICs are new.

AV1 is why without a doubt

Link to post
Share on other sites
1 minute ago, Kisai said:

AV1 is why without a doubt

VP9 isn't the exactly cheap computationally, but my real assumption is that AV1 accelerators were going to be expensive and they finally got a massive migration to in-house designs finished. I'd also take a good guess that some group got removed from the project in the middle, because it shouldn't have taken 6 years. Unless there was some long-term IP tie up issue they've just skipped mentioning.

Link to post
Share on other sites

Do these chips motherboards have RGB headers?

PC - NZXT H510 Elite, Ryzen 5600, 16GB DDR3200 2x8GB, EVGA 3070 FTW3 Ultra, Asus VG278HQ 165hz,

 

Mac - 1.4ghz i5, 4GB DDR3 1600mhz, Intel HD 5000.  x2

 

Endlessly wishing for a BBQ in space.

Link to post
Share on other sites
On 4/23/2021 at 9:44 AM, Tieox said:

I can hear Linus's heavy breathing already.

hes going to push it to its limits now. brace yourselves.

Main PC: the literature club machine

Intel I5 9600k @ 4.2 Ghz | MSI z390-a pro | G.Skill Trident Z RGB 32 GB 3000Mhz | Samsung 970 Evo 500 GB | Thermaltake Floe Riing RGB 240 | Asus GeForce GTX 1660 Ti 6 GB DUAL OC | Thermaltake Core P3 TG Snow Edition

 

Daily drivers

OPPO A52 | APPLE watch series 3 | Macbook air 2017 | Royal kludge RK61 | Razer Viper | Beryodynamic DT 990 PRO | Focusrite Scarlett solo gen 2

Link to post
Share on other sites
On 4/23/2021 at 9:44 PM, Kisai said:

AV1 is why without a doubt

I agree, they're one of the contributors and if they plan to use it seriously, they need to offer everything in AV1, which normally would require them to multiply what infrastructure they use for other codecs.

Seeing AV1 decoders in new hardware and YT making that step, seems AV1 is very close to the mainstream now.

Link to post
Share on other sites
On 4/23/2021 at 7:50 PM, Taf the Ghost said:

Read the blog post, and I'm rather underwhelmed. Firstly, because there is no way they were doing transcoding in "software" previously. They'd have been using accelerators inside a standard server, so this likely just means they've moved to custom server packages. This is a lot more of the result of increases in interconnect bandwidth and faster storage.

 

The other thing is: why wasn't this done by 2015 rather than started in 2015? It's not like ASICs are new.

They were in fact transcoding in software before. At least for some codecs from what I can tell.

I have some older Youtube videos downloaded and they were encoded with x264 in software for the AVC version. Newer videos seems to be encoded with something else, including the AVC versions.

Link to post
Share on other sites
11 hours ago, LAwLz said:

They were in fact transcoding in software before. At least for some codecs from what I can tell.

I have some older Youtube videos downloaded and they were encoded with x264 in software for the AVC version. Newer videos seems to be encoded with something else, including the AVC versions.

This actually might point to a different issue, which was really their problem: they had different work paths for every codec. YT also goes back and reencodes older videos to save space, which might be where most of the software encoder stack is hitting.

Link to post
Share on other sites

but can they fit in my pocket and without taking all my money?

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×