I think Linus would be better off using FFmpeg for his 6 cam Badminton streaming / recording

ninbura · December 17, 2020

Typically there is a topic posted for every video in the “LTT Official” portion of the forum, but for this video specifically there was never a post / discussion thread created… Not sure if this was by design because they didn’t want everyone and their mom telling them how they think they could do it better, or if they just forgot to make a thread for it. Assuming it’s the latter, I’m really not sure where else to put this.

Love the video, never seen anyone try to capture more than 3 4k sources on one PC. I understand that you already have a working solution and probably don’t want to change it, but here’s a few things I thought of while watching the video:

You can bypass Nvidia’s artificial encode limit via this patch, not sure if that’s against your guys’ partnership with them or something but it’s super easy to apply and works perfectly.
You can avoid high 3D GPU usage by using FFmpeg to stream and record instead of OBS. Additionally, you would only need that first Quadro with this method as the GP100 is perfectly capable of encoding 6 streams of 1080p (along with basically any other Nvidia card released in the last 5 years assuming you're using the patch), which seemed to be your final output resolution despite the 4K cameras.

Using my patched GTX 1080 I am capable of encoding up to 4 streams of 4K60 using FFmpeg, and the GP100 has 3 NVENC encoding chips while the GTX 1080 has 2 (same architecture). Thus, in theory the GP100 could do 6 encodes of 4K60 by itself, which would be cool to see. Though at that point your limitation is more likely to be your CPU, or even software which starts to hiccup as multiple instances of recordings inadvertently hit the same threads and such.

I would suggest giving FFmpeg a try with a command / parameters like this:

-thread_queue_size 9999
-indexmem 9999 
-f dshow 
-rtbufsize 2147.48M 
-video_size 1920x1080 
-framerate 60
-i video=”[video device]”:audio=”[audio device]” 
-map 0 
-c:v h264_nvenc 
-preset: hp 
-r 60 
-rc-lookahead 120 
-pix_fmt nv12 
-b:v 6M
-b:v 6M 
-minrate 6M
-maxrate 6M 
-bufsize 6M 
-c:a aac 
-ar 44100 
-b:a 320k 
-vsync 1 
-max_muxing_queue_size 9999 
-f mpegts 
C:\Users\[user]\Videos\FFmpeg\Output.ts

You can also use the tee psuedo-muxer in FFmpeg to send the same encode to 2 places, like a stream to Twitch and a local file. What makes this particularly convenient is that you don’t have to do the same encode work twice like you have to in OBS:

-f tee 
"[f=mpegts]C:\Users\[user]\Videos\FFmpeg\Output.ts|[f=mpegts]udp://10.0.1.255:1234/"

Lastly if desired FFmpeg also has a segment muxer, which would allow you to record 24/7 without over-filling your hard drives, the recording will automatically overwrite the first part when it hits the maximum part you specified:

-f segment 
-segment_time 1800 
-segment_wrap 48 
-reset_timestamps 1
-segment_format_options max_delay=0 
C:\Users\[user]\Videos\FFmpeg\Output%02d.ts

You can even combine the tee and segment logic to stream / record 24/7 without every overfilling the drives:

-f tee 
"[f=segment:segment_time=1800:segment_wrap=48:reset_timestamps=1:segment_format_options=max_delay=0]C:\Users\[user]\Videos\Output%02d.ts|[f=mpegts]udp://10.0.1.255:1234/"

Since FFmpeg is run in the console you can call these 6 encodes programmatically in something like Powershell using Start-Process, allowing you to easily launch everything from one place. I’ve attached a zip containing the file structure for doing something like this.

Linus Example.zip

igormp · December 17, 2020

18 minutes ago, Ninbura said:

Love the video, never seen anyone try to capture more than 3 4k sources on one PC

I've done over 20

(to be fair, it wasn't a single computer, but a k8s cluster with some V100s)

A nice thing to add is that ffmpeg can push rtmp streams directly, so they can be both pushed to twitch while also being recorded to disk.

Another thing that I commented on the video (since there's no dedicated thread here) was the following:

Quote

The nvenc encoder found on nvidia GPUs is the same for every card of the same generation, meaning that a GP100 has the same encoder of a GTX 1050, with the only difference being the driver limit for consumer cards. So a couple 1650 Supers (not the regular ones) would deliver better performance/quality while also being able to handle up to 6 streams (the GTX limit is 3 streams per NVENC chip)

Of course there's also the difference in the amount of actual NVENC chips inside each GPU, but a GPU like the 1070 already has 2 of those (and coupled with patched drivers it's a better solution than using a Quadro IMO).

From my experience, a single 4k30 h264 stream (with the command shown below) uses like 40~50% of the encoder, so 2 4k30 streams per nvenc module on turing should be doable.

ffmpeg -loglevel debug -threads:v 2 -threads:a 8 -filter_threads 2 -thread_queue_size 5M \
-f x11grab -s 3840x2160 -framerate 30 -i :0.0 -thread_queue_size 5M -f alsa -ac 2 \
-i hw:0,0 -bsf:a aac_adtstoasc -c:a aac -ac 2 -b:a 128k \
-b:v 20M -minrate:v 20M -maxrate:v 20M -bufsize:v 20M -c:v h264_nvenc \
-qp:v 19  -profile:v high -rc:v cbr_ld_hq -r:v 60 -g:v 120 -bf:v 3 -refs:v 16 -f flv /dev/null

ninbura · December 17, 2020

50 minutes ago, igormp said:

Of course there's also the difference in the amount of actual NVENC chips inside each GPU, but a GPU like the 1070 already has 2 of those (and coupled with patched drivers it's a better solution than using a Quadro IMO).

The amount of chips makes a big difference though, not saying you weren't accounting for that but just pointing it out. I remember switching from my GTX 1080 to a GTX 1050 because I kept hearing everyone say "there's no difference" only to find that my typical FFmpeg command was overloading the encoder in the GTX 1050 when my GTX 1080 had 50% headroom. Essentially each chip of the same architecture can pull the same weight, so a GTX 1080 has 100% more bandwidth than the GTX 1050 and the GP100 has 200% more bandwidth when compared to the GTX 1050. Additionally while the Nvidia encode matrix states that the 1070 has 2 NVENC chips, only one of them is activated (at least in my experience), and the patch wasn't able to activate the second one, once again at least in my experience.

50 minutes ago, igormp said:

From my experience, a single 4k30 h264 stream (with the command shown below) uses like 40~50% of the encoder, so 2 4k30 streams per nvenc module on turing should be doable.

Not sure why this is the case, but the encoder doesn't care what framerate you're doing the encode at, just the resolution. When I say "it doesn't care" I mean within reason, or within the specified spec. What I mean to say is if you do a 4K30 encode it uses the same bandwidth as a 4K60 encode, in-regards to the NVENC encoder at least, obviously a higher framerate encode will further stress the system elsewhere. The Turing GPUs do have the added benefit of a higher quality output, but they also suffer from a major loss in bandwidth when compared to a GTX 1080 (half as much) and especially when compared to GP100 (1/3 the bandwidth). Though, this probably wouldn't matter in Linus's case as I believe the end result was 6 1080p encodes, which the Turing architecture could handle. Unless he kept using OBS, in-which case he would be doing 12 1080p encodes, as OBS has to do a separate encode for the recording and stream.

Bottom line, if he switched to FFmpeg he could get away with using a single 1650 super (which uses 7th gen NVENC) to handle all 6 1080p encodes, with the added benefit of the increase quality which the Turing NVENC encoder offers.

Edit: Just want to clarify that the Turing NVENC encoding chips don't suffer a loss in bandwidth when compared 1:1 to a Pascal NVENC encoding chip, but rather there are just no GPUs with more than 1 Turing NVENC chip like there are GPUs with multiple Pascal encoding chips (i.e. GTX 1080, GP100).

igormp · December 17, 2020

10 hours ago, Ninbura said:

and the patch wasn't able to activate the second one, once again at least in my experience.

Ouch, didn't know about that. To be fair, my experience has been with either a single consumer GPU (mine) or a handful of teslas in the cloud.

10 hours ago, Ninbura said:

Not sure why this is the case, but the encoder doesn't care what framerate you're doing the encode at, just the resolution.

I just mentioned it because it's what I tested. I believe what makes an actual difference is the bitrate used (since that implies more bandwidth required).

WillOfTheLand · December 17, 2020

While we're on the topic of the badminton video, for the streams he should just use youtube, it allows you to do multiple switchable cams on the output so the user (viewer) can just have 1 player/page up and switch between all the cams. Even works in viewing the vods after.

ninbura · December 17, 2020

3 hours ago, igormp said:

I just mentioned it because it's what I tested. I believe what makes an actual difference is the bitrate used (since that implies more bandwidth required).

I promise my goal isn't to contradict haha, but at least in my testing the bitrate also has no affect on the encoder usage, almost entirely just the resolution. I've also found how "busy" the feed that's being encoded affects encoder usage. For example if I'm just recording a PS5 Pro on the home screen, my computer sitting on my desktop, and my camera while it's off, all at 4K, the encoder sits around 55%. But when I open Demon Souls on the PS5 (spinning the camera), Rocket League on the PC (driving in circles), and turn my camera on my encoder sits around 65%.

I'm not an expert on what's happening at the lowest level, but it seems to me the encoder is rendering the highest quality feed based on the resolution and then you pull a much lower quality stream from it depending on your other parameters. I mean I guess I do see some variance in-terms of encoder usage as I use different presets, like hp (high performance) & hq (high quality), but for the most part it's a wash.

Definitely what seems to pull 90% of the weight when it comes to NVENC encoder usage is the target resolution, would love for some huge brain expert to explain exactly what's happening there.

igormp · December 17, 2020

51 minutes ago, Ninbura said:

I promise my goal isn't to contradict haha, but at least in my testing the bitrate also has no affect on the encoder usage, almost entirely just the resolution. I've also found how "busy" the feed that's being encoded affects encoder usage. For example if I'm just recording a PS5 Pro on the home screen, my computer sitting on my desktop, and my camera while it's off, all at 4K, the encoder sits around 55%. But when I open Demon Souls on the PS5 (spinning the camera), Rocket League on the PC (driving in circles), and turn my camera on my encoder sits around 65%.

I'm not an expert on what's happening at the lowest level, but it seems to me the encoder is rendering the highest quality feed based on the resolution and then you pull a much lower quality stream from it depending on your other parameters. I mean I guess I do see some variance in-terms of encoder usage as I use different presets, like hp (high performance) & hq (high quality), but for the most part it's a wash.

Definitely what seems to pull 90% of the weight when it comes to NVENC encoder usage is the target resolution, would love for some huge brain expert to explain exactly what's happening there.

If you keep an eye on the terminal running ffmpeg, it'll output the current bitrate being used. A static or low motion screen wouldn't make much use of it, but try to open something with heavy movements (like a really fast-paced game) and you'll see both bitrate and encoder usage go up (the latter you already mentioned with rocket league/demon souls).

LogicalDrm · December 17, 2020

-> Moved to Programs, Apps and Websites

abc_123 · December 19, 2020

On 12/17/2020 at 1:16 AM, Ninbura said:

You can avoid high 3D GPU usage by using FFmpeg to stream and record instead of OBS. Additionally, you would only need that first Quadro with this method as the GP100 is perfectly capable of encoding 6 streams of 1080p (along with basically any other Nvidia card released in the last 5 years assuming you're using the patch), which seemed to be your final output resolution despite the 4K cameras.

What makes FFmpeg use less GPU for encoding compared to OBS?

ninbura · December 19, 2020

12 minutes ago, Emily123 said:

What makes FFmpeg use less GPU for encoding compared to OBS?

It doesn't use much less encoding, it just doesn't have any 3D usage, which is really the crux of OBS.

abc_123 · December 19, 2020

13 minutes ago, Ninbura said:

It doesn't use much less encoding, it just doesn't have any 3D usage, which is really the crux of OBS.

What causes OBS to use so much 3D? The only thing I can think of is the preview but you can turn that off.

ninbura · December 21, 2020

On 12/18/2020 at 11:45 PM, Emily123 said:

What causes OBS to use so much 3D? The only thing I can think of is the preview but you can turn that off.

I'm not precisely aware of the specifics, but even when you disable the preview it uses a ton of 3D bandwidth.

Something to do with OBS rendering the frame with the GPU but then encoding it with either the CPU or GPU encoder, no matter how you shake it, a 4K OBS canvas eats a lot of 3D GPU bandwidth.

Sign In

I think Linus would be better off using FFmpeg for his 6 cam Badminton streaming / recording

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Create an account or sign in to comment

Create an account

Sign in

Topics

Latest From Linus Tech Tips:

I had no idea SHEIN sold PC parts…

Latest From Tech Quickie:

Why Are Gaming Laptops So Expensive?

Latest From TechLinked:

Good Riddance, TikTok

Latest From GameLinked:

Is Nintendo being FRAMED?

Latest From ShortCircuit:

I tried 20 influencer foods, here are the best… and the worst…

Latest From Mac Address:

Why did you buy an Apple Vision Pro?

Latest From Channel Super Fun:

I Swapped the CEO's Assistant For a Day!