Jump to content

DLSS Accelerator Card Thoughts

Bhameck

     I was thinking about how DLSS Is a very powerful feature for Nvidia GPUs, and it got me thinking about ways to speed it up. I came up with an idea: Instead of putting the tensor cores for DLSS on the card directly, could you (theoretically) move them off of the GPU die and put it on a separate card? The latency would be higher, but since it would be sending the finished lower res frame to the Tensor Card then upscaling, you could use the Tensor card as the output. Would it make a noticeable difference? You could make a dedicated slot for tensor accelerator cards that is connected to the GPU. Having a dedicated card for DLSS could free more space on the GPU die for more CUDA and RT Cores, making for a faster GPU, so It could run at a higher resolution, thus making DLSS upscale to a proportionally higher output. Since there would be a dedicated card for DLSS,  would that make DLSS work better? I'm not very experienced with GPU architecture and GPU rendering, and I am most likely completely incompetent. Please enlighten me, I would love to learn more and discuss. Thanks

Link to comment
Share on other sites

Link to post
Share on other sites

Oh cool, what if we also put the PhysX calculations on an discrete card too? 😮

Desktop: Ryzen 9 3950X, Asus TUF Gaming X570-Plus, 64GB DDR4, MSI RTX 3080 Gaming X Trio, Creative Sound Blaster AE-7

Gaming PC #2: Ryzen 7 5800X3D, Asus TUF Gaming B550M-Plus, 32GB DDR4, Gigabyte Windforce GTX 1080

Gaming PC #3: Intel i7 4790, Asus B85M-G, 16B DDR3, XFX Radeon R9 390X 8GB

WFH PC: Intel i7 4790, Asus B85M-F, 16GB DDR3, Gigabyte Radeon RX 6400 4GB

UnRAID #1: AMD Ryzen 9 3900X, Asus TUF Gaming B450M-Plus, 64GB DDR4, Radeon HD 5450

UnRAID #2: Intel E5-2603v2, Asus P9X79 LE, 24GB DDR3, Radeon HD 5450

MiniPC: BeeLink SER6 6600H w/ Ryzen 5 6600H, 16GB DDR5 
Windows XP Retro PC: Intel i3 3250, Asus P8B75-M LX, 8GB DDR3, Sapphire Radeon HD 6850, Creative Sound Blaster Audigy

Windows 9X Retro PC: Intel E5800, ASRock 775i65G r2.0, 1GB DDR1, AGP Sapphire Radeon X800 Pro, Creative Sound Blaster Live!

Steam Deck w/ 2TB SSD Upgrade

Link to comment
Share on other sites

Link to post
Share on other sites

so a GPU in parts that we can build as a lego set too?

what could go wrong?

Just add some Phys card, raytracing card, some AI and a little bit of DDR69 with 128 GB

All cards onto a big card becoming the transformers of cards or an modular Intel NUG.

Link to comment
Share on other sites

Link to post
Share on other sites

Would not work because DLSS, the way it works is directly on top of rasterization or ray traced scene, meaning it has to be present on same rendering stage exactly. Having it outside of GPU that's doing rasterization/ray tracing would create huge latency and most likely huge problems with performance or visuals.

 

AMD's FSR however, that's another thing and that could be done. Since FSR does the rather simple upscaling on the output stage (with exception of tapping into rendering pipeline after game's interface so it doesn't resize GUI elements), it would be doable without much issues. Assuming add-on card could perform resampling an sharpening with same speed as the GPU itself. Graphic card would just send the final output frames meant to the monitor to this "FSR accelerator" directly via PCIe slot just like CrossfireX did inter-card communication in the past and monitor would be actually connected to the FSR accelerator. It would process the received image and output it to the monitor.

 

This would however also affect GUI elements, but could work with any game and if the accelerator would be specifically crafted ASIC that can compute Lanczos and AMD's CAS sharpening with insane speed because it would only be doing that exactly, it could be super low power and also super low latency. I'm sure drivers and software can be done in such a way that you can easily switch between native resolution and FSR accelerator. You'd have to manually switch between these and manually set lower resolutions in games, but it would totally be doable. Question is just how much interest there is for such a thing and why not just ask for this from AMD and also NVIDIA so much they'd make it a native feature that can work with any game and not just those coded with it. The part with NVIDIA might be issue unless they can figure out DLSS to be game agnostic too. Then it wouldn't matter.

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, Bhameck said:

     I was thinking about how DLSS Is a very powerful feature for Nvidia GPUs, and it got me thinking about ways to speed it up. I came up with an idea: Instead of putting the tensor cores for DLSS on the card directly, could you (theoretically) move them off of the GPU die and put it on a separate card? The latency would be higher, but since it would be sending the finished lower res frame to the Tensor Card then upscaling, you could use the Tensor card as the output. Would it make a noticeable difference? You could make a dedicated slot for tensor accelerator cards that is connected to the GPU. Having a dedicated card for DLSS could free more space on the GPU die for more CUDA and RT Cores, making for a faster GPU, so It could run at a higher resolution, thus making DLSS upscale to a proportionally higher output. Since there would be a dedicated card for DLSS,  would that make DLSS work better? I'm not very experienced with GPU architecture and GPU rendering, and I am most likely completely incompetent. Please enlighten me, I would love to learn more and discuss. Thanks

 

1 hour ago, CerealExperimentsLain said:

Oh cool, what if we also put the PhysX calculations on an discrete card too? 😮

You could do this by slapping a secondary GPU in your rig back around the arkham days but nearly everyone didn't bother and just run it on their main cards anyways.  So i guess conveniance is your biggest enemy here 

 With all the Trolls, Try Hards, Noobs and Weirdos around here you'd think i'd find SOMEWHERE to fit in!

Link to comment
Share on other sites

Link to post
Share on other sites

6 hours ago, CerealExperimentsLain said:

Oh cool, what if we also put the PhysX calculations on an discrete card too? 😮

Hair FX™, too? 

 

Awesome!

 

 

… I actually think DLSS specifically is a great idea, a lot of people with lower end cards like 1060 etc could use this (of course its not gonna happen, i really like the idea tho!)

The direction tells you... the direction

-Scott Manley, 2021

 

Softwares used:

Corsair Link (Anime Edition) 

MSI Afterburner 

OpenRGB

Lively Wallpaper 

OBS Studio

Shutter Encoder

Avidemux

FSResizer

Audacity 

VLC

WMP

GIMP

HWiNFO64

Paint

3D Paint

GitHub Desktop 

Superposition 

Prime95

Aida64

GPUZ

CPUZ

Generic Logviewer

 

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

8 hours ago, Bhameck said:

so It could run at a higher resolution, thus making DLSS upscale to a proportionally higher output. Since there would be a dedicated card for DLSS,  would that make DLSS work better?

The entire point of DLSS is to avoid rendering at a higher resolution. Making the GPU render at a higher resolution, sending that to a separate component to scale to an even higher resolution kind of defeats the point. Yes, it could be used to render at 4k and upscale to 16k, but at least for the moment the intention is more to render at 720p or 1080p and upscaling that to say 2160p.

7 hours ago, RejZoR said:

The part with NVIDIA might be issue unless they can figure out DLSS to be game agnostic too. Then it wouldn't matter.

Isn't DLSS 2.0 and up game agnostic already? I thought pre-2.0 were the only implentations that needed training for each game.

Crystal: CPU: i7 7700K | Motherboard: Asus ROG Strix Z270F | RAM: GSkill 16 GB@3200MHz | GPU: Nvidia GTX 1080 Ti FE | Case: Corsair Crystal 570X (black) | PSU: EVGA Supernova G2 1000W | Monitor: Asus VG248QE 24"

Laptop: Dell XPS 13 9370 | CPU: i5 10510U | RAM: 16 GB

Server: CPU: i5 4690k | RAM: 16 GB | Case: Corsair Graphite 760T White | Storage: 19 TB

Link to comment
Share on other sites

Link to post
Share on other sites

51 minutes ago, tikker said:

The entire point of DLSS is to avoid rendering at a higher resolution. Making the GPU render at a higher resolution, sending that to a separate component to scale to an even higher resolution kind of defeats the point. Yes, it could be used to render at 4k and upscale to 16k, but at least for the moment the intention is more to render at 720p or 1080p and upscaling that to say 2160p.

Isn't DLSS 2.0 and up game agnostic already? I thought pre-2.0 were the only implentations that needed training for each game.

It needs to be specifically coded into the game. It can't be done through graphic driver. FSR can, but would also affect interface elements.

Link to comment
Share on other sites

Link to post
Share on other sites

7 hours ago, YouSirAreADudeSir said:

 

You could do this by slapping a secondary GPU in your rig back around the arkham days but nearly everyone didn't bother and just run it on their main cards anyways.  So i guess conveniance is your biggest enemy here 

Even before that there were dedicated cards that just handled PhysX, from the Geforce 8000-series era. The 200 series (which had integrated PhysX) made them obsolete though after only a year or so on the market.

Link to comment
Share on other sites

Link to post
Share on other sites

7 hours ago, YouSirAreADudeSir said:

You could do this by slapping a secondary GPU in your rig back around the arkham days but nearly everyone didn't bother and just run it on their main cards anyways.  So i guess conveniance is your biggest enemy here 

Found the user too young to remember that PhysX started as a dedicated PCI card.  Using a second GPU for PhysX came later. 😛

Desktop: Ryzen 9 3950X, Asus TUF Gaming X570-Plus, 64GB DDR4, MSI RTX 3080 Gaming X Trio, Creative Sound Blaster AE-7

Gaming PC #2: Ryzen 7 5800X3D, Asus TUF Gaming B550M-Plus, 32GB DDR4, Gigabyte Windforce GTX 1080

Gaming PC #3: Intel i7 4790, Asus B85M-G, 16B DDR3, XFX Radeon R9 390X 8GB

WFH PC: Intel i7 4790, Asus B85M-F, 16GB DDR3, Gigabyte Radeon RX 6400 4GB

UnRAID #1: AMD Ryzen 9 3900X, Asus TUF Gaming B450M-Plus, 64GB DDR4, Radeon HD 5450

UnRAID #2: Intel E5-2603v2, Asus P9X79 LE, 24GB DDR3, Radeon HD 5450

MiniPC: BeeLink SER6 6600H w/ Ryzen 5 6600H, 16GB DDR5 
Windows XP Retro PC: Intel i3 3250, Asus P8B75-M LX, 8GB DDR3, Sapphire Radeon HD 6850, Creative Sound Blaster Audigy

Windows 9X Retro PC: Intel E5800, ASRock 775i65G r2.0, 1GB DDR1, AGP Sapphire Radeon X800 Pro, Creative Sound Blaster Live!

Steam Deck w/ 2TB SSD Upgrade

Link to comment
Share on other sites

Link to post
Share on other sites

3 hours ago, CerealExperimentsLain said:

Found the user too young to remember that PhysX started as a dedicated PCI card.  Using a second GPU for PhysX came later. 😛

Me??  I'm 39 lol.  I remember the PhysX card from before nVidia bought them out.  Just didn't think they were worth mentioning

 With all the Trolls, Try Hards, Noobs and Weirdos around here you'd think i'd find SOMEWHERE to fit in!

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×