Jump to content

NVIDIA To Devs: Compute/Graphics Toggle Is A Heavyweight Switch

Mr_Troll

A couple of weeks ago, NVIDIA published of vademecum of DirectX 12 Do’s and Don’ts that went largely unnoticed. However, it actually contains some interesting information on the tips that NVIDIA gave to developers on how to best use Microsoft’s new lower level API with their existing architecture.

A couple of them, for instance, seem to confirm two stories we reported last month about Maxwell problems with Asynchronous Compute. In case you don’t recall, the reference is to AMD’s Robert Hallock saying that Maxwell can’t perform Async Compute without heavy reliance on slow context switching; a few days later, Tech Report’s David Kanter mentioned that according to Oculus employees, preemption context switching was potentially catastrophic for Maxwell GPUs.

Now, under the Pipeline State Objects (PSOs) section, they were very clear:

 

 

  • Don’t toggle between compute and graphics on the same command queue more than absolutely necessary
  • This is still a heavyweight switch to make

 

 

 

That’s not all they had to say about compute and graphics tasks – under the Work Submission – Command Lists & Bundles section, NVIDIA warned developers as follows:

 

 

 

  • Check carefully if the use of a separate compute command queues really is advantageous
  • Even for compute tasks that can in theory run in parallel with graphics tasks, the actual scheduling details of the parallel work on the GPU may not generate the results you hope for
  • Be conscious of which asynchronous compute and graphics workloads can be scheduled together

 

 

 

Finally, NVIDIA also gave some advice on how to best use Maxwell and DirectX 12 hardware features. They recommend to use Conservative Rasterization, which right now is only available on Maxwell cards, while they are a bit more cautious about Raster Order Views, the other DX12_1 level feature.

 

  • Use hardware conservative raster for full-speed conservative rasterization
  • No need to use a GS to implement a ‘slow’ software base conservative rasterization – See https://developer.nvidia.com/content/dont-be-conservative-conservative-rasterization
  • Make use of NvAPI (when available) to access other Maxwell features
  • Advanced Rasterization features:
    Bounding box rasterization mode for quad based geometry
    New MSAA features like post depth coverage mask and overriding the coverage mask for routing of data to sub-samples
    Programmable MSAA sample locations
  • Fast Geometry Shader features:
    Render to cube maps in one geometry pass without geometry amplifications
    Render to multiple viewports without geometry amplifications
    Use the fast pass-through geometry shader for techniques that need per-triangle data in the pixel shader
  • New interlocked operations
  • Enhanced blending ops
  • New texture filtering ops
  • Don’t use Raster Order View (ROV) techniques pervasively
  • Guaranteeing order doesn’t come for free
  • Always compare with alternative approaches like advanced blending ops and atomics

 

 

 

Source : http://wccftech.com/nvidia-devs-computegraphics-toggle-heavyweight-switch/

https://developer.nvidia.com/dx12-dos-and-donts

 

So this just confirms what we all know ... that maxwell and architectures before are not that good at DX 12 as AMD. Not surprising.


 

Intel Core i7 7800x @ 5.0 Ghz with 1.305 volts (really good chip), Mesh OC @ 3.3 Ghz, Fractal Design Celsius S36, Asrock X299 Killer SLI/ac, 16 GB Adata XPG Z1 OCed to  3600 Mhz , Aorus  RX 580 XTR 8G, Samsung 950 evo, Win 10 Home - loving it :D

Had a Ryzen before ... but  a bad bios flash killed it :(

MSI GT72S Dominator Pro G - i7 6820HK, 980m SLI, Gsync, 1080p, 16 GB RAM, 2x128 GB SSD + 1TB HDD, Win 10 home

 

Link to comment
Share on other sites

Link to post
Share on other sites

nvidia knows how GPUs work better than the devs

which is why the devs should take and appreciate their help/advice

NEW PC build: Blank Heaven   minimalist white and black PC     Old S340 build log "White Heaven"        The "LIGHTCANON" flashlight build log        Project AntiRoll (prototype)        Custom speaker project

Spoiler

Ryzen 3950X | AMD Vega Frontier Edition | ASUS X570 Pro WS | Corsair Vengeance LPX 64GB | NZXT H500 | Seasonic Prime Fanless TX-700 | Custom loop | Coolermaster SK630 White | Logitech MX Master 2S | Samsung 980 Pro 1TB + 970 Pro 512GB | Samsung 58" 4k TV | Scarlett 2i4 | 2x AT2020

 

Link to comment
Share on other sites

Link to post
Share on other sites

-snip-

Notice, Nvidias being quite as a church mouse about Pascal?

I'm playing my Xbone on 3 LG Curved monitors-No one ever

Please, read CoC it helps, it helped me it should help you-Every competent member

Resident bad pun maker.....please excuse them

Link to comment
Share on other sites

Link to post
Share on other sites

  • Don’t use Raster Order View (ROV) techniques pervasively .... this is a DX 12.1 feature... so are they lying about DX 12.1 compliance aswell?

Intel Core i7 7800x @ 5.0 Ghz with 1.305 volts (really good chip), Mesh OC @ 3.3 Ghz, Fractal Design Celsius S36, Asrock X299 Killer SLI/ac, 16 GB Adata XPG Z1 OCed to  3600 Mhz , Aorus  RX 580 XTR 8G, Samsung 950 evo, Win 10 Home - loving it :D

Had a Ryzen before ... but  a bad bios flash killed it :(

MSI GT72S Dominator Pro G - i7 6820HK, 980m SLI, Gsync, 1080p, 16 GB RAM, 2x128 GB SSD + 1TB HDD, Win 10 home

 

Link to comment
Share on other sites

Link to post
Share on other sites

i'm sure people will still say that nvidia will fix the performance for dx12 on maxwell and previous gens through drivers but news flash you cant fix a hardware problem with software

Spoiler

My system is the Dell Inspiron 15 5559 Microsoft Signature Edition

                         The Austrailian king of LTT said that I'm awesome and a funny guy. the greatest psu list known to man DDR3 ram guide

                                                                                                               i got 477 posts in my first 30 days on LinusTechTips.com

 

Link to comment
Share on other sites

Link to post
Share on other sites

nvidia knows how GPUs work better than the devs

which is why the devs should take and appreciate their help/advice

 

Nvidia knows how their GPUs work better and surely want to encourage them to code so they are not at such a disadvantage. They seem to be taking advantage of devs that might make this same assumptions to basically throw AMD under the bus and avoid devs coding for async compute giving their competitor an advantage.

-------

Current Rig

-------

Link to comment
Share on other sites

Link to post
Share on other sites

Don’t use Raster Order View (ROV) techniques pervasively .... this is a DX 12.1 feature... so are they lying about DX 12.1 compliance as well?

to note, these are optional features: Resource binding (three tiers), tiled resources (three tiers), conservative rasterization (three tiers), stencil reference value from Pixel Shader, rasterizer ordered views, typed UAV loads for additional formats, UMA/hUMA support.

Logical blend operations, double precision (64-bit) floating point operations, minimum floating point precision (10 or 16 bit).

and these are required features for 12_0: Resource Binding Tier 2, Tiled Resources Tier 2 (Texture2D), Typed UAV Loads (additional formats).

and 12_1: Conservative Rasterization Tier 1, Rasterizer Ordered Views.

Link to comment
Share on other sites

Link to post
Share on other sites

nvidia knows how GPUs work better than the devs

which is why the devs should take and appreciate their help/advice

 

The question is are they giving the instructions so their GPUs don't fall short of AMD's?

The ability to google properly is a skill of its own. 

Link to comment
Share on other sites

Link to post
Share on other sites

oh noes!

not full DX12 hardware that can't do full DX12 specs  <_<

As per usual, the first generation of cards that support the latest DirectX API suck at actually supporting it.

"We also blind small animals with cosmetics.
We do not sell cosmetics. We just blind animals."

 

"Please don't mistake us for Equifax. Those fuckers are evil"

 

This PSA brought to you by Equifacks.
PMSL

Link to comment
Share on other sites

Link to post
Share on other sites

i'm sure people will still say that nvidia will fix the performance for dx12 on maxwell and previous gens through drivers but news flash you cant fix a hardware problem with software

If they use the marketshart to force all developers not to use featuers the nvidia cards can't handle well, it is somehow "solved".

And I'm sure they try to do so usong heavy PR.

Mineral oil and 40 kg aluminium heat sinks are a perfect combination: 73 cores and a Titan X, Twenty Thousand Leagues Under the Oil

Link to comment
Share on other sites

Link to post
Share on other sites

 

  • Don’t use Raster Order View (ROV) techniques pervasively .... this is a DX 12.1 feature... so are they lying about DX 12.1 compliance aswell?

 

I'm pretty sure its supported but probably really taxing to use it excessively. 

i5 2400 | ASUS RTX 4090 TUF OC | Seasonic 1200W Prime Gold | WD Green 120gb | WD Blue 1tb | some ram | a random case

 

Link to comment
Share on other sites

Link to post
Share on other sites

 

typo sorry...

 

Intel Core i7 7800x @ 5.0 Ghz with 1.305 volts (really good chip), Mesh OC @ 3.3 Ghz, Fractal Design Celsius S36, Asrock X299 Killer SLI/ac, 16 GB Adata XPG Z1 OCed to  3600 Mhz , Aorus  RX 580 XTR 8G, Samsung 950 evo, Win 10 Home - loving it :D

Had a Ryzen before ... but  a bad bios flash killed it :(

MSI GT72S Dominator Pro G - i7 6820HK, 980m SLI, Gsync, 1080p, 16 GB RAM, 2x128 GB SSD + 1TB HDD, Win 10 home

 

Link to comment
Share on other sites

Link to post
Share on other sites

This is literally my first time seeing this term.

 

Me too

Someone told Luke and Linus at CES 2017 to "Unban the legend known as Jerakl" and that's about all I've got going for me. (It didn't work)

 

Link to comment
Share on other sites

Link to post
Share on other sites

Nvidia just sucks at async compute, and Kepler and Maxwell always will, no matter the driver updates they will get. People still don't understand that the efficiency of these these two architectures (especially Maxwell) came at huge compromises.

AMD's GCN cards are going to wreck most NVidia stuff in DX12 that is out right now. Based on certain powerpoint presentations, we might see Pascal suck at async compute too, but of course, that is speculation.

 

We know most devs are going to focus a lot on async compute, solely for the sake of consoles, which are GCN based, so we will probably see this efficiency translate to AMD, whereas we might see NVidia specific code paths, where async compute is limited or completely off. Maybe even using ROV or other 12.1 stuff, that could help NVidia a little bit.

 

Also this is hilarious:

 

  • Don’t rely on being able to allocate all GPU memory in one go
  • Depending on the underlying GPU architecture the memory may or may not be segmented​

 

That's the 970 right there. What a shit card, to require specific coding from devs, just because NVidia wanted to overcharge for defective chips (sure it's not a bug it's a feature, but what a shitty feature it is).

Watching Intel have competition is like watching a headless chicken trying to get out of a mine field

CPU: Intel I7 4790K@4.6 with NZXT X31 AIO; MOTHERBOARD: ASUS Z97 Maximus VII Ranger; RAM: 8 GB Kingston HyperX 1600 DDR3; GFX: ASUS R9 290 4GB; CASE: Lian Li v700wx; STORAGE: Corsair Force 3 120GB SSD; Samsung 850 500GB SSD; Various old Seagates; PSU: Corsair RM650; MONITOR: 2x 20" Dell IPS; KEYBOARD/MOUSE: Logitech K810/ MX Master; OS: Windows 10 Pro

Link to comment
Share on other sites

Link to post
Share on other sites

I don't understand.

 

heeeeeeelp

Someone told Luke and Linus at CES 2017 to "Unban the legend known as Jerakl" and that's about all I've got going for me. (It didn't work)

 

Link to comment
Share on other sites

Link to post
Share on other sites

So basicaly:

Do / Dont do this so that the game runs as best as possible on our NVIDIA cards.

 

I want to see Do and Dont to devs from AMD as well then.

Link to comment
Share on other sites

Link to post
Share on other sites

 

  • Don’t use Raster Order View (ROV) techniques pervasively .... this is a DX 12.1 feature... so are they lying about DX 12.1 compliance aswell?

 

No, but they know they'll be the target of antitrust litigation if they don't give this warning since AMD has no support of 12.1 features.

Software Engineer for Suncorp (Australia), Computer Tech Enthusiast, Miami University Graduate, Nerd

Link to comment
Share on other sites

Link to post
Share on other sites

The question is are they giving the instructions so their GPUs don't fall short of AMD's?

it would probably increase performance on any gpu to follow those steps

its not like nvidia GPUs and AMD gpuswork completely differently...after all they both do the same work, render frames for a game

NEW PC build: Blank Heaven   minimalist white and black PC     Old S340 build log "White Heaven"        The "LIGHTCANON" flashlight build log        Project AntiRoll (prototype)        Custom speaker project

Spoiler

Ryzen 3950X | AMD Vega Frontier Edition | ASUS X570 Pro WS | Corsair Vengeance LPX 64GB | NZXT H500 | Seasonic Prime Fanless TX-700 | Custom loop | Coolermaster SK630 White | Logitech MX Master 2S | Samsung 980 Pro 1TB + 970 Pro 512GB | Samsung 58" 4k TV | Scarlett 2i4 | 2x AT2020

 

Link to comment
Share on other sites

Link to post
Share on other sites

Nvidia knows how their GPUs work better and surely want to encourage them to code so they are not at such a disadvantage. They seem to be taking advantage of devs that might make this same assumptions to basically throw AMD under the bus and avoid devs coding for async compute giving their competitor an advantage.

all the stuff listed is not "nvidia specific" or proprietary

you're just making up conspiracy theories lol

NEW PC build: Blank Heaven   minimalist white and black PC     Old S340 build log "White Heaven"        The "LIGHTCANON" flashlight build log        Project AntiRoll (prototype)        Custom speaker project

Spoiler

Ryzen 3950X | AMD Vega Frontier Edition | ASUS X570 Pro WS | Corsair Vengeance LPX 64GB | NZXT H500 | Seasonic Prime Fanless TX-700 | Custom loop | Coolermaster SK630 White | Logitech MX Master 2S | Samsung 980 Pro 1TB + 970 Pro 512GB | Samsung 58" 4k TV | Scarlett 2i4 | 2x AT2020

 

Link to comment
Share on other sites

Link to post
Share on other sites

0ed.gif

CPU i7 6700 Cooling Cryorig H7 Motherboard MSI H110i Pro AC RAM Kingston HyperX Fury 16GB DDR4 2133 GPU Pulse RX 5700 XT Case Fractal Design Define Mini C Storage Trascend SSD370S 256GB + WD Black 320GB + Sandisk Ultra II 480GB + WD Blue 1TB PSU EVGA GS 550 Display Nixeus Vue24B FreeSync 144 Hz Monitor (VESA mounted) Keyboard Aorus K3 Mechanical Keyboard Mouse Logitech G402 OS Windows 10 Home 64 bit

Link to comment
Share on other sites

Link to post
Share on other sites

TL:DR - our compute sucks - please don't use it

Archangel (Desktop) CPU: i5 4590 GPU:Asus R9 280  3GB RAM:HyperX Beast 2x4GBPSU:SeaSonic S12G 750W Mobo:GA-H97m-HD3 Case:CM Silencio 650 Storage:1 TB WD Red
Celestial (Laptop 1) CPU:i7 4720HQ GPU:GTX 860M 4GB RAM:2x4GB SK Hynix DDR3Storage: 250GB 850 EVO Model:Lenovo Y50-70
Seraph (Laptop 2) CPU:i7 6700HQ GPU:GTX 970M 3GB RAM:2x8GB DDR4Storage: 256GB Samsung 951 + 1TB Toshiba HDD Model:Asus GL502VT

Windows 10 is now MSX! - http://linustechtips.com/main/topic/440190-can-we-start-calling-windows-10/page-6

Link to comment
Share on other sites

Link to post
Share on other sites

all the stuff listed is not "nvidia specific" or proprietary

you're just making up conspiracy theories lol

 

Not being nvidia means nothing if it performs a lot worst: The reverse would be AMD not recommending opengl, it's not Nvidia specific but their cards perform much worst with it so of course they wouldn't want devs to use that.

-------

Current Rig

-------

Link to comment
Share on other sites

Link to post
Share on other sites

The question is are they giving the instructions so their GPUs don't fall short of AMD's?

pretty much yes.

CPU: i7 6700k @4.5GHZ | Mobo: MSI Z170 Gaming M5 | RAM: G Skill Rip Jaws V- 16GB | GPU: Sapphire RX 5700 XT | Storage: Seagate Barracuda 2TB 7200RPM, Seagate Barracuda 500GB 7200RPM, Kingston SSD-now 100V+ 128GB, WD Black 600GB, WD Blue 500GB, Intel 600p 256GB nvme SSD |PSU:Corsair CX750M| Cooling: Corsair H60| Displays: 27" LG IPS277L, Samsung Curved 72hz Freesync 27 inch, Epson EX7220 Projector with 100 inch 16:10 Screen | Kb: Corsair Vengeance K70 | Mouse: R.A.T. 4 |  Case:  NZXT Phantom 410 (Red) | OS: Win 10 Home 64 Bit

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now


×