Jump to content

NVIDIA Pascal Async compute support: CONFIRMED

i_build_nanosuits
4 minutes ago, i_build_nanosuits said:

I will say though, i will download this timespy benchmark tonight at home and i will test with my 980ti and see if i get improvements from Async compute as well (i hope so :p)

Maxwell doesn't run async at all because of the driver blocking all requests to use it, according to a 3DMark developer. So any gains in Timespy are simply from the lower overhead/better API.

 

https://steamcommunity.com/app/223850/discussions/0/366298942110944664/

[Out-of-date] Want to learn how to make your own custom Windows 10 image?

 

Desktop: AMD R9 3900X | ASUS ROG Strix X570-F | Radeon RX 5700 XT | EVGA GTX 1080 SC | 32GB Trident Z Neo 3600MHz | 1TB 970 EVO | 256GB 840 EVO | 960GB Corsair Force LE | EVGA G2 850W | Phanteks P400S

Laptop: Intel M-5Y10c | Intel HD Graphics | 8GB RAM | 250GB Micron SSD | Asus UX305FA

Server 01: Intel Xeon D 1541 | ASRock Rack D1541D4I-2L2T | 32GB Hynix ECC DDR4 | 4x8TB Western Digital HDDs | 32TB Raw 16TB Usable

Server 02: Intel i7 7700K | Gigabye Z170N Gaming5 | 16GB Trident Z 3200MHz

Link to comment
Share on other sites

Link to post
Share on other sites

10 minutes ago, i_build_nanosuits said:

i should have known the AMD fanbabies would come at me with pitchforks.


That was not the goal of this post, i don't feel like arguying, the proofs are there.


Pascal DOES get improved performance from Async compute support under DX12, now deal with it.
i'm out. (this forum really sucks, it's a good thinhg for nvidia users...that DOES NOT mean it's a BAD thing for AMD fanboys...GROW UP!!)

If anything, I'm just pointing out that this thread is essentially a repost. There's a full 7 pages of discussion on the other thread which was started nearly a week ago, and there's nobody here doubting that Pascal doesn't support async compute. Calm down.

'Fanboyism is stupid' - someone on this forum.

Be nice to each other boys and girls. And don't cheap out on a power supply.

Spoiler

CPU: Intel Core i7 4790K - 4.5 GHz | Motherboard: ASUS MAXIMUS VII HERO | RAM: 32GB Corsair Vengeance Pro DDR3 | SSD: Samsung 850 EVO - 500GB | GPU: MSI GTX 980 Ti Gaming 6GB | PSU: EVGA SuperNOVA 650 G2 | Case: NZXT Phantom 530 | Cooling: CRYORIG R1 Ultimate | Monitor: ASUS ROG Swift PG279Q | Peripherals: Corsair Vengeance K70 and Razer DeathAdder

 

Link to comment
Share on other sites

Link to post
Share on other sites

Depends on how you define async compute. The entire point of performance increase on AMD, is to use async compute like hyperthreading. That is to push the compute task to any and all idle shaders, thus filling in the blanks.

 

Pascal cannot do this. Just like Kepler, Pascal can only do content switching, which means the pipeline either do graphics work or compute work, but never both at the same time. The difference between Kepler and Pascal, is that Kepler has to flush the entire pipeline between each shift, making the use of async compute completely useless. That is why we haven't gotten that infamous async compute driver for Kepler, and never will.

Pascal on the other hand, does not have to flush the pipeline between each content switch. However, the performance increase of using async compute on Pascal is quite small.

 

Async compute is not about either setting it off or on. It has to actually be implemented in a performance increasing way. 

Watching Intel have competition is like watching a headless chicken trying to get out of a mine field

CPU: Intel I7 4790K@4.6 with NZXT X31 AIO; MOTHERBOARD: ASUS Z97 Maximus VII Ranger; RAM: 8 GB Kingston HyperX 1600 DDR3; GFX: ASUS R9 290 4GB; CASE: Lian Li v700wx; STORAGE: Corsair Force 3 120GB SSD; Samsung 850 500GB SSD; Various old Seagates; PSU: Corsair RM650; MONITOR: 2x 20" Dell IPS; KEYBOARD/MOUSE: Logitech K810/ MX Master; OS: Windows 10 Pro

Link to comment
Share on other sites

Link to post
Share on other sites

8 minutes ago, Morgan MLGman said:

Hmmmm, but considering what transitioning from DX11 to DX12 in Ashes of the Singularity, which is a DX12 showcase game benchmark, does to a 390X (that it surpassed 980Ti's DX12 score), is it really that bad that they used that many queues? If they do improve performance THAT much, what's wrong with that? Not like it's the 64x tesselation that decreases FPS by half when used ;-;

1) No one has done an objective measure to see where each architecture peaks out on gains. For all we know AMD is just throwing the maximum number it can sustain out there purely to make Nvidia look bad.

 

2) AC/S makes up for the fact AMD's driver (which still has higher CPU overhead than Nvidia's) doesn't keep its own shaders fed well. It's a cheap band-aid or trick as it's being used now.

 

3) Tessellation improved performance vs. the alternative which is doing the hair physics purely in compute. It's exactly the same thing as using too much AC/S against Nvidia. Nvidia knew AMD sucked at it and pushed it to obscene limits. This is just karma at play. That doesn't make it right, but that is what's going on here.

 

I could mathematically prove we don't need more than 4 queues based on current triangle count, model count, lighting models used, etc. to prove there isn't enough dead pipeline time to justify using more than 4 even if your driver is 75% efficient, but it's not nearly as important as the fact Nvidia needs to work more closely with devs on DX 12 support, and AMD should stop trolling and fix its own damn drivers since its hardware is suffering on scalability because of the over engineering it put into AC/S support.

Software Engineer for Suncorp (Australia), Computer Tech Enthusiast, Miami University Graduate, Nerd

Link to comment
Share on other sites

Link to post
Share on other sites

21 minutes ago, i_build_nanosuits said:

i should have known the AMD fanbabies would come at me with pitchforks.


That was not the goal of this post, i don't feel like arguying, the proofs are there.


Pascal DOES get improved performance from Async compute support under DX12, now deal with it.
i'm out. (this forum really sucks, it's a good thinhg for nvidia users...that DOES NOT mean it's a BAD thing for AMD fanboys...GROW UP!!)

 

"DEAL WITH IT" 

 

smh. This is fanboy talk coming out of you. y'know? 

i5 2400 | ASUS RTX 4090 TUF OC | Seasonic 1200W Prime Gold | WD Green 120gb | WD Blue 1tb | some ram | a random case

 

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, patrickjp93 said:

1) No one has done an objective measure to see where each architecture peaks out on gains. For all we know AMD is just throwing the maximum number it can sustain out there purely to make Nvidia look bad.

 

2) AC/S makes up for the fact AMD's driver (which still has higher CPU overhead than Nvidia's) doesn't keep its own shaders fed well. It's a cheap band-aid or trick as it's being used now.

 

3) Tessellation improved performance vs. the alternative which is doing the hair physics purely in compute. It's exactly the same thing as using too much AC/S against Nvidia. Nvidia knew AMD sucked at it and pushed it to obscene limits. This is just karma at play. That doesn't make it right, but that is what's going on here.

 

I could mathematically prove we don't need more than 4 queues based on current triangle count, model count, lighting models used, etc. to prove there isn't enough dead pipeline time to justify using more than 4 even if your driver is 75% efficient, but it's not nearly as important as the fact Nvidia needs to work more closely with devs on DX 12 support, and AMD should stop trolling and fix its own damn drivers since its hardware is suffering on scalability because of the over engineering it put into AC/S support.

So following what you wrote, can DX12 be tweaked enough so Maxwell and Pascal see as big of a gain when going to DX12 from DX11 as Hawaii? Cause a 290X/390X (which are pretty damn hardware-beefy) see very significant gains

CPU: AMD Ryzen 7 5800X3D GPU: AMD Radeon RX 6900 XT 16GB GDDR6 Motherboard: MSI PRESTIGE X570 CREATION
AIO: Corsair H150i Pro RAM: Corsair Dominator Platinum RGB 32GB 3600MHz DDR4 Case: Lian Li PC-O11 Dynamic PSU: Corsair RM850x White

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, Morgan MLGman said:

So following what you wrote, can DX12 be tweaked enough so Maxwell and Pascal see as big of a gain when going to DX12 from DX11 as Hawaii? Cause a 290X/390X (which are pretty damn hardware-beefy) see very significant gains

Probably not because Nvidia already keeps its shaders much better fed. There's less room to grow. That said, in VR, Nvidia has the advantage because of Simultaneous Multi-Projection. AMD cannot do 1-pass rendering, and that is going to be a very painful, inescapable fact for the next 2 years or more. 

Software Engineer for Suncorp (Australia), Computer Tech Enthusiast, Miami University Graduate, Nerd

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, Morgan MLGman said:

So following what you wrote, can DX12 be tweaked enough so Maxwell and Pascal see as big of a gain when going to DX12 from DX11 as Hawaii? Cause a 290X/390X (which are pretty damn hardware-beefy) see very significant gains

It's more like he's pointing out that async compute is masking a separate issue that AMD has.  Nvidia doesn't/didn't have that problem, so async compute will not demonstrate the same leap for them.

i7-5820k  |  MSI X99S SLI-Plus  |  4x4GB HyperX 2400 DDR4  |  Sapphire Radeon R9 295X2  |  Samsung 840 EVO 1TB x2  |  Corsair AX1200i  |  Corsair H100i  |  NZXT H440 Razer

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, Orblivion said:

It's more like he's pointing out that async compute is masking a separate issue that AMD has.  Nvidia doesn't/didn't have that problem, so async compute will not demonstrate the same leap for them.

I know. But I was asking about something else, as I was curious whether it was possible for Nvidia cards to get such a performance boost as Hawaii architecture-based GPUs.

 

1 minute ago, patrickjp93 said:

Probably not because Nvidia already keeps its shaders much better fed. There's less room to grow. That said, in VR, Nvidia has the advantage because of Simultaneous Multi-Projection. AMD cannot do 1-pass rendering, and that is going to be a very painful, inescapable fact for the next 2 years or more. 

Interesting, considering how heavily AMD invested themselves into VR market and promotion of VR and their "Cheap, 199$ Premium VR"

CPU: AMD Ryzen 7 5800X3D GPU: AMD Radeon RX 6900 XT 16GB GDDR6 Motherboard: MSI PRESTIGE X570 CREATION
AIO: Corsair H150i Pro RAM: Corsair Dominator Platinum RGB 32GB 3600MHz DDR4 Case: Lian Li PC-O11 Dynamic PSU: Corsair RM850x White

Link to comment
Share on other sites

Link to post
Share on other sites

Well, pascal has software async compute.

Many people are reporting the timespy benchmarks don't use the hardware async compute capabilities.

CPU i5 6600k @ 4.6GHz GPU MSI R9 390 GAMING 8G RAM 8 x 2gb DDR4-2800MHz Avexir RAM Mother Board ASUS Z170 Pro Gaming Case NZXT H440 PSU Cooler Master v750 750W Storage WD 1TB Blue + Samsung 950 pro 128gb m.2 pci-e SSD Cooler Corsair H110i GTX

Monitor BenQ BL2420PT 24" 1440p 60Hz

Link to comment
Share on other sites

Link to post
Share on other sites

7 minutes ago, Notional said:

Depends on how you define async compute.

I have seen this phenomenon before and I bet it is what we are seeing here.

 

It is all about them keep misusing the terminology. Until people just start picking it up.

Look at the title of this thread, as misleading as it can be (from a pure terminology-usage point of view).

 

In the future, we will all be confused to whether or not pascal actually supported async or not.

Please avoid feeding the argumentative narcissistic academic monkey.

"the last 20 percent – going from demo to production-worthy algorithm – is both hard and is time-consuming. The last 20 percent is what separates the men from the boys" - Mobileye CEO

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, Morgan MLGman said:

I know. But I was asking about something else, as I was curious whether it was possible for Nvidia cards to get such a performance boost as Hawaii architecture-based GPUs.

 

Interesting, considering how heavily AMD invested themselves into VR market and promotion of VR and their "Cheap, 199$ Premium VR"

The universe is a funny thing. AMD may have found the electrically, thermally expensive way to keep its shaders fed, but now it has to find more thermal headroom to add SMP if it wants to keep up in VR.

 

This is why I harp on software developers. You cannot blame the hardware makers for lousy performance gains if you don't utilize their hardware to the fullest. I'm going to avoid my usual Intel performance gains rant, but suffice it to say it's not Intel's fault the other side won't pull its weight.

Software Engineer for Suncorp (Australia), Computer Tech Enthusiast, Miami University Graduate, Nerd

Link to comment
Share on other sites

Link to post
Share on other sites

33 minutes ago, i_build_nanosuits said:

i should have known the AMD fanbabies would come at me with pitchforks.


That was not the goal of this post, i don't feel like arguying, the proofs are there.


Pascal DOES get improved performance from Async compute support under DX12, now deal with it.
i'm out. (this forum really sucks, it's a good thinhg for nvidia users...that DOES NOT mean it's a BAD thing for AMD fanboys...GROW UP!!)

you are being too simplistic here.

 

What we are seeing here, with Time Spy and other DX/Vulcan iterations using Async, is similar to what we see with FX CPUs in software that is highly tuned for multi core.

Improvements are seen, clear improvements. However it does not change the fact that once you test something that does it even better, then suddenly whatever gains the FX made is invalid.

 

Nvidia has a overcomplicated and risky way of running Async. Ive spent two days listening to conferences and tech demos of how async is supposed to work with Pascal. Hell, some of these utterly boring as fuck conferences lasts 2 hours or more.

However what i have learned is that running Async on Pascal, is possible, but you are walking on a knives edge performance wise. It is much harder to implement properly, compared to GCN, it is much more dangerous if you fuck up (near catastrophic amounts of latencies and or outright driver crash).

 

It is also dependent on both the game engine AND the Nvidia driver being coded for it. Not as badly as Maxwell require, but still it is a lot of work.

 

(Async on maxwell require a dedicated customized rendering path JUST for maxwell GPUs in order to run Async. This is why Maxwell never has and never will have Async compute. Nvidia has officially stated that Async compute has been disabled in the driver for Maxwell. And given how long it has been since they promised to make Async work on Maxwell in Ashes, i bet we wont ever see Maxwell GPUs, any of them, run Async compute)

 

 

It is no different claiming Pascal can run Async, then it is claiming FX CPUs can run games. Both statements are true. In both cases the real answer however is "However it is not very good at it".

 

 

Thing is. to quote myself from a different post:
 

Quote

Sorry to break the news to you.

Pascal can run Async, just barely. I have spent the last two days listening to Nvidia press conferences about Async compute and how to make it work on Pascal on Youtube. It is clear that even Pascal is VERY intolerant to shoddy coding. One fuckup from developers and even Pascal will shit itself with Async Workloads. However, there is mechanisms in the Nvidia drivers to help prevent such a event happening.

 

Async Compute and Async Shading is the bane of Nvidia atm. They struggle with those features, and they struggle REALLY BADLY.

 

Async compute and shading has been used in console games since 2013. It is a feature many devs are comfortable using. With DX12 and Vulcan, they can do so on PC too. It is stupid to think that a feature enabled to a developer for 3 years on the console side which they have used extensively to push console performance, isnt going to be levvied on PC. Ofcourse it is. Every dev wants to make a beautiful game.

 

However, as it is, nvidia isnt prepared for this. Not one fucking bit. I think they seriously underestimated how popular this feature would become.

AMD is really jabbing them in the side with it too. Their marketing team has found a weakness in Nvidias hardware and are exploiting the shit out of it.

 

Just like Nvidia has been "abusing" tesselation to play on Maxwells strengths and GCNs weaknesses, expect that every AMD title going forward will be abusing the shit out of Async Compute and Shading. Because GCN rocks that shit, and Nvidias current gen hardware struggles really badly.

 

 

Just you wait. More and more games will use Async. And in more and more cases we will see large performance gaps between AMD and Nvidia either being closed, or widened due to this.

Link to comment
Share on other sites

Link to post
Share on other sites

10 minutes ago, patrickjp93 said:

The universe is a funny thing. AMD may have found the electrically, thermally expensive way to keep its shaders fed, but now it has to find more thermal headroom to add SMP if it wants to keep up in VR.

 

This is why I harp on software developers. You cannot blame the hardware makers for lousy performance gains if you don't utilize their hardware to the fullest. I'm going to avoid my usual Intel performance gains rant, but suffice it to say it's not Intel's fault the other side won't pull its weight.

Dunno how versed you are in LiquidVR, but AMD has already found a way to do this sort of warp for VR specifically. ALthough NOT for standard monitors, which is what SMP currently does better.

I am curious to see how AMD are going to add their version for monitors.

 

What AMD really need to do is optimize the fuck out of their rasterizers. They leave a lot to be desired.

Link to comment
Share on other sites

Link to post
Share on other sites

It's not true async compute on a hardware level.

\\ QUIET AUDIO WORKSTATION //

5960X 3.7GHz @ 0.983V / ASUS X99-A USB3.1      

32 GB G.Skill Ripjaws 4 & 2667MHz @ 1.2V

AMD R9 Fury X

256GB SM961 + 1TB Samsung 850 Evo  

Cooler Master Silencio 652S (soon Calyos NSG S0 ^^)              

Noctua NH-D15 / 3x NF-S12A                 

Seasonic PRIME Titanium 750W        

Logitech G810 Orion Spectrum / Logitech G900

2x Samsung S24E650BW 16:10  / Adam A7X / Fractal Axe Fx 2 Mark I

Windows 7 Ultimate

 

4K GAMING/EMULATION RIG

Xeon X5670 4.2Ghz (200BCLK) @ ~1.38V / Asus P6X58D Premium

12GB Corsair Vengeance 1600Mhz

Gainward GTX 1080 Golden Sample

Intel 535 Series 240 GB + San Disk SSD Plus 512GB

Corsair Crystal 570X

Noctua NH-S12 

Be Quiet Dark Rock 11 650W

Logitech K830

Xbox One Wireless Controller

Logitech Z623 Speakers/Subwoofer

Windows 10 Pro

Link to comment
Share on other sites

Link to post
Share on other sites

7 minutes ago, Prysin said:

Dunno how versed you are in LiquidVR, but AMD has already found a way to do this sort of warp for VR specifically. ALthough NOT for standard monitors, which is what SMP currently does better.

I am curious to see how AMD are going to add their version for monitors.

 

What AMD really need to do is optimize the fuck out of their rasterizers. They leave a lot to be desired.

The frame latency under LiquidVR still leaves GameworksVR in the lead by a significant margin.

Software Engineer for Suncorp (Australia), Computer Tech Enthusiast, Miami University Graduate, Nerd

Link to comment
Share on other sites

Link to post
Share on other sites

52 minutes ago, patrickjp93 said:

Probably not because Nvidia already keeps its shaders much better fed. There's less room to grow. That said, in VR, Nvidia has the advantage because of Simultaneous Multi-Projection. AMD cannot do 1-pass rendering, and that is going to be a very painful, inescapable fact for the next 2 years or more. 

We still haven't seen Vega, they can still change a few things

He who asks is stupid for 5 minutes. He who does not ask, remains stupid. -Chinese proverb. 

Those who know much are aware that they know little. - Slick roasting me

Spoiler

AXIOM

CPU- Intel i5-6500 GPU- EVGA 1060 6GB Motherboard- Gigabyte GA-H170-D3H RAM- 8GB HyperX DDR4-2133 PSU- EVGA GQ 650w HDD- OEM 750GB Seagate Case- NZXT S340 Mouse- Logitech Gaming g402 Keyboard-  Azio MGK1 Headset- HyperX Cloud Core

Offical first poster LTT V2.0

 

Link to comment
Share on other sites

Link to post
Share on other sites

Guest
This topic is now closed to further replies.

×