Vega 10 to have greater tflop/s than Titan X (P) and general info

Belgarathian · October 9, 2016

On 10/9/2016 at 10:09 AM, DXMember said:

that's theoretical peak performance

R9 290X and GTX 980ti both have the same theoretical peak TFLOPS performance

But they do perform similarly... #biastesting #1samplepool

patrickjp93 · October 10, 2016

12 hours ago, Prysin said:

well patrick, one in 7 titles that has been benchmarked with a DX11 and DX12 comparison. That is surely a very good statistic, is it not? so that is 14.3% of the games it sees a minor boost. that leaves 83.7% of the games see no gains or a loss. That is not uplifing statistics, are they. Or are you going to argue that just because the losses is so small they are totally insignificant and hold no true meaning in any shape or form. And if devs werent so shit at coding, Nvidia would see more gains because because because bad coding???

AC/S isnt all there is for perf gains, true. But it is part of it and can give you large gains if implemented right.

Sure if comparing optimal pipeline saturation with optimal drivers and optimal code, AC/S will only add some 7-10% more performance over not having it (statement by several developers). However 7-10% is nearly a full GPU TIER up... that is nothing to scoff at.

It's very early in the DX 12 life cycle and the implementations we've seen so far are wrappers around DX 11, so really it's a statistic with very little weight at all no matter whose stance you take.

Those devs were also speaking solely from the perspective of AMD hardware. If the drivers were better that uplift may not exist at all.

Prysin · October 10, 2016

2 hours ago, patrickjp93 said:

It's very early in the DX 12 life cycle and the implementations we've seen so far are wrappers around DX 11, so really it's a statistic with very little weight at all no matter whose stance you take.

Those devs were also speaking solely from the perspective of AMD hardware. If the drivers were better that uplift may not exist at all.

the uplift wouldnt be as large, but it would most likely be there. Although closer to the 7-10% boost range that game devs keep talking of rather then 20-30% as we see now. I think the truth is that there is a bit too much overhead in Nvidias approach to see much gains. 7-10% overhead is NOT a lot, but it may just be enough to make whatever gains you normally see not appear.

The few times we DO see Nvidia benefit from AC/S it is generally in the 3-5% range, supporting the notion that the driver based context switching, while working as intended do have some overhead due to the nature of the solution. I do not think we can eliminate all this over head, they can maybe push it down to 2-3% at best, but it will always be there. Hardware based solutions DO provide a better solution performance wise, but at the added cost of efficiency due to constantly having to power up said hardware.

In the end it is a discussion of which is worse. Loss of efficiency or loss of performance. And which merit the most priority during the design process...

patrickjp93 · October 10, 2016

39 minutes ago, Prysin said:

the uplift wouldnt be as large, but it would most likely be there. Although closer to the 7-10% boost range that game devs keep talking of rather then 20-30% as we see now. I think the truth is that there is a bit too much overhead in Nvidias approach to see much gains. 7-10% overhead is NOT a lot, but it may just be enough to make whatever gains you normally see not appear.

The few times we DO see Nvidia benefit from AC/S it is generally in the 3-5% range, supporting the notion that the driver based context switching, while working as intended do have some overhead due to the nature of the solution. I do not think we can eliminate all this over head, they can maybe push it down to 2-3% at best, but it will always be there. Hardware based solutions DO provide a better solution performance wise, but at the added cost of efficiency due to constantly having to power up said hardware.

In the end it is a discussion of which is worse. Loss of efficiency or loss of performance. And which merit the most priority during the design process...

What overhead? Nvidia has less CPU overhead and better GPU utilization than AMD and has consistently better performance despite having fewer Tflops on paper.

That 3-5% doesn't suggest that by default. There's also the possibility Nvidia just hasn't optimized the driver for that game enough yet.

If hardware solutions were always better, Nvidia wouldn't still have the performance crown, and Intel's AVX-based video encoders and decoders wouldn't be beating dGPUs at the task.

Efficiency, always. HPC represents far more money than PC to Nvidia. It makes perfect sense, and yet Nvidia still has the better performance.

Prysin · October 10, 2016

1 hour ago, patrickjp93 said:

What overhead? Nvidia has less CPU overhead and better GPU utilization than AMD and has consistently better performance despite having fewer Tflops on paper.

That 3-5% doesn't suggest that by default. There's also the possibility Nvidia just hasn't optimized the driver for that game enough yet.

If hardware solutions were always better, Nvidia wouldn't still have the performance crown, and Intel's AVX-based video encoders and decoders wouldn't be beating dGPUs at the task.

Efficiency, always. HPC represents far more money than PC to Nvidia. It makes perfect sense, and yet Nvidia still has the better performance.

Patrick, stop being unreasonably pedantic. We both know that REGARDLESS of optimization, any driver that has to be inserted between hardware and API/Game engine WILL INTRODUCE OVERHEAD. Why? because its a added step. How much overhead? i dont know.

Jhen Sun, and others from the Nvidia camp has time and time again explained that the way their AC/S works, is by utilizing a software layer, driver, to schedule tasks through preemption (prone to miss) or through flags (prone to game dev errors). They have also, EXPLICITLY STATED that to get the most performance out of THEIR AC/S solution, you MUST HAVE A FULLY COMPATIBLE AND OPTIMIZED GAME ENGINE IN ADDITION TO THEIR DRIVERS. Nvidia cannot optimize much more, it would have to be up to the game devs.

Nvidia wouldnt release a half arsed driver for ALL "DX11.3" implementations we have seen so far. Stop deluding yourself with the idea that Nvidia, given all the bad press they are getting for AC/S performance, wouldnt do their damnest to make Pascal look good. They are hitting their limits, only game patches will help them improve further. Mind you the only two "real" DX12 titles around would be Ashes and Time Spy. Both using different approaches on how to get AC/S to work.

Fire Strike TIME SPY is how you optimize for Pascal (and maxwell, technically). By using CONCURRENT rather then ASYNCHRONUS, you get the best out of Nvidias system. Why? Because through a combination of flags and easily recognizable workloads they can reduce the overhead to near zero. There will always be overhead in any software layer that has to be inserted between hardware and code. The code MAY increase performance through optimizations, but it will ALWAYS incurr some overhead elsewere while doing so.

One can argue that you can work around this overhead, but to do so you would have to hold the software potential back through deliberate inpute/output latencies. Something that would NEVER fly.

Efficiency is the priority of Nvidia, this is why they threw out their hardware schedulers with Fermi. They wanted EFFICIENCY.

It comes at the cost of performance and flexibility. Apparently, Nvidia decided that the loss of performance in a extremely specific use case, and flexibility, would not be a problem. WHY?

Because for HPC, the workloads are highly customized for the hardware. Thus through guiding the developers with some simple "Do's and Don'ts" they can avoid the whole issue.

As for AVX, comes down to efficiency and instructions. dGPU instructions are not optimized for semi parrallell workloads. Some parts of decode/encode is less suited for parrallell then others, and depending on codec. There is also the addition of efficiency. AMD and NVidia both use ASICs or dedicated blocks for encode/decode. They no longer fire up the entire GPU for these workloads. Only in very few softwares is this behavior allowed, for the most part they use much slower and much less power hungry blocks or ASIC's.

patrickjp93 · October 10, 2016

1 minute ago, Prysin said:

Patrick, stop being unreasonably pedantic. We both know that REGARDLESS of optimization, any driver that has to be inserted between hardware and API/Game engine WILL INTRODUCE OVERHEAD. Why? because its a added step. How much overhead? i dont know.

Jhen Sun, and others from the Nvidia camp has time and time again explained that the way their AC/S works, is by utilizing a software layer, driver, to schedule tasks through preemption (prone to miss) or through flags (prone to game dev errors). They have also, EXPLICITLY STATED that to get the most performance out of THEIR AC/S solution, you MUST HAVE A FULLY COMPATIBLE AND OPTIMIZED GAME ENGINE IN ADDITION TO THEIR DRIVERS. Nvidia cannot optimize much more, it would have to be up to the game devs.

Nvidia wouldnt release a half arsed driver for ALL "DX11.3" implementations we have seen so far. Stop deluding yourself with the idea that Nvidia, given all the bad press they are getting for AC/S performance, wouldnt do their damnest to make Pascal look good. They are hitting their limits, only game patches will help them improve further. Mind you the only two "real" DX12 titles around would be Ashes and Time Spy. Both using different approaches on how to get AC/S to work.

Fire Strike TIME SPY is how you optimize for Pascal (and maxwell, technically). By using CONCURRENT rather then ASYNCHRONUS, you get the best out of Nvidias system. Why? Because through a combination of flags and easily recognizable workloads they can reduce the overhead to near zero. There will always be overhead in any software layer that has to be inserted between hardware and code. The code MAY increase performance through optimizations, but it will ALWAYS incurr some overhead elsewere while doing so.

One can argue that you can work around this overhead, but to do so you would have to hold the software potential back through deliberate inpute/output latencies. Something that would NEVER fly.

Efficiency is the priority of Nvidia, this is why they threw out their hardware schedulers with Fermi. They wanted EFFICIENCY.

It comes at the cost of performance and flexibility. Apparently, Nvidia decided that the loss of performance in a extremely specific use case, and flexibility, would not be a problem. WHY?

Because for HPC, the workloads are highly customized for the hardware. Thus through guiding the developers with some simple "Do's and Don'ts" they can avoid the whole issue.

As for AVX, comes down to efficiency and instructions. dGPU instructions are not optimized for semi parrallell workloads. Some parts of decode/encode is less suited for parrallell then others, and depending on codec. There is also the addition of efficiency. AMD and NVidia both use ASICs or dedicated blocks for encode/decode. They no longer fire up the entire GPU for these workloads. Only in very few softwares is this behavior allowed, for the most part they use much slower and much less power hungry blocks or ASIC's.

It's not prone to miss, and you seriously need to look up the definitions of asynchrony and concurrency, because you can't have one without the other.

Ashes is not the real DX 12 title. Hitman is.

-BirdiE- · October 10, 2016

7 minutes ago, Prysin said:

Efficiency is the priority of Nvidia, this is why they threw out their hardware schedulers with Fermi. They wanted EFFICIENCY.

It comes at the cost of performance and flexibility.

Despite your belief that I am pro-Nvidia, I promise you I am not. In the segments they compete, AMD is currently the superior price/performance solution. And I truly hope they put themselves in a position to truly compete with Nvidia.

With that being said.. where the hell are you getting this? It hasn't come at the cost of anything... Look just slightly above at the performance chart... Even in AotS..

The 980ti and 290x have almost the same raw computing power, the TDP on the 980ti is less, and the performance of the 980ti is MASSIVELY higher in DX11, and still higher in DX12.

You're looking at it entirely the wrong way... Nvidia has the clearly superior solution, and AMD has sacrificed efficiency in order to compete with Nvidia in performance. This is why the GTX 1060 offers more gaming performance per Tflop than the RX 480, AND uses less power. Nvidia's technology is clearly superior...

However, all that matters for the consumer is the price at which that performance is being offered.. And in that department AMD is currently winning.

Prysin · October 10, 2016

5 hours ago, -BirdiE- said:

Despite your belief that I am pro-Nvidia, I promise you I am not. In the segments they compete, AMD is currently the superior price/performance solution. And I truly hope they put themselves in a position to truly compete with Nvidia.

With that being said.. where the hell are you getting this? It hasn't come at the cost of anything... Look just slightly above at the performance chart... Even in AotS..

The 980ti and 290x have almost the same raw computing power, the TDP on the 980ti is less, and the performance of the 980ti is MASSIVELY higher in DX11, and still higher in DX12.

You're looking at it entirely the wrong way... Nvidia has the clearly superior solution, and AMD has sacrificed efficiency in order to compete with Nvidia in performance. This is why the GTX 1060 offers more gaming performance per Tflop than the RX 480, AND uses less power. Nvidia's technology is clearly superior...

However, all that matters for the consumer is the price at which that performance is being offered.. And in that department AMD is currently winning.

*sigh* i have typed up a lengthy reply to you and patrick 4 times today, just to have my internet drop, firefox crash or windows update troll me right when i tried to post.

so instead of retyping the wall of text i will kindly ask you to read up on Pascal, Maxwell, Hawaii and Fiji. In depth, how they function at a pipeline level. PCPer, The Tech Report, Realworldtech, Tom's Hardware, Kitguru and a few others have a great deal of info that will prove that you are partially right, but mostly wrong. But for reasons i cannot be bothered to try type AGAIN.

@patrickjp93

Ashes has DX11 and DX12 pipelines. Their DX12 pipeline is rebuilt from their original mantle pipeline. Look up "Stardock entertainment talks about Mantle" on youtube.

-BirdiE- · October 10, 2016

3 hours ago, Prysin said:

*sigh* i have typed up a lengthy reply to you and patrick 4 times today, just to have my internet drop, firefox crash or windows update troll me right when i tried to post.

Haha. That's rough.

Alright though, because I am way too exhausted to have read it.

3 hours ago, Prysin said:

so instead of retyping the wall of text i will kindly ask you to read up on Pascal, Maxwell, Hawaii and Fiji. In depth, how they function at a pipeline level. PCPer, The Tech Report, Realworldtech, Tom's Hardware, Kitguru and a few others have a great deal of info that will prove that you are partially right, but mostly wrong. But for reasons i cannot be bothered to try type AGAIN.

I will do some reading after much napping.

AlTech · October 10, 2016

On ‎10‎/‎9‎/‎2016 at 5:19 AM, ivan134 said:

TFLOPS is a useless measure when comparing Nvidia cards vs AMD

I completely agree. It's just like comparing the RPM of different car engines........

On ‎10‎/‎9‎/‎2016 at 5:28 AM, ivan134 said:

Apparently Navi has been postponed, and upgraded Polaris/Vega is next

K then. I'm still waiting for something in between the 480 and Fury X v2........ I don't want to pay $200 for a GPU nor $600.

rattacko123 · October 11, 2016

2 hours ago, AluminiumTech said:

K then. I'm still waiting for something in between the 480 and Fury X v2........ I don't want to pay $200 for a GPU nor $600.

Hopefully the RX 490 is the graphics card we are looking for

Mihle · October 11, 2016

I hope Vega is really good ?

Zeeee · October 11, 2016

Okay but doesnt rx 480 have 5.8tflops and gtx 970 has 4.8 tflops yet theyre the exact same performance? Even when both overclocked?

Prysin · October 11, 2016

7 minutes ago, Zeeee said:

Okay but doesnt rx 480 have 5.8tflops and gtx 970 has 4.8 tflops yet theyre the exact same performance? Even when both overclocked?

not under DX12. Under DX11 yes, due to AMDs shitty DX11 drivers

RooTR is a Nvidia Title btw, so there is no excuse for why AMD is doing better. The async compute used there is tailored to Nvidias strengths.

Paragon_X · October 11, 2016

AMD is so late in getting high end GPUs in the market, at least lets hope to prove to be atleast as good as nvidia this time round and pull an r9 290 again.

Morgan MLGman · October 11, 2016

On 10.10.2016 at 1:06 AM, Belgarathian said:

But they do perform similarly... #biastesting #1samplepool

That is exactly what @Prysin said on the previous page. If it's implemented correctly, those GPUs perform the same (AOTS DX12) so their TFLOPS amounts seem to be accurate. Of course it doesn't translate like that in the real world, but the 290X still has plenty of potential under the hood. Especially for a 2013 card.

Sign In

Vega 10 to have greater tflop/s than Titan X (P) and general info

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Create an account or sign in to comment

Create an account

Sign in

Featured Topics

Topics

Latest From Linus Tech Tips:

Your PC Can Look Like THIS Now!

Latest From Tech Quickie:

What Speed DDR5 Should You Buy?

Latest From TechLinked:

Search is About to Change Forever

Latest From GameLinked:

I wish this wasn’t an Ubisoft game

Latest From ShortCircuit: