Jump to content

Vega 10 to have greater tflop/s than Titan X (P) and general info

RoboGuy
On 10/9/2016 at 10:09 AM, DXMember said:

that's theoretical peak performance

 

R9 290X and GTX 980ti both have the same theoretical peak TFLOPS performance

But they do perform similarly... 9_9xD #biastesting #1samplepool

2016-10-10_12-03-08.png

Link to comment
Share on other sites

Link to post
Share on other sites

12 hours ago, Prysin said:

well patrick, one in 7 titles that has been benchmarked with a DX11 and DX12 comparison. That is surely a very good statistic, is it not? so that is 14.3% of the games it sees a minor boost. that leaves 83.7% of the games see no gains or a loss. That is not uplifing statistics, are they. Or are you going to argue that just because the losses is so small they are totally insignificant and hold no true meaning in any shape or form. And if devs werent so shit at coding, Nvidia would see more gains because because because bad coding???

 

AC/S isnt all there is for perf gains, true. But it is part of it and can give you large gains if implemented right.

Sure if comparing optimal pipeline saturation with optimal drivers and optimal code, AC/S will only add some 7-10% more performance over not having it (statement by several developers). However 7-10% is nearly a full GPU TIER up... that is nothing to scoff at.

It's very early in the DX 12 life cycle and the implementations we've seen so far are wrappers around DX 11, so really it's a statistic with very little weight at all no matter whose stance you take.

 

Those devs were also speaking solely from the perspective of AMD hardware. If the drivers were better that uplift may not exist at all.

Software Engineer for Suncorp (Australia), Computer Tech Enthusiast, Miami University Graduate, Nerd

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, patrickjp93 said:

It's very early in the DX 12 life cycle and the implementations we've seen so far are wrappers around DX 11, so really it's a statistic with very little weight at all no matter whose stance you take.

 

Those devs were also speaking solely from the perspective of AMD hardware. If the drivers were better that uplift may not exist at all.

the uplift wouldnt be as large, but it would most likely be there. Although closer to the 7-10% boost range  that game devs keep talking of rather then 20-30% as we see now. I think the truth is that there is a bit too much overhead in  Nvidias approach to see much gains. 7-10% overhead is NOT a lot, but it may just be enough to make whatever gains you normally see not appear.

The few times we DO see Nvidia benefit from AC/S it is generally in the 3-5% range, supporting the notion that the driver based context switching, while working as intended do have some overhead due to the nature of the solution. I do not think we can eliminate all this over head, they can maybe push it down to 2-3% at best, but it will always be there. Hardware based solutions DO provide a better solution performance wise, but at the added cost of efficiency due to constantly having to power up said hardware.

 

In the end it is a discussion of which is worse. Loss of efficiency or loss of performance. And which merit the most priority during the design process...

Link to comment
Share on other sites

Link to post
Share on other sites

39 minutes ago, Prysin said:

the uplift wouldnt be as large, but it would most likely be there. Although closer to the 7-10% boost range  that game devs keep talking of rather then 20-30% as we see now. I think the truth is that there is a bit too much overhead in  Nvidias approach to see much gains. 7-10% overhead is NOT a lot, but it may just be enough to make whatever gains you normally see not appear.

The few times we DO see Nvidia benefit from AC/S it is generally in the 3-5% range, supporting the notion that the driver based context switching, while working as intended do have some overhead due to the nature of the solution. I do not think we can eliminate all this over head, they can maybe push it down to 2-3% at best, but it will always be there. Hardware based solutions DO provide a better solution performance wise, but at the added cost of efficiency due to constantly having to power up said hardware.

 

In the end it is a discussion of which is worse. Loss of efficiency or loss of performance. And which merit the most priority during the design process...

What overhead? Nvidia has less CPU overhead and better GPU utilization than AMD and has consistently better performance despite having fewer Tflops on paper.

 

That 3-5% doesn't suggest that by default. There's also the possibility Nvidia just hasn't optimized the driver for that game enough yet. 

 

If hardware solutions were always better, Nvidia wouldn't still have the performance crown, and Intel's AVX-based video encoders and decoders wouldn't be beating dGPUs at the task.

 

Efficiency, always. HPC represents far more money than PC to Nvidia. It makes perfect sense, and yet Nvidia still has the better performance.

Software Engineer for Suncorp (Australia), Computer Tech Enthusiast, Miami University Graduate, Nerd

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, patrickjp93 said:

What overhead? Nvidia has less CPU overhead and better GPU utilization than AMD and has consistently better performance despite having fewer Tflops on paper.

 

That 3-5% doesn't suggest that by default. There's also the possibility Nvidia just hasn't optimized the driver for that game enough yet. 

 

If hardware solutions were always better, Nvidia wouldn't still have the performance crown, and Intel's AVX-based video encoders and decoders wouldn't be beating dGPUs at the task.

 

Efficiency, always. HPC represents far more money than PC to Nvidia. It makes perfect sense, and yet Nvidia still has the better performance.

Patrick, stop being unreasonably pedantic. We both know that REGARDLESS of optimization, any driver that has to be inserted between hardware and API/Game engine WILL INTRODUCE OVERHEAD. Why? because its a added step. How much overhead? i dont know.

 

Jhen Sun, and others from the Nvidia camp has time and time again explained that the way their AC/S works, is by utilizing a software layer, driver, to schedule tasks through preemption (prone to miss) or through flags (prone to game dev errors). They have also, EXPLICITLY STATED that to get the most performance out of THEIR AC/S solution, you MUST HAVE A FULLY COMPATIBLE AND OPTIMIZED GAME ENGINE IN ADDITION TO THEIR DRIVERS. Nvidia cannot optimize much more, it would have to be up to the game devs.

 

Nvidia wouldnt release a half arsed driver for ALL "DX11.3" implementations we have seen so far. Stop deluding yourself with the idea that Nvidia, given all the bad press they are getting for AC/S performance, wouldnt do their damnest to make Pascal look good. They are hitting their limits, only game patches will help them improve further. Mind you the only two "real" DX12 titles around would be Ashes and Time Spy. Both using different approaches on how to get AC/S to work.

 

Fire Strike TIME SPY is how you optimize for Pascal (and maxwell, technically). By using CONCURRENT rather then ASYNCHRONUS, you get the best out of Nvidias system. Why? Because through a combination of flags and easily recognizable workloads they can reduce the overhead to near zero. There will always be overhead in any software layer that has to be inserted between hardware and code. The code MAY increase performance through optimizations, but it will ALWAYS incurr some overhead elsewere while doing so. 

One can argue that you can work around this overhead, but to do so you would have to hold the software potential back through deliberate inpute/output latencies. Something that would NEVER fly.

 

Efficiency is the priority of Nvidia, this is why they threw out their hardware schedulers with Fermi. They wanted EFFICIENCY.

It comes at the cost of performance and flexibility. Apparently, Nvidia decided that the loss of performance in a extremely specific use case, and flexibility, would not be a problem. WHY?

Because for HPC, the workloads are highly customized for the hardware. Thus through guiding the developers with some simple "Do's and Don'ts" they can avoid the whole issue.

 

As for AVX, comes down to efficiency and instructions. dGPU instructions are not optimized for semi parrallell workloads. Some parts of decode/encode is less suited for parrallell then others, and depending on codec. There is also the addition of efficiency. AMD and NVidia both use ASICs or dedicated blocks for encode/decode. They no longer fire up the entire GPU for these workloads. Only in very few softwares is this behavior allowed, for the most part they use much slower and much less power hungry blocks or ASIC's.

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, Prysin said:

Patrick, stop being unreasonably pedantic. We both know that REGARDLESS of optimization, any driver that has to be inserted between hardware and API/Game engine WILL INTRODUCE OVERHEAD. Why? because its a added step. How much overhead? i dont know.

 

Jhen Sun, and others from the Nvidia camp has time and time again explained that the way their AC/S works, is by utilizing a software layer, driver, to schedule tasks through preemption (prone to miss) or through flags (prone to game dev errors). They have also, EXPLICITLY STATED that to get the most performance out of THEIR AC/S solution, you MUST HAVE A FULLY COMPATIBLE AND OPTIMIZED GAME ENGINE IN ADDITION TO THEIR DRIVERS. Nvidia cannot optimize much more, it would have to be up to the game devs.

 

Nvidia wouldnt release a half arsed driver for ALL "DX11.3" implementations we have seen so far. Stop deluding yourself with the idea that Nvidia, given all the bad press they are getting for AC/S performance, wouldnt do their damnest to make Pascal look good. They are hitting their limits, only game patches will help them improve further. Mind you the only two "real" DX12 titles around would be Ashes and Time Spy. Both using different approaches on how to get AC/S to work.

 

Fire Strike TIME SPY is how you optimize for Pascal (and maxwell, technically). By using CONCURRENT rather then ASYNCHRONUS, you get the best out of Nvidias system. Why? Because through a combination of flags and easily recognizable workloads they can reduce the overhead to near zero. There will always be overhead in any software layer that has to be inserted between hardware and code. The code MAY increase performance through optimizations, but it will ALWAYS incurr some overhead elsewere while doing so. 

One can argue that you can work around this overhead, but to do so you would have to hold the software potential back through deliberate inpute/output latencies. Something that would NEVER fly.

 

Efficiency is the priority of Nvidia, this is why they threw out their hardware schedulers with Fermi. They wanted EFFICIENCY.

It comes at the cost of performance and flexibility. Apparently, Nvidia decided that the loss of performance in a extremely specific use case, and flexibility, would not be a problem. WHY?

Because for HPC, the workloads are highly customized for the hardware. Thus through guiding the developers with some simple "Do's and Don'ts" they can avoid the whole issue.

 

As for AVX, comes down to efficiency and instructions. dGPU instructions are not optimized for semi parrallell workloads. Some parts of decode/encode is less suited for parrallell then others, and depending on codec. There is also the addition of efficiency. AMD and NVidia both use ASICs or dedicated blocks for encode/decode. They no longer fire up the entire GPU for these workloads. Only in very few softwares is this behavior allowed, for the most part they use much slower and much less power hungry blocks or ASIC's.

 

 

It's not prone to miss, and you seriously need to look up the definitions of asynchrony and concurrency, because you can't have one without the other.

 

Ashes is not the real DX 12 title. Hitman is.

Software Engineer for Suncorp (Australia), Computer Tech Enthusiast, Miami University Graduate, Nerd

Link to comment
Share on other sites

Link to post
Share on other sites

7 minutes ago, Prysin said:

Efficiency is the priority of Nvidia, this is why they threw out their hardware schedulers with Fermi. They wanted EFFICIENCY.

It comes at the cost of performance and flexibility.

Despite your belief that I am pro-Nvidia, I promise you I am not. In the segments they compete, AMD is currently the superior price/performance solution. And I truly hope they put themselves in a position to truly compete with Nvidia.

 

With that being said.. where the hell are you getting this? It hasn't come at the cost of anything... Look just slightly above at the performance chart... Even in AotS..

The 980ti and 290x have almost the same raw computing power, the TDP on the 980ti is less, and the performance of the 980ti is MASSIVELY higher in DX11, and still higher in DX12.

 

You're looking at it entirely the wrong way... Nvidia has the clearly superior solution, and AMD has sacrificed efficiency in order to compete with Nvidia in performance. This is why the GTX 1060 offers more gaming performance per Tflop than the RX 480, AND uses less power. Nvidia's technology is clearly superior...

However, all that matters for the consumer is the price at which that performance is being offered.. And in that department AMD is currently winning.

Link to comment
Share on other sites

Link to post
Share on other sites

5 hours ago, -BirdiE- said:

Despite your belief that I am pro-Nvidia, I promise you I am not. In the segments they compete, AMD is currently the superior price/performance solution. And I truly hope they put themselves in a position to truly compete with Nvidia.

 

With that being said.. where the hell are you getting this? It hasn't come at the cost of anything... Look just slightly above at the performance chart... Even in AotS..

The 980ti and 290x have almost the same raw computing power, the TDP on the 980ti is less, and the performance of the 980ti is MASSIVELY higher in DX11, and still higher in DX12.

 

You're looking at it entirely the wrong way... Nvidia has the clearly superior solution, and AMD has sacrificed efficiency in order to compete with Nvidia in performance. This is why the GTX 1060 offers more gaming performance per Tflop than the RX 480, AND uses less power. Nvidia's technology is clearly superior...

However, all that matters for the consumer is the price at which that performance is being offered.. And in that department AMD is currently winning.

*sigh* i have typed up a lengthy reply to you and patrick 4 times today, just to have my internet drop, firefox crash or windows update troll me right when i tried to post.

 

so instead of retyping the wall of text i will kindly ask you to read up on Pascal, Maxwell, Hawaii and Fiji. In depth, how they function at a pipeline level. PCPer, The Tech Report, Realworldtech, Tom's Hardware, Kitguru and a few others have a great deal of info that will prove that you are partially right, but mostly wrong. But for reasons i cannot be bothered to try type AGAIN.

 

@patrickjp93

Ashes has DX11 and DX12 pipelines. Their DX12 pipeline is rebuilt from their original mantle pipeline. Look up "Stardock entertainment talks about Mantle" on youtube.

Link to comment
Share on other sites

Link to post
Share on other sites

3 hours ago, Prysin said:

*sigh* i have typed up a lengthy reply to you and patrick 4 times today, just to have my internet drop, firefox crash or windows update troll me right when i tried to post.

Haha. That's rough.

Alright though, because I am way too exhausted to have read it.

 

3 hours ago, Prysin said:

so instead of retyping the wall of text i will kindly ask you to read up on Pascal, Maxwell, Hawaii and Fiji. In depth, how they function at a pipeline level. PCPer, The Tech Report, Realworldtech, Tom's Hardware, Kitguru and a few others have a great deal of info that will prove that you are partially right, but mostly wrong. But for reasons i cannot be bothered to try type AGAIN.

I will do some reading after much napping.

Link to comment
Share on other sites

Link to post
Share on other sites

On ‎10‎/‎9‎/‎2016 at 5:19 AM, ivan134 said:

TFLOPS is a useless measure when comparing Nvidia cards vs AMD

I completely agree. It's just like comparing the RPM of different car engines........

 

On ‎10‎/‎9‎/‎2016 at 5:28 AM, ivan134 said:

Apparently Navi has been postponed, and upgraded Polaris/Vega is next

K then. I'm still waiting for something in between the 480 and Fury X v2........ I don't want to pay $200 for a GPU nor $600.

Judge a product on its own merits AND the company that made it.

How to setup MSI Afterburner OSD | How to make your AMD Radeon GPU more efficient with Radeon Chill | (Probably) Why LMG Merch shipping to the EU is expensive

Oneplus 6 (Early 2023 to present) | HP Envy 15" x360 R7 5700U (Mid 2021 to present) | Steam Deck (Late 2022 to present)

 

Mid 2023 AlTech Desktop Refresh - AMD R7 5800X (Mid 2023), XFX Radeon RX 6700XT MBA (Mid 2021), MSI X370 Gaming Pro Carbon (Early 2018), 32GB DDR4-3200 (16GB x2) (Mid 2022

Noctua NH-D15 (Early 2021), Corsair MP510 1.92TB NVMe SSD (Mid 2020), beQuiet Pure Wings 2 140mm x2 & 120mm x1 (Mid 2023),

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, AluminiumTech said:

K then. I'm still waiting for something in between the 480 and Fury X v2........ I don't want to pay $200 for a GPU nor $600.

Hopefully the RX 490 is the graphics card we are looking for

hello!

is it me you're looking for?

ᴾC SᴾeCS ᴰoWᴺ ᴮEᴸoW

Spoiler

Desktop: X99-PC

CPU: i7 5820k

Mobo: X99 Deluxe

Cooler: Dark Rock Pro 3

RAM: 32GB DDR4
GPU: GTX 1080

Storage: 1TB 850 Evo, 1TB HDD, bunch of external hard drives
PSU: EVGA G2 750w

Peripherals: Logitech G502, Ducky One 711

Audio: Xonar U7, O2 amplifier (RIP), HD6XX

Monitors: 4k 24" Dell monitor, 1080p 24" Asus monitor

 

Laptop:

-Overkill Dell XPS

Fully maxed out early 2017 Dell XPS 15, GTX 1050 4GB, 7700HQ, 1TB nvme SSD, 32GB RAM, 4k display. 97Whr battery :x 
Dell was having a $600 off sale for the fully specced out model, so I decided to get it :P

 

-Crapbook

Fully specced out early 2013 Macbook "pro" with gt 650m and constant 105c temperature on the CPU (GPU is 80-90C) when doing anything intensive...

A 2013 laptop with a regular sized battery still has better battery life than a 2017 laptop with a massive battery! I think this is a testament to apple's ability at making laptops, or maybe how little CPU technology has improved even 4+ years later (at least, until the recent introduction of 15W 4 core CPUs). Anyway, I'm never going to get a 35W CPU laptop again unless battery technology becomes ~5x better than as it is in 2018.

Apple knows how to make proper consumer-grade laptops (they don't know how to make pro laptops though). I guess this mostly software power efficiency related, but getting a mac makes perfect sense if you want a portable/powerful laptop that can do anything you want it to with great battery life.

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

I hope Vega is really good ?

“Remember to look up at the stars and not down at your feet. Try to make sense of what you see and wonder about what makes the universe exist. Be curious. And however difficult life may seem, there is always something you can do and succeed at. 
It matters that you don't just give up.”

-Stephen Hawking

Link to comment
Share on other sites

Link to post
Share on other sites

Okay but doesnt rx 480 have 5.8tflops and gtx 970 has 4.8 tflops yet theyre the exact same performance? Even when both overclocked?

Link to comment
Share on other sites

Link to post
Share on other sites

7 minutes ago, Zeeee said:

Okay but doesnt rx 480 have 5.8tflops and gtx 970 has 4.8 tflops yet theyre the exact same performance? Even when both overclocked?

not under DX12. Under DX11 yes, due to AMDs shitty DX11 drivers

 

RooTR is a Nvidia Title btw, so there is no excuse for why AMD is doing better. The async compute used there is tailored to Nvidias strengths.

RX-480-ABC-85.jpg

Link to comment
Share on other sites

Link to post
Share on other sites

AMD is so late in getting high end GPUs in the market, at least lets hope to prove to be atleast as good as nvidia this time round and pull an r9 290 again.

//Case: Phanteks 400 TGE //Mobo: Asus x470-F Strix //CPU: R5 2600X //CPU Cooler: Corsair H100i v2 //RAM: G-Skill RGB 3200mhz //HDD: WD Caviar Black 1tb //SSD: Samsung 970 Evo 250Gb //GPU: GTX 1050 Ti //PSU: Seasonic MII EVO m2 520W

Link to comment
Share on other sites

Link to post
Share on other sites

On 10.10.2016 at 1:06 AM, Belgarathian said:

But they do perform similarly... 9_9xD #biastesting #1samplepool

2016-10-10_12-03-08.png

That is exactly what @Prysin said on the previous page. If it's implemented correctly, those GPUs perform the same (AOTS DX12) so their TFLOPS amounts seem to be accurate. Of course it doesn't translate like that in the real world, but the 290X still has plenty of potential under the hood. Especially for a 2013 card.

CPU: AMD Ryzen 7 5800X3D GPU: AMD Radeon RX 6900 XT 16GB GDDR6 Motherboard: MSI PRESTIGE X570 CREATION
AIO: Corsair H150i Pro RAM: Corsair Dominator Platinum RGB 32GB 3600MHz DDR4 Case: Lian Li PC-O11 Dynamic PSU: Corsair RM850x White

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×