Jump to content

[AMD] Details on Fiji VR/395x2 surface and launch month

CoolaxGaming

1. i don't have all day.  Just doing a quick comparison, but you will absolutely find the same results if you gathered benchmark data from other websites.

 

2. Yes, of course 256 bit with compression isn't as good.  That's the problem.  Maybe a 256 bit bus without the compression overhead would have less issues, but we're talking about real products not theoretical.

 

3. You see the same effect with the 980.  It's a Maxwell issue that is present on the 970 and 980 but not the TitanX (which retains 40% better performance than a 290x regardless of resolution)  This is obv because of the 384 bit bus on the TitanX removing the bottleneck that we see with the other two cards.

 

The fact that it's not present in the TitanX does support your argument that there is a point where it doesn't matter, but I think it's premature to say HBM won't make a difference going forward when we can clearly see Maxwell 256 bit cards being hamstrung by their memory bus.  

 

I'm sure someone said the same thing about GDDR5 back in the day or SATA 6.  Available bandwidth is always saturated eventually.  It might not matter that AMD is using HBM for the 390x, but you better believe it will be the standard eventually.

1)Make comparisons on valid data and premises. If you choose to waste your time on bad data, that's your issue.

2)According to these performance indices it is, and the numbers Nvidia quotes are best case scenarios when the entire frame buffer consists of highly compressible textures.

3)No, there's no issue with the 980. Has someone stopped following benchmark updates after driver releases? Also, your analysis is still highly flawed. It's much more about the raw core, ROP, and TMU count than the bandwidth because of reduced backpressure.

 

Again, Maxwell is far from hamstrung by the 256-bit bus for the core counts they have. With 50% more cores and resources, you can rebalance the load to eliminate backpressure, and then bandwidth can be a benefit, but there's a limit to the effects of that, which is why core overclocking still has a much higher effect.

 

Oh it'll be the standard for Intel, AMD, and Nvidia before the close of 2016, but that's got far more to do with scientific computing and HPC space competition.

Software Engineer for Suncorp (Australia), Computer Tech Enthusiast, Miami University Graduate, Nerd

Link to comment
Share on other sites

Link to post
Share on other sites

 

3)No, there's no issue with the 980. Has someone stopped following benchmark updates after driver releases? Also, your analysis is still highly flawed. It's much more about the raw core, ROP, and TMU count than the bandwidth because of reduced backpressure.

 

 

 

No, go look at some benchmarks.  The gap between the 290 or 290x and the 980 shrinks with resolution just like it does with the 970.  

 

JnB7KHS.jpg

 

Don't believe me?  Go look at some recent benchmarks containing the two.

http://www.techspot.com/review/991-gta-5-pc-benchmarks/page4.html

http://www.gamersnexus.net/game-bench/1905-gta-v-pc-fps-benchmark-graphics-cards

 

The TitanX works as a control, and it doesn't observe the same performance degradation and maintains the same relative performance of roughly ~35-40% over the 290(x) regardless of resolution. Both the 980 and 970 drop significantly as resolution increases from the 980 being around 20% faster than the 290x to running less than 1% faster at 4K ultra settings.  It's consistent across multiple games as well and across different benchmark websites.  

 

It's also present with the 960.  

GTX-960-REF-70.jpg

http://www.hardwarecanucks.com/forum/hardware-canucks-reviews/68697-nvidia-gtx-960-reference-review-16.html

 

I'm just eyeballing, but it seems Maxwell loses roughly 7% relative performance for each bump up in resolution between 1080p-1440p-4K.  Usually you see the opposite effect where higher powered cards gain in their lead over less powerful ones as the lower ones can't keep up. (Like you do between the 290x and 7970)

4K // R5 3600 // RTX2080Ti

Link to comment
Share on other sites

Link to post
Share on other sites

No, go look at some benchmarks.  The gap between the 290 or 290x and the 980 shrinks with resolution just like it does with the 970.  

 

 

 

Don't believe me?  Go look at some recent benchmarks containing the two.

http://www.techspot.com/review/991-gta-5-pc-benchmarks/page4.html

http://www.gamersnexus.net/game-bench/1905-gta-v-pc-fps-benchmark-graphics-cards

 

The TitanX works as a control, and it doesn't observe the same performance degradation and maintains the same relative performance of roughly ~35-40% over the 290(x) regardless of resolution. Both the 980 and 970 drop significantly as resolution increases from the 980 being around 20% faster than the 290x to running less than 1% faster at 4K ultra settings.  It's consistent across multiple games as well and across different benchmark websites.  

 

It's also present with the 960.  

 

http://www.hardwarecanucks.com/forum/hardware-canucks-reviews/68697-nvidia-gtx-960-reference-review-16.html

 

I'm just eyeballing, but it seems Maxwell loses roughly 7% relative performance for each bump up in resolution between 1080p-1440p-4K.  Usually you see the opposite effect where higher powered cards gain in their lead over less powerful ones as the lower ones can't keep up. (Like you do between the 290x and 7970)

You think the much lower core count has anything to do with it? ;)  Seriously who the heck taught you people how to do technological analysis? There is back pressure in the pipeline at 1440p and 4K which wasn't present at lower resolutions. Without more raw resources bandwidth won't matter. 2048 cores vs. 2816 and 2880 and you really think this isn't a more pressing issue?

 

The 7970 and 290X have very different pipeline designs. That's not a valid comparison to make, and the intricacies of fixing it go way above what I can write in a reasonably long post. And actually if you look closer NOW instead of against old benchmarks, there's practically no performance loss moving between resolutions for the 980, though for the 970 you start getting cache misses and have to hit that last 0.5GB of memory.

Software Engineer for Suncorp (Australia), Computer Tech Enthusiast, Miami University Graduate, Nerd

Link to comment
Share on other sites

Link to post
Share on other sites

GPUs are intricate pipelines when it comes to graphics, even though in scientific computing only the cores themselves are used. There is a problem you don't often see discussed for GPUs, even though it's discussed extensively in CPU design: pipeline backpressure. Currently there are so many shaders per TMU and per ROP that even if the shaders are all fed, you can't make progress until the TMU or ROP finishes its task which uses data from 128 shaders, and then all of those ROPS and TMUs produce a final collaborated result before it all gets sent through the pixel shaders to the screen. If the shader count was 96 instead of 128 per ROP and the TMU count was increased in proportion, then you'd get a boost where the cores are staying fed and passing on more results. When that happens increasing bandwidth can be helpful. Until that time, there are some subsets in every scene ever drawn which don't produce back pressure. For those, increasing bandwidth helps, and this can increase fps, but not to the same degree increasing core clocks (which are also hooked to the ROPs and TMUs) will.

 

For the 960 you're well below that 250GB/s threshhold which is a good estimation even if not exact, and there are fewer cores and resources handling the same workloads, so depending on the driver's load balancing per frame, some more bandwidth may be beneficial.

 

The effective clock of 9GHz is a best-case scenario where all information is a highly compressible texture. It's some marketing, but there is truth that you can gain some extra space or more textures by using their color compression tech.

 

If you look at the HBM specs, each stack is a 1024-bit wide interface. There are 4 stacks this time. simply multiply.

 

amazing, can't believe back pressure is still present in cpu/gpus...there's back pressure in forced induction systems, water hammers in pipes, etc...it is damaging to the system if not controlled, is the same for processors?

 

i've only a faint understanding of what you explained :D , it seems it is still better to increase core clock (also increasing pixel/clock since you mentioned ROPs?) than bandwidth if it can't be saturated (no benefit?).

 

thanks for the info!

Link to comment
Share on other sites

Link to post
Share on other sites

amazing, can't believe back pressure is still present in cpu/gpus...there's back pressure in forced induction systems, water hammers in pipes, etc...it is damaging to the system if not controlled, is the same for processors?

 

i've only a faint understanding of what you explained :D , it seems it is still better to increase core clock (also increasing pixel/clock since you mentioned ROPs?) than bandwidth if it can't be saturated (no benefit?).

 

thanks for the info!

For CPUs this iswhy we have buffers and redundant systems to catch these errors and stop the TLBs from accidentally tossing instructions or data from the cache before they're used.

 

Backpressure will always exist for some workloads. No software will ever be so perfect as to avoid it (software of real worth anyway).

Software Engineer for Suncorp (Australia), Computer Tech Enthusiast, Miami University Graduate, Nerd

Link to comment
Share on other sites

Link to post
Share on other sites

amazing, can't believe back pressure is still present in cpu/gpus...there's back pressure in forced induction systems, water hammers in pipes, etc...it is damaging to the system if not controlled, is the same for processors?

 

i've only a faint understanding of what you explained :D , it seems it is still better to increase core clock (also increasing pixel/clock since you mentioned ROPs?) than bandwidth if it can't be saturated (no benefit?).

 

thanks for the info!

Per usual a bunch of made up gibberish that has nothing to do with the post he's trying to counter. GPUs don't have "back pressure" they aren't combustible engines. Nvidia juiced up Maxwell's L2 cache from 1024kb to 2048kb which cuts out a large chunk of the reliance on memory bandwidth which allowed them to cut back on interface width to conserve on cost.

Link to comment
Share on other sites

Link to post
Share on other sites

390X vs 980Ti (if it's real).

 

Man... I can't wait to see this... :D

My Systems:

Main - Work + Gaming:

Spoiler

Woodland Raven: Ryzen 2700X // AMD Wraith RGB // Asus Prime X570-P // G.Skill 2x 8GB 3600MHz DDR4 // Radeon RX Vega 56 // Crucial P1 NVMe 1TB M.2 SSD // Deepcool DQ650-M // chassis build in progress // Windows 10 // Thrustmaster TMX + G27 pedals & shifter

F@H Rig:

Spoiler

FX-8350 // Deepcool Neptwin // MSI 970 Gaming // AData 2x 4GB 1600 DDR3 // 2x Gigabyte RX-570 4G's // Samsung 840 120GB SSD // Cooler Master V650 // Windows 10

 

HTPC:

Spoiler

SNES PC (HTPC): i3-4150 @3.5 // Gigabyte GA-H87N-Wifi // G.Skill 2x 4GB DDR3 1600 // Asus Dual GTX 1050Ti 4GB OC // AData SP600 128GB SSD // Pico 160XT PSU // Custom SNES Enclosure // 55" LG LED 1080p TV  // Logitech wireless touchpad-keyboard // Windows 10 // Build Log

Laptops:

Spoiler

MY DAILY: Lenovo ThinkPad T410 // 14" 1440x900 // i5-540M 2.5GHz Dual-Core HT // Intel HD iGPU + Quadro NVS 3100M 512MB dGPU // 2x4GB DDR3L 1066 // Mushkin Triactor 480GB SSD // Windows 10

 

WIFE'S: Dell Latitude E5450 // 14" 1366x768 // i5-5300U 2.3GHz Dual-Core HT // Intel HD5500 // 2x4GB RAM DDR3L 1600 // 500GB 7200 HDD // Linux Mint 19.3 Cinnamon

 

EXPERIMENTAL: Pinebook // 11.6" 1080p // Manjaro KDE (ARM)

NAS:

Spoiler

Home NAS: Pentium G4400 @3.3 // Gigabyte GA-Z170-HD3 // 2x 4GB DDR4 2400 // Intel HD Graphics // Kingston A400 120GB SSD // 3x Seagate Barracuda 2TB 7200 HDDs in RAID-Z // Cooler Master Silent Pro M 1000w PSU // Antec Performance Plus 1080AMG // FreeNAS OS

 

Link to comment
Share on other sites

Link to post
Share on other sites

You think the much lower core count has anything to do with it? ;)

 

I wasn't aware that cores could be compared in such a fashion across architectures.  

 

Anyway, I suppose it may not strictly be a memory thing.  I think it is interesting from a real world "which GPU to buy" perspective though.  

4K // R5 3600 // RTX2080Ti

Link to comment
Share on other sites

Link to post
Share on other sites

I wasn't aware that cores could be compared in such a fashion across architectures.  

 

Anyway, I suppose it may not strictly be a memory thing.  I think it is interesting from a real world "which GPU to buy" perspective though.  

Okay, to be fair I should say SPs instead of cores, because that is the most fundamental unit of data processing. There is back pressure due to the lower processing resources.

Software Engineer for Suncorp (Australia), Computer Tech Enthusiast, Miami University Graduate, Nerd

Link to comment
Share on other sites

Link to post
Share on other sites

You reminded me, gotta watch out for people dumping their first gen Titans too, see if I can get two of them for a reasonable price vs the 295x2.

Who would do such a thing ???

 

btw

 

wts 2 titans, pst ;P

 

 

*im only kidding, my cards arent for sale. plse dont report me for coc.

Main Rig: http://linustechtips.com/main/topic/58641-the-i7-950s-gots-to-go-updated-104/ | CPU: Intel i7-4930K | GPU: 2x EVGA Geforce GTX Titan SC SLI| MB: EVGA X79 Dark | RAM: 16GB HyperX Beast 2400mhz | SSD: Samsung 840 Pro 256gb | HDD: 2x Western Digital Raptors 74gb | EX-H34B Hot Swap Rack | Case: Lian Li PC-D600 | Cooling: H100i | Power Supply: Corsair HX1050 |

 

Pfsense Build (Repurposed for plex) https://linustechtips.com/main/topic/715459-pfsense-build/

 

 

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

1) Learn some statistics. 3 test rounds is not reliable. 5 is the bare minimum for a 95% confidence interval

 

Drop the pompousness. You really havent contributed any statistical evidence to back up your opinion

 

 

How many times do I have to explain to people games have no bandwidth problem currently.
HBM really won't help gaming much for a very long time

 

 

He provided something (real world observations). However small his sample size, it stands up to your empty opinions any day of the week. Shut up or put up some stats if you want to call him out on his. 

AMD FX-8350 @ 4.7Ghz when gaming | MSI 990FXA-GD80 v2 | Swiftech H220 | Sapphire Radeon HD 7950  +  XFX Radeon 7950 | 8 Gigs of Crucial Ballistix Tracers | 140 GB Raptor X | 1 TB WD Blue | 250 GB Samsung Pro SSD | 120 GB Samsung SSD | 750 Watt Antec HCG PSU | Corsair C70 Mil Green

Link to comment
Share on other sites

Link to post
Share on other sites

395x2 is gonna get rekt by the GTX 2080 Ti

"The of and to a in is I that it for you was with on as have but be they"

Link to comment
Share on other sites

Link to post
Share on other sites

And In conclusion the AMD R9-395x2 will have a slower clock speed and a higher heat output then the Nvidia GTX 2080 Ti

Processor-Intel - Core i9-7900X 3.3GHz , Motherboard- Gigabyte - X299 AORUS Gaming 7, RAM- G.Skill - TridentZ RGB 32GB (4 x 8GB) DDR4-4000 Memory, GPU-Zotac - GeForce GTX 1080 Ti 11GB AMP Extreme,  Case- Phanteks - Enthoo Evolv ATX Glass, Storage- Samsung - 960 EVO 250GB M.2-2280, Samsung 850 Evo 250GB SSD, Samsung 850 EVO 1TB SSD, Toshiba 4TB (PH3400U) , PSU- SeaSonic 1200W Platinum, Cooling- NZXT - Kraken X62 Rev 2, Sound-Sennheiser - HD 598SE, SURE - SM7B, OS- Windows 10 Pro.

Link to comment
Share on other sites

Link to post
Share on other sites

And In conclusion the AMD R9-395x2 will have a slower clock speed and a higher heat output then the Nvidia GTX 2080 Ti

 4) (256/8)*7*10^9 = 224*10^9 = 224GB/s for GTX 2080Ti

     (512/8)*5*10^9 = 320*10^9 = 320GB/s for R9 395x2

 

 

this means stuff, a smart guy told me.

Motherboard - Gigabyte P67A-UD5 Processor - Intel Core i7-2600K RAM - G.Skill Ripjaws @1600 8GB Graphics Cards  - MSI and EVGA GeForce GTX 580 SLI PSU - Cooler Master Silent Pro 1,000w SSD - OCZ Vertex 3 120GB x2 HDD - WD Caviar Black 1TB Case - Corsair Obsidian 600D Audio - Asus Xonar DG


   Hail Sithis!

Link to comment
Share on other sites

Link to post
Share on other sites

 4) (256/8)*7*10^9 = 224*10^9 = 224GB/s for GTX 2080Ti

     (512/8)*5*10^9 = 320*10^9 = 320GB/s for R9 395x2

 

 

this means stuff, a smart guy told me.

But 2080 is a bigger number so it's way better

"The of and to a in is I that it for you was with on as have but be they"

Link to comment
Share on other sites

Link to post
Share on other sites

*quote* Beebskadoo
 

4) (256/8)*7*10^9 = 224*10^9 = 224GB/s for GTX 2080Ti

     (512/8)*5*10^9 = 320*10^9 = 320GB/s for R9 395x2

 

 

this means stuff, a smart guy told me.

 

 

 

 

 

Ohhh damn I guess I better learn my statistics xP


 

Processor-Intel - Core i9-7900X 3.3GHz , Motherboard- Gigabyte - X299 AORUS Gaming 7, RAM- G.Skill - TridentZ RGB 32GB (4 x 8GB) DDR4-4000 Memory, GPU-Zotac - GeForce GTX 1080 Ti 11GB AMP Extreme,  Case- Phanteks - Enthoo Evolv ATX Glass, Storage- Samsung - 960 EVO 250GB M.2-2280, Samsung 850 Evo 250GB SSD, Samsung 850 EVO 1TB SSD, Toshiba 4TB (PH3400U) , PSU- SeaSonic 1200W Platinum, Cooling- NZXT - Kraken X62 Rev 2, Sound-Sennheiser - HD 598SE, SURE - SM7B, OS- Windows 10 Pro.

Link to comment
Share on other sites

Link to post
Share on other sites

I have made a simple chart here to illustrate how the GTX 2080 Ti and the 395x2 will stack up to each other. I think this will solve the issues that everyone has brought up in the thread.

 

b91d235e1e.png

 

1dda78e72c.gif

"The of and to a in is I that it for you was with on as have but be they"

Link to comment
Share on other sites

Link to post
Share on other sites

 4) (256/8)*7*10^9 = 224*10^9 = 224GB/s for GTX 2080Ti

     (512/8)*5*10^9 = 320*10^9 = 320GB/s for R9 395x2

 

 

this means stuff, a smart guy told me.

It's a bandwidth calculation. 256 bits of width divided by the # of bits per byte multiplied by the effective frequency of the memory circuit which is a quad-pumped DRAM running at 1.5GHz for a 7GHz effective clock, where giga is the prefix for 10^9 in scientific numerics.

 

Bits / bits/byte * 1/seconds-per-cycle = Gigabytes/second = GB/s. 

Software Engineer for Suncorp (Australia), Computer Tech Enthusiast, Miami University Graduate, Nerd

Link to comment
Share on other sites

Link to post
Share on other sites

im confused the r9 395x2 has less ram per core than 390x

maybe they mean 8 gb per gpu?

Link to comment
Share on other sites

Link to post
Share on other sites

You reminded me, gotta watch out for people dumping their first gen Titans too, see if I can get two of them for a reasonable price vs the 295x2.

You'll never get titans for a reasonable price, first gen is still selling for $550 give or take.

 

Might as well buy a 980 at that point

Specs: 4790k | Asus Z-97 Pro Wifi | MX100 512GB SSD | NZXT H440 Plastidipped Black | Dark Rock 3 CPU Cooler | MSI 290x Lightning | EVGA 850 G2 | 3x Noctua Industrial NF-F12's

Bought a powermac G5, expect a mod log sometime in 2015

Corsair is overrated, and Anime is ruined by the people who watch it

Link to comment
Share on other sites

Link to post
Share on other sites

All over the place.

God damnit, for some reason, I need to press the P key REALLY hard compared to other keys.

Press On The P Real Hard

Link to comment
Share on other sites

Link to post
Share on other sites

If it's less than 10.000 NOK (about $1200) I will buy the 395x2.

In case the moderators do not ban me as requested, this is a notice that I have left and am not coming back.

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×