Jump to content

Firestrike combined specifics regarding Ryzen.

So I was looking at benchmarks, looking for some telltale signs as to what could be wrong with the gaming performance and I saw these results on the dutch website Tweakers.

 

1800x.jpg

 

But i've never really been sure what the combined test does. I've always thought it to be some maximum drawcall throughput test that puts a heavy load on both the CPU and GPU.

But if i'm to extract anything from it, other than a number, it would be nice to know what it does. So, anyone who can explain me exactly what this test does? Because these also roughly translate to the later game benchmarks on a games with high CPU dependency.

 

https://tweakers.net/reviews/5239/4/ryzen-7-processors-amd-is-terug-in-de-race-grafische-prestaties-en-geheugen.html

Link to comment
Share on other sites

Link to post
Share on other sites

  • 4 weeks later...
On 3/4/2017 at 2:03 AM, Majestic said:

 

 

The Firestrike combined test is the most relevant score in Firestrike if you are having a discussion about Ryzen gaming performance and it is the score that has been universally ignored by every reviewer that I have seen. That graph is the only mention of combined scores that I have seen in an article about Ryzen. It would appear that the majority of tech reviewers don't understand the firestrike benchmark at all. It is important because it is the only part of the benchmark that does something similar to what is going on when you play a game. The Graphics and physics tests are also important as they show the performance of either the Graphics or CPU running almost in isolation.

 

If reviewers had bothered to pay any sort of attention to the Combined score, They may have noticed that the Ryzen graphics and physics scores were both fairly comparable to what you score with an Intel 6900K and beat the 7700K scores, however the combined score seriously under performed both of the Intel platforms, giving a hint where they should be looking for an explanation of the slow gaming problems - ie. It is not the CPU or GPU itself, it is not the PCIe bus because that runs to a standard, it could only be the connectivity to the memory or the PCIe controllers in the Data Fabric on the chip. 

 

This is from the 3DMark Technical Guide that you can download from here http://s3.amazonaws.com/download-aws.futuremark.com/3DMark_Technical_Guide.pdf

 

"3DMark Fire Strike Combined test stresses both the GPU and CPU simultaneously. The GPU load combines elements from Graphics test 1 and 2 using tessellation, volumetric illumination, fluid simulation, particle simulation, FFT based bloom and depth of field. The CPU load comes from the rigid body physics of the breaking statues in the background. There are 32 simulation worlds running in separate threads each containing one statue decomposing into 113 parts. Additionally there are 16 invisible rigid bodies in each world except the one closest to camera to push the decomposed elements apart. The simulations run on one thread per available CPU core. The 3DMark Fire Strike Combined test uses the Bullet Open Source Physics Library."

 

The guide gives details on the graphics and physics tests as well as how the benchmark is scored, as well as how the other benchmarks are run as well but you can download the file and read up yourself.

Link to comment
Share on other sites

Link to post
Share on other sites

3 hours ago, GTBTK said:

however the combined score seriously under performed both of the Intel platforms, giving a hint where they should be looking for an explanation of the slow gaming problems - ie. It is not the CPU or GPU itself, it is not the PCIe bus because that runs to a standard, it could only be the connectivity to the memory or the PCIe controllers in the Data Fabric on the chip. 

Or the fact it's apparently running 32 threads and the Ryzen's achilles heel is heavy context switching.

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, Majestic said:

Or the fact it's apparently running 32 threads and the Ryzen's achilles heel is heavy context switching.

Shame that the context switching theory is wrong, like everything else with Ryzen, everyone is looking at one thing once and assuming all the rest. The maths don't actually add up for that theory.

 

These are fag packet calculations just to illustrate a point. I know that the numbers like the % of time a tread is switching to another core are an over estimaton, the reality is a smaller %. 

 

Every time a thread switches, it has a 1 in 8 chance of jumping to the other CCX, that is 12.5% of all thread switches. Threads are not switching 100% of the time, they have to do work on the core they are residing on for a significantly longer period of time before they switch again.

 

Lets assume that a thread spends 10% of its time switching, (I am sure it is significantly less than 10%) for the sake of the discussion to make the maths a little easier. So only 1.25% of the compute time is effected by cross CCX switched  threads and that does not take into account that those threads are running in parallel with other threads that are not being slowed down, a significant number of those other threads are not dependent on the results from the slowed threads (a tessellation thread is not always gependent of a physics calculation) and some of the slowed threads will still finish before non slowed threads simply because they had less work to do. That reduces the 1.25% a bit more. Lets say to 1%. I can keep going pointing out that the original 12.5% assumes a total loss and we are only introducing delays and dont have to start the threads calculations over again so that 1% actually gets smaller again. But I will stop here.

 

Either way, delays on 1% of the processing threads is not enough to cause a performance decrease of 20-40% in actual work output, particularly when the the 2 component workloads (the graphics part and the physics part) have already been shown that the nature of the instructions that they are each processing in isolation does not have significant impact on the system as a whole even if the CPU can run at 100% load. 

 

There is definitely some performance degradation added by delays in switching between CCX modules, but that is only contributing the performance drop because the data Fabric enables switching and it also enables the connection to the PCIe Controller for the graphics output and the memory controller for every thread executing instructions regardless of switching or not, plus the thread inter CCX switch management. plus the nural net traffic, plus plus plus....

 

The cause is contention for resources on the Data Fabric due to an overloading of the Data Fabric caused by the high compute memory read/write requirements combined with high memory read/write and PCIe controller access the graphical processing requires. As supported memory speeds increase with the bios improvements, the bandwidth and associated latency improves and the extent of the original problem is reduced as can be seen if you look at the improvement in combined scores over the last couple of weeks, improving by almost 20%.

 

Given the reports of occasional black screens, I would be willing to bet that the CPU side of PCIe controller in the SOC is the primary cause of the bottleneck 

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

@GTBTK Actually, there are 2 core complexes with 4 threads. Compare it to a bowl of marbles, 4 red (CCX1) and 4 blue (CCX2). The changes of taking either a red or a blue is 4 over 8. Meaning a 50% chance. Not a student of statistics I see.

 

Bad statistics based on random assumptions (10% time spent corehopping?). Might not want to be so overeager to call everyone wrong, mate.

Link to comment
Share on other sites

Link to post
Share on other sites

think about what you are saying first.

 

We are talking about how often a thread will jump the gap, not how often a thread will run on a specific core. A single thread has the option of switching to 7 other locations on the same ccx it is already on plus the only other option is that the thread takes option number 8 which is to jump across to ANY of the 8 thread locations on the other Module.

 

If it is currently on CCX 1 Core 1 thread 1 and it is being directed to a core on the other CCX it is still only one jump to go to CCX 2 Core2 Thread 1 or Core 5 thread 2 or what ever location it ends up on on the other CCX. that is a 1 in 8 chance of having to jump cores. It is not telling you the probability of switching to a specific core on CCX2.

 

Given that the threads runs for multi milliseconds doing its processing work before spending 14 -70 nano seconds switching. That 10% should have been closer to 0.01% or less but I though it made it easier to illustrate a point without making the core switching assumption look too ridiculous

 

Link to comment
Share on other sites

Link to post
Share on other sites

This is very, very elementary statistics man. If the thread is running on core 0 and interdependent with another thread on the same CCX, during a hop it has a 4/8 chance to land on a CCX (given equally likely outcomes) which adds latency. (core 4 to 7).

Link to comment
Share on other sites

Link to post
Share on other sites

yes it is. but even you got it confused. It seems it needs to be because everyone is following the herd and not thinking about this stuff themselves. Especially while the media seem to be selective about what they report on. I don't know if they are trying to use the poor at gaming pronouncement as a form of click bait or just because they are clueless and just follow a formula they wrote learned. 

 

That graph you posted with the Combined scores highlighting the poor ryzen performance, is the ONLY review I have seen that even mentions the combined score. Everyone else is totally ignoring it which I find really strange as they are all concluding Ryzen gaming is poor and then ignoring the only tool that compares isolated GPU performance, isolated CPU performance and combined CPU and GPU performance against itself and other Intel units.

 

It also seems a bit silly to be claiming that gaming performance is poor when with the same GPU they did the review on, it is performing better than 98% of all computers on firestrike bench marks that while not a game is closer than many benchmarks and the minimum frame rates are better than Intels, which is actually much more important in user experience terms than max frame rates if you are exceeding monitor speed. 

Link to comment
Share on other sites

Link to post
Share on other sites

  • 1 year later...
On 3/26/2017 at 6:55 AM, GTBTK said:

think about what you are saying first.

 

We are talking about how often a thread will jump the gap, not how often a thread will run on a specific core. A single thread has the option of switching to 7 other locations on the same ccx it is already on plus the only other option is that the thread takes option number 8 which is to jump across to ANY of the 8 thread locations on the other Module.

 

If it is currently on CCX 1 Core 1 thread 1 and it is being directed to a core on the other CCX it is still only one jump to go to CCX 2 Core2 Thread 1 or Core 5 thread 2 or what ever location it ends up on on the other CCX. that is a 1 in 8 chance of having to jump cores. It is not telling you the probability of switching to a specific core on CCX2.

 

Given that the threads runs for multi milliseconds doing its processing work before spending 14 -70 nano seconds switching. That 10% should have been closer to 0.01% or less but I though it made it easier to illustrate a point without making the core switching assumption look too ridiculous

 

 

know this might be a bit - REALLY old by now, sorry for my (thread necromancy) habits but at least they are not thread contention issues! (lol) 

Seriously though, you have almost certainly hit the nail on the head here. EVER SINCE Ryzen was released and I started looking at the Firestrike results of them (as I thought I WOULD own Ryzen for a bit due to hype) - but not only was the game perf lackluster, the Firestrike Combined points to a SIGNIFICANT design flaw when it comes to gaming performance.

 

I have always GUESSED similarly to you, I figured that the CCX Latency was nowhere NEAR the entire cause, I always said "Well, wait, doesn't Infinity Fabric then require heavy RAM transactions - since the bus is required for core to core communication outside of a CCX?  And even if RAM was not being written to, isn't the bus was otherwise being occupied heavily for said communications?"

 

And it would seem you were right, as were my thoughts on the matter (and sometimes comments on YouTube vids, often pointing out that DEAD TERRIBLE Combined score...) Because now that Threadripper exists, we can see that is is a BAD GAMING CHIP! Because it relies on 4 CCXs it will always perform badly at gaming - worse than any R5 or R7 in fact!

 

I have actually proven (to myself) and mentioned the fact that with my OLD i7-2600k at 4.5 GHZ and my G1 980 Ti at 1503 core 8 GHZ VRAM - I scored a BETTER combined in Firestrike than TechShowdown managed with an R7 at or near 4 GHZ - and a Top-of-the-Line GTX 1080 Ti!  Which was only like 35 FPS for my 2600k - something like 33 for the Ryzen Chip I watched!

 

And now the "Next Gen" of Ryzen is about to release... Well I have seen some benchmarks, and AMD has done VERY little to mitigate any of these problems. 

 

And I got AMD Fanboys calling me a shill and a moron for LOVING my Delidded 5 GHZ Core/Cache 0 offset i7-8700k - because Ryzen+ is really going to change things!

 

You know the drill - wait for Navi!...... ((TM))

 

*Sidenote: I am not ANY sort of Fanboy. I love products that serve me well at a price I can afford and I find reasonable. Whoever offers me the best for what I want to do will get my money and recommendation if it impresses me - as both my current CPU and GPU have. I do NOT care who makes the silicon at all, I care who makes the better silicon for my needs.

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×