Futuremark readying new Vulkan and DX12 benchmark

Prysin · December 22, 2016

Quote

Futuremark is working on new game-tests for its 3DMark benchmark suite. One of these is a game test that takes advantage of DirectX 12, but isn't as taxing on the hardware as "Time Spy." Its target hardware is notebook graphics and entry-mainstream graphics cards. It will be to "Time Spy" what "Sky Diver" is to "Fire Strike."

The next, more interesting move by Futuremark is a benchmark that takes advantage of the Vulkan 3D graphics API. The company will release this Vulkan-based benchmark for both Windows and Android platforms. Lastly we've learned that development of the company's VR benchmarks are coming along nicely, and the company hopes to release new VR benchmarks for PC and mobile platforms soon. Futuremark is expected to reveal these new game-tests and benchmarks at its 2017 International CES booth, early January.

This is rather neat, and i cannot wait to see how the Vulkan bench will turn out. Hopefully it will allow a apples to apples test vs DX12. Time will show.

Source: https://www.techpowerup.com/228884/futuremark-readies-new-vulkan-and-directx-12-benchmarks

Energycore · December 22, 2016

Oh cool stuff

A lot of 3DMark benchmarks have been a bit slanted for Nvidia cards if i'm not mistaken (like Firestrike on 1060 vs 480). Their bench is interesting, but I'll still give it less credit than actual game performance.

AlTech · December 22, 2016

3 hours ago, Energycore said:

Oh cool stuff

A lot of 3DMark benchmarks have been a bit slanted for Nvidia cards if i'm not mistaken (like Firestrike on 1060 vs 480). Their bench is interesting, but I'll still give it less credit than actual game performance.

I've started to care less and less about benchmarks recently.

Carclis · December 22, 2016

The article suggests that this new benchmark will be less demanding and for lower end hardware. I suppose that kinda makes it useless to those that already own TimeSpy then.

Notional · December 22, 2016

Useless. The last Futuremark DX12 bench was pathetic. The entire point of low level API's is direct access to hardware for specific optimization for that hardware, as in architecture/cards. Futuremark did NOT do specific optimization, so in the end, it was basically just a DX11 bench in a DX12 wrapper with CPU multithreading and higher drawcall support. Hardly useful for anything, which is probably why NVidia did so well, compared to how much they suck in next gen API based games.

If this is the same nonsense, why even bother? Most games has built in bench marks today anyways.

Bouzoo · December 22, 2016

Inb4 they give it for free and it turns out it's only a demo version. That would be another $10 for the full version.

Prysin · December 22, 2016

28 minutes ago, Bouzoo said:

Inb4 they give it for free and it turns out it's only a demo version. That would be another $10 for the full version.

atleast they arent owned by EA... cuz then it'd be a full fat 60$

The Benjamins · December 22, 2016

5 minutes ago, Prysin said:

atleast they arent owned by EA... cuz then it'd be a full fat 60$

I am sorry but the results module is DLC so it will be another $10 or $20 for the season pass.

Carclis · December 22, 2016

26 minutes ago, The Benjamins said:

I am sorry but the results module is DLC so it will be another $10 or $20 for the season pass.

They could always pull an "Activision" and only let you purchase the new benchmark if you buy the 3dmark bundle that includes all the old benchmarks.

ivan134 · December 22, 2016

Well, let's see if it will be real DX 12 this time. I don't know why people are taking TimeSpy seriously when the creator already said it's not actually DX 12.

Space Reptile · December 22, 2016

even passmark has a DX12 test , its time for futuremark to show us a proper benchmark for vulcan , all we have so far is DOOM

Mira Yurizaki · December 22, 2016

6 hours ago, Energycore said:

Oh cool stuff

A lot of 3DMark benchmarks have been a bit slanted for Nvidia cards if i'm not mistaken (like Firestrike on 1060 vs 480). Their bench is interesting, but I'll still give it less credit than actual game performance.

FireStrike is a DirectX 11 test. NVIDIA usually pulls ahead over AMD on those.

2 hours ago, Notional said:

Useless. The last Futuremark DX12 bench was pathetic. The entire point of low level API's is direct access to hardware for specific optimization for that hardware, as in architecture/cards. Futuremark did NOT do specific optimization, so in the end, it was basically just a DX11 bench in a DX12 wrapper with CPU multithreading and higher drawcall support. Hardly useful for anything, which is probably why NVidia did so well, compared to how much they suck in next gen API based games.

They explained that they asked all three GPU vendors (Intel, NVIDIA, and AMD) if they should include any architecture specific optimizations. They all said no.

Quote

In the past, we have discussed the option of vendor-specific code paths with our development partners, but they are invariably against it. In many cases, an aggressive optimization path would also require altering the work being done, which means the test would no longer provide a common reference point. And with separate paths for each architecture, not only would the outputs not be comparable, but the paths would be obsolete with every new architecture launch.

3DMark benchmarks use a path that is heavily optimized for all hardware. This path is developed by working with all vendors to ensure that our engine runs as efficiently as possible on all available hardware. Without vendor support and participation this would not be possible, but we are lucky in having active and dedicated development partners.

This has burned them in the past before: http://arstechnica.com/gadgets/2008/07/atom-nano-review/6/

Note that when a VIA Nano CPU changes its CPUID to look like an Intel processor, the results of its benchmark dramatically improves than when it looks like an AMD processor. This was a result of Futuremark using Intel specific optimizations that were not available on AMD processors at the time of development. The Nano has those things the Intel processors had at the time, and Futuremark only looks at the CPUID for compatibility, not if it actually supports that feature or not.

Valentyn · December 22, 2016

2 hours ago, M.Yurizaki said:

FireStrike is a DirectX 11 test. NVIDIA usually pulls ahead over AMD on those.

They explained that they asked all three GPU vendors (Intel, NVIDIA, and AMD) if they should include any architecture specific optimizations. They all said no.

This has burned them in the past before:

Note that when a VIA Nano CPU changes its CPUID to look like an Intel processor, the results of its benchmark dramatically improves than when it looks like an AMD processor. This was a result of Futuremark using Intel specific optimizations that were not available on AMD processors at the time of development. The Nano has those things the Intel processors had at the time, and Futuremark only looks at the CPUID for compatibility, not if it actually supports that feature or not.

The entire purpose of DX12 is that the developer Must make the effort to create hardware specific code to take as much advantage of it as possible. Especially since Drivers have minimal impact now. This has little to nothing to do with CPUs in the case you mentioned.

At GDC both AMD and NVIDIA kept emphasising the best practices for DX12, going so far as to state if you cannot create IHV specific paths that you should use DX11 instead.

Creating a DX12 or Vulkan benchmark that does not use IHV specific paths is not a proper benchmark, as it does not fully utilise either the API or the hardware as they were intended.

http://www.gdcvault.com/play/1023128/Advanced-Graphics-Techniques-Tutorial-Day

Mira Yurizaki · December 22, 2016

11 minutes ago, Valentyn said:

The entire purpose of DX12 is that the developer Must make the effort to create hardware specific code to take as much advantage of it as possible. Especially since Drivers have minimal impact now. This has little to nothing to do with CPUs in the case you mentioned.

At GDC both AMD and NVIDIA kept emphasising the best practices for DX12, going so far as to state if you cannot create IHV specific paths that you should use DX11 instead.

Creating a DX12 or Vulkan benchmark that does not use IHV specific paths is not a proper benchmark, as it does not fully utilise either the API or the hardware as they were intended.

The point of benchmarking however is to run the same exact code paths to see how each architecture handles it. The moment you add in architecture specific optimizations, you lose that apples-to-apples comparison. Which again, Futuremark was called out on for using optimizations that at the time were specific to Intel (though their implementation for handling that was flawed).

Imagine if I gave a test to some college students. But let's say I give the computer science majors more math problems and less writing problems and the English literature students less math problems and more writing problems, because I'm "optimizing" the test for what would work best with them. How is this test fair?

Valentyn · December 22, 2016

36 minutes ago, M.Yurizaki said:

Imagine if I gave a test to some college students. But let's say I give the computer science majors more math problems and less writing problems and the English literature students less math problems and more writing problems, because I'm "optimizing" the test for what would work best with them. How is this test fair?

Terrible analogy since if you're you're majoring in Computer Science you will get more maths problems.

I don't recall ever getting a literature assignment when I studied IT and Computing Systems Management. You know why? It was irrelevant to my field, bar writing reports.
It's perfectly fair.

How would it be fair for Computing and Science students to do an indepth analysis of an Author and their works?

Giving both fields a standardised test is entirely unfair, since it has no true merit in either field or the ability to test either group of students performance in said field.

The benchmark for DX12 needs IHV specific code for each major player, AMD, Intel, and NVIDIA. It's stated in the best practices.
Not doing so means it's not a true DX12 benchmark that takes advantage of the API's low level nature.

The entire purpose of the API is for such specific code paths to extract as much performance as possible, taking the majority of the performance optimisation burden onto the developer and away from drivers.

Mira Yurizaki · December 22, 2016

18 minutes ago, Valentyn said:

Terrible analogy since if you're you're majoring in Computer Science you will get more maths problems.

I don't recall ever get a literature assignment when I studied IT and Computing Systems Management. You know why? It was irrelevant to my field, bar writing reports.
It's perfectly fair.

How would it be fair for Computing and Science students to do an indepth analysis of an Author and their works?

Giving both fields a standardised test is entirely unfair, since it has no true merit in either field or the ability to test either group of students performance in said field.

The benchmark for DX12 needs IHV specific code for each major player, AMD, Intel, and NVIDIA. It's stated in the best practices.
Not doing so means it's not a true DX12 benchmark that takes advantage of the API's low level nature.

The entire purpose of the API is for such specific code paths to extract as much performance as possible, taking the majority of the performance optimisation burden onto the developer and away from drivers.

If you have to tailor the tests to make it fair, then the comparison becomes pointless because you're not running the same exact test. So there's no point in Futuremark making a benchmarking tool if they have to optimize their code paths for each architecture. It would be an apples to oranges to melons comparison.

All a score would really say is how good Futuremark is at optimizing for that architecture. Nothing more. You also can't guarantee every developer will use the same optimizations, if they can even use it.

Notional · December 22, 2016

24 minutes ago, M.Yurizaki said:

If you have to tailor the tests to make it fair, then the comparison becomes pointless because you're not running the same exact test

But that is the entire point of DX12. Not doing it, would make it pointless to make a DX12 benchmark.

Mira Yurizaki · December 22, 2016

7 minutes ago, Notional said:

But that is the entire point of DX12. Not doing it, would make it pointless to make a DX12 benchmark.

Then there's no point in creating a benchmark for DX12 or Vulkan. If the scores were because the hardware ran code specific to it, then it's meaningless to compare it against another architecture because it can't run that code.

At best all Futuremark can do is identify what parameters are affected by each architecture but are not vendor specific. For example, to keep GCN happy, you must feed it large job batches. Kepler and beyond don't like this. However this is not an architecture specific parameter. So it would be a better test of the overall picture by testing with various job batch sizes rather than a large one for GCN and a small one for Kepler and beyond.

But at that point, 3DMark only becomes a tool to analyze what works with certain architectures and what doesn't. Which means it's no longer a tool useful for the everyman. NVIDIA fanboys will go "lol, AMD sucks at this test" and AMD fanboys will go "lol, NVIDIA sucks at this test", even if it's the same test with a different parameter.

Energycore · December 22, 2016

There seems to be an argument whether the test should be optimized, and here is why I think why it should:

We look at benchmarks to give ourselves a ballpark measure of a video card's game performance. That's what bench was made for and that's what we still use it for.

If instead of the bench we were talking about a game, of course we would want both GPU companies to contribute optimized code for their hardware, because that will in the end yield more framerates for us gamers.

In the same token, when both companies are allowed to contribute optimized code for the benchmark, the analogy of them being given two different tests doesn't work: they're given the same workload and then are asked to contribute both the hardware in the form of the GPUs, and the optimized code for their architectures.

This doesn't come without its problems to overcome, like where you draw the line of cheating, but I do think it is the better way to go about a benchmark. If we just cared about how an architecture handled a specific workload without optimization, we could just rank GPUs by how fast they complete GPU Pi

Humbug · December 23, 2016

hmmm the argument in this thread is largely the reason I don't much care for synthetic benchmarks. If I am going to buy a Vega GPU in march why would I look at it's 3d mark score when I can instead look at the performance in real-world applications which I can actually use.

Also hypothetically let's say Vulkan outperforms DX12 by 5% in the benchmark. Even then wouldn't it be too simplistic for us to assume that said API is faster? That performance difference could be due to their vulkan implementation being slightly better in some way, or due to drivers, or some extension they used without using the equivalent in competing API... How the hell can you draw any conclusions without

a) being an expert

b) looking at their code and talking to IHVs

for example we all know about the stellar performance of doom 2016 on vulkan API with silky smooth frame delivery etc. After that many people came out singing the praises of vulkan and all. Of course it did demonstrate the potential but also that achievement is not just a merit to vulkan. It's also a merit to the id software engineers who knew exactly what they were doing and produced a good optimized code-base. A lesser developer may not have done as good a job and then we would be saying different things... So I guess my issue is that regardless of what results a particular benchmark or game produces it will be extremely difficult for us to draw definitive conclusions.

Valentyn · December 23, 2016

1 hour ago, Humbug said:

hmmm the argument in this thread is largely the reason I don't much care for synthetic benchmarks. If I am going to buy a Vega GPU in march why would I look at it's 3d mark score when I can instead look at the performance in real-world applications which I can actually use.

Exactly. Futuremark failed with Timespy in my opinion. Doing little more than another DX11 benchmark with some. Dx12 features added on.

I feel they'll do the exact same with their next set of tests, as they're failing to deliver on the basic premise of the APIs, nevermind the best practices which were laid out by AMD and NVIDIA. Which is to create IHV specific code paths to optimise as much as possible

Drivers have minimal impact and they should, since the majority of the work now shifts to the developer.

Doom is a perfect example, and it wasn't even developed from the ground up in Vulkan only.

I stopped caring much for 3D Mark back with Vantage, and I hope people start looking at real world working and gaming tests instead.

We'll see what futuremark does, but I suspect it'll be causing more outcry again, rather than allowing people to judge which card is better for said API.

Sign In

Futuremark readying new Vulkan and DX12 benchmark

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites