Navi 21/23 Cards Rumored (aka "Nvidia Killers" xD)

leadeater · August 27, 2019

3 minutes ago, ryao said:

All of the software using transform feedback / stream output will kill the performance of a tile based architecture. DXVK found a large amount of it using transform feedback. It is a mandatory feature of Direct3D 11 and Direct3D 12:

https://docs.microsoft.com/en-us/windows/win32/direct3d12/hardware-feature-levels

That's why Ray Tracing is the potential solution to the problems of multi GPU, but you have to change the entire render workflow. The interdependence on each GPU is pretty much the reason why everyone gave up on it.

ryao · August 27, 2019

11 minutes ago, leadeater said:

That's why Ray Tracing is the potential solution to the problems of multi GPU, but you have to change the entire render workflow. The interdependence on each GPU is pretty much the reason why everyone gave up on it.

There are other issues too that are orthogonal to ray tracing:

”It's worth noting that tiling architectures do run into similar issues without transform feedback if you enable a geometry or tessellation shader but transform feedback certainly isn't helping.”

http://jason-blog.jlekstrand.net/2018/10/transform-feedback-is-terrible-so-why.html

This is why Vulkan and Direct3D 12 have the game developers do the work. They can design their games to not do things that hurt tiled rendering across multiple GPUs when exposed to the issues firsthand. Tiled rendering is more general than multiGPU though. It is used on the iPhone for example to reduce GPU memory bandwidth requirements. It is known for being good at that.

Trixanity · August 27, 2019

1 hour ago, leadeater said:

I would also amend the original statement to development studios or project managers because the actual developers work ridiculous hours and releasing the product as soon as possible comes before all else, even at the cost of quality.

Games like Ashes of the Singularity perfectly illustrate how priorities effect the final product, when technical standards come first with an aim to cover all available features on the market then the final product will deliver on that. Stardock/Oxide Games doesn't have a greater technical capability than any other development studio they just have different priorities. Galactic Civilization III, another Stardock developed game, also shows this because after the game was released and greater than 4 core CPUs came on the market, even if they were only in the HEDT sector, they did a major redesign of the entire game engine to allow true multi-core scaling across as many cores in the system and even works across multiple sockets. This was 2 years after the games initial release, that sort of redevelopment is very rare.

The willingness to walk away from multi GPUs is very short sighted, that goes for everyone involved. There is nothing fundamental that prevents this from working and everyone would benefit from it. GPUs would be cheaper to buy, cheaper to make, architecture generations would evolve quicker and there would likely be flow on benefits to motherboard designs and CPU focus areas. Tech reviewers really need to change their tune, instead of giving the standard run down how multi GPU is effectively dead and advising us to buy the single best card we can they should actively put pressure on publishers (these are the ones that actually matter) to have games that actually support this technology. Public shaming really does go along way, the type I agree with.

Certainly. A lot of the digital industry is overworked (software developers, video editors, digital art, animation, VFX etc) but the result is that there is a lack of attention to the technical implementations. It seems the MO is as long as it's good enough - anything more is fiddling around. That's not necessarily on the developers but on project managers and suits as you say.

MultiGPU could be good if implemented properly but no one seems willing to find a good way of doing it where you avoid the pitfalls we've seen to date. Many new GPUs don't support the implicit implementations (SLI and Crossfire) anymore. Many of Nvidia's cards don't and it looks like Navi cards don't either. Developers can do it themselves but not many will and again: the quality needs to hit a certain level or it'll just create more problems. The video games industry needs an overhaul but who would be willing to spend the money on extra development time and will we see studios (in response) do even more microtransactions and lootboxes? Will they have the balls to try to increase video game prices past $60? There is a cancer growing but like many other systemic problems all the solutions are made in the name of short term gain.

TechyBen · August 27, 2019

58 minutes ago, leadeater said:

There isn't a lot of point to doing so, not with DX12 and Vulkan, also PCIe 4.0. It's currently a software issue so the only real solution is *cough* GameWorks *cough*, you know what I mean though .

Tile based rendering needs a good kick start and brought up a layer in to the game engine and graphics APIs, Nvidia already does this but it's down at driver and hardware level which isn't where you want to work from for multi GPU.

Interestingly Ray Tracing might actually help this become a thing again, one of the problems with multi GPUs in the past was post processing effects and lighting/shadows. If you split those tasks across GPUs and don't evaluate the entire frame you can get differences in shadow depth, coverage/alignment and lighting levels which is why (from my understanding) the resulting rendered frame is reconstructed then post effects applied afterwards and that is where the majority of multi GPU setups have the most issues (every OMG wft moment I have seen has been either shadows or lighting). Ray Tracing can allow the distribution of work across GPUs without that problem, http://khrylx.github.io/DSGPURayTracing/

That's an interesting take on it. Nice.

I often wonder how you deal with edge cases on multi gpu section/split screen rendering. If a blur/filter/effect is over the two pixels on the edge of each GPUs render, how do you tell it "blurr this green with that red", between them? The math and coding must make heads spin! XD

ryao · August 27, 2019

14 minutes ago, Trixanity said:

Certainly. A lot of the digital industry is overworked (software developers, video editors, digital art, animation, VFX etc) but the result is that there is a lack of attention to the technical implementations. It seems the MO is as long as it's good enough - anything more is fiddling around. That's not necessarily on the developers but on project managers and suits as you say.

MultiGPU could be good if implemented properly but no one seems willing to find a good way of doing it where you avoid the pitfalls we've seen to date. Many new GPUs don't support the implicit implementations (SLI and Crossfire) anymore. Many of Nvidia's cards don't and it looks like Navi cards don't either. Developers can do it themselves but not many will and again: the quality needs to hit a certain level or it'll just create more problems. The video games industry needs an overhaul but who would be willing to spend the money on extra development time and will we see studios (in response) do even more microtransactions and lootboxes? Will they have the balls to try to increase video game prices past $60? There is a cancer growing but like many other systemic problems all the solutions are made in the name of short term gain.

That is because doing it in the driver for APIs unfriendly to the concept was insane. You cannot know whether it is safe to divide work without a priori knowledge from what I am told. It appears that you end up with stuttering when you mess up at doing proper division of work.

The way forward is to give the game developers the tools needed to do it themselves. It forces them to design their software in ways friendly to the concept. That is what Vulkan and Direct3D 12 do.

cj09beira · August 27, 2019

11 hours ago, Results45 said:

Why are you being a stickler for monolithic chip designs? Yes, SoC designers now have to deal with latency between chiplets (down to nanoseconds), but as long as the interconnect fabric is fast/strong/versatile enough, is being able to more easily scale performance (from 2 chiplets on the low-end to 16 chiplets on the insane end) not all the more worth it?

i am not, chiplet will be an advantage in the future, just saying it will be much harder to do that for games, very soon something like that should be out for compute

10 hours ago, Damascus said:

Doesn't the arch shit itself after 56 CU's? I thought it was basically impossible to run more than 64. Maybe this could be achieved with infinity fabric, but as OP said get a dumptruck full of salt ready

on gcn scaling reduces quite a bit after 12 Cus per shader engine, but vega 56 for example has 4 shader engines, the 5700xt achieves higher performance with just 2 of them, amd could now easily just double the number of shader engines and get near 100% scaling on that as long as their scheduler is good, and they still have room to grow on each shader engine, so they should be fine scaling wise for a while

33 minutes ago, leadeater said:

That's why Ray Tracing is the potential solution to the problems of multi GPU, but you have to change the entire render workflow. The interdependence on each GPU is pretty much the reason why everyone gave up on it.

problem is, is it viable to drop support for older games so soon

cluelessgenius · August 27, 2019

for years now AMD didnt even have a legitamite offer in the higher end stuff. i got myself a 1080 ti while amd could barely beat the 1070 and they had to crank it all to 11 to do so. i sat there with my newly bought 1440p ultrawide and really didnt have another choice then to go with nvidia if i wanted playable fps even having to sacrifice freesync for it.

what im trying to say is they really shouldnt be to cocky beforehand. how about you show us the hardware first and let someone confirm the performance that isnt feeding their family with AMD money. then maybe you can be proud of yourself. but calling it an nvidia killer already just sounds like a provocation. nvidia is gonna get mad and byte back and amd isnt strong anough yet.

i remember how hyped i was for vega and that didnt happen so with this one im gonna wait and see and them being cocky beforehand really doesnt paint them in the best light. i mean have they seen any of the karate kid movies? the loudest mouth always get kicked in the face in the end and the quiet underdog wins. words are silver,actions are gold or however that whole thing goes.

alright im getting to worked up about this but i used to be a strong amd advocate in my younger years and they really have let me down over the years so its gonna take a while for them to convince me

79wjd · August 27, 2019

23 hours ago, Humbug said:

The CPU desktop enthusiast / gamer / workstation market is more rational and less brand loyal. Whenever AMD has the best product technically it sells well, therefore AMD has incentive to price well and sell large volumes and increase their market share.

The GPU market is different. More brand loyal and irrational. Most people would rather buy even an inferior, slower Geforce GPU than a superior, faster Radeon GPU. Most people in this market just buy the best Geforce GPU that they can afford. So when AMD reduces their prices they are taking a hit to their margins yet still with minimal market share impact. AMD has realized this now, so they are going to focus on good margins going forward.They will only undercut Nvidia slightly by about $50. Because they know from history that either way only 20% or so of consumers are going to buy radeon, so may as well settle for that and make as much money as possible.

There's another difference that you're overlooking though.

AMD's CPU division also launched a Halo line of products that drives marketing. RTG hasn't had an equivalent flagship at launch in many.... many years. Sure, the 200 series eventually caught up and surpassed Kepler, but it took a long time for that to happen, by which point most people moved on from caring about Kepler benchmarks.

leadeater · August 27, 2019

4 minutes ago, cj09beira said:

problem is, is it viable to drop support for older games so soon

Shouldn't have to, anything old will be using old API feature set and as long as you maintain support for that and hardware support for it you'll be able to diverge off to a new path without dropping legacy capabilities. We're still a long way off from not having to stick with hybrid ray tracing so it'll just be looking at what components can be taken out and achieved in a different way.

Far as I can see I don't see any reason why Ray Tracing cores couldn't be chiplets either.

Trixanity · August 27, 2019

8 minutes ago, ryao said:

That is because doing it in the driver for APIs unfriendly to the concept was insane. You cannot know whether it is safe to divide work without a priori knowledge from what I am told. It appears that you end up with stuttering when you mess up at doing proper division of work.

The way forward is to give the game developers the tools needed to do it themselves. It forces them to design their software in ways friendly to the concept. That is what Vulkan and Direct3D 12 do.

Exactly but contrary to the intent, it seems like Vulkan and DX12 killed multiGPU instead of revitalizing it. It seems developers don't have time to play with the built-in tools to do multiGPU so it's become a niche that pretty much no one implements because it's just extra work for little gain (not many sitting on more than one card). So both the demand and supply is pretty much gone. Someone would need to kickstart it if we're to ever see it take off. Maybe start off with an unlinked explicit implementation that takes advantage of the integrated GPU many have sitting in their Intel systems doing nothing. That could get the ball rolling but ironically that would actually be bad for AMD so perhaps when Intel launches their GPUs they'll push it hard to get a USP.

ryao · August 27, 2019

33 minutes ago, Trixanity said:

Exactly but contrary to the intent, it seems like Vulkan and DX12 killed multiGPU instead of revitalizing it. It seems developers don't have time to play with the built-in tools to do multiGPU so it's become a niche that pretty much no one implements because it's just extra work for little gain (not many sitting on more than one card). So both the demand and supply is pretty much gone. Someone would need to kickstart it if we're to ever see it take off. Maybe start off with an unlinked explicit implementation that takes advantage of the integrated GPU many have sitting in their Intel systems doing nothing. That could get the ball rolling but ironically that would actually be bad for AMD so perhaps when Intel launches their GPUs they'll push it hard to get a USP.

I do not know what you mean by “unlinked explicit implementation that takes advantage of the integrated GPU”. Doing this outside of the game code is incredibly hard. I am not a graphics expert, but from what I have heard needs to be done to support this, I am not sure if you can restrict the graphics API enough to be friendly to the driver doing this in place of the developer without completely breaking it.

You would probably get better performance from the driver developers spending the same amount of time improving their shader compilers than from them expending enormous resources to go down this route again to try to use the iGPU to render parts of frames. It is an open secret that shader compilers (especially on AMD) have room for improvement that can send FPS higher.

The only one who really benefited from SLI was Nvidia in that they forced AMD to divert their far more limited resources into crossfire, causing them to have a worse driver than they could have had. They were too busy keeping up with nvidia’s SLI to address fundamental issues in their driver like the fact that they need to spend several times the amount of time to get the same result due to how they maintain multiple independent drivers, their poor quality shader compiler output and all of the various bugs. Also, nvidia got to sell more graphics cards before people caught on to the microstutter problem.

Game engine developers might be able to help add support for multi-GPU through Vulkan and Direct3D 12, but fixing all of the things that are not friendly to it is a big task. Then there is also the paralyzing combination of vendor lock-in on direct3d keeping them from doing anything with Vulkan and Windows 7 keeping them from doing anything with Direct3D 12.

If they move to Vulkan, this would be better, but Microsoft seems to be doing a good job at holding back progress. They not only do a good job of keeping developers off open standards that work on older versions of their operating system and on other operating systems, but they also only recently put Direct3D 12 on Windows 7 and it isn’t even a complete backport, so the game developers need to do special things to use it.

slippers_ · August 27, 2019

A small while ago I saw a Coreteks video discussing the die sizes of the current gen AMD cards (5700 & XT) and commented on the fact that AMD could "easily" up the die size and stuff in more CU's and whatnot.

I believe it was this video where he discusses this:

I wont ignore that he does seem biased towards AMD, putting them in the light of David vs Goliath (Nvidia being Goliath of course hehe).

Trixanity · August 27, 2019

14 minutes ago, ryao said:

I do not know what you mean by “unlinked explicit implementation that takes advantage of the integrated GPU”. Doing this outside of the game code is incredibly hard. I am not a graphics expert, but from what I have heard needs to be done to support this, I am not sure if you can restrict the graphics API enough to be friendly to the driver doing this in place of the developer without completely breaking it.

You would probably get better performance from the driver developers spending the same amount of time improving their shader compilers than from them expending enormous resources to go down this route again to try to use the iGPU to render parts of frames. It is an open secret that shader compilers (especially on AMD) have room for improvement that can send FPS higher.

The only one who really benefited from SLI was Nvidia in that they forced AMD to divert their far more limited resources into crossfire, causing them to have a worse driver than they could have had. They were too busy keeping up with nvidia’s SLI to address fundamental issues in their driver like the fact that they need to spend several times the amount of time to get the same result due to how they maintain multiple independent drivers, their poor quality shader compiler output and all of the various bugs. Also, nvidia got to sell more graphics cards before people caught on to the microstutter problem.

That said, game engine developers might be able to help add support for multi-GPU though Vulkan and Direct3D 12, but fixing all of the things that are not friendly to it is a big task. Then there is also the paralyzing combination of vendor lock-in on direct3d keeping them from doing anything with Vulkan and Windows 7 keeping them from doing anything with Direcf3D 12. If they move to Vulkan, this would be better, but Microsoft seems to be doing a good job at holding back progress. They not only do a good job of keeping developers off open standards, they also only recently put Direct3D 12 on Windows 7 and it isn’t even a complete backport, so the game developers need to do special things to use it.

Explicit and implicit multiGPU is terminology used in DX12 documentation.

What I'm talking about is functionality available in DX12. It just requires game developers to code for it. I think Oxide Games had a tech demo showing off unlinked explicit multiGPU.

I'll attach some images explaining each possible multiGPU implementation in DX12:

Spoiler

One of the main points of the low level APIs was that it gave developers more control. That includes programming their own multiGPU support. It's theirs for the taking but many developers either don't have the time or the skill to do stuff like this properly so it hasn't really been used much. Initially most just relied on implicit multiGPU which is pretty much the same as SLI/Crossfire in its reliance on driver support.

I don't think Windows 7 is much of a hindrance anymore. It should be noted that DX12 has been backported to W7 in some limited capacity but the future is W10 either way (whether we like it or not). I'm not sure what APIs Vulkan has regarding multiGPUs, what the capabilities are and how it's different from DX12 but I'd assume they're similar but probably not identical in feature set.

Quite frankly developers have had APIs to implement multiGPU on their own for years but they're not doing anything with it so it's not that there's a technical limitation. Basically multiGPU should not rely on drivers. It's a stupid crutch. It's not the first time that we've seen driver implementations or attempts at it that yielded inconsistent or even bad results.

pas008 · August 27, 2019

2 hours ago, leadeater said:

There isn't a lot of point to doing so, not with DX12 and Vulkan, also PCIe 4.0. It's currently a software issue so the only real solution is *cough* GameWorks *cough*, you know what I mean though .

Tile based rendering needs a good kick start and brought up a layer in to the game engine and graphics APIs, Nvidia already does this but it's down at driver and hardware level which isn't where you want to work from for multi GPU.

Interestingly Ray Tracing might actually help this become a thing again, one of the problems with multi GPUs in the past was post processing effects and lighting/shadows. If you split those tasks across GPUs and don't evaluate the entire frame you can get differences in shadow depth, coverage/alignment and lighting levels which is why (from my understanding) the resulting rendered frame is reconstructed then post effects applied afterwards and that is where the majority of multi GPU setups have the most issues (every OMG wft moment I have seen has been either shadows or lighting). Ray Tracing can allow the distribution of work across GPUs without that problem, http://khrylx.github.io/DSGPURayTracing/

why not the hardware level?

why couldnt multigpu work in parts/quadrants of said resolution

like 4x tu102 or whatever chips (overkill) but each being solely responsible for said quadrant,

so each would be driving 1080p for 4k, and then all syncd together

then this would be up to intel, amd, and nvidia, no game would need anything else?

ryao · August 27, 2019

20 minutes ago, Trixanity said:

Explicit and implicit multiGPU is terminology used in DX12 documentation.

What I'm talking about is functionality available in DX12. It just requires game developers to code for it. I think Oxide Games had a tech demo showing off unlinked explicit multiGPU.

I'll attach some images explaining each possible multiGPU implementation in DX12:

Reveal hidden contents

Reveal hidden contents

Reveal hidden contents

Reveal hidden contents

One of the main points of the low level APIs was that it gave developers more control. That includes programming their own multiGPU support. It's theirs for the taking but many developers either don't have the time or the skill to do stuff like this properly so it hasn't really been used much. Initially most just relied on implicit multiGPU which is pretty much the same as SLI/Crossfire in its reliance on driver support.

I don't think Windows 7 is much of a hindrance anymore. It should be noted that DX12 has been backported to W7 in some limited capacity but the future is W10 either way (whether we like it or not). I'm not sure what APIs Vulkan has regarding multiGPUs, what the capabilities are and how it's different from DX12 but I'd assume they're similar but probably not identical in feature set.

Quite frankly developers have had APIs to implement multiGPU on their own for years but they're not doing anything with it so it's not that there's a technical limitation. Basically multiGPU should not rely on drivers. It's a stupid crutch. It's not the first time that we've seen driver implementations or attempts at it that yielded inconsistent or even bad results.

Implicit multiadapter with unlinked GPUs is an inferior version of what SLI and crossfire do. It is a bad idea. There are no tools for developers to support it aside from knowing what not to do when writing software. It is the wrong place to do this.

By the way, half of China is on Windows 7 and it is a growth market:

https://www.extremetech.com/gaming/297092-microsoft-makes-it-easier-to-bring-directx-12-games-to-windows-7

Either Microsoft stops making Vulkan a second class citizen in their developer tools to encourage adoption or they do a proper backport that lets users install Direct3D 12 on older versions of Windows. Otherwise, explicit multiGPU is unlikely to gain much appeal.

That said, I don’t really think that it is a very important feature. I used to say it when I was younger and people suggested multicore GPUs after seeing the benefits of multicore CPUs, but there is nothing that you can get from two GPUs that you cannot get in a better way by doubling execution resources on the GPU. Doubling resources in the GPU scales far better.

What we need are better driver shader compilers. It wouldn’t surprise me if there are situations where a significant amount of the GPU’s potential is wasted by poor driver shader compilers. Valve has already demonstrated that on AMD GPUs with Mesa-ACO.

Trixanity · August 27, 2019

1 minute ago, ryao said:

Implicit multiadapter with unlinked GPUs is what SLI and crossfire do. It is a bad idea. There are no tools for developers to support it aside from knowing what not to do when writing software. It is the wrong place to do this.

By the way, half of China is on Windows 7 and it is a growth market:

https://www.extremetech.com/gaming/297092-microsoft-makes-it-easier-to-bring-directx-12-games-to-windows-7

Either Microsoft stops making Vulkan a second class citizen in their developer tools to encourage adoption or they do a proper backport that lets users install Direct3D 12 on older versions of Windows. Otherwise, explicit multiGPU is unlikely to gain much appeal.

That said, I don’t really think that it is a very important feature. I used to say it when I was younger and people suggested multicore GPUs, but there is nothing that you can get in theory from two GPUs that you cannot get in a better way by doubling execution resources on the GPU. What we need are better driver shader compilers.

I know and that's why I said it's up to developers to embrace the tools provided to them in low level APIs but it's a hard sell when it's a lot more difficult and there is no audience for it.

If China is a growth market, one would assume that they'll eventually move on to W10. Unless MS is providing regional support, then there'll be no more patches starting next year. MS have no incentive to push Vulkan. Vulkan is supposed to be platform agnostic so it's up to the organization behind Vulkan to make Vulkan the superior choice. Vulkan already has the issue of being blocked by Apple so they have a lot of work to do.

Doubling execution units on GPUs is contingent on die sizes and process nodes. It's not a good thing to rely on considering the state of Moore's law. Whether a better shader compiler will do all that much remains to be seen. I've never heard anyone else bring that up as immediate performance gains being left on the table. And how much performance do you expect to gain? Double? Triple? How do you expect it to scale? If we're not talking a lot of performance and it's just a one time boost in performance then it might not be worth anyone's time.

ryao · August 27, 2019

48 minutes ago, Trixanity said:

I know and that's why I said it's up to developers to embrace the tools provided to them in low level APIs but it's a hard sell when it's a lot more difficult and there is no audience for it.

If China is a growth market, one would assume that they'll eventually move on to W10. Unless MS is providing regional support, then there'll be no more patches starting next year. MS have no incentive to push Vulkan. Vulkan is supposed to be platform agnostic so it's up to the organization behind Vulkan to make Vulkan the superior choice. Vulkan already has the issue of being blocked by Apple so they have a lot of work to do.

Doubling execution units on GPUs is contingent on die sizes and process nodes. It's not a good thing to rely on considering the state of Moore's law. Whether a better shader compiler will do all that much remains to be seen. I've never heard anyone else bring that up as immediate performance gains being left on the table. And how much performance do you expect to gain? Double? Triple? How do you expect it to scale? If we're not talking a lot of performance and it's just a one time boost in performance then it might not be worth anyone's time.

It depends on the software, but simply using D9VK on A Hat in Time on AMD graphics hardware (Vega if I recall) is said to triple performance. More than the shader compiler changes there though and the benefits of improvements only occur when those improvements remove existing inefficiencies. Improving performance through better software is not a one time thing because the efficiency of various workloads will vary. What works for one does not work for another. Some already are very efficient. Improved driver compilers and other tweaks is basically what Nvidia’s wonder drivers that magically improve performance in titles do though.

That said, you can resolve yield issues with large silicon dies by disabling areas that are defective. That is what these guys are doing on the scale of an entire wafer:

https://www.eetimes.com/document.asp?doc_id=1335043&page_number=1

You are going to be using the same amount of silicon anyway, so why add redundant PCBs and introduce the complications of NUMA into the GPU? Also, even if you do not see the bad silicon, you still pay for it, so being able to use it makes good sense. You get a lower cost per mm^2 through less waste that way as long as you design things well enough that this trick gives you near 100% yields. At the scale of an entire wafer, they definitely designed their chips to obtain near 100% yields. Those cost something like >$100,000 each at high volume and many times more at low volume. Losing an entire wafer at low volume is painful.

Anyway, people have been hoping for a magic bullet here for a long time and it just is not going to happen. The closest here would be having the game / game engine developers do it, but there is not much demand for that. The user base is not there. The API support is not there either because they are going to need to use Direct3D 11 for the foreseeable future. Also, Direct3D 12 and Vulkan are not for everyone as developers get to reimplement things done for them in exchange for being able to custom tailor things to suit their software. OpenGL 4.6 and Direct3D 11 will continue to be useful for a very long time.

Developers are far more likely to use multiGPU on Google Stadia where they are guaranteed to have it available and Vulkan is in use.

By the way, assuming that we have some break through where games become friendly to tiled GPU architectures to make multiGPU scaling work, how do you expect having to buy double the number of GPUs every couple of years to scale if we are not relying on die shrinks?

1 GPU 2019 (total 1 GPU)

1 GPU 2021 (total 2 GPUs)

2 GPUs 2023 (total 4 GPUs)

4 GPUs 2025 (total 8 GPUs)

8 GPUs 2027 (total 16 GPUs)

Ignoring the problem of connecting all of these to the same machine, your wallet is not going to be able to take it. Less than 0.1% of the population can be expected to keep doing this past the 20 year mark (1024 GPUs!). Even larger dies designed to maximize yields won’t save us from this problem, but they are more cost efficient and lack the problem of interconnection issues.

79wjd · August 27, 2019

41 minutes ago, ryao said:

It depends on the software, but simply using D9VK on A Hat in Time on AMD graphics hardware (Vega if I recall) is said to triple performance. More than the shader compiler changes there though and the benefits of improvements only occur when those improvements remove existing inefficiencies. Improving performance through better software is not a one time thing because the efficiency of various workloads will vary. What works for one does not work for another. Some already are very efficient. Improved driver compilers and other tweaks is basically what Nvidia’s wonder drivers that magically improve performance in titles do though.

That said, you can resolve yield issues with large silicon dies by disabling areas that are defective. That is what these guys are doing on the scale of an entire wafer:

https://www.eetimes.com/document.asp?doc_id=1335043&page_number=1

You are going to be using the same amount of silicon anyway, so why add redundant PCBs and introduce the complications of NUMA into the GPU? Also, even if you do not see the bad silicon, you still pay for it, so being able to use it makes good sense. You get a lower cost per mm^2 through less waste that way as long as you design things well enough that this trick gives you near 100% yields. At the scale of an entire wafer, they definitely designed their chips to obtain near 100% yields. Those cost something like >$100,000 each at high volume and many times more at low volume. Losing an entire wafer at low volume is painful.

Anyway, people have been hoping for a magic bullet here for a long time and it just is not going to happen. The closest here would be having the game / game engine developers do it, but there is not much demand for that. The user base is not there. The API support is not there either because they are going to need to use Direct3D 11 for the foreseeable future. Also, Direct3D 12 and Vulkan are not for everyone as developers get to reimplement things done for them in exchange for being able to custom tailor things to suit their software. OpenGL 4.6 and Direct3D 11 will continue to be useful for a very long time.

Developers are far more likely to use multiGPU on Google Stadia where they are guaranteed to have it available and Vulkan is in use.

By the way, assuming that we have some break through where games become friendly to tiled GPU architectures, how do you expect having to buy double the number of GPUs every couple of years to scale if we are not relying on die shrinks?

1 GPU 2019 (total 1 GPU)

1 GPU 2021 (total 2 GPUs)

2 GPUs 2023 (total 4 GPUs)

4 GPUs 2025 (total 8 GPUs)

8 GPUs 2027 (total 16 GPUs)

Ignoring the problem of connecting all of these to the same machine, your wallet is not going to be able to take it. Less than 0.1% of the population can afford to keep doing this past the 20 year mark. Even larger dies designed to maximize yields won’t save us from this problem, but they are more cost efficient and lack the problem of interconnect issues.

And what makes you think that this super GPU with double the silicon on it will cost significantly less than two cards with half as much each?

It would certainly be cheaper (using chiplets. A massive monolithic die would almost certainly end up more expensive), but not massively so.

ryao · August 27, 2019

12 minutes ago, 79wjd said:

And what makes you think that this super GPU with double the silicon on it will cost significantly less than two cards with half as much die area each?

Do a bill of material cost calculation assuming that the GPUs are designed to tolerate defects:

GPU die of size N - X dollars

GPU did of size 2N - 2X dollars

Everything else - Y dollars

Two graphics cards are 2X + 2Y. One with a larger GPU made to get yields are 2X + Y. The larger GPU graphics card is therefore cheaper.

By the way, if you include memory chips, the larger GPU graphics card can use less silicon. Hence why Y is smaller than 2Y. There is other stuff like the PCB, HSF, assembly, packaging, shipping, etcetera, plus supply and demand curves. This is a simplification, but it shows the concept. The trick to getting this to work is to design the GPUs to be tolerant enough of defects that you can use nearly all of the dies, which drives down costs.

pas008 · August 27, 2019

4 minutes ago, ryao said:

Do a bill of material cost calculation:

GPU die of size N - X dollars

GPU of size 2N - ~2X dollars

Everything else - Y dollars

Two graphics cards are 2X + 2Y. One with a larger GPU made to get yields are ~2X + Y. The larger GPU graphics card is therefore cheaper.

By the way, if you include memory chips, the larger GPU graphics card can use less silicon. Hence why Y is smaller than 2Y. This is a simplification, but it shows the concept. The trick to getting this to work is to design the GPUs to be tolerant enough to defects that you can use nearly all of the dies, which drives down costs.

cheaper for whom?

ryao · August 27, 2019

5 minutes ago, pas008 said:

cheaper for whom?

Well, let me put it this way. Computers used to cost millions of dollars before cheaper models that used smaller parts were made. The IBM PC was such a revolution in doing this that one of the people at IBM picked it up and said that it was too light to be a computer. The Raspberry Pi took this concept to an extreme and produced a $35 computer.

Who do you think benefits when improvements in technology lower the cost of it by allowing less hardware to be necessary?

MoonSpot · August 27, 2019

4 hours ago, leadeater said:

Public shaming really does go along way, the type I agree with.

Spoken like a true fan, of the last jedi.

*Raises Fists, spins OG MK theme; flexes gluteus*

cj09beira · August 27, 2019

16 minutes ago, 79wjd said:

And what makes you think that this super GPU with double the silicon on it will cost significantly less than two cards with half as much each?

It would certainly be cheaper (using chiplets. A massive monolithic die would almost certainly end up more expensive), but not massively so.

one less cooler one less pcb etc, it can easily be close to half the cost just because of this

ryao · August 27, 2019

Chiplets are a way to workaround the problem of defects. They also have the benefit of allowing you to produce lower end models with less silicon without producing different dies from the higher end models.

What Cerebras did was make basically everything in the die fully redundant such that there is little chance of defects making a die unusable, which is another way to handle defects. They have demonstrated that you can get large dies that way by making an entire wafer into a single die. They do not do lower end models (as far as I know).

I am not an expert at hardware engineering, but from what I can tell, AMD’s chiplet approach has the benefit of producing lower end models more cost effectively while Cerebras approach has the benefit of ensuring all components have data paths in silicon, which gives the highest bandwidth. Cerebras probably has a higher percentage of etched silicon being actively used in shipped products too. I have no idea how much of that would be lost to redundancy needed to make the large die approach feasible though.

In any case, multiGPU is basically a unicorn. You need the game developers to do it to get it done properly and they don’t do it very much for reasons of it just giving them an enormous amount of work for a very small set of users who will buy their games anyway. They are using Direct3D 11 and must implement either Vulkan or Direct3D 12 for a single GPU before even touching multiGPU. The assumption that the drivers can do it for them has been more than shown to be wrong. It is an even bigger amount of work with inferior results. Tiling GPUs in theory are basically multiGPU, but the way 3D graphics is done with Direct3D is not friendly to them, so they are not a solution unless we go back to the game developers making changes to their games to be multiGPU friendly. This is why multiGPU never really worked in the first place.

pas008 · August 27, 2019

46 minutes ago, ryao said:

Well, let me put it this way. Computers used to cost millions of dollars before cheaper models that used smaller parts were made. The IBM PC was such a revolution in doing this that one of the people at IBM picked it up and said that it was too light to be a computer. The Raspberry Pi took this concept to an extreme and produced a $35 computer.

Who do you think benefits when improvements in technology lower the cost of it by allowing less hardware to be necessary?

dont need to talk down to me

still doesnt answer my question

cheaper for the designer manufacturer, consumer?

2 smaller easily binned chips are most likely cheaper than huge monolithic

why do you think ryzen is doing so well

Sign In

Navi 21/23 Cards Rumored (aka "Nvidia Killers" xD)

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites