Jump to content

SLI Scaling solved ?

This can be done through driver tweaks. But anyways, I think Maxwell V2(980Ti) should feature enhanced SLI compatibility.

It might be able to be done with driver tweaks but neither AMD nor Nvidia would do it that way since it would essentially make upgrading from last gen 290/290x or 770/780 cards pointless and same with the 970/980 cards going into the 1000 series. I would be willing to bet my left nut that it would be a major selling point on the next big release of GPU's.

Link to comment
Share on other sites

Link to post
Share on other sites

 

TLDR me and i will possibly say something in this post. cba to read the thread

 

 

From what I understand, use the CPU to keep track of the workloads of SLI'd GPU's instead of having each GPU keep track of its own work as well as the work of the other GPU.

 

At least I think that's what he's saying. He drew a diagram

post-47235-0-09830500-1423746311.jpg

This makes sense to me, rather than having each GPU have the same workload in its frame buffer. Instead have each with its own workload, all coordinated by the CPU

Ketchup is better than mustard.

GUI is better than Command Line Interface.

Dubs are better than subs

Link to comment
Share on other sites

Link to post
Share on other sites

It has been quite a while since nvidia first brought out sli technology, but they haven't still sorted out the issues that broom within the tech itself. I think I may have a solution to that, but it is still a trial and error process.

 

I think Nvidia should unload all of the existing sli scheduling loads on the cpu. This way the GPUs don't have to bother about organising or separating their loads. In games the cpu sits ducks, a better way to utilise that piece of expensive silicon would be this. 

 

To do this, the driver could seperate the screen into equal symmetrical sections (according to the number of cards) so that the gpus together would try to produce the whole frame. This also means that the total available frame buffer could be pooled together. Then the driver would stitch them back together and display it on the screen.

 

Here is an image to demonstrate what I mean.

 

 attachicon.gifCapture.JPG

 

Share what you think about this. Nvidia should really think about this issue.

 

this is basically what Mantle (and by extension DX12) is already capable of doing. This was hot news 2 weeks ago.

R9 3900XT | Tomahawk B550 | Ventus OC RTX 3090 | Photon 1050W | 32GB DDR4 | TUF GT501 Case | Vizio 4K 50'' HDR

 

Link to comment
Share on other sites

Link to post
Share on other sites

From what I understand, use the CPU to keep track of the workloads of SLI'd GPU's instead of having each GPU keep track of its own work as well as the work of the other GPU.

 

At least I think that's what he's saying. He drew a diagram

 

There are two ways for SLI/CFX to work. Alternate frame rendering and Parallel frame rendering

 

AFR is what we use now, each GPU does a frame and lets the next one be for another gpu. this is why we can get latency, as not all frames are equally demanding.

 

PFR is what the diagram describes, and is something that the driver and the API must both support, as well as the engine. (meaning its not used much)... This way, you can use all the ram available in the system, but there is a chance of it introducing vertical tearing. Its also much harder to code correctly, and make good profiles for. 

 

About the CPU controlling it all... Its all well (assuming you dont have a shitty bulldozer) until you figure out that this wont be even remotely viable as a solution until full DX12 matures (meaning atleast Pascal). Pascal on the other hand has a dedicated ARMv8 core on the GPU itself for things like this, and incorporates NVLink, so GPUs will have no problems communicating any more, and making it go through the slow PCIe4.0 bus would be stupid

"Unofficially Official" Leading Scientific Research and Development Officer of the Official Star Citizen LTT Conglomerate | Reaper Squad, Idris Captain | 1x Aurora LN


Game developer, AI researcher, Developing the UOLTT mobile apps


G SIX [My Mac Pro G5 CaseMod Thread]

Link to comment
Share on other sites

Link to post
Share on other sites

There are two ways for SLI/CFX to work. Alternate frame rendering and Parallel frame rendering

 

AFR is what we use now, each GPU does a frame and lets the next one be for another gpu. this is why we can get latency, as not all frames are equally demanding.

 

PFR is what the diagram describes, and is something that the driver and the API must both support, as well as the engine. (meaning its not used much)... This way, you can use all the ram available in the system, but there is a chance of it introducing vertical tearing. Its also much harder to code correctly, and make good profiles for. 

 

About the CPU controlling it all... Its all well (assuming you dont have a shitty bulldozer) until you figure out that this wont be even remotely viable as a solution until full DX12 matures (meaning atleast Pascal). Pascal on the other hand has a dedicated ARMv8 core on the GPU itself for things like this, and incorporates NVLink, so GPUs will have no problems communicating any more, and making it go through the slow PCIe4.0 bus would be stupid

so basically....."no" right? Will have to look up some of the things you mentioned.

 

Edit: When can I have NVlink on my personal rig? Lol

Ketchup is better than mustard.

GUI is better than Command Line Interface.

Dubs are better than subs

Link to comment
Share on other sites

Link to post
Share on other sites

so basically....."no" right? Will have to look up some of the things you mentioned.

the use of the CPU is dumb. PFR isnt dumb, just not used yet, due to the API (DX11) not supporting it i think, and noone really used it before so no existing code

"Unofficially Official" Leading Scientific Research and Development Officer of the Official Star Citizen LTT Conglomerate | Reaper Squad, Idris Captain | 1x Aurora LN


Game developer, AI researcher, Developing the UOLTT mobile apps


G SIX [My Mac Pro G5 CaseMod Thread]

Link to comment
Share on other sites

Link to post
Share on other sites

the use of the CPU is dumb. PFR isnt dumb, just not used yet, due to the API (DX11) not supporting it i think, and noone really used it before so no existing code

lol, when can I have NVlink on my home desktop? Sounds like amazeballs from Nvidia's description of it.....that is if they don't shoot it in the knee before selling it.

Ketchup is better than mustard.

GUI is better than Command Line Interface.

Dubs are better than subs

Link to comment
Share on other sites

Link to post
Share on other sites

lol, when can I have NVlink on my home desktop? Sounds like amazeballs from Nvidia's description of it.....that is if they don't shoot it in the knee before selling it.

it comes with pascal. 

"Unofficially Official" Leading Scientific Research and Development Officer of the Official Star Citizen LTT Conglomerate | Reaper Squad, Idris Captain | 1x Aurora LN


Game developer, AI researcher, Developing the UOLTT mobile apps


G SIX [My Mac Pro G5 CaseMod Thread]

Link to comment
Share on other sites

Link to post
Share on other sites

it comes with pascal. 

And what is that? lol. I tried googling it, closest thing I saw was something about open source compiler. Or is that what you are talking about? The Nvidia article made it seem like NVlink was a replacement to PCIe

Ketchup is better than mustard.

GUI is better than Command Line Interface.

Dubs are better than subs

Link to comment
Share on other sites

Link to post
Share on other sites

And what is that? lol. I tried googling it, closest thing I saw was something about open source compiler. Or is that what you are talking about? The Nvidia article made it seem like NVlink was a replacement to PCIe

Pascal is the architecture after maxwell, so GTX 11xx. NVLink is sort of a PCIe replacement, but more for interGPU comms than with the CPU. CPU has one NVLink to the first GPU, and then the other ones just connect to the GPU_0 and simplify data paths as well as dramatically increase bandwith

"Unofficially Official" Leading Scientific Research and Development Officer of the Official Star Citizen LTT Conglomerate | Reaper Squad, Idris Captain | 1x Aurora LN


Game developer, AI researcher, Developing the UOLTT mobile apps


G SIX [My Mac Pro G5 CaseMod Thread]

Link to comment
Share on other sites

Link to post
Share on other sites

And what is that? lol. I tried googling it, closest thing I saw was something about open source compiler. Or is that what you are talking about? The Nvidia article made it seem like NVlink was a replacement to PCIe

 

An alternative to the pcie bus. Your motherboard and processor would need to support that.

Link to comment
Share on other sites

Link to post
Share on other sites

Pascal is the architecture after maxwell, so GTX 11xx. NVLink is sort of a PCIe replacement, but more for interGPU comms than with the CPU. CPU has one NVLink to the first GPU, and then the other ones just connect to the GPU_0 and simplify data paths as well as dramatically increase bandwith

Ahhh. Won't that take new CPU and motherboard architecture if it is replacing pcie-3.0?

Ketchup is better than mustard.

GUI is better than Command Line Interface.

Dubs are better than subs

Link to comment
Share on other sites

Link to post
Share on other sites

There are two ways for SLI/CFX to work. Alternate frame rendering and Parallel frame rendering

 

AFR is what we use now, each GPU does a frame and lets the next one be for another gpu. this is why we can get latency, as not all frames are equally demanding.

 

PFR is what the diagram describes, and is something that the driver and the API must both support, as well as the engine. (meaning its not used much)... This way, you can use all the ram available in the system, but there is a chance of it introducing vertical tearing. Its also much harder to code correctly, and make good profiles for. 

 

About the CPU controlling it all... Its all well (assuming you dont have a shitty bulldozer) until you figure out that this wont be even remotely viable as a solution until full DX12 matures (meaning atleast Pascal). Pascal on the other hand has a dedicated ARMv8 core on the GPU itself for things like this, and incorporates NVLink, so GPUs will have no problems communicating any more, and making it go through the slow PCIe4.0 bus would be stupid

 

Why need a second processor at all when you can use the heavily under utilized processor to do it for you. Well, maybe some processor usage but not full.

Link to comment
Share on other sites

Link to post
Share on other sites

Ahhh. Won't that take new CPU and motherboard architecture if it is replacing pcie-3.0?

Yes, id imagine it will be like GSync, with a few CPUs and mobos supporting it until a similar open standard is developed. and that there will be GP104 and GP110 based GPUs over PCIe as well

"Unofficially Official" Leading Scientific Research and Development Officer of the Official Star Citizen LTT Conglomerate | Reaper Squad, Idris Captain | 1x Aurora LN


Game developer, AI researcher, Developing the UOLTT mobile apps


G SIX [My Mac Pro G5 CaseMod Thread]

Link to comment
Share on other sites

Link to post
Share on other sites

Why need a second processor at all when you can use the heavily under utilized processor to do it for you. Well, maybe some processor usage but not full.

because NV has alot of ARM experience and can build a custom v8 core that uses almost no power and space and can be locked down to to one thing and one thing well instead of relying on the CPU to be powerful enough to also do that. especially with low overhead APIs like DX12 putting more strain back on the cpu.

"Unofficially Official" Leading Scientific Research and Development Officer of the Official Star Citizen LTT Conglomerate | Reaper Squad, Idris Captain | 1x Aurora LN


Game developer, AI researcher, Developing the UOLTT mobile apps


G SIX [My Mac Pro G5 CaseMod Thread]

Link to comment
Share on other sites

Link to post
Share on other sites

Why need a second processor at all when you can use the heavily under utilized processor to do it for you. Well, maybe some processor usage but not full.

Because that would probably be more complicated and difficult to do, rather than just build the GPU's to be able to better communicate with each other, at a much higher speed and with a much higher bandwidth.

 

Yes, id imagine it will be like GSync, with a few CPUs and mobos supporting it until a similar open standard is developed. and that there will be GP104 and GP110 based GPUs over PCIe as well

Lol how do yall keep track of all these random numbers and names?

 

because NV has alot of ARM experience and can build a custom v8 core that uses almost no power and space and can be locked down to to one thing and one thing well instead of relying on the CPU to be powerful enough to also do that. especially with low overhead APIs like DX12 putting more strain back on the cpu.

 

I forgot about that factor, that not all CPU's would be capable of coordinating the GPU's. Very good point.

Ketchup is better than mustard.

GUI is better than Command Line Interface.

Dubs are better than subs

Link to comment
Share on other sites

Link to post
Share on other sites

because NV has alot of ARM experience and can build a custom v8 core that uses almost no power and space and can be locked down to to one thing and one thing well instead of relying on the CPU to be powerful enough to also do that. especially with low overhead APIs like DX12 putting more strain back on the cpu.

 

I got what you mean, but it sounds like stuff from the future. Something that works on present date hardware would be much appreciated, don't you think? Then also the shiny new 4790K and a pair of 980s that you bought could be put to some good use. I am just trying to make 4 way sli feasible while still using that 5960X.  

Link to comment
Share on other sites

Link to post
Share on other sites

I got what you mean, but it sounds like stuff from the future. Something that works on present date hardware would be much appreciated, don't you think? Then also the shiny new 4790K and a pair of 980s that you bought could be put to some good use. I am just trying to make 4 way sli feasible while still using that 5960X.  

You can always try custom drivers.

Ketchup is better than mustard.

GUI is better than Command Line Interface.

Dubs are better than subs

Link to comment
Share on other sites

Link to post
Share on other sites

You can always try custom drivers.

 

Know anyone skilled and dedicated enough?

Link to comment
Share on other sites

Link to post
Share on other sites

I got what you mean, but it sounds like stuff from the future. Something that works on present date hardware would be much appreciated, don't you think? Then also the shiny new 4790K and a pair of 980s that you bought could be put to some good use. I am just trying to make 4 way sli feasible while still using that 5960X.  

It is stuff from the future. NV is only just testing the first Pascal chips now...  Well backward compatibility is great and all, but hinders progress alot... look at x86. it has so much legacy support holding it back. Power8/9 dont have that, and are much more powerful, given the same TDP and run at 5GHz

"Unofficially Official" Leading Scientific Research and Development Officer of the Official Star Citizen LTT Conglomerate | Reaper Squad, Idris Captain | 1x Aurora LN


Game developer, AI researcher, Developing the UOLTT mobile apps


G SIX [My Mac Pro G5 CaseMod Thread]

Link to comment
Share on other sites

Link to post
Share on other sites

It is stuff from the future. NV is only just testing the first Pascal chips now...  Well backward compatibility is great and all, but hinders progress alot... look at x86. it has so much legacy support holding it back. Power8/9 dont have that, and are much more powerful, given the same TDP and run at 5GHz

 

Then, there would be nothing left to justify the thousands of dollars you just spent on four cards thinking the performance just might get a little better in the future.

Link to comment
Share on other sites

Link to post
Share on other sites

Know anyone skilled and dedicated enough?

There are a few groups that make them, although I've never tried it. I've had one or two people in the guild I am apart of say they always use custom drivers and always get better performance because of it. I googled and got this

http://www.tweakforce.com/

 

Not sure if they have gtx 980 drivers yet.

Ketchup is better than mustard.

GUI is better than Command Line Interface.

Dubs are better than subs

Link to comment
Share on other sites

Link to post
Share on other sites

I just hope Nvidia just thinks about this problem and addresses it by a backward compatible driver update, rather than a hardware revision.

Link to comment
Share on other sites

Link to post
Share on other sites

Then, there would be nothing left to justify the thousands of dollars you just spent on four cards thinking the performance just might get a little better in the future.

you didnt understand me. im saying CPU performance is held back by legacy support. with pascal, nvidia will start relying less and less on legacy tech like PCIe and sli bridges to improve performance more. ;) so each gen will get a higher boost or lower tdp :)

"Unofficially Official" Leading Scientific Research and Development Officer of the Official Star Citizen LTT Conglomerate | Reaper Squad, Idris Captain | 1x Aurora LN


Game developer, AI researcher, Developing the UOLTT mobile apps


G SIX [My Mac Pro G5 CaseMod Thread]

Link to comment
Share on other sites

Link to post
Share on other sites

I just hope Nvidia just thinks about this problem and addresses it by a backward compatible driver update, rather than a hardware revision.

your thought process is wrong entirely. 

  1. Its hardware impossible to have them do it, since they dont fully support DX12_0
  2. even if a driver could fix this, nv wouldnt do it, to drive the sales up

"Unofficially Official" Leading Scientific Research and Development Officer of the Official Star Citizen LTT Conglomerate | Reaper Squad, Idris Captain | 1x Aurora LN


Game developer, AI researcher, Developing the UOLTT mobile apps


G SIX [My Mac Pro G5 CaseMod Thread]

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×