Jump to content

SLI Scaling solved ?

It has been quite a while since nvidia first brought out sli technology, but they haven't still sorted out the issues that broom within the tech itself. I think I may have a solution to that, but it is still a trial and error process.

 

I think Nvidia should unload all of the existing sli scheduling loads on the cpu. This way the GPUs don't have to bother about organising or separating their loads. In games the cpu sits ducks, a better way to utilise that piece of expensive silicon would be this. 

 

To do this, the driver could seperate the screen into equal symmetrical sections (according to the number of cards) so that the gpus together would try to produce the whole frame. This also means that the total available frame buffer could be pooled together. Then the driver would stitch them back together and display it on the screen.

 

Here is an image to demonstrate what I mean.

 

 post-47235-0-09830500-1423746311.jpg

 

Share what you think about this. Nvidia should really think about this issue.

Link to comment
Share on other sites

Link to post
Share on other sites

I might see some issues - 

PCI-E bottlenecks (?)

 

Altough since I don't know how sli/cfx works or cpu's, I would imagine 5960x would be the best cpu for gaming (sadly) becouse i5 could keep up

 

Disclaimer: 

This all could be wrong, correct me then. It's just an tought and I don't know how sli/cfx or cpu's work

Link to comment
Share on other sites

Link to post
Share on other sites

I was under the impression it has to do with game devs :S

yup, I think the devs are also to blame for the bad scaling on 3+ configs. Some just scale better than others

|CPU: Intel i7-5960X @ 4.4ghz|MoBo: Asus Rampage V|RAM: 64GB Corsair Dominator Platinum|GPU:2-way SLI Gigabyte G1 Gaming GTX 980's|SSD:512GB Samsung 850 pro|HDD: 2TB WD Black|PSU: Corsair AX1200i|COOLING: NZXT Kraken x61|SOUNDCARD: Creative SBX ZxR|  ^_^  Planned Bedroom Build: Red Phantom [quadro is stuck in customs, still trying to find a cheaper way to buy a highend xeon]

Link to comment
Share on other sites

Link to post
Share on other sites

It is open for discussion so I was just wondering what everybody would think about my idea. But there is a thing though, people using SLI/CFX already have bigdog CPUs. I practically don't see a bottleneck, if u see it that way.

Link to comment
Share on other sites

Link to post
Share on other sites

Not all games the CPU is a sitting duck. Games like Battlefield 3, 4, Crysis 2, and 3 are examples of this. If the driver ends up stitching it all back together, doesn't that mean the information goes back to the CPU? And then transferred back to the master card? If that's the case, I don't like it. The way it works now is fine where they take turns rendering frames. Anyways, how would this solve SLI scaling? It's a software problem, not a hardware problem.

"It pays to keep an open mind, but not so open your brain falls out." - Carl Sagan.

"I can explain it to you, but I can't understand it for you" - Edward I. Koch

Link to comment
Share on other sites

Link to post
Share on other sites

I dont see this being practical unless you have a quad core with hyper-threading since a lot of games utilize a good portion of a standard quad/dual w-hyper-threading core cpu as it is. So essentially it would hurt the G3258 and i5's more than it helps. The only cpu's I see something like this helping is those that have hyper-threading since the virtual cores in most games are dead weight since games cant really take advantage of those virtual cores that well, so the i3/i7/zeons/ and all the quad core w/hyper-threading AMD cpu's out there would benefit but the G3258/i5/ and all the non-hyper-threaded AMD cpu's would just be gimped rather than improved.

Link to comment
Share on other sites

Link to post
Share on other sites

The devs do the making, SLI scheduling is most part Nvidia driver side of things.

Link to comment
Share on other sites

Link to post
Share on other sites

I might see some issues - 

PCI-E bottlenecks (?)

 

Altough since I don't know how sli/cfx works or cpu's, I would imagine 5960x would be the best cpu for gaming (sadly) becouse i5 could keep up

 

Disclaimer: 

This all could be wrong, correct me then. It's just an tought and I don't know how sli/cfx or cpu's work

Bandwidth doesn't play that big a role. Nvidia cards are PCIe snobs, but really, even 4x is totally fine. You only start really bottlenecking at 2.0 these days.

 

Also, there's something called the law of diminishing returns. Affects us all. There's no solution to it. There are latencies involved with multi-GPU configs that just aren't solvable.

Link to comment
Share on other sites

Link to post
Share on other sites

Interesting, but there are potential issues involved with splitting the screen that way. Let's say under heavy gaming load one of the cards begins to throttle (usually the top one as is the case in sli setups), then screen tear would occur because that portion of the screen cannot keep up with the rest. Perhaps if g-sync technology was improved to compensate for this then it would be a killer performance boost for nvidia cards.

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Quisque euismod suscipit est, non placerat turpis vestibulum sed. Phasellus et faucibus odio. Donec a nisi at purus porttitor ullamcorper auctor a nibh. Integer id neque a nisi laoreet ultrices id ac augue. Nulla facilisi. Nullam purus elit, dictum quis euismod vitae, mollis non dui. Morbi vehicula neque eu mattis bibendum. Curabitur sed odio tortor. Sed euismod mi in diam volutpat, vitae convallis ipsum mattis. Praesent eleifend faucibus pulvinar. Interdum et malesuada fames ac ante ipsum primis in faucibus. Vestibulum velit nunc, fermentum a libero a, venenatis tempus lectus.

Link to comment
Share on other sites

Link to post
Share on other sites

DX12 games might benefit in the future, but I think  a smart SlI solution such as this would help. My option does include a bit of lag, but unlocks a whole ton of graphical horsepower at your feet.

Link to comment
Share on other sites

Link to post
Share on other sites

DX12 games might benefit in the future, but I think  a smart SlI solution such as this would help. My option does include a bit of lag, but unlocks a whole ton of graphical horsepower at your feet.

I'm fairly certain the CPU already handles the scheduling anyways.

"It pays to keep an open mind, but not so open your brain falls out." - Carl Sagan.

"I can explain it to you, but I can't understand it for you" - Edward I. Koch

Link to comment
Share on other sites

Link to post
Share on other sites

This solution would see that all the cards have the same amount of load, equalising usage and temperature. So throttling and usage wouldn't be a problem.

Link to comment
Share on other sites

Link to post
Share on other sites

Any body have Nvidia's email address? Let's see what the pros have to say.

Link to comment
Share on other sites

Link to post
Share on other sites

Any body have Nvidia's email address? Let's see what the pros have to say.

Nvidia.com

Because he had a hard drive.

Link to comment
Share on other sites

Link to post
Share on other sites

Nvidia.com

That's not an email address.

"It pays to keep an open mind, but not so open your brain falls out." - Carl Sagan.

"I can explain it to you, but I can't understand it for you" - Edward I. Koch

Link to comment
Share on other sites

Link to post
Share on other sites

This solution would see that all the cards have the same amount of load, equalising usage and temperature. So throttling and usage wouldn't be a problem.

But if you think about it while yes scaling would be better in this instance your essentially putting all cards at a lower tier in order to equalize them out since the ones that are getting hotter are performing worse than the ones not being throttled. So in reality a throttled 980 would perform about the same as a non-throttled 970 so a triple 980 config would be scaled down to triple 970 performance. So with that in mind your loosing 1 card in the chain since 3 970s=2.5 980's on paper, I know through scaling it is a tad bit difference but your still at way less performance than 3 full 980s.

Link to comment
Share on other sites

Link to post
Share on other sites

This solution would see that all the cards have the same amount of load, equalising usage and temperature. So throttling and usage wouldn't be a problem.

Even under equal usage the top card usually gets choked because of the clearance with the fan. Therefore, the scheduling done by the cpu needs to dynamically take into account the temperature of the cards at the same time.

 

Nonetheless, brilliant idea op!

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Quisque euismod suscipit est, non placerat turpis vestibulum sed. Phasellus et faucibus odio. Donec a nisi at purus porttitor ullamcorper auctor a nibh. Integer id neque a nisi laoreet ultrices id ac augue. Nulla facilisi. Nullam purus elit, dictum quis euismod vitae, mollis non dui. Morbi vehicula neque eu mattis bibendum. Curabitur sed odio tortor. Sed euismod mi in diam volutpat, vitae convallis ipsum mattis. Praesent eleifend faucibus pulvinar. Interdum et malesuada fames ac ante ipsum primis in faucibus. Vestibulum velit nunc, fermentum a libero a, venenatis tempus lectus.

Link to comment
Share on other sites

Link to post
Share on other sites

But if you think about it while yes scaling would be better in this instance your essentially putting all cards at a lower tier in order to equalize them out since the ones that are getting hotter are performing worse than the ones not being throttled. So in reality a throttled 980 would perform about the same as a non-throttled 970 so a triple 980 config would be scaled down to triple 970 performance.

 

But as all are working at the same pace and the load is spread out, we could be getting a full scale even at 4 way configs.

Link to comment
Share on other sites

Link to post
Share on other sites

This opens the possibility of 144Hz 4K gaminf today, only if we had the monitor and cables though.  :)

Link to comment
Share on other sites

Link to post
Share on other sites

Bandwidth doesn't play that big a role. Nvidia cards are PCIe snobs, but really, even 4x is totally fine. You only start really bottlenecking at 2.0 these days.

Also, there's something called the law of diminishing returns. Affects us all. There's no solution to it. There are latencies involved with multi-GPU configs that just aren't solvable.

There's multiple benchmarks that show 2.0x16 vs 3.0x16 in SLI configs at higher resolution giving sometimes upwards of 10 fps gains. There is definitely an improvement imo,especially with 2-3 cards at 3x1080, 4k, 1440p in some games.

Also @op, there's loads of tests I've seen for SLI scaling that confuse me. People always use cpus clocked very conservatively that say scaling is very poor above 2 cards, but when I see people on overclock.net or other sites benchmark with 5ghz 2011 chips, scaling is almost perfect even at 4 way in loads of AAA games.

http://www.overclock.net/t/1415441/7680x1440-benchmarks-plus-2-3-4-way-sli-gk110-scaling

Stuff:  i7 7700k @ (dat nibba succ) | ASRock Z170M OC Formula | G.Skill TridentZ 3600 c16 | EKWB 1080 @ 2100 mhz  |  Acer X34 Predator | R4 | EVGA 1000 P2 | 1080mm Radiator Custom Loop | HD800 + Audio-GD NFB-11 | 850 Evo 1TB | 840 Pro 256GB | 3TB WD Blue | 2TB Barracuda

Hwbot: http://hwbot.org/user/lays/ 

FireStrike 980 ti @ 1800 Mhz http://hwbot.org/submission/3183338 http://www.3dmark.com/3dm/11574089

Link to comment
Share on other sites

Link to post
Share on other sites

But as all are working at the same pace and the load is spread out, we could be getting a full scale even at 4 way configs.

Bear in mind that all cards would need equal amount of information. So VRAM still cannot "pool together" as you say. When video cards render frames, they utilize the same textures as the other cards.

"It pays to keep an open mind, but not so open your brain falls out." - Carl Sagan.

"I can explain it to you, but I can't understand it for you" - Edward I. Koch

Link to comment
Share on other sites

Link to post
Share on other sites

But as all are working at the same pace and the load is spread out, we could be getting a full scale even at 4 way configs.

That may be true I would be interested to see if it could work and would be really happy if it did. I just have this sinking feeling that it would cause everyone to have to buy a card 1 tier higher to get the full performance of their target tier. So if your looking to get full performance out of a triple 970 config you would have to buy triple 980s to get that full 970 configs theoretical performance. Im sorry if it sounds like im being pessimistic and shooting down your idea I am just making the point of what equalizing the load of the config as a whole based on the worst performing card as a whole could bring. 

Link to comment
Share on other sites

Link to post
Share on other sites

But each card has a different texture set, different part of the frame. This is where the CPU kicks in to do the trick.

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×