GTX 1060 leak (TAKE WITH A TRUCKLOAD OF SALT)

Ansau · July 4, 2016

1 hour ago, Briggsy said:

These are average consumption numbers taken from the reviewers that actually do extensive power testing across a lot of games.

guru3D - 164W 480 vs. 154W 970

Techpowerup - 163W 480 vs. 156W 970

Kitguru (system load) - 225W 480 vs. 235W (209W adjusted) Palit 970 (which pulls 26 watts more than reference 970 in gaming)

I looked through a dozen other reviews, but most either do not list the 970 in use, only show peak power consumption (a somewhat useless metric here because the reference 970/980 have 2x 6-pin connectors, AIB's 8+6 pin), or their power consumption testing was done using a single synthetic bench or game instead of an average taken from a list of games - so they are not useful for anything but cherry-picking.

People compare the 480 to the 970 because that's where the performance matches, but really if you want to compare to a 980, thats even worse for the 480 because reference 970 and 980 draw the same average power while gaming, making the 480 efficiency even worse.

You cherry-picked some testing that showed higher power consumption. Curiously, these sites are some of those having 480 with over-consumption in the pcie.

You must also be very bad at looking through internet. Sites like anandtech, pcper, techspot, sweclockers, purepc, techreport, computerbase or pclab show the rx 480 having very similar power consumption than the 970.

And if you want to compare efficiency through nº of transistors, you should count those disabled in the 970. As we don't know how many transistors are included in a SM, only full chips must be counted. Moreover, comparing the effeciency at the performance should be done with a more extended workload scenario, not just dx11 games where Maxwell is designed for. Computing, dx12 games, mining... should also be counted, and AMD tends to be better.

leadeater · July 4, 2016

5 minutes ago, Ansau said:

You cherry-picked some testing that showed higher power consumption. Curiously, these sites are some of those having 480 with over-consumption in the pcie.

You have also very very bad at looking through internet. Sites like anandtech, pcper, techspot, sweclockers, purepc, techreport, computerbase or pclab show the rx 480 having very similar power consumption than the 970.

And if you want to compare efficiency through nº of transistors, you should count those disabled in the 970. As we don't know how many transistors are included in a SM, only full chips must be counted. Moreover, performance/efficiency should be done with a more extended workload scenario, not just dx11 games where Nvidia excells. Computing, dx12 games, mining, coding and decoding... should also be counted, and AMD tends to be better.

Yes but at the end of the day people buy graphics cards to play games, that is their purpose. You look at performance/$ and equally so performance/watt, transistor count isn't a good metric to measure against and I don't even know why people are bothering. Sure it's interesting to look at but has little to do with a purchasing decision.

Ansau · July 4, 2016

Well, in any case, included performance/price, the rx 480 is still better than the 970. But performance/price doesn't contradict the fact that in this thread people took the numbers they wanted to make Polaris vs Maxwell efficiency biased to Nvidia.

leadeater · July 4, 2016

1 hour ago, Briggsy said:

Yeah, there is definitely hardware involvement with pascal in the scheduling. I think some people are so hung up on the idea of Pascal being a shrunk Maxwell, they fail to see the improvements in architecture.

Strictly speaking what Nvidia is doing does not meet the term asynchronous, which is why they use the term preemption.

Quote

Right now, the best available evidence suggests that when AMD and Nvidia talk about asynchronous compute, they are talking about two very different capabilities. “Asynchronous compute,” in fact, isn’t necessarily the best name for what’s happening here. The question is whether or not Nvidia GPUs can run graphics and compute workloads concurrently. AMD can, courtesy of its ACE units.

Quote

GCN-style asynchronous computing is unlikely to boost Maxwell performance, in other words, because Maxwell isn’t really designed for these kinds of workloads. Whether Nvidia can work around that limitation (or implement something even faster) remains to be seen.

http://www.extremetech.com/extreme/213519-asynchronous-shading-amd-nvidia-and-dx12-what-we-know-so-far

I would like to see an update for this article testing the 1080/1070 so we can see the new graphs and how close they can get to the ideal line. If it is much closer to the ideal then practically speaking we can say this is good enough for now as that is what really matters, is it adequately doing the required job.

If async compute becomes more important and Nvidia start having issues they will fix it, either with hardware or again with more driver optimizations (or both).

leadeater · July 4, 2016

16 minutes ago, Ansau said:

Well, in any case, included performance/price, the rx 480 is still better than the 970. But performance/price doesn't contradict the fact that in this thread people took the numbers they wanted to make Polaris vs Maxwell efficiency biased to Nvidia.

Well this is were statistics jokes comes in handy.

"You can continuously model the data until you can get the result you are looking for".

"A statistician can prove anything is true given enough time".

https://en.wikipedia.org/wiki/Misuse_of_statistics

TheLaserCucumber · July 4, 2016

If Nvidia is going to make the GTX 1060 as expensive as the 1070 and 1080, i'm not gonna buy one.

The cheapest GTX 1070 here, sits at around 479€, which are 533 USD.
The cheapest 8GB RX 480 sits at 269€, which are around 299 USD.

For me, the GTX 1060 MUST be under 350 USD, that i can consider buying it.

Even if the GTX 1060 will be more efficient, i will buy an RX 480.

But let's wait until it comes out.

LAwLz · July 4, 2016

2 hours ago, leadeater said:

The general point was that the RX 480 could not have been priced more than the 970, AMD knew that. Would you buy the RX 480 if it cost more than another product but performed less? If the RX 480 was significantly faster than the 970 then it would have been priced higher as people would have paid that higher amount.

But that's not how the market works... Or at least it isn't suppose to. In previous generations AMD and Nvidia did in fact price products (that were 1 generation newer) that performed significantly better at the same price point. Want to know why? Because you need to offer a BETTER value for money in order to make customers choose your product.

If you give exactly the same price:performance as your competitors (or in this case, even your own, older products) then there isn't any reason for consumers to pick you over other brands.

Mira Yurizaki · July 4, 2016

Urgh. This is why you need someone like Ryan Smith from Anandtech or whoever from Ars Technica (if they still have people) to explain something here about "asynchronous compute" and whatever.

AMD uses a hardware scheduler, like in most CPUs. NVIDIA does not use a hardware scheduler, all of its scheduling is done using the drivers. They both can achieve the same thing, but NVIDIA may require a higher performing CPU because it shifted the burden of scheduling to software rather than hardware. On the flipside, this means that NVIDIA dumped a lot of transistors from the GPU to focus more on execution units and allowing for better efficiency. NVIDIA may switch the burden of scheduling back to the video card in the form of a Project Denver like CPU, but that's probably just a rumor.

Please note that neither solution is necessarily better than the other.

leadeater · July 4, 2016

3 minutes ago, LAwLz said:

But that's not how the market works... Or at least it isn't suppose to. In previous generations AMD and Nvidia did in fact price products (that were 1 generation newer) that performed significantly better at the same price point. Want to know why? Because you need to offer a BETTER value for money in order to make customers choose your product.

If you give exactly the same price:performance as your competitors (or in this case, even your own, older products) then there isn't any reason for consumers to pick you over other brands.

Oh I agree that's not how it should work but it does happen. As for the previous generations yes that is correct but the improvements came top down and existing products got price adjusted so even the old product while still available price:performance got better.

This type of price matching happens not just in computer parts, look at cars/phones/tvs heck even accommodation.

TheMidnightNarwhal · July 4, 2016

On 7/1/2016 at 6:33 AM, gilang01 said:

The RX480 is already faster in Forza than the 980

What. Since when is Forza on PC.

KeltonDSMer · July 4, 2016

2 hours ago, Briggsy said:

"The graphics units will keep track of their intermediate progress on the current rendering workload so that they can stop, save their state and move off the hardware to allow for the preempted workload to be addressed quickly. NVIDIA tells us the entire process of context switching can occur in less than 100 microseconds after the last pixel shading work is finished."

I am wondering, is 100microsconds fast? That is 150,000 GPU cycles at 1.5GHz and 400,000 CPU cycles at 4GHz. Is this really the most efficient processing path? Is the CPU unable to do this work in the near half million cycles that it completes in the time it takes to finish a context switch?

Humbug · July 4, 2016

41 minutes ago, TheMidnightNarwhal said:

What. Since when is Forza on PC.

https://www.vg247.com/2016/05/05/forza-6-apex-beta-download-release-date/

TheMidnightNarwhal · July 4, 2016

1 minute ago, Humbug said:

https://www.vg247.com/2016/05/05/forza-6-apex-beta-download-release-date/

Oh yeah. I tried to download it from the windows store (only possible way right?) and it told me I didn't had Windows 10, but I was running Windows 10.

Briggsy · July 4, 2016

2 hours ago, KeltonDSMer said:

I am wondering, is 100microsconds fast? That is 150,000 GPU cycles at 1.5GHz and 400,000 CPU cycles at 4GHz. Is this really the most efficient processing path? Is the CPU unable to do this work in the near half million cycles that it completes in the time it takes to finish a context switch?

I guess it depends on human perception. The latest research would suggest that humans can interpret visual queues seen for as little as 13 milliseconds. Oculus says motion to photon latency in VR should be at most 20 milliseconds. A context switch occurring within an SM (triggered internally when a task is flagged as <whatever VRWorks flags it as>) at 100 microseconds is 1/10 of a millisecond, but then you would also add latency with the headset drivers and what not, but the context switching would not be the culprit.

One thing I can say with absolute certainty: I have used my DK2 with a FuryX and R9 390, and there was always some lag issues when moving my head around, in the form of image ghosting and framerate drops. With a 1070, those problems are gone, so the preemption in Pascal is working as it should.

Another fun fact on the topic of Async compute, the 480 only has 4 ACE's instead of the 8 ACE's that are on the 390/x/Fury/x, which drops the number of compute tasks in queue down to 32 for the 480.

2 hours ago, leadeater said:

Strictly speaking what Nvidia is doing does not meet the term asynchronous, which is why they use the term preemption.

http://www.extremetech.com/extreme/213519-asynchronous-shading-amd-nvidia-and-dx12-what-we-know-so-far

I would like to see an update for this article testing the 1080/1070 so we can see the new graphs and how close they can get to the ideal line. If it is much closer to the ideal then practically speaking we can say this is good enough for now as that is what really matters, is it adequately doing the required job.

If async compute becomes more important and Nvidia start having issues they will fix it, either with hardware or again with more driver optimizations (or both).

I think Nvidia likes to put Async Compute into 2 different categories. I would argue neither AMD or Nvidia's definitions are cannon, but obviously concurrency could be argued in literal terms, or in terms of what the user/hardware perceives. Even AMD still shunts compute and graphic tasks through the same pipeline, so even though the two are being processed concurrently, they are funneled through the pipeline in a serial, non-concurrent fashion.

From PCPer, regarding Nvidia's view on async compute: http://www.pcper.com/reviews/Graphics-Cards/GeForce-GTX-1080-8GB-Founders-Edition-Review-GP104-Brings-Pascal-Gamers/Async

Quote

Pascal improves the story dramatically for NVIDIA, though there will still be debate as to how its integration to support asynchronous compute compares to AMD’s GCN designs. NVIDIA sees asynchronous computing as creating two distinct scenarios: overlapping workloads and time critical workloads.

Overlapping workloads are used when a GPU does not fill its processing capability with a single workload alone, leaving gaps or bubbles in the compute pipeline that degrade efficiency and slow down the combined performance of the system. This could be PhysX processing for GeForce GPUs or it might be a post-processing step that a game engine uses to filter the image as a final step. In Maxwell, this load balancing had to work with a fixed partitioning model. Essentially, the software had to say upfront how much time of the GPU it wanted divided between the workloads in contention. If the balance of the workloads stays in balance, this can be an efficient model, but any shift in the workloads would mean either unwanted idle time or jobs not completing in the desired time frame. Pascal addresses this by enabling dynamic load balancing that monitors the GPU for when work being added, allowing the secondary workload to take the bubbles in the system to be used for compute.

Time critical workloads create a different problem – they need prioritization and need to be inserted ASAP. An example of this is the late time warp used by the Oculus Rift to morph the image at the last possible instant with the most recent motion input data. With Maxwell, there was no way to have granular preemption the system had to set a fixed time to ask for the asynchronous time warp (ATW) to start, meaning that the system would often leave GPU compute performance on the table, under-utilizing the hardware.

Pascal is the first GPU architecture to implement a pixel level preemption capability for graphics. The graphics units will keep track of their intermediate progress on the current rendering workload so that they can stop, save their state and move off the hardware to allow for the preempted workload to be addressed quickly. NVIDIA tells us the entire process of context switching can occur in less than 100 microseconds after the last pixel shading work is finished. Similarly for compute tasks, Pascal integrates thread level preemption.

The combination of dynamic scheduling and pixel/thread level preemption in hardware improve NVIDIA’s performance on asynchronous compute and workloads pretty dramatically. In the asynchronous time warp example above, Pascal will be able to give more time to the GPU for rendering tasks than Maxwell, waiting until the last moment to request the time warp via preemption. This capability is already built into and support by Oculus.

KeltonDSMer · July 4, 2016

5 minutes ago, Briggsy said:

I guess it depends on human perception. The latest research would suggest that humans can interpret visual queues seen for as little as 13 milliseconds. Oculus says motion to photon latency in VR should be at most 20 milliseconds. A context switch occurring within an SM (triggered internally when a task is flagged as <whatever VRWorks flags it as>) at 100 microseconds is 1/10 of a millisecond, but then you would also add latency with the headset drivers and what not, but the context switching would not be the culprit.

Of course 100 microseconds is short in the context of how long a frame is displayed on a screen, but I was curious about this time in comparison to how long it would take to complete the given task on the CPU. I just don't like the idea of NV spending the development resources necessary to implement this if it isn't significantly faster compared to conventional processing. Who cares if they can cross off the "async compute" checkbox if the gains aren't there? If an AMD chip can run faster with AC, great. If an NV chip can run the same scene at similar performance with/without AC, why do they take the effort?

Jumper118 · July 4, 2016

I heard the 1060 is 40% faster in nvidia game werks simulator 2016 too.

Briggsy · July 4, 2016

3 minutes ago, Jumper118 said:

I heard the 1060 is 40% faster in nvidia game werks simulator 2016 too.

nope, 69%.

Jumper118 · July 4, 2016

5 minutes ago, Briggsy said:

nope, 69%.

Maybe 420% if you turn on vr hype train werks.

Briggsy · July 4, 2016

4 hours ago, KeltonDSMer said:

Of course 100 microseconds is short in the context of how long a frame is displayed on a screen, but I was curious about this time in comparison to how long it would take to complete the given task on the CPU. I just don't like the idea of NV spending the development resources necessary to implement this if it isn't significantly faster compared to conventional processing. Who cares if they can cross off the "async compute" checkbox if the gains aren't there? If an AMD chip can run faster with AC, great. If an NV chip can run the same scene at similar performance with/without AC, why do they take the effort?

You hit the nail on the head. Why bother working on async compute when Nvidia doesn't really need it.

I guess an answer to that question is that Nvidia doesn't need "concurrent" async compute. Maxwell did have horrible context switching speed, so with Pascal Nvidia removed that bottleneck instead of bloating their hardware for no reason. In turn, async compute performance is actually a thing for pascal, but I think Nvidia's real objective was to improve VR performance, which they did dramatically over Maxwell. Another added bonus of this improvement is Multi-Frame Projection, which is a pretty damn cool technology for multi-display scaling.

For AMD, concurrent async compute is a solution to a problem that AMD created in their own hardware, while for Nvidia they can pick and choose what aspects of it will benefit them, and not even bother with attempting concurrent async compute, because then power draw would sky rocket, and clockspeeds would drop. They would need to add more cores, have bigger dies, and be left with the same performance and latency in the end as they have now in dx12, but much worse performance in DX11. Or in other words, they would be sitting in AMD's spot.

lukart · July 4, 2016

On 01/07/2016 at 9:14 PM, Kobathor said:

AMD sells cards and GPU's to those brands. REMEMBER? AMD distributes to the companies I listed (and more.) They set the MSRP of $199 USD and $229 USD for their 4GB and 8GB cards respectively. The companies like Gigabyte, Powercolor ect distribute cards to Europe, not AMD. AMD does not control the price. AMD doesn't sell the vendors cards for more money just because they're being shipped to the EU. The prices are high in the EU as they always seem to be. It's not a cartel, it's not AMD's fault, and it may not even be the card vendor's faults.

It's not the partners fault thats for sure.

First, you guys really think the brands do have different pricing for US compared to EU?

The price you get is based on the amount of cards you buy, you buy with alot of volume you get perhaps 5 bucks discount on a 200USD+ card. That doesnt seem much, but when you buy alot, thats money.

Lets say newegg buys 2000 cards, it gets 200$ landed price (price of partner sale plus shipping)

Overclockers.co.uk buys 500 cards it gets 205~207$ price. (price of partner sale plus shipping)

Of course €U is full of bullshit tax's and generally since they are more local eTailers which can move alot of cards, they don't compare to US, which Newegg basically supplies 50% of the entire market.

In EU you have alot of smaller players which they don't have the efficiency or the structure to be able to survive with margins such as newegg has. which means they end up with higher prices as the eTailer requires more margin to be profitable.

Also, you guys probably don't know, but Newegg enjoys terms and conditions that not many other etailers have. They don't assume any cost whatsoever on the RMA, they get alot of support from AMD and Nvidia directly which for them is great for then to run the products at low margins.

KeltonDSMer · July 4, 2016

5 minutes ago, Briggsy said:

You hit the nail on the head. Why bother working on async compute when Nvidia doesn't really need it.

I guess an answer to that question is that Nvidia doesn't need "concurrent" async compute. Maxwell did have horrible context switching speed, so with Pascal Nvidia removed that bottleneck instead of bloating their hardware for no reason. In turn, async compute performance is actually a thing for pascal, but I think Nvidia's real objective was to improve VR performance, which they did dramatically over Maxwell. Another added bonus is Multi-Frame Projection, which is a pretty damn cool technology for multi-display scaling.

For AMD, concurrent async compute is a solution to a problem that AMD created in their own hardware, while for Nvidia they can pick and choose what aspects of it will benefit them, and not even bother with attempting concurrent async compute, because then power draw would sky rocket, and clockspeeds would drop. They would need to add more cores, have bigger dies, and be left with the same performance and latency in the end as they have now in dx12, but much worse performance in DX11. Or in other words, they would be sitting in AMD's spot.

I am unclear as to how the workload for implementing AsC is split between the game devs and the graphics chip makers. At this point to me it looks like it is relatively less work on AMD's side from game to game to implement AsC as opposed to NV's required efforts. Will NV have to optimize this scheduling on a per game basis at the driver level every time a console port that makes use of AsC comes along? Will this start introducing more CPU overhead for NV's DX12 drivers?

I guess we will see what the balance is between the use of AsC by console game devs compared to the gains NV is able to see when using their new and improved context switching. If there are even 5% gains to be had for NV, I hope they continue the pursuit and spend the resources. NV doesn't really "need" to do anything, but if the gains are there I'd like them to bring that to us even if that means optimizing on a per game basis.

Tech Inquisition · July 5, 2016

9 hours ago, TheMidnightNarwhal said:

Oh yeah. I tried to download it from the windows store (only possible way right?) and it told me I didn't had Windows 10, but I was running Windows 10.

Most likely you haven't upgraded to build 1511... some PC's having issues to do that automatically... you can run a forced update to fix this. Just google updating to 1511...

http://www.makeuseof.com/tag/upgrade-windows-10-version-1511-now/

AppleWhizzer · July 5, 2016

Probably wont roast your PCI-E slots like an RX480....

TheMidnightNarwhal · July 5, 2016

18 hours ago, Tech Inquisition said:

Most likely you haven't upgraded to build 1511... some PC's having issues to do that automatically... you can run a forced update to fix this. Just google updating to 1511...

http://www.makeuseof.com/tag/upgrade-windows-10-version-1511-now/

Thanks that's exactly my problem, I check for updates and ain't there, I'l have to check this out.

NumLock21 · July 7, 2016

Sign In

GTX 1060 leak (TAKE WITH A TRUCKLOAD OF SALT)

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites