Jump to content

AMD speaks on W10 scheduler and Ryzen

Just now, N3v3r3nding_N3wb said:

If you mean read the rest of your comment, I did, and how you feel about is entirely up to you. Just know that your feelings on the subject are very illogical.

Illogical because I view and i7 as the norm for gaming ?

CPU: Intel i7 7700K | GPU: ROG Strix GTX 1080Ti | PSU: Seasonic X-1250 (faulty) | Memory: Corsair Vengeance RGB 3200Mhz 16GB | OS Drive: Western Digital Black NVMe 250GB | Game Drive(s): Samsung 970 Evo 500GB, Hitachi 7K3000 3TB 3.5" | Motherboard: Gigabyte Z270x Gaming 7 | Case: Fractal Design Define S (No Window and modded front Panel) | Monitor(s): Dell S2716DG G-Sync 144Hz, Acer R240HY 60Hz (Dead) | Keyboard: G.SKILL RIPJAWS KM780R MX | Mouse: Steelseries Sensei 310 (Striked out parts are sold or dead, awaiting zen2 parts)

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, XenosTech said:

Illogical because I view and i7 as the norm for gaming ?

Rather that you call it average, when in fact there isn't anything currently available that you would consider above average. It just goes against what average means.

Please avoid feeding the argumentative narcissistic academic monkey.

"the last 20 percent – going from demo to production-worthy algorithm – is both hard and is time-consuming. The last 20 percent is what separates the men from the boys" - Mobileye CEO

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, Tomsen said:

Rather that you call it average, when in fact there isn't anything currently available that you would consider above average. It just goes against what average means.

Based on context a word can mean several things. There isn't anything new about the i7's that we didn't know about them for the last idek years.... We know thye have excellent single core performance, they have HT and now with kaby lake the only thing new is how easy you can push them to 5 ghz without having to got shell out a ton of money for a custom water loop since we can hit that on air with a decent cooler. So to me they're just an average chip now and not say average in terms of performance.. that how you are interpreting what I say as.

CPU: Intel i7 7700K | GPU: ROG Strix GTX 1080Ti | PSU: Seasonic X-1250 (faulty) | Memory: Corsair Vengeance RGB 3200Mhz 16GB | OS Drive: Western Digital Black NVMe 250GB | Game Drive(s): Samsung 970 Evo 500GB, Hitachi 7K3000 3TB 3.5" | Motherboard: Gigabyte Z270x Gaming 7 | Case: Fractal Design Define S (No Window and modded front Panel) | Monitor(s): Dell S2716DG G-Sync 144Hz, Acer R240HY 60Hz (Dead) | Keyboard: G.SKILL RIPJAWS KM780R MX | Mouse: Steelseries Sensei 310 (Striked out parts are sold or dead, awaiting zen2 parts)

Link to comment
Share on other sites

Link to post
Share on other sites

6 hours ago, cj09beira said:

theres one thing the scheduler could do to improve perf though which is keep threads wich a lot of crosstalk in the same ccx. but this is adding features to scheduler not exactly a bug

so, let me get this straight .. you want a 8 core CPU to be treated as 2 NUMA, yes ?!?!? yes ....

ok, what will happen with R5s and R3s when those CCX nodes will only have 3 and 2 cores /  node, eh .....

 

Edited by wkdpaul
cleaned up
Link to comment
Share on other sites

Link to post
Share on other sites

3 minutes ago, zMeul said:

ok, what will happen with R5s and R3s when those CCX nodes will only have 3 and 2 cores /  node, eh .....

My understanding is that the 4-core Ryzen R5 and R3s will consist of a single CCX node...  

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, WMGroomAK said:

My understanding is that the 4-core Ryzen R5 and R3s will consist of a single CCX node...  

no, they won't

from what I got, each node will have cores disabled

 

if they will have a single CCX, that will be like mana from heaven

Link to comment
Share on other sites

Link to post
Share on other sites

28 minutes ago, zMeul said:

no, they won't

from what I got, each node will have cores disabled

 

if they will have a single CCX, that will be like mana from heaven

This article from Ars Tech (https://arstechnica.com/gadgets/2017/03/amds-moment-of-zen-finally-an-architecture-that-can-compete/2/) would seem to indicate that the 4-core chips will be a single CCX (which does make more sense than building a dual CCX chip than going in and either physically or microcode disabling half the chip and associated L3 cache).  

 

Quote

In the second quarter, these will be joined by Ryzen 5. The R5 1600X will be a six-core, 12-thread chip running at 3.6-4.0GHz (two CCXes, with one core from each disabled), and the 1500X will be a four-core, eight-thread chip at 3.5-3.7GHz (just a single CCX).

If you've got any news that would indicated that the 4-core SKUs are all going to be dual CCXes, I would enjoy reading those as well...  

Link to comment
Share on other sites

Link to post
Share on other sites

31 minutes ago, XenosTech said:

Based on context a word can mean several things. There isn't anything new about the i7's that we didn't know about them for the last idek years.... We know thye have excellent single core performance, they have HT and now with kaby lake the only thing new is how easy you can push them to 5 ghz without having to got shell out a ton of money for a custom water loop since we can hit that on air with a decent cooler. So to me they're just an average chip now and not say average in terms of performance.. that how you are interpreting what I say as.

Yes, based on context, words find their meaning.  But, to have the desired meaning, the context must be clear.  Now that you've explained what you meant, it's obvious.  Before, going just on unclear context, it was not obvious.

Royal Rumble: https://pcpartpicker.com/user/N3v3r3nding_N3wb/saved/#view=NR9ycf

 

"How fortunate for governments that the people they administer don't think." -- Adolf Hitler
 

"I am always ready to learn although I do not always like being taught." -- Winston Churchill

 

"We must learn to live together as brothers or perish together as fools." -- Martin Luther King Jr.

Link to comment
Share on other sites

Link to post
Share on other sites

26 minutes ago, WMGroomAK said:

This article from Ars Tech (https://arstechnica.com/gadgets/2017/03/amds-moment-of-zen-finally-an-architecture-that-can-compete/2/) would seem to indicate that the 4-core chips will be a single CCX (which does make more sense than building a dual CCX chip than going in and either physically or microcode disabling half the chip and associated L3 cache).  

 

If you've got any news that would indicated that the 4-core SKUs are all going to be dual CCXes, I would enjoy reading those as well...  

It will be a single CCX, that has quite clearly been indicated and is cheaper to manufactures which is a high priority for AMD.

Link to comment
Share on other sites

Link to post
Share on other sites

3 minutes ago, leadeater said:

It will be a single CCX, that has quite clearly been indicated and is cheaper to manufactures which is a high priority for AMD.

That's what I thought as well...  It also makes sense in that it provides at least a bit of framework for the APUs, which should be a single CCX and a GPU SoC.  

Link to comment
Share on other sites

Link to post
Share on other sites

The intercommunications between two CCX's definitely sounds like part of the easy IPC gains that they were talking about in the AMA. Whatever you might be gleaning from this, there's a lot that AMD could improve upon. Just like Intel with Nehalem. And honestly, given where Zen is now, having obvious and fixable bottlenecks makes me very optimistic for the future of Zen based chips.

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, zMeul said:

so, let me get this straight .. you want a 8 core CPU to be treated as 2 NUMA, yes ?!?!? yes ....

ok, what will happen with R5s and R3s when those CCX nodes will only have 3 and 2 cores /  node, eh .....

 

well no i dont, thats why i said new feature.

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, WMGroomAK said:

This article from Ars Tech (https://arstechnica.com/gadgets/2017/03/amds-moment-of-zen-finally-an-architecture-that-can-compete/2/) would seem to indicate that the 4-core chips will be a single CCX (which does make more sense than building a dual CCX chip than going in and either physically or microcode disabling half the chip and associated L3 cache).  

 

If you've got any news that would indicated that the 4-core SKUs are all going to be dual CCXes, I would enjoy reading those as well...  

I believe that's bull, why? simple .. let's look at the Zen die shot:

ryzen-die.jpg

 

 

the one of the R5 will be a 6 core 12 threads CPU? yes? that's basically impossible to do with a single CCX

 

now the R3s with 4 cores / 8 threads - theoretically it's quite possible to do on a single CCX, but practically not possible because there is a shit ton more stuff on the CPU die than just cutting one CCX away

 

what's more plausible?

this:

Spoiler

T2GBbkN.png

or this:

Spoiler

8SXGVG1.png

 

I bet on the no2

Link to comment
Share on other sites

Link to post
Share on other sites

8 minutes ago, zMeul said:

I believe that's bull, why? simple .. let's look at the Zen die shot:

ryzen-die.jpg

 

 

the one of the R5 will be a 6 core 12 threads CPU? yes? that's basically impossible to do with a single CCX

 

now the R3s with 4 cores / 8 threads - theoretically it's quite possible to do on a single CCX, but practically not possible because there is a shit ton more stuff on the CPU die than just cutting one CCX away

 

what's more plausible?

this:

  Reveal hidden contents

T2GBbkN.png

or this:

  Reveal hidden contents

8SXGVG1.png

 

I bet on the no2

Don't forget that AMD in the past has created entirely new die for their lower end CPU as it makes them actually cheaper to manufacturer than the higher end parts (if the lower end part for example had less cache, the size of the die and therefore the cost reflected that). AMD would get higher margins off a separate smaller die unless the 14nm yields aren't that good (which doesn't seem to be the case)

"We also blind small animals with cosmetics.
We do not sell cosmetics. We just blind animals."

 

"Please don't mistake us for Equifax. Those fuckers are evil"

 

This PSA brought to you by Equifacks.
PMSL

Link to comment
Share on other sites

Link to post
Share on other sites

7 minutes ago, Dabombinable said:

Don't forget that AMD in the past has created entirely new die for their lower end CPU as it makes them actually cheaper to manufacturer than the higher end parts (if the lower end part for example had less cache, the size of the die and therefore the cost reflected that). AMD would get higher margins off a separate smaller die unless the 14nm yields aren't that good (which doesn't seem to be the case)

that's only possible for R3s with 4 cores, but not for R5s with 6 cores

their lithography success rate should be godlike, otherwise they'll throw away a lot of dies - testing the dies is not cheap either

 

here's one other dead giveaway that they would not have new dies - their R5 and R3 TDP is, for the most part, identical

if you cut a CCX away, the R3 should've been ~30W TDP parts, not 65W ;)

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, M.Yurizaki said:

Because the CCX's talk to each other through a much lower speed bus than the L3 caches within the same CCX talk to each other

I meant compared to broadwell-e

PSU Tier List | CoC

Gaming Build | FreeNAS Server

Spoiler

i5-4690k || Seidon 240m || GTX780 ACX || MSI Z97s SLI Plus || 8GB 2400mhz || 250GB 840 Evo || 1TB WD Blue || H440 (Black/Blue) || Windows 10 Pro || Dell P2414H & BenQ XL2411Z || Ducky Shine Mini || Logitech G502 Proteus Core

Spoiler

FreeNAS 9.3 - Stable || Xeon E3 1230v2 || Supermicro X9SCM-F || 32GB Crucial ECC DDR3 || 3x4TB WD Red (JBOD) || SYBA SI-PEX40064 sata controller || Corsair CX500m || NZXT Source 210.

Link to comment
Share on other sites

Link to post
Share on other sites

14 minutes ago, djdwosk97 said:

I meant compared to broadwell-e

Broadwell-E uses unified L3 cache for all cores, whereas L3 in the CCX is split up into 1MB chunks totalling 8MB 4MB chunks totaling 16MB. However all of the L3 caches talk at the same speed as the L3 to L2 cache communication. So latency isn't that bad, but I guess it's still latency.

 

You can find an article that talks about it at https://www.techpowerup.com/231268/amds-ryzen-cache-analyzed-improvements-improveable-ccx-compromises

Edited by M.Yurizaki
Link to comment
Share on other sites

Link to post
Share on other sites

Whether or not the quad core will be single or dual CCX remains to be seen. I don't think anyone can say one way or another right now. It entirely depends on what AMD's yields are and what the demand for the 8 and 6 core chips looks like.

Link to comment
Share on other sites

Link to post
Share on other sites

8 hours ago, LAwLz said:

Whether or not the quad core will be single or dual CCX remains to be seen. I don't think anyone can say one way or another right now. It entirely depends on what AMD's yields are and what the demand for the 8 and 6 core chips looks like.

True but the CCXs are contained entities designed to be scalable, for the purpose of Naples which uses the same CCXs, so it is very likely a 4 core SKU will be a single CCX. Yields isn't really here nor there since to get the 8 core SKUs you need two functioning CCXs, asking for one isn't any tougher so I don't see how that really plays much part in 2 CCX vs 1 CCX for a 4 core SKU.

 

Edit:

I think your point was more around over supply of 2 CCX dies? The demand aspect? Lowering production and holding those back would make much more sense than just turning them in to 4 core SKUs, if the intent from the start was to make a single CCX die. If that is the case the design of it has already been done long ago and engineering samples have already been made or are being made.

End edit;

 

Design cost of a different die is the biggest factor versus just disabling cores, but then the cost per unit is higher. Which plays out better we can't know as we don't have those costing details and never will.

 

As for the 6 core SKU, that must be 2 CCX simple math :).

 

For the above point about TDP, well someone needs to go look up what TDP actually means because it is not power draw of the CPU. The 4 core SKU and 6 core SKU having the same TDP in no way indicates the CCX makeup.

Link to comment
Share on other sites

Link to post
Share on other sites

31 minutes ago, leadeater said:

-snip-

I was more thinking along the lines of, we don't know what the yields nor demand are for things. After looking at the die shot on the previous page I am not even sure how the CCXs are split up. If you look at the die shot posted a bit earlier it seems like there is no clean path where they can just split two Ryzen 7 chips into two Ryzen 3 chips, since the two CCXs aren't identical.

But now that I think more about it, that doesn't make any sense. It would be a terrible design decision.

 

Designing a new die just for the quad core version wouldn't really make sense either if their goal was to save money by bein able to reuse the CCX design in all SKUs.

 

But then we also have the problem of supply/demand. Let's not forget that AMD have sold quad cores as triple and even dual cores before, like you were alluding to a bit.

 

But what if they have a lot of CCXs where two cores are faulty or the manufacturing process isn't mature yet and they get a lot of CCXs which don't pass their binning process in terms of power/heat?

 

If the die really looks the way it does in the die shot above (where there seem to be two different types of CCXs), and if their yields are bad, and/or the supply/demand is off, then I think it would make sense to use CCXs with one defective core for the 6 core version, and CCXs with two faulty cores for the quad core version.

 

But who knows... I just think it isn't as set in stone as it might appear.

Just cutting one 8 core into two quad cores seems like the most obvious way of doing things, but it doesn't seem like that's possible (or even economical).

Link to comment
Share on other sites

Link to post
Share on other sites

2 minutes ago, LAwLz said:

But who knows... I just think it isn't as set in stone as it might appear.

Just cutting one 8 core into two quad cores seems like the most obvious way of doing things, but it doesn't seem like that's possible (or even economical).

Yea that's basically where I'm stuck at as to which of those two options pays off better. Seems wasteful and costly to disable that many cores and use up wafer area to deliver 4 core products. Naples while it does show how AMD can scale CCXs is very different in die design regarding PCIe lanes and memory controller and looks to be only offering 8 (2 CCX), 16 (4/6 CCX), 24 (6/8 CCX) and 32 (8 CCX) products and is a poor example to use for gauging if a single CCX die design is going to be used.

 

We also know AMD is favoring a market push to high core products all round so how much they actually want to invest in 4 core products is unknown.

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, leadeater said:

Seems wasteful and costly to disable that many cores and use up wafer area to deliver 4 core products.

Not if their yields are bad, and/or supply and demand doesn't end up being like they expected, or if they have two different types of CCXs and you have to pair them together to get a fully functional chip (which it seems like judging by the die shot posted earlier).

 

That die shot is really screwing with my head because I just can't see how they could use a single CCX by itself if that's what the die looks like. But if you need a pair then why would they go with this design to begin with?

Link to comment
Share on other sites

Link to post
Share on other sites

I think the issue with poor gaming performance is down to the 140ns delay between the two CCX's communicating data between each other, vs two cores in the same CCX communicating data with a 10ns 20-40ns delay. Pc perspective did a video on this, and the graph is below:

 

ping-amd.png

 

Logical cores 1-8 (physical cores 1-4) are on one CCX, whereas logical cores 9-16 (physical cores 5-8) are on a different CCX.

 

An independently threaded program such as Cinebench where threads don't need to communicate data between each other (since they just receive instructions on the main thread) each core will perform at its maximum potential, however in games where cores need to communicate data between eachother, the delays between the CCX could be mounting up, leading to performance degredation.

 

If the CCX latency is the issue (which I think it is, seeing as how well Ryzen performs vs Intel in thread-independent production software) then game developers should consider putting similar workloads on cores in the same CCX to alleviate the performance issue.

 

Someone should verify this by limiting the core affinity of a game to all the logical cores within the same CCX.

 

Speedtests

WiFi - 7ms, 22Mb down, 10Mb up

Ethernet - 6ms, 47.5Mb down, 9.7Mb up

 

Rigs

Spoiler

 Type            Desktop

 OS              Windows 10 Pro

 CPU             i5-4430S

 RAM             8GB CORSAIR XMS3 (2x4gb)

 Cooler          LC Power LC-CC-97 65W

 Motherboard     ASUS H81M-PLUS

 GPU             GeForce GTX 1060

 Storage         120GB Sandisk SSD (boot), 750GB Seagate 2.5" (storage), 500GB Seagate 2.5" SSHD (cache)

 

Spoiler

Type            Server

OS              Ubuntu 14.04 LTS

CPU             Core 2 Duo E6320

RAM             2GB Non-ECC

Motherboard     ASUS P5VD2-MX SE

Storage         RAID 1: 250GB WD Blue and Seagate Barracuda

Uses            Webserver, NAS, Mediaserver, Database Server

 

Quotes of Fame

On 8/27/2015 at 10:09 AM, Drixen said:

Linus is light years ahead a lot of other YouTubers, he isn't just an average YouTuber.. he's legitimately, legit.

On 10/11/2015 at 11:36 AM, Geralt said:

When something is worth doing, it's worth overdoing.

On 6/22/2016 at 10:05 AM, trag1c said:

It's completely blown out of proportion. Also if you're the least bit worried about data gathering then you should go live in a cave a 1000Km from the nearest establishment simply because every device and every entity gathers information these days. In the current era privacy is just fallacy and nothing more.

 

Link to comment
Share on other sites

Link to post
Share on other sites

29 minutes ago, LAwLz said:

Not if their yields are bad, and/or supply and demand doesn't end up being like they expected, or if they have two different types of CCXs and you have to pair them together to get a fully functional chip (which it seems like judging by the die shot posted earlier).

 

That die shot is really screwing with my head because I just can't see how they could use a single CCX by itself if that's what the die looks like. But if you need a pair then why would they go with this design to begin with?

Probably wasn't that clear by what I meant by scalable CCX design. Any SKU that uses a different amount of CCXs is a different die design, cutting the current die isn't possible.

 

Basically the CCXs are paired with a memory controller and I/O silicon logic, that means while they can put any number of CCXs they wish (maybe in pairs?) on a die they still need the CCX interconnects plus memory controller/PCIe that go with it. Cut the die cut the memory controller which means broken non functional die.

 

That's where it starts to get really complex and further in to more unknowns:

  • Can a single CCX be properly connected to the memory controller or is a minimum of two required?
  • Can a single CCX die actually be made smaller in physical area? Can the memory controller etc be rearranged?
  • Is the potential sales volume of the 4 core SKUs worth investing in a dedicated die design?

My personally feeling on the matter regarding the CCX is it was always designed primarily for Naples to allow AMD to easily make a few different die designs covering a large range of core count SKU offerings cheaply and to maximize wafer area usage. Ryzen is the product of making do with what you have and not necessarily a specifically dedicated design, surely a 100% Ryzen focused design wouldn't have a CCX interconnect in it at all and be a unified 8 core??

 

I'd love to have an open and honest discussion with the Zen architecture engineers to find out where the real focus was, Ryzen or Naples.

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now


×