Jump to content

Zen Engineering Samples, Specs Spotted

patrickjp93

http://www.guru3d.com/news-story/ams-zen-engineering-sample-specs-leaked.html

 

To start, keep in mind these are A0 engineering samples. Cache sizes, clock speed, and TDP can all change between now and retail launch.

 

Sample CPUs with 4, 8, 24, and 32 cores have been spotted making the rounds. Of these, only the 4 and 8C variants were made for the AM4 socket.

 

The 4-core variant is limited to 8MB of L3 cache and holds a 65W TDP at clock speeds of 2.8GHz base to 3.2GHz boost.

 

The 8-core has 16MB of L3 cache and holds a 95W TDP with the same clock speeds.

 

Further, AMD has chosen to double the size of its L2 cache over Intel's offerings, at 512KB. Knowing that cache timings get larger in proportion to the cache size, this is intriguing. We'll have to come up with some cache-thrashing benchmarks to see whose solution is better.

 

The most interesting part of the article imho states that the 4 and 8-core chips idle at 550MHz and consume just 2.5 and 5W respectively. This is in stark contrast to the FX series where, at idle, an 8350 still consumes almost 30W of power.

 

The 24 and 32-core chips have 160 and 180W TDPs at 2.75 and 2.9GHz boost, with a 2x32MB L3 cache configuration. Idle speed for these chips is even lower at 400MHz, not that it matters since big iron server chips should NEVER be idling. Remember this is a 2-die solution, so there will be a price to pay in terms of cache coherency, but it's still quite impressive. It's not stated if the boost clock is for all cores or just one, though based on the TDP scaling of the smaller counterparts, I'm going to guess it's single-core only.

 

Opinion:

I suspect AMD will be fudging its TDP just a bit or have lower cache clock speeds to keep its TDPs under Intel's, but so far this is right around what I expected for clock speeds. Had AMD been able to pull off the speeds Bulldozer and Vishera enjoyed AND stay inside a healthy 95W while being performance-competitive core for core with Intel, I must admit I'd have to give AMD very high praise.

 

Remember, these are A0 samples, so there's room for clocks to tick upward just a bit, which is why I say clock speeds will likely come right in line with Haswell-E.

 

Edit: You guys are getting slow or lazy. In 24 hours I'm the one posting all the Intel and AMD news even though the articles were up hours before I found them.

Software Engineer for Suncorp (Australia), Computer Tech Enthusiast, Miami University Graduate, Nerd

Link to comment
Share on other sites

Link to post
Share on other sites

I'm confused. @patrickjp93 Why does the quad core have 2MB L3 cache when the Eight core has 8MB?

 

Shouldn't the quad core have 6MB L3 cache or 8MB like Intel's quad cores do?

Judge a product on its own merits AND the company that made it.

How to setup MSI Afterburner OSD | How to make your AMD Radeon GPU more efficient with Radeon Chill | (Probably) Why LMG Merch shipping to the EU is expensive

Oneplus 6 (Early 2023 to present) | HP Envy 15" x360 R7 5700U (Mid 2021 to present) | Steam Deck (Late 2022 to present)

 

Mid 2023 AlTech Desktop Refresh - AMD R7 5800X (Mid 2023), XFX Radeon RX 6700XT MBA (Mid 2021), MSI X370 Gaming Pro Carbon (Early 2018), 32GB DDR4-3200 (16GB x2) (Mid 2022

Noctua NH-D15 (Early 2021), Corsair MP510 1.92TB NVMe SSD (Mid 2020), beQuiet Pure Wings 2 140mm x2 & 120mm x1 (Mid 2023),

Link to comment
Share on other sites

Link to post
Share on other sites

Not too concerned with TDP myself but it seems AMD are this time around, wonder if we'll see any 9590 madness again. lets hope zen truly live upto the hype. 

CPU: i7 5820k @4.4GHz | MoboMSI MPower X99A | RAM: 16GB DDR4 Quad Channel Corsair LP | GPU: EVGA 1080 FTW Case: Define R5 Black Window | OS: Win 10 Pro

Storage: SanDisk Ultra II 960GB 2x WD Red 4TB | PSU: EVGA 750W G2 | Display:Acer XF270HU + Dell U2515H | Cooling: Phanteks PH-TC14PE

Keyboard: Ducky One  TKL Browns | Mouse: Steel Series Rival 300 | Sound: DT990s

 

Link to comment
Share on other sites

Link to post
Share on other sites

download.jpg

 

AMD, please don't fuck this up... 

Main Rig:-

Ryzen 7 3800X | Asus ROG Strix X570-F Gaming | 16GB Team Group Dark Pro 3600Mhz | Corsair MP600 1TB PCIe Gen 4 | Sapphire 5700 XT Pulse | Corsair H115i Platinum | WD Black 1TB | WD Green 4TB | EVGA SuperNOVA G3 650W | Asus TUF GT501 | Samsung C27HG70 1440p 144hz HDR FreeSync 2 | Ubuntu 20.04.2 LTS |

 

Server:-

Intel NUC running Server 2019 + Synology DSM218+ with 2 x 4TB Toshiba NAS Ready HDDs (RAID0)

Link to comment
Share on other sites

Link to post
Share on other sites

2 minutes ago, CUDA_Cores said:

32 cores! I hope they aren't 32 sh*t cores like AMD pulled to bulldozer.

Doesn't really matter since it's a server chip, not AM4

 

1 minute ago, AluminiumTech said:

I'm confused. Why does the quad core have 2MB L3 cache when the Eight core has 8MB?

 

Shouldn't the quad core have 4MB L3 cache or 6MB like Intel's quad cores do?

If they want optimal work flow then yes (unless the chip is weaker than we think)

 

Quote

I suspect AMD will be fudging its TDP just a bit or have lower cache clock speeds to keep its TDPs under Intel's,

Sure seems that way at least based on this info

https://linustechtips.com/main/topic/631048-psu-tier-list-updated/ Tier Breakdown (My understanding)--1 Godly, 2 Great, 3 Good, 4 Average, 5 Meh, 6 Bad, 7 Awful

 

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, AluminiumTech said:

I'm confused. @patrickjp93 Why does the quad core have 2MB L3 cache when the Eight core has 8MB?

 

Shouldn't the quad core have 4MB L3 cache or 6MB like Intel's quad cores do?

Partly defective sample is my guess.

 

Intel's quads have 8. On I5s 2MB is disabled though. It's the same die as a mainstream I7.

Software Engineer for Suncorp (Australia), Computer Tech Enthusiast, Miami University Graduate, Nerd

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, patrickjp93 said:

Partly defective sample is my guess.

 

Intel's quads have 8. On I5s 2MB is disabled though. It's the same die as a mainstream I7.

Yeah I realized my mistake. But then does that mean they might ship the quad core with 4MB L3 cache?

 

Also it's a shame AMD didn't put any high performance 100GB/s eDRAM L4 cache on there :(.

Judge a product on its own merits AND the company that made it.

How to setup MSI Afterburner OSD | How to make your AMD Radeon GPU more efficient with Radeon Chill | (Probably) Why LMG Merch shipping to the EU is expensive

Oneplus 6 (Early 2023 to present) | HP Envy 15" x360 R7 5700U (Mid 2021 to present) | Steam Deck (Late 2022 to present)

 

Mid 2023 AlTech Desktop Refresh - AMD R7 5800X (Mid 2023), XFX Radeon RX 6700XT MBA (Mid 2021), MSI X370 Gaming Pro Carbon (Early 2018), 32GB DDR4-3200 (16GB x2) (Mid 2022

Noctua NH-D15 (Early 2021), Corsair MP510 1.92TB NVMe SSD (Mid 2020), beQuiet Pure Wings 2 140mm x2 & 120mm x1 (Mid 2023),

Link to comment
Share on other sites

Link to post
Share on other sites

Could it be that they're doing 2 tiers of consumer chips, one with low low cache (like a HTPC variant) and one with big cache? Perhaps the chip mentioned in this article is merely the low cache model? 

 

Just spitballing. 

Main Rig:-

Ryzen 7 3800X | Asus ROG Strix X570-F Gaming | 16GB Team Group Dark Pro 3600Mhz | Corsair MP600 1TB PCIe Gen 4 | Sapphire 5700 XT Pulse | Corsair H115i Platinum | WD Black 1TB | WD Green 4TB | EVGA SuperNOVA G3 650W | Asus TUF GT501 | Samsung C27HG70 1440p 144hz HDR FreeSync 2 | Ubuntu 20.04.2 LTS |

 

Server:-

Intel NUC running Server 2019 + Synology DSM218+ with 2 x 4TB Toshiba NAS Ready HDDs (RAID0)

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, AluminiumTech said:

Yeah I realized my mistake. But then does that mean they might ship the quad core with 4MB L3 cache?

 

Also it's a shame AMD didn't put any high performance 100GB/s eDRAM L4 cache on there :(.

4MB is too little for a quad core imho. I think AMD is just shipping out what samples it can.

 

eDRAM is expensive to make and is more useful for an iGPU. Yes, it can help some CPU tasks, but most of what consumers do is either so big you'd need a Gig of cache before the miss ratio would drop (browsers) or is so small it fits in cache anyway (word processing).

Software Engineer for Suncorp (Australia), Computer Tech Enthusiast, Miami University Graduate, Nerd

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, Master Disaster said:

Could it be that they're doing 2 tiers of consumer chips, one with low low cache (like a HTPC variant) and one with big cache? Perhaps the chip mentioned in this article is merely the low cache model? 

 

Just spitballing. 

I hadn't considered the HTPC angle, but APUs are still a better fit for that, or AMD I suppose could provide a custom board of their own with a tiny iGPU embedded on the board. But I think that would blow up the price of the one-off chip.

Software Engineer for Suncorp (Australia), Computer Tech Enthusiast, Miami University Graduate, Nerd

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, patrickjp93 said:

4MB is too little for a quad core imho. I think AMD is just shipping out what samples it can.

 

eDRAM is expensive to make and is more useful for an iGPU. Yes, it can help some CPU tasks, but most of what consumers do is either so big you'd need a Gig of cache before the miss ratio would drop (browsers) or is so small it fits in cache anyway (word processing).

Hopefully we see 8MB on the quad core and 15/16MB for the eight core.

 

And i'd like a 3GHz base clock.

 

Yeah. I know eDRAM is expensive. But Intel is giving it out like candy in their mobile i5s and i7s.

Judge a product on its own merits AND the company that made it.

How to setup MSI Afterburner OSD | How to make your AMD Radeon GPU more efficient with Radeon Chill | (Probably) Why LMG Merch shipping to the EU is expensive

Oneplus 6 (Early 2023 to present) | HP Envy 15" x360 R7 5700U (Mid 2021 to present) | Steam Deck (Late 2022 to present)

 

Mid 2023 AlTech Desktop Refresh - AMD R7 5800X (Mid 2023), XFX Radeon RX 6700XT MBA (Mid 2021), MSI X370 Gaming Pro Carbon (Early 2018), 32GB DDR4-3200 (16GB x2) (Mid 2022

Noctua NH-D15 (Early 2021), Corsair MP510 1.92TB NVMe SSD (Mid 2020), beQuiet Pure Wings 2 140mm x2 & 120mm x1 (Mid 2023),

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, AluminiumTech said:

Hopefully we see 8MB on the quad core and 15/16MB for the eight core.

 

Yeah. I know eDRAM is expensive. But Intel is giving it out like candy in their mobile i5s and i7s.

Those CPUs sell for $600 or more. The quad-core I7s start at $800. I wouldn't call it giving out eDRAM like candy.

Software Engineer for Suncorp (Australia), Computer Tech Enthusiast, Miami University Graduate, Nerd

Link to comment
Share on other sites

Link to post
Share on other sites

Do you think they may throw HBM on a high end APU?

if you want to annoy me, then join my teamspeak server ts.benja.cc

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, The Benjamins said:

Do you think they may throw HBM on a high end APU?

I only know about the HPC APU using it for now, but that's also a mid to late 2017 product at the earliest.

Software Engineer for Suncorp (Australia), Computer Tech Enthusiast, Miami University Graduate, Nerd

Link to comment
Share on other sites

Link to post
Share on other sites

2 minutes ago, Tedny said:

i7-6700k have 256 kb L2 cache. Interesing, will Window Zen 512kb L2 cache use ?!

 

7 minutes ago, Master Disaster said:

Could it be that they're doing 2 tiers of consumer chips, one with low low cache (like a HTPC variant) and one with big cache? Perhaps the chip mentioned in this article is merely the low cache model? 

 

Just spitballing. 

 

9 minutes ago, AluminiumTech said:

Yeah I realized my mistake. But then does that mean they might ship the quad core with 4MB L3 cache?

 

Also it's a shame AMD didn't put any high performance 100GB/s eDRAM L4 cache on there :(.

I was reading back over it to double check, and it seems I may have made a mistake in saying 2MB L3. 1) It's early in the morning. 2) The author of the article could use a revision or two himself to clarify.

 

It seems the 8MB cache is for the quad-core. When I see L2 cache, I never see it listed as a collective unit in MB. I always see it in kilobytes since cores don't share L2 and it doesn't make sense. The article says the 8-core will get double this, or 16MB of L3 cache.

Software Engineer for Suncorp (Australia), Computer Tech Enthusiast, Miami University Graduate, Nerd

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, patrickjp93 said:

 

 

I was reading back over it to double check, and it seems I may have made a mistake in saying 2MB L3. 1) It's early in the morning. 2) The author of the article could use a revision or two himself to clarify.

 

It seems the 8MB cache is for the quad-core. When I see L2 cache, I never see it listed as a collective unit in MB. I always see it in kilobytes since cores don't share L2 and it doesn't make sense. The article says the 8-core will get double this, or 16MB of L3 cache.

Oh cool. So they will have 8MB for the quad core and 16MB for the eight core.

 

 

And the 2MB and 8MB was for L2 cache? That seems like way too much.

Judge a product on its own merits AND the company that made it.

How to setup MSI Afterburner OSD | How to make your AMD Radeon GPU more efficient with Radeon Chill | (Probably) Why LMG Merch shipping to the EU is expensive

Oneplus 6 (Early 2023 to present) | HP Envy 15" x360 R7 5700U (Mid 2021 to present) | Steam Deck (Late 2022 to present)

 

Mid 2023 AlTech Desktop Refresh - AMD R7 5800X (Mid 2023), XFX Radeon RX 6700XT MBA (Mid 2021), MSI X370 Gaming Pro Carbon (Early 2018), 32GB DDR4-3200 (16GB x2) (Mid 2022

Noctua NH-D15 (Early 2021), Corsair MP510 1.92TB NVMe SSD (Mid 2020), beQuiet Pure Wings 2 140mm x2 & 120mm x1 (Mid 2023),

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, patrickjp93 said:

I only know about the HPC APU using it for now, but that's also a mid to late 2017 product at the earliest.

I think it would be awesome if they could get a RX 480/470 with 1-4GB HBM with a 4/8 core zen CPU. would make for a awesome HTPC/gaming APU.

if you want to annoy me, then join my teamspeak server ts.benja.cc

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, AluminiumTech said:

Oh cool. So they will have 8MB for the quad core and 16MB for the eight core.

 

 

And the 2MB and 8MB was for L2 cache? That seems like way too much.

Well, it's 2MB across 4 cores, or 512KB for each core, which is double what Intel uses. And since cache timings have to loosen as cache size increases (universal truth btw), I'm wondering if AMD figured having more data closer at the 2nd level was more worthwhile.

Software Engineer for Suncorp (Australia), Computer Tech Enthusiast, Miami University Graduate, Nerd

Link to comment
Share on other sites

Link to post
Share on other sites

2 minutes ago, patrickjp93 said:

Well, it's 2MB across 4 cores, or 512KB for each core, which is double what Intel uses. And since cache timings have to loosen as cache size increases (universal truth btw), I'm wondering if AMD figured having more data closer at the 2nd level was more worthwhile.

Well, AMD did use 512KB of L2 cache with K10-back then however that was it, 512KB L2 cache per core with no L3 cache. Having 2MB L3 cache per core should help as well since its not shared (with my 4790K for example, it has 2MB of L3 cache per core).

"We also blind small animals with cosmetics.
We do not sell cosmetics. We just blind animals."

 

"Please don't mistake us for Equifax. Those fuckers are evil"

 

This PSA brought to you by Equifacks.
PMSL

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, Dabombinable said:

Well, AMD did use 512KB of L2 cache with K10-back then however that was it, 512KB L2 cache per core with no L3 cache. Having 2MB L3 cache per core should help as well since its not shared (with my 4790K for example, it has 2MB of L3 cache per core).

L3 cache is shared. It only becomes a problem if you have a noisy neighbor.

Software Engineer for Suncorp (Australia), Computer Tech Enthusiast, Miami University Graduate, Nerd

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, patrickjp93 said:

L3 cache is shared. It only becomes a problem if you have a noisy neighbor.

Oh....just like it was on their CMT architectures? Well....I could see some issues with all 8 and 16 threads under load. Its one of those trade off it seems between multi threaded performance and single threaded performance. They should know by now with the way CMT is that if it starts getting shared, the performance of both cores and all 4 threads will be affected. I was hoping that it wouldn't be like the cache in Wolfdale (L2 split between cores).

"We also blind small animals with cosmetics.
We do not sell cosmetics. We just blind animals."

 

"Please don't mistake us for Equifax. Those fuckers are evil"

 

This PSA brought to you by Equifacks.
PMSL

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, Dabombinable said:

Oh....just like it was on their CMT architectures? Well....I could see some issues with all 8 and 16 threads under load. Its one of those trade off it seems between multi threaded performance and single threaded performance. They should know by now with the way CMT is that if it starts getting shared, the performance of both cores and all 4 threads will be affected. I was hoping that it wouldn't be like the cache in Wolfdale (L2 split between cores).

It's shared on Intel's architectures too you know... There are benefits and drawbacks to having solitary and unified caches. Everyone from ARM to IBM uses a mix of them.

 

Also, you're getting tangled between two issues: exclusivity and false sharing. It's not yet clear if AMD implemented an exclusive cache hierarchy where data is not maintained in all 3 levels of cache (if the data is in L1 of 1 core, it's not in the L2 of that same core, and is not in shared L3) or an inclusive one where it is. Exclusive cache hierarchies requiring snooping algorithms to go back through the other caches for other cores, and that's one enormous performance killer for Bulldozer and the entire Construction Core family. An Inclusive cache means, as long as the cache line doesn't have any data modified, sharing it is actually more efficient, because everyone can read it and pull it up from L3 to higher cache levels and not have to worry.

 

False sharing is when a cache line is given to at least 2 different cores, and 2 cores modify some data on that cache line (a line is usually 64 bytes). In order to maintain cache coherency, once the first change is made, the other cache line copies are labeled dirty and have to be modified before the second change can take place. This stalls the pipeline of the second core. That is also a performance killer.

 

Having a shared L3 cache is not a bad thing at all. The programmer would have to diligently avoid the false sharing problem even without a shared L3 cache, so there's no real downside.

Software Engineer for Suncorp (Australia), Computer Tech Enthusiast, Miami University Graduate, Nerd

Link to comment
Share on other sites

Link to post
Share on other sites

10 minutes ago, patrickjp93 said:

It's shared on Intel's architectures too you know... There are benefits and drawbacks to having solitary and unified caches. Everyone from ARM to IBM uses a mix of them.

 

Also, you're getting tangled between two issues: exclusivity and false sharing. It's not yet clear if AMD implemented an exclusive cache hierarchy where data is not maintained in all 3 levels of cache (if the data is in L1 of 1 core, it's not in the L2 of that same core, and is not in shared L3) or an inclusive one where it is. Exclusive cache hierarchies requiring snooping algorithms to go back through the other caches for other cores, and that's one enormous performance killer for Bulldozer and the entire Construction Core family. An Inclusive cache means, as long as the cache line doesn't have any data modified, sharing it is actually more efficient, because everyone can read it and pull it up from L3 to higher cache levels and not have to worry.

 

False sharing is when a cache line is given to at least 2 different cores, and 2 cores modify some data on that cache line (a line is usually 64 bytes). In order to maintain cache coherency, once the first change is made, the other cache line copies are labeled dirty and have to be modified before the second change can take place. This stalls the pipeline of the second core. That is also a performance killer.

 

Having a shared L3 cache is not a bad thing at all. The programmer would have to diligently avoid the false sharing problem even without a shared L3 cache, so there's no real downside.

One of the problems is that some programmers-specifically in the AAA games industry, are far from diligent.

"We also blind small animals with cosmetics.
We do not sell cosmetics. We just blind animals."

 

"Please don't mistake us for Equifax. Those fuckers are evil"

 

This PSA brought to you by Equifacks.
PMSL

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, Dabombinable said:

One of the problems is that some programmers-specifically in the AAA games industry, are far from diligent.

Oh I know. I had a discussion on performance tuning at Epic, and two of the more senior developers said to me that tuning performance for hyper threading was so hard they literally instilled a ban on it because it could cause performance loss in other areas. I showed them a few tricks with OpenMP to demonstrate setting core affinity and putting mixed workloads on each pair of logical cores and saw a 20% boost using good old OpenGL and CPU-based cloth physics. To put it shortly, what I could do in 600 lines of code left them stunned.

 

At least the engine developers like Mike Acton have finally started paying attention to cache lines, though just from looking at Unreal, good Lord is there are long way to go...

Software Engineer for Suncorp (Australia), Computer Tech Enthusiast, Miami University Graduate, Nerd

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, patrickjp93 said:

Oh I know. I had a discussion on performance tuning at Epic, and two of the more senior developers said to me that tuning performance for hyper threading was so hard they literally instilled a ban on it because it could cause performance loss in other areas. I showed them a few tricks with OpenMP to demonstrate setting core affinity and putting mixed workloads on each pair of logical cores and saw a 20% boost using good old OpenGL and CPU-based cloth physics. To put it shortly, what I could do in 600 lines of code left them stunned.

 

At least the engine developers like Mike Acton have finally started paying attention to cache lines, though just from looking at Unreal, good Lord is there are long way to go...

So its more than likely that the programmers in all reality need re training? Because I take it that you came out of University/College a lot more recently.

"We also blind small animals with cosmetics.
We do not sell cosmetics. We just blind animals."

 

"Please don't mistake us for Equifax. Those fuckers are evil"

 

This PSA brought to you by Equifacks.
PMSL

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×