[Rumour] AMD's Zen To Have Ten Pipelines Per Core

patrickjp93 · October 3, 2015

TLB is used to speed up virtual memory addressing... it will have a buffer and bypass virtual address generation altogether. Does multi processor require to have same addressing require full addressing space per core then i can see use of exceeding exabyte in multi processor scenario... but for desktop class processor? i have not read about tree traversal much.. BUt is it spanning tree processing in information coding? Isn't its processing parallelism achieved using modified algorithm. i am confused..

TLB covers all cache and memory predictions. Every CPU architecture has a TLB. Not every one has virtualization instructions and memory address extensions.

The thing everyone seems to forget is there is no fundamental difference between an ultramobile Y-class core and an E7 Xeon core. All that changes is the interconnect fabric, the security processors, some direct hardware virtualization extensions, and cache sizes. Everything Intel implements to fight IBM in HPC gets implemented all the way down on our level too, to give time for finding extension bugs, to refine the manufacturing process, maybe tweak the designs, etc.. Otherwise the cores are exactly the same, just targetting different power and speed requirements along with some extra enterprise bells and whistles backed into the chip.

jos · October 3, 2015

TLB covers all cache and memory predictions. Every CPU architecture has a TLB. Not every one has virtualization instructions and memory address extensions.

The thing everyone seems to forget is there is no fundamental difference between an ultramobile Y-class core and an E7 Xeon core. All that changes is the interconnect fabric, the security processors, some direct hardware virtualization extensions, and cache sizes. Everything Intel implements to fight IBM in HPC gets implemented all the way down on our level too, to give time for finding extension bugs, to refine the manufacturing process, maybe tweak the designs, etc.. Otherwise the cores are exactly the same, just targetting different power and speed requirements along with some extra enterprise bells and whistles backed into the chip.

I remember TLB was used for virtual to physical address translation..my knowledge might have gotten outdated....I never knew they implement everything in xeon E7 down to all....

LukaP · October 3, 2015

AMD. 2 ALU per "core". Now its 4. They still won't be "cores" in the same sense as Intel's current design or AMD's K10. Also, shared cache cripples CPU cores.

forgot to answer before.

Intel's Haswell has 4 integer ALUs per core and 2 FPU ALUs. it also shares L2 Cache and LLC. SMT design isnt about what is in a core, but how its utilised.

In CMT, you dotn think in cores, you think in modules. Now for example, one Excavator module has 2 Decoders that send translated instructions to the schedulers. (similar to two intel cores). the different thing is that in CMT, those two decoders share one FPU scheduler. meaning only one floating instruction can be executed per module per pipeline cycle. (with intel it could be one per core, so 2) (or even more i cant recall how haswells multiple FPUs can be called on).

And that is how AMD can claim integer multithreading, but it cant do floats without waiting. What cripples it is not shared cache or ALUs. or even how many they have. its that they cant schedule all of them in one pipeline cycle

@patrickjp93 correct anything you see wrong it may be alot its late

patrickjp93 · October 3, 2015

forgot to answer before.

Intel's Haswell has 4 integer ALUs per core and 2 FPU ALUs. it also shares L2 Cache and LLC. SMT design isnt about what is in a core, but how its utilised.

In CMT, you dotn think in cores, you think in modules. Now for example, one Excavator module has 2 Decoders that send translated instructions to the schedulers. (similar to two intel cores). the different thing is that in CMT, those two decoders share one FPU scheduler. meaning only one floating instruction can be executed per module per pipeline cycle. (with intel it could be one per core, so 2) (or even more i cant recall how haswells multiple FPUs can be called on).

And that is how AMD can claim integer multithreading, but it cant do floats without waiting. What cripples it is not shared cache or ALUs. or even how many they have. its that they cant schedule all of them in one pipeline cycle

Pardon but I believe it's 2 integer, 1 mixed vector, and 1 dedicated FPU. I don't think Haswell has 6 compute units in total.

jos · October 3, 2015

forgot to answer before.

Intel's Haswell has 4 integer ALUs per core and 2 FPU ALUs. it also shares L2 Cache and LLC. SMT design isnt about what is in a core, but how its utilised.

In CMT, you dotn think in cores, you think in modules. Now for example, one Excavator module has 2 Decoders that send translated instructions to the schedulers. (similar to two intel cores). the different thing is that in CMT, those two decoders share one FPU scheduler. meaning only one floating instruction can be executed per module per pipeline cycle. (with intel it could be one per core, so 2) (or even more i cant recall how haswells multiple FPUs can be called on).

And that is how AMD can claim integer multithreading, but it cant do floats without waiting. What cripples it is not shared cache or ALUs. or even how many they have. its that they cant schedule all of them in one pipeline cycle

@patrickjp93 correct anything you see wrong it may be alot its late

But it reduces number of transistors and hence can reduce price...

jos · October 3, 2015

Pardon but I believe it's 2 integer, 1 mixed vector, and 1 dedicated FPU. I don't think Haswell has 6 compute units in total.

Then how did they achieve similar instructions per cycle...I am not sure someone said both have have same IPC...I know there are other variables.. but still...

LukaP · October 3, 2015

Pardon but I believe it's 2 integer, 1 mixed vector, and 1 dedicated FPU. I don't think Haswell has 6 compute units in total.

may be. i cant really find any block diagram for haswell for some reason...

But it reduces number of transistors and hence can reduce price...

actually no. it increases transistors needed for similar performance. by 2x (ish)

Okay, so all systems have two CPUs. So let’s look at the CPUs themselves:

Opteron 6276: 8-module/16-thread, which has two Bulldozer dies of 1.2B transistors each, total 2.4B transistors

Opteron 6220: 4-module/8-thread, one Bulldozer die of 1.2B transistors

Opteron 6174: 12-core/12-thread, which has two dies of 0.9B transistors each, total 1.8B transistors

Xeon X5650: 6-core/12-thread, 1.17B transistors

But if we look at the actual benchmarks, we see that the reality is different: AMD actually NEEDS those two dies to keep up with Intel’s single die. And even then, Intel’s chip excels in keeping response times short. The new CMT-based Opterons are not all that convincing compared to the smaller, older Opteron 6174 either, which can handle only 12 threads instead of 16, and just uses vanilla SMP for multithreading.

https://scalibq.wordpress.com/2012/02/14/the-myth-of-cmt-cluster-based-multithreading/

LukaP · October 3, 2015

Then how did they achieve similar instructions per cycle...I am not sure someone said both have have same IPC...I know there are other variables.. but still...

2x intALU = 2FLOP/cycle

vs

1 fpALU + 1vec/fpALU = 1FLOP/cycle + 1FLOP/cycle = 2FLOP/cycle

jos · October 3, 2015

actually no. it increases transistors needed for similar performance. by 2x (ish)

https://scalibq.wordpress.com/2012/02/14/the-myth-of-cmt-cluster-based-multithreading/

For same performance... but they never did that.. they did.. we have quad core clocked at some GHz. and it is cheaper than intel... hey we throw in a good gpu along... in that case they saved money...that was their thinking old race to processor core and clock speed but competing in pricing.. that was their market position they tried but failed..

LukaP · October 3, 2015

For same performance... but they never did that.. they did.. we have quad core clocked at some GHz. and it is cheaper than intel... hey we throw in a good gpu along... in that case they saved money...that was their thinking old race to processor core and clock speed but competing in pricing.. that was their market position they tried but failed..

im not sure what youre trying to say here, but the fact is, for the same performance (roughly) you need 2x (roughly) more transistors with CMT, versus SMT. So it is in no way cheaper to produce

TOMPPIX · October 3, 2015

Haven't intel won already. Zen would have to be A LOT faster than a i7 6700k and be in the same price range, for me to even consider buying it or it could perform the same as an i7 6700k and be cheaper. i'm talking gaming here not video rendering.

LukaP · October 3, 2015

Haven't intel won already. Zen would have to be A LOT faster than a i7 6700k and be in the same price range, for me to even consider buying it or it could perform the same as an i7 6700k and be cheaper. i'm talking gaming here not video rendering.

Then they have already lost you. unless AMD and NV make quantum leaps in GPU performance, you wont see a difference in gaming between any recent cpus (talking 2600k till Zen range here) anyway.

Humbug · October 3, 2015

Haven't intel won already. Zen would have to be A LOT faster than a i7 6700k and be in the same price range, for me to even consider buying it or it could perform the same as an i7 6700k and be cheaper.

It's very likely that they will match Intel in discrete GPU gaming workloads, as these are mostly GPU limited.

WereCat · October 3, 2015

Haven't intel won already. Zen would have to be A LOT faster than a i7 6700k and be in the same price range, for me to even consider buying it or it could perform the same as an i7 6700k and be cheaper. i'm talking gaming here not video rendering.

I dont think that AMD needs to have better IPC than Intel. They need to get close enough at least and then add features that matters.

Prysin · October 3, 2015

Haven't intel won already. Zen would have to be A LOT faster than a i7 6700k and be in the same price range, for me to even consider buying it or it could perform the same as an i7 6700k and be cheaper. i'm talking gaming here not video rendering.

no, Aslong as AMD can offer ANY product equal or just slightly behind Intel, then they have "won" a chance to shine.

What AMD needs is to get close to haswell.

What AMD needs is a cheaper platform then Skylake, featuring the same or more features, at a lower price point.

Because today, going from a Z97 4690k to a Z170 6600k/6700k costs around 500 USD or MORE!!!...

If AMD can give you the same features as Skylake (or a few more) for Haswell performance, but at lower then skylake prices. Then all the people who still run pre-haswell systems, or current gen AMD systems will have a GOOD upgrade.

Because if you offer a more feature rich "haswell" option, then that is naturally better. We know that ZENplus will arrive in mid/late 2018. This would without a doubt be faster then Haswell (even if just by 5-10%), but by knowing there WILL be a upgrade path, then the ZEN platform will be more then viable.

ZEN will also come into the game "late". But by then DDR4 prices should have hit current gen DDR3 levels. Thus it is even cheaper to upgrade to AMD.

Intels motherboards will remain a bit pricy because of how they moved a lot of the voltage regulating systems over to the mobo (again). So expecting Z150/Z170 boards to drop in price by any massive margin is a longshot.

DDR4 boards and boards with voltage regulation on the board itself has existed for a long time, so manufacturing costs tied with these features shouldnt be that high. The systems themselves (more traces on the mobo to cover 288 pins for DDR4 rather then 240 for DDR3) and more chips is what is driving prices up.

TL;DR

If AMD can give us Haswell performance with Skylake features at reasonable prices (lower then Skylake) they have won.

Because upgrade wise, they will make more sense then Intel will (intel will be a lot more pricy, but not offer any real FPS advantage to make up for the price disparity)

Ren · October 3, 2015

This should be in CPUs subforum or something, not News..

Shakaza · October 3, 2015

This should be in CPUs subforum or something, not News..

This is news...about CPUs. CPUs are tech, so this is the appropriate subsection. If you disagree, too bad. That's up for the mods to decide, not you.

MageTank · October 3, 2015

In coding, modularity is beautiful both in theory and application. Debugging is so much easier when you can separate everything out. That, and compilers tend to optimize better when you have fewer lines per method/function. For Clang the optimal # of lines per function is about 9 unless it's basically doing a list of function calls.

You are probably the most optimistic coder i have ever met. My father and brother are both programmers, and they complain about everything. From documentation (they think people should just automatically understand their work) to porting (they think people should just use linux for everything, and hate anything windows server). If i can go through one night without hearing them argue over MySQL vs MSSQL, i'd be a happy man.

Prysin · October 3, 2015

You are probably the most optimistic coder i have ever met. My father and brother are both programmers, and they complain about everything. From documentation (they think people should just automatically understand their work) to porting (they think people should just use linux for everything, and hate anything windows server). If i can go through one night without hearing them argue over MySQL vs MSSQL, i'd be a happy man.

he is a student....

so the amount of bullshit half assed coding work he has to fix before lunch on a daily basis is limited.

Dabombinable · October 3, 2015

It has more L2 cache than Skylake, double actually...

http://www.fudzilla.com/news/processors/37494-amd-x86-16-core-zen-apu-detailed

http://techreport.com/review/28751/intel-core-i7-6700k-skylake-processor-reviewed/4

My God how does @LukaP stand you people? At least when Opcode and I argued it was over business practices and projections, and the only reason he left was because of a blood feud with one of the mods.

Sorry, my mistake. I was miss interpretting the diagram. Its double the L2 cache per core of an i7 4790K. 256KiB>512KiB. And yes unlike you I'll admit when I'm wrong.

I understand why you thought it was CMT, i am just saying, this is SMT. If this block is an accurate representation of what we will see in the final product, you shouldn't have to worry. The #1 downfall of CMT was the way the resources were managed. Modularity. Great in theory, not so great in application.

I know it is SMT, however its the way AMD is actually designing the CPU around it that reminds me of their CMT line with "more threads=better".

Edit: Its the same amount of cache per core as a Phenom II N970.

patrickjp93 · October 3, 2015

You are probably the most optimistic coder i have ever met. My father and brother are both programmers, and they complain about everything. From documentation (they think people should just automatically understand their work) to porting (they think people should just use linux for everything, and hate anything windows server). If i can go through one night without hearing them argue over MySQL vs MSSQL, i'd be a happy man.

In C++ you should be able to write your code in such a way a college freshman can read it and understand it, but documentation isn't too bad if you write as you go. As for porting, yeah, it sucks... Answer: Oracle, done.

patrickjp93 · October 3, 2015

he is a student....

so the amount of bullshit half assed coding work he has to fix before lunch on a daily basis is limited.

I have to test and grade 56 copies of BS coding work before lunch every day. Also, I do have work experience coding for IBM. Working in systems built by amateurs of the past who had no respect for design patterns is painful, yes. That said, I can fix it.

patrickjp93 · October 3, 2015

no, Aslong as AMD can offer ANY product equal or just slightly behind Intel, then they have "won" a chance to shine.

What AMD needs is to get close to haswell.

What AMD needs is a cheaper platform then Skylake, featuring the same or more features, at a lower price point.

Because today, going from a Z97 4690k to a Z170 6600k/6700k costs around 500 USD or MORE!!!...

If AMD can give you the same features as Skylake (or a few more) for Haswell performance, but at lower then skylake prices. Then all the people who still run pre-haswell systems, or current gen AMD systems will have a GOOD upgrade.

Because if you offer a more feature rich "haswell" option, then that is naturally better. We know that ZENplus will arrive in mid/late 2018. This would without a doubt be faster then Haswell (even if just by 5-10%), but by knowing there WILL be a upgrade path, then the ZEN platform will be more then viable.

ZEN will also come into the game "late". But by then DDR4 prices should have hit current gen DDR3 levels. Thus it is even cheaper to upgrade to AMD.

Intels motherboards will remain a bit pricy because of how they moved a lot of the voltage regulating systems over to the mobo (again). So expecting Z150/Z170 boards to drop in price by any massive margin is a longshot.

DDR4 boards and boards with voltage regulation on the board itself has existed for a long time, so manufacturing costs tied with these features shouldnt be that high. The systems themselves (more traces on the mobo to cover 288 pins for DDR4 rather then 240 for DDR3) and more chips is what is driving prices up.

TL;DR

If AMD can give us Haswell performance with Skylake features at reasonable prices (lower then Skylake) they have won.

Because upgrade wise, they will make more sense then Intel will (intel will be a lot more pricy, but not offer any real FPS advantage to make up for the price disparity)

For the 6600K you can do it for $320.

TetraSky · October 3, 2015

It can have whatever bell and whistle it wants, if it doesn't perform better than what Intel is offering today, it might as well be dead on arrival.

To me this just look like the same kind of hype there was behind their Bulldozer CPUs.. And we all know how that ended.

marldorthegreat · October 3, 2015

Then they have already lost you. unless AMD and NV make quantum leaps in GPU performance, you wont see a difference in gaming between any recent cpus (talking 2600k till Zen range here) anyway.

Its looking like it though next year. lower processing nodes are making gpus more powerful than ever. 28nm is really holding back nvidia and to an extent amd

Sign In

[Rumour] AMD's Zen To Have Ten Pipelines Per Core

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites