Intel & AMD, Architectural Discussion, How Far Ahead Is Intel ?

TechFan@ic · August 21, 2013

Well guys I've been doing my own research & analysis of the performance of AMD & Intel CPUs for sometime, I did not do tests personally however I tried to salvage as much knowledge as possible to try and answer a question that has been on my mind, which is how far ahead Intel really is? and that question led me to some very interesting realizations.

The discussion will be highly technical and might seem boring to some, I'll start off with the basics & end with the answer to the original question plus several others.

Ok let's begin, first off our discussion will be surrounding AMD's modular Bulldozer/Piledriver architecture & Intel's core series from Westmere to Sandy/Ivy & Haswell.

The Architectures : AMD Bulldozer/Piledriver

The building block of AMD's modular architecture is not an x86 core, it's an x86 module, this module contains everything that a core contains but with a number of components doubled, these components include the integer scheduler, its datapath, 16KB of L1 DCache & it's own load/store unit.
The doubling of these components is what gives the AMD module essentially two cores/threads.
The integer cores share the early pipeline stages (e.g. L1i, fetch, decode), the floating point unit, and the L2 cache with the rest of the module.

How this affects performance
Depending on the workload the quad module eight core AMD processors can either perform as eight processing units or as four and this depends entirely on the affinity of the workload to integer or floating point calculations.
If integer calculations are needed there are eight integer pipelines that can be used, however if floating point calculations are needed there are only four.

The difference between an integer processor and a floating point processor is that an integer processor only processes integers, as in whole numbers (including fixed fractions), while a floating point unit can process floating-point values which are fractions that don't end so to speak.

So an integer core can process a value that looks like this 5/2 (which equals 2.5), while a floating point unit can process a value that looks like this 22/3 which equals 7.33333333333.... (a number that doesn't end) which in turn is approximated to 733333 ×10^-5or 7.33333.
Think of the floating point unit as the invention of the scientific notation but for processors.
Why is a floating point unit needed ? well there is a range to which an integer core can process information, a 64bit processor can process 2⁶⁴ discrete numbers, if the process you require falls out of range you approximate it if needed & code it for the floating point unit, this can also help save on-die memory (L1,L2 & L3 Cache memory) space .
You can read more about the difference between integer processing and floating point processing here & here.

Here are examples of the effect this has on performance.
in 7-Zip the compression workload is very integer core intensive and thus a quad module AMD CPU performs well because it can provide eight integer threads.

However in floating-point heavy workloads like a synthetic render scene in Cinebench the eight integer pipelines are held back by the four floating point pipelines in the 8350 which prevents the CPU from reaching its full performance potential.

If the 8350 had eight floating point units single-threaded performance would be multiplied by 8 resulting in an 8.8 score.

The sharing of the early pipeline stages like the decode stage can also present similar symptoms to the sharing of the floating point unit.
Elimination of this resource sharing can improve performance by up to 25%.
Here is the in-depth analysis.
With Steamroller (AMD's 2014 update to the modular CPU architecture) decode resources have been doubled, which will minimize resource sharing and make it exclusive to the floating point unit.

What's also worth noting is that the majority of games have a great amount of floating-point operations, in games where the floating-point workload becomes too intensive for the CPU the developers code it for the GPU, an example of this is Nvidia's PhysX, AMD's TressFX & the Havoc physics engine.

Why is AMD not paying attention to floating-point performance ?
To answer this question we must understand AMD's goals. AMD combined GPGPU (General Purpose Graphics Processing Unit) cores with CPU cores in the APU in what AMD calls a heterogeneous architecture, not necessarily to make a product that replaces entry level discrete graphics cards, but the end goal is to let the GPU cores handle floating point operations (and any parallel workload) because floating point workloads can often be massively parallel and so a GPU architecture would be able to crunch floating-point data orders of magniute faster than a single large floating-point unit.

AMD's building block v.s. Intel's.
AMD Bulldozer/Piledriver Module (2 integer cores & 1 floating point unit) on top.
Intel Westemere core (1 integer core and 1 floating point unit) on the bottom.

Both are built on the 32nm process, Sandy Bridge is also built on the 32nm process.
The sizes of Intel's & AMD's building blocks (excluding the L2/L3 2MB caches) are :
Westmere: 17.2mm^2
Sandy Bridge: 18.4mm^2
Bulldozer Module: 19.42mm^2
So manufacturing process aside, a modern intel core is roughly the same size as an AMD module.

Size comparison :

Ok so now that we've established that two AMD "cores" are roughly the same size as one intel Sandy/Ivy/Haswell core, lets dive in the benchmarks.
We'll be comparing the FX 8350 (Die-Size = 315mm^2) & the i7 3820 (Die-Size = 294mm^2), the FX 8350 is approximately 21mm^2 larger because of its fairly larger cache pool.

What is a CPU Die (Die Size) ? click here .

Hash-rate, same algorithm used to mine bitcoins : uses all integer threads available.

FPU VP8 / SinJulia : uses all floating point threads available.

7-Zip file compression : uses all threads available.

Video encoding : uses all threads available.

Photoshopt Cs6 : uses one thread.

Cinebench : single threaded

Ray-tracing renderer : uses all threads available.

So we're noticing a pattern here, we're all familiar with, single threaded workloads run faster on the larger intel cores, however anything that utilizes all threads available runs faster on the smaller more parallel architecture of the AMD module.
This is nothing new, the 4 extra logical threads (hyper-threading) of intel's i7 processors help it keep up with AMD's eight physical integer cores.

Compression : uses all integer threads available.

Although the physical integer cores still maintain a performance lead, the degree of this depends on how well optimized the workload is for hyper-threading.
But what about Intel CPUs that don't support hyper-threading?

In that case 3 AMD modules become as fast as 4 intel cores as long as they are clocked 500mhz higher, although a 500mhz delta isn't really large considering that haswell i5s usually top at 4.3Ghz while the 6 core AMD processors can reach 4.8Ghz on the same cooling.
FX 6300 @ 3.5Ghz vs i5 4430 constant turbo on all 4 cores @ 3.0Ghz.
FX 6350 @ 3.9Ghz vs i5 4570 constant turbo on all 4 cores @ 3.4Ghz

What about games ?

As things stand right now, the majority of games are either coded for two or four threads which gives the larger intel cores the advantage here, however recently game engines that support up to 6 threads have started to appear.
There is a shift towards higher parallelism in games & this is driven by the next-gen consoles that have 8 cores, 6 of which are fully dedicated for games.
According to various developers this will greatly benefit AMD's modular architecture. Source : #1 #2 #3
Newer games like Battlefield 4, Crysis 3 & Far Cry 3 run very well on AMD's modular CPU architecture.

Back to my original question which is: how far ahead is Intel ?

A dual-core AMD module which takes roughly the same die area as a single Intel core has significantly more throughput but also higher single-thread latency.
So architecturally speaking, Intel isn't really ahead of AMD nor is AMD ahead of Intel so to speak, the parallel nature of AMD's Bulldozer modules gives them the total performance advantage over the Intel cores. However what Intel's architecture lacks in total CPU throughput makes up in single-threaded performance augmented with hyper-threading.

In an absolute sense an AMD module can process more data per second than an Intel core of the same size. However the sacrifice for this higher throughput is the single-threaded performance as we've discussed above. Although the majority of CPU-intensive programs are already multi-core/thread reliant & the ones that aren't are becoming so more rapidly as time passes which will in turn make this sacrifice in single-threaded performance for more total performance worth it for both casual and power users alike.

The reason why Intel stuck to single-threaded performance focused architectures is because it's easier to write code for one thread as opposed to two or two as opposed to 3 and so on, it's also much more complicated to design a multi-threaded module that shares resources as AMD did. But it's becoming extremely difficult to push more performance out of a single thread in the highly power constrained computing environments which have become the norm today which makes the move to a higher level of parallelism an inevitable certainty.

Where Intel is really ahead of AMD is not the architecture, it's the manufacturing process. Intel has had 22nm at its disposal for nearly two years while AMD only now has moved to 28nm process technology provided by Globalfoundries, this is because AMD sold off its fabrication plants which gives the company no option but to use whatever manufacturing process is available by 3rd parties like Globalfoundries & TSMC.

The fabrication process gives Intel two advantages over AMD, one of which is power efficiency so even though 3 AMD Piledriver modules match 4 Intel Haswell cores in performance when hyper-threading is unused, the 4 Intel cores will consume less power.

And even though smart technology implementations and fabrication tricks can enhance power efficiency of a chip, generally speaking the die size of a CPU or GPU reflects power consumption at load extremely well & so with the more advanced manufacturing process Intel can keep die sizes small to benefit from the power & cost savings.

Which leads us to Intel's second advantage which is higher profit margins and that's where Intel is really ahead of AMD.

It was a long-ride I hope you've enjoyed it.

Samdb · August 21, 2013

An APU is much better than Intel's IGPU. And I would really enjoy to see this.

WunderWuffle · August 21, 2013

Well guys I've been doing my own research & analysis of the performance of AMD & Intel CPUs for sometime, I did not do tests personally however I tried to salvage as much knowledge as possible to try and answer a question that has been on my mind, which is how far ahead is intel ? and that question lead to me to some very interesting realizations.

The discussion will be highly technical and might seem boring to some, but I wanted to gauge the interest of the community before I go on about writing what I had found out, because it's going to be quite long & do not wish to spend my time on something the community is not interested in.

So is anyone interested ?

Sure, I would love to read about this,

Jamdude · August 21, 2013

It'd be great to see what you've pulled together - I'm definitely interested.

Emperor_Piehead · August 21, 2013

It's not that intel is ahead it's that amd has gone a different route to where amd cpu's can't be compared easily to intel cpu. Like the reason why amd cpu's haven't done well in games is usually they are games that are single core and thread games that have a heavy focus on fpu while new games that support the amd architecture like crysis 3 are based on more cores, more threads and alu. @Kuzma has a thread about the comparison of amd and intel cpus.

TechFan@ic · August 21, 2013

@Kuzma has a thread about the comparison of amd and intel cpus.

Indeed he does & it's very good, but he tried to simplify his thread as much as possible so that most people can understand it, however this thread will be quite technical.

Emperor_Piehead · August 21, 2013

Indeed he does & it's very good, but he tried to simplify his thread as much as possible so that most people can understand it, however this thread will be quite technical.

well he didn't want to confuse people, but as a start to this thread it's good place to start.

NinjaStyle013 · August 21, 2013

Intel focuses on single core performance and uses hyper threading to get more multithreaded performance.

AMD is focuses on getting more multi core performance but misses out on single threaded performance.

I wanna see your write up.

SpenceSouth · August 21, 2013

Please post it. I'd love to read it.

Kuzma · August 22, 2013

It's not that intel is ahead it's that amd has gone a different route to where amd cpu's can't be compared easily to intel cpu. Like the reason why amd cpu's haven't done well in games is usually they are games that are single core and thread games that have a heavy focus on fpu while new games that support the amd architecture like crysis 3 are based on more cores, more threads and alu. @Kuzma has a thread about the comparison of amd and intel cpus.

Indeed he does & it's very good, but he tried to simplify his thread as much as possible so that most people can understand it, however this thread will be quite technical.

Well :p I could always edit the main post into containing a more technical section but to be honest :/ I could probably right a blog article 3 pages long on this and a forum really isn't the right place to put something like this.

Emperor_Piehead · August 22, 2013

Well :P I could always edit the main post into containing a more technical section but to be honest :/ I could probably right a blog article 3 pages long on this and a forum really isn't the right place to put something like this.

tbh I don't see why not it will turn some people off, but some people will read it and like it.

Kuzma · August 22, 2013

tbh I don't see why not it will turn some people off, but some people will read it and like it.

xD it will probably be double the length of the rest of the thread combined but I could always do it.

Vitalius · August 22, 2013

Well guys I've been doing my own research & analysis of the performance of AMD & Intel CPUs for sometime, I did not do tests personally however I tried to salvage as much knowledge as possible to try and answer a question that has been on my mind, which is how far ahead is intel ? and that question lead to me to some very interesting realizations.

The discussion will be highly technical and might seem boring to some, but I wanted to gauge the interest of the community before I go on about writing what I had found out, because it's going to be quite long & do not wish to spend my time on something the community is not interested in.

So is anyone interested ?

I am very interested.

Please use spoilers and format the entire post however. I am weak to Walls of Text. They crit me all the time too. 8x damage sucks when you are a Lazy Type Human.

Cobalt · August 22, 2013

Woo, yes bring them info plz!!

More seriuosly, I would really like to see really how much is AMD behind in total IPC, also I think you may have insight on the whole FX module and not core thing, if you do, that would be even more awesome

Kuzma · August 22, 2013

I am very interested.

Please use spoilers and format the entire post however. I am weak to Walls of Text. They crit me all the time too. 8x damage sucks when you are a Lazy Type Human.

Well, if I was to do mine... it'd be a wall of text so I might just leave it :p

Woo, yes bring them info plz!!

More seriuosly, I would really like to see really how much is AMD behind in total IPC, also I think you may have insight on the whole FX module and not core thing, if you do, that would be even more awesome

Read my threads, there's loads of information there for you ^_^

iAsuno · August 22, 2013

post it. i have so much to learn.

Glenwing · August 22, 2013

AMD has not really put their plan in action yet... Their CPUs are strong in integer but weaker in floating point because they only have 1 FPU per pair of cores... but their intention is to offload that sort of calculation to a GPU which is far, far better than even the most powerful CPU at floating point operations (this is Heterogeneous System Architecture, or HSA). And this is where APUs come in, with their advanced onboard GPU. You can see they are pushing the FM2 platform now, since AM3+ doesn't have integrated graphics, and there are rumors of AM3+ being abandoned completely. A future FX-level APU with an onboard GPU will blaze through FLOps, and in theory their performance should improve in huge strides, especially if you have a dedicated GPU for the graphics work, and the integrated GPU in the APU can be dedicated to accelerating the CPU's floating point calculations.

codytappen · August 22, 2013

I'd like to see what your research says, this is how science works after all.

Kuzma · August 22, 2013

AMD has not really put their plan in action yet... Their CPUs are strong in integer but weaker in floating point because they only have 1 FPU per pair of cores... but their intention is to offload that sort of calculation to a GPU which is far, far better than even the most powerful CPU at floating point operations (this is Heterogeneous System Architecture, or HSA). And this is where APUs come in, with their advanced onboard GPU. You can see they are pushing the FM2 platform now, since AM3+ doesn't have integrated graphics, and there are rumors of AM3+ being abandoned completely. A future FX-level APU with an onboard GPU will blaze through FLOps, and in theory their performance should improve in huge strides, especially if you have a dedicated GPU for the graphics work, and the integrated GPU in the APU can be dedicated to accelerating the CPU's floating point calculations.

Great point Glenwing ^_^ , I can forsee some kind of FX-APU hybrid with most of the actual power for gaming being not so good to allow for more of the die space for compute.

Cobalt · August 22, 2013

Well, if I was to do mine... it'd be a wall of text so I might just leave it :P

Read my threads, there's loads of information there for you ^_^

Will do command!

Glenwing · August 22, 2013

Great point Glenwing ^_^ , I can forsee some kind of FX-APU hybrid with most of the actual power for gaming being not so good to allow for more of the die space for compute.

Games are pretty heavy on floating point calculations, so if AMD can hold off Intel long enough to pull this off... it should be a good show.

Kuzma · August 22, 2013

Games are pretty heavy on floating point calculations, so if AMD can hold off Intel long enough to pull this off... it should be a good show.

I will quote myself here

The second coming of Athlon

TechFan@ic · August 22, 2013

I'm still in the process of writing this up, I just frequently press post so I don't lose everything if firefox crashes.

TechFan@ic · August 22, 2013

I'm very tired T_T, will continue the write-up tomorrow, I hope you've enjoyed the discussion so far.

Dravic · August 22, 2013

You're missing the whole point. Intel ain't ahead, it's just AMD lagging behind.

They thought of modules? wow, great. They are slower than Intel counterparts... Why not just make Phenom III? I'll tell you why. Save die space for APUs.

Except they are too confident about APU performance - APUs will never catch up to regular dedicated GPUs. And as of today APUs are useless and overpriced (price to performance is really really bad, 50 bucks more you got a dedicated GPU + 4 core CPU).

And to make it clear, you're kinda talking about 8 core 4 module processors like they are something new... Intel can throw more cores at the problem at any time, they got processors with lots of cores for years now... They just dont feel like putting them in regular consumer price range.

Look up Intel Xeons, look up Intel Phi.

Sign In

Intel & AMD, Architectural Discussion, How Far Ahead Is Intel ?

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites