Jump to content

Intel & AMD, Architectural Discussion, How Far Ahead Is Intel ?

So they do have a plan!

LOL, it looks like the few years ahead are going to be EXTREMELY exciting.

Link to comment
Share on other sites

Link to post
Share on other sites

Good job TechFan@ic. Nice write-up. Though I'm not sure I 100% agree with your point of Intels better energy-efficiency being solely a function of their better manufacturing-process. But that's hard to figure out I guess.

But your other points are pretty much spot-on.

 

To anyone who has doubts if APUs really are the future, just ask yourself this one question: What if PhysX was an AMD-technology?

Suddenly every gamer would get an APU just to have a dedicated PhysX-Processor build right into his CPU.

AMD's idea with the APUs was clearly right, they just could have executed it a bit better. But hopefully in the future more gaming/physics-stuff will move to OpenCL, and the power of the integrated graphics card will be better used for gaming.

As long as it is open source. "World is open source", I'm very happy seeing AMD do all OS things (mantle for example), it is more fair competition, i just hope if/when they become leading company (in $ sense, not technology only) they will stay that way.

Link to comment
Share on other sites

Link to post
Share on other sites

loved reading this! thanks alot, its making me second guess my choice to just go intel for my next chip

 

really hope amd arent out of the strictly cpu market

Spoiler

Gaming/Engineering PC: -i7 6700K, 4-4.2GHz "Eleanor" -ASUS ROG HERO VIII MOBO -16GB DDR4 3000MHz Corsair (2x8GB) -Gigabyte Windforce 980Ti OC edition (1405MHz GPU clock) -H110i GT Corsair CPU Water cooler -980GB Sandisk Ultra II SSD -Corsair 450D ATX Case -RM850i Corsair PSU (Modular) -28” 4K Samsung -27” 1080p Samsung 

Link to comment
Share on other sites

Link to post
Share on other sites

  • 1 month later...

Great job! I really really enjoyed reading this, and I do like the 8 cores while editing videos on power director 11. And gaming I feel a bit different about it, my 780 gives higher average fps on my 8350 than on my brother's 2500k, but the minimum fps is lower and it dips very frequently.

Link to comment
Share on other sites

Link to post
Share on other sites

I can't seem to understand AMD's Modular approach until I read this article. This was a very good read. Thank You for creating this. :D

"Cough, Cough, Cough"

Link to comment
Share on other sites

Link to post
Share on other sites

I can't believe I'm still getting comments & likes on my post, thanks a bunch guys.

 

Great job! I really really enjoyed reading this, and I do like the 8 cores while editing videos on power director 11. And gaming I feel a bit different about it, my 780 gives higher average fps on my 8350 than on my brother's 2500k, but the minimum fps is lower and it dips very frequently.

I agree with you when it comes to gaming performance but not entirely.
The long old issue with CPUs like the 8350 and AMD's older 1100T is that they have/had more cores than the game can actually use so most often than not the CPU is actually very powerful but the game just can't take advantage of the performance.

The major turnaround point for this old trend is right now, next generation consoles both have 8 cores, 6 of which are entirely dedicated to games.
So essentially every graphically intensive next-gen game will natively support and fully take advantage of 6 CPU cores, not only that but the developers will be highly optimizing their code for the AMD architecture and AMD specific instruction sets, this will give AMD CPUs another edge in performance per clock. Mantle will also extend the core utilization to 8 cores and improve CPU performance even further.

This is why you see the FX 8350 performing exceptionally well in games like Battlefield 4 & Crysis 3 even though these games aren't next-gen exclusive.

Games like Thief & Titanfall which are developed exclusively for next-gen will make even better use of the cores.

Link to comment
Share on other sites

Link to post
Share on other sites

I can't believe I'm still getting comments & likes on my post, thanks a bunch guys.

 

I agree with you when it comes to gaming performance but not entirely.

The long old issue with CPUs like the 8350 and AMD's older 1100T is that they have/had more cores than the game can actually use so most often than not the CPU is actually very powerful but the game just can't take advantage of the performance.

The major turnaround point for this old trend is right now, next generation consoles both have 8 cores, 6 of which are entirely dedicated to games.

So essentially every graphically intensive next-gen game will natively support and fully take advantage of 6 CPU cores, not only that but the developers will be highly optimizing their code for the AMD architecture and AMD specific instruction sets, this will give AMD CPUs another edge in performance per clock. Mantle will also extend the core utilization to 8 cores and improve CPU performance even further.

This is why you see the FX 8350 performing exceptionally well in games like Battlefield 4 & Crysis 3 even though these games aren't next-gen exclusive.

Games like Thief & Titanfall which are developed exclusively for next-gen will make even better use of the cores.

I know, the future is so exiting! I The games I tested the performance of the 780 are crysis 3, battlefield 3, farcry 3, blacklight retribution, mirror's edge and interstellar marines beta. In most of them the minimum fps was lower on the 8350 and it dipped more often, however that only occured when the card was running at 1186MHz (fully stable), but it had less studdering and framerate dips when on stock speeds. That studdering difference was less noticeable on my brother's PC.

about the comment. I just saw it today for the first time! didn't even notice that the date of the original post...

Link to comment
Share on other sites

Link to post
Share on other sites

  • 1 month later...

I hope you will update this thread once steamroller CPUs/APUs are out.

Link to comment
Share on other sites

Link to post
Share on other sites

I hope you will update this thread once steamroller CPUs/APUs are out.

I would've been more excited had AMD announced a quad module, eight core Steamroller CPU so I could compare it to the i7 4820K or the i7 4770K.

But that didn't happen unfortunately.

Once I can get my hands on an A10 7850K die-shot I will probably start dissecting the architecture to compare its performance/area to Intel's Haswell i3 processors.

Link to comment
Share on other sites

Link to post
Share on other sites

  • 6 months later...

-ship-

 

Sorry but your work is flawed.

 

You must base your conclusions and numbers on R15 scores. R11.5 must be thrown out as at one point it was purposely using code that hurt AMD performance. 

 

The numbers also go up for AMD on R15 from a "fixed" R11.5.

 

Compare the numbers and in some cases you will see as much as 11% increase at the same clock speed from AMD when going from R11.5 to R15.

Link to comment
Share on other sites

Link to post
Share on other sites

Sorry but your work is flawed.

 

You must base your conclusions and numbers on R15 scores. R11.5 must be thrown out as at one point it was purposely using code that hurt AMD performance. 

 

The numbers also go up for AMD on R15 from a "fixed" R11.5.

 

Compare the numbers and in some cases you will see as much as 11% increase at the same clock speed from AMD when going from R11.5 to R15.

I completely agree, although I'm still doubtful that R15 has been cleaned up. It was also more difficult to find R15 benchmarks that included the i7 3820 since it was the main target of the comparison between it and the 8350 when I originally wrote the piece.

I honestly believe that all synthetics including Cinebench shouldn't be used to benchmark CPUs. Only real-world applications and workloads should be tested.

Anand did use Cinebench 11.5 to illustrate his point about FP performance of Bulldozer & it does hold water especially because he compared the six core Phenom 1100T to the eight core Bulldozers 8150 & since he didn't use it to directly draw a comparison between AMD & Intel, rather between two AMD products I think his point has validity to it.

Thankfully when Piledriver came it had cleaned up a lot of the issues with Bulldozer.

Link to comment
Share on other sites

Link to post
Share on other sites

Some details I'd like to see are: what is AMD doing to alleviate its shortcoming in single-thread performance, and what is Intel likely to do in implementing a heterogeneous architecture given it is now working towards unified memory with Skylake. Given Intel already has the superior float performance in its CPU cores, what would it likely do when implementing its newest GPU core architecture? Would it cut down on the CPU float scheduler, or would it develop them both in tandem towards being better?

 

Obviously, if Intel cut down on its CPU core resources, it could potentially fit more cores and increase their clock rates to better equal AMD's and widen the performance gap, though this may also slow down the GPU cores.

Software Engineer for Suncorp (Australia), Computer Tech Enthusiast, Miami University Graduate, Nerd

Link to comment
Share on other sites

Link to post
Share on other sites

Indeed latency is becoming a serious problem, the higher the RAM speeds the more it requires to run at those speeds, I personally think DDR4 will probably the last DDR memory we're going to see, things like HMC will eventually take over.

I wouldn't be surprised to see the Volcanic Island's architecture implemented in future APUs, that's very likely to happen.

 

As per HMC, the cost is going to have to come way down. I have a few friends looking at the specs of the new MIT supercomputer built with these things. A 128 GB version costs a whopping $12,300. While this cost/GB is not a bad ratio, it gets much worse in smaller sizes with the 64 GB version still costing $9,900

Software Engineer for Suncorp (Australia), Computer Tech Enthusiast, Miami University Graduate, Nerd

Link to comment
Share on other sites

Link to post
Share on other sites

I only cherry picked a few benchmarks to illustrate either a weakness or a strong point of the architecture.

The benchmarks listed under "Benchmarks" were taken straight off HardwareCanuck's 8350 review, the rest are from Anandtech.

I also should mention that Cinebench and Passmark both favor intel (compiled using the intel libraries) but I posted them none the less.

This stopped being a scandal long ago when the Intel compiler was fixed to stop hamstringing AMD processors at execution time. It no longer discriminates between brand, only what instruction sets are available.

Software Engineer for Suncorp (Australia), Computer Tech Enthusiast, Miami University Graduate, Nerd

Link to comment
Share on other sites

Link to post
Share on other sites

http://linustechtips.com/main/topic/176283-what-do-amd-cpus-have-over-intel/page-10#entry2385755

So let's talk about this, I welcome criticism & in fact I demanded it in my thread.

"- Piledriver can process two floating point operation at the same time. So it can function as 8 threads for the SIMD cluster"

That doesn't mean that the floating point unit has enough throughput to match the two integer cores/threads. Even if the floating point can process two FMACs at the same time it doesn't mean that it can run two threads at the same time, there is only one FP backend so the FP is fundamentally single-threaded in Bulldozer.

Similarly each Integer core has 2 AGen pipelines it doesn't mean that a single integer core can simultaneously run two threads.

If we look at traditional AMD cores like K10 & K8 which are conceptually similar you will find that exclusive Floating Point real-estate in each core matches the exclusive integer real-estate in terms of die area. In AMD's Bulldozer module we see that AMD allocated significantly more die area (about double) to integer execution over floating point execution.

CRwb41M.jpg

K10.5 Core on the left (Stars core in Llano), Bulldozer/Piledriver core on the right.

Blue = Integer, Red = Floating Point.

The thing is that bulldozers FlexFPU CAN process two 128bit instructions. It is designed two be a smaller SIMD cluster. The FlexFPU can process one heavy 256bit SIMD, however that only occurs on special workloads (extreme AVX). The flexFPU is meant to process 2x64bit(or 1x128bit)) instructions per "core".

SIMD instructions aren't exactly rare today, however they aren't as common as integer instructions. AMD saw that with a single 128bit FMAC unit, you would have enough for general everyday. They also assured that 256bit FMAC would be supported (They support AVX so it is a must), this is done by combining both FMAC units to a 256bit unit. I'm sure if a 256bit SIMD instruction is been fetched, it is been decoded to 2 mops, however this is only known to AMD.

When the instruction have been queued the SIMD cluster, the SIMD cluster doesn't care what instructions is from what thread, it just execute it.

So it can run two 128bit FMAC instructions from two different thread. Or use the 2 MMX units and 2 FMAC units separately.

There is one big difference between bulldozer and AMDs previous architecture. CMT. CMT is essentially duplicating the ALU cluster in case of bulldozer.

Like in theory the fx 8350 is a 4 CMT core processor. It features 8 ALU clusters however.

 

"- "So architecturally speaking, Intel isn't really ahead of AMD nor is AMD ahead of Intel so to speak" - Extremely big no. Intel is far ahead in certain technologies. Especially the frontend and SIMD cluster."

Again AMD is significantly ahead in Integer, in fact they lead in integer even outperforming the extreme editions from Intel, AMD is also significantly ahead in high frequency, low-gate count design.

Each architecture excels in different areas in terms of the architecture itself there is no clear winner like I said AMD wins in multi-threaded Intel wins in single-threaded and the benchmarks reflect that.

Intel maintains an efficiency advantage through their smaller process which I mention in my thread as well.

That is not correct anymore. AMD was leading back in the days of sandy bridge, and had a smaller advantages with ivy bridge.

Haswell have the same amount of ALUs as piledriver. Haswell feature 4 ALUs per core (meanwhile piledriver feature 2 ALUs per ALU cluster) so in total they have 16 ALU each. Lets see the extreme processor from intel; 3930k. It feature 3 ALUs per core. 3x6 = 18 ALUs in total. So it doesn't beat the extreme processor from Intel in heavy integer calculations, and Intel is quite effective utilizing all their ALUs in the ALU cluster.

The core I7 4770k will have a significant advantages over the fx 8350. The FX 8350 relies much more on the software to fully utilize itself. Haswell with its fewer ALU clusters (but same amount of ALUs) will be easier to utilize.

Higher frequency architecture for the price of longer pipeline? That is not necessary something good. There are more advantages with a shorter pipeline than a larger.

Bulldozers CPI is a completely mess. A branch miss is been discovered to late in bulldozers architecture, forcing almost the entire pipeline to be flushed.

You cannot justify "multi-threaded" as you aren't even been specific about what workload. Multi-threaded SIMD workload. The 6x256bit SIMD from haswell would completely run over the fx 8350 with its 2xMMX units and 2x128bit FMAC units.

Do not justify multi-threaded as parallel. It is NOT the same. Also multi-threaded is anything using more than a single thread. You will literally need to utilize ALL 8 ALU clusters on the fx 8350 to even match the core I7 4770k in certain workloads. That is one of the big issues with bulldozer. It relies to much on the software to be fully utilized.

Intel greatest advantages remain the frontend and cache system. (AMD is not behind in terms of the frontend technology, as they are on pair with other manufacturers. However Intel is just ahead)

Link to comment
Share on other sites

Link to post
Share on other sites

Sorry but your work is flawed.

 

You must base your conclusions and numbers on R15 scores. R11.5 must be thrown out as at one point it was purposely using code that hurt AMD performance. 

Many people didn't notice it, google the 9590 scores quickly some came out with 7.8 and others came out of with 9.0. Not sure what this brought this performance improvement up, probably that hotfix?

This dude scored 9.0: youtube.com/watch?v=8SFJVH9AZmw

Link to comment
Share on other sites

Link to post
Share on other sites

Many people didn't notice it, google the 9590 scores quickly some came out with 7.8 and others came out of with 9.0. Not sure what this brought this performance improvement up, probably that hotfix?

This dude scored 9.0: youtube.com/watch?v=8SFJVH9AZmw

 

Hell it could be many reasons.

 

Windows did 2 "hotfixes" One you had to download and install as an option and the other that was implemented in a patch.

Windows 7 vs Windows 8.

VRM throttling or board issues.

Cinebench changing the way the threads were executed by which code it used/instruction set.

There was a Bulldozer fix that you could download that would force some programs to use a certain instruction set. Not sure it that works on Cinebench but it works(worked) on Super PI for example.

Link to comment
Share on other sites

Link to post
Share on other sites

Hell it could be many reasons.

This has been done with the ud5: http://www.kitguru.net/components/cpu/zardon/amd-fx9590-5ghz-review-w-gigabyte-990fxa-ud5/13/

This with the 990FX sabertooth: http://www.hardwarecanucks.com/forum/hardware-canucks-reviews/62166-amd-fx-9590-review-piledriver-5ghz-3.html

If I'm not mistaken the VRM on the ud5 isn't better/close than/to the 990fx sabertooth. C11.5 came out in 2010, currently guru3d only has that one, perhaps its either those hotfixes or just a newer release that improved their performance.

Link to comment
Share on other sites

Link to post
Share on other sites

http://linustechtips.com/main/topic/176283-what-do-amd-cpus-have-over-intel/page-10#entry2385755

The thing is that bulldozers FlexFPU CAN process two 128bit instructions. It is designed two be a smaller SIMD cluster. The FlexFPU can process one heavy 256bit SIMD, however that only occurs on special workloads (extreme AVX). The flexFPU is meant to process 2x64bit(or 1x128bit)) instructions per "core".

SIMD instructions aren't exactly rare today, however they aren't as common as integer instructions. AMD saw that with a single 128bit FMAC unit, you would have enough for general everyday. They also assured that 256bit FMAC would be supported (They support AVX so it is a must), this is done by combining both FMAC units to a 256bit unit. I'm sure if a 256bit SIMD instruction is been fetched, it is been decoded to 2 mops, however this is only known to AMD.

When the instruction have been queued the SIMD cluster, the SIMD cluster doesn't care what instructions is from what thread, it just execute it.

So it can run two 128bit FMAC instructions from two different thread. Or use the 2 MMX units and 2 FMAC units separately.

There is one big difference between bulldozer and AMDs previous architecture. CMT. CMT is essentially duplicating the ALU cluster in case of bulldozer.

Like in theory the fx 8350 is a 4 CMT core processor. It features 8 ALU clusters however.

 

That is not correct anymore. AMD was leading back in the days of sandy bridge, and had a smaller advantages with ivy bridge.

Haswell have the same amount of ALUs as piledriver. Haswell feature 4 ALUs per core (meanwhile piledriver feature 2 ALUs per ALU cluster) so in total they have 16 ALU each. Lets see the extreme processor from intel; 3930k. It feature 3 ALUs per core. 3x6 = 18 ALUs in total. So it doesn't beat the extreme processor from Intel in heavy integer calculations, and Intel is quite effective utilizing all their ALUs in the ALU cluster.

The core I7 4770k will have a significant advantages over the fx 8350. The FX 8350 relies much more on the software to fully utilize itself. Haswell with its fewer ALU clusters (but same amount of ALUs) will be easier to utilize.

Higher frequency architecture for the price of longer pipeline? That is not necessary something good. There are more advantages with a shorter pipeline than a larger.

Bulldozers CPI is a completely mess. A branch miss is been discovered to late in bulldozers architecture, forcing almost the entire pipeline to be flushed.

You cannot justify "multi-threaded" as you aren't even been specific about what workload. Multi-threaded SIMD workload. The 6x256bit SIMD from haswell would completely run over the fx 8350 with its 2xMMX units and 2x128bit FMAC units.

Do not justify multi-threaded as parallel. It is NOT the same. Also multi-threaded is anything using more than a single thread. You will literally need to utilize ALL 8 ALU clusters on the fx 8350 to even match the core I7 4770k in certain workloads. That is one of the big issues with bulldozer. It relies to much on the software to be fully utilized.

Intel greatest advantages remain the frontend and cache system. (AMD is not behind in terms of the frontend technology, as they are on pair with other manufacturers. However Intel is just ahead)

I agree with the majority of what you've said, I don't have any problem with it but you seem to misinterpret what the specifications actually mean for performance.

We already know that Bulldozer has two FMAC units which can process 256bit AVX instructions together as a single unit it's a very similar approach to Intel's Sandy Bridge approach which borrows 128bit SIMD from integer rather than combining two 128bit floats, so they sacrificed integer execution to gain more throughput out of the FPU.

This approach saves die area & power but this mutually exclusive sharing also significantly diminshes the performance of integer/float mixed workloads.

From anandtech

Compared to a quad-core Phenom II, AMD's eight-core (quad-module) FX sees no drop in floating point execution resources. AMD's architecture has always had independent scheduling for integer and floating point instructions, and we see the same number of execution ports between Phenom II cores and FX modules. Just as is the case with the integer cores, the shared FP core in a Bulldozer module has larger scheduling hardware in front of it than the FPU in Phenom II.

The problem is AMD had to increase the functionality of its FPU with the move to Bulldozer. The Phenom II architecture lacks SSE4 and AVX support, both of which were added in Bulldozer. Furthermore, AMD chose Bulldozer as the architecture to include support for fused multiply-add instructions (FMA). Enabling FMA support also increases the relative die area of the FPU. So while the throughput of Bulldozer's FPU hasn't increased over K8, its capabilities have.

http://www.anandtech.com/show/4955/the-bulldozer-review-amd-fx8150-tested/2

You seem to think that because a Bulldozer FPU can process FMA & AVX instructions that it's faster than Phenom which isn't true, it's more capable yes but not faster.

It can process different types of instructions but the processing itself isn't significantly faster than Phenom.

AMD still leads in integer even compared to Haswell no matter how you slice it, each AMD module has significantly more integer throughput than a Haswell core both of which have the same die size so form an architectural stand point that's how a comparison should be drawn. You can't compare a Bulldozer Integer core/cluster to a Haswell core because it's less than half the size. Similarly you can't compare one ALU to the other, not all ALUs are created equal.

You can clearly see that in both of these instances the similarly clocked 7850K ( dual module Steamroller) and i3 4330 (dual core Haswell) the 7850K is ahead in integer.

KAVERI-APU-41.jpg

KAVERI-APU-40.jpg

http://www.hardwarecanucks.com/forum/hardware-canucks-reviews/65031-amd-kaveri-a10-7850k-a8-7600-review-9.html

Against Sandy Bridge Bulldozer was faster in both Integer and Floats but since the introduction of Haswell's more advanced FPU AMD lost their float lead, however AMD's upcoming Excavator will introduce the same float functionality as Haswell. It also should be mentioned that both Sandy & Ivy are fundamentally the same, Ivy is a die-shrunk Sandy with very fine improvements and tweaks. Again Broadwell will be a die-shrunk Haswell.

Compared to Sandy Bridge, Bulldozer only has two advantages in FP performance: FMA support and higher 128-bit AVX throughput. There's very little code available today that uses AMD's FMA instruction, while the 128-bit AVX advantage is tangible.

http://www.anandtech.com/show/4955/the-bulldozer-review-amd-fx8150-tested/2

 

Also what is a multi-threaded workload ? it's any workload that can utlize the multiple threads that the CPU has to offer. Any CPU intensive application is smart enough to take advantage of all of the available performance.  Exclusively single threaded workloads exist but they're far less common because as soon as you decide or are forced into creating a single threaded workload you instantly realize that you're throwing away the majority of the performance inside any processors today since all processors sold on the market today are multi-core CPUs & in power constrained environments this also means that you're going to throw away a significant amount of efficiency, meaning you're eating away battery life which is a huge concern.

Link to comment
Share on other sites

Link to post
Share on other sites

Against Sandy Bridge Bulldozer was faster in both Integer and Floats but since the introduction of Haswell's more advanced FPU AMD lost their float lead, however AMD's upcoming Excavator will introduce the same float functionality as Haswell. It also should be 

Sandy bridge is still faster than BD in floats.

Your benchmarks are flawed again, 4770K doing 3GB/s when it's in reality at 4GB/s. Just quickly opened Aida64 to point this out.

ePVsMyB.png

Clock for clock it would pass it quite easily.

Link to comment
Share on other sites

Link to post
Share on other sites

Sandy bridge is still faster than BD in floats.

Your benchmarks are flawed again, 4770K doing 3GB/s when it's in reality at 4GB/s. Just quickly opened Aida64 to point this out.

ePVsMyB.png

Clock for clock it would pass it quite easily.

No it isn't. Sandy only has one dedicated 128bit SIMD per floating point unit. Bulldozer has two.

Also your benchmark shows the 8350 ahead of the 4770.

Also the 4770 turbos to 3.7Ghz on all cores according to intel's official specifications : http://www.intel.com/support/processors/corei7/sb/CS-032279.htm it never operates at the base 3.4Ghz clock. So the delta is 300mhz, which is tiny considering AMD's short-ticked Bulldozer can achieve 800mhz higher clocks on average than Haswell.

In your benchmark your also showing the 8350 beating the 3960X which is 250mm² larger and has 4 more threads, 2 more clusters.

Link to comment
Share on other sites

Link to post
Share on other sites

No it isn't. Sandy only has one dedicated 128bit SIMD per floating point unit. Bulldozer has two.

Also your benchmark shows the 8350 ahead of the 4770.

Also the 4770 turbos to 3.7Ghz on all cores it never operates at the base 3.4Ghz clock. So the delta is 300mhz, which is tiny considering AMD's short-ticked Bulldozer can achieve 800mhz higher clocks on average than Haswell.

For godsake.

6lbGZck.png

YtPs8j6.png

tlZmbdg.png

That CPU hash benchmark shows the 7680k that APU crap outperforming a 2600K. We are perfectly pointing out that you both are cherrypicking benchmarks. It's hilarious to come up with the 8350 being faster than a 2600k already. A 4770k at 5GHz will outperform the 8350 at 5GHz in that hashbenchmark. Haswell's performance per clock is quite larger than AMD.

Edit: Back to CPU hash. As I included the recent database, the difference between a 3930K & 4770 was 2%.

FX-9590-40.jpg

KzY0IDR.png

Link to comment
Share on other sites

Link to post
Share on other sites

No it isn't. Sandy only has one dedicated 128bit SIMD per floating point unit. Bulldozer has two.

Also your benchmark shows the 8350 ahead of the 4770.

Also the 4770 turbos to 3.7Ghz on all cores according to intel's official specifications : http://www.intel.com/support/processors/corei7/sb/CS-032279.htm it never operates at the base 3.4Ghz clock. So the delta is 300mhz, which is tiny considering AMD's short-ticked Bulldozer can achieve 800mhz higher clocks on average than Haswell.

You seem to have missed the fact the 8350 had a serious RAM advantage too. 1866 CL 9 vs 1600 CL 9 is a big difference in both bandwidth and absolute latency

9/933 = 9.646 ns

9/800 = 11.25 ns 

 

1/(3.7*10^9) =0.2703 nanoseconds per clock.

 

That's an 8 CPU cycle difference in fetch time after you account for the time to travel from the bus back to the registers.

 

Benches should never have different RAM speeds across different chips unless it's to look at the effect of RAM on performance on 1 or 2 chips.

Software Engineer for Suncorp (Australia), Computer Tech Enthusiast, Miami University Graduate, Nerd

Link to comment
Share on other sites

Link to post
Share on other sites

Guest
This topic is now closed to further replies.


×