Jump to content

Understanding the Single vs Multi thread performance: I9-9900k @ 3.6 vs R7-3700X @ 3.6

This benchmark perplexed me: https://www.anandtech.com/bench/product/2263?vs=2520

 

IfvIm reading it correctly, the eight core,sixteen thread Ryzen R7 3700X performs significantly better than the eight core, sixteen thread I9-9900K in multi threaded tasks, while at the same time it generally seems to fall behind in single threaded tasks. Both CPUs have the same core count, the same thread count, and (though this is not likely relevant) the same base clock. So the question becomes why? Shouldn't a system with faster single threads have faster multi-thread performance for the same number of threads? Or is there something causing the Intel chips to scale worse than the AND units? 

 

Right now I've got about three hypothesis that I'm looking at, but I'm interested in what other people have found. 

 

First hypothesis is that multi-threaded tasks may use different essential CPU features than single threaded task do and that the AMD cups may be aligned toward those functions. I think this is the weakest as the synthetic benchmarks don't show any specific single thread strength for AMD aside from Cypto. 

 

Second hypothesis is thermal head room. While the Intel and AMD TDPs aren't measured under the same load conditions, the LTT thermal tests posted up on YouTube do show that the Zen 2 cores generate less heat at load. Further, the PCMark tests I've seen that fix the clock speeds seem to be showing matching performance per hz between the Zen 2 and current Intel cores.

 

I'm suspecting that what we're seeing is, in single thread, on normal coolers, the I9 has more than enough head room to turbo it's running core up to the 5ghz limit, while the Ryzen ends up with a lower top clock speed, but in multi thread tasks, the lower thermal output lets the Ryzen keep more of its cores at a higher peak boost than the Intel can under normal cooling. 

 

The third hypothesis is that there is an intrinsic difference in the architecture that limits the way the I9 chips scale with more cores. I. E. Could the 16mb of cache or the lower memory speed be holding it back when all cores are demanding data? This one, I just don't know enough about multi threaded workloads to have a good idea, but open to thoughts. 

 

What are your thoughts? I'm rather curious about what's going on here, simply because it is, on the face, counter-intuitive, and I've found those are often the best way to get a deeper understanding of the subtleties of how systems work. 

 

Thank you, 

 

Harry Voyager

Edited by Harry Voyager
Correcting typos
Link to comment
Share on other sites

Link to post
Share on other sites

Just now, 5x5 said:

The answer is quite sole. SMT is more efficient than Hyper Threading

That, and intel can still slap the clocks up to 5jigglehertz, meanwhile ryzen is only at about 4.3 Ryzen does seem to have the IPC advantage.

Bethesda PC:   R7 3700X  -  Asrock B550 Extreme 4  -  Corsair Dominator Platinum RGB 16GB@3.6GHz -  Zotac AMP Extreme 1080TI -  Samsung 860 Evo 256GB  -  WD Blue 2TB SSD -  500DX  -  Stock cooling lul  -  Rm650x

CrumpleBox V3:  Xeon X5680  -  Asus X58 Sabertooth  -  DDr3 16GB@1.33Ghz  -  Gigabyte 1660s -  TT smart RGB 700W  -  

Cooler Master Storm Trooper  -  120GB Samsung 850 Pro   -  LTT Edition Chromax NH-D15 ?

 

CrumpleBox 3 ROTF: I5-6400  -  MSI B150m Mortar  -  16GB 2133Mhz Vengeance Pro RGB  -  Strix 1070Ti - GTX 1070 FE  -  Adata 128GB SSD  -  Fractal Design Define C  -  Gammaxx 400V2  -  Cooler Master silent pro gold 1000W

CrumpleBox 2: i7-7820x - MSI X299 Raider - 32GB Thermaltake Toughram 3.6Ghz - 2x Sapphire Nitro Fury - 128GB PCie Adata SSD - O11 Dynamic - EVGA CLC 360 - Corsair RM1000X

 

Perhiperals:  Gateway 900p60 monitor  -  Dell 1024x768@75  -  Logi. G403 Carbon  -  Logi. G502  -  SteSer. Arctis 5  -  SteSer. Rival 110 - Corsair Strafe RGB MK.2

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, 5x5 said:

The answer is quite simple. SMT is more efficient than Hyper Threading

This 

 

and also 

 

Ryzen has massive L3 cache so it doesn't need to access the RAM as often. 

 

There's definitely more to it than just this though. 

Link to comment
Share on other sites

Link to post
Share on other sites

Do you mean AMDs implementation of SMT is more efficient than Intel's? Both CPUs are eight core CPUs. 

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, Harry Voyager said:

Do you mean AMDs implementation of SMT is more efficient than Intel's? Both CPUs are eight core CPUs. 

Yes. And clockspeeds and core count is not all. Zen2 can execute more instructions per clock than 9th gen intel

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, 5x5 said:

Yes. And clockspeeds and core count is not all. Zen2 can execute more instructions per clock than 9th gen intel

But on a single thread, the Intel chip is still putting out ~10-20% more integer and floating point operations. That's why I'm wondering what causes it to flip when they go to 16 available threads. 

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, Harry Voyager said:

But on a single thread, the Intel chip is still putting out ~10-20% more integer and floating point operations. That's why I'm wondering what causes it to flip when they go to 16 available threads. 

Because that's far from everything.  Use real world applications to compare, not FP.  You can't gauge a CPU by one metric alone

Link to comment
Share on other sites

Link to post
Share on other sites

15 minutes ago, Harry Voyager said:

Do you mean AMDs implementation of SMT is more efficient than Intel's? Both CPUs are eight core CPUs. 

AMDs method for SMT and intels hyper threading are different technologies, and SMT is just... better.  SMT can get up to an addition 80%, while good hyperthreading is capped at 50-60%

 

Let's use the examples of a 3700x and 9900k

 

9900k 

8c + 8x0.6= 12.8 "cores"

3700x

8c + 8x0.8= 14.4 "cores"

 

The actual end result is significant, and it gets exacerbated by factors like zens high cache

Want to custom loop?  Ask me more if you are curious

 

Link to comment
Share on other sites

Link to post
Share on other sites

10 minutes ago, Harry Voyager said:

But on a single thread, the Intel chip is still putting out ~10-20% more integer and floating point operations. That's why I'm wondering what causes it to flip when they go to 16 available threads. 

because the link you used shows the CPU running stock settings, not locked to 3.6GHz as you thought.

 

https://www.techspot.com/article/1876-4ghz-ryzen-3rd-gen-vs-core-i9/

CPU: i7-2600K 4751MHz 1.44V (software) --> 1.47V at the back of the socket Motherboard: Asrock Z77 Extreme4 (BCLK: 103.3MHz) CPU Cooler: Noctua NH-D15 RAM: Adata XPG 2x8GB DDR3 (XMP: 2133MHz 10-11-11-30 CR2, custom: 2203MHz 10-11-10-26 CR1 tRFC:230 tREFI:14000) GPU: Asus GTX 1070 Dual (Super Jetstream vbios, +70(2025-2088MHz)/+400(8.8Gbps)) SSD: Samsung 840 Pro 256GB (main boot drive), Transcend SSD370 128GB PSU: Seasonic X-660 80+ Gold Case: Antec P110 Silent, 5 intakes 1 exhaust Monitor: AOC G2460PF 1080p 144Hz (150Hz max w/ DP, 121Hz max w/ HDMI) TN panel Keyboard: Logitech G610 Orion (Cherry MX Blue) with SteelSeries Apex M260 keycaps Mouse: BenQ Zowie FK1

 

Model: HP Omen 17 17-an110ca CPU: i7-8750H (0.125V core & cache, 50mV SA undervolt) GPU: GTX 1060 6GB Mobile (+80/+450, 1650MHz~1750MHz 0.78V~0.85V) RAM: 8+8GB DDR4-2400 18-17-17-39 2T Storage: HP EX920 1TB PCIe x4 M.2 SSD + Crucial MX500 1TB 2.5" SATA SSD, 128GB Toshiba PCIe x2 M.2 SSD (KBG30ZMV128G) gone cooking externally, 1TB Seagate 7200RPM 2.5" HDD (ST1000LM049-2GH172) left outside Monitor: 1080p 126Hz IPS G-sync

 

Desktop benching:

Cinebench R15 Single thread:168 Multi-thread: 833 

SuperPi (v1.5 from Techpowerup, PI value output) 16K: 0.100s 1M: 8.255s 32M: 7m 45.93s

Link to comment
Share on other sites

Link to post
Share on other sites

4 minutes ago, Jurrunio said:

because the link you used shows the CPU running stock settings, not locked to 3.6GHz as you thought.

 

https://www.techspot.com/article/1876-4ghz-ryzen-3rd-gen-vs-core-i9/

Sorry, I was not clear. I was aware the Anandtech results were not clock locked, and did not link the results I'd seen in Pass mark that were reporting the clock-locked results. That does line up with my second hypothesis: the Intel can boost higher on a single core, but may not be able to keep that speed on all eight cores as well as the AMD chip. 

 

When combined with AMDs multi-thread handling apparently being straight up better than the Intel implementation it sounds like it's a combination of 2&3.

 

This would lead to a theory that an Intel with sufficient cooling would perform better than a comparable AMD up to the number of threads equal to the core count, then drop off as the thread count climbed beyond that.

 

Note: I'm using synthetic and as/like mostly because, as near as I can tell, no-one has even created a truly multi-threaded VR flight simulator. It is a use case that does not yet exist. Thus, predictions rather than full tests. 

 

Plus the result was unexpected enough that I was rather curious why it wasm

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, Harry Voyager said:

Sorry, I was not clear. I was aware the Anandtech results were not clock locked, and did not link the results I'd seen in Pass mark that were reporting the clock-locked results. That does line up with my second hypothesis: the Intel can boost higher on a single core, but may not be able to keep that speed on all eight cores as well as the AMD chip. 

 

When combined with AMDs multi-thread handling apparently being straight up better than the Intel implementation it sounds like it's a combination of 2&3.

 

This would lead to a theory that an Intel with sufficient cooling would perform better than a comparable AMD up to the number of threads equal to the core count, then drop off as the thread count climbed beyond that.

 

Note: I'm using synthetic and as/like mostly because, as near as I can tell, no-one has even created a truly multi-threaded VR flight simulator. It is a use case that does not yet exist. Thus, predictions rather than full tests. 

 

Plus the result was unexpected enough that I was rather curious why it wasm

Youe hypothesis is not exactly correct. Intel clocks are held but HT is just not as good as SMT. Couple that with intels lower instructions per clock and you have your explanation

Link to comment
Share on other sites

Link to post
Share on other sites

5 minutes ago, Harry Voyager said:

That does line up with my second hypothesis: the Intel can boost higher on a single core, but may not be able to keep that speed on all eight cores as well as the AMD chip. 

Yes, 14nm runs way too hot and on many boards, hit the power limit. Some boards tho have power limits raised so high that this isn't true, it holds most of their performance.

 

7 minutes ago, Harry Voyager said:

When combined with AMDs multi-thread handling apparently being straight up better than the Intel implementation it sounds like it's a combination of 2&3.

Yes, SMT is better than HT (technically HT is interleved multithreading, different to SMT).

 

9 minutes ago, Harry Voyager said:

This would lead to a theory that an Intel with sufficient cooling would perform better than a comparable AMD up to the number of threads equal to the core count, then drop off as the thread count climbed beyond that.

Up to the point that the heat overrides the cooling. Not so much the 8 core for now, maybe the 10 core in 10th gen lineup will show that? At least Skylake-X and Cascade Lake X flagships are showing this very clearly.

CPU: i7-2600K 4751MHz 1.44V (software) --> 1.47V at the back of the socket Motherboard: Asrock Z77 Extreme4 (BCLK: 103.3MHz) CPU Cooler: Noctua NH-D15 RAM: Adata XPG 2x8GB DDR3 (XMP: 2133MHz 10-11-11-30 CR2, custom: 2203MHz 10-11-10-26 CR1 tRFC:230 tREFI:14000) GPU: Asus GTX 1070 Dual (Super Jetstream vbios, +70(2025-2088MHz)/+400(8.8Gbps)) SSD: Samsung 840 Pro 256GB (main boot drive), Transcend SSD370 128GB PSU: Seasonic X-660 80+ Gold Case: Antec P110 Silent, 5 intakes 1 exhaust Monitor: AOC G2460PF 1080p 144Hz (150Hz max w/ DP, 121Hz max w/ HDMI) TN panel Keyboard: Logitech G610 Orion (Cherry MX Blue) with SteelSeries Apex M260 keycaps Mouse: BenQ Zowie FK1

 

Model: HP Omen 17 17-an110ca CPU: i7-8750H (0.125V core & cache, 50mV SA undervolt) GPU: GTX 1060 6GB Mobile (+80/+450, 1650MHz~1750MHz 0.78V~0.85V) RAM: 8+8GB DDR4-2400 18-17-17-39 2T Storage: HP EX920 1TB PCIe x4 M.2 SSD + Crucial MX500 1TB 2.5" SATA SSD, 128GB Toshiba PCIe x2 M.2 SSD (KBG30ZMV128G) gone cooking externally, 1TB Seagate 7200RPM 2.5" HDD (ST1000LM049-2GH172) left outside Monitor: 1080p 126Hz IPS G-sync

 

Desktop benching:

Cinebench R15 Single thread:168 Multi-thread: 833 

SuperPi (v1.5 from Techpowerup, PI value output) 16K: 0.100s 1M: 8.255s 32M: 7m 45.93s

Link to comment
Share on other sites

Link to post
Share on other sites

True. Hopefully Intel will be able to start fielding their 10nm parts in the desktop market soon. Though I suspect at least some of that is the higher clock speeds too, at least based on what we seem to be seeing from high end Ryzen chips running at top clocks. The 3800X is also rated in the 100W TDP range and the high end Ryzen motherboards are building for some eye-popping power delivery. 

 

Unfortunately I don't think that my current machine will last that long so I'm having to make the call between the I9-9900K/F and the available Ryzen chips. Honestly, what I really want is the expected 3950X that boosts to 4.8Ghz, but I don't really know how real that is going to be, or whether it will be available at anything like the predicted list price. 

Link to comment
Share on other sites

Link to post
Share on other sites

44 minutes ago, Harry Voyager said:

True. Hopefully Intel will be able to start fielding their 10nm parts in the desktop market soon. Though I suspect at least some of that is the higher clock speeds too, at least based on what we seem to be seeing from high end Ryzen chips running at top clocks. The 3800X is also rated in the 100W TDP range and the high end Ryzen motherboards are building for some eye-popping power delivery. 

 

Unfortunately I don't think that my current machine will last that long so I'm having to make the call between the I9-9900K/F and the available Ryzen chips. Honestly, what I really want is the expected 3950X that boosts to 4.8Ghz, but I don't really know how real that is going to be, or whether it will be available at anything like the predicted list price. 

 

It doesn't look like 10nm will be ready anytime soon; betting on 2H of 2020 or early 2021.

There was mention of releasing 14nm++++ (? how many +'s we are on now?) CPUs for the upcoming product line-up.

 

I think a few have implied this already...

From the looks of it, Ryzen2 has better IPC vs Intel, but there are a lot of applications / software out there that simply favours brute force clock frequency -- and uses 1 ~ 4 cores. Since Intel's CPU line-up at boosts 1 ~ 2 cores up into the 4.8 GHz ~ 5.0 GHz region, Ryzen2 trails behind because of that.

Intel Z390 Rig ( *NEW* Primary )

Intel X99 Rig (Officially Decommissioned, Dead CPU returned to Intel)

  • i7-8086K @ 5.1 GHz
  • Gigabyte Z390 Aorus Master
  • Sapphire NITRO+ RX 6800 XT S.E + EKwb Quantum Vector Full Cover Waterblock
  • 32GB G.Skill TridentZ DDR4-3000 CL14 @ DDR-3400 custom CL15 timings
  • SanDisk 480 GB SSD + 1TB Samsung 860 EVO +  500GB Samsung 980 + 1TB WD SN750
  • EVGA SuperNOVA 850W P2 + Red/White CableMod Cables
  • Lian-Li O11 Dynamic EVO XL
  • Ekwb Custom loop + 2x EKwb Quantum Surface P360M Radiators
  • Logitech G502 Proteus Spectrum + Corsair K70 (Red LED, anodized black, Cheery MX Browns)

AMD Ryzen Rig

  • AMD R7-5800X
  • Gigabyte B550 Aorus Pro AC
  • 32GB (16GB X 2) Crucial Ballistix RGB DDR4-3600
  • Gigabyte Vision RTX 3060 Ti OC
  • EKwb D-RGB 360mm AIO
  • Intel 660p NVMe 1TB + Crucial MX500 1TB + WD Black 1TB HDD
  • EVGA P2 850W + White CableMod cables
  • Lian-Li LanCool II Mesh - White

Intel Z97 Rig (Decomissioned)

  • Intel i5-4690K 4.8 GHz
  • ASUS ROG Maximus VII Hero Z97
  • Sapphire Vapor-X HD 7950 EVGA GTX 1070 SC Black Edition ACX 3.0
  • 20 GB (8GB X 2 + 4GB X 1) Corsair Vengeance DDR3 1600 MHz
  • Corsair A50 air cooler  NZXT X61
  • Crucial MX500 1TB SSD + SanDisk Ultra II 240GB SSD + WD Caviar Black 1TB HDD + Kingston V300 120GB SSD [non-gimped version]
  • Antec New TruePower 550W EVGA G2 650W + White CableMod cables
  • Cooler Master HAF 912 White NZXT S340 Elite w/ white LED stips

AMD 990FX Rig (Decommissioned)

  • FX-8350 @ 4.8 / 4.9 GHz (given up on the 5.0 / 5.1 GHz attempt)
  • ASUS ROG Crosshair V Formula 990FX
  • 12 GB (4 GB X 3) G.Skill RipJawsX DDR3 @ 1866 MHz
  • Sapphire Vapor-X HD 7970 + Sapphire Dual-X HD 7970 in Crossfire  Sapphire NITRO R9-Fury in Crossfire *NONE*
  • Thermaltake Frio w/ Cooler Master JetFlo's in push-pull
  • Samsung 850 EVO 500GB SSD + Kingston V300 120GB SSD + WD Caviar Black 1TB HDD
  • Corsair TX850 (ver.1)
  • Cooler Master HAF 932

 

<> Electrical Engineer , B.Eng <>

<> Electronics & Computer Engineering Technologist (Diploma + Advanced Diploma) <>

<> Electronics Engineering Technician for the Canadian Department of National Defence <>

Link to comment
Share on other sites

Link to post
Share on other sites

45 minutes ago, Harry Voyager said:

True. Hopefully Intel will be able to start fielding their 10nm parts in the desktop market soon. Though I suspect at least some of that is the higher clock speeds too, at least based on what we seem to be seeing from high end Ryzen chips running at top clocks. The 3800X is also rated in the 100W TDP range and the high end Ryzen motherboards are building for some eye-popping power delivery. 

 

Unfortunately I don't think that my current machine will last that long so I'm having to make the call between the I9-9900K/F and the available Ryzen chips. Honestly, what I really want is the expected 3950X that boosts to 4.8Ghz, but I don't really know how real that is going to be, or whether it will be available at anything like the predicted list price. 

think 10nm will be short lived for intel on cpu side prolly be their chipsets when everything else is 7nm

Just now, -rascal- said:

 

It doesn't look like 10nm will be ready anytime soon; betting on 2H of 2020 or early 2021.

There was mention of releasing 14nm++++ (? how many +'s we are on now?) CPUs for the upcoming product line-up.

 

I think a few have implied this already...

From the looks of it, Ryzen2 has better IPC vs Intel, but there are a lot of applications / software out there that simply favours brute force clock frequency -- and uses 1 ~ 4 cores. Since Intel's CPU line-up at boosts 1 ~ 2 cores up into the 4.8 GHz ~ 5.0 GHz region, Ryzen2 trails behind because of that.

right

people always forget ipc is dependent on frequency/clocks too, this is why intel is still competitive on majority of software for everyday users on sterile benchmarks

but then again who uses their pc to do 1 task, majority of us multitask

think some reviewers should focus on sterile real world multitasking

 

I game, cad, music, video, browse/shop(so few tabs open), discord/chat programs, monitor temps or even security cameras few other things but every little bit can alter performance

there is much more like streaming, rendering, editing, file conversion, media sharing, etc that people do also

 

would ryzen gain the edge here?

 

I know better multithreading doesnt always mean better multitasking because still how all the software uses the resources

but i'd love to see a few different multitasking benchmark/comparisions myself in these reviews besides some streaming ones

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×