Jump to content

How does Windows utilize Hyperthreaded cores?

I was hoping you'd specifically run 4 threads of first two cores (0,1,2,3) vs one thread from each core (0,3,5,7) which should show a difference like between ultra mobile dual core i7 with HT vs quad core i5...

Link to comment
Share on other sites

Link to post
Share on other sites

It would be really interesting if the CPU utilization were also presented not only the results of the benchmarks. There were no actual relationship between the results of the "alone" and "simultaneous" runs on the benchmarks aside from "alone" is better. Why is it not halved during simultaneous when in effect you are running two benchmarks on the same number of physical cores?

Link to comment
Share on other sites

Link to post
Share on other sites

2:12 - so they used an 8 core processor as they say they assign each VM four logical cores? 

Link to comment
Share on other sites

Link to post
Share on other sites

To answer question not yet asked : How does it compare to AMD's module ?

Implementation difference :
b5xKUWW.png
^This is a basic sketch about difference (pay no attention to actual size of the blocks).

Something not mentioned in the video :
Hyper Threading works best, when no one thread is utilising 100% of the Core's hardware reasorces (be it Front end [decoders], or Back end [ALU/FPU]).

@Piyok
Because they did 0,2,4,6 and 1,3,5,7 thread split on VM's (numbers are threads the actual i7 CPU has).
When one VM sits idle (like in the "alone" part of the test), all available reasources are used on other one.
That's why U almost don't get any performance drop (ie. all thread's are created equal).

U get halfed performace, when U use both VM's at 100% (like in simult. test part).

CPU : Core i7 6950X @ 4.26 GHz + Hydronaut + TRVX + 2x Delta 38mm PWM
MB : Gigabyte X99 SOC (BIOS F23c)
RAM : 4x Patriot Viper Steel 4000MHz CL16 @ 3042MHz CL12.12.12.24 CR2T @1.48V.
GPU : Titan Xp Collector's Edition (Empire)
M.2/HDD : Samsung SM961 256GB (NVMe/OS) + + 3x HGST Ultrastar 7K6000 6TB
DAC : Motu M4 + Audio Technica ATH-A900Z
PSU: Seasonic X-760 || CASE : Fractal Meshify 2 XL || OS : Win 10 Pro x64
Link to comment
Share on other sites

Link to post
Share on other sites

So then I can assume if I am willing to put a high workload on the cpu, 8 physical cores (AMD fx8350 i.e.) would perform better than a hyperthreaded Intel with 4cores,  am I right,  or am I still missing something? 

Link to comment
Share on other sites

Link to post
Share on other sites

U R correct.
Problem, is finding a game that can use added ALU's without the need of FPU buff.

Programs for video rendering are quite good at doing this (if optimised for AMD module architecture).

CPU : Core i7 6950X @ 4.26 GHz + Hydronaut + TRVX + 2x Delta 38mm PWM
MB : Gigabyte X99 SOC (BIOS F23c)
RAM : 4x Patriot Viper Steel 4000MHz CL16 @ 3042MHz CL12.12.12.24 CR2T @1.48V.
GPU : Titan Xp Collector's Edition (Empire)
M.2/HDD : Samsung SM961 256GB (NVMe/OS) + + 3x HGST Ultrastar 7K6000 6TB
DAC : Motu M4 + Audio Technica ATH-A900Z
PSU: Seasonic X-760 || CASE : Fractal Meshify 2 XL || OS : Win 10 Pro x64
Link to comment
Share on other sites

Link to post
Share on other sites

And if this is correct I'd be interested in knowing how much of a difference is there between a 8 full cores and a hyperthreaded quad core. 

Link to comment
Share on other sites

Link to post
Share on other sites

A LOT, or Nothing at all.
Depends highly on program used and CPU (Turbo may clock same CPU higher, if only half of the cores are active).

CPU : Core i7 6950X @ 4.26 GHz + Hydronaut + TRVX + 2x Delta 38mm PWM
MB : Gigabyte X99 SOC (BIOS F23c)
RAM : 4x Patriot Viper Steel 4000MHz CL16 @ 3042MHz CL12.12.12.24 CR2T @1.48V.
GPU : Titan Xp Collector's Edition (Empire)
M.2/HDD : Samsung SM961 256GB (NVMe/OS) + + 3x HGST Ultrastar 7K6000 6TB
DAC : Motu M4 + Audio Technica ATH-A900Z
PSU: Seasonic X-760 || CASE : Fractal Meshify 2 XL || OS : Win 10 Pro x64
Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, agent_x007 said:

To answer question not yet asked : How does it compare to AMD's module ?

Implementation difference :
b5xKUWW.png
^This is a basic sketch about difference (pay no attention to actual size of the blocks).

Something not mentioned in the video :
Hyper Threading works best, when no one thread is utilising 100% of the Core's hardware reasorces (be it Front end [decoders], or Back end [ALU/FPU]).

@Piyok
Because they did 0,2,4,6 and 1,3,5,7 thread split on VM's (numbers are threads the actual i7 CPU has).
When one VM sits idle (like in the "alone" part of the test), all available reasources are used on other one.
That's why U almost don't get any performance drop (ie. all thread's are created equal).

U get halfed performace, when U use both VM's at 100% (like in simult. test part).

I understood the split. Im curious though, why is the result of simultaneous is more than half of that of the alone? Because I thought it should be half or should be less than half because of maybe context switching??

Link to comment
Share on other sites

Link to post
Share on other sites

How can someone explain me the difference in multicore performance of a 4790 and a fx8350 at the same ghz if physical cores are supposed to be better,  I take cpuboss results as an estimate. 

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, Piyok said:

I understood the split. Im curious though, why is the result of simultaneous is more than half of that of the alone? Because I thought it should be half or should be less than half because of maybe context switching??

Well, looking at it, I think it would be like this:

You have an instruction that needs to be processed.
You have a core thats working.
Hyper-threading notices a gap in that cores workload and slots the job in to be processed.
This saves any downtime/gaps in processing and effectively more work done.

At the moment, thats how I am reading it.
I could be seriously wrong though! No expert on the whole threading thing, though for a long time thought it might be like one core taking on 2 jobs instead of one, though never got my head around it.

With the AMD FX lot, when being hammered by Adobe, all cores can be flat out. But other programs there is a lot of peaks and dips for every second of work, and if I do a like for like export of say a photo or video, there is a big difference in time taken (hint Adobe is faster, but you're screwed for doing anything else on the side).

This would fit in with my guess at understanding.
But await some one to come along and tell me how wrong I am.

 

Its not my fault I am grumpy, you try having a porcelain todger that's always hard! 

Link to comment
Share on other sites

Link to post
Share on other sites

6 hours ago, Piyok said:

I understood the split. Im curious though, why is the result of simultaneous is more than half of that of the alone? Because I thought it should be half or should be less than half because of maybe context switching??

Well Cinebench LOVES Hyper Threading (LINK) and cores.
If program is optimised to run on Quad Core (or Octa Core), it can beat the c**p out of simply double clocked half cored CPU.
Also :
It's not like 100% utilisation in Task Menager means that ALL of the CPU hardware is at 100%.
Second thread can squeeze that extra clock cycles out of 100% utilisation.
That's why result is at more than 50%.

Contet switching hurts performance more than Hyper Threading.

CPU : Core i7 6950X @ 4.26 GHz + Hydronaut + TRVX + 2x Delta 38mm PWM
MB : Gigabyte X99 SOC (BIOS F23c)
RAM : 4x Patriot Viper Steel 4000MHz CL16 @ 3042MHz CL12.12.12.24 CR2T @1.48V.
GPU : Titan Xp Collector's Edition (Empire)
M.2/HDD : Samsung SM961 256GB (NVMe/OS) + + 3x HGST Ultrastar 7K6000 6TB
DAC : Motu M4 + Audio Technica ATH-A900Z
PSU: Seasonic X-760 || CASE : Fractal Meshify 2 XL || OS : Win 10 Pro x64
Link to comment
Share on other sites

Link to post
Share on other sites

I'd like to see a comparison, with benchmarks, using the i7-5960X in various configurations:

 

1 - 8 cores, 4.0 GHz, no hyperthreading.

2 - 4 cores, 4.0 GHz, hyperthreading enabled (8 logical cores)

3 - 4.0 GHz, 8 hyperthreads (using only hyperthreaded cores, no physical ones) in a VM, while simultaneously running Prime95 or Cinebench or some other benchmark / stress program on the main cores on the host.

 

I suppose I could do something similar on my 4790K, but I'd have to cut the cores/threads in half.  In my case it'd be 4 cores no hyper, 2 cores with hyper, 4 hyperthreads in a VM plus 4 cores benching/stressing in host.  (And my i3-6100 in my laptop has hyperthreading too, just half the threads of my i7.)

 

Basically, what this would be trying to answer, is ...

 

Assuming you have the same architecture and same clock speed, are things faster with a certain number of cores and no hyperthreading, versus half the cores, with hyperthreading?  And, by how much?  Also I'd sure include multitasking - for example if I was to try it, I'd be running Unigine Valley & Heaven, Firestrike, Prime95, Cinebench, and a few other things simultaneously and seeing how well the computer multitasks. :)

 

Also, speaking of workshop, previous episodes, etc ... I'd like to see @Slick revisit the (CPU) cooling topics, like thermal paste application methods, number of case fans & their placement (and size of case fans too), aftermarket vs stock CPU coolers ... but this time, running Prime95 28.7 small FFT to heat things up. :)  On my i7-4790K with the Hyper 212 Evo in a Rosewill Thor V2, within like 5-10 seconds it hits 100°C at 4.2 GHz in winter and same temp at 3.7 GHz in summer.  Summer ambient temps can reach 30-32°C indoors here at least a few times every year.  (I think during one very hot summer we had several years ago it may have hit 38-40°C plus in the house!  I wonder what it'd be like to try to keep a PC from torching itself in that kind of environment!)

Link to comment
Share on other sites

Link to post
Share on other sites

9While I do LOVE the workshop series, mostly because it busts myths that run rampant in the PC community, (including this one) it always seems like they miss one tiny detail in their testing that makes the results suddenly meaningless. In this case, the proper way to do it would be to spin up one VM per logical core, and see if the performance in any VM's was less than any other. I don't know how they missed this, because the way they did it, they leave it open to someone saying

 

"hey, you put 2 'real' cores and 2 'hyperthreaded' cores in each VM Lulz your results are worthless" - Unless they tried every single possibility of 4 and 4 cores, which I think would take a lot longer, and they didn't mention that in the video. 

 

Just to clarify, I don't expect you to get different results though, and I probably wouldn't go back and redo this video anyway.

 

What I would suggest, is an AMD video, because as mentioned earlier by @agent_x007 their 8-core consumer CPU has 8 ALU's and only 4 FPUs. See what happens if you try and split the modules by assigning half to each VM. IIRC when these CPUs came out, windows was terrible at scheduling for them and it caused quite a scene.

 

Anyways keep up the good work, and just remember to eliminate as many variables as possible next time!

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

It would be interesting to see what would happen if you disabled all the physical cores and just ran on the hyperthreded ones. Does anyone know what would happen? 

 

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, JohnnyCorporalTech said:

It would be interesting to see what would happen if you disabled all the physical cores and just ran on the hyperthreded ones. Does anyone know what would happen? 

 

There isn't a such thing as a "Hyperthreaded core"

There is only "cores WITH hyperthreading" if you disable hyperthreading, you turn and i7 into an i5

Link to comment
Share on other sites

Link to post
Share on other sites

On 3/9/2016 at 3:08 PM, Mandrewoid said:

There isn't a such thing as a "Hyperthreaded core"

There is only "cores WITH hyperthreading" if you disable hyperthreading, you turn and i7 into an i5

I mean in something such as unraid where both the hyperthreaded cores and the physical ones show up.

Link to comment
Share on other sites

Link to post
Share on other sites

Hi.

 

I think your method of testing was not correct. Involving virtual machines and relying on hvm to keep same core under load is not what you need. Also you could add second CPU to the mix and test performance of two single cores on different CPU.

In my opinion it should look like this:

1. Install windows on the hardware

2. enable HT in bios

3. use command line or task manager to bind heavy CPU benchmarking program to the following combination of cores/threads/cpu-s:

a. two physical cores

b. two logical cores

c. one core from one cpu and second core from second cpu

d. disable HT and test single core, two cores, two cpus.

 

That way we can see exactly how HT is affecting CPU performance.

Your video showed us that assigning more virtual machines to Hypervisor will cause them to be slower. Virtual cores are not exactly the same as real ones - there is always thin layer of supervision from the host system.

 

Also I have an idea about new test: because I don't have necessary hardware maybe you can test how number of PCI lanes and PCI generation affects GPU performance? I am in process of upgrading my workstation to used dual xeon but unfortunately older generation supports only pci gen 2 in pci x 8 slots. I wonder if i made good decision about that mb :/

Link to comment
Share on other sites

Link to post
Share on other sites

On 3/7/2016 at 6:00 AM, vonguch said:

So then I can assume if I am willing to put a high workload on the cpu, 8 physical cores (AMD fx8350 i.e.) would perform better than a hyperthreaded Intel with 4cores,  am I right,  or am I still missing something? 

Not exactly. AMDs per core performance is well behind Intels.

Link to comment
Share on other sites

Link to post
Share on other sites

I would like to see HT vs not HT in each of those tests added as well.  Or in a different video.

 

So so so many people always say get a i7 for editing or whatever when it costs like 35% more for the extra x performance, what's x!?  Is it even worth it?

Intel 4670K /w TT water 2.0 performer, GTX 1070FE, Gigabyte Z87X-DH3, Corsair HX750, 16GB Mushkin 1333mhz, Fractal R4 Windowed, Varmilo mint TKL, Logitech m310, HP Pavilion 23bw, Logitech 2.1 Speakers

Link to comment
Share on other sites

Link to post
Share on other sites

On 3/10/2016 at 8:52 PM, JohnnyCorporalTech said:

I mean in something such as unraid where both the hyperthreaded cores and the physical ones show up.

There is no such thing as a hyperthreaded core~! Merely a core with hyperthreading. For a very rough analogy, lets think of a CPU core as a pipe that a person can stuff raw material in, and finished products come out the other side. Effectively what intel is doing with hyperthreading, is adding another person to do the shoving. So while person A is turning around to pick up the next block, for example. Person B is shoving another block in. Thats why they show up as twice as many "cpus" in windows. Because there are 2 ... lets call them management units or schedulers? Maybe. Intel itself deserves a significant amount of blame for this confusion, because their page dedicated to hyperthreading clearly shows a hyperthreading-enabled cpu core as being able to do twice as much stuff. Which has been historically shown to be not true. A more practical estimation would be along the lines of 1.6 times as much stuff as a non-hyperthreading core. This is easy to test. Simply turn off hypertheading in the bios. 

 

TL;DR: 

in a hyperthreaded CPU, each of your "CPU's" that show up to the operating system, is physically identical to all of the others. 

On 3/10/2016 at 8:01 AM, site_owner said:

Hi.

 

I think your method of testing was not correct. Involving virtual machines and relying on hvm to keep same core under load is not what you need. Also you could add second CPU to the mix and test performance of two single cores on different CPU.

In my opinion it should look like this:

1. Install windows on the hardware

2. enable HT in bios

3. use command line or task manager to bind heavy CPU benchmarking program to the following combination of cores/threads/cpu-s:

a. two physical cores

b. two logical cores

c. one core from one cpu and second core from second cpu

d. disable HT and test single core, two cores, two cpus.

 

That way we can see exactly how HT is affecting CPU performance.

Your video showed us that assigning more virtual machines to Hypervisor will cause them to be slower. Virtual cores are not exactly the same as real ones - there is always thin layer of supervision from the host system.

 

Also I have an idea about new test: because I don't have necessary hardware maybe you can test how number of PCI lanes and PCI generation affects GPU performance? I am in process of upgrading my workstation to used dual xeon but unfortunately older generation supports only pci gen 2 in pci x 8 slots. I wonder if i made good decision about that mb :/

There is data available on the internets currently that seem to suggest that modern graphics cards aren't close to bottlenecking in PCIe 2.0 slots at the 16 lane level . Though there may be some merit to looking into how AMD crossfire performs when limited to only a PCIe 2.0 4x or 8x slot, because the inter-card communication goes over PCI. Also I'd be interested in this as well because supposedly in the next couple generations of GPU's we might be seeing them having direct (and fast) access to system RAM, which may start to eat up PCIe bandwidth. 

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, Mandrewoid said:

There is no such thing as a hyperthreaded core~! Merely a core with hyperthreading. For a very rough analogy, lets think of a CPU core as a pipe that a person can stuff raw material in, and finished products come out the other side. Effectively what intel is doing with hyperthreading, is adding another person to do the shoving. So while person A is turning around to pick up the next block, for example. Person B is shoving another block in. Thats why they show up as twice as many "cpus" in windows. Because there are 2 ... lets call them management units or schedulers? Maybe. Intel itself deserves a significant amount of blame for this confusion, because their page dedicated to hyperthreading clearly shows a hyperthreading-enabled cpu core as being able to do twice as much stuff. Which has been historically shown to be not true. A more practical estimation would be along the lines of 1.6 times as much stuff as a non-hyperthreading core. This is easy to test. Simply turn off hypertheading in the bios. 

 

TL;DR: 

in a hyperthreaded CPU, each of your "CPU's" that show up to the operating system, is physically identical to all of the others. 

There is data available on the internets currently that seem to suggest that modern graphics cards aren't close to bottlenecking in PCIe 2.0 slots at the 16 lane level . Though there may be some merit to looking into how AMD crossfire performs when limited to only a PCIe 2.0 4x or 8x slot, because the inter-card communication goes over PCI. Also I'd be interested in this as well because supposedly in the next couple generations of GPU's we might be seeing them having direct (and fast) access to system RAM, which may start to eat up PCIe bandwidth. 

I see what you mean but if you watch linus' 2 gamers 1 cpu , he used an 8 core extreme edition Prossessor(5960X) you can see at 11:45 linus selects the first 8 "cores" (cores 0 - 7). At least 4 of those "cores" would have to be hyperthreads, so if linus selected the four "cores" that are hyper threads, what would happen. That is what I am asking.

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×