Jump to content

Rumours that Intel is going to ditch hardware Core multithreading on Arrow Lake because of technical reasons. As Gamer Meld mentioned in his latest video, it is because of the performance lose you get with hyperthreading as if one thread is saturated, there is not enough room for the second thread. But what my amazing brain is thinking about a revolutionary idea is that what if Intel could bring a dynamic hyperthreading feature? Where the amount of threads running from the CPU can be switched in runtime live depending upon the load of the software. If the load is more multithreaded, increase the threads (probably even more than 2 because that is possible), and if the load is more single threaded (like gaming), decrease the threads to 1. Likewise if a thread is very saturated, do not add another thread and when a thread is not much saturated, you might be able run another thread of that same core.

PLEASE MARK COMMENTS AS SOLUTION IF SATISFIED!!

bigger number better, makes me look cooler.

Link to comment
https://linustechtips.com/topic/1554299-what-about-dynamic-core-multithreading/
Share on other sites

Link to post
Share on other sites

If they are actually planning on removing HT, my assumption would be the goal is to:

  • Reduce complexity of a CPU core
    • potentially allowing for higher clocks
    • potentially Increased security (less complex = easier to verify)
  • Reduce size of a CPU core
    • which could allow for more physical cores

If a size reduction allows them replace a core+SMT with two physical cores, it would be a net benefit in terms of performance. Even if the core count doesn't double, but is e.g. x1.5 you might still be better off.

 

Keeping HT and adding some type of dynamic hardware solution on top that enables/disables it on the fly would have the exact opposite effect. It would increase core complexity. Plus, the operating system's scheduler can already achieve that in software, by never assigning more than one thread to a core.

Remember to either quote or @mention others, so they are notified of your reply

Link to post
Share on other sites

1 hour ago, Eigenvektor said:

If a size reduction allows them replace a core+SMT with two physical cores, it would be a net benefit in terms of performance. Even if the core count doesn't double, but is e.g. x1.5 you might still be better off.

Thread count through SMT is over-valued by many who don't understand the actual performance improvements it can, or can't give. Cinebench in recent generations, at least R15 to R23 gains about 30% throughput from it. I haven't tested 2024 that way yet. But Cinebench is one of the best cases. If you take an average over a wide variety of CPU workloads, it would be much less.

 

It was a long time ago, but from memory on HT's release, I think Intel said adding it took about 5% extra die area. It is a good tradeoff if, on average, you get more than that 5%. Now, as we get to higher core counts, and also different types of core, reducing the complexity on the scheduler is no bad thing. I think removing HT/SMT can help give more consistent behaviour, and remove cross-thread attacks running on the same core.

 

Here's some really old testing I did looking at HT/SMT. Unfortunately it leans towards compute, and many of them are synthetics, so this is not representative of everything. Still, it gives a taste.

htsmt.png.c016bdeb63dc9c75fa34632245e649

3DPM is code written by a non-professional programmer and I believe in that case it is able to extract more performance because of the lack of optimisation. Unfortunately the AVX-512 version which was later tuned by an unnamed former Intel employee* is not freely available, but I'd bet that wouldn't see such significant gains any more. *Jim Keller described that person as one of a few people in the world who can really optimise well.

 

Cinebench shows the ~30% increase. Early Zen wasn't great at single thread performance so it looks like they do better, but it is in part because they're starting from a lower baseline and it covers that weakness.

 

Aida obviously is synthetic.

 

More details in the thread I made at the time.

 

Gaming system: R7 7800X3D, Asus ROG Strix B650E-F Gaming Wifi, Thermalright Phantom Spirit 120 SE ARGB, Corsair Vengeance 2x 32GB 6000C30, MSI Ventus 3x OC RTX 5070 Ti, MSI MPG A850G, Fractal Design North, Samsung 990 Pro 2TB, Alienware AW3225QF (32" 240 Hz OLED)
Productivity system: i9-7980XE, Asus X299 TUF mark 2, Noctua D15, 64GB ram (mixed), RTX 4070 FE, NZXT E850, GameMax Abyss, Samsung 980 Pro 2TB, iiyama ProLite XU2793QSU-B6 (27" 1440p 100 Hz)
Gaming laptop: Lenovo Legion 5, 5800H, RTX 3070, Kingston DDR4 3200C22 2x16GB 2Rx8, Kingston Fury Renegade 1TB + Crucial P1 1TB SSD, 165 Hz IPS 1080p G-Sync Compatible

Link to post
Share on other sites

16 minutes ago, porina said:

Now, as we get to higher core counts, and also different types of core, reducing the complexity on the scheduler is no bad thing. I think removing HT/SMT can help give more consistent behaviour, and remove cross-thread attacks running on the same core.

Yeah, I didn't even consider the reduction in complexity on the scheduler side of things. Should this go to its own core, or can it share a core with something else? If that question is removed, it should make scheduling easier. Especially if you consider that P/E cores likely added a whole new level of complexity on top.

 

Otherwise this matches what I had in mind. Core counts are a lot higher than they used to be when HT was first introduced. Is it actually still beneficial to keep around, considering a lot of software doesn't scale to this many cores (and can benefit from HT on top of that). And, possibly more importantly, considering the security issues associated with it.

 

Of course if it's only 5% of die space, it's unlikely it could allow for more physical cores, but any reduction in size might still be beneficial (to Intel at least in terms of production costs/yield)

Remember to either quote or @mention others, so they are notified of your reply

Link to post
Share on other sites

1 hour ago, Eigenvektor said:

Of course if it's only 5% of die space, it's unlikely it could allow for more physical cores, but any reduction in size might still be beneficial (to Intel at least in terms of production costs/yield)

I found the source for it: "less than 5%" in 2002.

Quote

Since the logical processors share the vast majority of microarchitecture resources and only a few small structures were replicated, the die area cost of the first implementation was less than 5% of the total die area.

https://web.archive.org/web/20121019025809/http://www.intel.com/technology/itj/2002/volume06issue01/vol6iss1_hyper_threading_technology.pdf

 

CPUs have changed a lot since then. Is it still 5%? More? Less? 

Gaming system: R7 7800X3D, Asus ROG Strix B650E-F Gaming Wifi, Thermalright Phantom Spirit 120 SE ARGB, Corsair Vengeance 2x 32GB 6000C30, MSI Ventus 3x OC RTX 5070 Ti, MSI MPG A850G, Fractal Design North, Samsung 990 Pro 2TB, Alienware AW3225QF (32" 240 Hz OLED)
Productivity system: i9-7980XE, Asus X299 TUF mark 2, Noctua D15, 64GB ram (mixed), RTX 4070 FE, NZXT E850, GameMax Abyss, Samsung 980 Pro 2TB, iiyama ProLite XU2793QSU-B6 (27" 1440p 100 Hz)
Gaming laptop: Lenovo Legion 5, 5800H, RTX 3070, Kingston DDR4 3200C22 2x16GB 2Rx8, Kingston Fury Renegade 1TB + Crucial P1 1TB SSD, 165 Hz IPS 1080p G-Sync Compatible

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×