Jump to content

How well does Intel thread director work for P and E cores on hybrid CPUs with WIN10 and WIN11 for gaming and apps

Is the Intel Thread director supposed to work seamlessly and make e-cores work well with all secondary threads in games so they work just as well if not better than if there were additional but less P cores. Like if a game thread beyond 8 (HT off) would use 10-30% intermittently of an additional P core, does thread director know how to make it use e-core the same performance cause e-cores are weaker so instead it makes it use say 60% of e-core and no slow down or waiting for process as it can use e-cores just as effectively for secondary threads even though architecture is different and they are slower IPC overall?? Does it work well and seamlessly without developers having developed with hybrid arch in mind cause thread director takes care of everything or at least in 99% of cases but maybe some rare edge cases?

I assume same would apply with Arrow Lake as 13th and 14th Gen. I have 0 interest in 13th or 14th Gen given their random stability and degradation issues, but upcoming Arrow Lake does interest me.

Would prefer a CPU with more than 8 P core or CCX within a CCD or tile or ring bus, but such option does not exist and appears it will not exist with upcoming Zen 5 and Arrow Lake. Zen 5 just 8 cores per CCX and CCD and thus the big hit going across to more cores.

Though at least Arrow Lake will have all P and e =-core son the same tile for great core to core latency like Meteor Lake has excluding the LPE cores on different tile which I assume middle and high end Arrow Lake will not have and even if they did just turn it off as all the better core son one tile for good latency between the cores.

Though does hybrid arch work well with no issues and thread director ensures it, or are there still problems?

Link to comment
Share on other sites

Link to post
Share on other sites

26 minutes ago, Wolverine2349 said:

Does it work well and seamlessly without developers having developed with hybrid arch in mind cause thread director takes care of everything or at least in 99% of cases but maybe some rare edge cases?

Developers rarely program in a way that takes specific core layouts or architectures into mind.

 

They will create however many threads their program needs and that's it. It's up to the operating system's task scheduler to figure out how to distribute them.

 

The scheduler should be clever enough to figure out which thread needs a P core or an E core.

 

If no more P cores are available, I would assume it's going to use an E core, then reschedule it on another core as resources become available.

 

Is it perfect? Probably not. There are reports of games that work better with E cores disabled, after all. If you have the hardware, simply try it out for yourself.

Remember to either quote or @mention others, so they are notified of your reply

Link to comment
Share on other sites

Link to post
Share on other sites

 

@Wolverine2349 This article from Anandtech might explain it better:

 

Quote

intelthreaddirector.thumb.jpg.58af341751a6a6166c734c02948fc8e7.jpg


In order to ensure that the cores are used to their maximum, Intel had to work with Microsoft to implement a new hybrid-aware scheduler, and this one interacts with an on-board microcontroller on the CPU for more information about what is actually going on. The microcontroller on the CPU is what we call Intel Thread Director. It has a full scope view of the whole processor – what is running where, what instructions are running, and what appears to be the most important. It monitors the instructions at the nanosecond level, and communicates with the OS on the microsecond level. It takes into account thermals, power settings, and identifies which threads can be promoted to higher performance modes, or those that can be bumped if something higher priority comes along. It can also adjust recommendations based on frequency, power, thermals, and additional sensory data not immediately available to the scheduler at that resolution. All of that gets fed to the operating system. The scheduler takes all of the information from Thread Director, constantly, as a guide. So if a user comes in with a more important workload, Thread Director tells the scheduler which cores are free, or which threads to demote. The scheduler can override the Thread Director, especially if the user has a specific request, such as making background tasks a higher priority.

 

What makes Windows 11 better than Windows 10 in this regard is that Windows 10 focuses more on the power of certain cores, whereas Windows 11 expands that to efficiency as well. While Windows 10 considers the E-cores as lower performance than P-cores, it doesn’t know how well each core does at a given frequency with a workload, whereas Windows 11 does. Combine that with an instruction prioritization model, and Intel states that under Windows 11, users should expect a lot better consistency in performance when it comes to hybrid CPU designs. Thread Director is running a pre-trained algorithm based on millions of hours of data gathered during the development of the feature. It identifies the effective IPC of a given workflow, and applies that to the performance/efficiency metrics of each core variation. If there’s an obvious potential for better IPC or better efficiency, then it suggests the thread is moved.

 

Workloads are broadly split into four classes:

  • Class 3: Bottleneck is not in the compute, e.g. IO or busy loops that don’t scale
  • Class 0: Most Applications
  • Class 1: Workloads using AVX/AVX2 instructions
  • Class 2: Workloads using AVX-VNNI instructions

Anything in Class 3 is recommended for E-cores. Anything in Class 1 or 2 is recommended for P cores, with Class 2 having higher priority. Everything else fits in Class 0, with frequency adjustments to optimize for IPC and efficiency if placed on the P-cores. The OS may force any class of workload onto any core, depending on the user.

 

TDClasses.thumb.jpg.8f5b938213f8d9b0b9f0a5a7ae37f853.jpg

 

Thread Director: Windows 11 Does It Best - Intel 12th Gen (anandtech.com)

 

If you read the rest of the article, you will come to find that they have a training model or algorithm that can adapt to scheduling scenarios. It can analyze the situation and reallocate resources depending on the thread requirement. It's incredibly brilliant stuff. 

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×