Jump to content

Why do I get a performance boost when I set affinity to threads 1 and 3?

I'm using a laptop with a Core i5 6200u. When I'm running applications (especially emulators like Dolphin) I get a considerable performance boost by going to the task manager and setting affinity only to cores 1 and 3, and I get a decreased performance with only threads 0 and 2. I get that i5 only has 2 cores and the other 2 threads are "virtual", but why only cores 1 and 3 instead of 0 and 2? Shoudn't the even cores be the physical ones? And why is it slower with all threads enabled? :| thanks.

Link to comment
Share on other sites

Link to post
Share on other sites

I expected to come in here and explain the difference between assigning it to run one thread each on two physically separate cores vs both threads on the same physical core due to HT but what you're doing is logical and I really can't explain what you've found.  Strange :/

 

I can postulate that manually assigning affinity like this is faster than using all logical cores because it may be scheduling things poorly and trying to run two threads on the same core when it could have otherwise put them on separate cores, but I can't say for sure if this is happening, and honestly it really shouldn't be if the program (and Windows) are both made properly.

 

I should also mention, when it comes to the checkboxes in task manager, it's not that some are "real" and some are "virtual" - all of them are logical cores (virtual if you want to think of it that way), and correspond with the real physical cores in pairs, so logical cores 0 + 1 map to physical core 1, logical cores 2 + 3 map to physical core 2, and so on if applicable (or at least this is what I found in my testing a few years back).

 

But yeah, I would expect 1 + 3 to perform the same as 0 + 2, as well as 0 + 3 and 1 + 2.  I would also expect 0 + 1 and 2 + 3 to perform the same, and I would expect either of the former to outperform either of the latter, but the difference you've found is not something I can think of a reason for.  (I would be curious to know how the oddball 0 + 3 and 1 + 2 stack up as well though)

 

Have you tried setting the emulator to a high priority and see if the behaviour changes?

Solve your own audio issues  |  First Steps with RPi 3  |  Humidity & Condensation  |  Sleep & Hibernation  |  Overclocking RAM  |  Making Backups  |  Displays  |  4K / 8K / 16K / etc.  |  Do I need 80+ Platinum?

If you can read this you're using the wrong theme.  You can change it at the bottom.

Link to comment
Share on other sites

Link to post
Share on other sites

  • 2 weeks later...
On 5/12/2019 at 12:58 AM, Ryan_Vickers said:

I expected to come in here and explain the difference between assigning it to run one thread each on two physically separate cores vs both threads on the same physical core due to HT but what you're doing is logical and I really can't explain what you've found.  Strange :/

 

I can postulate that manually assigning affinity like this is faster than using all logical cores because it may be scheduling things poorly and trying to run two threads on the same core when it could have otherwise put them on separate cores, but I can't say for sure if this is happening, and honestly it really shouldn't be if the program (and Windows) are both made properly.

 

I should also mention, when it comes to the checkboxes in task manager, it's not that some are "real" and some are "virtual" - all of them are logical cores (virtual if you want to think of it that way), and correspond with the real physical cores in pairs, so logical cores 0 + 1 map to physical core 1, logical cores 2 + 3 map to physical core 2, and so on if applicable (or at least this is what I found in my testing a few years back).

 

But yeah, I would expect 1 + 3 to perform the same as 0 + 2, as well as 0 + 3 and 1 + 2.  I would also expect 0 + 1 and 2 + 3 to perform the same, and I would expect either of the former to outperform either of the latter, but the difference you've found is not something I can think of a reason for.  (I would be curious to know how the oddball 0 + 3 and 1 + 2 stack up as well though)

 

Have you tried setting the emulator to a high priority and see if the behaviour changes?

Changing the priority doesn't seem to do anything.

 

Here's the performance on Zelda Skyward Sword using the same savestate:

 

All threads:

38 - 40 fps

 

CPU 0 and 1:

31 - 32 fps

 

CPU 0 and 2:

35 - 37 fps

 

CPU 0 and 3:

 

35 - 37 fps

 

CPU 1 and 2:

 

41 - 42 fps

 

CPU 1 and 3:

 

41 - 42 fps

 

CPU 2 and 3:

 

35 - 37 fps

 

CPU 1, 2 and 3:

 

41 - 42 fps

 

CPU 0, 2 and 3:

 

37 - 38 fps

 

CPU 0, 1 and 3:

 

34 - 35 fps

 

CPU 0, 1 and 2:

 

38 - 39 fps

 

--------------------------------------------------------------

 

New test: When I disable dual-core usage in Dolphin and set only a single thread in Task Manager I get the following:

 

Thread 0:

 

20 fps when I don't move the cursor, 13 - 15 when I'm moving the cursor.

 

Thread 1:

 

28 fps no cursor movement, 24 when moving

 

Thread 2:

 

Same as thread 1 but no performance loss when moving the cursor

 

Thread 3:

 

Same as thread 2.

 

The cursor thing is probably because System Interrupt process is using a lot of CPU (about 15% now), specially when I use the touchpad (after some research on Google it seems to be a common problem on my laptop model).

 

I find this really interesting. It would be cool if other people could test on other systems.

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, Raploz said:

The picture is starting to come together now.  It would seem that any time you involve logical core 0, performance takes a hit.  I would guess that this is due to a lot of system tasks using it by default but I don't know for sure.

 

With that in mind, I would expect that 1 and 2, or 1 and 3 would be best, along with larger combos that include these and exclude 0, and your data suggests that this is indeed correct.  It also casts a wider light on the situation - enough that your findings from the OP aren't quite so strange anymore.

 

I'm a bit surprised that priority didn't do anything.

 

My next move would be to look into confirming or refuting if this whole situation is indeed caused by high load on logical core 0 due to other processes.  If that is the case, then the mystery is solved more or less as far as I am concerned, and my next step after that would be to look at how that load can be reduced or eliminated, if necessary and applicable.  Regardless, at least you will be able to get the highest possible performance from the game.

Solve your own audio issues  |  First Steps with RPi 3  |  Humidity & Condensation  |  Sleep & Hibernation  |  Overclocking RAM  |  Making Backups  |  Displays  |  4K / 8K / 16K / etc.  |  Do I need 80+ Platinum?

If you can read this you're using the wrong theme.  You can change it at the bottom.

Link to comment
Share on other sites

Link to post
Share on other sites

13 hours ago, Ryan_Vickers said:

The picture is starting to come together now.  It would seem that any time you involve logical core 0, performance takes a hit.  I would guess that this is due to a lot of system tasks using it by default but I don't know for sure.

 

With that in mind, I would expect that 1 and 2, or 1 and 3 would be best, along with larger combos that include these and exclude 0, and your data suggests that this is indeed correct.  It also casts a wider light on the situation - enough that your findings from the OP aren't quite so strange anymore.

 

I'm a bit surprised that priority didn't do anything.

 

My next move would be to look into confirming or refuting if this whole situation is indeed caused by high load on logical core 0 due to other processes.  If that is the case, then the mystery is solved more or less as far as I am concerned, and my next step after that would be to look at how that load can be reduced or eliminated, if necessary and applicable.  Regardless, at least you will be able to get the highest possible performance from the game.

I'm pretty sure there's something wrong with this laptop on the hardware/firmware level, because after some research it seems that the System Interrupt process keeps spiking even after a Windows format for other people. BTW it's an Acer Aspire F5-573G-521B, it seems that all F5-573G are affected and the problem is not solved even with the latest BIOS. According to this topic it's a problem with an audio driver, I did what was advised and now system interrupts are down to 0,3% CPU usage, but the emulator is still faster when setting affinity to certain threads.

 

-----

 

Another really interesting thing is that when I run an emulator like Citra (3DS) on Windows I get about 60% speed on Mario Kart 7 under Windows, when I switch to a Live image of Linux Manjaro I can get 100% speed, and can even increase the resolution to 2x and still have it run at 80-90%. But when I use a distro like Ubuntu or any other based on Ubuntu the performance is extremely bad, as it fluctuates between 100% and 20% all the time.

 

I don't know if both things are related but it's really weird to me that Windows is so much slower than Manjaro.

Link to comment
Share on other sites

Link to post
Share on other sites

16 hours ago, Raploz said:

I'm pretty sure there's something wrong with this laptop on the hardware/firmware level, because after some research it seems that the System Interrupt process keeps spiking even after a Windows format for other people. BTW it's an Acer Aspire F5-573G-521B, it seems that all F5-573G are affected and the problem is not solved even with the latest BIOS. According to this topic it's a problem with an audio driver, I did what was advised and now system interrupts are down to 0,3% CPU usage, but the emulator is still faster when setting affinity to certain threads.

 

-----

 

Another really interesting thing is that when I run an emulator like Citra (3DS) on Windows I get about 60% speed on Mario Kart 7 under Windows, when I switch to a Live image of Linux Manjaro I can get 100% speed, and can even increase the resolution to 2x and still have it run at 80-90%. But when I use a distro like Ubuntu or any other based on Ubuntu the performance is extremely bad, as it fluctuates between 100% and 20% all the time.

 

I don't know if both things are related but it's really weird to me that Windows is so much slower than Manjaro.

Very interesting.  I think that OS-based performance difference is worth looking into.  I'd start by checking which drivers are in use, and if different, try switching to the same ones and see if the performance equalises.  I'd also check the version of the program. If you're getting it from the package manager and not directly from the website it's possible the different distros have different versions, with Ubuntu almost certainly being an older one.  I would also look into which it any desktop composting or effects are enabled.  It's harder to tell these days because they've made it a seamless part of the DE now, much like windows, rather than a specific standout feature you have to enable, mainly (I believe) because it's finally polished to the point where it doesn't have a significant impact on performance and/or reliably switches it self off when demanding full screen apps are used - or at least, that's the theory.  In reality, I'm pretty sure it still has a effect and could definitely be enough to cause this difference.  Turn them off completely on both DEs if you can for the best situation.

Solve your own audio issues  |  First Steps with RPi 3  |  Humidity & Condensation  |  Sleep & Hibernation  |  Overclocking RAM  |  Making Backups  |  Displays  |  4K / 8K / 16K / etc.  |  Do I need 80+ Platinum?

If you can read this you're using the wrong theme.  You can change it at the bottom.

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×