AMD's SMT Implementation is Vulnerable to New Attack Called SQUIP - Affects Ryzen Processors

LAwLz · August 12, 2022

Summary

A new vulnerability has recently been discovered in AMD processors. The attack is called the Scheduler Queue Contention Side Channel. It attacks the way multiple scheduler queues interact with it each.

According to the researchers, the reason why this type of exploit has not been tested before is because most security research has been done on Intel CPUs, and Intel CPUs only has a single scheduler queue. As a result, this type of attack is not possible on Intel processors. The researchers discovered that both AMD's Ryzen series, as well as Apple's M series of CPUs have multiple queues.

However, the exploit relies on SMT which Apple do not use in their M series. As a result, AMD are currently the only one known to be vulnerable.

This exploit allows a program to read memory it should not have access to at a rate of about 0.9MB/s to 2.7MB/s with an error rate below 0.8%.

This would allow for example a virtual machine instance in Azure or AWS to access information from other customers running on the same machine, or malware on a PC to access memory and information that would otherwise not be accessible (because of lacking privilege for example).

The researchers created a proof of concept which allowed them to obtain the RSA encryption keys of a neighbouring VM. This was possible even with AMD's Secure Encrypted Virtualization (SEV) feature enabled.

AMD has acknowledged the vulnerability and assigned it a severity rating of "Medium".

There are currently three suggested ways of mitigation.

1) Security related software can be patched so that they ensure secret-independent execution flows with constant-time crypto implementations. This however requires the software implementation to harden itself. Some programs are already hardened and in those cases this is not an issue. But it is not something that is easy to verify and in puts the responsibility and control of securing the system on the developers and not the users. This also has the issue that not all software can be written as constant-time algorithms. For example user inputs.

2) Disable SMT.

3) Employ the use of Co-Scheduling in the OS. This is a feature where the thread scheduler does not allow two threads from different security domains are allowed to run on the same CPU core. In previous testing, this resulted in a performance drop between 12% and 20%. Not idea, but far better than the 8% to 53% performance loss that occurred when SMT was disabled in the same test.

Quotes

Quote

Abstract—Modern superscalar CPUs have multiple execution units that independently execute operations from the instruction stream. Previous work has shown that numerous side channels exist around these out-of-order execution pipelines, particularly for an attacker running on an SMT core.

In this paper, we present the SQUIP attack, the first side-channel attack on scheduler queues, which are critical for deciding the schedule of instructions to be executed in superscalar CPUs. Scheduler queues have not been explored as a side channel so far, as Intel CPUs only have a single scheduler queue, and contention thereof would be virtually the same as contention of the reorder buffer. However, the Apple M1, AMD Zen 2, and Zen 3 microarchitectures have separate scheduler queues per execution unit. We first reverse-engineer the behavior of the scheduler queues on these CPUs and show that they can be primed and probed. The SQUIP attack observes the occupancy level from within the same hardware core and across SMT threads. We evaluate the performance of the SQUIP attack in a covert channel, exfiltrating 0.89 Mbit/s from a co-located virtual machine at an error rate below 0.8 %, and 2.70 Mbit/s from a co-located process at an error rate below 0.8 %. We then demonstrate the side channel on an mbedTLS RSA signature process in a co-located process and in a co-located virtual machine. Our attack recovers full RSA4096 keys with only 50 500 traces and less than 5 to 18 bit errors on average. Finally, we discuss mitigations necessary, especially for Zen 2 and Zen 3 systems, to prevent our attacks.

My thoughts

Not great news.

It's not the end of the world for home users, but it is definitely a big issue for server/service providers like Google, Microsoft and Amazon where multiple customers have access to the same hardware, and can run their own code on it.

It poses a slight risk to home users as well, but since it requires specific software to run on a machine it is not nearly as big of a deal for home users. I do think that AMD's rating seems rather low though. Unless it is discovered that this can be done through for example JavaScript, which I don't think has been confirmed one way or the other.

Sources

https://stefangast.eu/papers/squip.pdf

https://www.amd.com/en/corporate/product-security/bulletin/amd-sb-1039

Rauten · August 12, 2022

I guess you could call this an "EPYC Ooof"

I regret nothing

RONOTHAN## · August 12, 2022

Guess I'm disabling SMT for the next month or so.

leadeater · August 12, 2022

5 hours ago, LAwLz said:

AMD has acknowledged the vulnerability and assigned it a severity rating of "Medium".

Probably because with these side channels attacks you have to prime the system a certain way and then it takes a significant amount of time to iterate through enough times of the attack vector to do it, all while not getting detected or VM moved to another host.

Reading some of the paper for the EPYC system in a hosting environment, like Azure/AWS, they had to pin each VM statically to a single core and each VM only has a single vCPU. Since this would never happen in reality and the vCPU of the VM would be moving physical CPU cores regularly it would be likely impossible to pull of such an attack under these standard operating conditions.

Quote

5.1. Threat Model The unprivileged attacker’s goal is to steal the RSA secret key from the victim process. Following the threat models of state-of-the-art SMT attacks [5], [53], [18], [63], [57], we assume that the attacker and victim are co-located on the same physical core but run on different SMT threads.

Quote

To demonstrate that SQUIP also works across virtual machine boundaries, we evaluated the performance of the covert channel in a cross-VM setup, with sender and receiver running in separate VMs. For our proof of concept, each virtual machine has one virtual CPU (vCPU), statically assigned to one SMT thread of a shared physical core.

Quote

We assume that in the victim VM, the victim process is pinned to one vCPU, which helps avoiding unnecessary movements of tasks between hardware cores and, thus, cold caches, i.e., it is plausible to find this configuration in practice. We make no assumptions about the other vCPU of the victim. We assume and focus on the setup where the attacker has achieved co-location [26] with the victim such that the RSA computation runs on the same physical core (on a sibling SMT thread) as the attacker’s SQUIP attack. To reduce interference from other tasks and timer interrupts, we enable the full task-isolation mode [55] in the guest and the host. This avoids interference from operating system or hypervisor tasks. Note that this is not a requirement for the attack, as filtering techniques (as described by Yarom et al. [73]) can also be used for timing measurements degraded by timer interrupts. Previous work has shown that achieving co-location in the cloud is possible [26]. While larger cloud providers will avoid colocation of different tenants on the same core, with the associated performance cost (see Section 6.2), smaller cloud providers may not have the margins to pay this performance cost and, therefore, may not use the co-scheduling approach. Furthermore, also on private servers or personal computers, where co-scheduling is not enabled by default, virtual machines are used for security (isolation) in practice.

Severity ratings aren't just done on what is technically possible, how likely and what is required also factor in to that. Wait for the CVSS score, those have a standardized evaluation criteria. It's under review now.

https://nvd.nist.gov/vuln/detail/CVE-2021-46778

LAwLz · August 12, 2022

15 minutes ago, leadeater said:

Probably because t with these side channels attacks you have to prime the system a certain way and then it takes a significant amount of time to iterate through enough times of the attack vector to do it, all while not getting detected or VM moved to another host.

Reading some of the paper for the EPYC system in a hosting environment, like Azure/AWS, they had to pin each VM statically to a single core and each VM only has a single vCPU. Since this would never happen in reality and the vCPU of the VM would be moving physical CPU cores regularly it would be likely impossible to pull of such an attack under these standard operating conditions.

Severity ratings aren't just done on what is technically possible, how likely and what is required also factor in to that. Wait for the CVSS score, those have a standardized evaluation criteria. It's under review now.

https://nvd.nist.gov/vuln/detail/CVE-2021-46778

Fair enough. I missed that detail about the testing conditions.

Seems like it will land on a 5.3 in score so putting it at "medium" is fair I guess.

hishnash · August 12, 2022

5 hours ago, LAwLz said:

It poses a slight risk to home users as well, but since it requires specific software to run on a machine it is not nearly as big of a deal for home users.

I think just because the user needs to run some malicious software does mean this is not an issue. For example I have a password manager on my machine that password manager includes keys I have for my work, keys that a hacker would turn into $$$. My assumption is that since im running in full secure boot with all system security settings turned on just runnings some bit of software in user space should not be able to in any way attach, modify, read or write to the memory of another user space application unless it makes use of some bug in the kernel.

But with a cpu bug like this it could be quite interesting for a malicious hacker (say someone who distributes keygens, cracked games, duplicated game mods that replaced dlls etc) to target common password managers aiming to snoop up some secrets and dump them out to a server. In the past attacks agaist these managers have been attempted by modifying packages in NPM to get devs to accidentally download them while working on projects and end up having all of their company secrets stolen but those attacks could only stream secrets you Han in place text on your fs such as env vars or un-secured ssh keys.

Arika · August 12, 2022

5 hours ago, RONOTHAN## said:

Guess I'm disabling SMT for the next month or so.

lol i'm not. these kind of vulnerabilities pop up every few months or so. i've given up caring about them. they almost always require the attacker to have already have access to your system for them to become actually dangerous. Which at that point, if they already have access, i have bigger problems than them just being able to read memory from other programs.

hishnash · August 12, 2022

7 minutes ago, Arika S said:

i have bigger problems than them just being able to read memory from other programs.

modern operating systems are supposed to isolate applications that are runnings so they are not able to do this. This does not require the hostile application to have any rights on your system just needs the os to schedule it on the same core with SMT as some other app.

Arika · August 12, 2022

42 minutes ago, hishnash said:

modern operating systems are supposed to isolate applications that are runnings so they are not able to do this. This does not require the hostile application to have any rights on your system just needs the os to schedule it on the same core with SMT as some other app.

But it still requires a hostile application on my computer. So it needs to have been figured out by a bad actor (as currently it's only researchers that have discovered it with no evidence it's been used in the wild, then i need to have downloaded a malicious program, have it not caught or quarantined by anything, and then have a 1/12 chance of it running on the same thread as something that it can steal something from a program that doesn't use constant-time cryptography. seems like a lot of things that need to happen for me to be impacted by this.

my personal risk assessment of this: I don't care about it.

You can be the first person to tell me "i told you so" if i get hit by this attack because i didn't disable SMT

LAwLz · August 12, 2022

49 minutes ago, Arika S said:

But it still requires a hostile application on my computer. So it needs to have been figured out by a bad actor (as currently it's only researchers that have discovered it with no evidence it's been used in the wild, then i need to have downloaded a malicious program, have it not caught or quarantined by anything, and then have a 1/12 chance of it running on the same thread as something that it can steal something from a program that doesn't use constant-time cryptography. seems like a lot of things that need to happen for me to be impacted by this.

my personal risk assessment of this: I don't care about it.

You can be the first person to tell me "i told you so" if i get hit by this attack because i didn't disable SMT

I am not going to disable SMT either, but depending on what attack vectors gets discovered the risk of running malicious code might be higher than we think.

One possibility that made Spectre and Meltdown so scary was that it could potentially be used through JavaScript. That meant that simply visiting a website that had malicious software on it could end up resulting things like passwords being stolen. It didn't even have to be a sketchy website. Just a website that happened to get a bad ad or something on it. Maybe some XSS exploit.

1 in 11 to "hit" the right thread might sound low, but imagine if you are exposed to it through malicious code on websites. It might be a low risk whenever you browse a website, but if you are exposed to that risk tens of times a day chances are the malware will hit the jackpot sooner or later.

Again, this all hinges on the idea that it would be possible to execute through JavaScript, which might not be the case.

My point though is that even a small risk can be significant if you are exposed to it frequently enough.

yian88 · August 13, 2022

Im so tired of hearing of these vulnerabilities that never affected personal computers, its only a server or critical infrastructure security issue, they should stop patching these on home OS and ruin performance. There is no vulnerability on a home pc, you the user are the biggest vulnerability to your pc, there is no way a random attack or attacker can hijack your cpu/memory resources from outside without running a program inside your OS.

porina · August 13, 2022

50 minutes ago, yian88 said:

they should stop patching these on home OS and ruin performance

By itself it may not be a significant risk, but every potential hole you have increases the chance many could be used together to do worse. So I think the patching route is the correct one for the masses. Perf impact in common workloads is usually insignificant, and advanced users who want to can still disable it if needed.

Dabombinable · August 13, 2022

Considering the amount of holes in the software running on any of the CPU that have had vulnerabilities discovered over the last several years...for the most part its not that big a deal. Especially when a fair few exploits require direct access to the hardware, or specific knowledge about its configuration and present state to enact. In which case your systems security is already screwed.

leadeater · August 13, 2022

20 hours ago, Arika S said:

and then have a 1/12 chance of it running on the same thread as something that it can steal something from a program that doesn't use constant-time cryptography

* for long enough without moving OS/hardware threads.

It's not a case of just running on the same physical core but different threads, the time required to prime the system and iterate through enough runs to pull this off is a key requirement. This is not an immediate execution data breach vulnerability.

mr moose · August 13, 2022

Not possible, I have it on more gooderer authoritahhh that only intel are vulnerable to such things.

But in all reality, after having read the comments I am assuming this is going to be nothing I need to worry about regarding updates etc?

LAwLz · August 13, 2022

5 hours ago, mr moose said:

Not possible, I have it on more gooderer authoritahhh that only intel are vulnerable to such things.

But in all reality, after having read the comments I am assuming this is going to be nothing I need to worry about regarding updates etc?

For things like cloud providers it is a big deal. If I rent a server in Azure then I could potentially get access to let's say private encryption keys used by Microsoft. The likelihood of that happening is small, but then again, they have a lot of customers with lots of secrets that should not be exposed. If this is left unchecked then sooner or later it will probably lead to someone getting access to something important.

For home users, it probably doesn't matter that much. Just like other similar vulnerabilities didn't end up mattering that much.

A lot of the old Spectre and Meltdown patches will also help protect from this vulnerability.

Jito463 · August 13, 2022

9 hours ago, mr moose said:

Not possible, I have it on more gooderer authoritahhh that only intel are vulnerable to such things.

With how complex hardware has become over the decades, I just assume all hardware has some sort of vulnerability in it, and I assume the same about software. The only questions I have are: how bad is it, how difficult is it to implement and how much do I actually need to worry about it (though usually the last one is answered by the first two).

TetraSky · August 13, 2022

As a consumer, my fear for this is next to zero. I would need to be installing a shady software from lord knows where, among other things, for this to affect me.
For VM host providers though... that would be bad indeed

StDragon · August 15, 2022

When speculative execution turns to speculative exploitation.

Sign In

AMD's SMT Implementation is Vulnerable to New Attack Called SQUIP - Affects Ryzen Processors

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Create an account or sign in to comment

Create an account