Jump to content

Torvalds angry at Intel for opt-in fix on Spectre v2

Jito463

https://www.theregister.co.uk/2018/01/22/intel_spectre_fix_linux/

 

Rather than implement the fix for Spectre directly, Intel has apparently decided to have a flag to indicate vulnerability, but have it be off by default.

Quote

Intel's fix for Spectre variant 2 – the branch target injection design flaw affecting most of its processor chips – is not to fix it.

Rather than preventing abuse of processor branch prediction by disabling the capability and incurring a performance hit, Chipzilla's future chips – at least for a few years until microarchitecture changes can be implemented – will ship vulnerable by default but will include a protection flag that can be set by software.

 

Intel explained its approach in its technical note about Spectre mitigation, titled Speculative Execution Side Channel Mitigations. Instead of treating Spectre as a bug, the chip maker is offering Spectre protection as a feature.

This has apparently raised the ire of Torvalds

Quote

The decision to address the flaw with an opt-in flag rather than activating defenses by default has left Linux kernel steward Linus Torvalds apoplectic.

Known for incendiary tirades, Torvalds does not disappoint. In a message posted to the Linux kernel mailing list on Sunday, he wrote, "As it is, the patches are COMPLETE AND UTTER GARBAGE."

"All of this is pure garbage. Is Intel really planning on making this s*** architectural?" he asked. "Has anybody talked to them and told them they are f****** insane? Please, any Intel engineers here – talk to your managers."

The article also goes into some details about the fix

Quote

Torvalds' ire arises from Intel's plan to have future processors advertise that they include a Spectre v2 fix while also requiring that the fix is enabled at boot time by setting a flag called the IBRS_ALL bit.

 

IBRS refers to Indirect Branch Restricted Speculation, one of three new hardware patches Intel is offering as CPU microcode updates, in addition to the mitigation created by Google called retpoline. You'll need this microcode from Chipzilla to fully mitigate Spectre on Intel CPUs, although, as detailed below, said microcode is unstable at the moment.

 

IBRS, along with Single Thread Indirect Branch Predictors (STIBP) and Indirect Branch Predictor Barrier (IBPB), prevent a potential attacker or malware from abusing branch prediction to read memory it shouldn't – such as passwords or other sensitive information out of protected kernel memory.

The bolded part caught my eye.  I wonder if Intel's hesitance to make it opt-out is due the current instability, rather than liability?  Nevertheless, that's not what Torvalds claims.

Quote

The expectation here, at least on Torvald's part, is that a future chip addressing past flaws should include a flag or version number that tells the kernel it's not vulnerable, so no unneeded and potentially performance-killing mitigations need to be applied. In other words, the chip should indicate to the kernel that its hardware design has been revised to remove the Spectre vulnerability, and thus does not need any software mitigations or workarounds.

 

Intel's approach is backwards, making the fix opt-in. Processors can, when asked, reveal to the kernel that Spectre countermeasures are present but disabled by default, and these therefore need to be enabled by the operating system. Presumably, this is because the performance hit is potentially too annoying, or because Intel doesn't want to appear to admit there is a catastrophic security blunder in its blueprints.

 

Annoyed by this convoluted approach, Torvalds himself suggested Intel's motivation is avoiding legal liability – recalling two decades of flawed chips would be ruinously expensive – and bad benchmarks. After all, Intel is already being sued all over the place right now.

Another potential issue is the performance hits from the patch.

Quote

"At Lyft, we saw an approximately 20 per cent slowdown on certain system call heavy workloads on AWS C4 instances when the mitigations were rolled out," said software engineer Matt Klein in a recent post.

As an aside, Intel also addressed issues with stability on Haswell & Broadwell based systems after the patch.

Quote

In a separate but related note, Intel on Monday identified the problem with its Broadwell and Haswell CPU updates to mitigate Spectre v2 attacks. Its initial patch had been causing affected machines to crash, so it's preparing a patch without the problematic bits – the Spectre v2 mitigation – that it can offer until it gets the full patch right.

 

"We recommend that OEMs, cloud service providers, system manufacturers, software vendors and end users stop deployment of current [microcode] versions, as they may introduce higher than expected reboots and other unpredictable system behavior," warned Intel, effectively freezing the rollout of fixes it earlier this month promised were golden.

 

"We ask that our industry partners focus efforts on testing early versions of the updated solution so we can accelerate its release. We expect to share more details on timing later this week."

 

HPE is the latest biz, among Lenovo, VMware, and others, to pull Intel's firmware update from its download pages.

 

"For those concerned about system stability while we finalize the updated solutions, we are also working with our OEM partners on the option to utilize a previous version of microcode that does not display these issues, but removes the Variant 2 (Spectre) mitigations," Intel continued.

 

For those not concerned about system stability, it's all good.

 

Link to comment
Share on other sites

Link to post
Share on other sites

Torvalds is just an angry asshole. It's time for him to be removed from the Linux development community so that there is no reason to pretend he's important.

Come Bloody Angel

Break off your chains

And look what I've found in the dirt.

 

Pale battered body

Seems she was struggling

Something is wrong with this world.

 

Fierce Bloody Angel

The blood is on your hands

Why did you come to this world?

 

Everybody turns to dust.

 

Everybody turns to dust.

 

The blood is on your hands.

 

The blood is on your hands!

 

Pyo.

Link to comment
Share on other sites

Link to post
Share on other sites

Not buying Intel anymore then for the foreseeable future

I nearly forgot I own a Mac. Looks like I will be buying Intel.

Link to comment
Share on other sites

Link to post
Share on other sites

3 minutes ago, RorzNZ said:

Not buying Intel anymore then for the foreseeable future

I nearly forgot I own a Mac. Looks like I will be buying Intel.

we can only hope that apple chooses zen and threadripper for the 2018 line. 

Good luck, Have fun, Build PC, and have a last gen console for use once a year. I should answer most of the time between 9 to 3 PST

NightHawk 3.0: R7 5700x @, B550A vision D, H105, 2x32gb Oloy 3600, Sapphire RX 6700XT  Nitro+, Corsair RM750X, 500 gb 850 evo, 2tb rocket and 5tb Toshiba x300, 2x 6TB WD Black W10 all in a 750D airflow.
GF PC: (nighthawk 2.0): R7 2700x, B450m vision D, 4x8gb Geli 2933, Strix GTX970, CX650M RGB, Obsidian 350D

Skunkworks: R5 3500U, 16gb, 500gb Adata XPG 6000 lite, Vega 8. HP probook G455R G6 Ubuntu 20. LTS

Condor (MC server): 6600K, z170m plus, 16gb corsair vengeance LPX, samsung 750 evo, EVGA BR 450.

Spirt  (NAS) ASUS Z9PR-D12, 2x E5 2620V2, 8x4gb, 24 3tb HDD. F80 800gb cache, trueNAS, 2x12disk raid Z3 stripped

PSU Tier List      Motherboard Tier List     SSD Tier List     How to get PC parts cheap    HP probook 445R G6 review

 

"Stupidity is like trying to find a limit of a constant. You are never truly smart in something, just less stupid."

Camera Gear: X-S10, 16-80 F4, 60D, 24-105 F4, 50mm F1.4, Helios44-m, 2 Cos-11D lavs

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, GDRRiley said:

we can only hope that apple chooses zen and threadripper for the 2018 line. 

I'm predicting the 6-core i7s and the VEGA gpus :)
That would make me think about switching, but I'll probably keep my laptop until 2022 at least.

... holy s**t 2020 is close wtf.

Link to comment
Share on other sites

Link to post
Share on other sites

15 minutes ago, Drak3 said:

It's time for him to be removed from the Linux development community so that there is no reason to pretend he's important.

Although the article (and my post) used Torvalds as the launching point, it was more about the fix from Intel being opt-in rather than opt-out.

Link to comment
Share on other sites

Link to post
Share on other sites

11 minutes ago, GDRRiley said:

we can only hope that apple chooses zen and threadripper for the 2018 line. 

like that's going to happen. I don't want to say never but even given the current situation, Apple seems pretty dedicated to Intel and that's not gonna change unless something big happens.

Link to comment
Share on other sites

Link to post
Share on other sites

11 minutes ago, Jito463 said:

AMD would have to be able to implement Thunderbolt, as I doubt Apple wants to drop that from their systems.

Thunderbolt is royalty-free now.

Link to comment
Share on other sites

Link to post
Share on other sites

Tovalds does seem to have an issue with everyone else at the moment.  Maybe he feels people aren't paying him enough attention.

Grammar and spelling is not indicative of intelligence/knowledge.  Not having the same opinion does not always mean lack of understanding.  

Link to comment
Share on other sites

Link to post
Share on other sites

I'm probably wrong, but is Torvalds taking priority of security over whether or not the system is stable? Granted, a crashed system is good from a security standpoint (:P), but how many server admins like troubleshooting random crashes that cost the company lots of money?

My eyes see the past…

My camera lens sees the present…

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, divito said:

Doesn't seem like a smart decision, and it's further irritating that people are writing this off just because Torvalds is the one bringing attention to this.

It seems to me the decision was based on allowing the techies who are responsible for large systems to update and enable patches in a more controlled way.  Rather than just installing the update and waiting for everything to crash, they can run the updates and slowly introduce patch components without risking everything rebooting at the same time.

 

Unless I am wrong this isn't aimed at  domestic Joe who doesn't have a clue.     Besides, most of his tirades are unprofessional and offer very little justification. 

Grammar and spelling is not indicative of intelligence/knowledge.  Not having the same opinion does not always mean lack of understanding.  

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, Jito463 said:

AMD would have to be able to implement Thunderbolt, as I doubt Apple wants to drop that from their systems.

What does that have to do with AMD?  Doesn't thunderbolt just use PCIe lanes?  AMD has more of them up for grabs than intel.  It's upto the MOBO manufacturer to implement TB3, in this case apple's design and whomever they source to produce them.

Link to comment
Share on other sites

Link to post
Share on other sites

24 minutes ago, MoonSpot said:

What does that have to do with AMD?  Doesn't thunderbolt just use PCIe lanes?  AMD has more of them up for grabs than intel.  It's upto the MOBO manufacturer to implement TB3, in this case apple's design and whomever they source to produce them.

Thunderbolt requires a special controller and has specific licensing stuff.

Judge a product on its own merits AND the company that made it.

How to setup MSI Afterburner OSD | How to make your AMD Radeon GPU more efficient with Radeon Chill | (Probably) Why LMG Merch shipping to the EU is expensive

Oneplus 6 (Early 2023 to present) | HP Envy 15" x360 R7 5700U (Mid 2021 to present) | Steam Deck (Late 2022 to present)

 

Mid 2023 AlTech Desktop Refresh - AMD R7 5800X (Mid 2023), XFX Radeon RX 6700XT MBA (Mid 2021), MSI X370 Gaming Pro Carbon (Early 2018), 32GB DDR4-3200 (16GB x2) (Mid 2022

Noctua NH-D15 (Early 2021), Corsair MP510 1.92TB NVMe SSD (Mid 2020), beQuiet Pure Wings 2 140mm x2 & 120mm x1 (Mid 2023),

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, Jito463 said:

Although the article (and my post) used Torvalds as the launching point, it was more about the fix from Intel being opt-in rather than opt-out.

That's so stupid though. Omg. I swear Intel is practically begging people to buy AMD at this point.

Judge a product on its own merits AND the company that made it.

How to setup MSI Afterburner OSD | How to make your AMD Radeon GPU more efficient with Radeon Chill | (Probably) Why LMG Merch shipping to the EU is expensive

Oneplus 6 (Early 2023 to present) | HP Envy 15" x360 R7 5700U (Mid 2021 to present) | Steam Deck (Late 2022 to present)

 

Mid 2023 AlTech Desktop Refresh - AMD R7 5800X (Mid 2023), XFX Radeon RX 6700XT MBA (Mid 2021), MSI X370 Gaming Pro Carbon (Early 2018), 32GB DDR4-3200 (16GB x2) (Mid 2022

Noctua NH-D15 (Early 2021), Corsair MP510 1.92TB NVMe SSD (Mid 2020), beQuiet Pure Wings 2 140mm x2 & 120mm x1 (Mid 2023),

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, Drak3 said:

Torvalds is just an angry asshole. It's time for him to be removed from the Linux development community so that there is no reason to pretend he's important.

In this case I do agree it should be opt-out and versioned correctly, but the fix SHOULD be stable first though.

Link to comment
Share on other sites

Link to post
Share on other sites

19 minutes ago, leadeater said:

In this case I do agree it should be opt-out and versioned correctly, but the fix SHOULD be stable first though.

So people should really only get their panties in a bunch if the patch is made stable but Intel don't make it opt out?

Grammar and spelling is not indicative of intelligence/knowledge.  Not having the same opinion does not always mean lack of understanding.  

Link to comment
Share on other sites

Link to post
Share on other sites

David Woodhouse (kernel developer at Intel assigned to implement IBRS) replied with an explanation of what is going on, with some background info.

I highly recommend people read (and understand) it before throwing shit at someone.

 

Quote

On Sun, 2018-01-21 at 14:27 -0800, Linus Torvalds wrote:
> On Sun, Jan 21, 2018 at 2:00 PM, David Woodhouse <dwmw2@infradead.org> wrote:
> >>
> >> The patches do things like add the garbage MSR writes to the kernel
> >> entry/exit points. That's insane. That says "we're trying to protect
> >> the kernel".  We already have retpoline there, with less overhead.
> >
> > You're looking at IBRS usage, not IBPB. They are different things.
> 
> Ehh. Odd intel naming detail.
> 
> If you look at this series, it very much does that kernel entry/exit
> stuff. It was patch 10/10, iirc. In fact, the patch I was replying to
> was explicitly setting that garbage up.
> 
> And I really don't want to see these garbage patches just mindlessly
> sent around.

I think we've covered the technical part of this now, not that you like
it — not that any of us *like* it. But since the peanut gallery is
paying lots of attention it's probably worth explaining it a little
more for their benefit.

This is all about Spectre variant 2, where the CPU can be tricked into
mispredicting the target of an indirect branch. And I'm specifically
looking at what we can do on *current* hardware, where we're limited to
the hacks they can manage to add in the microcode.

The new microcode from Intel and AMD adds three new features.

One new feature (IBPB) is a complete barrier for branch prediction.
After frobbing this, no branch targets learned earlier are going to be
used. It's kind of expensive (order of magnitude ~4000 cycles).

The second (STIBP) protects a hyperthread sibling from following branch
predictions which were learned on another sibling. You *might* want
this when running unrelated processes in userspace, for example. Or
different VM guests running on HT siblings.

The third feature (IBRS) is more complicated. It's designed to be
set when you enter a more privileged execution mode (i.e. the kernel).
It prevents branch targets learned in a less-privileged execution mode,
BEFORE IT WAS MOST RECENTLY SET, from taking effect. But it's not just
a 'set-and-forget' feature, it also has barrier-like semantics and
needs to be set on *each* entry into the kernel (from userspace or a VM
guest). It's *also* expensive. And a vile hack, but for a while it was
the only option we had.

Even with IBRS, the CPU cannot tell the difference between different
userspace processes, and between different VM guests. So in addition to
IBRS to protect the kernel, we need the full IBPB barrier on context
switch and vmexit. And maybe STIBP while they're running.

Then along came Paul with the cunning plan of "oh, indirect branches
can be exploited? Screw it, let's not have any of *those* then", which
is retpoline. And it's a *lot* faster than frobbing IBRS on every entry
into the kernel. It's a massive performance win.

So now we *mostly* don't need IBRS. We build with retpoline, use IBPB
on context switches/vmexit (which is in the first part of this patch
series before IBRS is added), and we're safe. We even refactored the
patch series to put retpoline first.

But wait, why did I say "mostly"? Well, not everyone has a retpoline
compiler yet... but OK, screw them; they need to update.

Then there's Skylake, and that generation of CPU cores. For complicated
reasons they actually end up being vulnerable not just on indirect
branches, but also on a 'ret' in some circumstances (such as 16+ CALLs
in a deep chain).

The IBRS solution, ugly though it is, did address that. Retpoline
doesn't. There are patches being floated to detect and prevent deep
stacks, and deal with some of the other special cases that bite on SKL,
but those are icky too. And in fact IBRS performance isn't anywhere
near as bad on this generation of CPUs as it is on earlier CPUs
*anyway*, which makes it not quite so insane to *contemplate* using it
as Intel proposed.

That's why my initial idea, as implemented in this RFC patchset, was to
stick with IBRS on Skylake, and use retpoline everywhere else. I'll
give you "garbage patches", but they weren't being "just mindlessly
sent around". If we're going to drop IBRS support and accept the
caveats, then let's do it as a conscious decision having seen what it
would look like, not just drop it quietly because poor Davey is too
scared that Linus might shout at him again. :)

I have seen *hand-wavy* analyses of the Skylake thing that mean I'm not
actually lying awake at night fretting about it, but nothing concrete
that really says it's OK.

If you view retpoline as a performance optimisation, which is how it
first arrived, then it's rather unconventional to say "well, it only
opens a *little* bit of a security hole but it does go nice and fast so
let's do it".

But fine, I'm content with ditching the use of IBRS to protect the
kernel, and I'm not even surprised. There's a *reason* we put it last
in the series, as both the most contentious and most dispensable part.
I'd be *happier* with a coherent analysis showing Skylake is still OK,
but hey-ho, screw Skylake.

The early part of the series adds the new feature bits and detects when
it can turn KPTI off on non-Meltdown-vulnerable Intel CPUs, and also
supports the IBPB barrier that we need to make retpoline complete. That
much I think we definitely *do* want. There have been a bunch of us
working on this behind the scenes; one of us will probably post that
bit in the next day or so.

I think we also want to expose IBRS to VM guests, even if we don't use
it ourselves. Because Windows guests (and RHEL guests; yay!) do use it.

If we can be done with the shouty part, I'd actually quite like to have
a sensible discussion about when, if ever, we do IBPB on context switch
(ptraceability and dumpable have both been suggested) and when, if
ever, we set STIPB in userspace.

 

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, mr moose said:

It seems to me the decision was based on allowing the techies who are responsible for large systems to update and enable patches in a more controlled way.  Rather than just installing the update and waiting for everything to crash, they can run the updates and slowly introduce patch components without risking everything rebooting at the same time.

If OP is accurate, though, this particular rant was about how upcoming CPUs will be shipped in the near future (the rant about how to patch CPUs already out in the wild was in a different post :P).

For unreleased CPUs there is a business-sensitive third alternative to unstable patches on by default or off by default: withholding products until they can ship with safe and stable solutions. While far from cheap, it provides a different perspective on how to judge imperfect solutions...

Link to comment
Share on other sites

Link to post
Share on other sites

7 minutes ago, SpaceGhostC2C said:

If OP is accurate, though, this particular rant was about how upcoming CPUs will be shipped in the near future (the rant about how to patch CPUs already out in the wild was in a different post :P).

For unreleased CPUs there is a business-sensitive third alternative to unstable patches on by default or off by default: withholding products until they can ship with safe and stable solutions. While far from cheap, it provides a different perspective on how to judge imperfect solutions...

I've read so much on it that it's all starting to blur in to the one article with the one issue.  And to be honest I don't mind treating it like that. 

 

At any rate, it seems my previous musing on the why and whatnot were mostly wrong in that it has less to do with unstable reboots (although they are a thing and serious in certain situations) but more to do with some CPU's just not needing said patches while others do.   That info linked by @LAwLz makes for some interesting light shedding.

Grammar and spelling is not indicative of intelligence/knowledge.  Not having the same opinion does not always mean lack of understanding.  

Link to comment
Share on other sites

Link to post
Share on other sites

4 hours ago, Jito463 said:

That doesn't necessarily meant AMD are ready to implement it.  Stuff like that does take time.

Of course but they would be crazy not to.

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×