Spectre and Meltdown

Forget about the old blog posts for now.

Today the hot item is Spectre and Meltdown. It's a class of vulnerabilities caused by CPU bugs that allows an adversary to steal sensitive data, even without any software bugs. Nice.

Everyone and his dog is talking about it, offering their opinions and such. Thusly, I feel compelled to offer my own.

Mind you, I'm not a CPU engineer, so don't take this as infallible. In fact, I may be totally wrong about it. So treat it like how you treat any other opinions - verify and cross-check with other sources. That being said, I've done some research about it myself, so I expect that I'm not too much fooled by myself :)



Overview

There are 3 kinds of vulnerabilities: Spectre 1, Spectre 2, and Meltdown.

In very simplified terms, this is how they work:
1. Spectre 1 - using speculative execution, leak sensitive data via cache timing.
2. Spectre 2 - by poisoning branch prediction cache, makes #1 more likely to happen.
3. Meltdown - Application of Spectre 1: read kernel-mode memory from non-privileged programs.



How they work

So how exactly do they work? https://googleprojectzero.blogspot.com.au/2018/01/reading-privileged-memory-with-side.html gives you the super details of how they work, but in the nutshell, here it is:

Spectre 1 - Speculative execution is a phantom CPU operation that supposedly does not leave any trace. And if you view it from CPU point of view, it really doesn't leave any trace.

Unfortunately, that's not the case when you view it from outside the CPU. From outside, a speculative execution looks just like normal execution - peripherals can't differentiate between them; and any side effects will stay. This is well known, and CPU designers are very careful not to perform speculative executions when dealing with external world.

However, there is one peripheral that sits between CPU and external world - the RAM cache. There are multiple levels of RAM cache (L1, L2, L3), some these belongs to the CPU (as in, located in the same physical chip), some are external to the CPU. In most designs, however, the physical location doesn't matter: wherever they are, these caches aren't usually aware of differences between speculative and normal execution. And this is where the trouble is: because the RAM cache is unable to differentiate between these two, any execution (normal or speculative) will leave an imprint in the RAM cache - certain data may be loaded or removed from the cache.

Although one cannot read the contents of RAM cache directly (that would be too easy!), one can still infer information by checking whether certain set of data in inside the RAM cache or not - by timing them (if it's in the cache, data is returned fast, otherwise it's slow).

And that's how Spectre 1 works - by doing tricks to control speculative execution, one can perform an operation which normally isn't allowed to leave RAM cache imprint, which can then be checked to gain some information.

Spectre 2 - Just like memory cache and speculative execution, branch prediction is a performance-improvement technique used by CPU designers. Most branches will trigger speculative execution; branch prediction (when the prediction is correct) makes that speculation run as short as possible.

In addition, certain memory-based branch ("indirect branch") uses small, in-CPU cache to hold the location of the previous few jumps; these are the locations from which speculative execution will be started.

Now, if you can fill this branch prediction cache with bad values (="poisoning" them), you can make CPU to perform speculative execution at the wrong location. Also, by making the branch prediction errs most of the time, you make that speculative execution longer-lived than that it should be. Together, they make it more easier to launch Spectre 1 attack.

Meltdown - is an application of Spectre 1 to attempt to read data from privileged and protected kernel memory, by non-privileged program. Normally this kind of operation will not even be attempted by the CPU, but when running speculative execution, some CPU "forget" to check for privilege separation and just blindly do it what it is asked to do.



Impact

Anything that allows non-privileged programs to read and leak infomation from protected memory is bad.



Mitigation Ideas

Addressing these vulnerabilities - especially Spectre - is hard because the cause of the problem is not a single architecture or CPU bugs or anyhing like it - it is tied to the concept itself.

Speculative execution, memory cache, and branch prediction are all related. They are time-proven performance-enhancing techniques that have been employed for decades (in consumer microprocessor world, Intel was first with their Pentium CPU back in 1993 - that's 25 years ago as of this time of writing.

Spectre 1 can be stopped entirely, if speculative execution does not impact the cache (or if the actions to the cache can be un-done once speculative execution is completed). But that is a very expensive operation in terms of performance. By doing that, you more or less lose the speed gain you get from speculative execution - which means, may as well don't bother to do speculative execution in the first place.

Spectre 2 can be stopped entirely if you can enlarge the branch prediction cache so poisoning won't work. But there is a physical limit on how large the branch cache can be, before it slows down and lose its purpose as a cache.

Alternatively, it can be stopped again in its entirety, if you disable speculative execution during branching. But that's what a branch prediction is for, so if you do that, may as well drop the branch prediction too.

Meltdown however, is easier to work out. We just need to ensure that speculative execution honours the memory protection too, just like normal execution. Alternatively, we make the kernel memory totally inaccessible from non-privileged programs (not by access control, but by mapping it out altogether).



Mitigation In Practice

Spectre 1 - There is no fix available, yet (no wonder, this is the most difficult one).

There are clues that some special memory barrier instructions (i.e. LFENCE) can be modified (perhaps by microcode update?) to stop speculative execution or at least remove the RAM cache imprint by undo-ing cache loading during speculative execution, on-demand (that is, when that LFENCE instruction is executed).

However, even when it is implemented (it isn't yet at the moment), this is a piecemail fix at best. It requires patches to be applied to compilers, or more importantly any programs capable of generating code or running interpreted code from untrusted source. It does not stop the attack fully, but only makes it more difficult to carry it out.

Spectre 2 - Things is a bit rosier in this department. The fix is basically to disable speculative execution during branching. This can be done in two ways. In software, it can be used by using a technique called "retpoline" (you can google that) - which basically let speculative execution chases its own tails (=thus effectively disabling it). In hardware, this can be done by the CPU exposing controls (via microcode update) to temporarily disable speculative execution during branching; and then the software making use of that control.

Retpoline is available today. The microcode update is presumably available today for certain CPUs, and the Linux kernel patches that make use of that branch controls are also available today. However, none of them have been merged into mainline yet. (Certain vendor-specific kernel builds already have these fixes, though).

Remember, the point of Spectre 2 is to make it easier to carry out Spectre 1, so by fixing Spectre 2 it makes Spectre 1 less likely to happen to the point of making it irrelevant (hopefully).

Meltdown - This is where the good news finally is. The fix can be done, again, via CPU microcode update, or by software. Because it may take a while for that microcode update to happen (or not all), the kernel developers have come up with a software fix called KPTI - Kernel Page Table Isolation. With this fix, kernel memory is completely hidden from non-privileged programs (that's what "isolation" stands for). This works, but with a very high-cost in performance: it is reported to be 5% at minimum, and may go to 30% or more.




Affected CPUs

Everyone has a different view on this, but here is my take about it.

Spectre 1 - All out-of-order superscalar CPUs (no matter what architecture or vendor or make) from Pentium Pro era (ca 1995) onwards are susceptible.

Spectre 2 - All CPU with branch prediction that use cache (aka "dynamic branch prediction") are affected. The exact techniques to carry out Spectre 2 attack may be different from one architecture to another, but the attack concept is applicable to all CPUs of this class.

Meltdown - certain CPU get it right and honour memory protection even during specutlative execution. These CPUs don't need the above KPTI patches and they are not affected by Meltdown. Some says that CPUs from AMD are not affected by this; but with so many models involved it's difficult to be sure.




So that's it. It does not sound very uplifting, but at least you get a picture of what you're going to have for the rest of 2018. And the year has just started ...

EDIT: If you don't understand some of the terms used in this article, you may want to check this excellent article by Eben Upton.

Posted on 16 Jan 2018, 15:24 - Categories: Linux General
Edit - Delete


No comments posted yet.

Add Comment

Title
Author
 
Content
Show Smilies
Security Code 9882983
Mascot of Fatdog64
Password (to protect your identity)