What’s up, Meltdown?
Everyone seems to be going crazy about Meltdown/Spectre for the past week. I just finished reading the Meltdown paper, and doing so made the existential threat clearer, though I’m still cautious to believe that the sky is falling. Very credible folks from the security community (tptacek et al.) seem to think that this is v bad, so I’m inclined to believe that I’m not seeing the full picture here.
I first read the Spectre PoC code (published in Appendix A of that paper and available as in Github gist form here). I understand that Spectre is separate, but the two are related so I was hoping that the example code with shed some light on what everyone seemed so afraid of.
It turned out that the code mainly showcased poisoning the branch predictor to cause a speculative execution of an unintended memory read, but the PoC itself didn’t illustrate a huge hole in memory isolation as it effectively just uses the cache timing side-channel to read data out of a char array in user memory. Any process already has full access to all of its own user memory, so that wasn’t particularly interesting.
What the authors have to say⌗
The BLUF on Meltdown is that a malicious process can read any memory mapped into its address space using the cache timing side channel. Any process could already read arbitrary user memory, so what’s interesting here is that the attacker can also read kernel memory via timing reads via Flush+Reload (described in the paper section 5.1).
It seemed clear that with this mechanism, if the reads were performed quickly enough (and the authors claim speeds of ≥ 500KB/s) then you could read data from kernel buffers passed into syscalls by other processes, things like logins, etc. It was threatening, but you’d need then scan kernel memory for the buffer holding that data, and at the advertised 500K/s it would take a day and a half to scan 1GB of kernel memory EDIT: yea I messed up, it would only take about half an hour. This is much easier to mount. Thus, feasible but limited. However, my lack of understanding of something very basic about the Linux kernel is what prevented me from seeing what the big issue was—and this is the interesting part of the attack and why the IT industry have collectively curled up in fetal position since public disclosure.
Physmem, physmem everywhere⌗
On Linux (without KASLR) the kernel is mapped at some fixed address KERNEL_START_ADDR
, specifically 0xffff880000000000
on 64-bit systems.
The Linux kernel, to make allocating pages easier, direct-maps all of physical memory into the kernel address space starting at the kernel offset. So for example, with KASLR enabled and a KASLR offset of 0xc000
, reading physical memory address 0x80000000
(the 2nd GiB of physical memory) would require the execution of a transient instruction reading from 0xffff88008000c000
.
This is huge. Up until now, address space isolation was one of the core tenets of Linux’s security model. Now by exploiting Meltdown, a single process could infer with high probability the values of bytes on arbitrary raw memory pages, which can be written by any process running on the system. Keep in mind though, at 500KB/s, one can scan an 8 gig machine in 11 days. It sounds to me like you need some information about which physical addresses are in use by a process to realistically execute this attack anywhere outside of the lab, but I’m sure there are ways of getting that information via side channels or the like.
Some published attacks⌗
You’ll note that in this video proof-of-concept published by Michael Schwarz—one of the authors of the Meltdown paper—the following commandline is run to read in the text entered into a login screen:
./reader 0x3c80e8040
And the first line of output for the reader
program is 0xffff8803c80e8040
.
Now we can probably guess that the attack is being mounted on a machine with KASLR enabled, and Michael is supplying the offset from KERNEL_START_ADDR
where the kernel’s direct-physical mapping begins.
The code for runnign the PoC is available on Github, and the README is a great place to start reading about how they performed the exploits. In particular, it does in fact seem like you need to provide the physical address where your secrets from the other process are being stored to perform anything resembling the login screen attack shown in the video.
So, I get why things are bad, but I’m still not totally sure why the sky is falling. Any remotely interesting attack seems to need some significant extra information to be executed in a reasonable amount of time. Just food for thought!