So an OS kernel constitutes a reasonably sized TCB, but a browser is "massive" and "very expensive" to check? Should I look for that on #StuffHNSays, or is there one just for DJB?
Stick a JIT-compiled Javascript engine in the Linux kernel, give it bindings to everything from the tty drivers to the scheduler, and add an "eval" system call, then see if you still think an OS is harder to secure than a browser.
Comparatively speaking, the functions of an OS kernel are fairly small and contained. Sure, you can find bloated kernels out there and small secure browsers, but I can read the kernel portion of the POSIX specs without crying. I can't say the same for all the browser specifications out there.
> I can read the kernel portion of the POSIX specs without crying.
POSIX is an incredibly poor guide to what a modern OS does. Even in the parts related to storage, they're hopelessly outdated - either forcing onerous compatibility with systems that haven't been sold in twenty years, or completely failing to address issues that have gained in importance over that same time. The network parts are even worse. There isn't even a POSIX spec for virtualization, and internal issues such as NUMA or PCI/USB enumeration were never under their purview to begin with.
If the requirements for a modern kernel were specified to the same degree as those for a modern browser, they'd be much longer. Note that external protocols and formats are not part of the browser's own specification unless you apply the same rule for external protocols and formats used by the kernel - pulling in a ton of stuff from IETF, PCI SIG, IBTA, SATA-IO, ANSI T10, etc. It's only fair, but again you'd end up with something far longer than the browser equivalent.
The responsibilities of an OS kernel are far deeper and broader than almost anyone out in browser-land thinks. I'm quite sure that works the other way too. Just because it's harder to see detail from further away doesn't mean it's not there.
The bulk of the code in a kernel is device drivers for hardware you don't have. The software attack surface is reasonably small and well defined in most cases. As opposed to a browser, where close to 100% of the code is exposed to hostile network traffic.
I don't believe the kernel attack surface is as small as you seem to think, nor is the browser attack surface as large.
Let's look at the kernel first. Yes, the LOC numbers are bloated by bazillions of drivers, but even the core components that most people use are pretty large. I just counted on Linux 3.16 and there are 1.2M lines in components I definitely use on one machine. That's excluding anything in drivers/; include what I use from there and we're probably over 2M. I don't think any reasonable person can claim that 2M lines is a small attack surface, so DJB is already wrong.
Now let's look at Firefox. Downloading 30.0 I see ~13M lines of C, C++ and JS. Subtract at least 2M for the build system itself, build-time-selectable components, dev tools, and NSS which is part of the OS attack surface as well. There are a bunch of other components that I'm sure most people never use, or even have explicitly disabled/blocked, but let's leave them in so we have 11M. Hey, 11M is still larger than 2M, so you and DJB must be right. Not so fast. Attack surface is not just about LOC, and certainly not single-component LOC. Let's look at some other confounding factors.
* The exact same algorithm, with the exact same attack surface, can be expressed in more or less verbose form. The Mozilla code is written in a more verbose style, but shouldn't necessarily get credit for that. In many ways, that's likely to make auditing harder.
* The kernel code is harder to audit. There are fewer people even remotely able to do it, it contains more low-level trickery requiring expertise in a particular platform to analyze, it has more asynchronous/reentrant/etc. control flows that defeat static analysis, etc. Line for line, analyzing kernel code is many times harder than analyzing Mozilla code.
* Across an entire enterprise, the number of platforms and drivers that need to be considered for the kernel - from phones and embedded devices to servers and desktops - increases significantly. So does the attack surface, and real security isn't about single machines in isolation. The corresponding increase for Firefox is very small.
* An operating system is more than a kernel. Even if we only include the utilities that are essential to boot a system and do minimal work on it, we might blow right through that 11M mark.
So yes, if you pick silly definitions and squint hard enough, DJB's statements about the two attack surfaces might be pedantically correct. They're still not practically correct. He frames it as "easy" vs. "hard" - a qualitative policy-driving distinction - and that's misleading at best. Even if you can't accept that he got the relative difficulty exactly wrong, it's clear they are well within the same ballpark. The supposed continental divide that DJB uses to justify the rest of his argument is in fact vapor-thin, and deserves derision.
As someone who does this work for a living, none of the factors that you employ to bridge the gap from 2MM to 11MM seem valid:
* Mozilla's code isn't larger because it's "more verbose"; in fact, kernel constraints often make kernel code more verbose --- for instance, the lack of a full-featured standard library for the kernel mean basic algorithms are often repeated, and everything is done up in fiddly structures to try to minimize memory footprint and maximize locality.
* The kernel code is not harder to audit. Kernel concepts can be harder to work with; there aren't that many developer/auditors that understand the implications of inconsistent TLBs, or for that matter what a TLB is. But that describes a small fraction of the kernel code overall. The code itself is straightforward.
* I don't even know what your third point means.
* An operating system is more than a kernel, but the TCB of an operating system is the kernel, and that's what we're discussing.
But despite the yawning gap between the (huge) size of Firefox and the (relatively small) size of the Linux kernel, the difficulty in securing a browser isn't about code size. It's reducible to just a few issues:
* A collection of rich content languages (particularly HTML, CSS, fonts)
* A content-controlled programming language
* The programming language runtime has object lifecycles constrained to individual pages
* The browser has an event system with state shared by every page
* The language has hooks into every feature supported by the browser
Anyone who has ever built a large-scale distributed system with, for instance, timers knows what a nightmare they are to debug. Object created, object manipulated, timer set on object, object destroyed, timer fired, segfault. Take the cross product of that bug with 150 different features. That's a small part of what browser security people have to deal with.
I do this work for a living too, Tom. In fact I've been working on distributed systems longer than you. I was writing about the dangers of relying on timeouts in distributed systems a decade ago. Your appeal to authority will get you nowhere.
The problems you describe in your last paragraph aren't browser problems. They're distributed-system problems - synchronization, coordination, cache consistency. They're problems in systems that just happen to include a browser as a component, and they also happen in systems that just happen to include a kernel as a component. You don't get to count those problems as part of the browser domain and exclude them from the kernel domain. That's totally disingenuous. As a distributed file system developer I deal with exactly these kinds of problems every day, in a context where browsers are irrelevant.
The fact remains that validating either a browser or a kernel as a TCB is extremely hard. It's not one easy and one hard, so we must choose the easy one. Sure, people who know nothing about the constraints that guide kernel programming might dismiss "fiddly bits" that they don't understand as needless complexity, or turn away from them entirely to comment on the remaining "straightforward" bits as though they were the whole. Sophists might use "I don't even know" as an excuse to dismiss a relevant point instead of engaging on it. Every developer likes to think that their own domain is the most challenging and important one ever, so they can feel all elite. So be it. Still, none of that changes the fact that DJB's argument was ridiculous. The reason we can't trust the browser as a TCB is because they were designed to requirements that are antithetical to that purpose, not because they're actually or inherently too complex.
I do get to exclude them from the kernel domain, because the kernel doesn't export a programming language that interfaces with the scheduler and with the various object lifecycles in the kernel (not to mention the programming language's own object lifecycles) --- and there are more different kinds of objects in the browser than in the kernel.
I'm a kernel person, not a browser person. I am not good enough at software security to be a good browser security person. It's not a coincidence that the best researchers in the industry spend much of their time discovering and weaponizing browser vulnerabilities.
Bernstein's argument is, predictably, not ridiculous.
I do not see daylight between your explanation of why browsers are insecure and Bernstein's. You say browsers are insecure because they're required to be complex. Bernstein says browsers insecure because their complexity makes it hard to isolate a TCB. Those are the same arguments, in different terms.
(My resume is pretty easy to track down if you want to see what I was up to a decade ago. When I said "do this for a living", I meant "software security research", which is my full-time job and has been for ~10 years, not systems development, my former job. Also, FWIW: prefer Thomas.)
>It's not a coincidence that the best researchers in the
>industry spend much of their time discovering and weaponizing >browser vulnerabilities
It's not a coincidence, but complexity isn't the only possible explanation. Attackers like to go after easy targets, and nobody's saying that browsers aren't easy targets. I'd say it's because browsers are built around a fundamentally flawed security model that provides inadequate isolation between various users, resources, and activities. That could be true even if browsers were ten times simpler. Others might point to prioritizing other goals (e.g. rendering performance) over security, or even say that browser implementers are just less skilled than kernel implementers.
> You say browsers are insecure because they're required to be complex.
No. I said they had conflicting requirements, but I wasn't talking about complexity. I was talking about things like plugin extensibility and interoperability with protocols and formats that are themselves insecure. Those requirements don't have to drive the insane LOC figures for browsers, but it's hard to be secure when you're exchanging bodily fluids with every plugin writer in the world.
Instead of beating this dead horse any more, perhaps the more interesting discussion is whether browsers can be secure. Or is the fundamental concept of a "browser" - a single program to mediate all kinds of protocols, intermingled with presentation and even execution of arbitrary code all in the same address space - just a security anti-pattern at its most basic level? If you have a component that tries to pretend it's an OS, but is essentially not capable of being a TCB, maybe it's not the component but the whole architecture that's broken.
To most programmers, libc is indistinguishable from the kernel. Do you actively distinguish between sections 2 and 3 of the manual, or know which calls in section 2 are actually just libc wrappers for kernel functions that do something slightly different? Libc would more reasonably be considered part of the OS than part of the browser.
You might consider it part of the OS, but it isn't part of the kernel. For purposes of understanding whether the kernel is "bigger" attack surface than a "browser", you can't ignore the runtime.
Well, then this discussion could have been really short, because of course the browser relies on the kernel so the kernel is part of the browser's attack surface. :rolleyes:
That's quite different. The kernel has security responsibilities that are independent of userspace, and vice-versa. It's not like the browser gets loaded in to kernel space.
Show me a browser with a proof of correctness? I'll show you a kernel with one: seL4.
An OS kernel (particularly a microkernel) doesn't actually have to do very much, you know.
This is first-principles brainstorming. As I've already seen it, it's rather hard and almost impossible to follow without the talk, however - just the slides really are missing a lot of it. I'm not sure I follow the arguments in this one either; there's just not enough there about them.