As someone who works on the Linux kernel's cryptography code, the regularly occurring AF_ALG exploits are really frustrating. AF_ALG, which was added to the kernel many years ago without sufficient review, should not exist. It's very complex, and it exposes a massive attack surface to unprivileged userspace programs. And it's almost completely unnecessary, as userspace already has its own cryptography code to use. The kernel's cryptography code is just for in-kernel users (for example, dm-crypt).
The algorithm being used in this exploit, "authencesn", is even an IPsec implementation detail, which never should have been exposed to userspace as a general-purpose en/decryption API.
If you're in charge of the configuration for a Linux kernel, I strongly recommend disabling all CONFIG_CRYPTO_USER_API_* kconfig options. This would have made this bug, and also every past and future AF_ALG bug, unexploitable. In the unlikely event that you find that it breaks any userspace programs on your system, please help migrate them to userspace crypto code! For some it's already been done. But in general, AF_ALG has actually never been used much in the first place, other than in exploits.
I don't think there's much other option. This sort of userspace API might have been sort of okay many years ago. But it just doesn't stand up in a world with syzbot, LLM-assisted bug discovery, etc.
> * The first and most important item is the access to hardware accelerators and hardware devices whose technical interface can only be accessed from the kernel mode / supervisor state of the processor. Such support cannot be used from user space except through AF_ALG.
> * When using user space libraries, all key material and other cryptographic sensitive parameters remains in the calling application's memory even when the application supplied the information to the library. When using AF_ALG, the key material and other sensitive parameters are handed to the kernel. The calling application now can reliably erase that information from its memory and just use the cipher handle to perform the cryptographic operations. If the application is cracked an attacker cannot obtain the key material.
> * On memory constrained systems like embedded systems, the additional memory footprint of a user space cryptographic library may be too much. As the kernel requires the kernel crypto API to be present, reusing existing code should reduce the memory footprint.
I can't judge whether this is a good justification, but there is one.
AF_ALG if I remember correctly predates userspace-accessible crypto acceleration and was way more important back when it meant you had actual need for "SSL accelerator" cards in servers, among other things
Yes, I remember that time, it was back when I wasn't allowed to know anything about what servers were doing other than to look it up in the internal leak, which was never maintained
Hi, embedded firmware engineer here. I give it a B-
There's a weird area between the workloads that fit on a microcontroller, and the stuff that demands a full-blown CPU. Think softcore processors on FPGAs, super tiny MIPS and RISC-V cores on an ASIC, etc. Typically you run something like Yocto on a core like that. Maybe MontaVista or QNX if you've got the right nerd running the show.
So you have serious compute needs, and security concerns that justify virtual memory. But you don't have infinite space to work with, so hardware acceleration is important. Having a standard API built into the kernel seems like a decent idea I guess.
And yet, I've never heard of AF_ALG. I've never seen it used. The thing is, if you have some bizzaro softcore, there's a good chance you also have a bizzaro crypto engine with no upstream kernel driver. If you're going to the trouble of rolling your own kernel with drivers for special crypto engines, why would you bother hooking it into this thing? Roll your own API that fits your needs and doesn't have a gigantic attack surface.
I feel like it should be possible to fulfill these advantages with a minimal, not very complex API. I.e. the grandparent's comment about IPsec implementation details doesn't make the cut, but a hardware accelerated cipher implementation does.
A hardware accelerated DMA-capable cipher implementation is an odd thing, and it’s generally not useful on its own. You might want to set up a whole chain of operations (encrypt, checksum, send to network, for example), but I’ve never encountered a case where you actually want to ask an accelerator to asynchronously encrypt application data and return the encrypted data to the application.
Unless you're pushing a ton of extra work into a network-capable accelerator, that sounds exactly like what you'd want for, e.g., an encrypted S3 implementation. You have encryption, RS encoding, striped checksumming, sending fragments to multiple hosts, some sort of potentially interesting partial failure handling, etc.
You could push that all down to the accelerator, but if there are even a few such use cases you might want a dedicated DMA-capable implementation instead.
When you can’t know the objective truth or when there isn’t one (as is the case in making decisions about security tradeoffs in software design), knowing the source of the argument is vital to interpreting its validity.
If iwd, or cryptsetup with certain non-default algorithms, isn't being used on the system, you should be fine. Not many programs use AF_ALG. It's possible there are others I'm not aware of, but it's quite rare.
To be clear, general-purpose Linux distros generally can't disable these kconfig options yet, due to these cases. But there are many Linux systems that simply don't need this functionality.
A good project for someone to work on would be to fix iwd and cryptsetup to always use userspace crypto, as they should.
I can’t comment on the ramifications, except to note that elsewhere in the thread this appears to not break anything (whether it makes userspace crypto a little less safe is academic, but that doesn’t matter if we have an easy local root shell), but I can verify the above fix does protect Ubuntu 24.04 from the exploit.
The point of noting whether it is loaded on their machine or not, is presumably to indicate that it is not normally loaded (for them), so disabling it to block the exploit should have no impact (for them).
As I understands any program code can use that socket to write to page cache memory and modify any main program. Even php code can be written for that. So it is serious problem if there is other security hole on web server.
For anyone wondering: AF_ALG is a Linux socket interface that exposes the kernel’s crypto API via file descriptors, using normal read(2)/write(2) calls for hashing and encryption.
It's already a configurable option in the kernel which can be fully disabled by distros if they wanted to provide their own compatibility layer, or just not ship any software that has a hard dependency on it.
That can be done in userspace too -- different userspace processes have different address spaces too.
The fact that the first link recommends using keyctl() for RSA private keys is also "interesting", given that the kernel's implementation of RSA isn't hardened against timing attacks (but userspace implementations of RSA typically are).
The CloudFlare blog discusses that idea when they talk about having an "agent process" to hold cryptographic material, but they list drawbacks like having to develop two processes, implement a well-defined interface, and enforce ACLs. I'm not convinced that "developing two processes" is a reason not to do it, since the kernel is effectively just the second process now, but everything else makes sense.
It's unfortunate though since this is one thing I think Windows does decently well. The Windows crypto and TLS APIs do use a key isolation process by default (LSASS) and have a stable interface for other processes to use it [0]. I imagine systemd could implement something similar, but I also know that there are very strong opinions about adding more surface area to systemd.
can you please give me a real-life example of an application, on a typical linux laptop or typical linux server, which userspace application would use this CRYPTO_USER_API ? None that I looked at seem to use it: openssl, pgp, sha256sum
As Eric has correctly stated above, we believe iwd (Intel Wireless Daemon), or rather the ell library it relies on (Embedded Linux Library) is the only relatively widespread user space application relying on it.
Isn't the better argument to ask whether there'd be benefit if all those things did?
A lack of adoption isn't apriori a good argument against an interface, and serious bugs can happen anywhere.
My personal opinion for a while has been that crypto operations should be in the kernel so we can end the madness that is every application shipping it's own crypto and trust system which has only gotten worse since containers were invented.
> My personal opinion for a while has been that crypto operations should be in the kernel so we can end the madness that is every application shipping it's own crypto and trust system which has only gotten worse since containers were invented.
There’s a valid argument here but I think that’d devolve into the DNSSec trap without both a very well-designed API and a stable way to ship updates for older kernels. If people can’t get good user experience or have to force kernel upgrades to improve security, most applications will avoid it. Things like Chrome shipping their own crypto mean that they can very quickly ship things like PQC without waiting years or having to deal with issues like kernel n+1 having unrelated driver or performance issues which force things into a security vs. functionality fight.
Which does sort of loop around to the issue of Linux not having a stable ABI as a feature I suppose which would be one way to implement it with long term compatibility on kernel modules.
But the Chrome example also highlights the problem: Chrome might ship it, but vanishingly little software is ever going to upgrade and we've got an explosion of statically linked languages now.
If Linux does that, I really hope it can be done in a standardized way that doesn't make porting to *BSD more difficult than it already might be. Standards are a good thing.
> A lack of adoption isn't apriori a good argument against an interface
I mean it kind of is (perhaps not a priori, but why is that relavent?). If something is not being used, its not meeting needs, so its just increasing attack surfaces without benefit.
> syzbot system continuously fuzzes main Linux kernel branches and automatically reports found bugs to kernel mailing lists. syzbot dashboard shows current statuses of bugs. All syzbot-reported bugs are also CCed to syzkaller-bugs mailing list. Direct all questions to syzkaller@googlegroups.com.
The primary benefit of AF_ALG is IMHO when it's combined with kernel keyrings, i.e. ALG_SET_KEY_BY_KEY_SERIAL.
To steal from the sibling post:
> * When using user space libraries, all key material and other cryptographic sensitive parameters remains in the calling application's memory even when the application supplied the information to the library. When using AF_ALG, the key material and other sensitive parameters are handed to the kernel. The calling application now can reliably erase that information [...]
It's even more than this: you can do crypto ops in user space without ever even having the key to begin with.
[Ed.: that said, maybe AF_ALG should be locked behind some CAP_*]
[Ed.#2: that said^2, I'm putting this one on authencesn, not AF_ALG. It's the extended sequence number juggling that went poorly, not AF_ALG at large. I bet this might even blow up in some strange hardware scenarios, "network packet on PCIe memory" or something like that - I'm speculating, though.]
It doesn't seem to actually get used that way in practice. ALG_SET_KEY_BY_KEY_SERIAL didn't even appear until just a few years ago. And either way, if the interface allows you to overwrite the su binary, whether it theoretically could provide some other security benefit becomes kind of irrelevant.
And, sure, if it breaks system security it's pointless. But so did "dirty pipe".
I do agree the number of issues in AF_ALG is annoying, which is why I suggested a CAP_* restriction. Maybe CAP_SYS_ADMIN in init_ns, that's kinda the big hammer.
Most of the Linux kernel crypto is not touching the TPM. If there is a TPM task, only that code should be in kernel, and it should be accessed from user space by a process with the appropriate token.
Yes, AF_ALG is exposing too many things, like authencesn, which has zero reason for being userspace accessible. It's a crypto mode specific to IPsec.
However,
> it should be accessed from user space by a process with the appropriate token.
That is AF_ALG. The operations it offers are what you need for full coverage. The issues with it are two:
- usage specific crypto in the kernel implements the same interfaces, and it doesn't have a filter for that, as mentioned above. It's not offering too many operations, it's offering too many algorithms.
- it's trying to be fast. I guess people also want to use crypto accelerators through it. (Which is kinda related to TPMs, there is accelerator hardware with built-in protected key storage...)
The CVE we're looking at here is in the intersection of both of these.
All the uses of vmsplice etc are a bit tricky, and that points to the need for a better interface. But given you're using splice, why not do the crypto in user space? A belief that it is better to be fast and buggy than safe and slower?
Still a risk that some admin-enabled method (like enabling an IPsec VPN) provides a path to it, but would reduce the potential for crafting weird inputs.
That's really orthogonal (and you can already do io_uring with AF_ALG, at the end of the day AF_ALG is just recvmsg() and sendmsg(), which work just fine in io_uring...)
YAGNI stocks are rising, Gentoo devs that compile their own kernel probably yeeted this module. Alpine, and MUSL deviants are probably immune to this downswing.
DRY looking very bearish, do repeat yourself, do build your own, do use userspace tools even if the kernel has its own version. Not as big a hit on the DRY philosophy as those pip and npm supply chain attacks last couple of weeks though.
I think the issue here is not "Don't Repeat Yourself", but "Don't Reinvent the Wheel". If your wheel is just a circle of wood, you're better off building it yourself than hiring a skilled (or sometimes not so skilled) laborer. Too much overhead and risk.
I love this. I think everyone in software should be feeling a tinge of “we should trim the fat” right now - get rid of as much of the old and infrequently used/tested code as we can. Push users towards the better tested alternatives.
The design philosophy of mainstream Linux distros is not like OpenBSD.
Linux distros go to market as maximally capable, maximally interoperable, and maximally available for whatever the users want to do. So there is a lot of "shovelware" that is unnecessarily installed with your base system. A lot of services are enabled that you don't need. A lot of kernel modules are loaded or ready to spring into action as soon as you connect hardware that the kernel recognizes.
All this maximizing also increases the system's attack surface, whether local or over the network. Your resources, time and effort increase, to update the system and maintain all those packages. The TCO is high.
With OpenBSD, the base system is hardened and the code is audited with security in mind. They only install or enable essential functions. So it's up to the user to dig in, customize it, and add in features that are needed.
The good news is that you can do some after-market hardening. Uninstall software that you're not using, and disable non-essential services. Tune your kernel for special-purpose, or general-purpose, but not every-purpose.
There are now special distros for containers and VMs with minimal system builds. They are designed to be as small and lightweight as possible. That is a good start in the right direction.
Thanks for the explanation. I am wondering if it is possible or does it make sense to have a modular linux that does not have these attack surfaces enabled by default. Alpine is my default solution for most Linux use cases (except when I need GPU support).
Not "by default", but still Gentoo. My USE= is several lines worth of -this -that -all-the-things. I got rid of wayland, pipewire, pulseaudio, avahi and a shitload of other stuff I don't need.
PulseAudio applications can still produce (but not record) audio through apulse and my handcrafted asoundrc
I think it would be reasonable to deprecate af_alg in favor of a character device. It's more accessible that way. The downside is that the maintainers hate adding new ioctls. I think that's fair. But I don't think a "regular" device node would cover the functionality userland expects.
That said, elsewhere ITT it's pointed out there are only a few use cases so far.
Yeah, so, as we just learned (dirty.frag) the issue isn't algif_aead, it's authencesn and you can exploit it through plain network sockets rather than AF_ALG.
Many things, such as ksmbd seems ill-advised when looked at from security. New AI driven exploits
era will likely make projects more wary to adding functions.
Indeed, iwd is the main reason why general-purpose Linux distros can't disable AF_ALG yet. But many Linux systems are more specialized and don't have wireless connectivity, or they use another wireless daemon such as wpa_supplicant which doesn't have this issue.
I'm hoping we can get iwd fixed to use a userspace crypto library, as well. This is something that people could help with.
iwd also runs as root, so it would be okay with a CAP_SYS_ADMIN permission check if one were introduced, I think.
can you please give me a real-life example of an application, on a typical linux laptop or typical linux server, which userspace application would use this CRYPTO_USER_API ? None that I looked at seem to use it: openssl, pgp, sha256sum
It'd make a lot of sense to sandbox AF_ALG, then, wouldn't it? At least for userspace-driven invocations. Let the kernel keep its current code-path for kernel-driven invocations, but have the same code unit files also build some other sandboxed form, to be invoked by the crypto-accelerating syscalls.
If these syscalls are used by userspace as rarely as you say, the performance impact of this kind of sandboxing wouldn't matter much. And maybe there could be a KCONFIG/boot flag to switch back to using the un-sandboxed code path for userspace invocations too, for enterprises stuck with old software who really care.
---
My own thought process on how this could work below (but I'm not a kernel contributor, so you can probably immediately picture a design better than I can):
The naive way to do this, would be for the kernel build process to emit a separate AF_ALG userland IPC server as an additional build artifact; to get distros to package this IPC server as a component package of kernel packages; and to set up the sandboxed AF_ALG "kernel bridge" so that it proxies calls through to this IPC server if it exists, and errors out otherwise. (Basically like kfuse, except in this case the only "FUSE servers" are first-party.)
But that's a bit painful, organizationally. Puts a lot of work on the distro maintainers' shoulders, that they might just not bother doing. Prone to error. I think there are better alternatives.
1. Maybe the userland syscalls that rely on AF_ALG could instead ground out inside the kernel in a copy of AF_ALG that's been compiled to eBPF? Then that eBPF bytecode could just be embedded into the kernel.
2. Maybe the Linux kernel could consider a facility that would enable it to act as a hybrid microkernel (similar to macOS's XNU) — with arbitrary static sections of the kernel image/kernel modules [or perhaps standalone static ELF binaries embedded within kernel/kmod .data sections] being spawned not as supervisor-mode kthreads doing their own autonomous thing, but rather as unprivileged user-mode kernel threads, running as IPC-servers for the rest of the kernel to talk to?
- The rest of the kernel could talk to these "userspace kthreads" via some nonblocking IPC mechanism; but this mechanism wouldn't need to be exposed to userland the way macOS's XPC is; it could be kernel-to-kernel only (where these "userspace kthreads", despite being in userspace, are still fundamentally kernel threads, and so get to participate in it.)
- Also, these "userspace kthreads", when they're the active scheduled task, would have the kernel image's read-only sections [or their binary's sections, from within the kernel's .data section] mapped into their address space, since that's the binary they're executing against. But they wouldn't inherit [or the spawning mechanism would actively prune from their task struct] the rest of the kernel's mappings. So they'd have to either use the IPC mechanism, or use regular syscalls, to do anything with the kernel, just like any userspace task.)
I don't see those eBPF or microkernel ideas as being particularly realistic! But there are some simple ways AF_ALG's attack surface could be reduced (as an intermediate step to disabling it entirely), like requiring CAP_SYS_ADMIN and/or limiting the algorithms to a specific list.
The algorithm being used in this exploit, "authencesn", is even an IPsec implementation detail, which never should have been exposed to userspace as a general-purpose en/decryption API.
If you're in charge of the configuration for a Linux kernel, I strongly recommend disabling all CONFIG_CRYPTO_USER_API_* kconfig options. This would have made this bug, and also every past and future AF_ALG bug, unexploitable. In the unlikely event that you find that it breaks any userspace programs on your system, please help migrate them to userspace crypto code! For some it's already been done. But in general, AF_ALG has actually never been used much in the first place, other than in exploits.
I don't think there's much other option. This sort of userspace API might have been sort of okay many years ago. But it just doesn't stand up in a world with syzbot, LLM-assisted bug discovery, etc.