Since Linux 6.9, LUKS suspend stopped wiping disk-encryption keys from memory (mathstodon.xyz)
521 points by IngoBlechschmid a day ago
kokada a day ago
While it is certainly an interesting bug, I kinda feel that the title is click bait? Because this `cryptsetup luksSuspend` from what I understood is not really officially supported but an extension done in Debian, so if anything this regression only affected Debian? I am not sure if you can blame the kernel for something that is not supported or even widely tested.
I still find this impressive, and it is nice that we now have a test (NixOSTests BTW are awesome, I agree with OP) to avoid this regression from coming back. But from the title it seems to be a widespread issue, not something that affects only one Distro.
IngoBlechschmid a day ago
Sorry, aimed for a technically precise title and didn't want to bait clicks.
Yes, this does not affect people on stock configurations for the plain reason that they wouldn't expect the volume key to be safe during suspend anyway.
Debian's solution was ported to several (most?) other distributions and I guess quite a few people maintained private ports.
The thread-keyring(7) manpage promises: "A thread keyring is destroyed when the thread that refers to it terminates." For their key upload (from userspace to kernelspace) mechanism, the cryptsetup project relied on this property; but kernel 6.9 introduced a regression invalidating this property.
kokada 9 hours ago
Thanks for the explanation, I am really not that familiar with `cryptsetup luksSuspend` and it is the first time I ever heard it exists.
Like other people in this thread I first got confused "wait, how would this work since if you cleanup the keys from the disk during suspend you couldn't access the disk anymore after resuming", but after reading your thread in Mastodon plus other comments here it eventually became clear that this is a special case that you need both the correct patches plus the correct setup to use `cryptsetup luksSuspend` in place of the normal suspend.
Can I ask one question? Why not use hibernation at that point? The reason I generally suspend to RAM is exactly because my password is long and annoying to type enough that if I know I am going to use the device soon I prefer to suspend instead of hibernation. Yes, technically resuming from suspend is faster, but it is also less secure (there are other interesting things in memory besides the LUKS keys) and also it uses more power.
cyphar 8 hours ago
IngoBlechschmid 6 hours ago
cyphar 8 hours ago
I'm confused why you're saying this is a Debian-specific thing -- luksSuspend is upstream and was added back in 2009[1] in release v1.1.0[2]. I've used it (though somewhat sparingly) on Arch and openSUSE in the past and it definitely exists on non-Debian distributions. Maybe you're thinking of the automatic integration with system suspend? If so, that's kind of besides the point -- luksSuspend documents itself as clearing the keys from system memory, which stopped happening in Linux 6.9 due to the referenced refactor patch.
Though it should be noted that it seems that this is actually a bug in cryptsetup in that it was depending on very specific lifetime behaviour of kernel keyring keys, when it arguably should've been more explicitly cleared by userspace[3].
[1]: https://gitlab.com/cryptsetup/cryptsetup/-/commit/3cea5dcc7b... [2]: https://gitlab.com/cryptsetup/cryptsetup/-/blob/main/docs/v1... [3]: https://gitlab.com/cryptsetup/cryptsetup/-/merge_requests/93...
nicce 21 hours ago
Hmm, the subcommand is in the official cryptsetup repository and the description matches? https://gitlab.com/cryptsetup/cryptsetup/-/blob/main/man/cry...
michaelmrose 14 hours ago
I've used this feature on arch its available on bog standard luks but as far as I know it's not used by default when you suspend.
You are thinking of the machinery to actually actuate suspend to ram after a luks suspend in a way that is actually useful which was first a Debian targeted thing then arch and used by default by neither.
NooneAtAll3 a day ago
what debian version first shipped 6.9?
tremon 19 hours ago
Only the current stable (13/trixie); bookworm shipped with 6.1 as the main kernel (with 6.12 available in backports).
bitbasher a day ago
I don't see any other way? When you sleep (suspend to RAM), everything is stored in RAM and is encrypted but the master key is present in kernel memory (if I recall correctly).
However, if you hibernate (suspend to disk) the entire contents of RAM (including the master key) is written/encrypted to disk and the RAM is cleared.
When you wake the machine up you have to re-enter the passphrase to decrypt the master key to re-load disk contents back to memory.
IngoBlechschmid a day ago
Yes, if you simply suspend your laptop on most stock Linux distributions, then everything including the master key is still kept in memory. But Debian pioneered the (optional) cryptsetup-suspend addon. This issues a luksSuspend command which is supposed to wipe the key from memory, and on resume asks you to resupply your passphrase.
Up to kernel 6.8, this worked as described; starting with kernel 6.9, it silently didn't.
herywort a day ago
So you would still be asked for a passphrase, even though it's already available?
IngoBlechschmid a day ago
Groxx a day ago
I've been wondering why hibernate didn't work with encryption, because this seems like the extremely obvious way to handle it, but I have struggled to find anything about it for years - glad to hear it does exist!
But yeah, also rather obviously it's inherently a bit leak-prone. Though it seems probably pretty simple to test, just hibernate and scan all stored data. They could probably even do it on shutdown, as a hash of the key data would be sufficient to detect the key.
reirob 8 hours ago
dathinab a day ago
makes me wonder if there is potential for a more "main stream"/by default friendly version of this, where the key during suspend is encrypted using the TPM even if the TPM isn't a possible unlock from cold boot (i.e. no TMP encrypted volume key in the LECS headers/meta only temporary in memory during suspend)
or the alternative (for more convenient usage) for single user systems auto login on boot + use disc password for doas/sudo?
naturalmovement a day ago
FYI: VeraCrypt is not the defacto encryption software for Windows.
IngoBlechschmid a day ago
michaelmrose 14 hours ago
The luks feature is not Debian specific
dist-epoch a day ago
Both Intel/AMD CPUs produced in the last 5 years or so support full transparent (to the OS) memory encryption. So cold boot attacks are a thing of the past if you enable this feature (it's typically disabled because it reduces RAM speed by about 0.5%).
tredre3 a day ago
The impact on performance is more along the lines of 1-2% on AMD (though it likely varies by generation (I did extensive benchmarking on Renoir wrt throughput/latency/gpu). But yes small enough to be insignificant unless you run LLMs or game on the iGPU. I imagine that it also uses marginally more power.
AMD also has a second encryption mode where the OS decides what gets transparently encrypted, it doesn't have to be everything. But that mode is poorly documented (or at least the documentation isn't accessible to peasants like me)
m3047 a day ago
Recent news is that this isn't shipping on some consumer-grade CPUs from AMD. There, made it explicit enough there's no room for conversation. Here's the link:
https://arstechnica.com/security/2026/06/users-cry-foul-afte...
dlcarrier a day ago
CodesInChaos a day ago
I don't have to re-enter my boot password after Sleep, so obviously the encryption key is still in memory.
wrs a day ago
Obviously your distro isn’t using cryptsetup-luksSuspend.
unethical_ban a day ago
Correct.
The point being made is: If one isn't re-entering their passphrase after suspend, how are they surprised that the encryption keys are somewhere in memory during suspend?
edit: I see now that the prompt was being given and the keys still resided in memory.
ksbd-pls-finish a day ago
akerl_ a day ago
weaksauce a day ago
killerstorm a day ago
tombert a day ago
I don't think this bothers me.
The only reason that I do the disk encryption is so that I don't have to worry about people going through my laptop to steal tax documents and/or credit card stuff when I sell the laptop. I of course also wipe the laptop too, but I figure that if the data is encrypted at the drive level then there's very little risk of anyone being able to use some kind of forensics tool and recover data.
chlorion 21 hours ago
You can just wipe the luks header as a nice middle ground.
Luks uses an anti-forensics algorithm that requires the entire volume key being available to unlock the disk at all (it combines the blocks of the key with some diffuse algorithm and xors stuff together to form the actual master key), so in theory you can just clear one sector of the volume key and the whole thing should be unrecoverable.
What I mean is that if even one block of the key is missing you can't guess the rest easily.
bluebarbet a day ago
Assuming the encryption key is strong, the wiping is theoretically redundant.
adrianmonk a day ago
And assuming the crypto algorithm has no fundamental flaws, it's applied correctly, and the software implementation has no bugs.
All of which are things people have on occasion believed to be true and found out later they were wrong about.
tombert 21 hours ago
tombert a day ago
Agreed. It's also very low effort and as such I'm ok with the redundancy.
bluebarbet a day ago
bbminner a day ago
I am far from a security expert, but from the number of "we missed a single line C check across files during refactoring" critical security bugs discovered on a regular basis these days, the whole premise of a "giant secure open source C codebase" seems questionable. It is not specific to C of course, but invariants are arguably even harder to enforce and track consistently (esp under changes to code) in C. Unsure if FP with invariants encoded in types is a practically feasible scalable solution either. Model checking? [LLM] fuzzing? Fewer primitives with clear boundaries? Is that how seLinux was "checked"?
fsddfsdfssdf a day ago
While I can see the shortcomings of C and generally don't recommend it for new projects I don't see this particular bug as a good example of something Rust's borrow checker or some other language's type system will catch. I don't think even static analyzers can catch this.
It's basically something like this:
original: DoTheThing()
new: DoTheThingSlightlyDifferentButKeepMyCredentialsAlive()
fix: DoTheThingSlightlyDifferentButDoInFactNOTKeepMyCredentialsAlive()
In my experience a substantial portion of gnarly bugs come down to a violation of a high-level system invariant and those do not strike me as something that can be automated. Even with something like Lean you can prove your program satisfies certain properties but you need to have thought about those properties in the first place. The proof doesn't discover the invariant for you.
If you'd had thought about the relevant security property you could have written a regression test for it which is not hard. IMO the really hard part isn't expressing the implementation safely, but it's the realization that this was a property the implementation needed to preserve.
bbminner a day ago
I agree re Rust vs C - this is not (only) a language issue. What would (roughly) the invariant be here?
In another thread comment below i argue that maybe the system (OS) itself is so complex that it lacks clear contract / the contract evolves too quickly over time (as other parts of the code need to change the given piece of code to extend it to their use case) and that defies clear encoding?
Or we lack easy enough means to describe specs? I tried reading jepsen spec earlier today and despite it being an "integration test" of sorts, it is far from "simple".
Can an entire OS or a system of comparable complexity be decomposed into objects simple enough that their entire intended behavior (with all edge cases) can be explained in a paragraph of human text + half a screen of dense behavioral "spec" - if i do X and do Y, Z should come out / hold _no matter what happens in-between_. Or that's what asserts + fuzzing is effectively supposed to do? Is there a clear distinction between invalid input and failed invariant in typical C code? I guess error code vs seg fault?
estebank a day ago
This is in effect a state machine, and when you have a type system more complex than C's you can encode state transitions in the type system (either by having state transitions explicitly return a new return type or by using sum types). You still need to architect the system to encode the invariants in types. No language will fix all logic bugs for free. But you can leverage language features to reduce their number.
fsddfsdfssdf a day ago
J_Shelby_J a day ago
WhitneyLand a day ago
The premise of a secure open codebase is fine.
The problem is being more auditable does not automatically make it more audited.
There have to be enough people with skill taking enough time to work on it.
pixl97 a day ago
If you think open source is bad, wait till you see enterprise code. I'm talking full auth bypass due to the stupidest crap. You can do that in any language if you have fools working on the code base.
620gelato a day ago
danudey a day ago
AlienRobot a day ago
pjdesno a day ago
To translate to Rust, it would have been "we missed a single line Rust check"...
This is a bug involving intersecting concerns and a deficit of cross-domain knowledge. It probably would have been the same in Lisp or assembly language.
dwattttt 21 hours ago
I think https://news.ycombinator.com/item?id=48766436 is a practical answer. A marker that can't be forged (without explicitly malicious code being written), which is needed by the "yield to suspend/shutdown" function.
Missing the "wipe key from memory" means you don't have the marker, and trying to continue the suspend/shutdown will fail to compile, because you don't have the marker.
russdill a day ago
The lesson here is that if a feature (at a minimum) does not have a associated test case, it is not actually a feature.
fsddfsdfssdf a day ago
Yes, I agree. I find the addition of the regression test the true long-term fix. The code is just an opaque incantation that may or may not preserve some property we find worth preserving and we have no way of knowing it keeps preserving it over time as other parts of the system change.
The test actually proves it and while it too can change it has more staying power because it's expressed at a higher level of abstraction ("random arcane weird C shit" in the case of code versus "does this property hold" in the case of a regression test).
bbminner a day ago
moritzwarhier a day ago
The whole premise of a "giant secure open source C codebase" seems questionable
Because code review is sometimes not much different from an idealized version of the halting problem, where you would have access to a formalized version of a specification.
In other words, there is no strict definition of what is a security issue.
bbminner a day ago
On the other hand, it is (both halting and spec adherence) are checkable under compute and space constraints though? :) I'd say the biggest hurdle are means to describe the spec in way that is easy enough for a human to produce to make it feasible.
Not a DB person either, but things like TLA+ seem very hard to write even with LLMs. Behavioral tests with an enumerable number of random paths to take (aka model checking - eg jepsen) seem more feasible. Although you can't check internal properties of the system (string `pass` or any of it's copies or parts are not held anywhere in memory at any point between lines A and B) unless we can check that two memory dumps are indistinguishable with different pass strings (assuming we abstracted away storage devices in a test environment).. Also not sure if it's "easy enough" to write such tests either.
Maybe the reason is that OS domain objects / primitives are too complex and not "isolatable" enough / lack a clear contract at all? (Hence multi file refactorings that break invariants.)
lazide a day ago
In open source, someone (many, many) someone’s can at least check.
Closed source…..
Twirrim a day ago
Not sure why you're getting downvoted, this is the entire point of open source.
Does such a bug exist in Windows? OSX? Who checks? If someone finds the key in memory, can they tell what conditions might be causing it and where?
Their only recourse under those situations is to hand it off to the OS Vendor and trust that what they implement does solve the problem, and trust that it wasn't a deliberate back-door that is now being replaced by another back-door.
charcircuit a day ago
deepsun a day ago
"Million eyeballs" argument was always kinda meh.
hugo1789 a day ago
Maybe but still a little better than closed source like Windows. Everytime whem someone asked me if I could hack my way into his Windows PC I always told "After all it's Windows, how bad can it be?" Doing that since 25 years still waiting for a Windows machine that doesn't open... On the other hand I failed to open about 50% of Apple Devices I was asked to open and about 10% of Linux machines. (Not because Linux is insecure by itself but because most Linux distros install with insecure defaults and users don't care.)
deepsun a day ago
moktonar a day ago
Did the Feds desperately need a way of getting the key? is this a bugdoor? Has the commits been traced? Recently I’ve been seeing this pattern a lot and I’m starting to be a little bit suspicious. Maybe it’s because people are more sensible to this and post more on it?
aniceperson a day ago
it is a regression. the user space application also would silently fail, it is a chain of oversights. also having the encryption keys in memory does not mean you can extract them, it is more of unnecessarily letting it there indefinitely, not having it where it shouldn't be.
procaryote a day ago
The whole point of luksSuspend is to not have the encryption keys in memory, as that actually does mean they could be extracted by an attacker who has taken the hardware.
johnathan101 a day ago
This is one of those regressions that's easy to miss because everything still "works." Security bugs often don't announce themselves.
IngoBlechschmid a day ago
Right! Which is why integration tests for these kinds of features are all the more important.
It was also fun to write, and enabled git-bisecting to isolate the specific kernel refactoring which introduced this bug: https://github.com/NixOS/nixpkgs/pull/532499
whimsicalism 19 hours ago
AI reply, triggered my spidey senses
foltik 19 hours ago
Yup look at the rest of their comments
mrob 9 hours ago
fpoling a day ago
On my laptop with Fedora I just configured Linux to hibernate to disk after 15 minutes of suspend. Powering memory off ensures that bugs like this Debian-specific would not matter.
Plus what Debian extension to Linux tooling does although nice in theory, but in practice if one really worries about cold-boot attacks, then all keys and important documents has to be wiped out from memory, not only LUKS keys.
So hibernating is really the only proper way to protect against cold boot.
IngoBlechschmid a day ago
> So hibernating is really the only proper way to protect against cold boot.
I agree; or resurrecting FridgeLock: https://www.sec.in.tum.de/i20/publications/fridgelock-preven...
fpoling a day ago
Interesting idea. On the other hand on the latest SSD with hardware encryption the raw disk speed under Linux can be over 5 GB/s so on my laptop with 64 GB of RAM the full restoration from disk takes like 45 seconds. With LUKS it is like 2 times slower. Which is not a problem at all. So I do not see much value in memory encryption in suspend.
killerstorm a day ago
Hmm, where does it get a key to decrypt memory on resume?
AFAIK it's practical only if you make use of TPM. And if you do, you're basically at mercy of TPM.
teravor a day ago
> where does it get a key to decrypt memory on resume?
you enter it...storus 17 hours ago
LUKS still keeps unencrypted header on the harddrive; real men use plain dm-crypt instead! Plausible deniability compatible.
bawolff 14 hours ago
Ah yes, it is very plausible you just keep a partition of hundreds of gb of perfectly random data for no reason at all.
miki123211 12 hours ago
Imagine how this HN thread would have looked like if this vulnerability existed in a proprietary OS.
The top-level comment would surely have been about how Applosoft doesn't care about software quality any more and "that's what you get if you allow vibe-coded slop into your OS". The one below it would have been a crazy (everywhere else, not crazy for HN) conspiracy theory about the surveillance industrial complex and the NSA.
snmx999 12 hours ago
Why is something this important not tested with every build?
deng a day ago
> Except that, for more than two years, the encryption key remained resident in memory across suspend, leaving it there for the taking by anyone who seized the still-powered laptop.
I don't get it. Obviously, the laptop is locked when it resumes, how is that key "for the taking by anyone"? I'm not saying it is impossible to read out RAM from a locked laptop, but surely not by "anyone".
jakewins a day ago
There are attacks that allow dumping RAM if the device is powered on though and you have physical access. Depending on config it may be very easy (just plug in a dumper over Thunderbolt on USB C and do direct memory access) or hard (freeze and swap physical RAM to an unlocked machine).. but the idea was defense-in-depth here; a well configured device should both be hard to dump RAM on and it should not give encryption keys if an attacker succeeds.
nicce a day ago
Anyone with physical access. I think it is understandable from the phrase.
There is a common misconception about how lock-screens in general work - they usually just prevents using the current hardware and software as it is to access the current OS. But the disk encryption is the main thing that prevents modification and other kind of access to actual data. And if the disk encryption key is lying in the memory, then effectively, the disk encryption is bypassed if someone can access the machine physically and assuming that there are no sufficient tampering protections in place for that machine.
acdha a day ago
Anyone with physical access, significant tools, and experience. The FBI has people who can pull data out of memory after freezing the RAM but the average laptop thief doesn’t so how serious this is depends significantly on your threat model. If you’re not a major criminal, bitcoin whale, or intelligence target this is almost certainly academic.
deng a day ago
bluebarbet a day ago
deng a day ago
> Anyone with physical access. I think it is understandable from the phrase.
Sorry, I'm probably dense, I still don't get it. You steal a laptop, you open it, the screen is locked with a password/fingerprint whatever. How do you read out the RAM from that laptop?
IngoBlechschmid a day ago
john_strinlai a day ago
saidnooneever a day ago
you dump the physical memory, then decrypt the disk offline
teravor a day ago
on the subject of encryption keys and memory there is something you can do:
- if your CPU supports it, enable memory encryption.
- if your TPM module supports this look for MemoryOverwriteRequestControl & MemoryOverwriteRequestControlLock (/sys/firmware/efi/efivars/) and toggle them. make sure that your computer always reboots and never powers off. memory will always be wiped on boot.
someothherguyy a day ago
bluebarbet a day ago
Proper capitalization makes English easier to read.
chazeon a day ago
But if you do this, don't you have to enter two passwords each time you wake? One for LUKS, one for the system login?
WhyNotHugo 17 hours ago
The other big problem is that all your processes continue running, but your disk is unmounted. I can't imagine how you'd avoid everything crashing horribly.
I mean, I can imagine an implementation where the system pauses all processes related to the user session _except_ the screenlocker, and have a custom screen-locker which can supply the credentials to luks…
But that the screen locker is a desktop application, so the compositor itself needs to stay alive too, but then compositor might try to talk to other applications, and those are frozen. So wouldn't it consider them crashed and disconnect them? Now your compositor needs to understand that the system is in a "disk unmounted and processes frozen" state too.
Not even sure how you'd deal with logs from its stdout, since the file descriptor to the log files is invalidated too.
If anyone is actually using such a setup, I have so many questions. I know that theoretically all this is feasible, but all the existing components don't seem to be ready for just unmounting the encrypted disk at runtime like that.
polotics a day ago
Well yes and I don't see how this can be avoided.
Dylan16807 a day ago
Do you mean with with current software? How to avoid it in general is straightforward.
If you're the only one with the disk password then the simple answer is make both passwords the same and make the different parts of the system communicate better.
If you want multiple users, give them each a different boot password and encrypt a separate copy of the disk key with each one. That password can be their login password too, or it can encrypt their login.
chazeon a day ago
I have always been thinking LUKS was supposed to be enrolled in TPM, so you should not have to enter this key manually; this is just to prevent someone from unplugging the hard drive and reading on another machine. Of course, this depends on one's threat model.
boutell a day ago
(No, no, I take this stuff seriously too, but it had to be said)
shevy-java a day ago
To me the bigger problem is that the linux kernel does not seem to have a thorough test suite. Such things should be easily testable and verifiable. Apparently since 2024 nobody had that; humans are only so good for some tasks. Automatism should be done programmatically by machines serving humans.
Edit: Wait, so this was a debian patch? Now, this does not nullify my prior statements, but they should have said so clearly that debian screwed up here rather than the linux kernel devs.
IngoBlechschmid 13 hours ago
No, it is indeed a kernel bug in the code path responsible for luksOpen.
Debian (and the distributions which ported cryptsetup-suspend) relied on cryptsetup luksSuspend doing its thing correctly, and cryptsetup luksSuspend relied on cryptsetup luksOpen doing its thing correctly, and cryptsetup luksOpen relied on the thread keyring being purged from memory on process exit, which is promised in the tread-keyring(7) manpage.
quotemstr a day ago
It's because of vulnerabilities like this that I enable Intel's "total memory encryption" feature. No plaintext leaves the CPU package. DIMM swap attacks become useless. Moreover, it's basically free: the cryptography happens directly in the memory controller, in hardware, inline with the bus transactions the CPU is doing anyway.
fsckboy a day ago
I don't see how that solves this problem. there is a string in memory that gets saved on suspend. that string when read by the CPU has the same properties it had before. if the CPU is using rot-13, the string is still rot-13 and the attacker doesn't need to spend the compute needed to crack rot-13, the CPU will simply do that as normal.
ltbarcly3 a day ago
This is correct, the memory encryption stuff is to prevent side channel attacks, not secure data.
quotemstr 21 hours ago
naturalmovement a day ago
Definitely not a symptom of Linux being a hodgepodge of code thrown together from a thousand different sources and no one person could tell you how it all fits.
cevn a day ago
Bugs happen in all code. The difference is, anyone can fix stuff in open source. Closed source bugs are out of control and must be worked around. Usually by switching to OSS
megous 4 hours ago
It's still better general OS than any alternative. :)
stackghost a day ago
Of course it's (indirectly) a symptom of that.
What's the alternative? Proprietary closed-source operating systems owned by corps who can be compelled to insert covert backdoors?
If BSD was as popular as Linux it would have the exact same problems.
steve918 a day ago
I wonder if you think other OSes are any different?
TempleOS is the only thing that comes to mind that doesn't fit your description and it's not practically useful.
Any sufficiently large codebase is a mix of ideas and concepts implemented by different people with different priorities over a large timespan and if you can fit the entire thing in your head it's not very interesting or complex.
IngoBlechschmid a day ago
Qubes OS, the Linux distribution aspiring to offer a reasonably secure operating system, pioneering a "every app runs in a virtual machine" approach in the Linux laptop/desktop space, tracks this at the following issue:
saidnooneever a day ago
naturalmovement a day ago
The *BSDs, Mac, and Windows all keep critical code in the same tree as the OS.
Something like disk encryption would be immediately visible.
So you don't have this mess of 80 different distros with 60 different versions of systemd, 20 that don't use it, a million kernel versions and it's all thrown together in a Costco-sized trash bag and we call the output "Linux".
yaris a day ago
brainwad a day ago
dist-epoch a day ago
"Mythos, find me a bug in LUKS. I know there is one in there".