Interesting choice on unsupported architectures to make this just silently fail to do what it says. (But fully in keeping with the Go philosophy of “we assume you’ve read every word of the docs before calling the function”.)
I'm reading this as a description of the current implementation for the proposal, not an intent for the final form. It's also worth pointing out that the target for this is cryptography developers, and it's probably not wild to expect them to read the docs.
Fair enough, I guess if you're just sort of waving this at your code because it has the word "secret" in it then I'm sure there are worse problems! On the other hand, I've seen a lot of cases of people doing just that. So overall I'd prefer "this is panicking because it won't work, and if you know what you're doing you can find a workaround" to "this won't work, but you wouldn't be using this if you didn't know that already, so I'll proceed".
I think it’s probably the right decision: it would make the feature unusable in any portable cryptography code outside those platforms, which means it probably wouldn’t get used.
And not getting used is equivalent to the feature doing nothing, as there’s no other way to achieve this today.
Heap allocations are only erased if ➊ the program drops all references to them, and ➋ then the garbage collector notices that those references are gone.
Ewww, finalizers!
For the record, for OCaml (which is also a GC-ed language), the recommendation is for users to allocate their secret data outside the GC heap (the Bigarray type can be used to allocate there) and to zero it out explicitly themselves right after usage. (If we had RAII it would go in the destructor.) Asking users to trust that the GC will do so in a timely fashion seems strange to me for a security-intended feature, especially as the amount of code that has to do this cleanup is rather small, so it makes sense to prioritize control over convenience. I guess that I don't understand the proposed Go design in that case.
Although, I read the proposal as a secondary extra layer of protection to harden the cracks that inevitably leak bits of data in a bigger scale project. For example slice sizes, said pointers, etc.
It's also pleasant to know that a forgotten defer call will be less likely causing a security issue in the future.
Interesting problem. How do other langs approach this?
In Rust I guess you could make non-clone non-send types that clear the contents on drop. That leaves the registers, but I don't fully understand why the values in registers are important. I'm assuming the secrets that you want to clear are all larger than registers so in the worst case you leak a pointer to a cleared memory, which shouldn't be an issue? Another issue is if you panic without unwinding then you don't run the drops, which I guess can be a problem?
In Rust I guess you could make non-clone non-send types that clear the contents on drop.
That's not enough, unless an object is behind a Pin it can move around in memory (e.g. be copied between stackframes). The zeroize crate notes a number of issues. secrecy provides a heap pointer which attempts to avoid those issues.
Another issue is if you panic without unwinding then you don't run the drops, which I guess can be a problem?
If you panic=abort, the process should immediately terminate and the memory should be reclaimed and eventually zeroed by the OS. So the risk of the process leaking data is not really a concern anymore.
Afaik every OS zeroes newly allocated pages as handing out an other process’s memory would be unconscionable. Until the OS hands it out the physical page should be inaccessible (unless you have direct access to the vm anyway). I know that freebsd will eagerly zero released pages in the background when idle rather than wait for allocations.
That may be on a per-implementation basis, but I have found multiple references that make the distinction between malloc and calloc and only the latter declares that it zeros the result
The C11 draft about malloc in 7.22.3.4 also makes no such claim, whereas for calloc in 7.22.3.2 it also claims "initialized to all bits zero"
I am not trying to start trouble, so if you have a more up-to-date reference I would enjoy updating my mental model. I am for sure not a C ninja, and I don't mean to imply that I am. But I also have always understood that C is packed to the gills with footguns like this and have grown up never trusting that it will do anything at all to safeguard the application's behavior
malloc(3) can return nonzero garbage when it allocates memory that was previously free(3)d, in which situation there’s no kernel involved. (Note the manual section numbers.)
The relevant kernel API is sbrk(2) which from POSIX’s point of view is now obsolete, though it’s still common for some malloc(3) implementations to use it. The more modern replacement is mmap(2) with MAP_ANON. Both of them specify that freshly allocated memory is zeroed.
Wrong direction of inquiry. Malloc is a C thing, it can return memory you had previously allocated, having never released that memory to the OS at all.
But when it needs memory it’s calling mmap to create anonymous maps, and in that case
I can speak to the JDK, which uses char[] for password (and I'd guess PrivateKey material, too, I just happen to have the char[] example handy) so that one can for (i=0;...) password[i]=0 to blank the memory once done with it
I can't speak for Java, but it's possible your loop for (i = 0 ; ... ) password[i] = 0 could be optimized out, if password is a local variable and the variable isn't referenced past the loop. This happens to C code using more modern C compilers.
I don't have anything specific in mind, but this feels like it could create a situation where timing leaks information. At least, those are the hairs on the back of my neck that stand up when I read:
Heap allocations made by the function are erased as soon as the garbage collector decides they are no longer reachable.
I may be completely wrong, but that'd make me want to be cautious with this one.
That's cool. It looks like it does a good job with several primitives that you'd use for implementing crypto. But I was thinking a little higher level than that. Suppose, for example, that you wrap an operation that verifies a signature, checks a signer identifier against an ACL, decrypts sensitive command parameters, and runs a command with those sensitive parameters in this Secret construct.
It seems likely that someone in a position to observe GC timing could determine whether the signature failed, the ACL check failed, the decryption failed, or the command failed.
What about subtle.WithDataIndependentTiming? It receives a func, and I imagine you can do something like subtle.WithDataIndependentTiming(secrets.Do()) to get both.
I imagine as long you ensure that all operations are always run even if the previous operation failed, this should avoid those timing issues you described, but I am no expert in crypto or security so take my observations with a grain of salt.
It is almost certainly a mistake to hard-wire into a high level language (!!!) various bespoke solutions to flaws of the hardware that it runs on. And here, what we have is a mitigating change, not a "fix", so it becomes a percentages game, i.e. a game of statistical chances.
I'm being critical, but I don't know what the right answer is, other than: "Hey Intel / AMD / ARM, please don't let virtual machines read the memory of other virtual machines!" It's far easier to be critical than to suggest a real solution, so I am guilty of something that I hate: criticizing without constructively improving the situation.
Having reviewed most of the cryptographic code in the Java standard libs, the approach there is to manually zero out the data whenever possible. In terms of readability, I'd give those libraries a rating of about 3 on a scale of 1..100, so they're probably achieving quite effective security-via-obfuscation 🤮 despite that not being an intended outcome. I'd add a laughing emoji but it's really not funny.
It isn’t about hardware flaws. This kind of zeroing is to protect against leaks of secrets or plaintext due to memory being reused within the same program, like Cloudflare’s Heartbleed bug. Safe languages ought to ensure that the program can’t observe uninitialized stack or heap, so in that context zeroing ought to be redundant – but even in safe languages zeroing is still wanted to protect against other bugs (eg in the runtime or unsafe code) that might lead to secrets leaking.
But Go already zeroes memory sometime between it being GC’d and allocated…
Also, I haven't been able to find any information on "Cloudflare's Heartbleed bug". I know what Heartbleed is -- at the time, two of the OpenSSL committers worked for me, and my team was responsible for the blessed OpenSSL versions within the EvilBigCo™️ where I worked -- but I'm unaware of what "Cloudflare's Heartbleed" is.
wrs | 2 months ago
Interesting choice on unsupported architectures to make this just silently fail to do what it says. (But fully in keeping with the Go philosophy of “we assume you’ve read every word of the docs before calling the function”.)
weberc2 | 2 months ago
I'm reading this as a description of the current implementation for the proposal, not an intent for the final form. It's also worth pointing out that the target for this is cryptography developers, and it's probably not wild to expect them to read the docs.
wrs | 2 months ago
Fair enough, I guess if you're just sort of waving this at your code because it has the word "secret" in it then I'm sure there are worse problems! On the other hand, I've seen a lot of cases of people doing just that. So overall I'd prefer "this is panicking because it won't work, and if you know what you're doing you can find a workaround" to "this won't work, but you wouldn't be using this if you didn't know that already, so I'll proceed".
mcpherrinm | 2 months ago
I think it’s probably the right decision: it would make the feature unusable in any portable cryptography code outside those platforms, which means it probably wouldn’t get used.
And not getting used is equivalent to the feature doing nothing, as there’s no other way to achieve this today.
gasche | 2 months ago
Ewww, finalizers!
For the record, for OCaml (which is also a GC-ed language), the recommendation is for users to allocate their secret data outside the GC heap (the
Bigarraytype can be used to allocate there) and to zero it out explicitly themselves right after usage. (If we had RAII it would go in the destructor.) Asking users to trust that the GC will do so in a timely fashion seems strange to me for a security-intended feature, especially as the amount of code that has to do this cleanup is rather small, so it makes sense to prioritize control over convenience. I guess that I don't understand the proposed Go design in that case.peter-leonov | 2 months ago
Agreed.
Although, I read the proposal as a secondary extra layer of protection to harden the cracks that inevitably leak bits of data in a bigger scale project. For example slice sizes, said pointers, etc.
It's also pleasant to know that a forgotten
defercall will be less likely causing a security issue in the future.nemith | 2 months ago
There is another, not yet accepted, proposal to make
secret.Dofollow go routines (allowing for what Go is starting to call goroutine bubbles).https://github.com/golang/go/issues/76477
icy | 2 months ago
I really like this. It's solid stdlib primitives like this that make Go an excellent internet programming language.
Riolku | 2 months ago
Very cool proposal. I suppose the number of footguns is probably inevitable?
osa1 | 2 months ago
Interesting problem. How do other langs approach this?
In Rust I guess you could make non-clone non-send types that clear the contents on
drop. That leaves the registers, but I don't fully understand why the values in registers are important. I'm assuming the secrets that you want to clear are all larger than registers so in the worst case you leak a pointer to a cleared memory, which shouldn't be an issue? Another issue is if you panic without unwinding then you don't run thedrops, which I guess can be a problem?masklinn | 2 months ago
That's not enough, unless an object is behind a
Pinit can move around in memory (e.g. be copied between stackframes). The zeroize crate notes a number of issues. secrecy provides a heap pointer which attempts to avoid those issues.If you panic=abort, the process should immediately terminate and the memory should be reclaimed and eventually zeroed by the OS. So the risk of the process leaking data is not really a concern anymore.
mdaniel | 2 months ago
I've never heard of that behavior, is that opt-in on some kernel/OS combinations?
masklinn | 2 months ago
Afaik every OS zeroes newly allocated pages as handing out an other process’s memory would be unconscionable. Until the OS hands it out the physical page should be inaccessible (unless you have direct access to the vm anyway). I know that freebsd will eagerly zero released pages in the background when idle rather than wait for allocations.
mdaniel | 2 months ago
That may be on a per-implementation basis, but I have found multiple references that make the distinction between
mallocandcallocand only the latter declares that it zeros the resultPOSIX calls it out explicitly https://pubs.opengroup.org/onlinepubs/9699919799/functions/malloc.html#:~:text=and%20whose%20value%20is%20unspecified
The C11 draft about
mallocin 7.22.3.4 also makes no such claim, whereas forcallocin 7.22.3.2 it also claims "initialized to all bits zero"I am not trying to start trouble, so if you have a more up-to-date reference I would enjoy updating my mental model. I am for sure not a C ninja, and I don't mean to imply that I am. But I also have always understood that C is packed to the gills with footguns like this and have grown up never trusting that it will do anything at all to safeguard the application's behavior
fanf | 2 months ago
malloc(3) can return nonzero garbage when it allocates memory that was previously free(3)d, in which situation there’s no kernel involved. (Note the manual section numbers.)
The relevant kernel API is sbrk(2) which from POSIX’s point of view is now obsolete, though it’s still common for some malloc(3) implementations to use it. The more modern replacement is mmap(2) with MAP_ANON. Both of them specify that freshly allocated memory is zeroed.
masklinn | 2 months ago
Wrong direction of inquiry. Malloc is a C thing, it can return memory you had previously allocated, having never released that memory to the OS at all.
But when it needs memory it’s calling mmap to create anonymous maps, and in that case
https://man7.org/linux/man-pages/man2/mmap.2.html
https://marc.info/?l=openbsd-bugs&m=157403903112173
https://learn.microsoft.com/en-us/windows/win32/api/memoryapi/nf-memoryapi-virtualalloc
mdaniel | 2 months ago
I can speak to the JDK, which uses
char[]for password (and I'd guessPrivateKeymaterial, too, I just happen to have thechar[]example handy) so that one canfor (i=0;...) password[i]=0to blank the memory once done with itspc476 | 2 months ago
I can't speak for Java, but it's possible your loop
for (i = 0 ; ... ) password[i] = 0could be optimized out, ifpasswordis a local variable and the variable isn't referenced past the loop. This happens to C code using more modern C compilers.ceph | 2 months ago
Register state can end up in memory if a context switch occurs, I guess. Though that can also happen in the middle of your routine…
hoistbypetard | 2 months ago
I don't have anything specific in mind, but this feels like it could create a situation where timing leaks information. At least, those are the hairs on the back of my neck that stand up when I read:
I may be completely wrong, but that'd make me want to be cautious with this one.
kokada | 2 months ago
There is already subtle for timing sensitive operations. I imagine you can combine both too.
hoistbypetard | 2 months ago
That's cool. It looks like it does a good job with several primitives that you'd use for implementing crypto. But I was thinking a little higher level than that. Suppose, for example, that you wrap an operation that verifies a signature, checks a signer identifier against an ACL, decrypts sensitive command parameters, and runs a command with those sensitive parameters in this
Secretconstruct.It seems likely that someone in a position to observe GC timing could determine whether the signature failed, the ACL check failed, the decryption failed, or the command failed.
Thanks for sharing subtle. I hadn't seen it.
kokada | 2 months ago
What about
subtle.WithDataIndependentTiming? It receives afunc, and I imagine you can do something likesubtle.WithDataIndependentTiming(secrets.Do())to get both.I imagine as long you ensure that all operations are always run even if the previous operation failed, this should avoid those timing issues you described, but I am no expert in crypto or security so take my observations with a grain of salt.
cpurdy | 2 months ago
It is almost certainly a mistake to hard-wire into a high level language (!!!) various bespoke solutions to flaws of the hardware that it runs on. And here, what we have is a mitigating change, not a "fix", so it becomes a percentages game, i.e. a game of statistical chances.
I'm being critical, but I don't know what the right answer is, other than: "Hey Intel / AMD / ARM, please don't let virtual machines read the memory of other virtual machines!" It's far easier to be critical than to suggest a real solution, so I am guilty of something that I hate: criticizing without constructively improving the situation.
Having reviewed most of the cryptographic code in the Java standard libs, the approach there is to manually zero out the data whenever possible. In terms of readability, I'd give those libraries a rating of about 3 on a scale of 1..100, so they're probably achieving quite effective security-via-obfuscation 🤮 despite that not being an intended outcome. I'd add a laughing emoji but it's really not funny.
fanf | 2 months ago
It isn’t about hardware flaws. This kind of zeroing is to protect against leaks of secrets or plaintext due to memory being reused within the same program, like Cloudflare’s Heartbleed bug. Safe languages ought to ensure that the program can’t observe uninitialized stack or heap, so in that context zeroing ought to be redundant – but even in safe languages zeroing is still wanted to protect against other bugs (eg in the runtime or unsafe code) that might lead to secrets leaking.
cpurdy | 2 months ago
But Go already zeroes memory sometime between it being GC’d and allocated…
Also, I haven't been able to find any information on "Cloudflare's Heartbleed bug". I know what Heartbleed is -- at the time, two of the OpenSSL committers worked for me, and my team was responsible for the blessed OpenSSL versions within the EvilBigCo™️ where I worked -- but I'm unaware of what "Cloudflare's Heartbleed" is.
miloignis | 2 months ago
I believe they meant "Cloudbleed" https://en.wikipedia.org/wiki/Cloudbleed
fanf | 2 months ago
Oops yes (d’oh)