Go proposal: Secret mode

52 points by cgrinds 2 months ago on lobsters | 28 comments

Interesting choice on unsupported architectures to make this just silently fail to do what it says. (But fully in keeping with the Go philosophy of “we assume you’ve read every word of the docs before calling the function”.)

weberc2 | 2 months ago

I'm reading this as a description of the current implementation for the proposal, not an intent for the final form. It's also worth pointing out that the target for this is cryptography developers, and it's probably not wild to expect them to read the docs.

: wrs | 2 months ago
Fair enough, I guess if you're just sort of waving this at your code because it has the word "secret" in it then I'm sure there are worse problems! On the other hand, I've seen a lot of cases of people doing just that. So overall I'd prefer "this is panicking because it won't work, and if you know what you're doing you can find a workaround" to "this won't work, but you wouldn't be using this if you didn't know that already, so I'll proceed".

mcpherrinm | 2 months ago

I think it’s probably the right decision: it would make the feature unusable in any portable cryptography code outside those platforms, which means it probably wouldn’t get used.

And not getting used is equivalent to the feature doing nothing, as there’s no other way to achieve this today.

gasche | 2 months ago

Heap allocations are only erased if ➊ the program drops all references to them, and ➋ then the garbage collector notices that those references are gone.

Ewww, finalizers!

For the record, for OCaml (which is also a GC-ed language), the recommendation is for users to allocate their secret data outside the GC heap (the Bigarray type can be used to allocate there) and to zero it out explicitly themselves right after usage. (If we had RAII it would go in the destructor.) Asking users to trust that the GC will do so in a timely fashion seems strange to me for a security-intended feature, especially as the amount of code that has to do this cleanup is rather small, so it makes sense to prioritize control over convenience. I guess that I don't understand the proposed Go design in that case.

: peter-leonov | 2 months ago
Agreed.

Although, I read the proposal as a secondary extra layer of protection to harden the cracks that inevitably leak bits of data in a bigger scale project. For example slice sizes, said pointers, etc.

It's also pleasant to know that a forgotten defer call will be less likely causing a security issue in the future.

nemith | 2 months ago

There is another, not yet accepted, proposal to make secret.Do follow go routines (allowing for what Go is starting to call goroutine bubbles).

https://github.com/golang/go/issues/76477

icy | 2 months ago

I really like this. It's solid stdlib primitives like this that make Go an excellent internet programming language.

Riolku | 2 months ago

Very cool proposal. I suppose the number of footguns is probably inevitable?

osa1 | 2 months ago

Interesting problem. How do other langs approach this?

In Rust I guess you could make non-clone non-send types that clear the contents on drop. That leaves the registers, but I don't fully understand why the values in registers are important. I'm assuming the secrets that you want to clear are all larger than registers so in the worst case you leak a pointer to a cleared memory, which shouldn't be an issue? Another issue is if you panic without unwinding then you don't run the drops, which I guess can be a problem?

masklinn | 2 months ago

In Rust I guess you could make non-clone non-send types that clear the contents on drop.

That's not enough, unless an object is behind a Pin it can move around in memory (e.g. be copied between stackframes). The zeroize crate notes a number of issues. secrecy provides a heap pointer which attempts to avoid those issues.

Another issue is if you panic without unwinding then you don't run the drops, which I guess can be a problem?

If you panic=abort, the process should immediately terminate and the memory should be reclaimed and eventually zeroed by the OS. So the risk of the process leaking data is not really a concern anymore.

mdaniel | 2 months ago

and eventually zeroed by the OS

I've never heard of that behavior, is that opt-in on some kernel/OS combinations?

masklinn | 2 months ago

Afaik every OS zeroes newly allocated pages as handing out an other process’s memory would be unconscionable. Until the OS hands it out the physical page should be inaccessible (unless you have direct access to the vm anyway). I know that freebsd will eagerly zero released pages in the background when idle rather than wait for allocations.

mdaniel | 2 months ago

That may be on a per-implementation basis, but I have found multiple references that make the distinction between malloc and calloc and only the latter declares that it zeros the result

POSIX calls it out explicitly https://pubs.opengroup.org/onlinepubs/9699919799/functions/malloc.html#:~:text=and%20whose%20value%20is%20unspecified

The C11 draft about malloc in 7.22.3.4 also makes no such claim, whereas for calloc in 7.22.3.2 it also claims "initialized to all bits zero"

I am not trying to start trouble, so if you have a more up-to-date reference I would enjoy updating my mental model. I am for sure not a C ninja, and I don't mean to imply that I am. But I also have always understood that C is packed to the gills with footguns like this and have grown up never trusting that it will do anything at all to safeguard the application's behavior

fanf | 2 months ago

malloc(3) can return nonzero garbage when it allocates memory that was previously free(3)d, in which situation there’s no kernel involved. (Note the manual section numbers.)

The relevant kernel API is sbrk(2) which from POSIX’s point of view is now obsolete, though it’s still common for some malloc(3) implementations to use it. The more modern replacement is mmap(2) with MAP_ANON. Both of them specify that freshly allocated memory is zeroed.

masklinn | 2 months ago

Wrong direction of inquiry. Malloc is a C thing, it can return memory you had previously allocated, having never released that memory to the OS at all.

But when it needs memory it’s calling mmap to create anonymous maps, and in that case

https://man7.org/linux/man-pages/man2/mmap.2.html

The mapping is not backed by any file; its contents are initialized to zero.
https://marc.info/?l=openbsd-bugs&m=157403903112173

the flag MAP_ANON does not indicate the state of the newly allocated memory.

Is it zero-initialized? Or is it undefined?

Of course it is zero filled. What else would it be? There are no plausible alternatives.
https://learn.microsoft.com/en-us/windows/win32/api/memoryapi/nf-memoryapi-virtualalloc

Memory allocated by this function is automatically initialized to zero.

mdaniel | 2 months ago

I can speak to the JDK, which uses char[] for password (and I'd guess PrivateKey material, too, I just happen to have the char[] example handy) so that one can for (i=0;...) password[i]=0 to blank the memory once done with it

: spc476 | 2 months ago
I can't speak for Java, but it's possible your loop for (i = 0 ; ... ) password[i] = 0 could be optimized out, if password is a local variable and the variable isn't referenced past the loop. This happens to C code using more modern C compilers.

ceph | 2 months ago

Register state can end up in memory if a context switch occurs, I guess. Though that can also happen in the middle of your routine…

hoistbypetard | 2 months ago

I don't have anything specific in mind, but this feels like it could create a situation where timing leaks information. At least, those are the hairs on the back of my neck that stand up when I read:

Heap allocations made by the function are erased as soon as the garbage collector decides they are no longer reachable.

I may be completely wrong, but that'd make me want to be cautious with this one.

kokada | 2 months ago

but this feels like it could create a situation where timing leaks information

There is already subtle for timing sensitive operations. I imagine you can combine both too.

hoistbypetard | 2 months ago

That's cool. It looks like it does a good job with several primitives that you'd use for implementing crypto. But I was thinking a little higher level than that. Suppose, for example, that you wrap an operation that verifies a signature, checks a signer identifier against an ACL, decrypts sensitive command parameters, and runs a command with those sensitive parameters in this Secret construct.

It seems likely that someone in a position to observe GC timing could determine whether the signature failed, the ACL check failed, the decryption failed, or the command failed.

Thanks for sharing subtle. I hadn't seen it.

: kokada | 2 months ago
What about subtle.WithDataIndependentTiming? It receives a func, and I imagine you can do something like subtle.WithDataIndependentTiming(secrets.Do()) to get both.

I imagine as long you ensure that all operations are always run even if the previous operation failed, this should avoid those timing issues you described, but I am no expert in crypto or security so take my observations with a grain of salt.

cpurdy | 2 months ago

It is almost certainly a mistake to hard-wire into a high level language (!!!) various bespoke solutions to flaws of the hardware that it runs on. And here, what we have is a mitigating change, not a "fix", so it becomes a percentages game, i.e. a game of statistical chances.

I'm being critical, but I don't know what the right answer is, other than: "Hey Intel / AMD / ARM, please don't let virtual machines read the memory of other virtual machines!" It's far easier to be critical than to suggest a real solution, so I am guilty of something that I hate: criticizing without constructively improving the situation.

Having reviewed most of the cryptographic code in the Java standard libs, the approach there is to manually zero out the data whenever possible. In terms of readability, I'd give those libraries a rating of about 3 on a scale of 1..100, so they're probably achieving quite effective security-via-obfuscation 🤮 despite that not being an intended outcome. I'd add a laughing emoji but it's really not funny.

fanf | 2 months ago

It isn’t about hardware flaws. This kind of zeroing is to protect against leaks of secrets or plaintext due to memory being reused within the same program, like Cloudflare’s Heartbleed bug. Safe languages ought to ensure that the program can’t observe uninitialized stack or heap, so in that context zeroing ought to be redundant – but even in safe languages zeroing is still wanted to protect against other bugs (eg in the runtime or unsafe code) that might lead to secrets leaking.

cpurdy | 2 months ago

But Go already zeroes memory sometime between it being GC’d and allocated…

Also, I haven't been able to find any information on "Cloudflare's Heartbleed bug". I know what Heartbleed is -- at the time, two of the OpenSSL committers worked for me, and my team was responsible for the blessed OpenSSL versions within the EvilBigCo™️ where I worked -- but I'm unaware of what "Cloudflare's Heartbleed" is.

miloignis | 2 months ago

I believe they meant "Cloudbleed" https://en.wikipedia.org/wiki/Cloudbleed

: fanf | 2 months ago
Oops yes (d’oh)