SSH certificates: the better SSH experience (jpmens.net)

151 points by jandeboevrie 10 hours ago

thomashabets2 6 hours ago

Every couple of months someone re-discovers SSH certificates, and blogs about them.

I'm guilty of it too. My blog post from 15 years ago is nowhere near as good as OP's post, but if I though me of 15 years ago lived up to my standards of today, I'd be really disappointed: https://blog.habets.se/2011/07/OpenSSH-certificates.html

Stefan-H 3 hours ago

I think the scary reality is most people conflate "keys" and "certificates". I have worked with security engineers that I need to remind that we do not use SSH certs, but rather key auth, and they have to think it through to make it click.

tracker1 2 hours ago

I'm consistently amazed how many developers and security professionals don't have a clear understanding how PPK even works conceptually.

Things like deploying dev keys to various production environments, instead of generating/registering them within said environment.

One of the worst recent security examples... You can't get this data over HTTPS from $OtherAgency, it's "not secure" ... then their suggestion is a "secure" read-only account to the other agency's SQL server (which uses the same TLS 1.3 as HTTPS). This is from person in charge of digital security for a government org.

papyDoctor 5 hours ago

Another useful feature of SSH certificates is that you can sign a user’s public key to grant them access to a remote machine for a limited time and as a specific remote user.

TZubiri 2 hours ago

The capacity to grant access as a specific remote user is present without certs as well right? The typical authorized_keys file lives under a user directory and grants access only to that user.

blueflow 2 hours ago

kaoD 6 hours ago

I've known SSH certs for a while but never went through the effort of migrating away from keys. I'm very frustrated about manually managing my SSH keys across my different servers and devices though.

I assume you gathered a lot of thoughts over these 15 years.

Should I invest in making the switch?

anyfoo 4 hours ago

A big problem I have with ssh carts is that they are not universally supported. For me, there is always some device or daemon (for example tinyssh in the initramfs of my gaming pc so that I can unlock it remotely) that only works with “plain old ssh keys”. And if I have to distribute and sync my keys onto a few hosts anyway, it takes away the benefits.

namibj 17 minutes ago

TZubiri 2 hours ago

thomashabets2 5 hours ago

If your use case is such that you are frustrated about managing keys, host or user keys, then yes it does sound like SSH certs would help you. E.g. when you have many users, servers, or high enough cartesian product of the two.

In environment where they don't cause frustration they're not worth it.

Not really more to it than that, from my point of view.

dizhn 3 hours ago

I am keeping an eye on the new (and alpha) Authentik agent which will allow idp based ssh logins. There's also SSSD already supported but it requires glibc (due to needing NSS) meaning it's not available on Alpine.

gnufx 28 minutes ago

ibotty 6 hours ago

Yes. Caveat: It might not really be worth it if all your infrastructure is managed by these newfangled infrastructure-as-code-things that are quick to roll out (OpenShift/OKD, Talos, etc.) and you have only one repo to change SSH keys (single cluster or single repo for all clusters).

There are some serious security benefits for larger organizations but it does not sound as if you are part of one.

cyberax 2 hours ago

It depends on what you want to do. CA certs are easy to manage, you just put the CA key instead of the SSH public key in authorized_keys.

They also provide a way to get hardware-backed security without messing with SSH agent forwarding and crappy USB security devices. You can use an HSM to issue a temporary certificate for your (possibly temporary) public key and use it as normal. The certificate can be valid for just 1 hour, enough to not worry about it leaking.

otabdeveloper4 5 hours ago

You will have to manage your SSH CA certificates instead of your keys.

The workflows SSH CA's are extremely janky and insecure.

With some creative use of `AuthorizedKeysCommand` you can make SSH key rotation painless and secure.

With SSH certificates you have to go back to the "keys to the kingdom" antipattern and just hope for the best.

jamiesonbecker 3 hours ago

cyberax 2 hours ago

V-eHGsd_ 3 hours ago

oh man, I referred back to your blog post when I wrote the ssh certificate authority for $job ... ~10 years ago.

Thank for writing it!

tacostakohashi 35 minutes ago

One constant source of amazement for me is people not using ssh keys / using passwords with ssh.

Especially at a BigCo, where there are different environments, with different passwords, and password expiry/rotation/complexity rules.

Like, when asking for help, or working together... you say to them "ok, lets ssh to devfoo1234", and they do it, and then type in their password, and maybe get it wrong, then need to reset it, or whatever... and it takes half a minute or more for them to just ssh to some host. Maybe there are several hosts involved, and it all multiplies out.

I mention to them "you know... i never use ssh passwords, i don't actually know my devfoo1234 password... maybe you should google for ssh-keygen, set it up, let me know if you have any problems?" and they're like "oh yeah, thats cool. i should do that sometime later!".... and then they never do, and they are forever messing with passwords.

I just don't get it.

yason an hour ago

Despite the drawbacks of its grassroot nature TOFU goes a looooong way.

With my own machines I can just physically check that the server host key matches what the ssh client sees. Once TOFU looks good I'm all set with that host because I don't change any of the keys ever.

In a no-frills corporate unix environment it's enough to have a list of the internal servers' public keys listed on an internal website, accessible via SSL so it's effectively signed by a known corporate identity. You only need to check this list once to validate carrying out TOFU after which you can trust future connections.

In settings with huge fleet of machines or in a very dynamic environment where new machines are rolled out all the time it probably makes things easier to use certificates. Of course, certificates come with some extra work and some extra features so the amount of benefit depend on the case. But at this scale TOFU is breaking down bad on multiple levels so you can't afford a strong opinion against certificates, really.

I wish web browsers could remember server TLS host keys easily too and at least notify me whenever they change even if they'd still accept the new keys via ~trusted CAs.

aquafox 2 hours ago

I work in a corporate setting and the money and time we wasted because of Zscaler and its SSL inspection [1] is beyond your wildest imagination. Whenever I see a "SSL certificate problem: self-signed certificate in certificate chain" error, I know I'm in trouble.

[1] https://www.zscaler.com/resources/security-terms-glossary/wh...

nayuki an hour ago

Zscaler was deployed on a Windows 11 machine at a place that a friend worked at. When I assessed the software, its behavior was downright evil like malware. As we all know, it injects its own root certificate into the operating system in order to conduct man-in-the-middle attacks on TLS/HTTPS connections to monitor the user's web activity.

Furthermore, it locks down the web browser's settings so that you cannot use a proxy server to bypass Zscaler's MITM. I saw this behavior in Mozilla Firefox, where the proxy option is set to "No proxy" and all other options are disabled and grayed out; I imagine that it does the same to Google Chrome. If you try to modify the browser's .ini(?) file for proxy settings, Zscaler immediately overwrites it against your will. Zscaler worked very hard to enforce its settings in order to spy on the computer user.

And as you'd expect, if you open up the Zscaler GUI in the system tray, you are presented with the option to disable the software if you have the IT password. Which of course, you don't have. Then again, that might be an epsilon better than the Cybereason antivirus software, which just has a system tray icon with no exit option, and cannot be killed in Task Manager even if you are a local administrator, and imposes a heavy slowdown if you're open hundreds of small text files per second.

pxc 43 minutes ago

I feel the same way about Cisco Umbrella where I work.

The worst breakage by far is protocol breakage; basically anything that uses HTTP as a basis for building some other protocol gets broken all the time. None of the people implementing it seem aware. They buy the vendor's claim that it's "transparent", when in fact even "inspect/trace-only" modes often break all kinds of shit.

I've seen Umbrella break:

  - Git
  - RubyGems
  - `go mod`
  - OrbStack
  - Matrix
  - Cargo
  - all JDKs
  - Nix
  - Pkgsrc
  - all VMs
and probably some other things I'm forgetting. When this breakage is reported, the first round of replies is typically "I visited that domain in my enterprise-managed browser and it's not blocked". That is, of course, a useless and irrelevant test.

Often it takes hours to even fully diagnose the breakage with enough confidence to point the finger at that tool and not some other endpoint security tool.

I'm not sure if the people buying and deploying tools in this category don't know how much stuff it breaks or just don't care. But the breakage is everywhere and nobody seems prepared for it.

tptacek 28 minutes ago

SSH certificates aren't X.509 certificates.

Tepix 5 hours ago

The author lists all the advantes of CA certificates, yet doesn't list the disadvantages. OTOH, all the many steps required to set it up make the disadvantages rather obvious.

Also, I've never had a security issue due to TOFU, have you?

akerl_ 5 hours ago

> Also, I've never had a security issue due to TOFU, have you?

This is a bit like suggesting you've never been in a car crash, so seat belts must not be worth considering.

Do you feel that beyond the obvious and documented work in setting them up, there are disadvantages to using SSH certificates?

adrian_b 4 hours ago

Certificates provide extra features, like revocation.

However, if you do not need the extra features provided by certificates, using SSH-generated keys is strictly equivalent with using certificates and it requires less work.

TOFU is neither necessary nor recommended, it is just a convenience feature, to be used when security may be lax.

The secure way to use SSH is to never use TOFU but to pair the user and the server by copying the public keys between the 2 computers through a secure channel, e.g. either by using a USB memory or by sending the public keys through already existing authenticated encrypted links that pass through other computers. (Such a link may be a HTTPS download link.)

When using certificates, a completely identical procedure must be used. After certificates are generated, like also after SSH keys are generated, the certificates must be copied to the client computer and the server computer through secure channels.

palata 3 hours ago

otabdeveloper4 4 hours ago

Your ISP or telecom has to be compromised for TOFU to be relevant to anything. In practice that never happens.

fc417fc802 3 hours ago

adrian_b 5 hours ago

TOFU is convenient, but not necessary.

Choosing to use TOFU is a distinct choice from the choice of using the keys generated by SSH, instead of using certificates.

If you do not want to use TOFU, for extra security, you just have to pair the computers by copying between them the corresponding public keys through a secure channel, e.g. by using a USB memory.

Using certificates does not add any simplification or any extra security.

For real security, you still must pair the communicating computers by copying between them the corresponding certificates, through a secure channel, e.g. a USB memory.

When you use for HTTPS the certificates that have come with your Internet browser, you trust that the installer package for the browser has come to that computer through a secure channel from the authority that has created the certificates. This is usually an assumption much more far fetched than the assumption that you can trust TOFU between computers under your control.

Certificates may be useful in big organizations, if other functionality is needed beyond just establishing secure communication channels, e.g. if you want to use certificate revocation.

In the list of "advantages" enumerated in the parent article, more than half of them are false, because if certificates are implemented correctly, completely equivalent actions must be executed when SSH keys without TOFU are used and when certificates are used.

Perhaps the author meant by writing some of the "advantages" that the actions that supposedly are no longer needed with certificates are done by an administrator, not by the user. However that is also applicable with SSH. An administrator could install the certificates, so that no action is required from the user, but an administrator can also install the SSH public keys, so that no TOFU is ever needed from the user.

Using certificates requires exactly the same steps like using keys generated by SSH (i.e. generating certificates and copying them between computers through secure channels, to pair the servers and the authorized users), but it may need additional steps, caused by the fact that certificates provide additional functionality.

gkoz 4 hours ago

Are you pairing computers by copying certificates to visit this site?

_hyn3 3 hours ago

adrian_b 2 hours ago

zamadatix 3 hours ago

If you have some form of access to set up the CA config on the box before connecting then you can use the same access channel to avoid needing to rely on TOFU for setting up the key access all the same.

This can be anything from being part of the install script to customized deployment image to physical access to access via a host in virtualized scenarios.

TOFU only really comes into play when the box is already set up and you have no other way to load things onto the box other than connecting via SSH to do so. But, again, that would be the same story if you were intending to go the certificate approach too.

gnufx 40 minutes ago

Life is easier if you can use Kerberos SSO, i.e. GSSAPIAuthentication in OpenSSH. (If we're talking certificates, presumably it is OpenSSH, or does anything else implement them?)

tptacek 27 minutes ago

Why would you do that rather than just hooking SSH up to a real IdP with certificates?

linsomniac 6 hours ago

In our dev/stg environment we reinstall half our machines every morning (largely to test our machine setup automation), and SSH host certificates make that so much nicer than having to persist host keys or remove/replace them in known_hosts. Highly recommended.

jamiesonbecker 3 hours ago

SSH certs quietly hurt in prod. Short-lived creds + centralized CA just moves complexity upward without solving the core problem: user management.

The system shifts from many small local states to one highly coupled control point. That control point has to be correct and reachable all the time. When it isn’t, failures go wide instead of narrow.

Example: a few boxes get popped and start hammering the CA. Now what? Access is broken everywhere at once.

Common friction points:

     1. your signer that has to be up and correct all the time
     2. trust roots everywhere (and drifting)
     3. TTL tuning nonsense (too short = random lockouts, too long = what was the point)
     4. limited on-box state makes debugging harder than it should be
     5. failures tend to fan out instead of staying contained
Revocation is also kind of a lie. Just waiting for expiry and hoping that’s good enough.

What actually happens is people reintroduce state anyway: sidecars, caches, agents… because you need it.

We went the opposite direction:

     1. nodes pull over outbound HTTPS
     2. local authorized_keys is the source of truth locally
     3. users/roles are visible on the box
     4. drift fixes itself quickly
     5. no inbound ports, no CA signatures (WELL, not strictly true*!)
You still get central control, but operation and failure modes are local instead of "everyone is locked out right now."

That’s basically what we do at Userify (https://userify.com). Less elegant than certs, more survivable at 2am. Also actually handles authz, not just part of authn.

And the part that usually gets hand-waved with SSH CAs:

     1. creating the user account
     2. managing sudo roles
     3. deciding what happens to home directories on removal
     4. cleanup vs retention for compliance/forensics
Those don’t go away - they're just not part of the certificate solution.

* (TLS still exists here, just at the transport layer using the system trust store. That channel delivers users, keys, and roles. The rest is handled explicitly instead of implied.)

ngrilly 3 hours ago

How do you solve TOFU?

longislandguido 2 hours ago

This discussion is full of schizo solutions to "secure" SSH, most of which make no practical sense or have no technical basis.

There really needs to be a definitive best practices guide published by a trusted authority.

kackerlacker 24 minutes ago

In my view it is more important to stop using software keys so probably use sk (fido) for both host and user.. From there CAs would be a next step.. The level of documentation and example setups is astoundingly poor if you even look at step 2 for any feature. I.e. SK keys are reasonably understood for user keys but the setup as host keys is vague and needs testing to see if it really works.

bobo56539 4 hours ago

With the recent wave of npm hacks stealing private keys, I wanted to limit key's lifetimes.

I've set up a couple of yubikeys as SSH CAs on hosts I manage. I use them to create short lived certs (say 24h) at the start of the day. This way i only have to enter the yubikey pin once a day.

I could not find an easy way to limit maximum certificate lifetime in openssh, except for using the AuthorizedPrincipalCommand, which feels very fragile.

Does anyone else have any experience with a similar setup? How do you limit cert max lifetime?

agwa an hour ago

Instead of using a CA, why not set the key's PIN policy to "once" and use an agent (e.g. https://github.com/FiloSottile/yubikey-agent/) that holds an active session to the yubikey? You start the agent at the beginning of the day, enter the PIN once, and then stop the agent at the end of the day.

gunapologist99 4 hours ago

Anyone tried out Userify? It creates/removes ssh pubkeys locally so (like a CA) no authn server needs to be online. But unlike certs, active sessions and processes are terminated when the user access is revoked.

jamiesonbecker 3 hours ago

We're in the process of updating the experience to this century! ;)

We've always taken the stance that crusty is better than vulnerable, but it turns out that not having a modern experience after 15 years is starting to feel like maybe we need to step up the features and shininess :)

Thom2000 6 hours ago

Sadly services such as Github don't support these so it's mostly good for internal infrastructure.

lights0123 5 hours ago

They do, for Enterprise customers only: https://docs.github.com/en/enterprise-cloud@latest/organizat...

They've rolled their host key one time, so there's little reason for them to use it on the host side.

jcalvinowens 5 hours ago

You can also address TOFU to some extent using SSHFP DNS records.

Openssh supports checking the DNSSEC signature in the client, in theory, but it's a configure option and I'm not sure if distros build with it.

jsiepkes 5 hours ago

On top of that you would need something to secure DNS. Like DNSSEC or at the very least use DNS with TLS or DNS over HTTP. None of these are typically enabled by default.

jcalvinowens 3 hours ago

Anything that uses system-resolved is probably doing DNSSEC validation by default. It's becoming much more common.

Additionally, as I mentioned, openssh itself has support for validating the DNSSEC signature even if your local resolver doesn't. I actually don't think it can use the standard resolver for SSHFP records at all, but I'm not sure.

fc417fc802 3 hours ago

Any idea if there's a standardized location, something like /.well-known/ssh?

moviuro 4 hours ago

All those articles about SSH certificates fall short of explaining how the revocation list can/should be published.

Is that yet another problem that I need to solve with syncthing?

https://man.openbsd.org/ssh-keygen.1#KEY_REVOCATION_LISTS

blipvert 3 hours ago

If you generate short lived certificates via an automated process/service then you don’t really need to manage a revocation list as they will have expired in short order.

jamiesonbecker 3 hours ago

But then you can't log in if your box goes offline for any reason.

blipvert 3 hours ago

sqbic 3 hours ago

I've had very good experiences with SSH Communication Security company's (the guys who invented SSH) PrivX product to manage secure remote access, including SSH certificates and also cert based Windows authentication. It supports other kinds of remote targets too, via webui or with native clients. Great product.

TZubiri 2 hours ago

>then I don’t need to type the target user’s password; instead I enter the key’s passphrase, a hopefully much more complicated combination of words, to unlock the private key.

This sentence is a bit of a red flag, it looks like the author is making a (subtle) mistake in the category of too much security, or at least misjudging the amount of security (objectively measurable entropy) needed. This is of course a less consequential error than too little entropy/security measures, but still if one wants to be a cybersecurity professional, especially one with influence, they must know exactly the right amount needed, because our resources are limited, and each additional bit of entropy and security step not only costs time of the admin that implements it, but of the users that have to follow it, and this can even impact security itself by fatiguing the user and causing them to circumvent measures or ignore alerts.

On to specifically what's wrong:

Either a key file or a password can be used to log in to a server or authenticate to any service in general. Besides the technical implementation, the main difference is whether the secret is stored on the device, or in the user's brain. One is not more correct than the other, there's a lot of tradeoffs, one can ensure more bits and is more ergonomic, the other is not stored on device so it cannot be compromised that way.

That said a 2FA approach, in whatever format, is (generally speaking) safer than any individual method, in that the two secrets are necessary to be granted access. In this scenario one needs both the file and the password to authenticate, even if the password is 4 digits long, that increases the security of the system when compared to no password. An attacker would have to setup a brute force attempt along with a way to verify the decryption was successful. If local decryption confirmation is not possible, then such a brute force attack would require to submit erroneous logins to the server potentially activating tripwires or alerting a monitoring admin.

There's nothing special about the second factor authorization being equal or equivalent in entropy to the first, and there's especially no requirement that a password have more entropy when it's a second authorization, in fact it's the other way around.

tl;dr You can consider each security mechanism in the wider context rather than in isolation and you will see security fatigue go down without compromising security.