gemini://warmedal.se/~bjorn/posts/certificate-security.gmi

This is a hard subject, so let's just be clear about one thing going in: I'm not about to cover it in detail :D There's been some discussion about this on gemini mailing list, and on IRC. Since a part of my work revolves around IT security and I have access to senior experts in the field at my workplace I had a discussion with a couple of them on the subject and thought I should share our conclusions.

The basis of public key cryptography -- a server or client certificate is what we call a public key -- is that the cryptographic algorithm used is asymmetric. That is: we use one key to encrypt and another to decrypt.

When a client contacts server A it will be presented with A's public key. Server A will then encrypt messages to the client using the corresponding *private* key, while the client can encrypt messages to the server with A's *public* key.

There's a lot of magic involved in making a connection truly secure, but this is the cornerstone of it. This is why we use a key *pair* instead just the one key.

If you look at a server certificate on the web by clicking the padlock in the address bar you can see a whole lot of information. First of all there's the certificate itself, which contains not-valid-before date and a not-valid-after date, a bunch of names the server may be known as ("www.server.com", "server.com", "*.server.com", for example), the organisation behind the server, contact information to the domain owner, etc.

You see, your browser has no reason to trust a random server somewhere on the internet that may as well has been put up minutes ago. How do we know that we're talking to server.com, as opposed to some middle man pretending to be server.com? Here's where the whole mechanism of Certificate Authorities come into play.

There are a large amount of Certificate Authorities out there, and Digicert will be my example (for reasons that will become apparent).

Digicert, as any CA, has a Root Certificate. This certificate is its closest guarded asset. It's valid for a long time, and the private key of it may never get lost or leaked or chaos ensues (and so on and so forth).

Apart from this it also has a bunch of Intermediary Certificates. These are cryptographically signed with the Root Certificate, proving to everyone who trusts the Root Certificate that they're bona fide.

If I want a certificate for my server called server.com from a provider like Digicert I need to pay through my nose and jump through a few hoops (proving my identify, for one, and proving that I own and control the domain for another). If I do this to their satisfaction I can create a certificate for server.com and have it cryptographically signed by one of their Intermediary Certificates. Now everyone who trusts the Digicert Root Certificate and the Intermediary Certificate is likely to trust my Server Certificate.

But what if I contact Digicert and tell them that "look, my private key has been leaked." -- In theory someone can now sign their own certificates with *my server certificate* and thereby be trusted up through the same chain. Chaos now?

Nah, not really. All the certificates in the chain are usually marked with some way to validate them and verify that they have not been revoked. The most common mechanism for this is OCSP. When someone contacts my server their browser will in turn contact Digicert's OCSP server to ask if my certificate has been revoked or not.

There are a bunch of levels of trust at play here. The browser will have a look at my certificate, compare its information to the hostname the browser is contacting and assume -- if they match -- that it's likely trustworthy. But not so fast! This is weak information: anyone can type matching info into their certificate. So the browser checks the Intermediary Certificate. If it knows and trusts this since before (all browsers and even operating systems come with a store of trusted certificates, which is updated with the system) all is well. Almost. It's still going to check every single certificate up the chain, because if any certificate there has been revoked that means all the downstream certificates can be phony.

In the end of November Digicert changed one of their Intermediary Certificates, revoking the old and putting a new in place. Then they started signing new certificate signature requests with the new certificate.

Pretty soon server certificates appeared in the wild signed by the new Intermediary Certificate. This was all well and good, because browser truststores had already been updated with this.

Suddenly api requests from a Java service between two companies started failing, showing SSL Handshake Exceptions. Fun times; most Java developers very rarely encounter errors like this and it can take a couple of days before a junior dev happens to ask the right person what's wrong. The right person will then fire up openssl s_client and connect, and find that something is bad with the Intermediary Certificate: it's not trusted.

The only thing we can do about it is to rebuild our containers, adding a command that adds the new Intermediary Certificate to the Java keystore.

Remember what I said about the browser not taking the facts stated in the server certificate at face value? How anyone can literally put anything there when creating their certificate?

This is true for self-signed certificates true. Even more so, because no third party vouches for the information.

Most gemini browsers will make a fair attempt at validating the certificate: first check if the Common Name or Subject Alternative Names match the requested hostname, then check the not-valid-before and not-valid-after dates, then check if we've visited this host before and if the cert provided now matches the cert provided last time.

Only one of these checks is meaningful in any way in a TOFU scheme. It even says in the name: Trust On First Use.

Sure, we can check that the Common Name etc matches what we expect, but why? If they don't match it tells us that... that... errh... maybe that the sysadmin didn't care? Or misspelled? Or had one cert for another virtual host and decided to use the same? None of these impact the privacy or security of our contact. It's still a cryptographically viable certificate.

And if they *do* match, what does that tell us? That the server is meticulously well configured? That a possible attacker also knows today's date and how to spell the domain name? Again: still a cryptographically viable certificate, and we haven't learned anything that makes it more secure.

Trust On First Use provides NO way to validate a certificate ON THE FIRST USE. On subsequent uses we can only compare with previously seen certificates. If you have access to an out-of-band verification method (tilde.team has a wiki page on https where the sha256 sum of the self-signed gemini cert is listed, for example), by all means use it. But the gemini protocol provides none.

We trust the certificate the first time we encounter it, and if/when it changes we need to somehow verify the new cert on our own. Any time we encounter a new certificate means we open ourselves up to the risk of a man-in-the-middle attack. And the attacker will know how the domain name is spelt, and what today's date is.

Certificate Security

Public Key Cryptography in Forty Seconds

Certificate Authorities and Trust

When CA Trust Goes Wrong

What About TOFU and Gemini?