A sledge hammer introduction to X.509 certificates

This post was written by eli on April 30, 2021
Posted Under: crypto,Internet,Server admin

Introduction

First and foremost: Crypto is not my expertise. This is a note to future self for the next time I’ll need to deal with similar topics. This post summarizes my understanding as I prepared worked on a timestamp server, and it shows the certificates used by it.

There are many guides to X.509 certificates out there, however it seems like it’s common practice to focus on the bureaucratic aspects (a.k.a. Public Key Infrastructure, or PKI), and less on the real heroes of this story: The public cryptographic keys that are being certified.

For example, RFC 3647 starts with:

In general, a public-key certificate (hereinafter “certificate”) binds a public key held by an entity (such as person, organization, account, device, or site) to a set of information that identifies the entity associated with use of the corresponding private key. In most cases involving identity certificates, this entity is known as the “subject” or “subscriber” of the certificate.

Which is surely correct, and yet it dives right into organization structures etc. Not complaining, the specific RFC is just about that.

So this post is an attempt to make friends with these small chunks of data, with a down-to-earth, technical approach. I’m not trying to cover all aspects nor being completely accurate. For exact information, refer to RFC 5280. When I say “the spec” below, I mean this document.

Let’s start from the basics, with the main character of this story: The digital signature.

The common way to make a digital signature is to first produce a pair of cryptographic keys: One is secret, and the second is public. Both are just short computer files.

The secret key is used in the mathematical operation that constitutes the action of a digital signature. Having access to it is therefore equivalent to being the person or entity that it represents. The public key allows verifying the digital signature with a similar mathematical operation.

A certificate is a message (practically — a computer file), saying “here’s a public key, and I hereby certify that it’s valid for use between this and this time for these and these uses”. This message is then digitally signed by whoever gives the certification (with is a key different from the one certified, of course). As we shall see below, there’s a lot more information in a certificate, but this is the point of it all.

The purpose of a certificate is like ID cards in real life: It’s a document that allows us to trust a piece of information from someone we’ve never seen before and know nothing about, without the possibility to consult with a third party. So there must be something about this document that makes it trustworthy.

The certificate chain

Every piece of software that works with public keys is installed with a list of public keys that it trusts. Browsers carry a relatively massive list for SSL certificates, but for kernel code signing it consists of exactly one certificate. So the size of this list varies, but is surely very small compared with the number of certificates out there are in general. Keys may be added and removed to this list in the course of time, but its size remains roughly the same.

The common way to maintain this list is by virtue of root certificates: These certificates basically say “trust me, this key is OK”. I’ll get further into this along with the example of a root certificate below.

As the secret keys of these root certificates are precious, they can’t be used to sign every certificate in the world. Instead, these are used to approve the key of another certificate. And quite often, that second certificate approves the key for verifying a third certificate. Only that certificate approves the public key which the software needs to know if it’s OK for use. In this example, these three certificates form a certificate chain. In real life, this chain usually consists of 3-5 certificates.

In many practical applications (e.g. code signing and time stamping) the sender of the data for validation also attaches a few certificates in order to help the validating side. Likewise, when a browser establishes a secure connection, it typically receives more than one certificate.

None of these peer-supplied certificates are root certificates (and if there is one, any sane software will ignore it, or else is the validation worthless). The validating software then attempts to create a valid certificate chain going from its own pool of root certificates (and possibly some other certificates it has access to) to the public key that needs validation. If such is found, the validation is deemed successful.

The design of the certificate system envisions two kinds of keys: Those used by End Entities for doing something useful, and those used by Certificate Authorities only for the purpose of signing and verifying other certificates. Each certificate testifies which type it belongs to in the “Basic Constraints” extension, as explained below.

Put shortly: The certificates that we (may) pay a company to make for us, are all End Entities certificates.

In this post I’ll show a valid certificate chain consisting of three certificates.

A sample End Entity certificate

This is a textual dump of a certificate, obtained with something like:

openssl x509 -in thecertificate.crt -text

Other tools represent the information slightly differently, but the terminology tends to remain the same.

Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number:
            65:46:72:11:63:f1:85:b4:3d:95:3d:72:66:e6:ee:c5:1c:f6:2b:6e
        Signature Algorithm: sha256WithRSAEncryption
        Issuer: C = GB, ST = Gallifrey, L = Gallifrey, O = Dr Who, CN = Dr Who Time Stamping CA
        Validity
            Not Before: Jan  1 00:00:00 2001 GMT
            Not After : May 19 00:00:00 2028 GMT
        Subject: C = GB, ST = Gallifrey, L = Gallifrey, O = Dr Who, CN = Dr Who Time Stamping Service CA
        Subject Public Key Info:
            Public Key Algorithm: rsaEncryption
                RSA Public-Key: (2048 bit)
                Modulus:
                    00:d7:34:07:c5:dd:f5:e6:6a:b2:9e:e6:76:e3:ce:
                    af:33:a3:10:60:97:e8:27:f1:62:87:90:a9:21:52:
[ ... ]
                Exponent: 65537 (0x10001)
        X509v3 extensions:
            X509v3 Basic Constraints: critical
                CA:FALSE
            X509v3 Key Usage: critical
                Digital Signature
            X509v3 Extended Key Usage: critical
                Time Stamping
            X509v3 Subject Key Identifier:
                3A:E5:43:A1:40:3F:A4:0F:01:CE:D3:3F:2A:EE:4E:92:B9:28:5C:3A
            X509v3 Authority Key Identifier:
                keyid:3C:F5:43:45:3B:40:10:BC:3F:25:47:18:10:C4:19:18:83:8C:09:D0
                DirName:/C=GB/ST=Gallifrey/L=Gallifrey/O=Dr Who/CN=Dr Who Root CA
                serial:7A:CF:23:8D:2E:A7:6C:84:52:53:AF:BA:D7:26:7F:54:53:B2:2D:6B

    Signature Algorithm: sha256WithRSAEncryption
         6c:54:88:55:ff:c7:e1:81:73:4e:00:80:46:0d:dc:d9:32:c1:
         53:ba:ff:f9:32:e4:f3:83:c2:29:bb:e5:91:88:8e:6f:46:f4:
[ ... ]

The key owning the certificate

Much of the difficulty to understand certificates stems from the fact that the bureaucratic terminology is misleading, making it look as if it was persons or companies that are certified.

So let’s keep the eyes on the ball: This is all about the cryptographic keys. There’s the key which is included in the certificate (the Subject’s public key) and there’s the key that signs the certificate (the Authority’s key). There are of course someones owning these keys, and their information is the one that is presented to us humans first and foremost. And yet, it’s all about the keys.

The certificate’s purpose is to say something about the cryptographic key which is given explicitly in the certificate’s body (printed out in hex format as “Modulus” above). On top of that, there’s always the “Subject” part, which the human-readable name given to this key.

As seen in the printout above, the Subject is a set of attributes and values assignments. Collectively, they are the name of the key. Which attributes are assigned differs from certificate to certificate, and it may even contain no attributes at all. The meaning of and rules for setting these has to do with the bureaucracy of assigning real-life certificates. From a technical point of view, these are just string assignments.

Usually, the most interesting one is CN, which stands for commonName, and is the most descriptive part in the Subject. And yet, it may be confusingly similar to that of other certificates.

For certificates which certify an SSL key for use by web server, the Subject’s CN is the domain it covers (possibly with a “*” wildcard). It might be the only assignment. For example *.facebook.com or mysmallersite.com.

Except for some root certificates, there’s a X509v3 Subject Key Identifier entry in the certificate as well. It’s a short hex string which is typically the SHA1 hash of the public key, or part of it, but the spec allows using other algorithms. It’s extremely useful for identifying certificates, since it’s easy to get confused between Subject names.

I’ll discuss root authorities and root certificates below, along with looking at a root certificate.

The key that signed the certificate

Then we have the “Authority” side, which is the collective name for whoever signed the certificate. Often called Certificate Authority, or CA for short. Confusingly enough (for me), it appears before the Subject, in the binary blob of the certificate as well as the text output above.

The bureaucratic name of this Authority is given as the”Issuer“. Once again, it consists of a set of attributes and values assignments, which collectively are the name of the key that is used to sign the certificate. This tells us to look for a certificate with an exact match: The exact same set of assignments, with the exact same values. If such issuer certificate is found, and we trust it, and it’s allowed to sign certificates, and the Issuer’s public key validates the signature of the Subject’s certificate — plus a bunch of other conditions — then the Subject certificate is considered valid. In other words, the public key it contains is valid for the uses mentioned in it. This said with lots of fine details omitted.

But looking for a certificate in the database based upon the name is inefficient, as the same entity may have multiple keys and hence multiple certificates for various reasons — in particular because a certificate is time limited. To solve this, all certificates (except root certificates) must point at their Authority with the X509v3 Authority Key Identifier field (though I’ve seen certificates without it). There are two methods for this:

  1. The value of that appears in the Subject Key Identifier field, in the certificate for the key that signed the current certificate (so it’s basically a hash of the public key that signed this certificate).
  2. The serial number of the certificate of the key that signed the current certificate, plus the Issuer name of the Authority’s certificate — that is the Authority that is two levels up in the foodchain. This is a more heavy-weight identification, and gives us a hint on what’s going on higher up.

The first method is more common (and is required if you want to call yourself a CA), and sometimes both are present.

Anyhow, practically speaking, when I want to figure out which certificate approves which, I go by the Subject / Authority Key Identifiers. It’s much easier to keep track of the first couple of hex octets than those typically long and confusing names.

Validity times and Certificate serial number

These are quite obvious: The validity time limits the time period for which the certificate can be used. The validating software uses the computer’s clock for this purpose, unless the validated message is timestamped (in particular with code signing), in which case the timestamp is used to validate all certificates in the chain.

The serial number is just a number that is unique for each certificate. The spec doesn’t define any specific algorithm for generating it. Note that this number relates to the certificate itself, and not to the public key being certified.

The signature

All certificates are signed with the secret key of their Authority. The public key for verifying it is given in the Authority’s certificate.

The signature algorithm appears at the beginning of the certificate, however the signature itself is last. The spec requires, obviously, that the algorithm that is used in the signature is the one stated in the beginning.

The signature is made on the ASN.1 DER-encoded blob which contains all certificate information (except for the signature section itself, of course).

X509v3 extensions

Practically all certificates that are used today are version 3 certificates, and they all have a section called X509v3 extensions. In this section, the creator of the certificate insert data objects as desired (but with some minimal requirements, as defined in the spec). The meaning and structure of each data object is conveyed by an Object Identifier (OID) field at the header of each object, appearing before the data in the certificate’s ASN.1 DER blob. It’s therefore possible to push any kind of data in this section, by assigning an OID for that kind of data.

In addition to the OID, each such data object also has a boolean value called “critical”: Note that some of the extensions in the example above are marked as critical, and some are not. When an extension is critical (the boolean is set true) the certificate must be deemed invalid if the extension is not recognized by its verifying software. Extensions that limit the usage of a certificate are typically marked critical, so that unintended use doesn’t occur because the extension wasn’t recognized.

I’ve already mentioned two x509v3 extensions: X509v3 Subject Key Identifier and X509v3 Authority Key Identifier, none of which are critical in the example above. And it makes sense: If the verifying software doesn’t recognize these, it has other means to figure out which certificate goes where.

So coming up next is a closer look at a few standard X509v3 extensions.

X509v3 Key Usage

As its name implies, this extension defines the allowed uses of the key contained in the certificate. A certificate that is issued by a CA must have this extension present, and mark it Critical.

This extension contains a bit string of 8 bits, defining the allowed usages as follows:

  • Bit 0: digitalSignature — verify a digital signature other than the one of a certificate or CRL (these are covered with bits 5 and 6).
  • Bit 1: nonRepudiation (or contentCommitment) — verify a digital signature in a way that is legally binding. In other words, a signature made with this key can’t be claimed later to be false.
  • Bit 2: keyEncipherment — encipher a private or secret key with the public key contained in the certificate.
  • Bit 3: dataEncipherment — encipher payload data directly with the public key (rarely used).
  • Bit 4: keyAgreement — for use with Diffie-Hellman or similar key exchange methods.
  • Bit 5: keyCertSign — verify the digital signature of certificates.
  • Bit 6: cRLSign — verify the digital signature of CRLs.
  • Bit 7: encipherOnly — when this and keyAgreement bits are set, only enciphering data is allowed in the key exchange process.
  • Bit 8: decipherOnly — when this and keyAgreement bits are set, only deciphering data is allowed in the key exchange process.

X509v3 Extended Key Usage

The Key Usage extension is somewhat vague about the purposes of the cryptographic operations. In particular, when the public key can be used to verify digital signature, surely not all kinds of signatures? If this was the case, this would make the public key valid to sign anything (that isn’t a legal document, a certificate and a CRL, and still).

On the other hand, how can a protocol properly foresee any possible use of the public key? Well, it can’t. Instead, each practical use of the key is given a unique number in the vocabulary of Object Identifiers (OIDs). This extension merely lists the OIDs that are relevant, and this translates into allowed uses. When evaluating the eligibility to use the public key (that is contained in the certificate), the Key Usage and Extended Key Usage are evaluated separately; a green light is given only if both evaluations resulted in an approval.

The spec doesn’t require this extension to be marked Critical, but it usually is, or what’s the point. The spec does however say that “in general, this extension will appear only in end entity certificates”, i.e. a certificate that is given to the end user (and hence with a key that can’t be used to sign other certificates). In reality, this extension is often present and assigned the intended use in certificates in the middle of the chain, despite this suggestion. As I’ve seen this in code signing and time stamping middle-chain certificates, maybe it’s to restrict the usage of this middle certificate for certain purposes. Or maybe it’s a workaround for buggy validation software.

This is a short and incomplete list of interesting OIDs that may appear in this extension:

  • TLS Web Server Authentication: 1.3.6.1.5.5.7.3.1
  • TLS Web Client Authentication: 1.3.6.1.5.5.7.3.2
  • Code signing: 1.3.6.1.5.5.7.3.3
  • Time stamping: 1.3.6.1.5.5.7.3.8

The first two appear in certificates that are issued for HTTPS servers.

X509v3 Basic Constraints

Never mind this extension’s name. It has nothing to do with what it means.

This extension involves two elements: First, a boolean value, “cA”, meaning Certificate Authority. According to the spec, the meaning of this flag is that the Subject of the certificate is a Certificate Authority (as opposed to End Entity). When true, the key included in the certificate may be used to sign other certificates.

But wait, what about the keyCertSign capability in X509v3 Key Usage (i.e. bit 6)? Why the duplicity? Not clear, but the spec requires that if cA is false, then keyCertSign must be cleared (certification signature not allowed). In other words, if you’re not a CA, don’t create certificates that can sign other certificates.

This flag is actually useful for manually analyzing a certificate chain going from the end user certificate towards the root: Given a pile of certificates, look for the one with CA:FALSE. That’s the certificate to start with.

The second element is pathLenConstraint, which usually appears as pathlen in text dumps. It limits the number of certificates between the current one and the final certificate in the chain, which is typically the End Entity certificate. Commonly, pathlen is set to zero in the certificate that authorizes the certificate that someone paid to get.

If there’s a self-issued certificate (Subject is identical to Issuer) in the chain which isn’t the root certificate, forget what I said about pathlen. But first find me such a certificate.

This extension is allowed to be marked critical or not.

X509v3 Subject Alternative Name

(not in the example above)

This extension allows assigning additional names, on top of the one appearing in Subject (or possibly instead of it). It’s often used with SSL certificates in order to make it valid for multiple domains.

A sample non-root CA certificate

This is the text dump of the certificate which is the Authority of the example certificate listed above:

Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number:
            7a:cf:23:8d:2e:a7:6c:84:52:53:af:ba:d7:26:7f:54:53:b2:2d:6b
        Signature Algorithm: sha256WithRSAEncryption
        Issuer: C = GB, ST = Gallifrey, L = Gallifrey, O = Dr Who, CN = Dr Who Root CA
        Validity
            Not Before: Jan  1 00:00:00 2001 GMT
            Not After : May 19 00:00:00 2028 GMT
        Subject: C = GB, ST = Gallifrey, L = Gallifrey, O = Dr Who, CN = Dr Who Time Stamping CA
        Subject Public Key Info:
            Public Key Algorithm: rsaEncryption
                RSA Public-Key: (2048 bit)
                Modulus:
                    00:b6:bf:46:38:c7:c1:63:58:f1:95:c6:cf:0a:5d:
                    72:d1:11:ce:86:96:04:ce:8f:cb:ab:da:22:b9:e0:
[ ... ]
                Exponent: 65537 (0x10001)
        X509v3 extensions:
            X509v3 Basic Constraints: critical
                CA:TRUE, pathlen:0
            X509v3 Key Usage: critical
                Certificate Sign, CRL Sign
            X509v3 Extended Key Usage:
                Time Stamping
            X509v3 Subject Key Identifier:
                3C:F5:43:45:3B:40:10:BC:3F:25:47:18:10:C4:19:18:83:8C:09:D0
            X509v3 Authority Key Identifier:
                keyid:98:9A:E3:EF:D8:C5:5C:7F:87:35:87:45:78:3D:51:8D:82:2F:1E:A3
                DirName:/C=GB/ST=Gallifrey/L=Gallifrey/O=Dr Who/CN=Dr Who Root CA
                serial:03:91:DC:F3:FA:8D:5A:CA:D0:3D:B7:EE:1B:71:2D:60:B5:0A:99:DE

    Signature Algorithm: sha256WithRSAEncryption
         13:18:16:99:6a:42:be:22:14:e5:e8:80:5a:ce:be:df:33:c6:
         22:df:d5:35:48:e6:9d:9f:ec:ec:07:72:49:33:ca:ca:3f:22:
[ ... ]

The public key contained in this certificate is pair with the secret key that signed the certificate before. As one would expect, this following fields match:

  • The list of assignments in Subject of this certificate is exactly the same as the Issuer in the previous one.
  • The Subject Key Identifier here with Authority Key Identifier, as keyid, in the previous one.
  • This Certificate’s Serial number appears in Authority Key Identifier as serial.
  • This Certificate’s Issuer appears in Authority Key Identifier in a condensed form as DirName.

Except for the Subject to Issuer match, the other fields may be missing in certificates. There’s a brief description of how the certificate chain is validated below, after showing the root certificate. At this point, these relations are listed just to help figuring out which certificate certifies which.

Note that unlike the previous certificate, CA is TRUE, which means that this a CA certificate (as opposed to End Entities certificate). In other words, it’s intended for the sole use of signing other certificates (and it does, at least the one above).

Also note that pathlen is assigned zero. This means that the it’s used only to sign End Entity certificates.

Note that DirName in Authority Key Identifier equals this certificate’s Issuer. Recall that DirName is the Issuer of the certificate that certifies this one. Hence the conclusion is that the certificate that certifies this one has the same name for Subject and Issuer: So with this subtle clue, we know almost for sure that the certificate above this one is a root certificate. Why almost? Because non-root self-issued certificates are allowed in the spec, but kindly show me one.

Extended Key Usage is set to Time Stamping. Even though this was supposed to be unusual, as mentioned before, this is what non-root certificates for time stamping and code signing usually look like.

And as expected, the Key Usage is Certificate Sign and CRL Sign, as one would expect to find on a CA certificate.

A sample root CA certificate

And now we’re left with the holy grail: the root certificate.

Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number:
            03:91:dc:f3:fa:8d:5a:ca:d0:3d:b7:ee:1b:71:2d:60:b5:0a:99:de
        Signature Algorithm: sha256WithRSAEncryption
        Issuer: C = GB, ST = Gallifrey, L = Gallifrey, O = Dr Who, CN = Dr Who Root CA
        Validity
            Not Before: Jan  1 00:00:00 2001 GMT
            Not After : May 19 00:00:00 2028 GMT
        Subject: C = GB, ST = Gallifrey, L = Gallifrey, O = Dr Who, CN = Dr Who Root CA
        Subject Public Key Info:
            Public Key Algorithm: rsaEncryption
                RSA Public-Key: (2048 bit)
                Modulus:
                    00:ce:e5:53:d7:1e:43:28:13:00:eb:b2:81:bb:ff:
                    28:23:98:9a:fd:69:07:ee:49:c5:54:44:66:77:5d:
[ ... ]
                Exponent: 65537 (0x10001)
        X509v3 extensions:
            X509v3 Subject Key Identifier:
                98:9A:E3:EF:D8:C5:5C:7F:87:35:87:45:78:3D:51:8D:82:2F:1E:A3
            X509v3 Authority Key Identifier:
                keyid:98:9A:E3:EF:D8:C5:5C:7F:87:35:87:45:78:3D:51:8D:82:2F:1E:A3
                DirName:/C=GB/ST=Gallifrey/L=Gallifrey/O=Dr Who/CN=Dr Who Root CA
                serial:03:91:DC:F3:FA:8D:5A:CA:D0:3D:B7:EE:1B:71:2D:60:B5:0A:99:DE

            X509v3 Basic Constraints: critical
                CA:TRUE
            X509v3 Key Usage: critical
                Certificate Sign, CRL Sign
    Signature Algorithm: sha256WithRSAEncryption
         30:92:7d:09:e4:ea:4d:81:dd:8e:c2:ba:c0:c4:a6:26:62:4d:
[ ... ]

All in all, it’s pretty similar to the previous non-root certificate, except that its Authority is itself: There is no way to validate this certificate, as there is no other public key to use. As mentioned above, root certificates are installed directly into the validating software’s database as the anchors of trust.

The list of fields that match between this and the previous certificate remains the same as between the previous certificate and its predecessor. Once again, not all fields are always present. Actually, there are a few fields that make no sense in a root certificate, and yet they are most commonly present. Let’s look at a couple of oddities (that are common):

  • The certificate is signed. One may wonder what for. The signature is always done with the Authority’s key, but in this case, its the key contained in the certificate itself. So this proves that the issuer of this certificate has the secret key that corresponds to the public key that is contained in the certificate. The need for a signature hence prevents issuing a root certificate for a public key without being the owner of the secret key. Why anyone would want to do that remains a question.
  • The certificate points at itself in the Authority Key Identifier extension. This is actually useful for spotting that this is indeed a root certificate, in particular when there are long and rambling names in the Subject / Issuer fields. But why the DirName?

How the certificate chain is validated

Chapter 6 of RFC 5280 offers a 20 pages long description of how a certificate chain is validated, and it’s no fun to read. However section 6.1.3 (“Basic Certificate Processing”) gives a concise outline of how the algorithm validates certificate Y based upon certificate X.

The algorithm given in the spec assumes that some other algorithm has found a candidate for a certificate chain. Chapter 6 describes how to check it by starting from root, and advancing one certificate pair at a time. This direction isn’t intuitive, as validation software usually encounters an End Entities certificate, and needs to figure out how to get to root from it. But as just said, the assumption is that we already know.

So the validation always trusts certificate X, and it checks if it can trust Y based upon the former. If so, it assigns X := Y and continues until it reaches the last certificate in the chain.

These four are the main checks that are made:

  1. The signature in certificate Y is validated with public key contained in certificate X.
  2. The Issuer part in certificate Y matches exactly the Subject of certificate X.
  3. Certificate Y’s validity period covers the time for which the chain is validated (the system clock time, or the timestamp’s time if such is applied).
  4. Certificate Y is not revoked.

Chapter 6 wouldn’t reach 20 pages if it was this simple, however much of the rambling in that chapter relates to certificate policies and other restrictions. The takeaway from this list of 4 criteria is where the focus is on walking from one certificate to another: The validation of the signature and matching the Subject / Issuer pairs.

I suppose that the Subject / Issuer check is there mostly to prevent certificates from being misleading to us humans: From a pure cryptographic point of view, no loopholes would have been created by skipping this test.

And this brings me back to what I started this post with: This whole thing with certificates has a bureaucratic side, and a cryptographic side. Both play a role.

Add a Comment

required, use real name
required, will not be published
optional, your blog address