I have been working a lot on SSL certificates with VMware products last year, I started developing a java tool and a VCO plugin to generate certificate files. I did a lot of research to understand algorithms and structure of SSL Certificates and ended up reading the entire X509 PKI RFC and TLS RFC. In this article I am going to explain the principles of SSL Certificates and SSL Encryption without going too technical and I think it will be enough to better understand how it works. This is what we will cover in this article:
- Symmetric Encryption
- Asymmetric Encryption
- Public Key Infrastructure
- SSL Certification Path Processing
- Structure of SSL Certificates
- How works encrypted communications
The screenshot above shows what I will be talking about during this long article. This error comes from Internet Explorer. I am sure you have already seen this kind of security warnings appearing on the screen, it differs between IE, Firefox, Chrome… Those warnings refer to non validated SSL certificate used to encrypt your data, meaning that this communication is not trusted. Before getting really interested with SSL certificates, I wouldn’t really care about those warnings and I am pretty sure that most of the people ignore these because they just don’t know what a certificate is and what it’s for.
Nowadays, any time you enter a login and password into your favorite social networking website or bank website, it uses SSL Certificates (more accurately, an X509 certificate) to initiate a secure connection between you and the website so that nobody will be able to see your personal information. And If you get this kind of security warnings on a website, I would strongly recommend to not provide any personal information because someone might be listening to the communication and decrypting every single message (also known as the « Man In The Middle » attack). Pretty scary huh ?!
Basically, SSL exists for two reasons: Encryption and Identification. In this article, I will explain why do we encrypt data and how it works, then I will explain the identification process and how to make sure that we are communicating with the right user or server.
Encryption in the process of hiding what is sent from one computer to another in a way that anyone sniffing the communication will never be able to read what is sent between those two computers. There is two main methods that I am going to explain: Symmetric Encryption and Asymmetric Encryption.
Until the end 1960’s, there was no easy way to encrypt a communication between two parties. The process of encrypting a message through a unsecured communication support relied upon a secret key that the two parties could share during a face to face meeting. Then the two parties could exchange encrypted message, encrypting and decrypting was done with the same key : This is called Symmetric Encryption. The problem with this method is that if the two parties cannot physically see each other, there is no way to securely communicate. Let’s illustrate the symmetric encryption with Bob sending a message to Alice and Eve which is listening to the communication:
The symmetric encryption process only work if the two parties share the same secret key. This key must be share in a trusted way. As I said before, how do you share a key in a trusted way if there is no way to physically meet the person you want to communicate with ? Do you send the key through the network ? If you do that, Eve which his listening this communication will get the « Hello ! » message and the key :
You can see that this type of encryption is not really convenient to use, the only secure way to exchange the secret key is to physically meet the person you want to talk to and never share that secret key to anyone else.
From 1970 to 1974, British cryptographer working for the government conceived the possibility of « non-secret encryption » but none of the crypto-system develop during this period have been put to practical use. In 1976, an asymmetric-key cryptosystem was published by Whitfield Diffie and Martin Hellman. This cryptosystem is known as the Diffie-Hellman key exchange. This was the first published practical method for establishing a shared secret key over a communication channel without using a prior shared secret key. In 1977 Ron Rivest and Adi Shamit from the MIT invented the « Public-key encryption », the algorithm came to be known as RSA from their initials.
Principle : The public-key cryptosystem is based on two keys : A Public Key and a Private key. Those two keys are generated at the same time and are mathematically linked. Encrypting with the Public Key is easy, but decrypting with that public key is extremly difficult. This is what we call : A One-Way Function. To decrypt the message you need the private key which is called « Trap Door ». This allows two parties to exchange information over a plaintext, unsecured channel without ever meeting or exchanging secrets out of band. The recipient can generate a keypair, send the public key over the insecure channel, and wait for the sender to encrypt something using it. At this point, only the recipient (the holder of the private key) can decrypt the data.
This is how works the RSA algorithm. RSA uses exponentiation modulo a product of two very large primes, two encrypt and decrypt. Its security is connected to the extreme difficulty of factoring large integers, a mathematical problem that cannot be solved with an efficient technique.
With all these explanations, I am going to illustrate the Public Key encryption with Bob sending a message to Alice and Eve which is still listening to the communication:
This is how is done the Public Key encryption. Now let’s see what Eve has received by sniffing the communication between Bob and Alice :
This is the power of the One-way functions and trap doors.
As you can understand, the Private Key must NEVER be shared. If Eve manages to get the Alice’s private key, she will be able to decrypt all the messages that Alice will receive, and not only messages from Bob !
Now you are going to say « OK I understand this but I never used that before ». In fact, this process of « Public-key algorithm » is used everywhere on the Internet :
This indicates that you are using an encrypted communication that uses SSL Certificates. But where are the keys ?????
Public Key : The Public Key is stored in the SSL certificate as you can see in the pictures below.
Private Key : No one sees the Private key of a server because it’s private. But for this article, I generated a Public key with its associated Private key, and this Private key look like this :
The Private Key MUST never be shared, this is the key to decrypt your data!!!
Encrypting a communication is good but it is not enough. You can use encrypted communication to talk to someone but that doesn’t mean necessarily that the computer that you are talking to is the one you think it is :
This is the screen of my Internet Explorer Web Browser showing the https://www.vmware.com website. Clicking on the padlock next to the url will popup a window telling you who the informations you are going to send is going to go to. In this case, I created a Self Signed certificate using www.vmware.com, imported it to my ESXi server and add a DNS record to redirect www.vmware.com to my ESXi server. As you can see, this site is not trusted because I am sending data to an ESXi Server and not to VMware Website. This is the real VMware website :
Behind that there is a whole lot of infrastructure that will validate that you are talking to the right person or the right server. The best known solution to this problem is what’s referred to as a « public key infrastructure » (PKI). At the heart of a PKI is a set of trusted authorities who can vouch for the validity of a public key. In this way, if you get a public key from Bob, you just need to check with the trusted authority whether or not this is really Bob’s public key. If it’s been replaced by something else like my ESXi server, the authority will detect this and warn you.
Public Key Infrastructure
A public key infrastructure is a set of component needed to create and managed digital certificates. Digital certificates are at the heart of PKI as they affirm the identity of the certificate subject and bind that identity to the public key contained in the certificate. The most common PKI software is Active Directory Certificate Services, there is also OpenTrust, EJCBA, OpenCA…
A typical PKI includes the following key elements:
- A certificate authority (CA), acts as the root of trust and provides services that authenticate the identity of individuals, computers and other entities
- An intermediate CA, certified by a root CA to issue certificates for specific uses permitted by the root CA
- A certificate database, which stores certificate requests and issues and revokes certificates
- A certificate store, which resides on a local computer as a place to store issued certificates and private keys
A CA issues digital certificates to entities after verifying their identity. It signs these certificates using its private key. Just as a public key can be used to encrypt data for the entity that holds the private key, a private key can be used to prove ownership of a public key. Instead of the sender encrypting the data with the public key, the asserting party (the one with the private key) encrypts a bit of data with the private key, and sends both that data (in the clear) and the encrypted data itself. As it turns out, public-key cryptography works in such a way that only the holder of the private key can do this – so if you have access to the public key, you can use it to decrypt the data. If it matches the « token » data, then you can be assured that it was generated by the holder of the private key. Such a token/encrypted data pair is called a digital signature. Each SSL certificates contain a digital signature that prove the matching between the public key and the information of the server you are talking with. The picture below shows the digital signature of the vmware web site certificate using Firefox:
The Digital Signature (on the left), generated by a trusted Certificate Authority, ensure the validity of a certificate public key with the associated informations (on the right):
CAs use trusted root certificate to create a « chain of trust » — many root certificates are embedded in Web browsers so they have built-in trust of those CAs. This is why the VMware website is already trusted. Your web browser trust by default the authorities involved is the signing of the VMware website’s certificate which has as a root certificate: GTE CyberTrust Global Root.
If you take a look at the certificate store of Firefox, you find that GTE CyberTrust Global Root is already in the trusted Certificate Authorities repository:
GTE CyberTrust Global Root is the root certificate of the chain of trust but it actually didn’t sign the VMware website certificate, to understand why you need to take a look at the chain of trust:
This chain of trust is composed of 5 certificates:
- GTE CyberTrust Global Root signed and trust Baltimor CyberTrust Root
- Baltimor CyberTrust Root signed and trust Digicert High Assurance EV Root CA
- Digicert High Assurance EV Root CA signed and trust DigiCert High Assurance CA-3
- DigiCert High Assurance CA-3 signed and trust *.vmware.com
We trust the first one so *.vmware.com is also trusted
The VMware website appear to be trusted in our web browser because the chain of trust is valid but trusted chain is not all about identifying the different elements with digital signatures, in fact there is a specific process to validate a chain of trust, this is called: Certificate Path Processing.
Certification path processing
First of all the certification path need to be built and to do that there are 2 ways:
- Name chaining: Build a certification path using name chaining means that the Subject Name in one certificate must be the Issuer Name in the next certificate in the path, and so on. The path begins with a self-signed certificate (Root CA) that contains the public key of the trust anchor. The path ends with the end-entity certificate. All other certificates within the path are referred to as intermediate CA certificates. Note that every certificate in the chain except for the last one is a CA certificate. Let’s illustrate the name chaining with the VMware website certificate:
If Certification Authorities were guaranteed to have only one public/private signing key pair active at any given time, satisfying the name chaining requirement would be all that is required to construct a certification path. However, it is possible (even likely) that CAs will have more than one valid signing key pair at the same time. This means that name chaining alone may not be sufficient to determine if the certification path is a legitimate candidate that should be submitted to the certification path validation process. This leads us to the notion of « key identifier chaining » as discussed below.
- Key Identifier Chaining: The Authority Key Identifier (AKID) and Subject Key Identifier (SKID) are certificate extensions that can be used to help facilitate the certification path construction process. AKIDs are used to distinguish one public key from another when a given Certification Authority (CA) has multiple signing keys, and SKIDs provide a means to identify certificates that contain a specific public key.
Similar to « name chaining » between a trust anchor and an end-entity certificate, the SKID of the first certificate should be the AKID of the next certificate in the path, and so on. You can find AKID and SKID in the certificates :
Path Validation :
Once the Path Construction process is done, the web browser need to check each certificate in the chain following this process :
- Verifying basic certificate information
- Certificate has a valid signature
- Certificate is in the validity period
- Certificate has not been revoked
- Verify that SAN extension is consistent
- Verify that Policy set information are consistent
- Process critical extension
- Verify if the certificate is a CA certificate
If those verification are passed, the website is trusted, if not, an SSL warning will appear.
Structure of SSL Certificates :
A SSL certificate is composed of X509 Basic Certificate Fields and Certificate Extensions. In this section, I will explain each of the fields that compose a certificate to better understand how it works.
The structure of an X.509 v3 digital certificate is as follows:
- Version: describes the version of the encoded certificate. When extensions are used, version MUST be 3.
- Serial Number: The serial number MUST be a positive integer assigned by the CA to each certificate. It MUST be unique for each certificate issued by a given CA. When we revoke a certification, the serial number is used to identify the certificate to invalidate.
- Algorithm ID: Describes the algorithm identifier for the algorithm used by the CA to sign the certificate.
- Issuer: identifies the entity who has signed and issued the certificate. The issuer field MUST contain a non-empty distinguished name (DN).
- Validity: The certificate validity period is the time interval during which the CA warrants that it will maintain information about the status of the certificate. The field is represented as a SEQUENCE of two dates: the date on which the certificate validity period begins (notBefore) and the date on which the certificate validity period ends (notAfter).
- Subject: The subject field identifies the entity associated with the public key stored in the subject public key field. The subject name MAY be carried in the subject field and/or the subjectAltName extension. The subject name field is defined as the X.501 type Name.
- Subject Public Key Info: This field is used to carry the public key and identify the algorithm with which the key is used (RSA, DSA, or Diffie-Hellman).
- Subject/Issuer Unique Identifier (optional): The subject and issuer unique identifiers are present in the certificate to handle the possibility of reuse of subject and/or issuer names over time. It is recommended that names should not be reused for different entities and that Internet certificates not make use of unique identifiers.
- Extensions (optional): The extensions defined for X.509 v3 certificates provide methods for associating additional attributes with users or public keys and for managing a certification hierarchy. The X.509 v3 certificate format also allows communities to define private extensions to carry information unique to those communities. Each extension in a certificate is designated as either critical or non-critical. A certificate using system MUST reject the certificate if it encounters a critical extension it does not recognize; however, a non-critical extension MAY be ignored if it is not recognized.
- Authority Key Identifier: The authority key identifier extension provides a means of identifying the public key corresponding to the private key used to sign the certificate. This extension is used where an issuer has multiple signing keys (either due to multiple concurrent key pairs or due to changeover). The identification MAY be based on either the key identifier (the subject key identifier in the issuer’s certificate) or on the issuer name.
- Subject Key Identifier: The subject key identifier extension provides a means of identifying certificates that contain a particular public key. To facilitate certification path construction, this extension MUST appear in all conforming CA certificates
- Key Usage: The key usage extension defines the purpose (e.g., encipherment, signature, certificate signing) of the key contained in the certificate.
- keyEncipherment: The keyEncipherment bit is asserted when the subject public key is used for key transport.
- dataEncipherment: The dataEncipherment bit is asserted when the subject public key is used for enciphering user data.
- keyAgreement: The keyAgreement bit is asserted when the subject public key is used for key agreement.
- keyCertSign: The keyCertSign bit is asserted when the subject public key is used for verifying a signature on public key certificates. If the keyCertSign bit is asserted, then the cA bit in the basic constraints extension (section 126.96.36.199) MUST also be asserted.
- cRLSign: The cRLSign bit is asserted when the subject public key is used for verifying a signature on certificate revocation list.
- Certificate Policies: The certificate policies extension contains a sequence of one or more policy information terms, each of which consists of an object identifier (OID) and optional qualifiers.
- Subject Alternative Names: The subject alternative names extension allows additional identities to be bound to the subject of the certificate. Defined options include an Internet electronic mail address, a DNS name, an IP address, and a uniform resource identifier (URI).
- Basic Constraints: The basic constraints extension identifies whether the subject of the certificate is a CA and the maximum depth of valid certification paths that include this certificate.
- Name Constraints: The name constraints extension, which MUST be used only in a CA certificate, indicates a name space within which all subject names in subsequent certificates in a certification path MUST be located. Restrictions apply to the subject distinguished name and apply to subject alternative names.
- Policy Constraints: The policy constraints extension can be used in certificates issued to CAs. The policy constraints extension constrains path validation in two ways. It can be used to prohibit policy mapping or require that each certificate in a path contain an acceptable policy identifier.
- Extended Key Usage: This extension indicates one or more purposes for which the certified public key may be used, in addition to or in place of the basic purposes indicated in the key usage extension. In general, this extension will appear only in end entity certificates.
- Server Authentication: This is used for TLS WWW server authentication, the KeyUsage may be consistent: digitalSignature, keyEncipherment or keyAgreement.
- Client Authentication: This is used for TLS WWW client authentication, the KeyUsage may be consistent: digitalSignature and/or keyAgreement.
- CRL Distribution Points: The CRL distribution points extension identifies how CRL information is obtained.
- The authority information access: indicates how to access CA information and services for the issuer of the certificate in which the extension appears.
- Signature Algorithm: The signatureAlgorithm field contains the algorithm identifier for the algorithm used by the CA when the certificate was signed.
- Signature: The Signature of the certificate that is use to authenticate the validity of the certificate informations and the associated public key.
How Works encrypted communications
Now that you know the concept of encryption and what is a certificate, I will explain the process of establishing a secured communication when connecting to a website that uses SSL, and how SSL certificates is used in that process. On the internet this is called TLS: Transport Layer Security.
Transport Layer Security (TLS) and its predecessor, Secure Sockets Layer (SSL), are cryptographic protocols designed to provide communication security over the Internet. They use X.509 certificates with asymmetric cryptography to authenticate the counterparty with whom they are communicating, and to exchange a symmetric key that will be used to encrypt datas.
Let’s take a look at the TLS Handshake:
This is the simplified version of the TLS Handshake, if you need more information, please refer to the TLS RFC.
1. TLS runs over a reliable transport (TCP), which means that we must first complete the TCP three-way handshake, which takes one full roundtrip.
3. With the TCP connection in place, the client sends a number of specifications in plain text, such as the version of the TLS protocol it is running, a random number, the list of supported ciphersuites, and other TLS options it may want to use. A cipher suite is a named combination of the key method (RSA, Diffie-Hellman), the cypher method (RC4, Triple DES, AES) and the Hash method (HMAC-MD5, HMAC-SHA…)s which are used to negotiate the security settings to establish the the encrypted communication with TLS.
4. The server picks the TLS protocol version for further communication, decides on a ciphersuite from the list provided by the client, attaches its certificate, and sends the response back to the client. Optionally, the server can also send a request for the client’s certificate and parameters for other TLS extensions.
5. Assuming both sides are able to negotiate a common version and cipher, the client is happy with the certificate provided by the server, it starts calculating a session key based on the server’s certificate and the random number created in step 3. The client start encrypting with the session key and sends the key exchange message to the server.
6. The server also calculated the session key, then processes the key exchange parameters sent by the client, checks message integrity by verifying the MAC (Hash), and returns an encrypted « Finished » message back to the client.
7. The client decrypts the message with the negotiated symmetric key, verifies the MAC, and if all is well, then the tunnel is established and application data can now be sent.
Info: As you see here, either symmetric and asymmetric algorithm are used in a secure communication and the reason for this is that asymmetric algorithm are expensive to compute. A symmetric key is created and share through asymmetric encrypted communication during the handshake process.
I hope this article helped to clarify what is SSL and how security works on the Internet. And if you need some more information, leave me a comment 😉
- RFC x509: https://www.ietf.org/rfc/rfc2459
- RFC TLS: https://tools.ietf.org/html/rfc5246
- Wikipedia: http://en.wikipedia.org/wiki/Public_key_infrastructure
- Digital SIgnature: http://commandlinefanatic.com/cgi-bin/showarticle.cgi?article=art012