Certificate Problems are a Common Cause of Downtime
The broad adoption and use of cryptography throughout modern enterprises is an important innovation and a key tool to improve the security of organizational systems and data. However, cryptography creates some complexities and dependencies that are often not well accounted-for and can lead to system downtime as a result. The use of cryptographic certificates for encryption and authentication is a key source of such downtime.
In a recent survey, 79% of responding organizations said they had suffered at least one certificate-related system outage during 2016; 38% suffered six or more such outages! This is something that we see from time-to-time in our business. Even when an outage is not directly attributable to a certificate problem, it is common to see a system or service restore be significantly delayed due to a difficulty in restoring a certificate or a need to generate or obtain new certificates.
A recent incident at the Department of Homeland Security underscores the risk here. On the morning of Monday, February 20th, 2017, DHS (specifically, Citizenship and Immigration Services) had a domain controller certificate expire, causing many users to be unable to log in to their accounts. Impressively, DHS was able to restore service by 10am, even though this was a federal holiday, so this was almost a best-case scenario. The majority of the organizations responding to the above-mentioned survey said that it would take their organization at least six hours to respond to such an outage.
Some basic guidelines for managing certificates include:
In a recent survey, 79% of responding organizations said they had suffered at least one certificate-related system outage during 2016; 38% suffered six or more such outages! This is something that we see from time-to-time in our business. Even when an outage is not directly attributable to a certificate problem, it is common to see a system or service restore be significantly delayed due to a difficulty in restoring a certificate or a need to generate or obtain new certificates.
A recent incident at the Department of Homeland Security underscores the risk here. On the morning of Monday, February 20th, 2017, DHS (specifically, Citizenship and Immigration Services) had a domain controller certificate expire, causing many users to be unable to log in to their accounts. Impressively, DHS was able to restore service by 10am, even though this was a federal holiday, so this was almost a best-case scenario. The majority of the organizations responding to the above-mentioned survey said that it would take their organization at least six hours to respond to such an outage.
Some basic guidelines for managing certificates include:
- Develop organizational policies regarding cryptography, including the requirement that all cryptographic devices and keys be approved by and registered with one central authority (usually IT or the CIO).
- Track all certificates in a central database (could just be a spreadsheet). Key information should include:
- System(s) on which the certificate is used
- Internal party responsible for the certs and/or system
- External party from which the certificate was obtained, if any, including contact info
- Expiration date
- Any necessary notes or instructions on the use of the certificate
- Store all certificates and parts of certificates in a backed-up location with strict access control and logging. Where possible, use a two-party rule, where the certificates are stored in portions accessible by different parties.
- Check key services periodically for certificate expiration dates. If the expiration falls within the period of the inspection (i.e., if checking quarterly and a cert will expire in less than three months) create a task to immediately renew/replace the certificate. Be sure to check dependencies, such as upcoming expiration of a certificate authority's signing certificate.
Comments
Post a Comment