We all know that ensuring your services remain online is good for business. Customers cannot transact with you if your systems are offline and, in a world, where high availability is expected, organisations who suffer downtime damage the reputation of their brand.
Many organisations have invested in technology which promises high availability but to ensure resiliency it is important to remember that high availability is not only about the technology. High availability results from the implementation of technology and the processes which support it.
The checklist below has been collated to help you ensure the solution you have implemented for your business promises and delivers the high availability you expect.
High Availability Checklist
1. Single Point of Failure
If your online availability is dependent on a single system, component or service, you do not have high availability.
The single point of failure may not necessarily be the application infrastructure itself. You may have multiple web, application and database servers constituted in a redundant, highly available configuration, but if these servers are hosted behind a lone router or firewall, you have a single point of failure for the entire solution. Looking at solutions by taking a holistic end-to-end approach can help you identify and mitigate this risk.
2. Service Dependencies
Modern applications and services do not operate in isolation. They are dependent on multiple external systems which provide the features and functionality needed by modern applications to serve their user base.
When deploying a solution for high availability, organisations must be cognisant of these external services and ensure they form part of the application or service’s high availability plan. Let’s use the example of an application which is reliant on a third-party email service which your application uses to interact with users for password resets etc. Should this email service go down, the functionality of your application is compromised even though the application is theoretically available.
3. Multiple instances
Applications and services must be hosted in a multi-instance configuration to ensure high availability.
If your application needs a web, application and database tier to function, these services need to be configured in a redundant, multi-instance fashion. In this way, should any one component fail, for whatever reason, the application remains functional.
Multiple instance configurations also bring the added benefit of heightened performance. Your application can be configured for load-balancing and scalability ensuring no impact on user experience when increased demand puts pressure on computing resources.
High availability solutions must take all risks into account including the risk of an entire physical site going offline. The mitigation of this risk is ensuring the availability of services in separate physical locations with data replication to ensure system integrity.
A separate, redundant location running identical services has been an expensive option for organisations to adopt in the past. However, with the proliferation of cloud services, this high availability architecture is now affordable and available to even the smallest organisation or startup.
5. Backups and Point-in-Time Restores
Disaster Recovery and High availability solutions start with backups. Without backups you can have no disaster recovery. One could argue that backups are not needed for high availability as backups are only ever needed to restore data. However, this is a dangerous premise and poses a serious risk.
If an organisation relies on high availability with no backups they run the real risk of unrecoverable system corruption. For example, if a human operator deletes an important file on a system configured for high availability, the deletion will replicate to all the available instances. If there are no backups, the system integrity cannot be restored back to a point in time before the deletion which compromised the system occurred. You can have high availability without backups. However, backups are an essential part of a resilient solution and are therefore an integral part of this checklist.
6. Failover Testing
Failover testing ensures your high availability solution operates as it should.
Without failover testing your solution is essentially unproven. Regular failover testing is therefore an essential part of any high availability solution checklist. Finding out your solution has a single point of failure or undocumented dependency during a failover test is a great deal better than finding out when an unplanned incident causes unforeseen downtime which detrimentally affects your business.
7. Critical Resource Monitoring
You cannot manage what you cannot measure. Monitoring your critical resources even if they are set in a configuration which guarantees high availability is essential to ensuring the integrity of your high availability solution.
For example, let’s take an example of a raided disk array. Should a single disk in an array fail your solution will remain online proving high availability. However, if that failed disk is not replaced, a second disk failure could prove to be catastrophic.
Ensure always-on availability with cloud-based hosting and backups
This checklist shows the extent of the checks and balances needed to ensure true high availability and guaranteed system resiliency. Outsourcing these checklist items by subscribing to a cloud backup and recovery service can not only enhance your business resiliency but also improve your operational efficiency. Items such as single points of failure, multiple instances, geo-replication and backups are essential parts of Nexon’s cloud service offering.
Ready to be always on?
Discover the benefits of always-on availability to your business with our free whitepaper 5 Important Reasons Why Your Business Needs to Be Always On.