While technology has come a long way in terms of availability and performance, there are still plenty of hiccups that prevent 100 percent uptime. This makes disaster protection planning a key component of any IT strategy, but what elements need to be included? Here are the top five must-haves for disaster protection:
1. A communications strategy
When things do go wrong, it is important that all stakeholders know who to contact. This can take the form of a list of system administrators and what they are in charge of, whether it is job scheduling or system management. For organizations leveraging third-party services, it is helpful to maintain a list of support contact information so that issues can be resolved quickly.
2. High Availability
While no system is perfect, those designed with HA are best equipped when a component malfunctions or goes down. As Smart Data Collective contributor Victor Brown pointed out, many organizations turn to the cloud for their disaster protection needs. This ensures uptime even in the event of a large-scale outage on-premises. However, the key components – system redundancy, infrastructure swapping and proactive monitoring – of an HA ecosystem are beneficial regardless of where resources are hosted. IBM i operators can leverage PowerHA System Mirror to ensure mission-critical resources remain available during planned and unplanned downtime.
3. Disaster recovery
In an ideal world, the mission-critical resources would remain available all the time, but even the best disaster protection plans sometimes fail. This means it is essential to ensure that a disaster recovery solution is in place in the event of downtime. According to a 2013 survey from the Disaster Recovery Preparedness Council, 72 percent of companies scored either a D or an F in disaster readiness.
From a business standpoint, a significant portion of the DR challenge comes from a lack of clear strategy. The majority of businesses surveyed did not have comprehensive documentation of their DR practices. On the technical side, automation could significantly reduce instances of downtime because analysts revealed human error as the top cause of system outages.
4. Frequent testing
Once a comprehensive DR plan and solution have been put into place, it is critical that they are actually tested for effectiveness. The council’s survey revealed that 50 percent of the organizations polled only tested their strategies twice per year and 13 percent never tested them. Considering that 70 percent of the ones that conduct testing failed to meet their own service-level agreements, most businesses have significant room for improvement.
Leveraging virtual servers and other infrastructure can reduce the time it takes to get systems up and running again. As IT analyst Greg Shulz recently pointed out in a State Tech article, virtualization allows organizations to build predefined configurations, which facilitates faster deployment and minimizes the risk of configuration errors. He suggested creating a master template for systems that can serve as a starting point for any new virtual deployment.