If the onset of Covid-19 has taught us anything, it is that we should expect and plan for the unexpected. A useable and tested IT disaster recovery plan will help ensure that you are prepared for different scenarios.
It is safe to say that many, if not most, were caught short in their planning for a global pandemic that would shut down economies, bring travel to a standstill and require teams to stay at home and adapt to a new way of working.
Many over the coming years will be asked how they responded to lockdown and those that didn’t have a plan or those who failed to properly execute their plan will be found wanting.
That may be in the form of a performance review, as part of a future interview or as part of a business due diligence exercise.
We are already seeing some business valuations dropping in the aftermath of a poorly executed disaster recovery plan to deal with Covid-19.
As such, we thought it may be useful to highlight key elements of disaster recovery from a technology perspective and then provide you with access to a free downloadable IT Disaster Recovery Plan Template in Microsoft Word.
In this article, we cover:
- What is an IT disaster recovery plan?
- Why is an IT disaster recovery plan important?
- The short-term and long-term effects of IT disruption on your business.
- How to minimise the cost of down-time.
- How do you create an IT disaster recovery plan?
- What should an IT disaster recovery plan contain?
- Disaster recovery planning tips.
- Disaster recovery KPIs.
- A downloadable IT disaster recovery plan template in Microsoft Word.
Our disaster recovery plan template will help you understand and create a disaster recovery plan that is specific to your organisation. It will help you think about what it would require for your business to go back to business as usual in the occurrence of an incident or outage.
What is an IT disaster recovery plan?
An IT disaster recovery plan is a set of procedures your organisation should follow when any of your IT is disrupted.
It forms part of a wider business disaster recovery plan that your business should have for all areas and scenarios.
To be useful when disaster strikes, your IT disaster recovery plan needs to be a living document that is amended whenever changes are made, risks evolve or your business changes.
IT disruption can be caused by various things such as human error, natural disasters, technological failure or cyber threats and attacks.
When a disruption occurs, your disaster recovery plan comes into play and details exactly what needs to be done to recover from or mitigate risk. It also details who is responsible for each action and gives timeframes for action.
Why is an IT disaster recovery plan important?
Planning for the unknown can be difficult, but having a comprehensive IT disaster recovery plan will undoubtedly help you become as prepared as you can be so that when an incident occurs, you will be a step ahead and have a well thought through plan to follow at a time when you may not have a lot of time to think and plan.
A disruption can happen at any time and the risk when one occurs is often high.
As mentioned above, disruption can be caused by natural disasters, hardware failures and human error.
Disruption can cause data loss and can make your business cease trading, so you need to be as prepared as possible.
Any type of IT disruption can result in one or any of these detrimental circumstances:
- Loss of data
- Loss of access to applications
- Loss of productivity
- Impact on ability to generate revenue
- Impact on reputation
- Potential loss of customers
- Large fines
Dealing with any of these scenarios will have a negative and significant impact on your organisation.
According to Statista this is what companies around the world have reported as losses due to IT disruptions. In 2019, 25 percent of respondents worldwide reported the average hourly downtime cost of their servers as being between 301,000 and 400,000 U.S. dollars.
The short-term and long-term effects of IT disruption on your business
The short-term impacts of an IT disruption include:
- On your team’s productivity,
- The potential of losing the ability to trade,
- Losing customers to your competition and
- The tarnishing of your organisation’s reputation
All of the above translate to big losses in revenue and custom.
The biggest long-term impact of IT disruption is the negative financial impact that it will have on your organisation.
IT downtime has the potential of snowballing into persistent problems and these usually result in a loss of revenue and subsequent profit.
Some industries may even face large fines that can become detrimental to your business. For example, Marriott Hotels was fined £18.4m for a data breach that affected millions of their customers.
How to minimise the cost of down-time
At this point it should be obvious that you minimise the cost and impact of down-time by having a comprehensive IT disaster recovery plan. However, having a comprehensive IT disaster recovery plan is not enough in itself.
For the whole process to work effectively, your comprehensive disaster recovery plan needs to be updated and tested regularly.
As mentioned previously, it needs to be a living document that reflects the true state of your business and your appetite for risk.
How do you create an IT disaster recovery plan?
Creating an IT disaster recovery plan is a lengthy process which requires a lot of thinking. However, when broken down into component parts, it is very achievable.
Creating an IT disaster recovery plan has 10 steps, as follows:
1 – Top management commitment
Composing an IT disaster recovery plan is a resource intensive task. Therefore, you will need to get commitment from top management as they will help you allocate a budget and resources, as well as sign off on your activities on completion.
2 – Establish a planning committee
Top management should help you establish a planning committee. The committee should be made up of employees that represent each department of your organisation.
For example, high ranking employees from marketing, finance, operations, customer service, sales, etc. Operation managers and data processing managers need to also be included in the committee as they would have key information and influence.
3 – Perform a risk assessment and business impact analysis
Before you start mapping out your disaster recovery processes, you will first need to identify the potential threats that your organisation could face.
Secondly, you will need to establish the likelihood of those threats. Finally, you will then need to determine the damage these threats could have to your operations.
4 – Establish processing and operations priorities
It is important to prioritise threats based on their effects on each business function, with some business functions being more important to your operations than others.
By prioritising, you will be able to identify the business functions that would require your attention first.
The criteria on which you prioritise should be based upon the importance of the business function or the length of time required to restore systems to a functional state.
If an outage requires a lot of time to fix, this should be prioritised.
5 – Determine recovery strategies
At this stage you will need to determine how to restore the affected business functions.
To keep things simple, it is best to have a high-level strategy rather than a nitty-gritty detailed explanation on the best ways to restore the affected business functions.
6 – Collect data
Without taking into account your data, you don’t really have an IT disaster recovery plan.
This is an essential step that you need to get right and always keep up to date.
You will need to gather information on how your different business functions operate and what the processes will need to be implemented in the event of disruption.
The data you will need to collect includes:
- Contact details of regulators.
- Key vendors (such as your electricity provider).
- Important members of departments (for example, your finance manager in your accounts department).
- Data breach notification checklist (containing information on asset inventories, insurance policies, etc).
- Data flow maps on how your business functions depend on IT.
7 – Organise and document plan
You should now have all the necessary information for you to start composing your IT disaster recovery plan.
You should have a list of threats for each department and can then create a plan to deal with each threat as appropriate.
You will need to:
- Verify the source of the threat
- Make sure any physical premises are secure
- Make sure your employees are safe
- Find a temporary solution
- Begin your recovery
8 – Develop testing criteria and procedures
You need to make sure your plans are appropriate for the threats you have listed, making sure there are no gaps in your plan.
To test your plan, you must first define what makes the test a success or a failure.
The most important success factor is that your organisation recovers from disaster.
Other important KPIs are how much data you retain and how long it took for your organisation to recover from incidents.
9 – Test the plan
Now that you have testing KPIs and criteria in place, you should go ahead and test your disaster recovery plans based of different scenarios.
Testing should occur at least once a year, or whenever there is a major change in your IT environment.
If you encounter any gaps with your plan, make sure you document them and deal with them.
10 – Obtain approval
You have now got a comprehensive plan that covers all possible threats and outlines how you will recover from an incident.
You will now need to gain approval from top management as they are ultimately responsible for your organisation’s policies and procedures.
What should an IT disaster recovery plan contain?
To ensure your organisation’s technology, data and employees are safe from a disaster or incident, your disaster recovery plan should include:
- A Register of your organisation’s IT: What part of your IT supports business functions? What are the risks associated with losing these?
- Stakeholders Information: Who are the people affected? How does it affect them?
- An Inventory of Hardware and Software: list in-use or spare software licenses and hardware
- Supplier Information: Do you need to contact any supplier if you do suffer an outage? For example, your IT provider or your landlord.
- Location Information: If your office space is inaccessible, is there an alternate location you can use?
- Tolerance for downtime and data loss: You need to make sure how long your organisation can afford to remain in downtime or how much data you can afford to lose before it greatly impacts your business.
- Testing Plan: How and when will you test your disaster recovery plan?
- Training Requirements: Is there any training that you need to provide to your team?
Disaster Recovery Planning Tips
Throughout our years of helping organisations with their disaster recovery planning and following accepted best practise, we have identified a number of key areas to concentrate on, like:
Clear and frequent communications
When an incident occurs, it is essential that you clearly and frequently communicate with your team about what is going on with the outage.
If you have a customer service function in your business, this part of your business should be prioritised as your customer service team will have to regularly update your customers.
A functioning communication plan
This part of the disaster recovery plan is often overlooked but it is actually a very important element.
How should you communicate with your employees and other stakeholders if the email and phone systems are down? Alternate communication methods are required, such as having a WhatsApp group or even sending emails from personal accounts.
A well-crafted communications plan should not only focus on your employees, but it should also include your vendors, suppliers and customers.
Additionally, you should also compose and include a statement that can be used on your website and social media platforms.
Eliminate single points of failure
A quick way of reducing downtime and mitigating costs is by removing single points of failure from your business wherever practical.
For example, you should load balance your servers, follow backup best practices and build technical fail-safes into your deployments.
Defined responsibilities and back up personnel
All disaster recovery plans should clearly define key people and their responsibilities.
Having clearly identified and defined roles will help everyone involved to understand who is responsible for what, who should be contacted and who acts as a backup resource in the occasion where a key member is not available.
Everyone in the organisation should understand the process and what is expected of them.
Prioritise prevention
Unfortunately, there isn’t an optimal way of preventing outages, they will happen no matter what. However, what you can do is minimise their impact.
You can minimise the impact of outages by replacing outdated systems, old security features and fixing any issues that have the potential to become business affecting.
Alternatively, you can implement a proactive IT support strategy to try and identify issues before they become business affecting.
Another way of preventing downtime is by updating and testing your disaster recovery plan regularly or when changes happen to your IT infrastructure.
Incident post-mortem
When an incident takes place, it is important to digest the entire occurrence and learn from any of your findings once you have overcome the challenge.
A post-mortem should cover why the issue occurred, the impact it has had on your business, what actions were taken to mitigate the incident, the solution that was taken and most importantly what can be done to prevent the issue from occurring again in the future.
Decide how you handle sensitive information
It is important to define procedures to ensure sensitive information remains protected during a disaster recovery plan.
The procedures should explain how sensitive information is retained and accessed.
Test your plan regularly
We can’t stress this enough; it is incredibly important to test your disaster recovery plans.
The technology your organisation relies on is constantly changing, you should continuously test your plans to make sure your disaster recovery plan performs well during an outage.
Remember, your disaster recovery plan is as good as your last test.
Disaster recovery KPIs
To measure the success of your IT disaster recovery plan you will need to look at these major key performance indicators:
- Recovery Point Objective (RPO)
- Recovery Time Objective (RTO)
- Recovery Time Actual (RTA)
- Test Frequency
Recovery Point Objective (RPO)
Recovery point objective measures the time when your data was preserved in a usable format, this is often the most recent backup.
Most organisations backup their data every 24 hours. Therefore, your RPO is most likely going to be 24 hours.
For organisations that heavily rely on data to produce income such as data centres or investment companies, their RPO would typically be up to one hour or less.
Recovery time objective (RTO)
Recovery time objective is the amount of time your applications can be down before they create significant damage to your organisation.
This should include the time your IT provider takes to bring the applications back.
High-priority applications should have high availability systems and options in place that should allow them to be recovered within seconds.
Recovery Time Actual (RTA)
Recovery Time Actual is the real time it takes to solve your incident. For example, your email services go down and the RTO for such issue is 3 hours.
However, it is 3am and you might need to wake up your IT manager, who then needs to travel 1.5 hours to the office but finds that her car has broken down and it actually takes her 3 hours to get to the office.
Once she’s at the office it takes her about 2 hours to solve the problem. The RTA is then 5 hours.
It is important to not have a big disparity between your RTO and RTA. The way to get these two metrics as close as possible is by testing.
Test frequency
This KPI measures how often you test your disaster recovery plans.
Regular testing helps to keep processes and plans up to date. It is an easy way of making sure your plans do not have any gaps.
We live in a world where IT is at the core of almost everything we do. Suffering an IT disruption or an IT incident is often unavoidable so the best way to respond is to ensure you have tested systems and processes in place to minimise impact.
For the sake of your business continuity, it is crucial to have a well-kept and tested IT disaster recovery plan in place.
When your business suffers an IT disruption, you need to have a plan in place to ensure your business continuity.
Creating an IT disaster recovery plan can be difficult. It requires knowledge, expertise and an intimate understanding of how your business operates.
Given that, our downloadable IT disaster recovery plan template in Microsoft Word format will help provide you with the guidance needed to establish a playbook of specific actions and responsibilities should anything go wrong in the future.
It will explain how you should go about creating your plan and will outline key elements to include in your disaster recovery plan.
If that isn’t enough and you are concerned about your IT disaster recovery plan and would prefer to receive help from experienced experts then please call us on +44 203 034 2244 or +1 323 984 8908. Alternatively, you can contact us online.
We will be happy to have a conversation with you and work closely with you and your team to establish a solid disaster recovery plan and ensure business continuity in the face of any IT incidents.
Cardonet have been helping organisations like yours overcome their technology challenges since 1999.
With a team of highly technical and knowledgeable in-house engineers that are available 24x7x365 in the United Kingdom, Europe and US, we can ensure your business continuity and smooth running of your operations.
You must be logged in to post a comment.