When AMAS performs an audit of a University owned server, the backup procedures are reviewed along with related disaster recovery plans. If you are responsible for the maintenance of servers in your department or school, whether they are used for administrative purposes or not, the following are some procedures you should consider having in place:
- Backup copies of the operating system, application software and critical data should be made on a regular basis. The frequency of the backup will depend on the frequency of changes made and the criticality of the data. For example, software may not change on a frequently but user data may change daily.
A backup copy of the most recent version of the operating system and application software should be available in rebuild a crashed server. These copies are available from the software vendors or ITS (Information Technology and Services). If copies are not available from these sources, the people responsible for the servers should generate them. Some will provide backup copies of software in emergency situations. Sometimes the software is kept in escrow at a third party location. AMAS suggests that these details should be documented during the RFP process. In an emergency, needed backups can be retrieved quickly.
Backup copies of the data should also be available in case the current data on the servers are lost. The data should be backed up on a frequency that is determined by the users of the data. An application with low activity and paper source documents transactions may only need to be backed up occasionally, e.g. semi-monthly. An application with high transaction volume and little/no paper source documents may need to be backed up daily e.g. incremental backups of only data changed. The frequency of the backups should depend on how much effort the your would need to put in to manual recovery. There are different schemes used for rotating and replacing the oldest backup with the most current. Incremental backups may require many copies of the data (up to a years worth). The simpler “grandfather, father, son” method requires three versions be kept, the oldest version is replaced by the newest. Whatever method is used, procedures for generating the backups should be written and logs kept of what dates the backups are made.
- Once the frequency of the backups is determined, consideration should be given to storing some of these backups at an offsite location and not in the server room location along with the servers. However unpleasant and far-fetched it may seem, there is a possibility that after a disaster, the location of your servers, or possibly even the building the servers are located in, may not be accessible. For this reason, it is a common business practice to have a copy of the software and data kept at an offsite location.
Assuming that the operating system and application software are already kept at an offsite location such as ITS or with a vendor, lets concentrate on the data. There are different schemes used for rotating backup data offsite. It is best is to have, at a minimum, the most current copy of the data stored offsite. Where this is impracticable, consideration should be given to have the most current copy sent offsite at least weekly or monthly. The criticality of the data and the amount of effort the users are willing to expend on recovery should be part of the decision. If data were backed up daily and a copy sent offsite weekly, the worst-case scenario would be the loss of one weeks worth of data.
Another issue to consider is the development and documentation of a disaster recovery/business continuation plan. The exercise of developing a plan affords you the time to clearly think out the steps required to restore the systems in a “non-emergency” mode. Plans are usually somewhat dynamic and should be reviewed and updated periodically. Some things to consider putting in a plan are:
Hardware requirements and vendor contacts. - Software products used with version numbers and vendor contracts. - Backup procedures The location of the backups. - A list of employees that are critical to the recovery process and how to contact them. - A list of users to contact. - Step-by-step procedures required to restore the systems and general operations.
The most important and most difficult step is testing all the plans and procedures in place to recover from a disaster. It can be time consuming, requires detailed planning. It should be done once a year at a minumum. As processes and business operations change, the backup plan may also have to be changed and the best way to determine this is through testing.
If you have comments about any of the items contained in this document, or have a suggestion of something else that should be included, please feel free to send them to AMAS at firstname.lastname@example.org