You are the CISO for a publicly traded company. What protections would you implement to ensure availability of your systems (and why)?
In order to ensure the availability of systems, there are certain measures that can be implemented and executed. The below protections and procedures should be used in order to ensure constant system availability:
Resources play a large part of systems and their functioning capability. If a system cannot withstand the amount of resources that are on the system, then the system will struggle to operate effectively and efficiently. It is important to understand how much data and resources the systems can currently handle. Once it is determined how many resources the system can handle, is needs to be determined if this amount also allots for expected growth. If the amount of growth expected cannot be handled by the systems, then the required upgrades/updates need to be implemented to ensure constant system availability. The system memory is also affected by the system capacity. If there is too much memory being used, then the system will become overloaded and may fail. By managing resources and ensuring that there is enough room for growth, system availability can be better ensured.
Mitigating risk and anticipating problems also helps to ensure system availability. If a risk is able to be determined and analyzed before the problem occurs, this will help the system remain available. The larger and more complex the system is, the more risk the system will face in terms of achieving constant availability. As risk becomes more complex though, it becomes harder and less possible to ensure 100% constant availability. It is important to ensure that all risk is managed and mitigated as effectively and as efficiently as possible. It is important to know what to do when these risks do cause problems in order to lessen the impact of system downtime. There should be a process in place to identify the risk, determine what to do if the risk occurs, and how the risk mitigation process will be implemented. If analyzed and addressed properly, many risks can be mitigated against before a problem actually occurs. By mitigating risk and anticipating problems, system availability can be better guaranteed.
In order to prepare for any system failures, testing environments should be used to prepare for and avoid any future problems. Along with testing the application code, it is important to ensure that other components of the system are tested as well. This can include, but should not be limited to, hardware components, software components, middleware, and other forms of infrastructure. Along with these, it is also a good idea to run a stress test on the system. This will ensure that even if the system is being pushed to its limits, it will still remain available for use and will not fail. If the system cannot bear against the stress test, then changes need to be made and implemented. By preparing for system failures and running proper tests, system availability can be better assured.