Building technology resilience: aspects and actions

This blog post was authored by Damon Owen - Managing Director, Technology Risk and Resilience and Dave Cozzens - Associate Director, Technology Risk and Resilience on Protiviti's technology insights blog.

This is the second in a two- part series exploring the benefits of technology resilience , its aspects and the steps involved to implement a technology resilience program. This post describes aspects of a successful technology resilience program and the steps to implementing one, either with external help or using an organisation’s own resources.

Building technology resilience is a continuous process. Technology resilience programs call for diligent monitoring, constant adaptation to evolving threats and continual evolution to respond to a shifting threat landscape. To begin with, they require a strong business case and well-articulated benefits to secure executive commitment and program funding. Then, continuous advancement of technology resilience capabilities is crucial to maintaining robust, secure infrastructure.

Technology resilience encompasses practices that maintain technology service availability even when normal operating conditions are disrupted. Its fundamentals include systems architecture that considers operational requirements and impact tolerances, anticipates failures and integrates countermeasures in its design. Essentials encompass monitoring and operational awareness to detect impending disruptions, delivering dashboards and reports to track resilience capabilities and trends as well as establishing automated scripts designed to respond to failure scenarios. Automated scripts can address failure identification and relocation, restoration and reconfiguration of systems; applications and database architectures many of which are designed to be self-healing. Technology resilience programs rely on retrospective analysis to restore system configurations to their normal operating states, determine causes underlying events and propose improvements to systems while also analysing health metrics to continuously improve resilience.

Image

 

Aspects of the technology resilience program

Exploring aspects of a sound technology resilience program helps leaders appreciate the scope of the effort and informs the implementation approach for future success:

Strategic

  1. Developing a holistic strategy to align technology resilience with the organisation’s goals and objectives.
  2. Fostering cross-functional collaboration among information technology, security, operations and business departments to ensure the technology resilience strategy encompasses all interests.
  3. Extending resilience practices to include third-party risk management (TPRM) to ensure vulnerabilities do not spread from others. Contract language and third- and fourth-party capabilities must ensure trading partners can and will deliver required resilience features.
  4. Building in redundancy: components and systems that will provide automated failover capability.
  5. Investing in technology solutions such as advanced monitoring, artificial intelligence-driven analytics, and predictive modeling to mitigate effects of threats.
  6. Distributing systems across a diverse infrastructure encompassing multiple locations — and providers. This practice will reduce the impact of any single point of failure or local disruption.
  7. Integrating robust cybersecurity measures to prepare for and protect against breaches and attacks.
  8. Attesting, validating, simulating: testing technology resilience plans and capabilities regularly. These tests should include scenario-based simulations to ensure constant readiness.
  9. Providing ongoing training, skill development and awareness activities so resources know their roles and responsibilities regarding technology resilience. Aligning with risk and compliance efforts to inform teams about emerging threats. Establishing annual training in crisis management, business continuity and disaster recovery.

Operations

  1. Conducting regular, comprehensive risk assessments to identify vulnerabilities and help prioritise mitigation strategies and technology resilience efforts.
  2. Enhancing crisis management and incident response plans to deliver rapid, effective action when disruptions occur. Include communication with internal and external stakeholders, public and private authorities and emergency response organisations in these plans.
  3. Automating routine tasks, incident response and system recovery. Automation helps ensure rapid recovery and also delivers a consistent approach.
  4. Enabling resilience mechanisms that switch to backups when primary systems fail. Even better: enabling workload mobility and transaction portability so no such failover is ever needed.
  5. Implementing storage replication and database recovery tools to ensure data availability and consistency, while minimising any lost transactions and data.
  6. Performing regular backups to secure and isolated (off-line, “air gapped”) locations to protect data from encryption, corruption, destruction and loss. Enabling restoration of clean data to predetermined points in time.
  7. Implementing load balancing across multiple servers. Not only will this practice prevent overload and maintain system performance, but it will also result in a more resilient architecture.
  8. Employing monitoring and alerting tools to provide insights into system health in real-time and to signal anomalies and failures.
  9. Establishing technology resilience program metrics and reporting to gauge effectiveness of efforts and report progress. Consider key performance indicators (KPI) like impact tolerance, recovery time objectives (RTO) and recovery point objectives (RPO) versus actuals, issue management and others.
  10. Maintaining clear, up-to-date documentation on system configurations, operational processes, recovery procedures and dependencies among business functions, applications, systems and services.

Steps to implement and operate an effective technology resilience program

Leaders must consider these key ten steps when implementing and enhancing technology resilience programs:

  1. Identify technology risks and assess their significance to the organisation.
  2. Validate technology resilience drivers like regulations, reputation, customers and stakeholders.
  3. Develop the value proposition and calculate the program’s return on investment.
  4. Develop a proactive approach to technology risk and its impact on organisational resilience .
  5. Identify the leader accountable for the program, including communication about goals, responsibilities, and progress metrics and reporting.
  6. Establish all roles and responsibilities involved in the program as the basis for performance assessment.
  7. Work with lines of business to gain adoption. Consider using playbooks and performance evaluation processes as tools to promote adoption.
  8. Develop consistent communication. Maintain the program’s visibility, renew executive commitment and conduct operational evaluations.
  9. Strengthen the technology resilience program through periodic reevaluation, drills and testing and reporting against performance metrics including key performance indicators (KPI) and key risk indicators (KRI).
  10. Mature the technology resilience program through continuous improvement and analysis of the evolving risk landscape.

Adapting to an increasingly disruptive threat landscape

Technology resilience programs enable organisations to adapt to an ever evolving and increasingly disruptive threat landscape. These programs call for continuous advancement of technology resilience capabilities that maintain robust and secure infrastructures. Understanding the aspects of a quality technology resilience program and the actions needed to develop one are key to realising technology resilience benefits.

To learn more about our technology resilience solutions, contact us or download our Guide to Business Continuity and Resilience and refer to Achieving Resilience Starts at the Top.

Featured insights

Loading...