Operational Resilience

As the service delivery infrastructure of financial institutions continues to become more complex due to digitalization and the evolution of ecosystems, it is becoming increasingly challenging for financial institutions to respond to diverse and unpredictable risks, such as system failures, cyberattacks, third party issues, pandemic, and geopolitical risks. This has recently led to a growing need for financial institutions to ensure operational resilience, or the ability to respond to incidents and recover operations that are important to customers and society. Regulators across the globe are more and more urging financial institutions to make haste with their operational resilience framework.

In this newsletter, we provide you with an overview of operational resilience expectations for financial institutions, along with key insights and the latest from Japan and global regulators.

1. Strengthening the Operational Resilience of Financial Institutions
2. Latest Regulatory Insights
3. Operational Resilience Approaches by Financial Institutions

1. Strengthening the Operational Resilience of Financial Institutions

Global financial regulators have focused on financial resilience over the last 10 to 12 years and that has been made a board level agenda item. But in the same period, the number of operational disruptions impacting the firms has increased. Therefore, the regulators now see the need to uplift the importance of operational resilience and it now needs to become a peer of financial resilience as a board level agenda item.

The change that many regulators are trying to influence here is move operational risk and resilience from being a second line of defense concern, to the first line of defense, embedded in the business owned by operations or the people who actually provide the services to the customers and the market rather than those assessing impact and likelihood. The first line of defense can better articulate the impact of service failure, on the market or the customers who use it. The second line of defense can provide an oversight of how the first line of defense implement this. This is a fundamental shift in the way that we are looking at operational risk and resilience. It is all about how a firm responds and recovers if the risks of the first line activities were to materialize. Operational resilience is an outcome, and the firm will become more operationally resilient through more effective management of operations and some of its operational risks.

In recent years, many financial institutions in Japan and elsewhere have been facing the risks of large-scale business disruption due to systems failure, increasingly diverse cyberattack incidents, terrorism, regional conflicts, and natural disasters caused by climate change.
Following the onset of the COVID-19 pandemic, due to accelerated digitalization of business operations, companies have been relying on third or fourth parties in their business operations, such as usage of cloud services and new services in collaboration with FinTech companies. Because people, processes and IT systems are now highly interconnected, risks in them are growing in complexity, and it often requires more time to analyze the root cause or the full scope of the impact of incidents that arise. Cases in which lengthy recovery time following system failures or temporary service outages causes significant customer impact are observed more and more among financial institutions in Japan and globally.

Given the current complex interconnections of people, processes and systems, there are limitations to typical risk management framework focused on preventive actions and zero-risk approaches common in Japanese financial institutions. To move beyond this, financial institutions must operate under the assumption that unexpected incidents will inevitably occur, and accordingly deploy proper processes and resources to enable rapid recovery from those incidents; in other words, firms must strive for enterprise-wide operational resilience.

Operational resilience is explained in “Principles for operational resilience” from the Basel Committee on Banking Supervision (“BCBS”) as the ability of financial institutions to respond and recover from disruptive events such as terrorism, cyberattacks, or natural disasters quickly and flexibly. Operational resilience is realized not by developing a new risk management framework to replace the existing one, but by leveraging the existing framework while supplementing it with a holistic enterprise-wide view of risks (see Figure 1).

[Figure 1: Image of a comprehensive framework that leverages existing risk management framework]

In the case of Business Continuity Plan (“BCP”) management for example, most firms have an established way of (1) categorizing important business processes, (2) analyzing necessary resources – primarily the firm’s own resources, considering financial impacts, reputational risks and others arising from the disruption of important business process – and (3) identifying the internal and external stakeholders critical to business continuity.

To ensure operational resilience a firm must consider impacts not only on the firm itself, but from a wider perspective encompassing impact on customers, market participants and the stability of the financial system, and in doing so it is important to identify necessary business processes and interdependencies among third (and fourth) parties, cloud services and APIs on an end-to-end basis.

From the perspective of cyber security management as well, it is becoming increasingly important to implement effective measures for quick recovery of operations and services under the assumption of being compromised to some extent by cyberattacks, and the points raised above with respect to BCPs are also relevant in the context of cyber security; ensuring operational resilience will also strengthen cyber security.

2. Regulatory Approaches for Operational Resilience

In 2018, UK regulators started the discussion by publishing a discussion paper on operational resilience, and approximately 3 years later, BCBS issued its principles. In Japan, the JFSA released a discussion paper in December 2022, which reflects recent market developments and is aligned with the latest approaches of global financial authorities.

1) Overseas Financial Regulatory Authorities
While each region publishes different discussion papers and regulations, fundamentally the objectives and principles that they are trying to get across are exactly the same. And it’s all about how the firms will respond and recover from incidents. The language, taxonomy and terminology being used in the papers are different. But fundamentally the key objectives of each of those regulatory bodies and what they are looking to achieve are very similar.

KPMG International's gap analysis shows that about 80 to 85% of the content is similar. This has enabled global systemically important financial institutions (G-SIFIs) to establish a comprehensive framework for responding to the regulations of each country.

In Europe and the United States, guidelines and regulations have been developed to increase resilience to enable business continuity. In March 2021, BCBS issued the final documents of “Principles for operational resilience”, which sets out 7 principles including business continuity, outsourcing to third parties, and the technology which are all important factors to consider when strengthening operational resilience. It is also important to consider that existing risk management frameworks, business continuity plans and third-party dependency management are implemented consistently within the organization.
In the United Kingdom, after releasing a joint Discussion Paper in July 2018, the UK regulatory authorities issued a joint Consultation Paper "Operational resilience: Impact tolerance for important business services" in December 2019, and final policy in March 2021 went into effect in March 2022. The key points of the UK regulation are Senior management’s leadership, identification of important business services, mapping of resources, setting of impact tolerance, scenario testing, establishment of communication plans, and fostering of resilience culture. In September 2020, the European Commission issued the "Digital Operational Resilience Act (DORA)" which was passed by the European Commission in November 2022. It is designed to ensure resilience to ICT risks and is based on the guidelines of the European Banking Authority (EBA). The key point of DORA is European supervisors have been able to select critical third-party service providers and have them overseen by key supervisors, including on-site inspections and ongoing monitoring. In March 2021, the UK regulatory authorities also published a discussion paper providing a framework for direct oversight of critical third parties.

In the United States, in October 2020, US regulators issued the "Sound Practices to Strengthen Operational Resilience". The US Monetary Authority (OCC) and the US Federal Reserve (FRB) have made operational resilience a key component of their inspection items since 2019.

Not only in Europe and the United States, but also in Asia, the regulatory authorities are becoming active. Hong Kong Monetary Authority (HKMA) published the Operational Resilience Policy Manual in May 2022. It describes a step-by-step approach to increasing operational resilience and defines what needs to be done at each step. In Australia, Australian Prudential Regulation Authority (APRA) issued a consultation paper on operational resilience in July 2022 which is scheduled to be finalized in early 2023 and enforced on January 1, 2024. In Singapore, Monetary Authority of Singapore (MAS) issued a consultation paper on outsourcing, technology risk and BCM to enhance operational resilience, and revised related regulations and guidelines. In the consultation paper on BCM, the key points of operational resilience have been included, such as to identify critical business services and map interdependencies including third parties.

2) Financial Services Agency of Japan (JFSA)
Proposals by JFSA are very closely aligned to other polices and guidelines that have been published by other regulators.

On December 16, 2022, the JFSA released a draft discussion paper entitled "Basic Approach to Ensuring Operational Resilience" (the “Discussion Paper”), which defines operational resilience as "the ability of a financial institution to continue to deliver critical operations at a minimum level of resilience in the event of system failures, terrorism or cyberattacks, infectious diseases, natural disasters, and other events”. This definition is similar to the principles of the BCBS and the regulations and guidelines of other countries. The Discussion Paper states that firms should set a minimum level ("Tolerance for disruption") to be maintained in consideration of the impact on the financial system and customers in the event of operational disruptions, and map the interconnection of critical operations, secure necessary resources, and periodically verify the appropriateness through drills and testing to ensure that expected impacts of disruptions are within the tolerance levels set. While the Discussion Paper is primarily directed at banks, it is also intended to be used by firms that provide critical services or operations within the financial system (including critical third parties). The objective is to view from the perspective of customers and financial stability and ensure operational resilience by building on existing risk management frameworks to develop a comprehensive framework encompassing end-to-end business processes, including third parties. The Discussion Paper is intended to be used as a basis for dialogue between the JFSA and financial institutions to develop a more practical operational resilience framework.

[Figure 2: Summary and Purpose of the Discussion Paper]

	Summary and Purpose of the Discussion Paper
Definition:	•This document defines operational resilience as the ability of a financial institution to continue to deliver critical operations at a minimum level (i.e., Tolerance for disruption ) in the event of such as a system failure, terrorism or cyber attack, an infectious disease, or a natural disaster.
Purpose:	•The purpose of this document is to present a basic framework for ensuring operational resilience based on an overview of international trends, and to sort out issues and issues to be considered. •And this is to use this framework as a material for dialogue between Financial Services Agency and financial institutions in order to build better practices in ensuring operational resilience.
Scope:	•This document is primarily intended for banks. However, ensuring operational resilience is also an important issue for firms that undertake critical operations in the financial system (including critical third parties).
Approach:	•Ensuring operational resilience by utilizing existing risk management frameworks and developing a comprehensive framework for the end-to-end business process, including third parties. •In order to ensure operational resilience , “Customer’s perspective”, “Cross-organizational approach that transcends business, systems, and departments”, “cooperation with external stakeholders including third parties”, “acceptance of diversity such as values and expertise”, and “mutual understanding through open dialogue” are indispensable. •“Identifying Critical Operations” and “Setting Tolerance for disruption” are described in relatively specific terms. However, "Mapping of interconnectedness, securing of necessary management resources" and "Verification of appropriateness and additional measures" need to be clarified through the search and discussion of best practices.
Application:	•Financial Services Agency will use the Discussion Paper to exchange views with financial institutions and promote dialogue. •It does not formally apply individual issues or use them as checklists in the inspection and supervision of financial institutions.

Source: Prepared by KPMG Consulting Co., Ltd. based on the JFSA's "Basic Approach to Ensuring Operational Resilience" (draft)

We identify within the Discussion Paper the following four key points in ensuring operational resilience: (1) identifying critical operations, (2) setting Tolerance for disruption, (3) mapping interconnections and securing necessary resources, and (4) verifying the appropriateness and taking additional measures.

It is important to first identify critical operations across the organization, and then through the lens of these operations or services set tolerance for disruption, secure resources and verify the appropriateness of the framework. This shift to viewing risk through the lens of end-to-end services is what distinguishes operational resilience from existing risk management frameworks.

It is important to set tolerance for disruption from the perspective of customers and financial stability. In order to do so, it is necessary to estimate the impact on customers and the markets more precisely, considering things like the scope of service disruption, the number and volume of impacted transactions, and the number of customers impacted. Detailed planning of prompt external communications including guidance to customers of alternative transaction methods, etc. is another important element to consider.

Mapping of interconnections and interdependencies requires continued exploration of best practices with respect to the scope and granularity of mapping and effective third-party risk management. With regard to third-party management, the Discussion Paper accounts for the increasing complexity of interdependencies of financial institutions and third parties and is in line with guidelines of other countries’ supervisors in promoting monitoring of third-party concentration risk dialogue with critical third parties. Mapping of critical operations against internal and external resources helps to clarify the resources required to deliver critical operations and identify alternative measures and solutions that can serve as temporary substitutes in the event of a service disruption, and to identify resources that are prone to cause disruptions of critical operations. Resources may include people (human resources, third parties), goods (tangible assets, technology, data), and money (investment), and identifying and fortifying against the vulnerabilities of each resource will lead to enhanced operational resilience.

The Discussion Paper calls for periodic review on an enterprise-wide basis to verify the appropriateness of the operational resilience framework and take additional measures for remediation and improvement as necessary, utilizing existing drills and testing under severe but plausible scenarios. In doing so, it is essential to take an outcome-oriented approach; take an enterprise-wide approach that transcends business, systems, and organizational boundaries; collaborate with external stakeholders including third parties; embrace diversity in values and expertise; and promote mutual understanding through open dialogue.

[Figure 3: Key Points in Ensuring Operational Resilience]

Source: Prepared by KPMG Consulting Co., Ltd. based on the JFSA's "Basic Approach to Ensuring Operational Resilience" (draft)

The content of the Discussion Paper is generally similar to regulations in the UK and other countries, but there are some differences. For example, the Discussion Paper assumes the use of existing frameworks such as BCPs and Recovery and Resolution Plans (“RRPs”). It assumes that operations and critical functions specified in existing BCPs and RRPs will be used as critical operations in the context of operational resilience, and that existing indicators in BCPs such as Recovery Time Objective in the event of disruption will be used for setting Tolerance for disruption.

However, from our point of view, further clarification is required ahead of the final guideline in relation to how critical operations are defined and how they are different from important business processes or critical functions used for the purposes of BCP and RRP, and how to define tolerances for disruption and how they might be different from RTO and other exiting metrics for the purposes of BCP, which are usually very internally focused (i.e., the firm’s tolerance for not having that application up and running). What RTO’s do not typically articulate is how long the market and the customers who leverage the service can tolerate not having the access to it.

The Discussion Paper also allows for verification of appropriateness through existing drills and testing programs, such as BCP drills, and does not require scenario testing as other regulators outside Japan do. Furthermore, the reference to human resource systems to secure the necessary human resources and the fostering of a risk management culture as issues to be addressed to ensure operational resilience is unique to Japan.

The Discussion Paper doesn’t give any regulatory timeline, but the firms need to get ahead of it by thinking about their methodology in the framework. International experience suggests that firms typically struggle with embedding this into business as usual (BAU). For example, in addition to a lengthy consultation process, firms in UK have been given four years to embed operational resilience into BAU: the cultural change required and changing mindset and behaviors, new roles and responsibilities needed to embed are difficult to implement. Therefore, the firms should start thinking about this and getting ahead of the regulatory timeline.

Similarities between the Discussion Paper and guidance from other regulators, on the other hand, include the expectation of strong management commitment, calling for operational resilience to be promoted under the clear ownership and accountability of senior management. KPMG's Ashley Harris, Global Lead Architect for Operational Resilience, and Nick Strange, who previously led the development of operational resilience policy at the Bank of England, comment on the expectation of a commitment to operational resilience by top management:

“The commitment of senior management has been a major factor in operational resilience landing well and reaching a degree of maturity in the UK. The Board and CEO see operational resilience as a necessary business strategy not just regulatory compliance. Without operational resilience, the financial services they provide cannot be shown to be fully resilient and they will lose their competitive edge with customers and markets. In addition, there is a potential reputational risk if business disruptions cause customers to experience an impact beyond their tolerance level. Operational resilience is not about avoiding risk, but about how quickly business can be recovered on the assumption that the risk of business disruption will materialize. The first line of defense, and ultimately the management, has ownership.”

In line with the above comments, for operational resilience to become a key agenda in Japan we suggest that firstly, it is necessary for management to view operational resilience as a business strategy and foster the mindset that ensuring operational resilience is essential for business. In addition, with the globalization of financial services and the increasing interconnection and interdependence between financial institutions and global Big Tech companies, it is important in terms of business management and maintaining international competitiveness for the management of Japanese financial institutions to recognize operational resilience, including for critical third parties such as Big Tech companies, as a key agenda.

3. Financial Institutions’ Approach to Operational Resilience

Some of the leading financial institutions outside Japan have in many cases taken two to three years for preparation and implementation of their own operational resilience framework, indicating that substantial time is likely required to achieve an effective operational resilience framework. Japanese financial institutions should move forward with planning and preparation of their operational resilience frameworks while continuing to follow regulatory developments and maintain dialogue with relevant stakeholders.
Leading financial institutions have taken the following general approach:

1) Start by performing a gap analysis between current processes/organization and regulatory (draft) requirements.

2) Begin to draft key principles and design their own operational resilience framework leveraging existing risk management activities where feasible.

3) Run a pilot for select important business services or critical operations.

4) Create a roadmap for implementation of the operational resilience framework.

5) Begin build-out of the operational resilience framework (expand the scope to cover all relevant important business services or critical operations, develop policies, procedures, and tools for continuous operation, etc.).

Of course, the detailed workload and launch plan should be considered based on each firm’s own unique circumstances, such as size and complexity of the business, systems, and existing risk management processes.
It is important that the operational resilience framework is enterprise-wide in scope, covering not only risk management functions but also inclusive of business, systems, and other relevant functions. Please be sure to read our next article as well, in which we plan to provide specific case studies.

It should also be noted that operational resilience not only helps financial institutions weather disruptions in the provision of financial services, but also contributes to the optimization of the various existing risk management activities. Leading firms have begun to reallocate resources as a result of enterprise-wide risk validation activities undertaken as part of their operational resilience framework, and they expect to realize some cost savings related to enterprise-wide risk management processes.

We recommend that firms secure the buy-in of top management and run their operational resilience framework on an enterprise-wide basis, viewing operational resilience not as just another regulatory compliance exercise, but as a solution towards sustainable management of financial institutions.