What do Silicon Valley Bank, Southwest Airlines, and Norfolk Southern Airlines have in common? These organizations, and many others like them, suffered business disasters last year. A run on a Silicon Valley bank led to a foreclosure by regulators. Southwest Airlines' crew scheduling software has collapsed, causing more than 16,000 flights to be canceled during the winter holiday season. The Norfolk Southern Railroad derailment in East Palestine, Ohio, caused the evacuation of local residents and the incineration of toxic chemicals. All three catastrophes likely occurred for the same underlying reasons, and they could have been prevented.
Such business and industrial disasters are not as rare as one might think. Bank failures in the United States average about 25 per year, major airline disruptions are familiar to any frequent traveler, and railroad derailments occur on average about 3 per day. Masu. The increased variability, uncertainty, and complexity of external factors, combined with their instantaneous global distribution, make catastrophic events more likely, if not more likely, to be strategic than ever before for a wide range of organizations. has become important.
What is vulnerability drift?
Typically, the subsystems of an organization or industry must work in harmony for the entire system to function effectively. However, these systems become vulnerable and the next unexpected shock can lead to catastrophe. If these systems were originally designed to be resilient to industrial disasters, how do they become vulnerable?
Akhil Bhardwaj of Tilburg University, Joseph Mahoney of the University of Illinois at Urbana-Champaign, and I have developed a new way of thinking to explain the common ultimate limits of these systems. We call this explanation “vulnerability drift.” Essentially, making changes (or “adaptations”) to one subsystem without checking whether it affects other subsystems can put the entire system at risk. Similar to the tower-building game Jenga, adjusting one block at the top increases the instability at the bottom, potentially causing disaster.
When there is vulnerability drift, the system does not appear to be “broken”, so it is usually not “fixed”. But vulnerability drift means that shocks that were once easily handled by systems are now more likely to cause catastrophe and require recovery plans. For example, failure to replace risk heads who dynamically adjust the balance of risks could leave banks vulnerable to sudden withdrawals if interest rates change rapidly. As airlines continue to grow and expand geographically, once-reliable software can no longer handle inclement weather. Also, if you don't sequence the cars based on weight, applying the brakes at an unexpected location can cause heavier loaded cars to bunch up and lighter or empty cars to derail.
Why does vulnerability drift occur??
Operators, executives, and regulators may not be familiar with it. Without system knowledge and detailed system analysis, local adaptation in one subsystem can (incorrectly) appear to be a good idea, especially if the failure does not occur immediately. Every adjustment has consequences, and not all consequences are immediately apparent.
Additionally, changes to business continuity plans and disaster recovery plans are not always intentional. Operators, leaders, and regulators may pursue “sub-goals.” For the sake of bonuses, they may sacrifice long-term system resiliency, which is difficult to measure, to achieve short-term goals that are easier to measure, such as improving product functionality or reducing costs. Or they may succumb to internal and external political pressures and be forced to adapt to what others want.
How can we prevent vulnerability drift?
1. Establish awareness. Understanding why vulnerability drift occurs is the first step to preventing it. The goal is to make operators, leaders, and regulators aware of the potential for vulnerability drift, avoid adaptations that increase vulnerability, and find ways to prevent the pursuit of the “wrong” subgoals.
To increase awareness, organizations need to conduct regular risk assessments, integrate systematic thinking into decision-making, and establish clear communication between cross-functional teams. These steps ensure that subsystem interconnections are considered and promote a culture of proactively identifying and addressing potential areas of vulnerability.
2. Visualize through heatmaps. We create a simple graphical method to help laypeople understand which pairs of subsystems are likely to experience vulnerability drift. For example, for Heat he created a freight train transportation graphic that resembled a map. With the help of industry experts, we identified 11 critical freight train transportation subsystems, leading to the creation of a matrix of all subsystem pairs. Widely distributing and explaining such heatmaps will help everyone become aware of vulnerability drift and know when to perform stability analysis before implementing adaptations.
3. Ensure system analysis. Based on heatmaps, include policies that require system analysis before making adaptations, and create auditing capabilities to reject adaptations that do not have the required analysis. Without direct reporting to the board, political influence can undermine the audit function. Without a direct reporting relationship with the board, audits may not have sufficient credibility to keep political influence at bay.
When considering adopting a new perspective like vulnerability drift, we need to ask ourselves three questions: Is the idea novel and valuable? What are the ultimate benefits of adopting it? What are the potential costs of ignoring it? What are the benefits of adopting this perspective? , being able to prevent disasters. The cost of ignoring vulnerability drift is to imagine putting yourself in the shoes of the CEOs of Silicon Valley Bank, Southwest Airlines, and Norfolk Southern Airlines after a catastrophe and thinking about all the pain caused by the loss of people, profits, and property. This may be best confirmed by thinking about it.