Saturday, October 8, 2022

It takes hard work to create chaos!

Last time I started a short series of posts on Barry Turner's book Man-made Disasters which is a classic in the safety science literature.  Turner wrote that "small-scale failures can be produced very rapidly, but large-scale failures can only be produced if time and resources are devoted to them."  In other words, it takes hard work to create chaos!  

While the term disaster is often used in the context of natural disasters such as earthquakes, tornadoes, fires, floods, and hurricanes, Turner uses a more specific definition that applies only to these large-scale organizational failures.  Here, a disaster is defined as "an event, concentrated in time and space, which threatens a society or a relatively self-sufficient subdivision of a society with major unwanted consequences as a result of the collapse of precautions that had hitherto been culturally accepted as adequate."  

In this context, the disasters that Turner is describing are completely man-made in origin.  They are defined less in technological terms and more in sociological ones.  Importantly, all organizations operate within a certain set of cultural beliefs, norms, and assumptions that are either formally and explicitly codified within a set of rules and regulations or tacitly taken for granted as "the way we do things around here" ("safety culture").  Man-made disasters, then, are differentiated from accidents (which refer more to random, chance events) and natural disasters (so-called "Acts of God" such as hurricanes and earthquakes) by "the recognition (often accompanied by considerable surprise) that there has been some critical divergence between those assumptions and the 'true' state of affairs."

While Turner eventually reviewed the accident inquiry reports from 84 different events in the United Kingdom, he focused on three major events:

1. Aberfan Coal Tip Slide of 1966:  Incidentally, this event was featured in episode #3, season #3 of the television series "The Crown".  A colliery spoil tip (basically, a pile of waste material - typically, shale, sandstone, and anything that isn't coal that is removed during the coal mining process) became unstable over time and slid downhill, eventually covering an elementary school in the town of Aberfan, Wales, killing 116 children and 28 adults.  

2.  Hixon Rail Crash of 1968: A semi-trailer carrying a 120-ton electrical transformer attempted to cross a recently installed automatic railroad crossing at Hixon, Staffordshire, England and was struck by British Rail express train, killing 11 people and injuring 45 people.

3.  Summerland Fire of 1973: A fire spread rapidly through the Summerland leisure center on the Isle of Man, killing 50 people and seriously injuring another 80 people.  The interior and exterior portions of the building had been designed by two different architects who didn't coordinate with each other, creating significant fire risks.  The fire was accidentally started by three boys who were smoking in a vacant part of the building.  

While all three events were quite different, Turner identified a number of similarities that he used to develop his "man-made disaster model":

1. Rigidities in perception and beliefs in organizational settings: In all three cases, the accurate perception of the possibility of a disaster was hindered by both cultural and institutional factors.  I referred last time to the concept of "bounded rationality", which is when individuals seek to make decisions that are merely "good enough" rather than the best possible decision (see also "satisficing").  Unfortunately, some important piece of information may have been left outside the framework of bounded rationality - in other words, leaders in these three cases failed to comprehend that the three accidents were even within the realm of possibility.  With the Aberfan case in particular, despite multiple reports in the past of similar colliery spoil tip slides, no one at the Aberfan site was thinking about the possibility of one.  

2. The decoy problem: Here, when some hazard or problem was perceived, action taken to address the problem took the focus away from the true problem, which ultimately led to the event.  Here, the operators and owners of the trucking company were more worried about the hazard of arcing onto the overhead electrical wire at the crossing.  They were so focused on trying to prevent this from occurring, they never considered that a long, slow-moving truck would be at risk of getting hit by a train.

3. Organizational exclusivity - the disregard of non-members: In two of the cases Turner reviewed (Aberfan and Hixon), individuals outside of the principal organizations concerned had anticipated the possibility of the accidents occurring.  However, it was automatically assumed that the organization knew much better what the risks were as opposed to an outsider's view.  

4. Information difficulties: Remember, so-called "wicked problems" lack an easy and obvious fix.  When dealing with complex situations (and wicked problems), Turner suggests that four types of information difficulties are important (see further below).  

5. Involvement of strangers: So-called strangers are those individuals who are present but are not otherwise part of the organization.  As Turner writes, "The problems created in situations where safe operation relies to some extent upon the safe behavior of strangers are intensified by the fact that the strangers are always located at the moment of danger at a site where they have a number of opportunities to manipulate the situation in ways not foreseen by those designing the abstract safety system."  As an example, during the Summerland Fire incident, when designing evacuation procedures (and even during the rescue operation), no one anticipated that parents would go back into danger in order to save their children.  These parents fought to reach their children against the flow of the crowd who was trying to evacuate, thereby creating a congested passageway and ultimately preventing people from getting out successfully.

6. Failure to comply with existing regulations: Both at Hixon and Summerland, individuals failed to comply with safety regulations.  

7. Minimizing emergent danger: Most, if not all, complex systems are non-linear.  The property of emergence suggests that multiple small, seemingly insignificant events, can interact in such a way to produce a much bigger and more dramatic event.  

8. Nature of recommendations after the disaster - the definition of well-structured problems: I mentioned this in my last post when I was talking about "After Action Reviews".  Recommendations made after the fact almost always forget that individuals operating during the crisis do so without perfect information.  Information that is relevant to the disaster may be obvious after the disaster has occurred, but that same information may have been completely hidden at the time of the disaster.  Remember, "not every failure which is obvious now, would be obvious before the disaster."  

One of Turner's most important concepts is that of the "incubation period" (see my last post, "Failure of Foresight" for an in-depth description).  Briefly, the "incubation period" is the time when certain latent failures and misconceptions or faulty assumptions about the organization's safety culture and/or rules and regulations hide beneath the surface until an inciting or triggering event sets off a chain of events leading to catastrophe.  The build-up of latent failures and faulty assumptions occur because of the discrepancy between the perceived state of the organization ("We are not in a dangerous situation") and its true state ("We actually are in a dangerous situation").  Turner cites four key reasons to explain how these latent failures lead to active failures:

1. Events go unnoticed or underappreciated in terms of their significance (recall in particular the decoy problem and the involvement of strangers discussed above)

2. Events go unnoticed or are misunderstood because of the difficulty in handling information in complex situations: Turner calls this the "variable disjunction of information" (which he defines as a complex situation in which a number of parties handling a problem are unable to obtain precisely the same information about the problem, so that many differing interpretations - and as a result, potential solutions - to the problem exist).  The information necessary to fully understand the problem available at any one point in time is dispersed across many locations and different individuals.  Variable disjunction of information is compounded further by the poor communication that often exists in the silos of these large and complex organizations.

3. Effective violations of precautions passing unnoticed because of cultural lag in existing precautions: Violations frequently occur when regulations are ambiguous or when they conflict with other goals of the organization.  The best example of this phenomenon comes from Scott Snook's concept of "practical drift" , which he defines as "the slow and steady uncoupling of practice from written procedure" and Diane Vaughn's "normalization of deviance".

4. Events go unnoticed or are misunderstood because of a reluctance to fear the worst possible outcome: When things do start to go poorly, individuals tend to minimize the danger or deny that the danger effects them.  There's no question that we, as humans, tend to assume that we are invulnerable or infallible.  We may respond to a dangerous situation with the phrase, "We got this!"  That same sense of invulnerability often prevents us from asking for help.  Just as important is the fear of "sounding the alarm" too prematurely - we don't want to look foolish or incompetent.  Lastly, we frequently assume that someone else has made the call for help (a form of the "Bystander Effect").

I want to leave this discussion with a final quote.  The French writer Victor Hugo said, "Great blunders are often made, like large ropes, of a multitude of fibers."  The most important take-home message from Barry Turner's "Man-Made Disasters Model" is the concept of the incubation period.  These events do not occur all of a sudden, but rather there is a set of long-standing issues that create the conditions favorable to a catastrophic event.  Here is the time where we as leaders need to intervene.  Next time, we will discuss how to do that.

2 comments:

  1. Good read and great cliffhanger!

    ReplyDelete
  2. Insightful yet again. Prolific writing ✍️!

    ReplyDelete