This NPR interview with Danielle Ofri, author of a new book on medical errors (and their prevention), had some interesting insight into how human factors play out during a pandemic.
Her new book is “When We Do Harm,” and I was most interested in these excerpts from the interview:
“…we got many donated ventilators. Many hospitals got that, and we needed them. … But it’s like having 10 different remote controls for 10 different TVs. It takes some time to figure that out. And we definitely saw things go wrong as people struggled to figure out how this remote control works from that one.”
“We had many patients being transferred from overloaded hospitals. And when patients come in a batch of 10 or 20, 30, 40, it is really a setup for things going wrong. So you have to be extremely careful in keeping the patients distinguished. We have to have a system set up to accept the transfers … [and] take the time to carefully sort patients out, especially if every patient comes with the same diagnosis, it is easy to mix patients up.”
And my favorite, even though it isn’t necessarily COVID-19 related:
“For example, … [with] a patient with diabetes … it won’t let me just put “diabetes.” It has to pick out one of the 50 possible variations of on- or off- insulin — with kidney problems, with neurologic problems and to what degree, in what stage — which are important, but I know that it’s there for billing. And each time I’m about to write about it, these 25 different things pop up and I have to address them right now. But of course, I’m not thinking about the billing diagnosis. I want to think about the diabetes. But this gets in the way of my train of thought. And it distracts me. And so I lose what I’m doing if I have to attend to these many things. And that’s really kind of the theme of medical records in the electronic form is that they’re made to be simple for billing and they’re not as logical, or they don’t think in the same logical way that clinicians do.”
The passengers on the Lion Air 610 flight were on board one of Boeing’s newest, most advanced planes. The pilot and co-pilot of the 737 MAX 8 were more than experienced, with around 11,000 flying hours between them. The weather conditions were not an issue and the flight was routine. So what caused that plane to crash into the Java Sea just 13 minutes after takeoff?
I’ve been waiting for updated information on the Lion Air crash before posting details. When I first read about the accident it struck me as a collection of human factors safety violations in design. I’ve pulled together some of the news reports on the crash, organized by the types of problems experienced on the airplane.
1. “a cacophony of warnings” Fortune Magazine reported on the number of warnings and alarms that began to sound as soon as the plane took flight. These same alarms occurred on its previous flight and there is some blaming of the victims here when they ask “If a previous crew was able to handle it, why not this one?”
The alerts included a so-called stick shaker — a loud device that makes a thumping noise and vibrates the control column to warn pilots they’re in danger of losing lift on the wings — and instruments that registered different readings for the captain and copilot, according to data presented to a panel of lawmakers in Jakarta Thursday.
2. New automation features, no training
The plane included new “anti-stall” technology that the airlines say was not explained well nor included in Boeing training materials.
In the past week, Boeing has stepped up its response by pushing back on suggestions that the company could have better alerted its customers to the jet’s new anti-stall feature. The three largest U.S. pilot unions and Lion Air’s operations director, Zwingly Silalahi, have expressed concern over what they said was a lack of information.
As was previously revealed by investigators, the plane’s angle-of-attack sensor on the captain’s side was providing dramatically different readings than the same device feeding the copilot’s instruments.
Angle of attack registers whether the plane’s nose is pointed above or below the oncoming air flow. A reading showing the nose is too high could signal a dangerous stall and the captain’s sensor was indicating more than 20 degrees higher than its counterpart. The stick shaker was activated on the captain’s side of the plane, but not the copilot’s, according to the data.
And more from CNN:
“Generally speaking, when there is a new delivery of aircraft — even though they are the same family — airline operators are required to send their pilots for training,” Bijan Vasigh, professor of economics and finance at Embry-Riddle Aeronautical University, told CNN.
Those training sessions generally take only a few days, but they give the pilots time to familiarize themselves with any new features or changes to the system, Vasigh said.
One of the MAX 8’s new features is an anti-stalling device, the maneuvering characteristics augmentation system (MCAS). If the MCAS detects that the plane is flying too slowly or steeply, and at risk of stalling, it can automatically lower the airplane’s nose.
It’s meant to be a safety mechanism. But the problem, according to Lion Air and a growing chorus of international pilots, was that no one knew about that system. Zwingli Silalahi, Lion Air’s operational director, said that Boeing did not suggest additional training for pilots operating the 737 MAX 8. “We didn’t receive any information from Boeing or from regulator about that additional training for our pilots,” Zwingli told CNN Wednesday.
“We don’t have that in the manual of the Boeing 737 MAX 8. That’s why we don’t have the special training for that specific situation,” he said.
Right after the Hawaii false nuclear alarm, I posted about how the user interface seemed to contribute to the error. At the time, sources were reporting it as a “dropdown” menu. Well, that wasn’t exactly true, but in the last few weeks it’s become clear that truth is stranger than fiction. Here is a run-down of the news on the story (spoiler, every step is a human factors-related issue):
Hawaii nuclear attack alarms are sounded, also sending alerts to cell phones across the state
Alarm is noted as false and the state struggles to get that message out to the panicked public
The actual interface is found and shown – rather than a drop-down menu it’s just closely clustered links on a 1990s-era website-looking interface that say “DRILL-PACOM(CDW)-STATE ONLY” and “PACOM(CDW)-STATE ONLY”
Latest news: the employee who sounded the alarm says it wasn’t an error, he heard this was “not a drill” and acted accordingly to trigger the real alarm
The now-fired employee has spoken up, saying he was sure of his actions and “did what I was trained to do.” When asked what he’d do differently, he said “nothing,” because everything he saw and heard at the time made him think this was not a drill. His firing is clearly an attempt by Hawaii to get rid of a ‘bad apple.’ Problem solved?
It seems like a good time for my favorite reminder from Sidney Dekker’s book, “The Field Guide to Human Error Investigations” (abridged):
To protect safe systems from the vagaries of human behavior, recommendations typically propose to:
• Tighten procedures and close regulatory gaps. This reduces the bandwidth in which people operate. It leaves less room for error.
• Introduce more technology to monitor or replace human work. If machines do the work, then humans can no longer make errors doing it. And if machines monitor human work, they ca
snuff out any erratic human behavior.
• Make sure that defective practitioners (the bad apples) do not contribute to system breakdown again. Put them on “administrative leave”; demote them to a lower status; educate or pressure them to behave better next time; instill some fear in them and their peers by taking them to court or reprimanding them.
In this view of human error, investigations can safely conclude with the label “human error”—by whatever name (for example: ignoring a warning light, violating a procedure). Such a conclusion and its implications supposedly get to the causes of system failure.
AN ILLUSION OF PROGRESS ON SAFETY The shortcomings of the bad apple theory are severe and deep. Progress on safety based on this view is often a short-lived illusion. For example, focusing on individual failures does not take away the underlying problem. Removing “defective” practitioners (throwing out the bad apples) fails to remove the potential for the errors they made.
…[T]rying to change your people by setting examples, or changing the make-up of your operational workforce by removing bad apples, has little long-term effect if the basic conditions that people work under are left unamended.
A ‘bad apple’ is often just a scapegoat that makes people feel better by giving a focus for blame. Real improvements and safety happen by improving the system, not by getting rid of employees who were forced to work within a problematic system.
The morning of January 13th, people in Hawaii received a false alarm that the island was under nuclear attack. One of the messages people received was via cell phones and it said:“BALLISTIC MISSILE THREAT INBOUND TO HAWAII. SEEK IMMEDIATE SHELTER. THIS IS NOT A DRILL.” Today, the Washington Post reported that the alarm was due to an employee pushing the “wrong button” when trying to test the nuclear alarm system.
To sum up the issue, the alarm is triggered by choosing an option in a drop down menu, which had options for “Test missile alert” and “Missile alert.” The employee chose the wrong dropdown and, once chosen, the system had no way to reverse the alarm.
A nuclear alarm system should be subjected to particularly high usability requirements, but this system didn’t even conform to Nielson’s 10 heuristics. It violates:
User control and freedom: Users often choose system functions by mistake and will need a clearly marked “emergency exit” to leave the unwanted state without having to go through an extended dialogue. Support undo and redo.
Visibility of system status: The system should always keep users informed about what is going on, through appropriate feedback within reasonable time.
Error prevention: Even better than good error messages is a careful design which prevents a problem from occurring in the first place. Either eliminate error-prone conditions or check for them and present users with a confirmation option before they commit to the action.
Help users recognize, diagnose, and recover from errors: Error messages should be expressed in plain language (no codes), precisely indicate the problem, and constructively suggest a solution.
And those are just the ones I could identify from reading the Washington Post article! Perhaps a human factors analysis will become regulated for these systems as it has been for the FDA and medical devices.
I chose a provocative title for this post after reading the report on what caused the wreck of the USS John McCain in August of 2017. A summary of the accident is that the USS John McCain was in high-traffic waters when they believed they lost control of steering the ship. Despite attempts to slow or maneuver, it was hit by another large vessel. The bodies of 10 sailors were eventually recovered and five others suffered injury.
Today marks the final report on the accident released by the Navy. After reading it, it seems to me the report blames the crew. Here are some quotes from the offical Naval report:
Loss of situational awareness in response to mistakes in the operation of the JOHN S MCCAIN’s steering and propulsion system, while in the presence of a high density of maritime traffic
Failure to follow the International Nautical Rules of the Road, a system of rules to govern the maneuvering of vessels when risk of collision is present
Watchstanders operating the JOHN S MCCAIN’s steering and propulsion systems had insufficient proficiency and knowledge of the systems
And a rather devestating:
In the Navy, the responsibility of the Commanding Officer for his or her ship is absolute. Many of the decisions made that led to this incident were the result of poor judgment and decision making of the Commanding Officer. That said, no single person bears full responsibility for this incident. The crew was unprepared for the situation in which they found themselves through a lack of preparation, ineffective command and control and deficiencies in training and preparations for navigation.
Ars Technica called my attention to an important but not specifically called out reason for the accident: the poor feedback design of the control system. I think it is a problem that the report focused on “failures” of the people involved, not the design of the machines and systems they used. After my reading, I would summarize the reason for the accident as “The ship could be controlled from many locations. This control was transferred using a computer interface. That interface did not give sufficient information about its current state or feedback about what station controlled what functions of the ship. This made the crew think they had lost steering control when actually that control had just been moved to another location.” I based this on information from the report, including:
Steering was never physically lost. Rather, it had been shifted to a different control station and watchstanders failed to recognize this configuration. Complicating this, the steering control transfer to the Lee Helm caused the rudder to go amidships (centerline). Since the Helmsman had been steering 1-4 degrees of right rudder to maintain course before the transfer, the amidships rudder deviated the ship’s course to the left.
Even this section calls out the “failure to recognize this configuration.” If the system is designed well, one shouldn’t have to expend any cognitive or physical resources to know from where the ship is being controlled.
Overall I was surprised at the tone of this report regarding crew performance. Perhaps some is deserved, but without a hard look at the systems the crew use, I don’t have much faith we can avoid future accidents. Fitts and Jones were the start of the human factors field in 1947, when they insisted that the design of the cockpit created accident-prone situations. This went against the beliefs of the times, which was that “pilot error” was the main factor. This ushered in a new era, one where we try to improve the systems people must use as well as their training and decision making. The picture below is of the interface of the USS John S McCain, commissioned in 1994. I would be very interested to see how it appears in action.
Anne’s research on attention and rock climbing was recently featured in an article in Outside Magazine:
To trad climb is to be faced with hundreds of such split-second micro decisions, the consequences of which can be fatal. That emphasis on human judgment and its fallibility intrigued Anne McLaughlin, a psychology professor at North Carolina State University. An attention and behavior researcher, she set out to model how and why rock climbers make decisions, and she’d recruited Weil and 31 other trad climbers to contribute data to the project.
The idea for the study first came about at the crag. In 2011, McLaughlin, Chris Wickens, a psychology professor at Colorado State University, and John Keller, an engineer at Alion Science and Technology, converged in Las Vegas for the Human Factors and Ergonomics Society conference, an annual event that brings together various professionals practicing user-focused product design. With Red Rocks just a few minutes away, the three avid climbers were eager to get some time on the rock before the day’s sessions, says Keller, even if it meant starting at 3 a.m.
I admit a fascination for reading about disasters. I suppose I’m hoping for the antidote. The little detail that will somehow protect me next time I get into a plane, train, or automobile. A gris-gris for the next time I tie into a climbing rope. Treating my bike helmet as a talisman for my commute. So far, so good.
He tells the story of a chartered plane crash in Bedford, Massachusetts in 2014, a take-off with so many skipped safety steps and errors that it seemed destined for a crash. There was plenty of time for the pilot stop before the crash, leading Rapp to say “It’s the most inexplicable thing I’ve yet seen a professional pilot do, and I’ve seen a lot of crazy things. If locked flight controls don’t prompt a takeoff abort, nothing will.” He sums up the reasons for these pilot’s “deviant” performance via Diane Vaughn’s factors of normalization (some interpretation on my part, here):
If rules and checklists and regulations are difficult, tedious, unusable, or interfere with the goal of the job at hand, they will be misused or ignored.
We can’t treat top-down training or continuing education as the only source of information. People pass on shortcuts, tricks, and attitudes to each other.
Reward the behaviors you want. But we tend to punish safety behaviors when they delay secondary (but important) goals, such as keeping passengers happy.
We can’t ignore the social world of the pilots and crew. Speaking out against “probably” unsafe behaviors is at least as hard as calling out a boss or coworker who makes “probably” racist or sexist comments. The higher the ambiguity, the less likely people take action (“I’m sure he didn’t mean it that way.” or “Well, we skipped that list, but it’s been fine the ten times so far.”)
The cure? An interdisciplinary solution coming from human factors psychologists, designers, engineers, and policy makers. That last group might be the most important, in that they recognize a focus on safety is not necessarily more rules and harsher punishments. It’s checking that each piece of the system is efficient, valued, and usable and that those systems work together in an integrated way.
Thanks to Travis Bowles for the heads-up on this article.
Feature photo from the NTSB report, photo credit to the Massachusetts Police.
I wanted a new helmet that offered some side-impact protection to replace my trusty Petzl Ecrin Roc, especially after a helmet-less Slovenian climber mocked me in Italy for wearing “such a heavy helmet” at a sport climbing crag.
I now own the Petzl Meteor, but after one trip discovered a strange design flaw.
Most helmets clip together the way carseats or backpack buckles clip together:
The Petzl Meteor helmet has a similar clip, but also contains magnets that draw the buckle together. Here is how it should work:
I was climbing at Lover’s Leap in California, a granite cliff. Those of you who know your geology might guess what happens when you combine magnets and iron-rich granite. I put the helmet on the ground while sorting gear, put it back on and heard the buckle snap together. A few minutes later, I looked down (which put some strain on the helmet strap), the buckle popped open, and the helmet fell off my head.
When I examined the buckle, there was grit stuck to the magnet.
Wiping it off seemed to work, except that it moved some of it to the sides rather than just the top. My fingers weren’t small enough to wipe it from the sides. So, the next time I snapped it shut and checked to make sure it was locked, I couldn’t get it off. The grit on the side prevented the buckle from pinching enough to release. I was finally able to get it off the sides by using part of a strap to get into the crevices.
I made some videos of the phenomenon. It was pretty easy to do, I just had to put my helmet on the ground for a moment and pick it up again. Attached grit was guaranteed – these are strong magnets!
The only issue I had with the buckle came after wearing the Sirocco while bolting and cleaning a granite sport route. Some of the swirling granite dust adhered to the magnets, obstructing the clips. It was easy enough to fix: I just wiped the magnets clean, and it has worked perfectly since.
What we found in our tests of both the Meteor and the Sirocco was that the magnet did not always have enough oomph to click both small arms of the buckle completely closed. About one in four times, only one of the plastic arms would fasten and the buckle would need an extra squeeze to click the other arm in. Another thing our testers noticed was that the magnet would pick up tiny pebbles which would prevent the buckle from fully closing. The pebbles can be easily cleaned by brushing off the exposed part of the magnet, but it adds an extra step to applying the helmet. The bottom line is, we prefer the simplicity of the old plastic buckle. We think that the magnet is a gimmick which potentially makes a less safe helmet.
Safety gear shouldn’t add steps to be remembered, such as making sure the buckle is locked, even after getting auditory and tactile feedback when one connected it. Some people may never climb in an area with iron in the ground, but the use-case for a granite environment should have been considered. You know, for little climbing areas such as the granite cliffs of Yosemite.
A friend of mine was recently rappelling from a climb, meaning that she had the rope through a device that was connected to her belay loop on her harness. As she rappelled, she yelled that her harness broke, and the waistband of the harness slid nearly to her armpits. Fortunately, she remained calm and collected, and was still able to rappell safely, if awkwardly, to the ground. On the ground, her partner saw that her waistband with belay loop had become disconnected from her leg loops. The leg loops were intact, though a keeper-strap that helps the leg loops stay centered was no longer connected.
So, what happened?
First, for the non-climbers, a primer. A climbing harness is composed of three major parts, attached to each other in various ways depending on the manufacturer. The first part is the waistband, which is load-bearing, meaning that it is meant to take the weight of a climber.
The second part of the harness is the belay loop, a load-bearing stitched circle that connects the waistband and leg loops and is also used to hold a belay device, to hold the climber’s weight when rappelling, and for anchoring to the ground or a wall when needed.
The last part of the harness is the leg loops, which are also load-bearing in the parts that connect to the belay loop and around the legs themselves.
Figure 1 shows the general composition of climbing harnesses, with these three parts diagrammed in the Base Concept.
Figure 1. Simplified diagrams of climbing harnesses.
On most harnesses, the leg loops are kept connected to the belay loop by a “keeper strap.” This is usually a weak connection not meant to bear weight, but only to keep the leg loops centered on the harness (shown in blue in figure 1). In the case study that prompted this blog post, the keeper strap was connected through the belay loop, rather than the full-strength leg loops (figure 2.) When loaded, it came apart, separating the leg loops from the waistbelt. My own tests found that the keeper strap can be very strong, when it is loaded on the strap itself. But if the leg loops move so that the keeper buckle is loaded by the belay loop, it comes apart easily.
Figure 2. Harness assembled with keeper strap bearing weight via the belay loop.
There are two ways to mis-attach leg loops to the belay loop of a harness. The first way is by connecting the leg loops back to the harness, after they were removed, using the keeper strap. The video below demonstrates this possibility. Once connected, the harness fits well and gives little indication the leg loops are not actually connected to bear weight.
The second (and I think more likely) way is by having the leg loops disconnected from the back of the harness, usually for a bathroom break or to get in and out of the harness. The leg loops are still connected in the front of the harness, but if a leg loop passes through the belay loop, suddenly the keeper strap is load bearing when the leg loops flip around. However, the harness does not fit differently nor does it look particularly different unless carefully inspected. Video below.
The non-load bearing parts of the harness are what determine the possibility for this error. In figure 1, some harnesses either do not allow disconnection of the leg loops in back or only allow their disconnection in tandem. When the leg loops are connected in this way, the front of the leg loops cannot be passed through the belay loop. Video demonstration below.
Back to figure 1, some harnesses allow the disconnection of leg loops for each leg. If these are disconnected, a loop may be passed through the front belay loop, resulting in the error in figure 2.
In sum, this error can be examined for likelihood and severity. It is not likely that the error occurs, however if it does occur it is likely it will go undiscovered until the keeper strap comes apart. For severity, the error could be lethal, although that is not likely. The waistbelt will hold the climber’s weight and having leg loops and a waistbelt is a (comfortable) redundancy. However, the sudden shock of suddenly losing support from the leg loops could cause loss of control, either for an un-backed-up rappell or while belaying another climber.
What are the alternatives?
Climbing is exploding, particularly climbing in gyms. The “gym” harnesses, with fewer components and gear loops (Figure 1), are a good option for most climbers now. However, there is little guidance about what harness one should buy for the gym vs. outdoor versatility so few probably know this harness is a good option.
Some harnesses are designed to be load-bearing at all points (i.e., “SafeTech” below). It is impossible to make an error in leg loop attachment.
Harnesses with permanently attached leg loops or loops that attach in the back with a single point are unlikely to result in the error.
Many climbers reading this are thinking “This would never happen to me” or “You’d have to be an idiot to put your harness together like that” or my usual favorite “If you wanted climbing to be perfectly safe, you shouldn’t even go.” Blaming the victim gives us a feeling of control over our own safety. However, there are other instances where gear was assembled or re-assembled incorrectly with tragic consequences. No one (or their child) deserves to pay with their life for a simple mistake that can be prevented through good design.
I’ll be the first to admit that I experience cognitive overload while trying to park. When there are three signs and the information needs to be combined across them, or at least each one needs to be searched, considered, and eliminated, I spend a lot of time blocking the street trying to decide if I can park.
For example, there might be a sign that says “No parking school zone 7-9am and 2-4pm” combined with a “2 hour parking only without residential permit 7am-5pm” and “< —-Parking” to indicate the side of the sign that’s open. It’s a challenge to figure out where and how long I can park at 1pm or what happens at 7pm.
Designer Nikki Sylianteng created new signs for parking in Los Angeles that incorporated all information into a single graphic.
I still have some difficulty in going back and forth to the legend at the bottom, but probably just because I’ve never seen the signs before. Otherwise, one just needs to know the time and day of the week.
An interview with her can be found in the LA Weekly where she describes mocking up a laminated example in NY and asking people for feedback on the street via sharpies. (Yay for paper prototypes!) An NPR story focused on the negative reactions of a few harried LA denizens, who predictably said “I like how it was,” but I’d like to see some timed tests of interpreting if it’s ok to park. I’d also like to suggest using a dual-task paradigm to put parkers under the same cognitive load in the lab as they might experience on the street.