All posts by Anne McLaughlin

Associate Professor, Department of Psychology, North Carolina State University, Raleigh, NC

The current need for enforcement of safety regulations

An NPR article reports on safety violations in Kentucky:

In December 2016, Pius “Gene” Hobbs was raking gravel with the Meade County public works crew when a dump truck backed over him. The driver then accelerated forward, hitting him a second time. Hobbs was crushed to death.

The sole eyewitness to the incident said that the dump truck’s backup beeper wasn’t audible at the noisy worksite. The Kentucky State Police trooper on the scene concurred. Hobbs might not have been able to hear the truck coming.

But when Kentucky Occupational Safety and Health arrived, hours later, the inspector tested the beeper on a quiet street and said it wasn’t a problem.

“These shortcomings are very concerning,” says Jordan Barab, a workplace safety expert who served as Deputy Assistant Secretary of Labor for Occupational Safety and Health under President Barack Obama. “Identifying the causes of these incidents is … vitally important.” Otherwise, the employer doesn’t know how to avoid the next incident, he says.

Gene Hobbs’ case is not the exception. In fact, it’s the norm, according to a recent federal audit.

Kentucky is what’s known as a “state plan,” meaning the federal Occupational Safety and Health Administration has authorized it to run its own worker safety program.

Every year, federal OSHA conducts an audit of all 28 state plans to ensure they are “at least as effective” as the federal agency at identifying and preventing workplace hazards.

According to this year’s audit of Kentucky, which covered fiscal year 2017, KY OSH is not meeting that standard. In fact, federal OSHA identified more shortcomings in Kentucky’s program than any other state.

We know that we must have regulations and enforcement of those regulations to have safe environments. Left to our own choices, people tend to choose what appears to be the fastest and easiest options, not the most safe ones. For an interesting read on the history of safety regulation, see this article from the Department of Labor.

In 1898 the Wisconsin bureau reported that it was often difficult to find safety devices that did not reduce efficiency. Sanitary improvements and fire escapes were expensive, which led many employers to resist their adoption. Constant pressure and attention were needed to obtain compliance. Employers objected to the posting of laws in their establishments and some tore them down. The proprietor of a shoe factory with very poor fire escape routes showed “a disposition to defeat” an inspector’s request for more fire escapes, though he complied in the end. A cloak maker who was also found to have inadequate fire escapes went to the extreme of relocating his operation to avoid compliance. Such delays were not uncommon.

When an inspector found abominable conditions in the dipping rooms of a match factory — poorly ventilated rooms filled with poisonous fumes from the liquid phosphorus which made up the match heads — he tried to persuade the operators to make improvements. They objected because of the costs involved and the inspector “left without expecting to see the changes made.” When a machinery manufacturer equipped his ripsaws with guards after an inspection, a reinspection revealed that the employees had removed the guards.

Without regulation, we’ll be back to 1898 in short order.

Lion Air Crash from October 2018

From CNN:

The passengers on the Lion Air 610 flight were on board one of Boeing’s newest, most advanced planes. The pilot and co-pilot of the 737 MAX 8 were more than experienced, with around 11,000 flying hours between them. The weather conditions were not an issue and the flight was routine. So what caused that plane to crash into the Java Sea just 13 minutes after takeoff?

I’ve been waiting for updated information on the Lion Air crash before posting details. When I first read about the accident it struck me as a collection of human factors safety violations in design. I’ve pulled together some of the news reports on the crash, organized by the types of problems experienced on the airplane.

1. “a cacophony of warnings”
Fortune Magazine reported on the number of warnings and alarms that began to sound as soon as the plane took flight. These same alarms occurred on its previous flight and there is some blaming of the victims here when they ask “If a previous crew was able to handle it, why not this one?”

The alerts included a so-called stick shaker — a loud device that makes a thumping noise and vibrates the control column to warn pilots they’re in danger of losing lift on the wings — and instruments that registered different readings for the captain and copilot, according to data presented to a panel of lawmakers in Jakarta Thursday.

2. New automation features, no training
The plane included new “anti-stall” technology that the airlines say was not explained well nor included in Boeing training materials.

In the past week, Boeing has stepped up its response by pushing back on suggestions that the company could have better alerted its customers to the jet’s new anti-stall feature. The three largest U.S. pilot unions and Lion Air’s operations director, Zwingly Silalahi, have expressed concern over what they said was a lack of information.

As was previously revealed by investigators, the plane’s angle-of-attack sensor on the captain’s side was providing dramatically different readings than the same device feeding the copilot’s instruments.

Angle of attack registers whether the plane’s nose is pointed above or below the oncoming air flow. A reading showing the nose is too high could signal a dangerous stall and the captain’s sensor was indicating more than 20 degrees higher than its counterpart. The stick shaker was activated on the captain’s side of the plane, but not the copilot’s, according to the data.

And more from CNN:

“Generally speaking, when there is a new delivery of aircraft — even though they are the same family — airline operators are required to send their pilots for training,” Bijan Vasigh, professor of economics and finance at Embry-Riddle Aeronautical University, told CNN.

Those training sessions generally take only a few days, but they give the pilots time to familiarize themselves with any new features or changes to the system, Vasigh said.
One of the MAX 8’s new features is an anti-stalling device, the maneuvering characteristics augmentation system (MCAS). If the MCAS detects that the plane is flying too slowly or steeply, and at risk of stalling, it can automatically lower the airplane’s nose.

It’s meant to be a safety mechanism. But the problem, according to Lion Air and a growing chorus of international pilots, was that no one knew about that system. Zwingli Silalahi, Lion Air’s operational director, said that Boeing did not suggest additional training for pilots operating the 737 MAX 8. “We didn’t receive any information from Boeing or from regulator about that additional training for our pilots,” Zwingli told CNN Wednesday.

“We don’t have that in the manual of the Boeing 737 MAX 8. That’s why we don’t have the special training for that specific situation,” he said.

Human Factors and the Ballot Box

New NPR story on the non-usability of ballots, voting software, and other factors affecting our elections:

New York City’s voters were subject to a series of setbacks after the election board unrolled a perforated two-page ballot. Voters who didn’t know they had to tear at the edges to get at the entire ballot ended up skipping the middle pages. Then the fat ballots jammed the scanners, long lines formed, and people’s ballots got soaked in the rain. When voters fed the soggy ballots into scanners, more machines malfunctioned.

In Georgia, hundreds blundered on their absentee ballot, incorrectly filling out the birth date section. Counties originally threw out the ballots before a federal judge ordered they be counted.

And in Broward County, Fla., 30,000 people who voted for governor skipped the contest for U.S. Senate. The county’s election board had placed that contest under a block of multi-lingual instructions, which ran halfway down the page. Quesenbery says voters scanning the instructions likely skimmed right over the race.

She has seen this design before. In 2009, King County, Wash., buried a tax initiative under a text-heavy column of instructions. An estimated 40,000 voters ended up missing the contest, leading the state to pass a bill mandating ballot directions look significantly different from the contests below.

“We know the answers,” says Quesenbery. “I wish we were making new mistakes, not making the same old mistakes.”

The story didn’t even mention the issues with the “butterfly ballot” from Florida in 2000. Whitney Queensbery is right. We do know the answers, and we certainly know the methods for getting the answers. We need the will to apply them in our civics, not just commercial industry.

Hawaii False Alarm: The story that keeps on giving

Right after the Hawaii false nuclear alarm, I posted about how the user interface seemed to contribute to the error. At the time, sources were reporting it as a “dropdown” menu. Well, that wasn’t exactly true, but in the last few weeks it’s become clear that truth is stranger than fiction. Here is a run-down of the news on the story (spoiler, every step is a human factors-related issue):

  • Hawaii nuclear attack alarms are sounded, also sending alerts to cell phones across the state
  • Alarm is noted as false and the state struggles to get that message out to the panicked public
  • Error is blamed on a confusing drop-down interface: “From a drop-down menu on a computer program, he saw two options: “Test missile alert” and “Missile alert.”
  • The actual interface is found and shown – rather than a drop-down menu it’s just closely clustered links on a 1990s-era website-looking interface that say “DRILL-PACOM(CDW)-STATE ONLY” and “PACOM(CDW)-STATE ONLY”
  • It comes to light that part of the reason the wrong alert stood for 38 minutes was because the Governor didn’t remember his twitter login and password
  • Latest news: the employee who sounded the alarm says it wasn’t an error, he heard this was “not a drill” and acted accordingly to trigger the real alarm

The now-fired employee has spoken up, saying he was sure of his actions and “did what I was trained to do.” When asked what he’d do differently, he said “nothing,” because everything he saw and heard at the time made him think this was not a drill. His firing is clearly an attempt by Hawaii to get rid of a ‘bad apple.’ Problem solved?

It seems like a good time for my favorite reminder from Sidney Dekker’s book, “The Field Guide to Human Error Investigations” (abridged):

To protect safe systems from the vagaries of human behavior, recommendations typically propose to:

    • Tighten procedures and close regulatory gaps. This reduces the bandwidth in which people operate. It leaves less room for error.
    • Introduce more technology to monitor or replace human work. If machines do the work, then humans can no longer make errors doing it. And if machines monitor human work, they ca
    snuff out any erratic human behavior.
    • Make sure that defective practitioners (the bad apples) do not contribute to system breakdown again. Put them on “administrative leave”; demote them to a lower status; educate or pressure them to behave better next time; instill some fear in them and their peers by taking them to court or reprimanding them.

In this view of human error, investigations can safely conclude with the label “human error”—by whatever name (for example: ignoring a warning light, violating a procedure). Such a conclusion and its implications supposedly get to the causes of system failure.

AN ILLUSION OF PROGRESS ON SAFETY
The shortcomings of the bad apple theory are severe and deep. Progress on safety based on this view is often a short-lived illusion. For example, focusing on individual failures does not take away the underlying problem. Removing “defective” practitioners (throwing out the bad apples) fails to remove the potential for the errors they made.

…[T]rying to change your people by setting examples, or changing the make-up of your operational workforce by removing bad apples, has little long-term effect if the basic conditions that people work under are left unamended.

A ‘bad apple’ is often just a scapegoat that makes people feel better by giving a focus for blame. Real improvements and safety happen by improving the system, not by getting rid of employees who were forced to work within a problematic system.

‘Mom, are we going to die today? Why won’t you answer me?’ – False Nuclear Alarm in Hawaii Due to User Interface


Image from the New York Times

The morning of January 13th, people in Hawaii received a false alarm that the island was under nuclear attack. One of the messages people received was via cell phones and it said:“BALLISTIC MISSILE THREAT INBOUND TO HAWAII. SEEK IMMEDIATE SHELTER. THIS IS NOT A DRILL.” Today, the Washington Post reported that the alarm was due to an employee pushing the “wrong button” when trying to test the nuclear alarm system.

The quote in the title of this post is from another Washington Post article where people experiencing the alarm were interviewed.

To sum up the issue, the alarm is triggered by choosing an option in a drop down menu, which had options for “Test missile alert” and “Missile alert.” The employee chose the wrong dropdown and, once chosen, the system had no way to reverse the alarm.

A nuclear alarm system should be subjected to particularly high usability requirements, but this system didn’t even conform to Nielson’s 10 heuristics. It violates:

  • User control and freedom: Users often choose system functions by mistake and will need a clearly marked “emergency exit” to leave the unwanted state without having to go through an extended dialogue. Support undo and redo.
  • Visibility of system status: The system should always keep users informed about what is going on, through appropriate feedback within reasonable time.
  • Error prevention: Even better than good error messages is a careful design which prevents a problem from occurring in the first place. Either eliminate error-prone conditions or check for them and present users with a confirmation option before they commit to the action.
  • Help users recognize, diagnose, and recover from errors: Error messages should be expressed in plain language (no codes), precisely indicate the problem, and constructively suggest a solution.
  • And those are just the ones I could identify from reading the Washington Post article! Perhaps a human factors analysis will become regulated for these systems as it has been for the FDA and medical devices.

    Did a User Interface Kill 10 Navy Sailors?

    I chose a provocative title for this post after reading the report on what caused the wreck of the USS John McCain in August of 2017. A summary of the accident is that the USS John McCain was in high-traffic waters when they believed they lost control of steering the ship. Despite attempts to slow or maneuver, it was hit by another large vessel. The bodies of 10 sailors were eventually recovered and five others suffered injury.

    Today marks the final report on the accident released by the Navy. After reading it, it seems to me the report blames the crew. Here are some quotes from the offical Naval report:

    • Loss of situational awareness in response to mistakes in the operation of the JOHN S MCCAIN’s steering and propulsion system, while in the presence of a high density of maritime traffic
    • Failure to follow the International Nautical Rules of the Road, a system of rules to govern the maneuvering of vessels when risk of collision is present
    • Watchstanders operating the JOHN S MCCAIN’s steering and propulsion systems had insufficient proficiency and knowledge of the systems

    And a rather devestating:

    In the Navy, the responsibility of the Commanding Officer for his or her ship is absolute. Many of the decisions made that led to this incident were the result of poor judgment and decision making of the Commanding Officer. That said, no single person bears full responsibility for this incident. The crew was unprepared for the situation in which they found themselves through a lack of preparation, ineffective command and control and deficiencies in training and preparations for navigation.

    Ouch.

    Ars Technica called my attention to an important but not specifically called out reason for the accident: the poor feedback design of the control system. I think it is a problem that the report focused on “failures” of the people involved, not the design of the machines and systems they used. After my reading, I would summarize the reason for the accident as “The ship could be controlled from many locations. This control was transferred using a computer interface. That interface did not give sufficient information about its current state or feedback about what station controlled what functions of the ship. This made the crew think they had lost steering control when actually that control had just been moved to another location.” I based this on information from the report, including:

    Steering was never physically lost. Rather, it had been shifted to a different control station and watchstanders failed to recognize this configuration. Complicating this, the steering control transfer to the Lee Helm caused the rudder to go amidships (centerline). Since the Helmsman had been steering 1-4 degrees of right rudder to maintain course before the transfer, the amidships rudder deviated the ship’s course to the left.

    Even this section calls out the “failure to recognize this configuration.” If the system is designed well, one shouldn’t have to expend any cognitive or physical resources to know from where the ship is being controlled.

    Overall I was surprised at the tone of this report regarding crew performance. Perhaps some is deserved, but without a hard look at the systems the crew use, I don’t have much faith we can avoid future accidents. Fitts and Jones were the start of the human factors field in 1947, when they insisted that the design of the cockpit created accident-prone situations. This went against the beliefs of the times, which was that “pilot error” was the main factor. This ushered in a new era, one where we try to improve the systems people must use as well as their training and decision making. The picture below is of the interface of the USS John S McCain, commissioned in 1994. I would be very interested to see how it appears in action.

    US Navy (USN) Boatswain’s Mate Seaman (BMSN) Charles Holmes mans the helm aboard the USN Arleigh Burke Class Guided Missile Destroyer USS JOHN S. MCCAIN (DDG 56) as the ship gets underway for a Friends and Family Day cruise. The MCCAIN is getting underway for a Friends and Family Day cruise from its homeport at Commander Fleet Activities (CFA) Yokosuka Naval Base (NB), Japan (JPN). Source: Wikimedia Commons

    Tesla counterpoint: “40% reduction in crashes” with introduction of Autosteer

    I posted yesterday about the challenges of fully autonomous cars and cars that approach autonomy. Today I bring you a story about the successes of semi-automatic features in automobiles.

    Tesla has a feature called Autopilot that assists the driver without being completely autonomous. Autopilot includes car-controlled actions such as collision warnings, automatic emergency braking, and automatic lane keeping. Tesla classifies the Autopilot features as Level 2 automation. (Level 5 is considered fully autonomous). Rich has already given our thoughts about calling this Autopilot in a previous post. One particular feature is called AutoSteer, described in the NHTSA report as:

    The Tesla Autosteer system uses information from the forward-looking camera, the radar sensor, and the ultrasonic sensors, to detect lane markings and the presence of vehicles and objects to provide automated lane-centering steering control based on the lane markings and the vehicle directly in front of the Tesla, if present. The Tesla owner’s manual contains the following warnings: 1) “Autosteer is intended for use only on highways and limited-access roads with a fully attentive driver. When using Autosteer, hold the steering wheel and be mindful of road conditions and surrounding traffic. Do not use Autosteer on city streets, in construction zones, or in areas where bicyclists or pedestrians may be present. Never depend on Autosteer to determine an appropriate driving path. Always be prepared to take immediate action. Failure to follow these instructions could cause serious property damage, injury or death;” and 2) “Many unforeseen circumstances can impair the operation of Autosteer. Always keep this in mind and remember that as a result, Autosteer may not steer Model S appropriately. Always drive attentively and be prepared to take immediate action.” The system does not prevent operation on any road types.

    An NHTSA report looking into a fatal Tesla crash also noted that the introduction of Autosteer corresponded to a 40% reduction in automobile crashes. That’s a lot considering Dr. Gill Pratt from Toyota said he might be happy with a 1% change.

    Autopilot was enabled in October, 2015, so there has been a good period of time for post-autopilot crash data to be generated.

    Toyota Gets It: Self-driving cars depend more on people than on engineering

    I recommend reading this interview with Toyota’s Dr. Gill Pratt in its entirety. He discusses pont-by-point the challenges of a self-driving car that we consider in human factors, but don’t hear much about in the media. For example:

    • Definitions of autonomy vary. True autonomy is far away. He gives the example of a car performing well on an interstate or in light traffic compared to driving through the center of Rome during rush hour.
    • Automation will fail. And the less it fails, the less prepared the driver is to assume control.
    • Emotionally we cannot accept autonomous cars that kill people, even if it reduces overall crash rates and saves lives in the long run.
    • It is difficult to run simulations with the autonomous cars that capture the extreme variability of the human drivers in other cars.

    I’ll leave you with the last paragraph in the interview as a summary:

    So to sum this thing up, I think there’s a general desire from the technical people in this field to have both the press and particularly the public better educated about what’s really going on. It’s very easy to get misunderstandings based on words like or phrases like “full autonomy.” What does full actually mean? This actually matters a lot: The idea that only the chauffeur mode of autonomy, where the car drives for you, that that’s the only way to make the car safer and to save lives, that’s just false. And it’s important to not say, “We want to save lives therefore we have to have driverless cars.” In particular, there are tremendous numbers of ways to support a human driver and to give them a kind of blunder prevention device which sits there, inactive most of the time, and every once in a while, will first warn and then, if necessary, intervene and take control. The system doesn’t need to be competent at everything all of the time. It needs to only handle the worst cases.

    The Patient Writes the Prescription

    jeep

    I took the photo above in my brother-in-laws 2015 Jeep Grand Cherokee EcoDiesel. It says “Exhaust Filter Nearing Full Safely Drive at Highway Speeds to Remedy.”

    I’d never seen anything like that before neither had he – it seemed like a terrible idea at first. What if the person couldn’t drive at highway speeds right then? Spending an unknown time driving at highway speeds wasting gas also seemed unpleasant. My brother-in-law said that he was having issues with the car before, but it wasn’t until the Jeep downloaded a software update that it displayed this message on the dashboard.

    My own car will be 14 years old this year (nearing an age where it can get its own learner’s permit?), so I had to adjust to the idea of a car that updated itself. I was intrigued by the issue and looked around to see what other Jeep owners had to say.

    I found another unhappy customer at the diesel Jeep forum:

    At the dealer a very knowledgeable certified technician explained to me that the problem is that we had been making lots of short trips in town, idling at red lights, with the result that the oil viscosity was now out of spec and that the particulate exhaust filter was nearly full and needed an hour of 75 mph driving to get the temperature high enough to burn off the accumulated particulates. No person and no manual had ever ever mentioned that there is a big problem associated with city driving.

    And further down the rabbit hole, I found it wasn’t just the diesel Jeep. This is from a Dodge Ram forum:

    I have 10,000K on 2014 Dodge Ram Ecodiesel. Warning came on that exhaust filter 90% full. Safely drive at highway speeds to remedy. Took truck on highway & warning changed to exhaust system regeneration in process. Exhaust filter 90% full.
    All warnings went away after 20 miles. What is this all about?

    It looks like Jeep added a supplement to their owners manual in 2015 to explain the problem:

    Exhaust Filter XX% Full Safely Drive at Highway Speeds to Remedy — This message will be displayed on the Driver Information Display (DID) if the exhaust particulate filter reaches 80% of its maximum storage capacity. Under conditions of exclusive short duration and low speed driving cycles, your diesel engine and exhaust after-treatment system may never reach the conditions required to cleanse the filter to remove the trapped PM. If this occurs, the “Exhaust Filter XX% Full Safely Drive at Highway Speeds to Remedy” message will be displayed in the DID. If this message is displayed, you will hear one chime to assist in alerting you of this condition. By simply driving your vehicle at highway speeds for up to 20 minutes, you can remedy the condition in the particulate filter system and allow your diesel engine and exhaust after-treatment system to cleanse the filter to remove the trapped PM and restore the system to normal operating condition.

    But now that I’ve had time to think about it, I agree with the remedy. After all,my own car just has a ‘check engine’ light no matter what the issue. Twenty minutes on the highway is a lot easier than scheduling a trip to a mechanic.

    What could be done better is the communication of the warning. It tells you what to do, and sort of why, but not how long you have to execute the action or the consequences of not acting. The manual contains a better explanation of why (although the 20 minutes there does not match the 60 minute estimate of at least one expert), not that many people read the manual. Also, the manual doesn’t match the message. The manual says you’ll receive a % full, but the message just said “nearly.” The dash display should direct the driver to more information in the manual. Or, with such a modern display, perhaps scroll to reveal more information (showing partial text, so the driver knows to scroll). Knowing the time to act is more critical, and maybe a % would do that since the driver can probably assume he or she can drive closer to 100% before taking action. It looks as though the driver needs to find a way to drive at highway speeds right now, but hopefully that is not the case. I can’t say for sure though, since neither the manual nor the display told me the answer.