Select Page

For years, it has been the contention of Reliability Professionals that root cause failure analysis (RCFA) is a skill that must be taught, absorbed, and consistently practiced. We agree with the CEO of a prominent RCFA training company, which, for several decades, has successfully shown plant personnel how to analyze failures and avoid their recurrence. This gentleman made it his business to read all types of outside company and potential-client-company (PCC) newsletters. As some point, he alerted us to one that caused him well-justified concern.

The cited newsletter was from a PCC that had arranged to train six of its key employees in a highly successful method that also happened to be one of many courses the CEO’s company had been developing and teaching for years. As stated in an earlier newsletter from the same PCC, its maintenance manager had designated those six staffers to perform all root-cause analyses at the company.

Some years later, a follow-up newsletter from the same PCC announced a new “best practice.” They had “evolved” (or so they thought) from the established method to an approach whereby the investigator simply asks “why” five consecutive times. In essence, this PCC was now teaching all its workers to ask “why” five times because:

“People can then do their own root-cause analysis and solve their own problems
without help from the corporate staff.”

Let’s use the following contrived example to refresh ourselves on how the “five whys” work their way back toward a perceived solution:

1. Why did the Environmental Protection Agency levy a fine? Because the flare went off.

2. Why did the flare go off? Because control valve PV-456 opened.

3. Why did PV-456 open? Because the unit lost feed.

4. Why did the unit lose feed? Because pump P-123 had caught on fire.

5. Why did the pump catch on fire? Because the bearing failed.

There you have it: The bearing failed. But the bearing should have a statistical 90% chance to operate at rated load and speed for 40,000 hours, and what if it really failed because the plant uses oil rings with a DN-value (shaft diameter times rpm) of 10,900—well in the range where instabilities or even the slightest amount of shaft out-of-horizontality cause the oil ring to run downhill and contact the inside of the bearing housing, which causes slivers of brass to flake off and get into the lube oil and cause the destruction of the bearing? What if none of the supervisors or mechanics had been taught to use only flinger disks, never oil rings, at these DN values? Why didn’t they know? The true answer to the question may embarrass several layers of supervision and management.

To be clear, letting people solve their own problems” is far from “best practice.” We recall how some RCFA trainers and teachers had explained in talks and articles why only asking “why” five times” and trusting certain other forms of cause-and-effect analysis will not work well. So, there’s really no need to dwell on the issue. Instead, here we should highlight the results reported when some plant-floor personnel were taught to merely ask why five times as they went about performing investigations.

Granted, people trained in the “ask why five times” method were usually getting beyond the point of simply placing blame on something (or someone). Placing blame may have been a rather common conclusion before they received such training. But they seldom came close to the root causes uncovered by at least one more- detailed approach. And the results of analyzing the same problem varied significantly from one investigator to the next. Why? Because investigators can only use their own past experience to guide them. They frequently stop at symptoms, which they may then proceed to address with ineffective corrective actions. After all, this is what they have always seen and have always done.

EXAMPLES OF INEFFECTIVE RCFA
Personnel at the PPC in question thought they had discovered a best practice. And they weren’t (and are not) alone in this thinking. In many companies, people believe they’re performing RCFA when, in fact, they are merely addressing symptoms. Some real-world examples from facilities applying “simple” RCFA” (such as only asking “why” five times in succession) are listed below.

Equipment failure. Defective mechanical seals were viewed as the root cause of an equipment failure and installing new seals was seen as the ultimate corrective action. No effort was made to determine why the seals failed or to acknowledge that whenever a mechanical seal fails in a machine, it does so for a reason. Perhaps that’s why one company in New Mexico had 23 mechanical seal failures on just two pumps over a two-year period.

Inappropriate action by operator. Human error (“they just goofed up”) was listed as the root cause for a serious mistake made by an operator and additional training was prescribed as the appropriate corrective action. In this instance nobody asked why the previous training had failed, or if specific, rather than general training was really the most effective way to prevent the error from being made again.

Bad behavior. Inappropriate behavior was cited as the root cause of an operator not following a certain procedure. The corrective action for this problem was re-emphasizing the need to use procedures. Here, nobody asked about the usability of the procedure; enforcement of procedure usage by management; or if operators were actually rewarded for not using a slightly more time-consuming (albeit far safer) procedure.

The three examples above are very real. These types of errors and failure to address the root causes harbor the seeds of repetition. Since none of the referenced corrective actions cured the sources of the problems, all failure events repeated themselves at the affected sites.

(Note that repetition can occur after years or just months, as evident from the 23 seal failures at the facility in New Mexico. After events such as explosions and fires at major refineries in the U.S., repetition of problems should frighten us).

When dealing with flammable, explosive, or toxic substances at hydrocarbon-processing plants, we should never allow deviations from the norm to become the new accepted standard. The safety of personnel and profitability of entire plants are at stake when RCFA is not carried out properly. Or, for whatever reason, when some of the best-documented and well-understood pipe-corrosion mechanisms are ignored.

WHEN ‘GOOD PRACTICE’ IS REALLY ‘BAD PRACTICE’
A corporation is at risk when people think they are improving performance, but, in fact, are really wasting effort. This happens when they implement ineffective fixes and lead management to believe that progress is made when, in reality, they continue to misdiagnose underlying causes. Failure to remedy the root causes of problems brings such a company perilously close to major failures that could:

      • maim, or even kill people
      • lead to major production losses
      • cause significant product quality issues
      • result in significant environmental damage
      • lead to serious and difficult-to-regain loss of goodwill, and/or
      • culminate in painful fines from government or regulatory agencies.

The difficult outcomes listed above prove that instead of being a good practice, most (if not all) “quick-and-effortless” analyses are actually bad practices that should be shunned like the plague. For more information on this topic, consider reading the Root Cause Analysis Blog at www.taproot.com/blog.TRR



Editor’s Note: Click Here To Download A Full List Of Heinz Bloch’s 24 Books


ABOUT THE AUTHOR
Heinz Bloch’s long professional career included assignments as Exxon Chemical’s Regional Machinery Specialist for the United States. A recognized subject-matter-expert on plant equipment and failure avoidance, he is the author of numerous books and articles, and continues to present at technical conferences around the world. Bloch holds B.S. and M.S. degrees in Mechanical Engineering and is an ASME Life Fellow. These days, he’s based near Houston, TX. 


Tags: reliability, availability, maintenance, RAM, root-cause analyses, RCA, root-cause-failure analysis, RCFA, Taproot.com