Successful failure-analysis and -avoidance strategies will have structure and be repeatable. Within the scope of this article, the reader is (again) reminded of some straightforward routines that have been in use since the 1960s. These techniques are anchored in two important observations:
1. All failure events can be found the seven cause categories 1(a) through 1(f), listed below. We again confirm there is no relevant eighth cause category:
- Design Defects (would be replicated in all identical machines or parts)
- Material Defects (detected by closely examining fracture surfaces, etc.)
- Processing and Manufacturing Deficiencies (e.g., using flawed heat treatment)
- Assembly and Installation Defects (inserting skewed gaskets, allowing careless piping procedures, etc.)
- Off-design or Unintended Service Conditions (all designs fit into an intended operating range)
- Maintenance Deficiencies (neglect, using incorrect procedures)
- Improper Operation (disregarding the fact that nothing is indestructible)
2. Mechanical components can only fail due to “FRETT” (Force, Reactive Environment, Time, Temperature).
As one tries to either find cause categories or, in the case of failed mechanical parts, the prevailing basic agent of the four possible parts-failure mechanisms (FRETT), one proceeds with a process of elimination. The one or, at most, two not-eliminated cause categories or likely failure mechanisms will then be investigated more closely (Ref. 1).
WHEN DEVIATIONS COMBINE
Whenever we investigate why a machine fails repeatedly and, seemingly, at random, it will be important to remember more than one deviation (or failure cause) may be involved. Letting deviations combine is quite risky, and if acceptance of deviations becomes the “new normal,” failure risks often escalate to the point of danger.
In the great majority of machines operating in today’s process plants, three principles pertain and should be kept in mind:
- When deviations combine, serious failure events are rapidly approaching.
- All failures have causes. Unless causes are found and addressed, more failures will occur
- Most root causes of machine-component failure can be uncovered and remedied by a properly led
and trained workforce (Ref. 1).
To illustrate the point, the following case history describes an incident involving a small fluid machine. It documents how several deviations proved troublesome in a 200-hp, vertically oriented, integrally geared, high-speed, low-flow (HSLF) single-stage centrifugal compressor. The machine had experienced at least five costly bearing failures in the span of two years before an experienced reliability professional was asked to participate in a structured investigation. As is so often the case, more than one deviation was found responsible for the problems encountered with this fluid machine.
SOME POSSIBLE CAUSE CATEGORIES EXPLAINED
To get started, the reliability professional whom we will here call Troubleshooter/ Failure Analyst (“TFA”) examined parts, pieces, and data after another failure occurred on a single-stage HSFL compressor. Considerable data had been collected by the equipment owner-operator and the timeline for the different failures was not easy to reconstruct. Understanding the above mentioned seven categories proved helpful.
There was much evidence that this HSLF had suffered from different issues. Moreover, it was soon evident that each failure incident had been treated along the lines of “the part that failed must be at fault.” True, if defective parts fabrication can be proven, simple part replacement may be a suitable action to take. Using the seven-root-cause process of elimination, the TFA reasoned as follows:
Not a faulty design—(1.a). If there is clear evidence of a faulty design, all such parts or machines would be affected. In that case, re-engineering might be appropriate. But if one redesigns “Component A”, the functionality of other components may be compromised. Caution would then be the wisest course of action and a rather comprehensive reassessment of many interacting factors would be needed. Because many other machines of this type and model were operating elsewhere, it could be concluded that the cause category listed above as 1(a) could be crossed off.
Quality-control issues—when and where? Remember our lead-in paragraphs which made the point that all machinery failures can be placed in seven cause categories. Early in the chain of separate failure events the equipment owners had decided that vendor quality control (QC) during fabrication was at fault, but where was the evidence? What measurement were taken to prove which components defective? The equipment owner had suspected QC (quality control) events in three of the seven cause categories. But no “when, why, and where” was explained.
Off-design or unintended service conditions—1.e). Without data, all possibilities are mere hypotheses and opinions. However, the TFA insisted on seeing data. When he examined (and re-read) data collected by the plant’s process computer, he quickly found that the small compressor had often been operated outside its allowable design flowrate or range. Time was taken to explain the vulnerabilities of low-flow operation. Operator or automatic control-related deviations from the original design pointed to a relevant cause category when at least one subsequent low flow event was found in the older collection of operating data.
Low-flow operation (operation in surge) means that the gas flowrate is so low that the gas volume pulsates back-and-forth as it travels from the compressor’s inlet (at suction pressure) to its outlet (at discharge pressure). Pulsating flow would also explain failures of the thrust bearing, although thrust bearing distress could be the result of lube oil foaming and/or lube supply quantity and temperature having been out-of-range. A thorough review of lube oil properties was among the failure investigator’s recommendations; such reviews could be considered a hunt for maintenance-related failure causes.
Maintenance-related problems—(1.f). Such problems were involved because, as the sketchy records showed, a leaking cooling coil in the seal oil circuit had been found in one instance and the plant’s mechanical workforce members had carried out quick repairs. On that occasion, no effort had been made to find the underlying failure cause. However, unless the root cause is known and addressed, a repeat failure is likely. It stands to reason that equipment owners would generally be well advised to establish the root causes of a failure and to implement long-term corrective action whenever possible.
Mechanical and assembly flaws—(1.d). In this instance the plant’s failure records also pointed to the possibility that, on at least one repair occasion, the HSLF’s impeller had not been fully inserted; consequently, the impeller hub probably did not make full contact with the shaft shoulder. The resulting impeller-diffuser contact damaged the impeller and may also have been responsible for a slight bend measured in the final output shaft.
Giving due consideration to repair procedures is of interest here. Prior to final assembly a diligent maintenance technician will make it a practice to install a dial indicator. The indicator readings must ascertain that shaft runout is within acceptable limits. There should also be verification that the impeller nut has been re-torqued to the prescribed value.
Since the axial length of the impeller hub will shrink as the previously heated impeller cools, re-torquing should only be attempted after the impeller has returned to near room temperature. At that time, another indicator reading taken at the vane tips should confirm that impeller runout is not excessive. Again, maintenance must be carried out in accordance with step-by-step procedures.
Improper operation—(1.g). Although there was no reason to attribute failure events to this cause category, the TFA wanted to highlight an important fact: Whenever machines have been designed and fabricated by reputable manufacturers, important data can usually be found in the manufacturer’s Operating and Maintenance Manual. The data existed in this instance, “but who has time to read an entire manual?” Therefore, the TFA recommended that pertinent manufacturer’s instructions be condensed and issued in single-sheet checklist format. Locations and items to check, dimensional listings, and tolerances often fit on a single sheet of paper, which, in turn, can be laminated in plastic. Single sheet checklists are greatly contributing to high rebuild quality and low failure rates at Best-in-Class companies (Ref. 2).
MECHANICAL SEAL UPGRADING
This owner-user company’s maintenance technicians had shop-tested certain mechanical seals and found them to be marginal, at best. So, the experienced TFA saw fit to question the appropriateness of using Teflon wedge secondaries in this service and explained that more pliable Viton O-rings would be preferred as secondary sealing elements.
In fact, as a dutybound and results-oriented investigator, the TFA proceeded to re-verify his recollection by telephoning a competent mechanical seal expert. The expert confirmed that his company had frequently retrofitted this style and model of HSLF compressor with seal upgrade kits, some at very moderate cost. The user-purchaser was asked to consider an upgrade kit and to communicate with two other mechanical seal manufacturers. He knew that HSLF compressors designed in the 1960s or 1970’s will often benefit from recent advances in sealing technology. Such advances can involve alternative material compositions, seemingly minor configurational changes, improved flush plans, and advantageous seal water management systems (Ref. 3).
Because the failure analyst was unable to rule out the (slight) probability of casing deflection due to pipe stress, he recommended that the machine’s inlet and discharge flange bolts be removed during the next scheduled shutdown. Temporary dial indicators could be mounted and casing-movement monitored while the HSLF compressor pipes were being disconnected and reconnected.
During a debriefing meeting, two operators recalled hearing a “chirping noise.” It was thought that a gear tooth-related problem might have been causing the noise. A low-level noise originating in the speed-increaser gear could possibly be managed, and the progress of gear-tooth distress slowed down by using a synthetic oil blend made from polyalphaolefins and diester-based stocks.
In HSLF-compressor gears, modern non-foaming synthetic gear oils are superior to the automatic transmission fluids typically used decades ago. But regardless of circumstances and perceived sounds, it was again confirmed that compressor failure analysts or equipment troubleshooters will often have to deal with more than one deviation. Whenever such deviations combine, a distress or downtime event may urgently need to be addressed. That was the situation in this case study. The story serves as a fitting reminder for plant personnel in any industry sector: Use structured and repeatable approaches to failure analysis and take remedial action early.TRR
1. Bloch, Heinz P., and F.K. Geitner; Machinery Failure Analysis and Troubleshooting, 4th Edition (2012), Butterworth-Heinemann, an imprint of Elsevier, Oxford, UK, and Waltham, MA, USA; ISBN 978-0-12-386045-3.
2. Bloch, Heinz P., and Allan R. Budris, Pump User’s Handbook—Life Extension, 4th Edition (2014), The Fairmont Press, Lilburn, GA 30047, ISBN 0-88273-720-8.
3. Perez, Robert X. and Bloch, Heinz P.; Pump Wisdom—Problem Solving for Operators and Specialists, 2nd Edition, (2022), DeGruyter, Berlin/Germany, ISBN 978-3-11-074934-2.
ABOUT THE AUTHOR
Heinz Bloch’s long professional career included assignments as Exxon Chemical’s Regional Machinery Specialist for the United States. A recognized subject-matter-expert on plant equipment and failure avoidance, he is the author of numerous books and articles, and continues to present at technical conferences around the world. Bloch holds B.S. and M.S. degrees in Mechanical Engineering and is an ASME Life Fellow. These days, he’s based near Houston, TX.
Tags: reliability, availability, maintenance, RAM, FRETT, troubleshooting, failure analysis, failure avoidance