In surveys and studies, as well as actual root-cause analyses (RCA) and troubleshooting activities in plants and facilities, a trend seems to have emerged: focusing on a point of failure or loss and not looking deeper into the system-related issues. Unfortunately, it’s very human to identify and correct a problem, even though it may just be a symptom of a larger problem.
Consider this hypothetical example: A bearing fails shortly after it has been put into operation. It’s replaced and fails again a short time later. After a while, an RCA is performed. The disassembled bearing shows fluting on the inner and outer races. Having read plenty of articles and recommendations, plant personnel install a shaft brush on the drive-end of the motor and an insulated bearing on the opposite drive-end. Problem solved.
The realization that people are “programmed” to take that type of approach becomes apparent when there’s a slight deviation. Consider this twist on the above example: A tech-support call concludes with the verdict that the shaft brush can’t be installed because personnel can’t access the drive-end of the motor and that there is no available insulated bearing for the opposite drive-end. In real life, there would be no reason why the insulated bearing couldn’t be installed on the drive-end of the motor and the shaft brush on the opposite drive-end. Somewhere along the way, you may begin to realize that you’ve focused entirely on the “fuse,” or symptom, and restricted your view to just the motor and, maybe, the related equipment. You’ve not delved further into the system.
The result of that common approach is programmed tunnel vision. We are programmed to attack the problem and get systems back up and running by identifying the point of failure and making that point stronger. This is the same approach as immediately replacing a blown fuse in an electrical circuit with a larger fuse. Sure, the fuse will not open the next time, but the root defect will eventually show itself, sometimes in dramatic fashion.
Awareness of the problem with that approach is growing. Increasing numbers of industry experts are advising plant personnel to understand the “operating context” of the components they are investigating. The “situational awareness” of what could be driving a defect might be enlightening and lead to a grand approach to the problem. But it might not expose the root-cause. This could lead to other issues popping up, or identification of nuisances that weren’t previously noted.
What would happen if we took a step back from the bearing failure and fluting issue and remembered the following: that we were working at a site with a high-resistant-grounding system (HRG); the motor was on a soft start; nearby motors were having similar issues; and someone had recently retrofitted a large variable frequency drive (VFD) into the system? How would we change our approach to the problem?
Then, what would happen if, through our next-step systems investigation, we discovered the following: that fuses had blown in the HRG and not been replaced, and there was high current in the grounding system? At this point, we would have learned that the transformer feeding several systems, including the new VFD, was not rated for the harmonic environment, and that someone had chosen not to include the harmonic filters in the VFD purchase.
As we look away from the bearing fluting to the larger picture, i.e., the systems analysis, we begin to see a series of events and conditions that could lead to a machine component failing. With expanded awareness, we might notice that PLCs aren’t functioning correctly, power supplies are failing, and other conditions (some seemingly unrelated) are occurring, all of which could be symptoms of the true root cause. Sometimes, it takes a fluke incident, such as a technician collecting a random current reading on ground and being surprised that more than 16 Amps are present, to realize something larger is happening.
Blinders aren’t limited to an individual or a local team: They also can affect entire industries. As human beings, we’re extremely adaptable. What, at first, only appears normal will, eventually, become normal as the problem is accepted and absorbed. For example, have you scheduled that bearing replacement for the upcoming quarter on that motor that has a bearing failure every quarter? If so, you have adapted to an unusual problem. There may be many such issues. Consider this one from the alternative-energy industry: wind-turbine-transformer failures.
Wind-turbine transformers outgas and fail at such a high rate that international standards have been written to address the problem. Among them is IEC/IEEE 60076-16, “Power Transformers – Part 16: Transformers for Wind Turbine Applications,” which identifies unusual per-tower issues associated with turbines. Working groups within the industry are addressing this overarching problem. In the meantime, other working groups are addressing issues related to fracturing bearings and other conditions within the gearbox. Moreover, still others are addressing specific concerns within the generator itself. In effect, silos, including national laboratories, international organizations, and technical groups, each with specific expertise in mechanical, electrical, or component conditions, are all trying to solve what is, essentially, the same problem.
What would happen if the industry took a step back to review the conditions surrounding the failure modes of each component? Or identified a systems approach to look at those conditions? What might be noticed? As mentioned in my May 29, 2021 article, “Reliability & Maintenance With A Neutral Harmonic Filter,” literature research can be quite valuable.
Searching through various literature for information on wind-turbine-transformer failures, I learned that several such events that were related to known problems in large utility generators had occurred between 2006 and 2010. Those failures led to the publication of academic papers, which turned into academic studies utilizing MatLab and Simulink software to simulate the system events and make predictions. One key takeaway from the’ findings was that a specific series of events can cause a recurrence of the transformer-failure issue, which becomes greater as more utility-scale generation and transmission and distribution is brought online. With this in mind, I embarked on my own investigation involving several stakeholders.
Next week’s article (Part 9) will focus on that specific wind-turbine-system issue and how we approached it using Electrical Signature Analysis (ESA) and other technologies and information. Moving beyond the academic perspective, I’ll review practical and real data as it relates to what was seen in simulations versus field experience. Among other things, the article will provide an overview of the observations from this in-depth investigation and discuss how the various issues are theoretically related. As with most other situations, once a root-cause is identified, real solutions can be developed.TRR
Click The Following Links To Read The Previous Seven Parts Of This Reliability & Maintenance Opportunities Series
ABOUT THE AUTHOR
Howard Penrose, Ph.D., CMRP, is Founder and President of Motor Doc LLC, Lombard, IL and, among other things, a Past Chair of the Society for Reliability and Maintenance Professionals, Atlanta (smrp.org). Email him at email@example.com, or firstname.lastname@example.org, and/or visit motordoc.com.
Tags: reliability, availability, maintenance, RAM, wind turbines, wind energy, power transformers, IEC/IEEE 60076-16, Electrical-Signature Analysis, ESA