Part I, The Possibilities Of Probability: In Reliability And Predictive Maintenance

The term predictive maintenance (PdM) has different meanings depending on your background and industry. Within the plant-management world, it means that if your maintenance personnel tell you that something will survive 100 hours, it will operate 100 hours, then fail one minute later. As for engineers and technical trades, when asked how long something will last, they’ll balk: They can’t state an exact number, so will usually answer, “I don’t know.” Because of the concept of absolutes (which is the exact opposite of predictive maintenance), opportunities around PdM are virtually non-existent in areas of reliability and maintenance operations.

So, we now look to concepts such as Artificial/Augmented Intelligence (AI), Machine Learning (ML), IoT (Internet of Things) and other digital-transformation technologies as our saviors. Again, our past experience often has shown reliability to be a concept that’s a long way from hitting the target or our expectations.

DEEP IN THE REAL WORLD
The root of reliability engineering is industrial engineering, which is the science of logistics, processes, and statistics. In other branches of engineering and trades, we discuss the concepts of precision with physical measurements such as micrometers and tolerances in mechanical engineering and machining, as well as microamps and tolerances in electrical engineering and electrical trades. This is easy for most managers to understand when dealing with operations and quality assurance, which are often confused with RAM (reliability, availability, and maintenance) endeavors.

On the management side, reliability is viewed from the lense of the operation of a machine that produces something repeatably within a product design tolerance in which the machine must operate for a specific time before functional failure or degradation loses that tolerance. The result is frustration that the technical reliability and maintenance side cannot provide a specific answer on how long equipment will last and, in the end, most predictive technologies do not look to progressing degradation. Instead, those technologies are used as prognostic tools on each set of data. Consequently, people may be given the title “reliability engineer,” but they’ve really been set up as planners and schedulers or prognostic-technology technicians.

Those in the executive suite at a company typically understand those in the safety department. Why? Both groups operate within the realm of statistics, probability, and risk. Their languages are not that different from each other, so when the safety manager says that a process or method is “high risk,” a corporate executive may not ask for the probability. It’s assumed that they are working in the same realm: that the corporate executive would be making business decisions based on level of risk through probabilities and statistics; and that the safety engineer would be making safety recommendations based on level of risk through probabilities and statistics. Unfortunately, when we, as industrial or reliability engineers, discuss the subset of PdM testing with those in other engineering disciplines and operations, there can be a disconnect.

(Note: One area of concentration that an industrial engineer can pursue is safety engineering. In fact, when I was teaching industrial engineering at the University of Illinois at Chicago (UIC), OSHA and other safety-related jobs made up a significant percentage of the offers my students received.)

A common (and continuing) goal among those of us in the reliability and maintenance arena has been the need for access the boardroom. When we are finally provided this opportunity, though, a big problem is that we present things as if we were talking to the operations manager: in terms of absolutes. This generates a type of barrier that has long plagued the physical-asset-management industry.

On the product-engineering side of reliability, such as development of hybrid vehicles and their components, the discussions are in terms of risk and probabilities. We modify the risk of failure by knowing the failure rate within a specific measurement, such as miles, operating hours or time, and the precision of engineering and materials. For costs we balance the two with, for example: setting a B15 at 20,000 hours of run-time, which equates to 15% of the component of interest failing during that period. This is much like the L10 life of a bearing that is measured as the number of revolutions at which point 10% of the bearing has failed. Such concepts exist to determine trade-offs and risk through product development, reputation, and aftermarket support.

What’s often missing in physical-asset management is the understanding that we are discussing the probability of survival over a specific period of time. We understand this from the philosophical side of reliability, based on available tools and over-use of the P-F curve. (What is the P-F curve, after all, other than the inverse natural-log curve of the chance of survival following the point of degradation used to determine the frequency of testing to catch the defect.) We use the P-F curve to identify the point of detection (P) to the point of defined failure (F) and the probability that it will continue to degrade without an abrupt F and dividing the time between the two points by at least half to detect a fault. This curve was originally created as a 1-e^–t^λ, which is the inverse of the reliability function, or the probability of failure during a given time knowing the failure rate (λ, or 1/MTBF).

In 2009, I used the term “Time to Failure Estimation” (TTFE) for the first time in the Institute of Electrical and Electronics Engineers (IEEE) in relation to electric motors and insulating materials. It was due to the backlash from other engineers when I presented “Predictive Maintenance of Insulation Systems for Electric Machines.” The term predictive maintenance, in turn, it was defined by the reviewers and audience as an absolute time to failure following the detection of a degradation, which was subsequently argued and rejected.

The change to TTFE not only generated acceptance, but also compelled researchers to take a new look at life expectancies of dielectric materials from a probability standpoint versus absolutes and how life-statistics were used. The concept is an ideal example of the very thing a predictive-maintenance technician or reliability engineer should be doing: presenting, once a fault has been detected, the remaining life of an asset from a probability and statistical point of view.

COMING UP
Over the next several articles, we will discuss the concepts of probability and statistical analysis from a reliability and predictive standpoint when performing traditional techniques with electric motors. We will also explore how these concepts are applied within AI and ML to project TTFE and remaining useful life.TRR

ABOUT THE AUTHOR
Howard Penrose, Ph.D., CMRP, is Founder and President of Motor Doc LLC, Lombard, IL and, among other things, a Past Chair of the Society for Reliability and Maintenance Professionals, Atlanta (smrp.org). Email him at howard@motordoc.com, or info@motordoc.com, and/or visit motordoc.com.

Tags: reliability, availability, maintenance, RAM, electrical systems, predictive maintenance, Artificial/Augmented Intelligence, AI, Machine Learning (ML), Internet of Things, IoT, P-F Curve

Part I, The Possibilities Of Probability: In Reliability And Predictive Maintenance

FEATURED CATEGORIES