Years ago, I learned how to systematically and successfully tackle equipment-failure events. I later conveyed these learnings to others and was fortunate in seeing them apply the principles. A key to successful application is to see/learn how others do it. Perhaps we should start by subscribing to the belief that tangible and lasting reliability improvements are achievable only if management and wage earners view every maintenance or downtime occurrence as an opportunity to find the true root cause. In essence, each event triggers efforts to upgrade the equipment, rather just fixing it in kind.
The next time equipment distress is reported in your operations , why not call together key individuals: The Clout, the Chairperson, the Data Source, the Client, the Recorder, and a Knowledgeable Worker should be present or, at least, represented. By the way: Virtual meetings of this type can be just as effective as our meetings were in the late 1960s, when, believe it or not, we only had flip charts. Back then, I was the happy owner of two well-used, dependable, Peugeot 504 sedans. Then as today, the Chairperson would designate a Recorder and ask that person to step up to the flip chart.
The first task is to write down a task headline. One that I recall from the late 1960s was “CREATE STANDARD CHECKLIST FOR AGITATOR SYSTEM MAJOR COMPONENT FAILURE,” the subheading of which was “Create Job Plan.” Every meeting participant contributed. (The Clout was watching, and we participants, of course, wanted to impress the Clout). Up went the list, which looked much like the following:
1. Conduct failure assessment—what gave way, what seemed to be wrong.
2. Conduct failure analysis—failure mode observation, i.e., the how, why, sequence, and root cause (only 7 possibilities: Design<>Assembly or Installation <> Fabrication or Manufacturing <> Material Defect <> Maintenance Deficiency <> Off-design or Unintended Service Conditions <> Improper Operation).
3. Review production history—run time, product properties (viscosity, etc.).
4. Review equipment history—computerized event log.
5. Review/copy all applicable system and component prints.
6. Obtain all current manufacturer’s operating and maintenance manuals.
7. Identify and contact in-plant/outside experts, if necessary; include possibly having to call on manufacturer/contractor/other specialists.
8. Develop job time-line—identify production window.
9. Check parts availability, including “swap options,” and discrete components.
10. Establish and maintain production communications interface.
11. Consult with Safety Personnel and arrange for presence of safety coach or have him/her address all concerns (lockout, tagging, rigging, etc.).
12. Perform comprehensive check of other related components and/or systems to avoid compound/recurring/supplemental problems (lube system, filter, relief valves, coolers, controls, hydraulic systems, pumps, filters, motors); fluid analysis (water, dirt intrusion passageways and sources).
13. Inspect all critical components gearboxes, seals, bearings, wiring insulation, relief valve settings, critical instruments, others.
14. Identify manpower availability/requirements.
15. Assign tasks and accountabilities.
16. Check all interlocks and production-control sequences.
17. Review all applicable design specifications.
18. Create startup checklist, including acceptance criteria.
19. Key-up necessary preventive maintenance routines not yet in the system.
20. Identify future similar job assignment responsibilities.
21. Conduct post-job debrief.
22. Create formal procedures and checklists for similar jobs immediately. Keep the momentum going. Capitalize immediately on lessons.
The entire approach described here is worth noting. Failure assessment and analysis replace the traditional reaction of scheduling a work crew with instructions to go for the customary quick fix. A deliberate thinking process is being pursued. Production and equipment histories are scrutinized for clues as to why a repeat failure might have occurred. The equipment manufacturer’s manuals are not disregarded. Before reinventing or guessing, the process approach is to determine if someone else has applicable expertise. Spare parts and safety considerations are being aired. Potential contributing influences are listed. Inspection assignments are made, chain of approval and accountability clarified.
In any well-managed facility, data collection is part of the job. A good equipment surveillance and analysis program is essentially a cost- and product-control stewardship system designed for operating machinery populations. Various forms of such systems are in use worldwide and have been responsible for dramatic savings in equipment maintenance costs. The computerized segment of this type of system is a tool designed to facilitate the collection and interpretation of data; identify the burdensome repeat failures; monitor progress; encourage proactive team participation and worklist compilation; track costs and savings along the way; simplify the reporting function; and provide a compact, centralized file system.
Benchmarking starts with knowing where you are and how your failures stack up against those experienced by the competition. This implies that you are serious , accurate, and consistent with your failure analysis, data logging, and data retrieval efforts. There has to be a structured, repeatable approach to all this; half-hearted or occasional efforts will not allow a company to be among the pace setters, or Best-in-Class (BiC) performers.TRR
Editor’s Note: Click Here To Download A Complete List Of Heinz Bloch’s 22 Books
ABOUT THE AUTHOR
Heinz Bloch’s long professional career included assignments as Exxon Chemical’s Regional Machinery Specialist for the United States. A recognized subject-matter-expert on plant equipment and failure avoidance, he is the author of numerous books and articles, and continues to present at technical conferences around the world. Bloch holds B.S. and M.S. degrees in Mechanical Engineering and is an ASME Life Fellow. These days, he’s based near Houston, TX. Email him at [email protected].
Tags: reliability, availability, maintenance, RAM, asset management, workforce issues