Reliability Engineers: The Holistic Physicians of Machine Care

Drew Troyer

As a reliability engineering teacher, I frequently draw analogies in my classes between machine care and care of the human body to make important points about managing reliability throughout the entire life cycle of the machine - design, manufacture and installation, commissioning, operation, maintenance, and disposal. I find the metaphors useful in helping people understand the most salient points of the topic. Simply stated, some aspects of health care address the symptoms of illness, while others focus upon the root causes that lead to illness in the first place. The same is true for machine care. Everyone should be familiar with and interested in his or her own health, which makes it a perfect way to analogously describe the value proposition of reliability engineering and its basic principles. Everyone gets it, no matter their position in the organization.

However, I'm a reliability engineer, not a physician. So rather than rely upon my own limited knowledge of medicine and health care, I've elected to team up with my friend Dr. Katherine E. Anderson, a naturopathic physician at the Cancer Treatment Centers of America's Southwestern Regional Medical Center, which is located in Tulsa, Okla. Dr. Anderson specializes in natural-based medical treatments used integratively with traditional medical treatments such as surgery, chemotherapy, radiation therapy, etc., in the treatment of cancer. As a naturopath, she's an expert in managing the proactive aspects of health care that, if performed correctly, prevent the onset of illness and disease.

Philosophically, reliability engineers perform a similar role to naturopaths but in the domain of managing the health of electromechanical systems and associated manufacturing processes instead of people. Our job is to understand and control the root causes that lead to machine failure. For too long, we addressed the symptoms of machine failure. But the real money for a manufacturing plant is in transitioning from reactive care to proactive care, which requires a major culture change for most organizations.

We hope this article will help you to communicate the benefits of proactive care in a way that anyone can understand using health care analogies. Communication is the critical element to achieving and sustaining culture change in your organization. In this column, we'll address the key analogies relating health care to machine care along the asset life cycle. In future issues, I hope to team with Dr. Anderson to explore in more detail specific manufacturing reliability topics using human body analogies to facilitate understanding.

Figure 1. Stages Along the Continuum

The Philosophy of Proactive vs. Reactive Care
The American Heart Association reported that in 2005 a total of 445,687 people suffered a heart attack. Of these, 37 percent died. In reliability engineering speak, that's a catastrophic functional failure; death is a tough first symptom. The remaining 63 percent survived with damaged tickers - at least to some degree - irreparably shortening and reducing the quality of their lives. There is no elegant way in which to react to a heart attack and avoid some level of negative consequences. Emergency room physicians typically deal with heart attack victims, doing everything they can to save the patients' lives and minimize damage to the body.

As a countermeasure, physicians recommend that we get periodic checkups. Once we get above a certain age or when we possess risk factors, the physician will run an electrocardiogram (EKG) and other tests to detect the early signs of heart disease. If problems are detected prior to a heart attack, the physician has a range of options, including medication, angioplasty, bypass surgery, etc., to interactively deal with the problem before it reaches the catastrophic stage. General practitioners typically provide screening during regular checkups. Cardiology specialists typically provide care once signs of heart disease are detected.

Holistic physicians, on the other hand, encourage us to manage lifestyle in a way that controls the root causes of heart disease. An individual's DNA significantly influences the risk of disease, including heart disease. However, Harvard researchers reported that by proactively maintaining a healthy body weight and composition, getting a modest amount of daily exercise, eating a nutritious diet, getting sufficient sleep and rest, refraining from smoking, avoiding harmful drugs or excessive alcohol consumption, etc., we can avoid 60 percent of heart disease. Likewise, by monitoring cholesterol, triglycerides, blood pressure, body mass index and other metrics, we can determine the degree to which lifestyle management is keeping our risk factors in check. General practitioners, naturopathic physicians, diet and exercise specialists, and others help keep us on track with a healthy lifestyle.

In terms of health management, we decide how we wish to allocate our effort and resources to manage diseases that shorten and reduce the quality of our lives - heart disease in our example, but the same philosophy applies to cancer and other illnesses. We may choose to react to disease, interactively respond to it with early warning or proactively control its root causes.

Maintenance engineers and professionals typically serve in a role that is similar to the ER physician by responding to functional failures after the fact in an attempt to minimize damage and restore the manufacturing process to functionality. As is the case with the human body, a failure that is allowed to propagate to the functional level causes some degree of permanent harm. Predictive maintenance engineers and technologists, maintenance planners and other maintenance pros serve in an interactive role to monitor, diagnose problems and take pre-emptive steps to deal with them prior to the point at which the function is affected. They function much in the same way as the general practitioner and cardiologist who interactively detect and respond to early signs of heart disease, but prior to the onset of cardiac arrest. Reliability engineers are the holistic practitioners of the machine care world. We seek to understand failure and its root causes and employ proactive control over the forcing functions that lead to failure in the first place.

Reliability engineering in manufacturing industries has always been closely aligned with the maintenance function, a fact that has always baffled me. Reliability engineering, like healthcare management, should address the health of the system throughout its entire life cycle. In a way, the notion of reliability challenges the psychological self-perception of a typical maintenance pro. If you ask maintenance people to describe what they do in a word or two, they usually respond "fix things." In order to fix something, it must first get itself into a broken state. If reliability efforts are effective, this doesn't occur nearly as often, causing maintenance people to wonder about their value contribution and about the future of their jobs; these emotions are very real and very primal. When equipment is reliable and monitored effectively for operational anomalies, things become calm and downright civilized at the plant, eliminating a source of "psychological income" some maintainers feed upon when they get to be the white knight riding in to save the day following a failure.

The point is that maintenance has typically focused upon responding to the symptoms of failure, while reliability is about proactively managing the causes of failure. Maintenance is a vital aspect of managing reliability, but reliability is not a synonym for maintenance. It is much more holistic, encompassing engineering and management decisions across the asset's management life cycle. The social-psychological challenges of changing from a traditional maintenance culture to a more proactive reliability culture demand clear and effective communication. Health care analogies offer a great way to communicate these concepts to everyone.

As an individual, you own the majority of your healthcare choices. The same goes for organizations as it pertains to machine health. So, let's examine the various stages of a machine's life cycle - design, manufacturing and installation, commissioning, operations, maintenance and, finally, disposal - drawing analogies to managing the health of the human body. The table on Page 10 illustrates machine care objectives with associated health care analogies along the six stages of the asset's life cycle.

Design: DNA, the Genetic Code
Little influences an individual's chances to live a long and healthy life more than DNA. DNA, which is largely a function of heredity, predisposes the individual in terms of physical and intellectual capabilities, and largely determines the degree to which the person is susceptible to illness. You can mitigate the effects of it through lifestyle management, but you simply can't get around bad DNA. The same is true for machines and manufacturing systems. Presently, medical researchers and professionals are aggressively pursuing gene therapy and exploring ways to "design" illness out.

Design for reliability, maintainability, operability, flexibility and all the other "abilities" required in machines necessitates advanced planning and a focus on minimizing the overall life-cycle cost of machine ownership. Typically, organizations seek to achieve functional capability at the lowest possible up-front purchase price - often sacrificing significant sums in through-life cost to operate, maintain and ultimately dispose of the machine. In the area of design, we engineers have a significant advantage over our friends in the healthcare industry. Electro-mechanical systems, which obey the laws of physics and chemistry, are much easier to predictably manipulate through the design process than the biological systems that physicians look after. Moreover, there is no social and/or political stigma attached to manipulating the "DNA" of machines through "genetic engineering" (the design process), whereas that's not the case for physicians.

There is no logical reason why designing to minimize the life-cycle cost of ownership of a machine or manufacturing process should not be a top priority for the manufacturing firm. Regrettably, however, this is rarely the case. Medical professionals would rejoice at the opportunity to exert the same degree of control in the design process that we can as machinery engineers.

Manufacturing and Installation: Prenatal Care
Apart from DNA, arguably no factor influences the length and quality of human life more than prenatal care. Insufficient prenatal nutrition and immunization and/or exposure to harmful drugs, alcohol and other chemicals are linked to a host of birth defects and other undesirable effects on the baby - physical, mental and emotional. It is damage that often can't be undone with reactive treatments, typically affecting the baby for his or her entire life. Proactive management of prenatal care is absolutely essential and an absolutely controllable factor in the health of an individual. Prenatal care should begin prior to conception and is essential both to the health of the baby and the mother.

Once a machine or process is properly designed, it must be manufactured and installed in a way to manage risks. Substandard, faulty or shelf-degraded parts and components, and poor craftsmanship in the manufacturing and installation process sets the stage for a reliability nightmare - increasing the risk of early life failure, which in turn sets the stage for a veritable maelstrom of trouble throughout the asset's life. Much like prenatal care, these are, fortunately, controllable factors that we frequently fail to prioritize in managing the health of our machines and processes.

Commissioning: The First Year of Life
Beyond DNA and prenatal care, the first year in a baby's life is likely the most important in terms of lifelong health and happiness. Now separated from the protective environment of his or her mother's womb, the baby, who is completely vulnerable and dependent upon caretakers, becomes exposed to the risk of sickness, disease and injury. Proper nutrition, health care and parental nurturing set the stage for a happy, healthy life. Conversely, lack of proper nutrition, health care and nurturing during those early years sets the stage for a lifetime of trouble, both in terms of physical health and emotional happiness and stability. In extreme cases, emotional neglect can lead to the development of socially unacceptable behavioral patterns.

The first couple of years are likewise critical to the health of our machines and manufacturing processes. Failure to learn how to properly operate and maintain the asset in the first couple of years following installation causes avoidable damage that frequently sets the stage for a life of reliability problems, often resulting in an irrecoverable loss of useful life, all of which could be avoided with a little foresight, training and procedure building. Too often, in the interest of generating cash as fast as possible, we neglect the very controllable commissioning factors (like creating standard operating procedures and training operations personnel), often creating significant long-term damage in the process.

Operation: Lifestyle Management
Once past infancy, attention turns to proper nutrition and exercise, sufficient sleep and rest, and supplements where required for normal, healthy children and adults. Likewise, not smoking, using drugs or consuming excessive amounts of alcohol and minimizing exposure to harmful chemicals, too much sun, etc., are lifestyle management factors. Failure to proactively manage lifestyle increases the risk of heart disease, cancer, adult onset diabetes and other diseases. Moreover, quality of life is compromised when we're not attentive to health. Lifestyle management goes beyond attention to our physical health. We must manage emotional stress and strive to maintain an appropriate work-life balance to remain healthy and well adjusted. Holistic health of the mind, body and spirit are essential to length and quality of life.

The same is true of machines and processes. Operational factors such as selling within the capabilities of our processes, the supply of sufficient quantities and high-quality raw materials, start-up procedures, changeover procedures, adjustment procedures, shutdown procedures, finished goods inventory, and downstream supply chain effectiveness all influence the health, reliability and profitability of our manufacturing processes and business.

For a manufacturing organization, the equivalent of work-life balance is the tricky balance between the functional "silos" within the organization. For example, if the sales organization oversells the capabilities of the plant or the supply chain, we risk customer dissatisfaction, which can significantly overshadow the business gains associated with securing the sale in the first place. Or, if upstream supply chain managers save money by awarding a contract to a supplier that offers a good price, but the savings comes at the expense of delivery or material quality, the "savings" can be eclipsed by losses incurred elsewhere in the value stream. Just as humans must balance the health of the mind, body and spirit, the manufacturing organization must manage the balance between functional groups along the value stream of the organization. It's hard to achieve, but it's critical to success.

Think about the functional groups along the value stream - sales and marketing, upstream supply chain, plant operations, product/process/equipment engineering, plant maintenance, downstream supply chain, etc. Then, ask yourself this question: What is the primary objective of each functional group? In most instances, it will be revenue maximization or cost minimization, which begs the question, "Who's watching the bottom line?" Organizations don't exist to maximize revenue or minimize cost; they exist to maximize the creation of value. Are your operational silos working for or against you in this pursuit?

Maintenance: Proactive, Interactive and Reactive
When we think of a physician, we think of the healthcare professional that we visit when we're sick or injured. Attending to illness and injury is certainly an important element of health care, but there is so much more. Optimized health is not simply the absence of disease. Health care pros are essential to ensuring that we don't become sick in the first place. A physician is, in essence, an individual who's received extensive training to understand the body - how it works, what makes it not work and how to restore it when sick or injured. With this knowledge, health care pros help us get off to a good start in life with prenatal and early life care, show us how to manage our lifestyles so as to minimize the risk of physical or emotional illness, and monitor our progress through life by checking our cholesterol, blood pressure, weight, body fat percentage and other observable parameters known to influence lifespan and quality of life. These are the proactive services provided by naturopathic, osteopathic, chiropractic and allopathic physicians, psychologists and psychiatrists, exercise and nutrition specialists, spiritual advisors, etc. Likewise, when something goes wrong, specialists such as cardiologists, oncologists, neurologists, etc., work to restore health and minimize pain and suffering.

Much like the physicians, maintenance engineers, technologists and craftspeople possess special knowledge about machines and manufacturing processes - how they run, how and why they fail, and how to restore them following failure. On the proactive side, we manage the health of our machines and systems through preventive measures such as precision alignment, balance, lubrication, fastener integrity, etc. We monitor the degree to which we've effectively controlled these root cause conditions using oil analysis, thermography, vibration analysis, etc., just as physicians monitor blood pressure and cholesterol for effectively the same purpose - to protect the health of the system. In this respect, the maintenance pro serves to ensure that all our efforts to manage design, manufacture and installation, commission and operations are effective. Likewise, when the system has failed or is malfunctioning, the pro's special knowledge enables him or her to restore it to health.

Disposal: Organ Donation
Most states offer individuals the chance to declare themselves an organ donor by making a checkmark on their driver's license application. By doing so, you are making a conscious decision to philanthropically donate your healthy organs to society in the event of your death. Doing so can profoundly affect the life of another.

While not as emotional as organ donation, we need a similar philosophy in the management of machine and process reliability. In some industries, the cultural norm has been to throw out all the capital equipment and replace it with new when a new product model is introduced. This is an unfortunate and wasteful practice that manufacturing firms can't afford to support. When designing a new system, it's essential that the existing plant "donate" as many of its parts and components to the new design as possible. Moreover, when designing the plant or line, we must anticipate which components or subsystems will be used in subsequent product models and, where applicable, design to provide maximum flexibility and sufficient electromechanical robustness.

I mentioned earlier that I've always been baffled why manufacturing process and machine reliability management is almost always assigned to the maintenance organization despite the fact that reliability management, much like human health management, is a life-cycle proposition. I've finally concluded that the reason is because once a problem reaches the maintenance department, irrespective of its origin, it has no place else to go.

For the reactive manufacturing firm, the maintenance department is analogous to the ER in a hospital. Irrespective of what led to the heart attack, once it hits the ER, it has no place else to go. The prevailing questions for would-be heart attack victims and manufacturing process reliability engineers and managers become: Are you taking appropriate measures to proactively manage health across all aspects of the asset life cycle? If not, what would it be worth to you as an individual and as a manufacturing organization to do so?

Subscribe to Machinery Lubrication

About the Author