
Root Cause Failure Analysis (RCFA) is often treated as a technical exercise, and for good reason. It depends on evidence, data, discipline and a structured process. When a failure occurs, the team must preserve the evidence, gather the right information and analyze it carefully enough to separate symptoms from causes.
Today’s RCFA landscape offers more data than ever before. Plants have access to condition monitoring tools, oil analysis reports, inspection records, maintenance histories, process data, alarms, trends, and all the other evidence that can help explain what happened. The challenge is no longer whether or not the data exists. The challenge is whether the team can collect, understand and use it well.
That is where culture enters the picture.
Noria recently covered the topic of “Culture of Reliability” on the Gear Talk Podcast, and it is a topic we often consider during consulting work. Plant reliability teams are asked to work with large amounts of technical information and make decisions that affect uptime, safety, cost, and long-term asset health. Considering the weight on their shoulders raises an important question: when a team is analyzing a failure, how well does that team actually work together?
Industrial lubrication programs regularly demonstrate the value of RCFA. When evidence is gathered properly and reviewed by a capable team, the result can be more than a repaired machine. A good RCFA can reveal gaps in lubrication practices, maintenance strategy, training, inspection routines, operating procedures or equipment design. It can prevent the same problem from returning under a different name.
But even the best data can be completely wasted if a team’s culture is poor.
Don’t hear the word “culture” and immediately dismiss it. This is not about corporate buzzwords. It is about whether the people involved are truly bought in, not only to the organization’s goals, but to the team’s responsibility to find the truth. A mechanic, operator, lube technician, engineer and supervisor may all see the same failure from different angles. Each may be holding a piece of the puzzle. If the culture encourages open discussion, the pieces can come together. If the culture discourages disagreement, those pieces may never even make it to the table.
It’s been said that a rising tide may lift all boats, but in RCFA, the opposite is also true. One ignored observation, one dismissed concern or one rushed conclusion can drag the entire analysis back toward failure.
Strategies, templates and critical analysis playbooks all have their place. They should be reviewed and improved regularly. But something else matters just as much: the team’s willingness to use data honestly. Does the culture invite open discussion? Can the team handle the occasional uncomfortable truth that someone may be wrong? Can leadership hear disagreement without treating it as defiance?
Most plants have invested at least some effort into gathering data and generating reports. But what happens to that pile of evidence after it is collected is much less universal. The best tool for interpreting that evidence is still the human element. Experience, memory, observation and machine familiarity often provide the crucial context needed to make the data meaningful.
That is where the health of the machine meets the health of the team.
Industrial maintenance teams are often made up of hardworking, capable and resolute people. That is a strength, but it also means personalities can run strong and opinions can run stronger. When a failure investigation begins, the technical process matters, but so does the culture in which that process takes place.
In many cases, the operator or mechanic working directly with the machinery every day wants more than anyone for those machines to be humming along happily when he shows up for his shift. There is pride in it. There is also pressure. The fallout from disruption, downtime and difficult startup conditions can be intense. It helps to remember how much experience is already in the room.
The mechanic wants the machine operating correctly and the line running smoothly. Mechanics handle routine and non-routine maintenance under conditions that others may never fully see. They are pulled in multiple directions, asked to balance urgency against safety, and expected to prioritize work based on criticality, cost and production impact. With their hands in so much of what keeps production moving, it sometimes borders on sorcery what these people are capable of.
The operator brings a different kind of knowledge. Operators deal with daily hiccups, subtle changes in machine behavior and the little performance issues that may not show up clearly in a report. They are often the first to say, “I just wish the thing would stop doing that.” That comment may sound casual, but it can carry valuable evidence. The more detailed the operator’s logbooks, the more useful those observations become. Operators may also be involved in critical process adjustments and understand how material changes, timing, tolerances, or operating technique affect the machine. Some of them can practically whisper to the machine through the PLC and know exactly which burner, pump, valve or bearing will cause trouble by the end of the week.
Some plants also invest in dedicated lubrication staff. These people provide critical information through logged observations, lubrication routes, inspection findings, oil sampling and condition monitoring. Their ability to maintain lubrication hardware, collect representative samples and keep records organized can make a meaningful contribution to machine health. In an RCFA, oil analysis reports should be easy to find, properly labeled and reviewed in context. Sample handling, sample history and timely reporting matter. If the oil analysis data is going to be used as evidence, the team needs confidence that the sample was collected, handled and interpreted correctly.
Then there is the plant engineer. The engineer brings technical depth, problem-solving structure and the ability to connect field observations to design, process and system-level concerns. We need plant engineers to be broad enough to understand the whole process and specialized enough to sharpen the focus when the problem demands it. Much of that ability comes from formal study. Much of it also comes from hard-won experience. Communication, data integration and team-based problem solving are now core tools of the trade.
Leadership also plays a defining role. Supervisors, managers and directors are responsible for setting the tone, establishing priorities and keeping the plates spinning. Many come from years on the floor. Others bring formal training, management experience or a natural ability to organize people around a goal. All of them influence the culture and direction of plant performance. They also carry the pressure of knowing exactly what downtime costs. In many plants, those numbers are staggering, and they are always changing.
The events leading up to machine failure are often observable. Someone saw something, heard something, adjusted something, cleaned something, bypassed something, reported something or lived with something. That is why interviews with frontline personnel are among the most important parts of RCFA. They are not a replacement for physical evidence or data, but they often provide critical context that helps explain what the data is trying to say.
Sometimes the history of a machine has left the team discouraged. A hard-to-access or chronically problematic asset can accumulate unresolved issues simply because the work is difficult, hazardous or easy to postpone. The mechanic and operator often know this better than anyone. They want that disruption cleared as much as leadership does.
RCFA can shine in these moments, but only if it stays focused.
Do not let RCFA turn into a finger-pointing session. Keep the team focused on the goal: understanding what happened, why it happened and how to keep it from happening again. Create a forum where people can offer ideas, concerns and observations without feeling like they are stepping into a courtroom. Review criticality levels when developing preventive maintenance tasks. Revisit strategies when assets change. Explain why certain recommendations may or may not align with organizational goals, safety concerns, production requirements or budget constraints.
The people on the frontline often possess a wealth of knowledge about the machine’s character and are invested in its performance. It should come as no surprise then that I regularly see mechanics wince when asked about a critical issue. If I had a dollar for every time someone said they had already mentioned a legitimate concern but felt it was ignored, I could eat quite well for a long time.
Sometimes, simply keeping people informed helps maintain perspective. Remind the team of the goal. Explain again the reason for the direction in which they’re heading. Never miss a chance to brag about someone who contributed well, and when possible, wrap up the discussion on a positive note. Teams need wins.
The same characteristics that make people valuable in a plant can make RCFA discussions intense. Strong opinions are not automatically a problem. In fact, disagreement is often a sign that people care enough to think carefully about the issue.
The key is encouraging disagreement that is respectful, evidence-based and useful.
A mechanic may disagree with an engineer. An operator may challenge a supervisor’s assumption. A lube technician may point to an oil analysis trend that complicates the preferred explanation. Those moments can feel uncomfortable, especially when downtime pressure is high. But if everyone is working toward the same truth, those challenges can strengthen the analysis.
Positions should be clearly stated and supported by evidence whenever possible, and all involved should regularly remind themselves that the goal is not to win the argument. The goal is to understand the failure.
Documentation matters here. The machine’s behavior, maintenance history, repair decisions, inspection results, oil analysis reports and operator observations should remain available throughout the asset’s operating life. Anyone should be able to pick up where another person left off. Even a rejected idea may be worth recording. It may not explain today’s problem, but it could become important in a future analysis.
All this being said, leadership still has to lead. Decisions must be made. Repairs must be completed. Production has to resume. No matter how good a contribution or concern may be, it can still be rejected. But it should be understood before it is dismissed.
The need to act, reach resolution and achieve targets never takes a back seat. Sometimes, though, it may need to scoot over a little so the team can hear something important.
A real RCFA killer is when people stop raising concerns. That usually does not happen all at once. It happens after people conclude that speaking up does not matter, that the decision has already been made or that disagreement will only create trouble. Once that happens, observed evidence stops being reported.
Organizations may need to pay attention not only to the RCFA method, but to the way RCFA discussions are moderated. A good forum is structured enough to stay focused and open enough to let useful information surface. Company leadership sets the tone, but everyone eventually ends up at the table making a case.
If the goal is to arrive at the truth, ask yourself: is everyone hearing the case clearly and honestly?
This all hit me recently during an onsite lubrication program design project. I was starting my morning inside an engineering office where mechanics and specialized technicians often gathered before making their rounds. It is one of the great joys of my job to be among the old familiar buzz in those rooms: trade wisdom, sharp observations and a few precariously executed jokes.
One morning, a mechanic we will call Bobby walked in smirking, ready to share a quick synopsis of the RCFA meeting he had just left. Following a salvo of colorfully curated adjectives, Bobby described the situation to his captive audience.
The plant was taking a time-sensitive feedstock and sending it through a set of rollers and a cutter. The process involved material weight, roller pressure and a list of clearance criteria along the drive system. Product specifications were managed at the rollers, but the process continued to fall out of spec without a clear remedy.
Several days earlier, Bobby had been called away from home and told to report to the plant immediately to resolve a critical failure suspected in the roller assembly. Other mechanics and leads had been unable to resolve the issue, so Bobby arrived ready to get the repair handled.
He inspected the system and raised a concern about the material itself. Based on what he could observe, the material did not appear to be behaving as expected. He also pointed to recent machine performance issues and deferred time-based maintenance on a particular component.
He was advised to replace the entire feed roller assembly.
Bobby voiced concern that a full replacement could create unnecessary cost and obscure the troubleshooting process. Replacing the whole assembly might get the line moving, but it could also remove or mask evidence before the real cause was understood.
Still, Bobby yielded to the supervisor’s direction and completed the roller replacement. No material analysis was performed to benchmark the material’s characteristics. After the replacement, the operator inspected the material and said it was machinable despite being outside the expected process window. Bobby looked at the material again and was surprised by the operator’s confidence. They tried it, made minor adjustments and got production back online within specification.
The next day, Bobby attended the morning RCA meeting. An engineer praised him for fixing the issue. Bobby pushed back. He believed the material condition was still in question, and he noted that the roller-related measurements still did not match specification.
After that meeting, Bobby came into the engineering room ready to tell the story. While he was talking, new batches of material were heading into the same conveyor and falling back out of spec.
Now Bobby had at least two major variables still unresolved: the roller measurements from a newly replaced assembly and material inconsistency across multiple batches, both inside and outside the expected process window. All of this was happening under time pressure while other troubleshooting procedures were still being worked through.
That is a short story of shop life, but it is also a representative sample of what happens in plants all the time. My head was swimming with questions, as some readers’ are. Others already know exactly which way the wind is blowing.
There is a lot to unpack in a story like that. Was the roller assembly actually the problem? Was the material the problem? Were the specifications right? Were the measurements reliable? Did the repair solve the issue, or did it simply restart production long enough for the same condition to reappear? Who was right? Who was wrong? At some point, someone had to make the call.
This is where RCFA culture matters.
A plant can praise the repair and still miss the root cause. A team can get production running and still leave the failure mechanism intact. A mechanic can be right about an unresolved variable and still be overruled for practical reasons. None of that is unusual. The question is whether the concern gets captured, tested and carried forward, or whether it disappears because the line started running again.
The machine is indifferent to synergized global alignment strategies. If ignored long enough, it will happily shut down production at 1:15 a.m. on a Saturday morning and let everyone walk in bleary-eyed to discover that the last RCFA was left unfinished.
I have had the opportunity to work closely with many plant management teams over the years. The pressure to perform confidently and consistently can be heavy, but most of the people I meet truly want to do the work well. They respect the work. They want to do the right thing the best way they know how. It really is something to watch and participate in. The teamwork and relationships can be remarkable.
Serious people coming together to face serious challenges will always bring intensity. There will be disagreement. There will be frustration. There will be pressure from production, maintenance, engineering and leadership. But with the right team, culture and leadership, there is also an opportunity to make measurable improvement and come out better than before.
Teams need a win.
If I walked into your tech room or mechanic room this morning and asked whether the team feels RCFA meetings are productive and effective, what would they say? Would they say the team works well together, accomplishes goals efficiently and moves the plant in the right direction? Or would they say the meetings are where conclusions are defended, concerns are brushed aside and the same problems return under new work orders?
Bias and pride are often on the line when experts deliberate. That is human. But excellence and integrity should remain at the fore of a healthy culture. The importance of allowing time to discuss, challenge and debate is hard to overstate. We need to periodically examine how we receive the perspectives of people with different kinds of experience and hard-won expertise.
A shutdown, failure or chronic defect is never convenient. It brings cost, pressure and risk. But it is also a rare and data-rich opportunity. The team may be able to access evidence, behavior and observations that are not visible during normal operation. Whether the organization listens intentionally to the team will influence the outcome of the reliability program.
Everyone should understand the importance of maintaining KPIs, but not everyone is responsible for making the final operational decision. That is why the flow of information matters so much. Leaders need reliable evidence. Engineers need context. Mechanics need to know their observations matter. Operators need a way to report what they see without being treated as a nuisance. Lube technicians need their inspection and oil analysis data to be part of the conversation, not an attachment no one opens.
The human element is what brings machine mastery to life. The people on the frontline spend much of their waking lives connected directly to the equipment we are entrusted to optimize. Make it easier for them to contribute observations and concerns. Capture their knowledge. Test it. Document it. Use it.
Data is a powerful tool, but it is only as useful as the culture permits it to be.
In a world of condition monitoring, lab analysis, PM programs, engineering projects, deadlines, new initiatives, performance metrics and AI, can we still hear the people standing right next to the machine? We should be listening closely.