How Onsite Oil Analysis Contributes to Root Cause and Failure Analysis

Wes Cash, Noria Corporation; Dan Walsh, AMETEK Spectro Scientific
Tags: preventive maintenance, wear debris analysis, oil analysis

Despite our best efforts, machine failures still occur—bringing production to a halt and triggering a cascade of investigations. Root Cause Analysis (RCA) and Failure Reporting, Analysis and Corrective Action System (FRACAS) remain essential tools for diagnosing these events. But lately, the role of lubrication data has become central to understanding and preventing failures.  Lubrication and contamination issues remain the majority source of premature bearing failures, or removal from operation.

Common causes of bearing failures. Source: https://evolution.skf.com/bearing-damage-analysis-iso-15243

 

Lubrication as a Forensic Tool

Lubrication analysis has evolved far beyond a routine maintenance task; it’s now a powerful forensic tool for understanding why machines fail. By integrating multiple sources of information (visual observations, chemical composition, and particulate contamination) maintenance teams can build a more complete picture of machine health. This not only helps pinpoint the exact cause of a failure but also provides early indicators that allow issues to be addressed before downtime occurs.

Inspection Results: These remain fundamental to any reliability program. When paired with digital logging tools and AI-driven anomaly detection, routine observations can reveal trends and subtle deviations that might otherwise go unnoticed.

Onsite Oil Analysis: Compact systems such as the MiniLab and FieldLab are becoming increasingly common, providing ASTM-compliant results within minutes at the point of use. This capability empowers maintenance teams to act quickly on findings without waiting for offsite lab results. TruVu 360™ software complements these systems by organizing the data into a single, intelligent dashboard.

Grease Sampling:  Similar to oil data, grease holds valuable information, as well. While not as common as oil sampling, grease sampling can provide insight into how well the grease is maintaining its properties and if there are significant amounts of contamination and wear debris present.

Filter Analysis: Often overlooked, the filter is a vault of information, especially as it relates to failures of lubricated equipment. Wear debris often becomes larger in size and concentration as failure progresses. Analyzing the shape and metallurgy of the particles in the filter can provide useful information about specific components that may be wearing, while also potentially shedding some light on the wear mechanism.

 

The Rise of Onsite Oil Analyzers

Oil analysis is one of the most insightful and effective methods for performing failure investigations and understanding machine health. For decades, this process meant drawing a sample, labeling it, and shipping it off to an external laboratory—often waiting several days or even weeks for results. By the time a report arrived, the equipment condition may have already changed, leaving maintenance teams reacting to problems instead of preventing them.

Lab-based testing continues to play an important role, especially for complex analyses or verifying critical findings. However, new technology now bridges the gap between offsite testing and day-to-day reliability needs. Onsite oil analysis systems bring many of the same high-precision capabilities directly to the plant floor, allowing for faster decisions, more frequent sampling, and better visibility into lubricant and machine health. This shift has turned oil analysis from a periodic activity into a continuous part of condition monitoring.

Two key developments are driving this evolution: compact, lab-quality onsite testing instruments and integrated data intelligence platforms that help teams make sense of the information they collect.

MiniLab: Bringing Lab-Quality Accuracy to the Plant Floor

The MiniLab from AMETEK Spectro Scientific has transformed how facilities monitor and respond to lubrication issues. It performs multiple critical tests in minutes and offers valuable data to assist in RCA. 

Key capabilities include:

Benefits of onsite testing:

TruVu 360™: Intelligence Behind the Insights

TruVu 360™ closes the gap between recommendations on the oil analysis report, maintenance actions taken and findings for continuous improvement.

 

The growing amount of oil analysis data presents another challenge: interpretation. That’s where platforms like TruVu 360™ add value. Rather than serving as a simple data dashboard, TruVu 360™ functions as the analytical core of a lubrication program by placing data from onsite analysis into a single, coherent picture of asset health.

Core capabilities include:

By transforming data into meaningful insights, TruVu 360™ empowers reliability teams to move from reactive troubleshooting to informed, predictive decision-making to ultimately improving uptime, safety, and confidence in maintenance strategies.

Phases of Failure Analysis

These tools are valuable in using the failure-analysis method that makes the most sense for your facility or organization. To simplify, here are five main phases we recommend should be followed:

1. Data Collection – This includes fact-finding, interviewing witnesses of the event, and determining if there were other sequential events that may have occurred with the failure. During the data-collection phase, it is important that evidence is preserved as much as possible. This includes documenting final running conditions, taking photographs of the equipment and components, and securing data samples much like the data mentioned above. Diligence is the key to avoid incurring any impact to the integrity of the data gathered during this step. Maintenance staff can use TruVu 360™ to timestamp and correlate events.

2. Assessment – During the assessment phase, the analytical methods such as the five whys may be employed. The overall goal of this step is to analyze the data and determine if it reveals the root cause of the failure. Oftentimes, root causes get grouped into one of many of the following categories including:

a. Equipment/Material Problems

b. Design Problems

c. Procedural Problems

d. Human Error

e. Training Deficiency

f. Management Problems

While this is not an exhaustive list, a single failure may have multiple reasons that caused it to get to a catastrophic case. For instance, the bearing wasn’t lubricated properly because the scheduled PM frequency was too long. Some technicians may just chalk this up to a lubrication issue and not look at the other aspects of what all was occurring. Use oil analysis results to validate hypotheses about wear, contamination, or lubricant degradation.

3. Corrective Action – This represents the plan of remediation to fix the issue and stop it from occurring again. Oftentimes, this plan will involve various departments such as maintenance, reliability, engineering, and operations.  Depending on the complexity of the corrective action, a complete redesign/rebuild of the equipment or environment that houses the equipment may be the most prudent. These cases are rare but do occur. MiniLab data can support redesign decisions with quantitative evidence.

4. Inform – The actions to prevent reoccurrence must be reported to the parties that will be responsible for implementing them. It is also a good practice to share the information with the departments that have an impact on the future operation of the asset. Sometimes, this may involve planners when a PM or BOM needs to be updated to reflect the changes stemming from this process. Use TruVu 360™ dashboards to communicate changes.

5. Follow-up – As with any process a verification step is often employed to ensure that the corrective action plan was put into place. This may also include more detailed analysis moving forward such as increasing the rate of lubricant sampling, inspections, and testing of the equipment. Use trend data to confirm resolution.

Common Lubrication Failures Identified via RCA

Lubrication-related problems often hide beneath the surface of a mechanical failure, only becoming evident once the data is analyzed. These issues may not always appear as the primary cause but frequently act as the underlying contributor that accelerates wear, heat, or contamination. By examining lubricant condition alongside mechanical evidence, RCA teams can uncover patterns that point directly to the root of the problem.

Each of these factors tells a part of the story. When evaluated together, they reveal how lubrication practices directly influence reliability—and how early detection can prevent repeat failures.

Lubrication as a Strategic Asset in RCA

Onsite tools like the MiniLab have made it possible to turn every oil sample into a valuable source of diagnostic information. By measuring key properties such as viscosity, elemental composition, and particle contamination, MiniLab systems provide immediate insight into lubricant condition and the mechanical health of the equipment it protects. These results help reliability teams confirm the root cause of a failure, verify corrective actions, and identify early warning signs long before performance is affected.

When paired with TruVu 360™ software, the data from the MiniLab becomes part of a broader reliability strategy. Maintenance teams can log failure events, connect lubricant data to asset history, automate report generation, and use historical trends to suggest potential corrective actions. When used consistently, this reduces downtime, prevents repeat failures, and extends the life of critical assets.

Detect problems sooner and make decisions faster. Learn more about the MiniLab oil analyzer from AMETEK Spectro Scientific.