Proven Methods for Determining the Cause of Machine Failures

Thomas L. Lantz

Proven Methods for Determining the Cause of Machine Failures

Machine failures are the bane of every maintenance department. Determining their exact cause can also be rather difficult. The equipment failures discussed in this article defied quick solutions and were quite costly in manpower, material and downtime. Various resolutions were proposed and tried with little success until thoughtful analysis was undertaken. The systematic approach that was employed as well as the results that were achieved in these three examples should provide a better idea of how to analyze similar types of equipment failures in your facility.

Possible Causes of Back-up Bearing Failures

Personnel Related

Rebuilding practices, skill-level training, motivation level

Systems or

Operations Related


Incorrect screw-down loading, mill cocked, mill thrusting, excessive water level in oil, low oil pressure (with indicator not on), separator bar loose, water sprays impinging seals

Maintenance Related

Oil line disconnected, bearing chock dimensions out of specification, oil cleanliness, oil temperature, age of bearing, incorrect oil specifications, oil quality not to specification, inadequate pump flow, oil pressure switches not calibrated, improper oil flow to bearing (orifice error), rockers broken, debris in shim pack, cracked sled, incorrect bolt specifications (two-piece chocks), incorrect bolt torque on nuts (two-piece chocks)

Material Related

Oil viscosity too high or low, improper Babbitt quality, incorrect Babbitt thickness, worn housing liners, worn chock liners, worn locator pins (two-piece chocks), bearing too small for mill loads


Failed Back-up Bearings

In a steel rolling mill, small rolls approximately 2 feet in diameter perform the actual rolling of the steel to reduce its thickness. Since surface quality is so important, these rolls must be re-ground often; the small diameter makes them easier to handle. Due to their small diameter, these rolls must be “backed up” by larger 3-foot-diameter rolls to prevent bending and distortion of strip thickness. Normally, the smaller or “work” rolls are driven and have anti-friction bearings, while the back-up rolls (undriven) have Babbitted or plain bearings. The work rolls are lubricated by grease or oil mist. The back-up bearings receive oil from a circulating system. The back-up rolls usually stay in the mill for several weeks of rolling before requiring regrinding. Bearing failures on these rolls are rare. Figure 1 illustrates this arrangement.



Back-up roll bearing failures suddenly began increasing on the last six finishing stands for no apparent reason. Department management instantly blamed the lubricant in the oil circulating system. Although every load of new oil was checked by a laboratory for quality, samples were taken from the system and checked at an outside lab to be sure there was no contamination. Each checked out perfectly. The oil was purchased on specification, every load was sampled, and years of records were on file to prove consistent incoming quality. The oil flow to the bearings was also checked and determined to be on target.


At this point, the mill was losing a bearing per week. Previously, one lost bearing per year was normal. Costs were skyrocketing, and a mill shutdown was a real possibility. A meeting with supervisors and repairmen was held to consider all the ways a back-up bearing could fail. The compiled list is shown in the table above.


This list was meant to be all-inclusive, so some items might not apply to a particular mill. When trying to determine the cause of any bearing failure, you should concentrate on what has changed recently if the problem is of recent origin. Something has changed, and that possibly includes current practices.



Several items on the list were deemed very unlikely or had been recently checked, so they were not considered. The remaining items were divided up among all personnel. After all other items on the list checked out, the bearing chock dimensions were investigated. A bearing chock is a housing into which the bearing is fitted before being placed on the roll. The bore into which the bearing was inserted was 3 feet in diameter. The internal dimensions of the chock are critical. The difference between the vertical and horizontal measurements cannot exceed 0.05 inches or the bearing will not seat properly.

The location of the bearing failures was random, so no pattern could be discerned. The losses occurred on both the drive side and the operator’s side of the mill on six finishing stands.


The bearing shop was fortunate to have a meticulous supervisor who recorded everything on a computer. Each bearing chock was numbered and a record kept of which stands each was in during any rolling schedule. The computer records proved that four chocks were involved in all the bearing failures. This was surprising, but the cause still had to be proven.


The four chocks were set aside and unused for a time to see what would happen. All bearing failures ceased. Checking the internal dimensions of these four chocks showed wear well beyond tolerances for good bearing seating. These chocks were immediately sent out for rebuilding, and normal bearing life resumed.


The Kepner-Tregoe method was also helpful in the analysis of the failed bearings. Rather than listing all the possible causes of a problem, the Kepner-Tregoe method seeks to describe what the problem is or is not, where it occurs or does not, when it occurs or does not, and its extent. Basically, you are building a fence around the problem to keep good information inside and under consideration and bad information out. You are determining what has changed from the previous “problem-free” condition. The true cause will satisfy all the conditions unearthed by using this method. If one condition cannot be satisfied by the suspected cause, it must be discarded and another considered.

With this bearing problem, the worn chocks satisfied all the conditions of location, timing and capability of causing the problem. None of the other possible causes could do that. Also, setting the suspected chocks aside amounted to changing only one parameter at a time, which prevented confusion of the issue.


Hydraulic Pump Failures

In this particular plant, most of the hydraulic systems used vane pumps. Losses were very high, and it didn’t take much analysis to determine that 80 percent of the pump gang’s time was being occupied changing pumps. It was also necessary to increase the crew size to keep up with the work. A fishbone diagram was prepared that listed all the possible causes of short pump life (see the table below).

Possible Causes of Excessive Pump Failures

Personnel Related

Rebuilding practices, skill-level training, motivation level, inadequate time (rush to complete)

Systems or

Operations Related

Improper pump starting by operators, low oil level warnings ignored

Maintenance Related

Dirty oil (filters need to be changed), undersized suction lines, systems receiving dirt during maintenance, loose suction lines (air entering), suction lines are too long, inattention to overheating oil

Material Related

Hydraulic oil quality level, incorrect parts used, inferior parts purchased, incorrect pump for the job, incorrect system design for the job, incorrect fluid used



Because vane pumps are very sensitive to dirt, and steel mills are inherently dirty, it was suspected that the vane pumps might not be the correct type for this environment. An investigation determined that the pump gang was rebuilding failed pumps with parts from other failed pumps. The pump manufacturer advised strongly against this practice and insisted that only new, matched sets of vanes, rotors and wear plates be used when rebuilding units. This clashed with long-standing practice in the plant.


The strategy was to change pump types. Gear pumps are less expensive than vane pumps and much more resistant to dirt. They also fail gradually, giving a warning by moving all cylinders more slowly. Vane pumps fail suddenly without warning.


It was also noticed that all systems in the plant were designed with the pump and motor sitting atop the tank. When the pump is started on this type of system, the mechanics are cautioned to “jog” the pumps, meaning to start and stop the motors at least three times before walking away to ensure the pump has picked up a prime. However, the operators often would start the pumps, and there was no guarantee that they would do it properly. If the pump does not pick up a prime, air entering the suction line will cavitate and destroy the pump.


While plant personnel were deciding what to do about these pump losses, anti-wear hydraulic fluids were just coming on the market. Oil companies claimed longer pump life would be possible with these fluids. This strategy was added to the list of possible actions.


Overheating oil was also persistent on all the systems and generally started with a problem in the unloading system. Assuring the coolers worked properly helped, but quick diagnosis and repair of the unloading system only brought minimal improvement in pump losses.


Possible Causes of Work Roll Bearing Failures

Personnel Related

Rebuilding practices, skill-level training, inadequate greasing, personnel changes

Systems or

Operations Related

Duration of rolling schedule, water sprays impinging seals, location of the losses

Maintenance Related

Grease change, grease quality, worn bearing chocks, wear plates on mill, wear plates on bearing chocks

Material Related

Bearing manufacturer, spacer change, age of bearings




Since it appeared that the pump losses had multiple causes, the decision was made to correct the easiest possible cause first - the fluid. Switching to anti-wear fluids made a small improvement.


Next, better filters were installed on each system. Changing these filters on a monthly basis became the routine because the bypass indicators were not trusted. In addition, it was easier to schedule the changes on the same shift. This also resulted in some improvement.


The practice of “filter-fill” was then begun. Previously, in order to fill a system, millwrights would bring a drum of fluid to the site and insert an air pump into the large bung hole. They would pump oil directly from the drum into the system through an opening in the top of the tank. When the drum was empty, they often would place the air pump on the ground, take the empty drum away and return with a full drum.


To counter this, all openings in the top of the tank were plugged except for the breather and a spin-on-type filter attached to an opening in the tank. The mechanic had no choice but to connect the hose to the filter when filling the tank. By this method, all dirt on the pump or in the oil was stopped by the filter. This led to improved pump life.


Subsequently, the vane pumps were changed to gear pumps in order to prevent the pump gang from attempting repairs. When a gear pump fails, it cannot be repaired properly except in a specialty shop. The strategy was to keep low-skill personnel from trying to make repairs. Pump life improved markedly with this action.


Still not satisfied, the plant converted its hydraulic tanks to vertical tanks with the pumps mounted beside them. This gave the pumps a “positive head” and lessened the chances of them being starved for fluid. This move was the most productive of all. Pump life increased so much that the pump gang thought someone else was doing their work.


This problem provides an excellent example of what to do when there’s a multitude of possible causes and all are believed to be contributory. If none can be eliminated as a possible cause, then correcting the easiest one first is a good strategy. Plants frequently live with a problem that should have been corrected years ago. These “lived-with” problems take up a lot of maintenance time and become routine or part of the “how we do things around here” syndrome. Constantly be on the lookout for these problems and eliminate them, but only after proper analysis.


Work Roll Bearing Failures

In the mill described previously, the work roll bearings were the anti-friction type. A total of 40 bearings were in the mill at one time, and all were lubricated with grease. The grease’s performance characteristics were specified by the mill, and every load was tested when received. Typically, 15 to 20 bearings were lost each year, primarily on the faster finishing stands. The losses, which usually were attributed to age or misdirected water sprays and only occasionally to grease, were considered normal and difficult to reduce. Figure 2 illustrates the typical bearing.


When the work roll bearings suddenly began failing on the finishing stands at the rate of one per day, a fishbone diagram was prepared to list the possible ways the bearings could fail (see the table above).

Operations management immediately blamed the grease. Even though years of records were available to confirm the grease’s quality, samples were taken. All the results were perfect.


Each item on the list above was then checked or discarded due to recent verification. Only one item stood out as suspicious - the location of the losses. An investigation proved that all the losses had occurred on one particular stand and only on the operator’s side of the mill. This pointed directly to that stand housing as contributing to the failures. The wear plates on the inside of the stand housing, which the bearing chocks rubbed against, were checked and found to be badly worn. They were changed immediately. Failures soon returned to normal levels.


These three examples illustrate the effectiveness of using a fishbone diagram to ensure that all possible causes of a failure are considered. The Kepner-Tregoe method can also help to establish what, where, when and the extent of the problem.


The first and last examples arose suddenly with too many people panicking and jumping to conclusions. In all three problems, using a fishbone diagram forced personnel to withhold action until all possible causes were at least written down.


The segments of the diagram - maintenance, personnel, systems and material - are only four of the possible areas to assess. In special situations, there may be others. Consider all the possibilities with a fishbone diagram and use the Kepner-Tregoe method to help narrow down the list. Finally, always remember to look for something that has changed recently. This will be the best approach to determine the cause of your next equipment failure.

Subscribe to Machinery Lubrication

About the Author