Designing a World-Class Oil Analysis Program

Noria Corporation

Anyone who has read Practicing Oil Analysis magazine on a regular basis over the past five years should be well versed on the impact oil analysis can have in helping improve equipment reliability and maintaining production uptime. From providing a predictive early warning of impending failure, to seeking a proactive root cause solution, there can be little doubt that oil analysis is an effective condition-monitoring tool.

However, for every success story, there’s a litany of stories recounting problems missed and failures that have occurred despite routine oil sampling. When this occurs, the usual reaction is to blame either the technology or the oil analysis lab for having failed to warn of impending doom. In extreme cases, the temptation may be to seek out a different lab which, rightly or wrongly, is billed as “better” than the incumbent lab.

But is it really the lab’s fault when an oil analysis program goes off track? While it is true that the lab should bear some accountability for the success or failure of the program, oftentimes a good long hard look in the mirror is all it takes to find the true root cause of the problem. So what exactly is involved in designing an oil analysis program that provides maximum payback?

Steps to Designing a World-Class Oil Analysis Program

Developing an effective oil analysis program requires careful planning. All too often when plant personnel decide to invest in oil analysis, they choose a lab and start sending samples without thinking about what they are trying to achieve. This fire, aim, ready approach to oil analysis is a guaranteed recipe for disaster. Instead, the program should be developed with a careful game plan in place based on a stated series of reliability goals.

There are five basic steps to developing an oil analysis program (Table 1).

To maximize the opportunities for success, these steps should be performed in this order so that the program is developed on a sound footing.

Step No. 1. Initial Program Setup

The overall structure and foundation of an oil analysis program should be based on sound reliability engineering goals. These goals should guide the end user through the process of designing and implementing the program.

For example, if the plant has experienced a history of hydraulic pump failures believed to be related to fluid contamination, every aspect of the oil analysis program, from sample valve location to selecting test slates and assigning targets and limits should be governed by the stated reliability objective - in this case, extending the mean time between failure (MTBF) of the pumps.

A program designed around this type of sound footing, requires development of an overall oil analysis strategy in conjunction with a failure mode and effects analysis (FMEA). This is often performed as part of a more comprehensive reliability-centered maintenance (RCM) program.

The FMEA process looks at each critical asset, and based on component type, application and historical failures, allows test slates, sampling frequencies and targets and limits to be selected. These items will address the most likely or prevalent root cause of the failure.

For example, if the goal of the program is to provide an early predictive warning for a large, slow turning gearbox with a history of failure due to the extreme loads experienced by this gearbox, the test slate selection should be geared toward detecting adhesive wear particles. Because adhesive wear particles are typically too large for elemental spectrometers to detect, a well-designed test slate should include some kind of routine ferrous density type measurement such as direct read (DR) ferrography or PQ index (ferrous wear).

Likewise, the sampling frequency should be selected to provide maximum warning between the start of a potential problem and the functional failure of the gearbox. This is often referred to as the P-F interval. (Additional information about FMEA is available in the May-June 2000 issue of Practicing Oil Analysis magazine in the article authored by Drew Troyer titled “How to Lube-up Your FMEA Process,”.

While the lab’s experience in developing effective oil analysis programs can be used to support the design process, it is ultimately the end user’s responsibility to ensure the program meets the company’s goals and reliability objectives. In particular, attention should be paid to the types of test procedures used by the lab under different circumstances.

In poorly designed programs, where lab and test selection is often driven by the per-sample cost, the lowest bidder typically becomes the lab of choice. However, for the sake of perhaps 50 cents to one dollar per sample, the test slate selection may include tests that provide little to no value.

For example, if the lab is simply asked to perform a test for “water,” the lab is at liberty to run anything from a simple crackle hot plate test, to FTIR, to a Karl Fischer moisture test. So which test should you choose? The correct answer to this question depends on your objectives.

If you simply want a yes/no test for free and emulsified water, then the crackle test is suitable and is probably the cheapest. However, if a quantitative answer is required, more sophisticated tests will be required - usually fourier transform infrared spectroscopy (FTIR) or Karl Fischer.

Now let’s think about the maximum permissible water, which should be set as a goal-based limit from an FMEA and criticality assessment. If the plant FMEA indicates a need to keep water content below 200 ppm (0.02 percent), then the only option to trend water content over time to ensure compliance would be the Karl Fischer test, because FTIR is typically insensitive below 500 to 1,000 ppm.

However, if the target is 1,500 ppm (0.15 percent), then FTIR is a simple, inexpensive test that is entirely suitable under these circumstances. Again, program design - including test slate and procedure selection - is dependent on end user defined goals.

Step No. 2. Sampling Strategy

Of all the factors involved in developing an effective program, sampling strategy has perhaps the single largest impact on success or failure. With oil analysis, the adage “garbage in, garbage out” definitely applies. While most oil analysis labs can provide advice on where and how to sample different components, the ultimate responsibility for sampling strategy must rest on the end user’s shoulders.

Take the real-life example of a reliability engineer at a plywood plant who had outright rejected oil analysis as an effective conditioning-monitoring technique. His misguided belief was based on the notion that because the plant he worked at had experienced four hydraulic pump failures in the past two years, none of which had been picked up by oil analysis, the technology simply did not work. But is it really the technology that’s at fault?

When the same engineer was asked from where in the system he was taking the sample, he seemed genuinely shocked that anyone would sample a hydraulic system anywhere other than the reservoir. However, by doing so, any wear debris from the failing pump would show up only in the oil sample bottle after finding its way through the system, which included valve blocks, untold numbers of actuators and a 3-micron return line filter, and into a 5,000-gallon reservoir where it would be diluted!

Of course, the correct location would have been immediately downstream of the pump (or on the case drain depending on pump design). In this example, the sampling strategy should have been driven by his reliability goal - finding an early warning sign of pump failure before it became catastrophic.

Figure 1

While bottom sampling can be useful in determining the presence of unusual levels of water, sludge and other debris, it is unlikely to yield any meaningful data from an oil analysis lab.

Of course, sample strategy involves more than just sampling location. Sampling method and procedure, bottle cleanliness and hardware all factor into the sampling equation.

Perhaps second only to location in importance, is the provision of collateral information when the sample is submitted to the lab. For industrial equipment, as few as one sample out of 10 is submitted to the lab with appropriate information about oil type, hours on the oil, filter changes or the addition of make-up oil.

Without suitable information, oil condition parameters such as viscosity or acid number cannot be compared to the new oil and trend analysis cannot be performed effectively.

Consider two identical gearboxes from which samples are taken and sent to the lab. When the samples are shipped, nothing but the customer name and asset ID is supplied. Both samples are analyzed and show 50 ppm of iron (Figure 2).

Now consider that the oil in gearbox No. 1 has been in service for five months, while gearbox No. 2 has recently had an oil change and has been back in service for only two months. Based on wear rates (ppm per month), gearbox No. 2 is wearing at two and a half times the rate of gearbox No. 1 (25 ppm per month compared to only 10 ppm per month).

Now let’s examine why the wear rate for gearbox 2 is so high. Let’s assume that during the recent oil change on gearbox 2, an incorrect viscosity grade of oil was used due to poor labeling of a top-off container, resulting in inadequate oil film thickness and increased wear.

Unless the lab is provided both the hours on the oil, and the correct new oil reference upon which to assess the type of oil in use, neither the elevated wear rate, nor the use of an incorrect grade of oil would be picked up by the lab - yet it is likely that the lab would be blamed if a failure were “missed.”

Without exception, it is the responsibility of the end user to ensure that any and all pertinent information that can be used by the lab in the analysis and interpretation of the data be sent to the lab with each and every sample. Failure to do so simply means that the lab is guessing at whether or not any of the data is significant and should be flagged for attention.

Figure 2

Step No. 3. Data Logging and Sample Analysis

Figure 3. Oil Analysis Program Design

Assuming the sampling strategy is correct and the program has been designed based on sound reliability engineering goals; it is now up to the lab to ensure the sample provides the necessary information. The first stage is to make sure the sample, and subsequent data, is logged in the correct location so trend analysis and rate-of-change limits can be applied.

That is the lab’s responsibility, right? What if two successive samples are labeled slightly different? For example, two samples are labeled unit IDs GB-3456 and 3456. While logic might tell us that the prefix GB simply means “gearbox,” imagine the difficulty the lab faces, confronted with as many as 2,000 samples daily. While carelessness and inattentiveness on the part of the lab are inexcusable, it is incumbent on the end user to ensure the consistency of information that is logged and used for diagnostic interpretation.

Once the sample has been properly set up at the lab, the actual sample analysis is next. This is an area where end users are definitely at the mercy of the lab and its quality assurance (QA) and quality control (QC) procedures. For example, how does the lab sequence tests?

f the lab is requested to run a particle count, does it perform this test first to minimize the possibility of further lab procedures contaminating the sample, or is it left until last? How often does the lab run QA samples - samples of known chemical composition inserted in the daily run to ensure test instruments are within acceptable QC limits? Does it run them every 10 samples, every 50, or not at all?

What happens if a QA sample fails? Does the lab retest the customer samples back to the last QA sample that passed, or does it simply recalibrate the instrument and move on?

What about the technicians who are actually running the tests? Are they high school graduates who have been hired as cheap labor, or are they chemical technicians or degreed chemists? What about training specific to used oil analysis? Have the lab techs been sent to any training courses and have they obtained any industry- recognized qualifications such as ICML’s LLA (Laboratory Lubricant Analyst) certification?

Does the lab have a well-designed QC program, with written and enforced procedures for each test to ensure uniformity in test procedures between one tech and another? Has the lab obtained any industry-recognized QA accreditation such as ISO 17025 or 10CFR50? It is strongly recommended that before selecting a lab, you visit it to assess its overall commitment to quality. Don’t simply accept the lab’s sales pitch at face value.

Figure 4. Steps to a World-Class Oil Analysis Program

Step No. 4. Data Diagnosis and Prognosis

Diagnostic and prognostic interpretation of the data is perhaps the step where the most antagonistic relationship can develop between the lab and its customers. For some customers, there is a misguided belief that for a $10 oil sample, they should receive a report that indicates which widget is failing, why it is failing and how long that widget can be left in service before failure will occur. If only it were that simple!

The lab’s role is to evaluate the data so that complex chemical concepts such as acid number or the presence of dark-metallo oxides makes sense to people who may have many years of maintenance experiences, but haven’t taken a high school chemistry class in many years.

The lab cannot be expected to know - unless it is specifically informed - that a particular component has been running hot for a few months, that the process generates thrust loading on the bearings, or that a new seal was recently installed on a specific component that is now showing signs of excess water in the oil sample.

Evaluating data and making meaningful condition-based monitoring (CBM) decisions is a symbiotic process. The end user needs the lab diagnosticians’ expertise to make sense of the data, while the lab needs the in-plant expertise of the end user who is intimately familiar with each component, its functionality, and what maintenance or process changes may have occurred recently that could impact the oil analysis data. Likewise, evaluating data in a vacuum, without other supporting technologies such as vibration analysis and thermography, can also detract from the effectiveness of the CBM process.

While the end user must bear some responsibility for correctly evaluating the data, the lab does have some culpability. For example, flagging a gearbox for a coolant leak based on elevated sodium, when the gearbox does not have any kind of glycol-based cooling system, but is part of the drive train for a conveyor carrying salt is clearly an oversight on the part of the lab.

Ultimately, it is the lab’s job to explain its finding to the customer, and the customer’s job to use these findings to make the correct maintenance decision, based on all available information, not just oil analysis data.

Step No. 5. Performance Tracking and Cost Benefit Analysis

Oil analysis is most effective when it is used to track metrics or benchmarks set forth in the planning stage. For example, the goal may be to improve the overall fluid cleanliness levels in the plant’s hydraulic press by using improved filtration. In this case, oil analysis - and specifically the particle count data - becomes a performance metric that can be used to measure compliance with the stated reliability goals.

Metrics provide accountability, not just for those directly involved with the oil analysis program, but for the whole plant, sending a clear message that lubrication and oil analysis are an important part of the plant’s strategy for achieving both maintenance and production objectives. The final stage is to evaluate, typically on an annual basis, the effectiveness of the oil analysis program.

This includes a cost benefit evaluation of maintenance “saves” due to oil analysis. Evaluation allows for continuous improvement of the program by realigning the program with either preexisting or new reliability objectives.

Summary

There can be little doubt that oil analysis is an integral part of any condition-based maintenance program. When used effectively, it can warn of impending failure, direct us to the root cause of a problem, or point to areas of opportunity we perhaps didn’t know existed.

However, just like you wouldn’t buy a used car without checking under the hood, taking it for a test drive and kicking the tires, don’t merely assume that filling the sample bottle with oil and sending it to the lab will produce the desired results. Get involved, ask questions, visit the lab and take control of your oil analysis program - it may be the best investment in time you ever made!

Practicing Oil Analysis (1/2004)

The Role of Lubrication in Root Cause and Failure Analysis

Innovations in Root Cause Analysis and Failure Analysis for Enhanced Lubrication Practices

How to Quantify Severity of Wear and Contamination with a Filtergram

A Sobering Reality: How Ignoring Oil and Vibration Analyses Impact your Bottom Line