Maintenance and Lubrication Success at Texas Instruments

Paul V. Arnold, Noria Corporation
Tags: lubrication programs, Case Studies, maintenance and reliability

Steel manufacturing has its heat, cement manufacturing its grit and wood products manufacturing its dust, but if you want a really demanding environment, look no further than semiconductors.

While there are challenges to achieve cleanliness - a precursor to mechanical and tribological performance, equipment reliability and product quality - in any industry sector, chip makers must take it down to the microscopic level. It's a degree of purity found in few places on Earth.

"Anything that gets exposed to the wafer - environment, gases or liquids - needs to have a certain quality that is much higher than any other industry," says Sunil Thekkepat, the operations and maintenance manager for Texas Instruments' DMOS5 wafer fabrication facility in Dallas. "The closest would be pharmaceutical, but they aren't to some of the levels that we are. Our manufacturing is based on purity. We need ultra-pure conditions."

Spread.jpg

For instance:

  • Water is put through a myriad of steps - reverse osmosis, degasification, demineralization, etc. - to achieve a deionized, ultra-pure form that is utilized in the production process.

  • Ambient and process air is filtered to remove practically all particulate matter.

  • Temperature, humidity and atmospheric pressure are precisely maintained and controlled.

"The magnitude is great. Even molecular contamination can cause a failure on a chip," says Thekkepat.

That's an issue when an affected chip is delivered to a customer and installed into a high-end cell phone or other consumer electronics device. That's unacceptable when it is delivered to a customer and installed in safety and health products such as automotive brakes, air bags or medical monitors.

"In our business, it's not enough to be within specification," says Brenda Harrison, a TI vice president and the company's manager for worldwide facilities. "You have to be within normal operational limits at all times."

Product can be within specifications, but if quality issues, scrap, equipment downtime or supply chain hang-ups lay in the wake of a good final shipment, that's a price customers aren't willing to pay.

Cover_Story_Photo1.jpg

Texas Instruments' sprawling North Dallas campus includes six major buildings and an enormous amount of production and support equipment.

Cost of quality is closely watched in this branch of manufacturing, where the product is expected to be better and cheaper every year. To that end, customers such as Toyota and its suppliers regularly audit TI factories. All TI sites manufacturing for the auto sector are certified on the stringent ISO 9000/TS 16949 standard, which necessitates development of a quality management system that provides for continual improvement, emphasizing defect prevention and the reduction of variation and waste.

Pure products come from pure operating conditions. Pure operating conditions come from purely performing mechanical systems (of which lubrication and oil analysis play a key role). Purely performing mechanical systems come from pure processes.

This is the environment in which TI and its facilities group works and succeeds.

Cover_Story_Photo2.jpg

Vance Black (left) is a maintenance supervisor at the DMOS5 wafer fabrication building.

Drawing the Line
TI's campus in North Dallas contains six major buildings, including four factories, three of which produce semiconductor technologies (analog, digital and wireless). It's tightly packed in a 245-acre footprint, making it one of the densest semiconductor sites in the world. The wafer fabrication buildings are labeled DFAB (for Dallas Wafer Fab), and DMOS5 and DMOS6 (for Dallas Metal Oxide Semiconductors). Between six and 10 facilities professionals at each fab, and 22 at Central Utilities, oversee the critical chillers, boilers, compressors, exhaust fans, scrubbers, pumps and fresh air systems that provide ultra-pure conditions. Buildings can have more than 6,000 pieces of equipment.

The facilities group manages all utilities including power, water, chemicals and gases and maintains precise environmental conditions to support the semiconductor manufacturing tools. The facilities group installs the tools; from that point on, they are operated and maintained by the equipment engineering group. The two groups works hand-in-hand to ensure reliable operating conditions.

The Quest for Zero
With so much equipment and such stringent demands, reliability and uptime has long been an area of focus. That was heightened in 1995 when then-vice president Shaunna Black stated that the goal for every TI business, organization and individual should be "zero wasted resources." Injuries and illnesses are waste. Defects are waste. Ineffective use of time and resources is waste. Waste = impurity. Get the picture?

The bar was further raised in 2003 when "zero major interruptions" became a rallying cry. TI defines a major interruption as an event with greater than $1 million in impact (the cost can reach $10 million to $15 million). That can be due to scrapped product, equipment breakdown and its repair/replacement cost, or effect on production at the customer's factories.

Cover_Story_Photo3.jpg

Factories on the North Dallas campus manufacture analog, digital and wireless semiconductor technologies.

To address the goals of zero wasted resources and zero major interruptions, the facilities group within the fabs and Central Utilities set out to mitigate risk and identify potential impact within its systems, processes and procedures. Basically, everything was reassessed.

"Reliability requires two disciplines," says Harrison. "One piece is discipline toward continuous improvement and continuous incremental advance. The other piece is the process of saying, 'Let's look at this from a completely clean slate. If we were starting over at Time Zero, how would we do it differently?' When you combine those approaches, magic happens."

According to sustainable development manager Paul Westbook, a member of the company's international facilities group, "It forces you to look at your practices and ask, 'Why do we do it this way? Is there a better way?'"

Cover_Story_Photo4.jpg

TI facilities are, more and more, basing preventive maintenance decisions (including oil changes) on actual condition as opposed to calendar methods.

This was exceptionally important for the facilities group since:

  1. the Dallas site admittedly had a history of over-maintenance (particularly in the area of lubrication) on systems, equipment and components;

  2. the buildings had far too many assets to cover with a traditional maintenance game plan, let alone one set up for over-maintenance;

  3. too much time was spent on non-primary equipment, which increased the vulnerability of more essential assets;

  4. manpower resources were finite and, as the result of retirements and staff shuffling, had been further reduced.

Cover_Story_Photo5.jpg

Paul Westbrook is TI's sustainable development manager and a member of the company's international facilities group.

The realization of "established ways not being the best ways" occurred several years back with the introduction of various predictive maintenance technologies.

"That was kind of the beginning of criticality analysis for us," says Matt French, the maintenance technologies leader at Central Utilities. "Vibration analysis was the first technology that we used. We tried to do vibration analysis on everything. Some things just aren't that critical, so why spend the time and effort doing vibration on that equipment? On top of that, vibration data is only useful if you analyze it. You don't have time to analyze data on everything."

 Cover_Story_Photo6.jpg

"Anything that gets exposed to the wafer - environment, gases or liquids - needs to have a certain quality that is much higher than any other industry," says operations and maintenance manager Sunil Thekkepat.

Managers, mechanics and technicians began to rethink things.

Oil changes on compressors, chillers, etc., were performed on a calendar basis. Why? (See the sidebar on Page 18.)

Workers performed routes with the blanket instruction to put "three squirts of grease" into specific components. Why?

Preventive maintenance tasks were continually performed on pieces of equipment that had shown no signs of degradation/failure or had little chance to impact the product/ customer. Why?

"Over-maintenance has been the biggest area of opportunity," says French. "Doing what you think is the right maintenance frequently creates the need for more maintenance. An example is changing oil that doesn't need to be changed. You can add impurities in with new oil that you didn't realize were there. That creates a problem when you took out oil that was in good shape. That's a waste of time and money."

Eric Whitmore, a member of the worldwide facilities organization and the facilities manager for DMOS5, sums it up in dollars and sense.

"We don't want to make every single thing bulletproof. If we do that, we'll put ourselves out of business. The cost is too high," he says. "We have to balance the probability and the criticality and the impact. Is it worth going after and investing capital dollars to reduce or eliminate that risk? Or, is it not worth it?"

New Definitions
The answers to these and other questions were distilled through a process of asset prioritization and criticality analysis. Facilities had exposure to the concepts in past work with failure modes and effects analysis (FMEA), but dynamic, lasting results didn't occur until Central Utilities embarked on a formal initiative in 2003. The success of it led to similar actions at the fab plants.

Unlike past efforts which looked at the criticality and impact of individual pieces of equipment (a pump, a motor, etc.), the new strategy involved examining and determining, first and foremost, the criticality and impact of individual systems (the cooling water system, the air abatement system, the ultra-pure water system, the HVAC system, etc.). A motor may be deemed a critical component of a non-critical system. Why give it the same attention as a critical motor in a critical system?

This process was more time-consuming at the fab plants since they house 10 times the number of systems than at Central Utilities (whose systems are smaller in number, but bigger in size and scope).

Four categories of criticality were defined:

High criticality: System failure could cause death, severe injury/illness, long-term (greater than one year) environmental damage, or require more than $250,000 to correct and/or settle incurred penalties. System failure will have an immediate impact in excess of $250,000 on production.

Medium criticality: System failure could cause minor injury/illness, short-term (less than one year) environmental damage, or require $1,000 to $250,000 to correct and/or settle incurred penalties. System failure may have an impact in excess of $250,000 on production if left unrepaired or will have an immediate impact of less than $250,000 on production. System failure causes redundancy to be lost.

Low criticality: System failure could not cause injury/illness but could cause environmental damage that can be readily repaired, requiring less than $1,000 to correct and/or settle incurred penalties. System failure may have an impact of less than $250,000 on production if left unrepaired. System failure diminishes system redundancy.

Non-critical: System failure could not cause injury, illness or environmental damage. System failure cannot impact production.

The next step was to define criticality for the various pieces of equipment that comprised a system.

Critical (1): Failure could cause safety, long-term environmental damage, or require more than $35,000 to correct and/or settle incurred penalties. Failure has a major impact on system reliability. Failure has an immediate impact on the system. Failure affects life safety equipment.

Essential (2): Failure could cause minor injury/illness or short-term environmental damage. Failure has some impact on system reliability. Failure will impact the system if left unrepaired. Failure causes system redundancy to be lost.

Important (3): Failure could not cause injury/illness but could cause minor environmental damage that can be readily repaired, requiring less than $1,000 to correct and/or settle incurred penalties. Failure may impact the system if left unrepaired. Failure diminishes system redundancy. Failure causes the loss of remote visibility of the system to operations, but the system control is intact. Failure causes loss of trending of system parameters.

Low criticality (4): Failure couldn't cause injury, illness or environmental damage. Failure has no impact on system reliability. Failure does not impact production.

"Since we have a set amount of resources, this work provides focus on the things that have the biggest impact on the customers and our site," says Thekkepat.

The work had additional benefits. As TI became a supplier to the automotive sector in 2003, it was (as stated earlier) expected to comply with the requirements of ISO/TS 16949. Included heavily in the audits that followed were questions related to the categorization of systems based on criticality.

"We had already done that," says Thekkepat. "Everything had been categorized and determined."

Making a Game Plan
With criticality definitions in place, cross-functional teams at the fabs and Central Utilities selected a master list of equipment. Chosen for evaluation were pumps greater than 50 horsepower, pumps less than 50 horsepower, compressors, chillers, exhaust fans, HVAC system fans and heat exchangers, scrubbers, boilers and motors. This covered systems with criticality greater than "medium" and equipment criticality greater than "important (3)". Teams assessed the criticality rankings along with the replacement value (in dollars) to determine an asset care plan for each system-based asset. (Asset care levels are defined in the sidebar on Page 20.)

Gone are the days of asset care based on a calendar and guesswork. In its place ...

  • Systems with high criticality and an equipment criticality ranking of critical (1) require continuous condition monitoring along with comprehensive predictive maintenance (PdM) and preventive (PM) maintenance. The integration of operators for inspection was also a requisite.

  • Those with second-level status are prescribed condition monitoring, PdM tasks, PM tasks and operator inspections.

  • Those with third-level status require PM tasks and operator inspections.

  • Those with fourth-tier status are allowed to run to failure.

There is a bit of wiggle room on the analysis. A fab may need to more closely maintain low-dollar equipment that is very critical to a process that impacts the product, while Central Utilities may have to raise the maintenance on high-dollar equipment that is not deemed as critical because of the immense replacement value.

Also, the results are not static. Teams are constantly re-evaluating criticality analyses as the result of equipment changes, new product introductions and normal business fluctuations.

"Semiconductor design and manufacturing is a dynamic industry," says French. "As consumer preferences change in buying electronics, the industry has to shift its innovation. This means that you need to have a longer-term vision for maintaining your existing assets and expanding as necessary. You have to stay on top to make the right investment and the right choice."

Unlock Innovation
Besides asset prioritization and criticality analysis, the facilities group used additional methods to pursue M&R purity.

In 2007, representatives from all buildings on the Dallas campus convened in order to compare maintenance tactics (including those for machinery lubrication and oil analysis), share best practices and standardize on the best methods for asset care.

"There had been so much work done in certain buildings, improvements that had been made, and things had not been fully disseminated," says Thekkepat. "We needed to compare and share."

It's all about connections.

"We have experienced people within each group," he says. "They have been here for 25 to 30 years or more. There is a great deal of knowledge built in at each of those facilities. Harnessing that knowledge was one reason to get them together. Before, it was a loose connection. To resolve issues, we need that kind of interaction between people who are experts in vibration or lubrication or oil analysis, compressors - whatever the case might be."

French recounted the benefits on one comparison.

"For air line maintenance, one group was changing filters way more frequently than another group, and one had eliminated the pre-filter system altogether and recognized no difference in quality in the air being supplied to the factories," he says. "An obvious cost savings is to not do maintenance if you don't have to. We decided to eliminate that pre-filter media. You don't have to change it. There's no maintenance. And, there was no change in actual outcome. One group really knew the answer. They had done studies and knew the best way to handle it."

Facilities leaders also communicate with TI sites around the world to share best practices.

"Our plants in Japan have done a considerable amount of work on this," says French. "They have very high reliability scores. What are they doing that's good? Let's learn from them. We have plants in Germany that have very intricate systems in place. Let's figure out what they are doing. How can we get to their level?"

Implementing best practices can be a challenge, especially when it means changing long-established practices.

"In some cases, we've had pushback because it goes against something that we have done for years. In other ways, it has validated the calls for change that people have had," says French. "We may have decided that condition monitoring eliminates the need to perform frequent oil changes on compressors. An oil change on a compressor isn't that expensive. The potential impact is a $150,000 overhaul. But, that's still $150 that doesn't need to be spent. When you have resistance, you have to lay out the facts behind the decision and convince them that it's the right thing to do."

The TI campus in Dallas also has taken an expanded view on the Toyota concept of 5-S to reduce waste and remove the impurities found in plant processes.

Traditional 5-S programs focus on cleaning up physical areas and creating a "factory that talks". Clutter is tossed, organization is restored and signage is posted to convey important information. TI does all that, and then keeps going. In 2008, it ushered in a concept called extended 5-S, or x5-S, that works to eliminate the waste and clutter in processes. It's another way that TI separates the value-added from the non-value-added in order to shift time and effort to that which provides real benefit.

As an example, at the suggestion of a maintenance associate, DMOS5 assembled a team of technicians, operators and engineers to examine maintenance programs for critical equipment, routine operator rounds and documentation. Using x-5-S concepts, maintenance practices were redefined to prevent failures, operator rounds were optimized to monitor critical equipment areas and 132 lockout/tagout procedures were winnowed down to 25. The group brought in more individuals to examine issues and provide feedback for changes. They called this activity "Small Group and Free Ideas".

Projects such as these unlock innovation and change the belief of what's possible.

"If every day, every TIer does something to do his or her job better, and we do that collectively over 26,000 people across the globe, how can anybody compete with that?" says Brenda Harrison.

Possibilities Are Endless
What was once thought impossible is now seen as possible.

Zero major interruptions, that key metric that served as the impetus for so much reassessment, was the goal for 2008. From a mechanical perspective, the Dallas campus completed the year with zero. The cost difference, when comparing interruptions for 2007 vs. 2008, exceeded $12 million.

"We set strategic goals that many would believe impossible," Harrison says. "But talented people never really think a goal is impossible for them. We have people who really love impossible goals and love being challenged to solve them. That's fundamentally engineering."

Goals and expectations have been met.

"In order to provide reliable systems, we have spent the money and done the right things," says French. "Now, the cost of reliability is recognized. If we don't have a major interruption, then the cost of reliability was good. If we have multiple interruptions, the cost - whatever we spent - on reliability was wrong. My group was told, 'If we have no interruptions this year, then your cost was OK.' Our costs are OK."

That doesn't mean the job is done.

"Upper management had a close eye on us in the past, but in a different way," says Thekkepat. "It's a positive implication now instead of a negative. They still won't relieve the pressure. Trust me on that."

Clean processes? Clean slate.

The semiconductor industry is tough and demanding, but it's the environment in which TI and its facilities group works and succeeds.

"We are being chased and copied, so we need to take our innovation to the next level," says Harrison. "In our business, you can't sit still."

When the competition is out to follow your footsteps, that's leadership, that's excellence, and that's reliability in its purest form.

An Eye on TI
Company: Texas Instruments develops analog, digital signal processing, RF and DLP semiconductor technologies. With 2008 sales of $12.5 billion, it ranked No. 215 on the 2009 Fortune 500 list. It employs approximately 26,000 people at manufacturing, engineering, research and development, testing, and other facilities around the world.

Focus site: TI's Expressway campus in North Dallas includes six major buildings packed in a 245-acre footprint. It features four factories, including three which produce semiconductor technologies for customers in the automobile, consumer electronics and medical products industries. The campus employs approximately 7,000 people, including more than 1,000 at the DMOS5 fab plant. Buildings on the campus range from 10 years old to 50 years old.

FYI: In 2009, Fortune rated Texas Instruments as one of the most admired companies in the world (No. 2 in semiconductors) and as one of the 100 best companies to work for in America (No. 65). ... TI holds more than 35,000 patents. ... TI's Jack Kilby invented the integrated circuit in 1958. ... The company invented the handheld calculator in 1967.

Chill Out: TI Prolongs Oil Changes in its Chillers
Companies pursuing maintenance and reliability improvement frequently focus on what they can "start doing" - start doing oil analysis, start doing root cause analysis, etc. Texas Instruments found that stopping what it was doing was just as important as starting. "One of the biggest changes was this … stop doing oil changes based on time and start doing oil changes based on oil condition," says maintenance technologies leader Matt French. "For our chillers, our OEM said we should be changing oil every year. We are well beyond four years and, in some cases, have achieved seven years on the same oil. The oil analysis is done monthly. We can look at it and say, 'How has the oil condition changed?' We found that you can run it much longer.

"Now we are looking at maintenance practices to extend that oil even beyond what we think is reasonable. We are well into the range of, 'Do we really want to push it more, or just change the oil?' The green part of it, the sustainability part, is that if you continue to eliminate all of the ingress, keep the particles out, keep the dirt and water out, the oil will run forever. The additive packages don't disappear.

"The longer we can run the oil and the better we can make sure that our equipment runs, the more we can reduce the amount of ingress and the less maintenance that you have to do on it. The more you can get out of the equipment, the more competitive you are.

"Changing oil on an oil analysis basis on our chillers, I think, has been a huge success story. We also have been able to detect bearing problems through oil analysis. I think not having to question, 'Is it the oil change or is it really an equipment problem?' is a great example of reducing time and effort." 

Levels of Asset Care
1) Continuous condition monitoring through real-time data collection of system and equipment parameters, including temperatures, pressures, vibration, motor current, etc.

2) Predictive maintenance that includes periodic data collection of machine condition to determine equipment health trends (vibration analysis, oil analysis, ultrasound, thermography, motor circuit testing - online and offline).

3) Preventive maintenance through periodic lubrication, alignment checks, coupling check, oil change, etc.

4) Inspection tasks through operator rounds and readings.

A Sustainable Advantage in Resources
A target of any good improvement initiative should be the maximization of all available resources. Texas Instruments took that to heart a few years back when Amory Lovins, chairman of the Rocky Mountain Institute, challenged the company to get two uses out of every drop of water.

Lovins' message was not about symbolism. He was serious.

"If you can get two uses for every drop, and your competition is only getting one, that is a competitive advantage," says Paul Westbook, Texas Instruments' sustainable development manager and a member of the company's international facilities group.

Facilities leaders have responded.

"We used to dump water down the drain that was cleaner than the water we were getting," says Westbrook. "We recognized, 'Why are we doing this?' We already spent a lot of money to get really clean water. Why are we throwing it away and bringing in water that is 'dirty'? Let's reuse it. Bring it back to the front of our plant and clean it up. I know we aren't getting two uses out of every drop, but we are getting between 1.3 and 1.4. We are moving toward two uses. Then, why not three or four?"

Adds Sunil Thekkepat, an operations and maintenance manager, "In DMOS5, if you take 1,000 gallons per minute of actual ultra-pure water produced, 35 percent is recycled back to the front end of the plant. Another 30 percent is used in cooling towers and fume scrubbers. We're seeking alternatives for the other 35 percent."


About the Author
Create your own user feedback survey