How to minimise maintenance errors

If an evil genius was to create an activity guaranteed to produce an abundance of errors, it would probably involve:

  • Frequent removal and replacement of large numbers of varied components.

  • Cramped and poorly lit spaces.

  • Make-shift equipment.

  • Severe time pressure.

  • Ambiguous or missing service manuals or instructions.

  • Someone starting a job and someone else finishing it.

  • Numerous groups working on the equipment simultaneously.

I would even wager the evil genius would have a name for this activity

....

MAINTENANCE WORK!

The Classic Error Producing Job

It's been said that maintenance work is the "classic error producing job”.                                   

Maintenance errors have been implicated in numerous disasters in safety-critical industries, including:

  • Apollo 13 oxygen tank failure (1970)

  • Three Mile Island nuclear plant failure (1979)

  • Crash of DC10 in Chicago (1979)

  • Piper Alpha offshore platform explosion (1988)

  • Esso Longford gas plant explosion (1998)

Most maintenance personnel are highly skilled and dedicated, but a basic error management principle is that “the best people can make the worst mistakes”.

Work pressures can lead people into the same kind of error, regardless of who is doing the job. 

What Is A Maintenance Error?

An error is the failure of planned actions to achieve their desired goal where this occurs without some unforeseeable or chance interaction.

 
For many tasks Violation + Error = Disaster 

The Good News – Maintenance Errors Are Predictable

Contrary to popular belief, the majority of maintenance slips, lapses and mistakes fall into predictable patterns.

For example, most errors are triggered by the situation and task circumstances, which are common to maintenance activities.

Maintenance error is managed in the same way that any well-defined business risk is managed.

Complexity Of Maintenance Tasks

Operators can be distracted during critical phases (e.g. during a re-assembly or final safety inspection) and this could cause safety-critical deficiencies being left undetected or uncorrected.

All humans have limited short term memory capacity and it is unrealistic to expect maintenance workers not to be distracted or to make errors with complex sequences.

The simple example below demonstrates the difficulty of many maintenance tasks.

Consider a bolt with eight (8) nuts on it. Each nut is labelled “A” through to “H”. 

There is only one way to take the nuts off, but there are in excess of 40,000 opportunities to get the sequence wrong when putting the nuts back on.

Calculation: 8x7x6x5x4x3x2 = 40,320

The Maintenance Error Approach That Doesn't Work

Traditionally, efforts to prevent recurrence of maintenance errors took the form of:

  • Disciplinary action.

  • Introducing more procedures.

  • Shaming.

  • More training.

Unfortunately, traditional responses to error fail to recognise or address the underlying causes. For example, difficult task under difficult circumstances.

Maintenance errors are consequences, not causes.

Situations and systems are easier to change than the human condition. And they're more reliable.

Successful Approaches

World-wide experience proves effective control of maintenance error requires multiple safety defences.

Effective safeguards usually involve a balanced mixture of “hard” engineered defences (which are generally more durable) and “soft” administrative safeguards.

An individual’s behaviour can be influenced, but it is very difficult to achieve full compliance across a large organization consistently.

Human error is not the problem, it is the unforgiving consequences. 

Therefore planning maintenance work needs to identify potential serious outcomes and provide engineering controls where possible.   

Local Error Provoking Factors

As we explained earlier, maintenance errors do not emerge randomly. 

Local factors that promote errors include:

  • Inadequate or inappropriate documentation.E.g. procedures and checklists.

  • Time pressure or fatigue e.g. perceived or real time restraints.

  • Untidy or disorganised work areas.

  • Poor co-ordination or communication e.g. task not completed by end of shift, but next shift put equipment back on line.

  • Inadequate tools or equipment.

  • Lack of knowledge or experience.

  • Procedure usage. For example, less than 60% of workers open procedures during tasks.

  • Personal beliefs on compliance e.g. feeling that strict compliance is not important.

Error Management

To control errors we need to remember:

  1. Human error is inevitable – plan for it!

  2. People cannot easily avoid those actions they did not intend to commit.

  3. You can't change the human condition, but you can change the conditions in which humans work. 

The Best Defence - Safety Culture

Safety culture can be expressed as an environment where everyone willingly and habitually follows agreed safe work procedures. 

This means all the time, not just when it is convenient. 

Therefore the real test of a safety culture is not observing behaviours in the good times; it is about how people respond in tough times. For example, during difficult maintenance activities under time pressure.

 

The Key Components Of A Safety Culture

Reporting Culture  

All difficulties & incidents are freely reported, without blame or fear of recriminations.

Learning Culture

All identified weaknesses are investigated & acted upon, in the spirit of workplace improvement.

Just Culture      

Mutual trust e.g. consideration of all circumstances when violations or problems occur.

Previous
Previous

How You Can Minimise Maintenance Errors

Next
Next

Can You Reduce the Frequency of Your Building Emergency Inspections?