Properly conducted RCAs are time and resource consuming, so when we are not getting the expected ROIs from our efforts, we have to consider why are we wasting so much money doing the same thing over and over again. Let’s look at a typical RCA Scenario…
You experience an unexpected shutdown that lasts 6 hours. The threshold to commission a formal RCA (i.e. – trigger) is 4 hours so the condition has been met. An RCA team is quickly put together amidst the chaos of the outage, and oftentimes the person most familiar with the process involved, is appointed the RCA team leader.
While under such conditions, there is an attempt to get data/evidence. Oftentimes the efforts are not as comprehensive as we’d like them to be. This is due to time pressures to safely secure the area and restart production, the lack of cooperation by the parties that control the data to share it at that time, time pressures to conclude the RCA and the fact there may be no requirement to provide such comprehensive validation (i.e. – evidence) of our conclusions. The RCA team meets for a week on and off, as that is not their primary job (and that ‘week’ timeframe is being very generous). Then they prepare for their final presentation to leadership seeking approval of their recommended corrective actions.
Production is eventually started up, the RCA is presented and finalized, and corrective actions are approved for implementation.
Two weeks later, the same failure occurs again, and the plant manager is not happy.
While I made up this scenario, it is based on my three (3) decades in this RCA space, and it’s not too far from reality in my experience. Let’s look at this one scenario and see what we can glean from it.
Viewing RCA As Re-work
In the above scenario, you would end up having to do another RCA because the failure recurred. This is akin to ‘re-work’. Sometimes with RCA’s we tend not to view it that way, and its deemed just as a cost of doing business. But it’s not…IT’S RE-WORK. If it didn’t happen again, we wouldn’t be analyzing it again.
What does that re-work really cost the organization?
For the sake of example, let’s use the following assumptions (or a reality check, just replace my assumptions with your own numbers and see what you come up with).
In our case let’s assume the following resources and costs were applied to conducting the RCA:
|Resource||# of People||Time||$/Hr-Matrl Cost||
Total Annual Cost
|Downtime||N/A||6 hours (Total)||$10,000 USD||$60,000 USD|
|Hourly Labor||3||6 hours (Each)||$50 USD||$900 USD|
|Professional Labor||4||4 hours (Each)||$75 USD||$1200 USD|
|Materials||N/A||N/A||$2500 USD||$2500 USD|
|Total RCA Cost||$64,600 USD|
This does NOT include ancillary costs that would involve the time of people in the storeroom/warehouse, purchasing, expediting parts, use of external Subject Matter Experts (SME), executive time for presentations, cost of implementing RCA corrective actions, RCA training & software, customer complaints, and the time of the RCA team members to meet and conduct the RCA. Essentially the costs in the table are to respond to the failure (not solve it). These numbers above would be safe, conservative and very defensible.
TIP: This is important because when trying to make such a business case for re-work (or RCA in general, to be honest), expect that people will try and discredit the integrity of your numbers. Make sure your numbers come from credible sources (like your accounting department).
So, our very simple case, for only one recurrence, the re-work will cost nearly $65,000 USD on average. Imagine if this was a chronic failure that happened 4x/year! This is our (and your) business case for making sure to do RCA properly and prevent the risk of recurrence.
Why Chronic Failures Are More Costly Than Sporadic Failures
Chronic failures are much easier to quantify in terms of ROIs. Think about why this is the case. If you have a sporadic/acute failure that happens once every 5 years, then logically you would have to wait 5 years to see if it happens again. In other words, you’d have to wait that long to take credit for it…that’s not happening. Likely, you wouldn’t even be in the same position five years later.
Contrast this to chronic failures, the ones that happen so often (every shift for instance) that we don’t even record them in our tracking systems. This is often because it may take longer to enter it into the system, then it does to make the quick fix. These failures are hidden in plain sight and often absorbed into the ‘cost of doing business’ paradigm. It’s not a failure anymore, it just your turn to fix it as part of the daily routine.
These are our greatest opportunities though!! These are easier to calculate ROI because they happen so often. They are actually accommodated for in our budgets as a slush fund under something like ‘General’ or ‘Routine’. They even get a cost of living increase every year!!
Let’s take an example and consider a simple chronic event such as conveyor belts that trip in a mining operation. On their individual impact they may take 15 minutes to locate and reset. This 15-min period requires the attention of a person, which at a typical standard rate ($40/hr. with benefits included) results in a cost per event of $10 (0.25 hr. x $40/hr. labor rate).
Because the event simply requires a person to find and reset the tripped conveyor system, generally no additional parts costs are involved. However, the 15-min delay causes a production loss upstream in the processing area, which equates to $5000/hr. Fifteen minutes now is worth $1250/occurrence (0.25 hr. x $5000/hr. production loss). So, each 15-min occurrence is now worth $1260 ($10 labor + $1250 lost production). Still considered a relatively low impact, right?
Now consider this particular conveying system, we experience 40 such stoppages a week or 2080 for the year. We are looking at an annual impact to the bottom line of $2,620,800 ($1260/occurrence x 2080 occurrences). The line item in an Opportunity Analysis may look like this.
|Event||Frequency/Yr||Impact/Occurrence||Total Annual Loss|
|Mining Conveyor Belt Trip||2,080||$1,260||$2,620,800|
This is why chronic failures tend to be way more costly than sporadic failures. Since on their individual occurrence they do not tend to hit an ‘RCA trigger’, there is not a requirement to analyze them. We just get good at continually fixing them…faster. Food for thought my friends!