Before you can make any progress to avoid future failures, you have to first work your way back to discover the root causes. Otherwise, it’s not a question of if a failure can reappear; it’s just a question of when.
Fault tree analysis (FTA) is a systematic method that helps you diagram lower-level events to understand how a system can fail so you can find ways to reduce risks.
What is fault tree analysis?
Fault tree analysis is closely connected to root causes analysis and shares the goal of helping you understand systems specifically in terms of how they can fail. In many cases, it can help you understand how specific combinations of events can come together to create failures.
A quick review of root cause analysis makes it easier to understand FTA.
In the basic Five Whys process of root cause analysis, you start with a failed state and keep asking yourself why until you reach the root cause. Generally, but not always, you need to ask five times, with each answer creating the basis of the following question. So, if the failed state is that the engine won’t start, you ask yourself why. The answer is that the battery is dead. So, the next question is “Why is the battery dead?” If the answer to that is a broken alternator, you then ask, “What’s wrong with the alternator?” Eventually, you might arrive at “Because the maintenance department missed the PM to check the alternator.”
FTA is similar because you’re also starting at an undesired state, so it’s possible to use the same example, the engine won’t start. But instead of asking questions, we’re creating a diagram with all the possible reasons for the failure. Throughout the diagram, we include events and gates. Events represent things that can happen in the system that either directly cause or partially contribute to a failure. Gates are show how failure moves through the system. At any given gate, you might need only one event for the failure to move to the next level. In some, you might need a combination of events to all happen before the failure can pass through the gate and spread further, while in others you need only some of the related events.
Because of the relationships between gates and events, you can understand the gates as operating on Boolean logic.
Common FTA events
At the top of your diagram is the top event (TE), the event that you want to figure out how to avoid. It’s the failure that’s at the center of your analysis, but because you’re using a tree diagram, it’s at the top. Basic events (BE) are along the bottom of the tree. Although they can move up, passing gates to connect with other events, there’s nothing below them. Between the two are your intermediate events (IE), which are first caused by the BEs before the move up to in turn cause the TE.
Common FTA gates
For an AND gate, the failure can only pass if all the input events happen. The number of inputs is not important; it might be two or two hundred. But for them to pass the gate, they all need to happen. With an OR gate, you need one or more of the events to happen before they can pass, while with XOR, you need only one of them. If more than one event happens, they can’t pass.
What are the benefits of FTA?
One way to understand the benefits is to look at the origins. In the early 60s, Bell Laboratories developed FTA to ensure the safety and reliability of the Intercontinental Ballistic Missile (ICBM) Launch Control System. From there, it quickly spread to other high-hazard industries, including:
- Aerospace
- Nuclear power
- Pharmaceutical
- Petrochemical
So, the benefits are related to finding risks and reducing them. More specifically, you can:
- Focus on removing risks instead of reactive repairs
- Analyze and improve your overall systems
- Prioritize maintenance and repairs to avoid catastrophic failures
You can use FTA at different points in a process; for example, when designing and implementing a new system or when modifying an existing one.
What are the steps for setting up FTA?
There are six, and you need to complete them in the right order.
Find the failure
Here, you need to know exactly what the failure is.
Map out the system
The next step can be the biggest, because you need to map out the system, its component parts, and their relationships to one another.
Think of possible causes
Now you need not only a list of the possible problems, but also some understanding of their individual probabilities. So, not only what could have gone wrong, but also the chances for each of those possibilities.
Diagram the tree
Here, you take all the information you’ve gathered and use it to create a fault tree diagram, making sure to include the correct logic gates where events connect to one another.
Calculate the risks
Give each risk a level of probability. It’s likely here you need to include system engineers and operators because they know the system best.
Lower the risks
Once you know the cause of the failure, you need to take concrete steps to lower the chances of it happening again.
How does an EAM help with fault tree analysis?
FTA offers a long list of benefits, but it also requires an upfront investment of time and expertise. It also requires a lot of reliable data on assets and equipment and their potential failures.
Modern enterprise asset management software helps ensure you get the reliable data you need for FTA. Unlike paper- or spreadsheet-based attempts, a good EAM removes a lot of the manual data entry that can quickly lead to mistakes. And once the data is inside the software, it’s safe, secure, and accessible from any connected device.
Summary
Fault tree analysis is a method for mapping out the events and conditions that can lead systems to fail. Instead of using a series of questions to work backwards to the root cause, here you use tree-shaped diagrams to get a complete understanding of what can lead to different failures in the system. Once you have paths, you can find ways to proactively prevent a failure from spreading. Although you can find FTA across industries, organizations that need to mitigate large amounts of risk are more like it. For example, those in aerospace and nuclear power. Because the process is dependent on accurate data and the ability to make solid predictions about possible failures, EAM software, with its built-in data capture and safeguarding features, can be an important tool in FTA.