Smart Factory
Predictive maintenance with a digital twin
This article deals with the design of a predictive maintenance algorithm for a triplex pump using data obtained from the simulation of a digital twin of the pump under different fault conditions.
When industrial equipment breaks down, the real problem is often not the cost of replacing the defective part, but the forced downtime. A production line stoppage can mean losses of thousands of euros per minute. Regular maintenance can reduce unplanned downtime, but it does not guarantee that equipment will never break down. By Steve Miller, Mathworks
What if a device could tell you that one of its components is about to fail? What if the device could even tell you which component needs to be replaced? Unplanned downtime would be significantly reduced. Planned maintenance work would only be carried out when required and not at fixed intervals. This is the goal of predictive maintenance: to avoid downtime by using sensor data to predict when maintenance is required.
At the heart of the development of every predictive maintenance algorithm is sensor data that can be used to train a classification algorithm for fault detection. Meaningful features are extracted from this data in a pre-processing step and used to train a machine learning algorithm for predictive maintenance. This algorithm is exported to simulation software such as Simulink for verification and then provided in the form of code on the device's control unit.
It is not always possible to collect data from physical systems in use under typical fault conditions. Allowing faults to occur in the field can lead to fatal failures and equipment destruction. Intentionally creating faults under more controlled conditions can be time consuming, costly or even unfeasible.
One solution to this challenge is to create a digital twin of the plant and generate sensor data for different fault conditions through simulation. With this approach, engineers can generate all the sensor data needed for a predictive maintenance workflow, including tests with all possible fault combinations and faults of varying severity.
This article describes the design of a predictive maintenance algorithm for a triplex pump using Matlab Simulink and Simscape(Figure 1). A digital twin of the pump is created in Simscape and matched with measured data and the predictive maintenance algorithm is created using machine learning. The algorithm only needs the discharge pressure of the pump to recognize which components or combinations of components are about to fail.
Creating the digital twin
A triplex pump has three pistons that are driven by a crankshaft(Figure 2). The pistons are arranged in such a way that liquid always emerges from one chamber. This makes the flow more even, reduces pressure fluctuations and thus reduces material stress compared to a single-piston pump. Typical failure conditions of such a pump are worn crankshaft bearings, leaking piston seals and blocked inlets.
CAD models for pumps, which are often available from the manufacturer, can be imported into Simulink and used to create a mechanical model of the pump for 3D multi-body simulation. To model the dynamic behavior of the system, the pump must now be supplemented with the hydraulic and electrical elements.
Some of the parameters required to create a digital twin, such as bore, stroke and shaft diameter, can be found in the manufacturer's data sheet, but others may be missing or only given as ranges. In this example, we need the upper and lower pressures at which the three check valves feeding the outlet open and close. We do not have exact values for these pressures as they depend on the temperature of the liquid being pumped.
The diagram in Figure 3 shows that the simulation of the pump with rough estimates (blue line) does not sufficiently match the data from the application (black line). The blue line resembles the measurement curve to a certain extent, but the differences are obviously large.
We use Simulink Design Optimization to automatically optimize the parameter values so that the model produces results that match the measured data. The parameters selected for optimization are located in the Check Valve Outlet block in Simscape(Figure 4). Simulink Design Optimization selects parameter values, runs a simulation and calculates the difference between the simulated curve and the measured curve. Based on this result, new parameter values are selected and a new simulation is performed. The gradients of the parameter values are calculated to determine the direction in which the parameter should be adjusted. In this example, convergence is achieved quickly as only two parameters are optimized. For more complex scenarios with more parameters, it is important to use functions that speed up the optimization process.
Creation of the predictive model
Once a digital twin of the pump has been created, the next step is to incorporate the behavior of failed components into the model.
There are various ways to add error behavior. Many Simulink blocks have drop-down menus for typical faults such as short circuits or open circuits. By simply changing parameter values, effects such as friction or fading can be modeled. In this example, three types of faults are considered: increased friction due to a worn bearing, a reduced flow range due to a blocked inlet and leaking piston seals. Block parameters must be adjusted for the first two faults. To model leakage, we need to add a path to the hydraulic system.
As shown in Figure 5, the selected error conditions can be activated and deactivated either via a user interface or via the command line in Matlab. In the model presented here, all error conditions are activated and deactivated via Matlab commands. This allows the entire process to be automated using scripts.
In the simulation of the pump shown in Figure 6 above, two faults were activated: a blocked inlet and a leaking seal on piston 3. These faults are marked by the red circles. The diagram in Figure 6 shows the simulation results for the outlet pressure as a continuous line (blue) and as sampled values with noise (yellow). The data generated by the simulation must include quantization noise, as we need to train our error detection algorithm with data that is as realistic as possible.
The green box in Figure 6 shows the normal value range for the outlet pressure. There are peaks that are well outside the normal range, indicating a fault. This graph alone would tell an engineer or pump operator that something is wrong, but it is not yet possible to judge exactly what the fault is.
We use this simulation to generate pressure data for the pump under all possible combinations of fault conditions. Approximately 200 scenarios were created for the digital twin. Each scenario must be simulated multiple times to account for quantization effects in the sensor. Since this approach requires several thousand simulations, we are looking for ways to speed up data generation.
A typical approach is to distribute simulations across the available threads on multi-core computers or across several computers or computer clusters. Depending on the complexity of the problem, time constraints and resources, this approach is supported by the Parallel Computing Toolbox and the Matlab Parallel Server.
Another approach is to use the "Fast Restart" function in Simulink, which takes advantage of the fact that many systems require a certain settling time until a steady state is reached. With "Fast Restart", this part of the test only needs to be simulated once. All subsequent simulations start at the point at which the system has reached the steady state. In this example, the settling time would account for around 70 % of the simulation time required for a single test(Figure 7). This means that around two thirds of the simulation time can be saved with "Fast Restart". As the "Fast Restart" function can be configured both via the Matlab command line and via scripts, it is ideal for automating the training process.
In the next step, the simulation results are used to extract training data for the machine learning algorithm. The Predictive Maintenance Toolbox offers various options for generating features from training data. Since the signal considered here is periodic, an FFT appears to be the most promising. As shown in Figure 8, this results in a small number of clearly separated peaks of different magnitudes for both individual faults and combinations of faults. This is the type of data that a machine learning algorithm can process very well.
The FFT results for each fault scenario are extracted into a table containing the inserted faults and the observed signal frequencies and magnitudes. Therefore, only relatively few parameters need to be considered.
Now all the data required for training an error detection algorithm is available and can be imported into the Statistics and Machine Learning Toolbox. We use a subset of the generated data to verify the trained algorithm.
We visualize the results of the training process in the Statistics and Machine Learning Toolbox. Using these visualizations, we can compare the strengths and weaknesses of different algorithms and determine whether additional training data is needed. We select the trained algorithm that achieved the highest accuracy in determining the pump error based on the measurement data and import it into the digital twin for verification using seven test cases saved for this purpose(Figure 9). As the final results show, the classification algorithm is able to reliably recognize all seven scenarios. It can now be deployed on the control unit.
A real application of this workflow is industrial plants that are used worldwide in very different environmental conditions. These systems can be changed: A new seal or valve supplier may be selected, the pump may be operated with different types of fluids and in new environments with different daily temperature ranges. All of these factors affect the pressure measured by the sensor, potentially making the fault detection algorithm unreliable or even useless. The ability to quickly adapt the algorithm to new conditions is crucial for the use of these systems in new markets.
The workflow described here can be automated with scripts in Matlab and most of the work results can be reused. The only step that needs to be repeated is the data acquisition under comparable conditions to the pump in use.
With the latest advances in smart networking, it will even be possible for machine builders to deliver equipment to customers with preliminary settings, collect data remotely under real-life conditions on site, train the fault detection algorithm and then re-deploy it remotely on the equipment. This will open up new opportunities for customer support, including re-training fault detection on equipment that has been in use for some time under site-specific conditions. The knowledge gained from numerous devices will benefit both customers and manufacturers.
Summary
With predictive maintenance, engineers can determine exactly when systems need to be serviced. This reduces downtime and prevents system failures, as maintenance can be planned according to actual need rather than a predetermined schedule. It would often be too expensive or even impossible to create the necessary fault conditions to train a predictive maintenance algorithm on real equipment. One solution to this challenge is to use data from the use of the fully functional device to optimize a physical 3D model and create a digital twin. The digital twin can then be used to design a predictive maintenance detection algorithm that is deployed on the equipment's control unit. The process can be automated. This enables rapid adaptation to different conditions, materials to be processed and system configurations.




















