Federated Learning
Innovative method in quality assurance
Federated learning has established itself as an innovative method in the field of quality assurance, especially in scenarios where companies want to protect sensitive production data but still want to benefit from collaborative models.
Industrial production is increasingly facing requirements in terms of data sovereignty, data protection and individual production conditions. Federated learning (FL) offers the advantage that companies can train models together despite strict data protection regulations or competitive data silos. In addition, the diversity of data from different partners can lead to more robust and generalizable models.
In contrast to classic centralized machine learning, which requires all data to be brought together at a central location, FL enables decentralized training of a global model directly at the individual production plants without the need to exchange raw data. This method proves to be particularly interesting for companies that have a limited amount of data of a specific class, for example when detecting defective parts or when analyzing data with low variance, as is the case when using the same camera in a constant perspective. By virtually expanding the database, FL helps to effectively prevent overfitting of training data. Federated averaging (FedAvg) is the original and one of the most widely used aggregation methods in federated learning. The process takes place in several steps:
- Initialization: A global model is initialized on a central server and then sent to all participating clients
- Local training: Each client trains the model independently on its local data over several epochs
- Model update: After completing the local training, the clients send their updated model parameters back to the central server
- Aggregation: The server aggregates the received model updates typically by averaging to create an improved global model
- Distribution of the model: The new, aggregated model is again distributed to all clients to start the next training round
This cycle is repeated until a specified accuracy or convergence is achieved.
Opportunities and challenges Federated learning makes it possible to combine similar processes on comparable machines. This means that environmental influences, such as increased temperature fluctuations on machines exposed to direct sunlight or increased wear on older machines, are inherently taken into account in the model. As a result, experience gained with one machine can be quickly and effectively transferred to others so that they can react in good time.
Nevertheless, FL systems face a variety of challenges. If there is a high degree of data heterogeneity, for example due to a non-uniformly distributed database or the fact that individual clients have significantly more data than others, more complex selection strategies must be used. Such strategies aim to take all data distributions into account as far as possible during training. In addition, model or hardware heterogeneities require more advanced aggregation methods. Common aggregation methods such as FedAvg often prove to be insufficient in such cases to efficiently integrate widely varying local data distributions.
Different approaches have been developed to overcome these challenges. The combination of federated learning with ensemble learning, for example, is generally robust to local data variance and delivers more precise results for various quality inspection applications. In addition, sampling strategies have been introduced to select clients whose data has particular added value for the global model, making communication more efficient. Furthermore, personalized aggregation techniques offer the possibility to weight local model updates more individually, which improves the generalization of models across different local conditions.
Additional protective measures can be implemented to further increase security. Differential privacy, for example, adds controlled, random disruptions to the model updates, which prevents conclusions being drawn about specific data sets of individual clients. Secure Multiparty Computation (SMPC) allows calculations to be performed on encrypted data so that the central server has no insight into the actual model updates. Homomorphic Encryption enables calculations to be performed directly on encrypted data without having to decrypt it, thus ensuring additional privacy.
Predict downtimes and plan maintenance measures
The practical suitability of FL has been proven in various quality assurance scenarios. In predictive maintenance, spring-loaded models enable efficient monitoring of the condition of machines and systems at different locations. By collecting knowledge from several production environments without sharing raw data, downtimes can be predicted precisely and maintenance measures can be planned better.
FL also offers great advantages for visual inspection procedures. Production sites with different lighting conditions, machine types and defect patterns benefit from a globally trained model that is continuously improved through decentralized learning processes. Especially when it comes to detecting fine cracks, scratches or corrosion, spring-based approaches can achieve significantly more reliable results than locally isolated models.
Future research will focus on adaptive strategies for coping with dynamic production conditions and further optimization of client selection as well as the use of hybrid models. In addition, research and practice will increasingly investigate how AI-supported quality assurance processes can be made even more efficient through the integration of real-time data analysis and edge computing. Federated learning will thus become a key instrument of modern Industry 4.0 strategies and provide sustainable support for the development of flexible, secure and highly precise quality assurance processes.
From the field: Quality assurance at Huawei
Tatjana Legler is leading an FL project with Huawei, in which SmartFactory Kaiserslautern and RPTU Kaiserslautern are working together. She is an AI expert and deputy head of the Chair of Machine Tools and Control at RPTU Kaiserslautern-Landau. In the use case, the surfaces of USB sticks were examined, which differed in manufacturing process, shape and color. A normal industrial camera took pictures, while the pre-trained AI model worked locally on the camera. If a deviation was detected in the connector, the workpiece was automatically ejected. After a relevant training session, new network weights consisting of millions of data values between 0.0 and 0.1, which abstractly described the error but did not allow any conclusions to be drawn about the workpiece, were uploaded. The centrally stored algorithm was optimized with the decentrally recorded data parameters and fed back into the local applications, which were also able to continue working in an improved manner. "The advantages are obvious," explains Legler. "An error that occurred at our site could be detected at Huawei without ever having occurred there before. A classic win-win situation."
In Germany, the use case was first presented at the Hannover Messe in 2023. Internationally, the team of Prof. Martin Ruskowski (left), Tatjana Legler (center) and Vinit Hegiste presented their initial findings at the AI Hardware & Edge AI Summit in California at the end of 2023.
Automatica, Hall B6, Stand 502










