Optimizing Machine Learning Performance via Dataset Generation for X-ray Image Classification

Mahr F, Schmidt K, Thielen N, Sindel T, Franke J (2024)


Publication Type: Conference contribution

Publication year: 2024

Publisher: IEEE

City/Town: New York City

Conference Proceedings Title: 2024 25th International Conference on Thermal, Mechanical and Multi-Physics Simulation and Experiments in Microelectronics and Microsystems (EuroSimE)

Event location: Catania IT

DOI: 10.1109/EuroSimE60745.2024.10491564

Abstract

Automated X-ray inspection (AXI) plays a crucial role in ensuring the quality of electronic modules, especially post Surface Mount Technology (SMT) manufacturing. However, the presence of pseudo errors complicates accurate defect classification, necessitating efficient solutions like machine learning (ML) automation. This paper explores the impact of various dataset characteristics on ML model performance enhancement. Leveraging 2D grayscale X-ray images classified into “OK” (pseudo errors) and “not OK” (real errors), this methodology employs statistical tests to detect data drift over time. The hypothesis is that ML models trained on datasets with temporal and grayscale similarities exhibit superior performance. A dataset generation pipeline is proposed, integrating quality validation, grayscale segmentation, and temporal cropping, aiming to enhance model performance. While these optimized datasets demonstrate enhanced performance, they require frequent retraining due to reduced generalization. To address this, advocacy is made for a retraining pipeline and the recommendation is to utilize statistical tests for detecting data drift, enabling timely model retraining. These investigations reveal that datasets optimized for temporal and grayscale congruence significantly enhance classification accuracy. Additionally, the study demonstrates improvements in performance metrics: accuracy (99.03% to 99.63%), precision (98.59% to 99.75%), recall (97.21% to 99.33%), and F1-score (97.9% to 99.54%). The study highlights the importance of dataset optimization for improving the reliability of ML-based AXI-reclassification and suggests practical strategies for maintaining model efficiency over time X.

Authors with CRIS profile

Involved external institutions

How to cite

APA:

Mahr, F., Schmidt, K., Thielen, N., Sindel, T., & Franke, J. (2024). Optimizing Machine Learning Performance via Dataset Generation for X-ray Image Classification. In 2024 25th International Conference on Thermal, Mechanical and Multi-Physics Simulation and Experiments in Microelectronics and Microsystems (EuroSimE). Catania, IT: New York City: IEEE.

MLA:

Mahr, Felix, et al. "Optimizing Machine Learning Performance via Dataset Generation for X-ray Image Classification." Proceedings of the 2024 25th International Conference on Thermal, Mechanical and Multi-Physics Simulation and Experiments in Microelectronics and Microsystems (EuroSimE), Catania New York City: IEEE, 2024.

BibTeX: Download