Predictions in Predictive Process Monitoring with Previously Unseen Categorical Values

Roider J, Wang W, Zanca D, Matzner M, Eskofier B (2024)


Publication Language: English

Publication Type: Conference contribution, Conference Contribution

Publication year: 2024

Publisher: Springer Link

Series: Lecture Notes in Business Information Processing

Book Volume: 533

Pages Range: 227-239

Conference Proceedings Title: Process Mining Workshops

Event location: Technical University of Denmark DK

ISBN: 978-3-031-82225-4

DOI: 10.1007/978-3-031-82225-4_17

Abstract

Predictive process monitoring (PPM) methods provide users with real-time predictions about ongoing process instances. Machine learning models used for such tasks do not account for data variability, such as the occurrence of previously unseen categorical feature values. Concept drift adaptation solutions are suggested in such scenarios. However, adapting to new feature values requires time and a sample size large enough to train a well-generalizing model. Still, users expect seamless communication during the timeframe between the first occurrence of a new value and the availability of an updated model. Dedicated solutions are needed since encoding techniques like one hot encoding cannot handle previously unseen values by default. In this work, we first introduce and discuss possible solutions from a business perspective, ranging from temporary shutdowns to dedicated manual and technical solutions for an uninterrupted continuation of predictive services. Next, we present five variants for one hot encoding to handle previously unseen categorical values. This is followed by a case study using six real-world event logs and two machine learning models, XGBoost and LSTM, to identify the variants that produce the most reliable remaining time predictions. The study also includes the evaluation of two baseline models as an alternative to the machine learning models. The results show that previously unseen categorical values can be handled on a technical level without severely affecting the remaining time prediction quality. However, future research is required to provide more practical recommendations.

Authors with CRIS profile

Related research project(s)

How to cite

APA:

Roider, J., Wang, W., Zanca, D., Matzner, M., & Eskofier, B. (2024). Predictions in Predictive Process Monitoring with Previously Unseen Categorical Values. In Andrea Delgado, Tijs Slaats (Eds.), Process Mining Workshops (pp. 227-239). Technical University of Denmark, DK: Springer Link.

MLA:

Roider, Johannes, et al. "Predictions in Predictive Process Monitoring with Previously Unseen Categorical Values." Proceedings of the 6th International Conference on Process Mining, Technical University of Denmark Ed. Andrea Delgado, Tijs Slaats, Springer Link, 2024. 227-239.

BibTeX: Download