Die Auswirkung der Reihenfolge von Mess- und Simulationsdaten auf das Ergebnis der Kreuzvalidierung in KDD Prozessen

Küstner C, Breitsprecher T, Wartzack S (2013)


Publication Language: German

Publication Type: Conference contribution

Publication year: 2013

Publisher: TuTech

City/Town: Hamburg

Pages Range: 175-186

Conference Proceedings Title: Design for X - Beiträge zum 24. DfX-Symposium 2013

Event location: Jesteburg DE

ISBN: 978-3-941492-63-9

URI: https://www.mfk.uni-erlangen.de?file=pubmfk_5638b08c3af84

Abstract

Knowledge Discovery in Databases (KDD) is an emerging computer-aided methodology that enables design engineers to acquire design-relevant knowledge from complex data. Sources for this data are experiments and simulations in nearly every domain that is related to the product life cycle (e. g. structural mechanics, manufacturing engineering, prototyping). The inter-European research project EUREKA ALARM (assistance system for the development of noise reduced rotating machines), inter alia, aims at the automatic knowledge acquisition from data sources like multi body simulation results and noise measurements of modern wind turbines. The design-relevant knowledge is represented by so called metamodels, behavior or surrogate models. These models enable the design engineer to predict the noise behavior of a wind turbine, based on different parameters like gear geometry, gear number, etc.. It is obvious that design engineers need to estimate the prediction quality of the models quantitatively for example by means of prediction performance values as relative prediction error, root mean squared error or root mean absolute error. The use of metamodels with poor prediction behavior are of cause not the objective of KDD methodology. In order to derive these performance values different procedures are suggested according to KDD literature, whereas the k-fold cross validation is the most common one. To ensure a statistically robust performance estimation it is advised to repeat this procedure which has led to a ten-times ten-fold cross validation. During the research efforts of ALARM a new issue appeared. It can be shown that the structure of the input dataset has an impact on the performance estimation. In this contribution an approach is presented how to handle this issue.

Authors with CRIS profile

Related research project(s)

How to cite

APA:

Küstner, C., Breitsprecher, T., & Wartzack, S. (2013). Die Auswirkung der Reihenfolge von Mess- und Simulationsdaten auf das Ergebnis der Kreuzvalidierung in KDD Prozessen. In Krause Dieter, Paetzold Kristin, Wartzack Sandro (Hrg.), Design for X - Beiträge zum 24. DfX-Symposium 2013 (S. 175-186). Jesteburg, DE: Hamburg: TuTech.

MLA:

Küstner, Christof, Thilo Breitsprecher, and Sandro Wartzack. "Die Auswirkung der Reihenfolge von Mess- und Simulationsdaten auf das Ergebnis der Kreuzvalidierung in KDD Prozessen." Tagungsband Design for X - 24. DfX-Symposium 2013, Jesteburg Hrg. Krause Dieter, Paetzold Kristin, Wartzack Sandro, Hamburg: TuTech, 2013. 175-186.

BibTeX: Download