Strahl S, Müller M (2024)
Publication Language: English
Publication Type: Conference contribution, Conference Contribution
Publication year: 2024
Publisher: ISMIR
Pages Range: 173-181
Conference Proceedings Title: Proceedings of the International Society for Music Information Retrieval Conference (ISMIR)
Event location: San Francisco, CA
Automatic piano transcription (APT) transforms piano recordings into symbolic note events. In recent years, APT has relied on supervised deep learning, which demands a large amount of labeled data that is often limited. This paper introduces a semi-supervised approach to APT, leveraging unlabeled data with techniques originally introduced in computer vision (CV): pseudo-labeling, consistency regularization, and distribution matching. The idea of pseudo-labeling is to use the current model for producing artificial labels for unlabeled data, and consistency regularization makes the model's predictions for unlabeled data robust to augmentations. Finally, distribution matching ensures that the pseudo-labels follow the same marginal distribution as the reference labels, adding an extra layer of robustness. Our method, tested on three piano datasets, shows improvements over purely supervised methods and performs comparably to existing semi-supervised approaches. Conceptually, this work illustrates that semi-supervised learning techniques from CV can be effectively transferred to the music domain, considerably reducing the dependence on large annotated datasets.
APA:
Strahl, S., & Müller, M. (2024). Semi-Supervised Piano Transcription Using Pseudo-Labeling Techniques. In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR) (pp. 173-181). San Francisco, CA, US: ISMIR.
MLA:
Strahl, Sebastian, and Meinard Müller. "Semi-Supervised Piano Transcription Using Pseudo-Labeling Techniques." Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), San Francisco, CA ISMIR, 2024. 173-181.
BibTeX: Download