Soft Dynamic Time Warping for Multi-Pitch Estimation and Beyond

Krause M, Weiß C, Müller M (2023)


Publication Type: Conference contribution

Publication year: 2023

Journal

Publisher: Institute of Electrical and Electronics Engineers Inc.

Book Volume: 2023-June

Conference Proceedings Title: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

Event location: Rhodes Island GR

ISBN: 9781728163277

DOI: 10.1109/ICASSP49357.2023.10095907

Abstract

Many tasks in music information retrieval (MIR) involve weakly aligned data, where exact temporal correspondences are unknown. The connectionist temporal classification (CTC) loss is a standard technique to learn feature representations based on weakly aligned training data. However, CTC is limited to discrete-valued target sequences and can be difficult to extend to multi-label problems. In this article, we show how soft dynamic time warping (SoftDTW), a differentiable variant of classical DTW, can be used as an alternative to CTC. Using multi-pitch estimation as an example scenario, we show that SoftDTW yields results on par with a state-of-the-art multi-label extension of CTC. In addition to being more elegant in terms of its algorithmic formulation, SoftDTW naturally extends to real-valued target sequences.

Authors with CRIS profile

How to cite

APA:

Krause, M., Weiß, C., & Müller, M. (2023). Soft Dynamic Time Warping for Multi-Pitch Estimation and Beyond. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Rhodes Island, GR: Institute of Electrical and Electronics Engineers Inc..

MLA:

Krause, Michael, Christof Weiß, and Meinard Müller. "Soft Dynamic Time Warping for Multi-Pitch Estimation and Beyond." Proceedings of the 48th IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023, Rhodes Island Institute of Electrical and Electronics Engineers Inc., 2023.

BibTeX: Download