Tamer NC, Özer Y, Müller M, Serra X (2023)
Publication Language: English
Publication Type: Conference contribution, Conference Contribution
Publication year: 2023
Publisher: ISMIR
Pages Range: 223-230
Conference Proceedings Title: Proceedings of the International Society for Music Information Retrieval Conference (ISMIR)
A descriptive transcription of a violin performance requires detecting not only the notes but also the fine-grained pitch variations, such as vibrato. Most existing deep learning methods for music transcription do not capture these variations and often need frame-level annotations, which are scarce for the violin. In this paper, we propose a novel method for high-resolution violin transcription that can leverage piece-level weak labels for training. Our conformer-based model works on the raw audio waveform and transcribes violin notes and their corresponding pitch deviations with 5.8 ms frame resolution and 10-cent frequency resolution. We demonstrate that our method (1) outperforms generic systems in the proxy tasks of violin transcription and pitch estimation, and (2) can automatically generate new training labels by aligning its feature representations with unseen scores. We share our model along with 34 hours of score-aligned solo violin performance dataset, notably including the 24 Paganini Caprices.
APA:
Tamer, N.C., Özer, Y., Müller, M., & Serra, X. (2023). High-Resolution Violin Transcription Using Weak Labels. In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR) (pp. 223-230). Mailand, IT: ISMIR.
MLA:
Tamer, Nazif Can, et al. "High-Resolution Violin Transcription Using Weak Labels." Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), Mailand ISMIR, 2023. 223-230.
BibTeX: Download