Deep Representation Learning for Orca Call Type Classification

Bergler C, Schmitt M, Cheng RX, Schröter H, Maier A, Barth V, Weber M, Nöth E (2019)

Publication Type: Conference contribution

Publication year: 2019

Journal

Lecture Notes in Computer Science Springer Verlag

Publisher: Springer Verlag

Book Volume: 11697 LNAI

Pages Range: 274-286

Conference Proceedings Title: Text, Speech, and Dialogue, 22nd International Conference, TSD 2019, Ljubljana, Slovenia, September 11–13, 2019, Proceedings

Event location: Ljubljana

ISBN: 9783030279462

DOI: 10.1007/978-3-030-27947-9_23

Abstract

Marine mammals produce a wide variety of vocalizations. There is a growing need for robust automatic classification methods especially in noisy underwater environments in order to access large amounts of bioacoustic signals and to replace tedious and error prone human perceptual classification. In case of the northern resident killer whale (Orcinus orca), echolocation clicks, whistles, and pulsed calls make up its vocal repertoire. Pulsed calls are the most intensively studied type of vocalization. In this study we propose a hybrid call type classification approach outperforming our previous work on supervised call type classification consisting of two components: (1) deep representation learning of killer whale sounds by investigating various autoencoder architectures and data corpora and (2) subsequent supervised training of a ResNet18 call type classifier on a much smaller dataset by using the pre-trained representations. The best semi-supervised trained classification model achieved a test accuracy of 96% and a mean test accuracy of 94% outperforming our previous work by 7% points.

Authors with CRIS profile

Christian Bergler Lehrstuhl für Informatik 5 (Mustererkennung) Hendrik Schröter Lehrstuhl für Informatik 5 (Mustererkennung) Andreas Maier Lehrstuhl für Informatik 5 (Mustererkennung) Elmar Nöth Professur für Informatik (Mustererkennung)

Related research project(s)

Deep Learning Applied to Animal Linguistics (DeepAL) April 1, 2018 - April 1, 2022

Involved external institutions

Anthro-Media Documentary and iTV Production

Germany (DE) Leibniz-Institut für Zoo- und Wildtierforschung (IZW)

Germany (DE)

How to cite

APA:

Bergler, C., Schmitt, M., Cheng, R.X., Schröter, H., Maier, A., Barth, V.,... Nöth, E. (2019). Deep Representation Learning for Orca Call Type Classification. In Kamil Ekštein (Eds.), Text, Speech, and Dialogue, 22nd International Conference, TSD 2019, Ljubljana, Slovenia, September 11–13, 2019, Proceedings (pp. 274-286). Ljubljana, SI: Springer Verlag.

MLA:

Bergler, Christian, et al. "Deep Representation Learning for Orca Call Type Classification." Proceedings of the 22nd International Conference on Text, Speech, and Dialogue, TSD 2019, Ljubljana Ed. Kamil Ekštein, Springer Verlag, 2019. 274-286.

BibTeX: Download