Bergler C, Schmitt M, Cheng RX, Maier A, Barth V, Nöth E (2019)
Publication Language: English
Publication Type: Conference contribution, Original article
Publication year: 2019
Publisher: International Speech Communication Association
Pages Range: 3357-3361
Conference Proceedings Title: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
DOI: 10.21437/Interspeech.2019-1857
Call type classification is an important instrument in bioacoustic investigating group-specific vocal repertoire, behavioral patterns, and cultures of different animal groups. There is a need using robust machine-based techniques to replace classification due to its advantages in handling large delivering consistent results, removing perceptual-based classification, and minimizing human errors. The current work is the first adopting a two-stage fully unsupervised approach on previous machine-segmented orca data to identify orca sound types using deep learning together with one of the largest bioacoustic datasets – the Orchive. The proposed methods include: (1) unsupervised feature learning using an undercomplete ResNet18-autoencoder trained on machine-annotated data, and (2) spectral clustering utilizing compressed orca feature representations. An existing human-labeled orca dataset was clustered, including 514 signals distributed over 12 classes. This two-stage fully unsupervised approach is an initial study to (1) examine machine-generated clusters against human-identified orca call type classes, (2) compare supervised call type classification versus unsupervised call type clustering, and (3) verify the general feasibility of a completely unsupervised approach based on machine-labeled orca data resulting in a major progress within the research field of animal linguistics, by deriving a much deeper understanding and facilitating totally new insights and opportunities.
APA:
Bergler, C., Schmitt, M., Cheng, R.X., Maier, A., Barth, V., & Nöth, E. (2019). Deep Learning for Orca Call Type Identification – A Fully Unsupervised Approach. In Gernot Kubin, Thomas Hain, Bjorn Schuller, Dina El Zarka, Petra Hodl (Eds.), Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH (pp. 3357-3361). Graz, AT: International Speech Communication Association.
MLA:
Bergler, Christian, et al. "Deep Learning for Orca Call Type Identification – A Fully Unsupervised Approach." Proceedings of the 20th Annual Conference of the International Speech Communication Association: Crossroads of Speech and Language, INTERSPEECH 2019, Graz Ed. Gernot Kubin, Thomas Hain, Bjorn Schuller, Dina El Zarka, Petra Hodl, International Speech Communication Association, 2019. 3357-3361.
BibTeX: Download