Kamp M, Kreutzer P, Philippsen M (2019)
Publication Language: English
Publication Type: Conference contribution, Original article
Publication year: 2019
Publisher: IEEE Press
City/Town: Piscataway, NJ, USA
Pages Range: 529 - 533
Conference Proceedings Title: Proceedings of the 16th International Conference on Mining Software Repositories (MSR 2019)
Event location: Montréal, QC, Kanada
ISBN: 978-1-7281-3412-3
URI: https://i2git.cs.fau.de/i2public/publications/-/raw/master/MSR19.pdf
In the past, techniques for detecting similarly behaving code fragments were often only evaluated with small, artificial oracles or with code originating from programming competitions. Such code fragments differ largely from production codes.
To enable more realistic evaluations, this paper presents SeSaMe, a data set of method pairs that are classified according to their semantic similarity. We applied text similarity measures on JavaDoc comments mined from 11 open source repositories and manually classified a selection of 857 pairs.
APA:
Kamp, M., Kreutzer, P., & Philippsen, M. (2019). SeSaMe: A Data Set of Semantically Similar Java Methods. In Proceedings of the 16th International Conference on Mining Software Repositories (MSR 2019) (pp. 529 - 533). Montréal, QC, Kanada, CA: Piscataway, NJ, USA: IEEE Press.
MLA:
Kamp, Marius, Patrick Kreutzer, and Michael Philippsen. "SeSaMe: A Data Set of Semantically Similar Java Methods." Proceedings of the 16th International Conference on Mining Software Repositories (MSR 2019), Montréal, QC, Kanada Piscataway, NJ, USA: IEEE Press, 2019. 529 - 533.
BibTeX: Download