Driving context into text-to-text privatization

Arnold S, Yesilbas D, Weinzierl S (2023)


Publication Language: English

Publication Type: Conference contribution, Conference Contribution

Publication year: 2023

Pages Range: 15-25

Conference Proceedings Title: Proceedings of the 3rd Workshop on Trustworthy Natural Language Processing

Event location: Toronto CA

URI: https://aclanthology.org/2023.trustnlp-1.2

DOI: 10.18653/v1/2023.trustnlp-1.2

Open Access Link: https://aclanthology.org/2023.trustnlp-1.2

Abstract

Metric Differential Privacy enables text-to-text privatization by adding calibrated noise to the vector of a word derived from an embedding space and projecting this noisy vector back to a discrete vocabulary using a nearest neighbor search. Since words are substituted without context, this mechanism is expected to fall short at finding substitutes for words with ambiguous meanings, such as bank. To account for these ambiguous words, we leverage a sense embedding and incorporate a sense disambiguation step prior to noise injection. We encompass our modification to the privatization mechanism with an estimation of privacy and utility. For word sense disambiguation on the Words in Context dataset, we demonstrate a substantial increase in classification accuracy by 6.05%.

Authors with CRIS profile

Related research project(s)

How to cite

APA:

Arnold, S., Yesilbas, D., & Weinzierl, S. (2023). Driving context into text-to-text privatization. In Association for Computational Linguistics (Eds.), Proceedings of the 3rd Workshop on Trustworthy Natural Language Processing (pp. 15-25). Toronto, CA.

MLA:

Arnold, Stefan, Dilara Yesilbas, and Sven Weinzierl. "Driving context into text-to-text privatization." Proceedings of the Annual Meeting of the Association for Computational Linguistics, Toronto Ed. Association for Computational Linguistics, 2023. 15-25.

BibTeX: Download