Large-Scale Dataset Pruning in Adversarial Training through Data Importance Extrapolation

Nieth B, Altstidl TR, Schwinn L, Eskofier B (2024)

Publication Language: English

Publication Type: Conference contribution, Conference Contribution

Publication year: 2024

Event location: Wien

Open Access Link: https://arxiv.org/abs/2406.13283

Abstract

Their vulnerability to small, imperceptible attacks limits the adoption of deep learning models to real-world systems. Adversarial training has proven to be one of the most promising strategies against these attacks, at the expense of a substantial increase in training time. With the ongoing trend of integrating large-scale synthetic data this is only expected to increase even further. Thus, the need for data-centric approaches that reduce the number of training samples while maintaining accuracy and robustness arises. While data pruning and active learning are prominent research topics in deep learning, they are as of now largely unexplored in the adversarial training literature. We address this gap and propose a new data pruning strategy based on extrapolating data importance scores from a small set of data to a larger set. In an empirical evaluation, we demonstrate that extrapolation-based pruning can efficiently reduce dataset size while maintaining robustness.

Authors with CRIS profile

Björn Nieth Lehrstuhl für Wirtschaftsinformatik, insbesondere IT-Management Thomas Robert Altstidl Machine Learning and Data Analytics Lab Leo Schwinn Machine Learning and Data Analytics Lab Björn Eskofier Machine Learning and Data Analytics Lab

How to cite

APA:

Nieth, B., Altstidl, T.R., Schwinn, L., & Eskofier, B. (2024). Large-Scale Dataset Pruning in Adversarial Training through Data Importance Extrapolation. In Proceedings of the International Conference on Machine Learning 2024 DMLR Workshop: Data-centric Machine Learning Research. Wien, AT.

MLA:

Nieth, Björn, et al. "Large-Scale Dataset Pruning in Adversarial Training through Data Importance Extrapolation." Proceedings of the International Conference on Machine Learning 2024 DMLR Workshop: Data-centric Machine Learning Research, Wien 2024.

BibTeX: Download