Reliable Machine Learning Methods in Image Forensics

Lorch B (2023)


Publication Type: Thesis

Publication year: 2023

URI: https://nbn-resolving.org/urn:nbn:de:bvb:29-opus4-217322

Abstract

Criminal investigations often involve images that can provide important clues or serve as evidence in court. To validate the authenticity of an image and identify its source, a broad range of image forensic tools has been developed. For recent years, the most powerful of these tools have been based on machine learning. However, concerns about the reliability, security, and opacity of machine learning raise the question of whether such tools can be used in criminal investigations. This thesis explores the practical applicability of machine learning in image forensics mainly from a technical but also from legal perspective. From a technical perspective, a major challenge behind machine learning tools is their sensitivity to training-test mismatches. To mitigate this issue, we propose the use of Bayesian detectors. While traditional detectors faced with unfamiliar inputs tend to fail silently, Bayesian detectors communicate the uncertainty in their prediction to the forensic analyst. Based on this predictive uncertainty, the forensic analyst can quantify how much to trust in the prediction. We demonstrate the benefits of Bayesian detectors using three popular forensic tasks. First, we study Bayesian linear regression for estimating the scale factor from rescaled images. Bayesian linear regression achieves comparable performance to other methods, and the predictive uncertainty additionally enables detecting images with unknown scale factors and unseen post-processing. Second, we study Bayesian logistic regression for the task of detecting JPEG double compression. The detector achieves high classification accuracy for known JPEG settings. Simultaneously, the predictive uncertainty exposes several pathological failure cases including training-test mismatches in quantization tables and in the JPEG encoder implementation. Third, we study Gaussian processes for the task of camera model identification. Many previous methods assume that an image under analysis originates from a given set of known camera models, but in practice a photo can also come from an unknown camera model. To avoid misclassifications, Gaussian processes provide a rejection mechanism through their probabilistic predictions. We demonstrate that a Gaussian process classifier achieves high classification accuracy for known cameras and provides reliable uncertainty estimates for unknown cameras. Another issue from a technical perspective is the vulnerability of machine learning to adversarial attacks. To harden forensic detectors against evasion attacks, previous work proposed to combine multiple classifiers to the one-and-a-half-class~(1.5C) classifier. To study its security in the white-box scenario, we propose a novel attack that requires only little image distortion. Our security analysis reveals three subtle pitfalls that undermine the security of the 1.5C classifier. We demonstrate that replacing the final component of the 1.5C classifier overcomes these pitfalls and achieves greater robustness. From a legal perspective, the recently proposed Artificial Intelligence Act classifies the use of machine learning in law enforcement as high risk. Machine learning tools for high-risk applications are permitted but must meet mandatory requirements. We summarize these requirements and discuss their alignment with recent research on the two forensic applications of license plate recognition and deep fake detection. Our discussion highlights the key challenges and directions for future work towards legal compliance with the Artificial Intelligence Act.

Authors with CRIS profile

How to cite

APA:

Lorch, B. (2023). Reliable Machine Learning Methods in Image Forensics (Dissertation).

MLA:

Lorch, Benedikt. Reliable Machine Learning Methods in Image Forensics. Dissertation, 2023.

BibTeX: Download