Cleaning Up Confounding: Accounting for Endogeneity Using Instrumental Variables and Two-Stage Models

Graf-Vlachy L, Wagner S (2024)


Publication Type: Journal article

Publication year: 2024

Journal

Book Volume: 33

Article Number: ART199

Journal Issue: 8

DOI: 10.1145/3674730

Abstract

Studies in empirical software engineering are often most useful if they make causal claims because this allows practitioners to identify how they can purposefully influence (rather than only predict) outcomes of interest. Unfortunately, many non-experimental studies suffer from potential endogeneity, for example, through omitted confounding variables, which precludes claims of causality. In this conceptual tutorial, we aim to transfer the proven solution of instrumental variables and two-stage models as a means to account for endogeneity from econometrics to the field of empirical software engineering. To this end, we discuss causality and causal inference, provide a definition of endogeneity, explain its causes, and lay out the conceptual idea behind instrumental variable approaches and two-stage models. We also provide an extensive illustration with simulated data and a brief illustration with real data to demonstrate the approach, offering Stata and R code to allow researchers to replicate our analyses and apply the techniques to their own research projects. We close with concrete recommendations and a guide for researchers on how to deal with endogeneity.

Involved external institutions

How to cite

APA:

Graf-Vlachy, L., & Wagner, S. (2024). Cleaning Up Confounding: Accounting for Endogeneity Using Instrumental Variables and Two-Stage Models. Acm Transactions on Software Engineering and Methodology, 33(8). https://doi.org/10.1145/3674730

MLA:

Graf-Vlachy, Lorenz, and Stefan Wagner. "Cleaning Up Confounding: Accounting for Endogeneity Using Instrumental Variables and Two-Stage Models." Acm Transactions on Software Engineering and Methodology 33.8 (2024).

BibTeX: Download