Bridging Structural Causal Inference and Machine Learning: The S-DIDML Estimator for Heterogeneous Treatment Effects
DOI:
https://doi.org/10.62177/apemr.v2i5.609Keywords:
S-DIDML, Methodology, Causal Inference, Difference-in-Differences, Double Machine Learning, Semiparametric MethodsAbstract
In response to the increasing complexity of policy environments and the proliferation of high-dimensional data, this paper introduces the S-DIDML estimator—a framework grounded in structure and semiparametrically flexible for causal inference. By embedding Difference-in-Differences (DID) logic within a Double Machine Learning (DML) architecture, the S-DIDML approach combines the strengths of temporal identification, machine learning-based nuisance adjustment, and orthogonalized estimation. We begin by identifying critical limitations in existing methods, including the lack of structural interpretability in ML models, instability of classical DID under high-dimensional confounding, and the temporal rigidity of standard DML frameworks. Building on recent advances in staggered adoption designs and Neyman orthogonalization, S-DIDML offers a five-step estimation pipeline that enables robust estimation of heterogeneous treatment effects (HTEs) while maintaining interpretability and scalability. Demonstrative applications are discussed across labor economics, education, taxation, and environmental policy. The proposed framework contributes to the methodological frontier by offering a blueprint for policy-relevant, structurally interpretable, and statistically valid causal analysis in complex data settings.
Downloads
References
Abadie, A. (2021). Using synthetic controls: Feasibility, data requirements, and methodological aspects. Journal of Economic Literature, 59(2), 391–425. https://doi.org/10.1257/jel.20201405
Athey, S., & Imbens, G. (2017). The state of applied econometrics: Causality and policy evaluation. Journal of Economic Perspectives, 31(2), 3–32. https://doi.org/10.1257/jep.31.2.3
Callaway, B., & Sant’Anna, P. H. C. (2021). Difference-in-differences with multiple time periods. Journal of Econometrics, 225(2), 200–230. https://doi.org/10.1016/j.jeconom.2020.12.001
Sant’Anna, P. H. C., & Zhao, J. (2020). Doubly robust difference-in-differences estimators. Journal of Econometrics, 219(1), 101–122. https://doi.org/10.1016/j.jeconom.2020.06.003
Sun, L., & Abraham, S. (2021). Estimating dynamic treatment effects in event studies with heterogeneous treatment effects. Journal of Econometrics, 225(2), 175–199. https://doi.org/10.1016/j.jeconom.2020.09.006
Chernozhukov, V., et al. (2018). Double/debiased machine learning for treatment and structural parameters. The Econometrics Journal, 21(1), C1–C68. https://doi.org/10.1111/ectj.12097
Kennedy, E. H. (2022). Semiparametric theory and empirical processes in causal inference. Annual Review of Statistics and Its Application, 9, 151–176. https://doi.org/10.1146/annurev-statistics-040220-112545
Nie, X., & Wager, S. (2021). Quasi-oracle estimation of heterogeneous treatment effects. Biometrika, 108(2), 299–319. https://doi.org/10.1093/biomet/asaa076
Imai, K., & Kim, I. S. (2021). When should we use unit fixed effects regression models for causal inference with longitudinal data? American Journal of Political Science, 65(2), 448–466. https://doi.org/10.1111/ajps.12523
Roth, J. (2023). Pre-test with care: How to test for parallel trends with multiple groups. Review of Economics and Statistics. https://doi.org/10.1162/rest_a_01207
Angrist, J. D., & Pischke, J.-S. (2009). Mostly harmless econometrics: An empiricist’s companion. Princeton University Press.
Wooldridge, J. M. (2021). Introductory Econometrics: A Modern Approach (7th ed.). Cengage Learning.
Belloni, A., Chernozhukov, V., & Hansen, C. (2014). Inference on treatment effects after selection among high-dimensional controls. Review of Economic Studies, 81(2), 608–650. https://doi.org/10.1093/restud/rdt044
Doudchenko, N., & Imbens, G. (2016). Balancing, regression, difference-in-differences and synthetic control methods: A synthesis. NBER Working Paper No. 22791.
Borusyak, K., Jaravel, X., & Spiess, J. (2023). Revisiting event study designs. Econometrica, 91(1), 65–95. https://doi.org/10.3982/ecta20695
Callaway, B., Goodman-Bacon, A., & Sant’Anna, P. H. C. (2023). Difference-in-differences with a continuous treatment. Journal of Econometrics. https://doi.org/10.1016/j.jeconom.2023.105417
Athey, S., Tibshirani, J., & Wager, S. (2019). Generalized random forests. Annals of Statistics, 47(2), 1148–1178. https://doi.org/10.1214/18-AOS1709
Künzel, S. R., Sekhon, J. S., Bickel, P. J., & Yu, B. (2019). Metalearners for estimating heterogeneous treatment effects using machine learning. PNAS, 116(10), 4156–4165. https://doi.org/10.1073/pnas.1804597116
Hill, J. (2011). Bayesian nonparametric modeling for causal inference. Journal of Computational and Graphical Statistics, 20(1), 217–240. https://doi.org/10.1198/jcgs.2010.08162
Imbens, G. W., & Rubin, D. B. (2015). Causal inference for statistics, social, and biomedical sciences: An introduction. Cambridge University Press.
Roth, J., Sant’Anna, P. H. C., Bilinski, A., & Poe, J. (2022). What’s trending in difference-in-differences? NBER Working Paper No. 31506.
Duflo, E., Glennerster, R., & Kremer, M. (2008). Using randomization in development economics research. Handbook of Development Economics, 4, 3895–3962.
Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning (2nd ed.). Springer.
Oprescu, M., & Zhu, Y. (2023). Selective machine learning for heterogeneous treatment effect estimation. Journal of Causal Inference, 11(1). https://doi.org/10.1515/jci-2022-0021
Wager, S., & Athey, S. (2018). Estimation and inference of heterogeneous treatment effects using random forests. Journal of the American Statistical Association, 113(523), 1228–1242. https://doi.org/10.1080/01621459.2017.1319839
Schmidheiny, K., & Siegloch, S. (2019). On event studies and distributed-lags in two-way fixed effects models. IZA Discussion Paper No. 12088.
de Chaisemartin, C., & D’Haultfœuille, X. (2020). Two-way fixed effects estimators with heterogeneous treatment effects. American Economic Review, 110(9), 2964–2996. https://doi.org/10.1257/aer.20181169
Xu, Y. (2017). Generalized synthetic control method: Causal inference with interactive fixed effects models. Political Analysis, 25(1), 57–76. https://doi.org/10.1017/pan.2016.2
Kasy, M., & Sautmann, A. (2021). Adaptive treatment assignment in experiments for policy choice. Econometrica, 89(1), 113–132. https://doi.org/10.3982/ECTA17443
Ben-Michael, E., Feller, A., & Rothstein, J. (2021). The augmented synthetic control method. Journal of the American Statistical Association, 116(536), 1789–1803. https://doi.org/10.1080/01621459.2021.1929245
Hazlett, C. (2020). Regression discontinuity and heteroskedasticity. Political Science Research and Methods, 8(3), 551–566.
Varian, H. R. (2014). Big data: New tricks for econometrics. Journal of Economic Perspectives, 28(2), 3–28.
Breiman, L. (2001). Statistical modeling: The two cultures. Statistical Science, 16(3), 199–231.
Knaus, M. C., Lechner, M., & Strittmatter, A. (2021). Machine learning estimation of heterogeneous labor market impacts of COVID-19 policies. Labour Economics, 72, 102054. https://doi.org/10.1016/j.labeco.2021.102054
Bryan, G., Karlan, D., & Nelson, S. (2021). Commitment devices. Annual Review of Economics, 13, 561–583. https://doi.org/10.1146/annurev-economics-082420-112136
Rubin, D. B. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology, 66(5), 688–701.
Heckman, J. J., & Vytlacil, E. (2007). Econometric evaluation of social programs, part I: Causal models, structural models and econometric policy evaluation. Handbook of Econometrics, 6, 4779–4874.
Pearl, J. (2009). Causality: Models, reasoning, and inference (2nd ed.). Cambridge University Press.
Heckman, J. J., Pinto, R., & Savelyev, P. A. (2013). Understanding the mechanisms through which an influential early childhood program boosted adult outcomes. American Economic Review, 103(6), 2052–2086.
Imbens, G. W. (2020). Potential outcome and directed acyclic graphs: An overview. AEA Papers and Proceedings, 110, 358–361. https://doi.org/10.1257/pandp.20201008
Downloads
How to Cite
Issue
Section
License
Copyright (c) 2025 Yile Yu; Anzhi Xu

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
DATE
Accepted: 2025-09-18
Published: 2025-09-26