Publications:
csa2sls: A Complete Subset Approach for Many Instruments using Stata (with Seojeong Lee, Siha Lee and Youngki Shin).
We develop a Stata command, csa2sls, that implements the complete subset averaging two-stage least squares (CSA2SLS) estimator in Lee and Shin (2021). The CSA2SLS estimator is an alternative to the two-stage least squares estimator that remedies the bias issue caused by many correlated instruments. We conduct Monte Carlo simulations and confirm that the CSA2SLS estimator reduces both the mean squared error and the estimation bias substantially when instruments are correlated. We illustrate the usage of csa2sls in Stata by an empirical application. [arXiv][Journal (open access)] December 2023, The Stata Journal
Working Papers:
A Nonparametric Test of Heterogeneous Treatment Effects Under Interference.
Statistical inference for heterogeneous treatment effects (HTEs) across predefined subgroups is challenging when units interact because treatment effects may vary by pre-treatment variables, post-treatment exposure variables (that measure the exposure to other units’ treatment statuses), or both. Moreover, treatment effects may be direct or indirect. In this paper, I develop procedures to infer HTEs and disentangle the drivers of the different forms of treatment effects heterogeneity in clustered populations. Specifically, I model clustered interference into the potential outcomes framework and propose nonparametric tests for the null hypotheses of (i) no HTEs by treatment assignment (or post-treatment exposure variables) for all pre-treatment variable values and (ii) no HTEs by pre-treatment variables for all treatment assignment vectors. I derive the asymptotic properties of the proposed tests and illustrate their use on data from an experiment that evaluates the effect of information on weather insurance purchases.
[arXiv][Summary], Revise & Resubmit, Journal of Business & Economic Statistics .
Statistical Treatment Rules under Social Interaction (with Seungjin Han and Youngki Shin).
In this paper, we study treatment assignment rules in the presence of social interaction. We construct an analytical framework under the anonymous interaction assumption, where the decision problem becomes choosing a treatment fraction. We propose a multinomial empirical success (MES) rule that includes the empirical success rule of Manski (2004) as a special case. We investigate the non-asymptotic bounds of the expected utility based on the MES rule. Finally, we prove that the MES rule achieves asymptotic optimality with the minimax regret criterion. [arXiv][Replication Code], Reject & Resubmit, Journal of Econometrics. New draft coming soon!
Randomization Inference of Heterogeneous Treatment Effects Under Network Interference.
We develop randomization-based tests for heterogeneous treatment effects in the presence of network interference. Leveraging the exposure mapping framework, we study a broad class of null hypotheses that represent various forms of constant treatment effects in networked populations. These null hypotheses, unlike the classical Fisher sharp null, are not sharp due to unknown parameters and multiple potential outcomes. Existing conditional randomization procedures either fail to control size or suffer from low statistical power in this setting. We propose testing procedures that construct a data-dependent focal assignment set and permit variation in focal units across focal assignments. These features complicate both estimation and inference, necessitating new technical developments. We establish the asymptotic validity of the proposed procedures under general conditions on the test statistic. The procedures are applied to experimental network data and evaluated via Monte Carlo simulations. [arXiv][ Slides], Under revision.
Causal Identification under Interference: The Role of Treatment Assignment Independence (with Monika Avila Marquez).
Empirical researchers routinely invoke the no-interference or individualistic treatment response (ITR) assumption to identify causal effects in observational studies, despite concerns that interference across units may arise in many economic settings. This paper studies the causal content of standard ITR-based identification formulas when arbitrary interference is present. We show that, under restrictions on dependence between treatment assignments across units, conventional ITR-based identification formulas---including those underlying selection-on-observables, instrumental variables, regression discontinuity designs, and difference-in-differences---identify well-defined causal objects: types of average direct effects (ADEs). These results do not require knowledge of the interference structure or specification of exposure mappings. We also propose a sensitivity analysis framework that quantifies the robustness of statistical inference to violations of treatment-assignment independence under arbitrary interference. [arXiv], Submitted.
Selected Work-in-Progress:
Ranking and Selection of Treatment Scale under Clustered Network Interference (with Youngki Shin).
endivregress: Estimating Treatment Effects with Endogenous Misreporting using Stata (with Augustine Denteh and Pierre Nguimkeu ).
Matching under Ambiguity