Propensity score matching (PSM) – STATA

Propensity score matching (PSM) is a quasi-experimental method in which the researcher uses statistical techniques to construct an artificial control group by matching each treated unit with a non-treated unit of similar characteristics. Using these matches, the researcher can estimate the impact of an intervention. Matching is a useful method in data analysis for estimating the impact of a program or event for which it is not ethically or logistically feasible to randomize. This page provides an overview of PSM and guidelines for implementation.

Read First

The efficacy of a PSM design depends mostly on how well the observed characters determine program participation. If the bias from unobserved characteristics are likely to be very small, PSM provides good estimates; if the bias from unobserved characteristics is large, then the estimates from the PSM can be sizably biased.
Compute the propensity score using baseline characteristics. Note that using too many covariates to compute the propensity score may result in an exacerbated lack of common support, while using too few may violate the unconfoundedness assumption.
psmatch2 is a useful Stata command for implementing PSM.

Overview

PSM is a quasi-experimental method in which the researcher uses statistical techniques to construct an artificial control group by matching each treated unit with a non-treated unit of similar characteristics. In particular, PSM computes the probability that a unit will enroll in a program based on observed characteristics. This is the propensity score. Then, PSM matches treated units to untreated units based on the propensity score. PSM relies on the assumption that, conditional on some observable characteristics, untreated units can be compared to treated units, as if the treatment has been fully randomized. In this way, PSM seeks to mimic randomization to overcome issues of selection bias that plague non- experimental methods.

To implement, PSM requires large samples and good data on both treated and non-treated units. Further, the data must include a sufficient number of untreated units with characteristics that correspond to those of the treated units. Finally, the data should include all relevant characteristics related to treatment participation and outcomes. Given that all characteristics relevant to treatment participation and outcomes are observable in the dataset and known by the researcher, the propensity score will produce valid matches for estimating the impact of an intervention. If unobservable characteristics exist between the treated and untreated units, however, PSM will provide biased estimates. Ultimately, PSM estimation results are only as good as the characteristics used for matching.

Implementation

psmatch2 is a useful Stata command that implements a variety of PSM methods and can carry out steps 2-5 in this section. Install this command by typing ssc install psmatch2 in Stata; find more information by typing help psmatch2 in Stata. An overview of the PSM steps follows:

Get data. The data must identify which units are treated and untreated, and should include all characteristics relevant to treatment participation and outcome.
Estimate propensity score through a discrete choice model (i.e. logit, probit). The function should include all relevant covariates related to treatment participation and outcomes. The covariates should be baseline characteristics that are not affected by the treatment. Then use the predicted values from the function to generate the propensity score. Note that while including many covariates may exacerbate the “lack of common support,” or an insufficient overlap of the propensity score distributions of the treated and untreated groups, including too few covariates may violate the unconfoundedness assumption
Restrict sample to common support. This ensures that units with the same covariate values have a positive probability of being both treated and untreated. After running psmatch2, run psgraph, pscore(pc_pscore) to visualize the distribution of units that will not be included in PSM.
Choose and implement a matching algorithm to match untreated units to treated units (i.e. radius matching, kernel and local-linear matching, nearest neighbor matching). See Additional Resources for more information.
Estimate the impact of intervention with matched sample and calculate standard errors. The average difference in outcomes between treated units and their matched untreated, control units is the estimated impact of the intervention.

Get Help with Data Analysis, Research, Thesis, Dissertation and Assignments.

Data Analytics Services

Need Our Services?

Econometrics & Statistics Modelling Services

Linear Regression
Probit & Logit Models
ARIMA Models
GLM models
Vector Autoregressive
Panel Data Models
Propensity Score Matching
Principal Component Analysis
Spatial Econometrics
Survival Analysis

Propensity score matching (PSM) – STATA

Read First

Overview

Implementation

Get Help with Data Analysis, Research, Thesis, Dissertation and Assignments.

Data Analytics Services

Need Our Services?

Econometrics & Statistics Modelling Services

Stuck with Your Research or Data Analysis Project?

Let Our Experts Help You:

Whatsapp Us:

Email Us:

We Make Sense out of your Data

CONTACT US

NAVIGATION

PRIVACY & TOS

Propensity score matching (PSM) – STATA

Read First

Overview

Implementation

Get Help with Data Analysis, Research, Thesis, Dissertation and Assignments.

Data Analytics Services

Need Our Services?

Econometrics & Statistics Modelling Services

Stuck with Your Research or Data Analysis Project?Let Our Experts Help You:

Whatsapp Us:

Email Us:

We Make Sense out of your Data

CONTACT US

NAVIGATION

PRIVACY & TOS

Stuck with Your Research or Data Analysis Project?

Let Our Experts Help You: