pysynthdid
pysynthdid copied to clipboard
Synthetic difference in differences for Python
pysynthdid : Synthetic difference in differences for Python
What is Synthetic difference in differences:
original paper:
Arkhangelsky, Dmitry, et al. Synthetic difference in differences. No. w25532. National Bureau of Economic Research, 2019. https://www.nber.org/papers/w25532
R pkg:
https://github.com/synth-inference/synthdid
data:image/s3,"s3://crabby-images/37ec2/37ec264bd52da4082ccac060f087904652f84473" alt=""
data:image/s3,"s3://crabby-images/a1e50/a1e50feebaf96515aeacc9d5831ce6fdc09ab541" alt=""
data:image/s3,"s3://crabby-images/24a8b/24a8bea29c54765c57708db6fccb0851b5dbf19e" alt=""
data:image/s3,"s3://crabby-images/4b027/4b0270468488ffca937b39351625f8b5906d7dd1" alt=""
Blog:
https://medium.com/@masa_asami/causal-inference-using-synthetic-difference-in-differences-with-python-5758e5a76909
Installation:
$ pip install git+https://github.com/MasaAsami/pysynthdid
This package is still under development. I plan to register with pypi
after the following specifications are met.
- Refactoring and better documentation
- Completion of the TEST code
How to use:
Here's a simple example :
- setup
from synthdid.model import SynthDID
from synthdid.sample_data import fetch_CaliforniaSmoking
df = fetch_CaliforniaSmoking()
PRE_TEREM = [1970, 1988]
POST_TEREM = [1989, 2000]
TREATMENT = ["California"]
- estimation & plot
sdid = SynthDID(df, PRE_TEREM, POST_TEREM, TREATMENT)
sdid.fit(zeta_type="base")
sdid.plot(model="sdid")
data:image/s3,"s3://crabby-images/57000/5700003bc3e316adf5ad2ef0cdde926e7d112050" alt=""
- Details of each method will be created later.
See the jupyter notebook
for basic usage
-
ReproductionExperiment_CaliforniaSmoking.ipynb
- This is a reproduction experiment note of the original paper, using a famous dataset (CaliforniaSmoking).
-
OtherOmegaEstimationMethods.ipynb
- This note is a different take on the estimation method for parameter
omega
(&zeta
). As a result, it confirms the robustness of the estimation method in the original paper.
- This note is a different take on the estimation method for parameter
-
ScaleTesting_of_DonorPools.ipynb
- In this note, we will check how the estimation results change with changes in the scale of the donor pool features.
- Adding donor pools with extremely different scales (e.g., 10x) can have a significant impact (bias) on the estimates.
- If different scales are mixed, as is usually the case in traditional regression, preprocessing such as logarithmic transformation is likely to be necessary
Discussions and PR:
- This module is still under development.
- If you have any questions or comments, please feel free to use issues.