Python Packages for Applied Economists

A comprehensive guide to Python packages for applied economists, organized by functionality to support econometric analysis, data management, visualization, and specialized tasks.

Core Libraries
Econometric Methods and Research Designs
- General Statistical Methods
- Instrumental Variables
- Panel Data Methods
- Regression Discontinuity Designs
- Difference-in-Differences and Synthetic Control Methods
Treatment Effect Estimation Tools
- Sensitivity Analysis
Machine Learning
Time Series Tools
Bayesian Analysis Tools
Data Management and Processing
- DataFrame Libraries
- Record Linkage and Data Matching
- Distance Metrics and String Matching
Visualization and Reporting
- Static Visualization
- Interactive Visualization
- Publication-Ready Outputs
  - Table Export and Formatting
Specialized Tools
- Geospatial Analysis
- Text Analysis
- PDF Processing and Document Analysis
- Web Scraping
Development Tools
- Debugging and Testing
- Cross-Language Integration
Installation Summary

Core Libraries

Before diving into specialized packages, ensure you have the foundational libraries installed:

NumPy
- Description: Fundamental package for numerical computations.
- Installation: pip install numpy
- Link: https://numpy.org/
Pandas
- Description: Essential for data manipulation and analysis.
- Installation: pip install pandas
- Link: https://pandas.pydata.org/
SciPy
- Description: Provides additional statistical functions and tools.
- Installation: pip install scipy
- Link: https://www.scipy.org/

Econometric Methods and Research Designs

General Statistical Methods

Statsmodels
- Description: Provides classes and functions for estimating various statistical models, performing statistical tests, and data exploration.
- Capabilities:
  - Linear Regression: Ordinary Least Squares (OLS)
  - Generalized Linear Models (GLM)
  - Discrete Choice Models: Logit, Probit
  - Time Series Analysis: ARIMA, VAR, and state-space models
  - Instrumental Variable Estimation: IV regression
- Installation: pip install statsmodels
- Stata Equivalent: regress, logit, probit, arima, var, ivregress
- Link: https://www.statsmodels.org/
Pingouin
- Description: Statistical package offering statistical tests and plotting functions.
- Capabilities:
  - ANOVAs, t-tests, correlations
  - Effect sizes, power analyses
- Installation: pip install pingouin
- Link: https://pingouin-stats.org/

Instrumental Variables

Linearmodels
- Description: Specialized for panel data econometrics, including fixed effects, random effects, and instrumental variable models.
- Capabilities:
  - Panel Data Analysis: Fixed effects, random effects, between estimators
  - Instrumental Variables: IV estimators, Generalized Method of Moments (GMM)
  - Seemingly Unrelated Regressions: System estimation
- Installation: pip install linearmodels
- Stata Equivalent: xtreg, ivregress, sureg
- Link: https://bashtage.github.io/linearmodels/

Panel Data Methods

PyFixest
- Description: Allows for fast estimation of linear models with multiple fixed effects, inspired by the R package fixest.
- Capabilities:
  - High-dimensional fixed effects models
  - Clustered and robust standard errors
  - Support for instrumental variables and interaction terms
- Installation: pip install pyfixest
- Stata Equivalent: reghdfe, areg
- Link: https://github.com/py-econometrics/pyfixest

Regression Discontinuity Designs

rdrobust
- Description: Implements local polynomial RD point estimators with robust bias-corrected confidence intervals and inference procedures.
- Capabilities:
  - RD estimation and inference
  - Automatic bandwidth selection
- Installation: pip install rdrobust
- Stata Equivalent: rdrobust
- Link: https://pypi.org/project/rdrobust/
rdlocrand
- Description: Provides tools for local randomization methods in RD designs.
- Capabilities:
  - Inference in RD designs using local randomization
- Installation: pip install rdlocrand
- Stata Equivalent: rdlocrand
- Link: https://pypi.org/project/rdlocrand/
rddensity
- Description: Provides manipulation testing based on density discontinuity.
- Capabilities:
  - Density discontinuity tests at cutoff
- Installation: pip install rddensity
- Stata Equivalent: rddensity
- Link: https://pypi.org/project/rddensity/
rdmulti
- Description: Analysis of RD designs with multiple cutoffs or scores.
- Capabilities:
  - Multivariate RD analysis
- Installation: pip install rdmulti
- Stata Equivalent: rdmulti
- Link: https://pypi.org/project/rdmulti/
rdpower
- Description: Power calculations for RD designs.
- Capabilities:
  - Computes power and sample size for RD designs
- Installation: pip install rdpower
- Stata Equivalent: rdpower
- Link: https://pypi.org/project/rdpower/
lpdensity
- Description: Implements local polynomial point estimation with robust bias-corrected confidence intervals.
- Capabilities:
  - Kernel density estimation
  - Local polynomial estimation
- Installation: pip install lpdensity
- Stata Equivalent: Part of the RD analysis toolkit
- Link: https://pypi.org/project/lpdensity/

Difference-in-Differences and Synthetic Control Methods

CSDID
- Description: Implements the Callaway and Sant'Anna (2020) Difference-in-Differences estimator for staggered adoption designs with treatment effect heterogeneity.
- Capabilities:
  - Estimation of group-time average treatment effects
  - Handles multiple time periods and variation in treatment timing
  - Allows for treatment effect heterogeneity
- Installation:
```
git clone https://github.com/d2cml-ai/csdid.git
cd csdid
pip install .
```
- Stata Equivalent: csdid (user-contributed command)
- Link: https://github.com/d2cml-ai/csdid
synthdid
- Description: Implements synthetic difference-in-differences estimation with inference and graphing procedures.
- Capabilities:
  - Synthetic DiD estimation
  - Multiple inference methods (placebo, bootstrap, jackknife)
  - Plotting tools for outcomes and weights
  - Support for covariates
  - Handles staggered adoption over multiple treatment periods
- Installation: pip install synthdid
- Stata Equivalent: sdid
- Link: https://pypi.org/project/synthdid/
SyntheticControlMethods
- Description: A Python package for causal inference using various Synthetic Control Methods.
- Capabilities:
  - Synthetic Control estimation
  - Placebo tests
  - Support for panel data
- Installation: pip install SyntheticControlMethods
- Stata Equivalent: synth
- Link: https://pypi.org/project/SyntheticControlMethods/

Treatment Effect Estimation Tools

MarginalEffects
- Description: Provides methods for computing and interpreting marginal effects in statistical models.
- Capabilities:
  - Calculates marginal effects for various models
  - Supports models from scikit-learn, statsmodels, and others
- Installation: pip install marginaleffects
- Link: https://pypi.org/project/marginaleffects/
EconML
- Description: Developed by Microsoft, EconML provides methods for estimating causal effects with machine learning techniques.
- Capabilities:
  - Double Machine Learning (DML)
  - Treatment Effect Estimation: Heterogeneous effects, policy evaluation
  - Support for Machine Learning Models: Integration with scikit-learn, LightGBM, and more
- Installation: pip install econml
- Stata Equivalent: teffects, ddml
- Link: https://econml.azurewebsites.net/
DoubleML
- Description: Implements the Double Machine Learning framework for causal inference in high-dimensional settings.
- Capabilities:
  - Treatment effect estimation using DML
  - Support for various machine learning algorithms
- Installation: pip install doubleml
- Stata Equivalent: ddml
- Link: https://docs.doubleml.org/stable/index.html

Sensitivity Analysis

PySensemakr
- Description: Sensitivity analysis toolkit for regression models.
- Capabilities:
  - Quantify robustness of regression coefficients to unobserved confounding
  - Implements methods similar to the sensemakr R package
- Installation: pip install PySensemakr
- Link: https://github.com/Carloscinelli/PySensemakr

Machine Learning

scikit-learn
- Description: A comprehensive library for machine learning algorithms.
- Capabilities:
  - Supervised Learning: Regression, classification
  - Unsupervised Learning: Clustering, dimensionality reduction
  - Model Selection and Evaluation: Cross-validation, grid search
- Installation: pip install scikit-learn
- Stata Equivalent: Machine learning methods for predictive modeling
- Link: https://scikit-learn.org/
XGBoost
- Description: An optimized distributed gradient boosting library designed to be highly efficient, flexible, and portable.
- Capabilities:
  - High-performance gradient boosting algorithms
  - Support for regression, classification, and ranking problems
- Installation: pip install xgboost
- Stata Equivalent: Advanced machine learning methods
- Link: https://xgboost.readthedocs.io/
LightGBM
- Description: A fast, distributed, high-performance gradient boosting framework.
- Capabilities:
  - Efficient gradient boosting algorithms
  - Support for large-scale data
- Installation: pip install lightgbm
- Link: https://github.com/microsoft/LightGBM

Time Series Tools

Statsmodels Time Series
- Description: Provides extensive time series analysis capabilities.
- Capabilities:
  - ARIMA Models: Autoregressive Integrated Moving Average
  - SARIMAX Models: Seasonal components and exogenous variables
  - Vector Autoregression (VAR): Multivariate time series
  - State Space Models: Flexible modeling of time series
- Installation: Part of statsmodels
- Stata Equivalent: arima, var, dfuller, kpSS
- Link: https://www.statsmodels.org/stable/tsa.html
ARCH
- Description: Tools for analyzing financial time series, including volatility modeling.
- Capabilities:
  - ARCH and GARCH models
  - Volatility forecasting
- Installation: pip install arch
- Link: https://arch.readthedocs.io/en/latest/
Ruptures
- Description: A Python library for offline change point detection.
- Capabilities:
  - Multiple change point detection methods
  - Handling univariate and multivariate signals
- Installation: pip install ruptures
- Link: https://centre-borelli.github.io/ruptures-docs/
xarray
- Description: N-D labeled arrays and datasets in Python.
- Capabilities:
  - Work with multi-dimensional arrays (similar to netCDF data)
  - Convenient data structures for time series data
- Installation: pip install xarray
- Link: https://xarray.pydata.org/en/stable/
StatsForecast
- Description: A collection of statistical models for time series forecasting.
- Capabilities:
  - Efficient implementation of forecasting models
  - Support for large-scale time series data
- Installation: pip install statsforecast
- Link: https://github.com/Nixtla/statsforecast
NeuralForecast
- Description: Deep learning models for time series forecasting.
- Capabilities:
  - State-of-the-art neural network architectures
  - Handling of complex seasonality and trends
- Installation: pip install neuralforecast
- Link: https://github.com/Nixtla/neuralforecast

Bayesian Analysis Tools

PyMC
- Description: Probabilistic programming library for Bayesian modeling and inference.
- Capabilities:
  - Bayesian statistical models
  - Markov Chain Monte Carlo (MCMC)
  - Variational inference
- Installation: pip install pymc
- Link: https://docs.pymc.io/
PyStan
- Description: Python interface to the Stan language for statistical modeling and high-performance statistical computation.
- Capabilities:
  - Bayesian inference
  - Customizable statistical models
- Installation: pip install pystan
- Link: https://pystan.readthedocs.io/en/latest/
Bambi
- Description: High-level Bayesian model-building interface in Python.
- Capabilities:
  - Simplifies specification of Bayesian models using formulas
  - Built on top of PyMC
- Installation: pip install bambi
- Link: https://bambinos.org/

Data Management and Processing

DataFrame Libraries

Polars
- Description: Modern, high-performance DataFrame library optimized for performance and memory efficiency.
- Capabilities:
  - Fast parallel execution of data operations
  - Memory-efficient processing
  - Syntax familiar to pandas and R's tidyverse users
  - Strong integration with Apache Arrow
- Installation: pip install polars
- Link: https://pola.rs/
Datatable
- Description: High-performance library for processing large datasets (up to 100GB) on a single machine.
- Capabilities:
  - Superior performance in sorting and grouping operations
  - Efficient memory usage
  - Seamless interoperability with pandas/NumPy
  - Optimized for single-node processing
- Installation: pip install datatable
- Link: https://github.com/h2oai/datatable
Vaex
- Description: Out-of-core DataFrame library for large datasets with lazy evaluation.
- Capabilities:
  - Memory-efficient handling of large datasets
  - Lazy evaluation for optimized performance
  - Built-in visualization capabilities
  - Good for datasets that don't fit in memory
- Installation: pip install vaex
- Link: https://vaex.io/
DuckDB
- Description: SQL database engine with DataFrame-like functionality and exceptional performance for analytical queries.
- Capabilities:
  - Top-tier performance for large-scale data operations
  - SQL interface for data manipulation
  - Efficient handling of large datasets (50GB+)
  - Strong integration with pandas and Arrow
- Installation: pip install duckdb
- Link: https://duckdb.org/

Record Linkage and Data Matching

Recordlinkage
- Description: Python toolkit for linking and deduplicating records.
- Capabilities:
  - Preprocessing and data cleaning
  - Index/blocking methods to reduce comparisons
  - Various comparison methods
  - Classification of record pairs
  - Evaluation metrics
- Installation: pip install recordlinkage
- Stata Equivalent: merge, reclink
- Link: https://recordlinkage.readthedocs.io/en/latest/
Dedupe
- Description: Machine learning powered deduplication and entity resolution.
- Capabilities:
  - Active learning approach to training
  - Scalable blocking methods
  - Automated matching decisions
- Installation: pip install dedupe
- Link: https://github.com/dedupeio/dedupe
Python-Levenshtein
- Description: Fast implementation of Levenshtein distance and string similarity metrics.
- Capabilities:
  - Compute edit distances for fuzzy matching
- Installation: pip install python-Levenshtein
- Link: https://pypi.org/project/python-Levenshtein/
Jellyfish
- Description: Library for approximate and phonetic matching of strings.
- Capabilities:
  - Soundex, Metaphone, and other phonetic algorithms
  - Damerau-Levenshtein distance
- Installation: pip install jellyfish
- Link: https://pypi.org/project/jellyfish/
PyStemmer
- Description: Snowball stemming algorithms for various languages.
- Capabilities:
  - Stemming words to their root forms for better matching
- Installation: pip install PyStemmer
- Link: https://pypi.org/project/PyStemmer/
NameParser
- Description: Parser for human names.
- Capabilities:
  - Splits names into components (first name, last name, etc.)
  - Useful for matching records based on names
- Installation: pip install nameparser
- Link: https://pypi.org/project/nameparser/
Company-Matching
- Description: Toolkit for matching company names.
- Capabilities:
  - Standardizes company names for accurate matching
  - Handles common abbreviations and variations
- Installation: pip install company-matching
- Link: https://github.com/IntelligentSoftwareSystems/Company-Matching

Distance Metrics and String Matching

py_stringmatching
- Description: Comprehensive toolkit for string matching.
- Capabilities:
  - Multiple string similarity measures
  - Phonetic encoding
  - Token-based similarities
- Installation: pip install py_stringmatching
- Link: https://github.com/J535D165/py_stringmatching
pyjarowinkler
- Description: Implementation of Jaro-Winkler distance.
- Capabilities:
  - Jaro similarity
  - Jaro-Winkler similarity
- Installation: pip install pyjarowinkler
- Link: https://pypi.org/project/pyjarowinkler/
RapidFuzz
- Description: Fast string matching library.
- Capabilities:
  - Quick fuzzy string matching
  - Multiple distance metrics
  - Optimized for performance
- Installation: pip install rapidfuzz
- Link: https://github.com/rapidfuzz/RapidFuzz
FuzzyWuzzy
- Description: Fuzzy string matching in Python.
- Capabilities:
  - String similarity matching
  - Partial and token-based ratios
- Installation: pip install fuzzywuzzy
- Link: https://pypi.org/project/fuzzywuzzy/

Visualization and Reporting

Static Visualization

Matplotlib
- Description: The foundational plotting library in Python.
- Capabilities:
  - Line plots, scatter plots, histograms, bar charts
  - Highly customizable visualizations
  - Support for LaTeX formatting in labels
- Installation: pip install matplotlib
- Stata Equivalent: Basic plotting functions
- Link: https://matplotlib.org/
Seaborn
- Description: A statistical data visualization library built on top of Matplotlib.
- Capabilities:
  - Enhanced statistical graphics
  - Regression plots, distribution plots, heatmaps
  - Integration with pandas data structures
- Installation: pip install seaborn
- Stata Equivalent: Enhanced plotting functions
- Link: https://seaborn.pydata.org/
Plotnine
- Description: A grammar of graphics for Python, based on ggplot2 in R.
- Capabilities:
  - Declarative syntax for creating complex plots
  - Supports layering, scaling, and theming
  - Ideal for creating publication-quality visualizations
- Installation: pip install plotnine
- Link: https://plotnine.readthedocs.io/
Binsreg
- Description: Provides binned regression methods for RD designs and data visualization.
- Capabilities:
  - Binned scatter plots
  - Regression discontinuity analysis
  - Data-driven bin selection
- Installation: pip install binsreg
- Stata Equivalent: binsreg, binscatter
- Link: https://pypi.org/project/binsreg/

Interactive Visualization

Plotly
- Description: An interactive, open-source plotting library.
- Capabilities:
  - Interactive plots
  - Support for web-based applications
  - Wide range of chart types
- Installation: pip install plotly
- Link: https://plotly.com/python/
Altair
- Description: Declarative statistical visualization library for Python.
- Capabilities:
  - Grammar of graphics approach
  - Interactive visualizations
- Installation: pip install altair
- Link: https://altair-viz.github.io/
Bokeh
- Description: Interactive visualization library for modern web browsers.
- Capabilities:
  - Interactive plots and dashboards
  - Real-time streaming and data updates
- Installation: pip install bokeh
- Link: https://bokeh.org/

Publication-Ready Outputs

Table Export and Formatting

Stargazer
- Description: A Python package that emulates the R package stargazer, generating LaTeX code for regression tables.
- Capabilities:
  - Formats regression results into LaTeX tables
  - Supports models from statsmodels and linearmodels
- Installation: pip install stargazer
- Link: https://pypi.org/project/stargazer/
PyTableWriter
- Description: A library to write tabular data in various formats.
- Capabilities:
  - Export data to formats like LaTeX, Markdown, Excel, CSV
  - Supports styling and formatting options
- Installation: pip install pytablewriter
- Link: https://pypi.org/project/pytablewriter/
pystout
- Description: A package to create publication-quality LaTeX tables from Python regression output.
- Capabilities:
  - Generates LaTeX tables from regression models
  - Supports models from statsmodels and linearmodels
  - Customizable table appearance and statistics
- Installation: pip install pystout
- Link: https://pypi.org/project/pystout/
tableone
- Description: Produces summary statistics for research papers.
- Capabilities:
  - Generates descriptive statistics tables
  - Supports grouping variables and statistical tests
  - Exports tables to LaTeX and other formats
- Installation: pip install tableone
- Link: https://pypi.org/project/tableone/
GreatTables
- Description: A package for creating beautiful and complex tables in Python.
- Capabilities:
  - Compose tables with headers, footers, stubs, and spanners
  - Format cell values in various ways
  - Integrates with pandas DataFrames
- Installation: pip install great_tables
- Link: https://pypi.org/project/great-tables/
tabulate
- Description: Formats tabular data in plain-text tables and can output in formats like LaTeX.
- Capabilities:
  - Convert arrays or DataFrames into formatted tables
  - Multiple output formats: plain text, GitHub-flavored Markdown, LaTeX, HTML, and more
- Installation: pip install tabulate
- Link: https://pypi.org/project/tabulate/

Specialized Tools

Geospatial Analysis

GeoPandas
- Description: Extends pandas to allow spatial operations on geometric types.
- Capabilities:
  - Reading and writing spatial data
  - Spatial joins and operations
  - Handling geospatial data formats like Shapefiles and GeoJSON
- Installation: pip install geopandas
- Stata Equivalent: Limited geospatial capabilities
- Link: https://geopandas.org/
Geoplot
- Description: A high-level geospatial plotting library.
- Capabilities:
  - Geospatial visualizations
  - Choropleth maps, cartograms, kernel density plots
- Installation: pip install geoplot
- Stata Equivalent: Basic mapping (with limited functionality)
- Link: https://github.com/ResidentMario/geoplot
Geopy
- Description: A Python client for several popular geocoding web services.
- Capabilities:
  - Geocoding addresses (converting addresses to coordinates)
  - Reverse geocoding
  - Calculating distances between points
- Installation: pip install geopy
- Stata Equivalent: Not directly available
- Link: https://geopy.readthedocs.io/
Geocoder
- Description: Geocoding library supporting multiple services.
- Capabilities:
  - Address standardization
  - Geographic entity matching
  - Multiple provider support
- Installation: pip install geocoder
- Link: https://geocoder.readthedocs.io/
libpysal
- Description: Core components of PySAL (Python Spatial Analysis Library).
- Capabilities:
  - Spatial weights matrices
  - Spatial graph analysis
  - Computational geometry
- Installation: pip install libpysal
- Stata Equivalent: spreg, spatial econometrics tools
- Link: https://pysal.org/libpysal/

Text Analysis

NLTK
- Description: Natural Language Toolkit, a leading platform for building Python programs to work with human language data.
- Capabilities:
  - Tokenization, stemming, tagging, parsing
  - Corpora and lexical resources
- Installation: pip install nltk
- Link: https://www.nltk.org/install.html
LangDetect
- Description: Port of Google's language-detection library.
- Capabilities:
  - Detects language of a text
- Installation: pip install langdetect
- Link: https://pypi.org/project/langdetect/

PDF Processing and Document Analysis

LayoutParser
- Description: A unified toolkit for Deep Learning-based Document Image Analysis.
- Capabilities:
  - Deep Learning Models: Perform layout detection in a few lines of code
  - Layout Data Structures: Optimized APIs for document image analysis tasks
  - OCR Integration: Perform OCR for each detected layout region
  - Visualization Tools: Flexible APIs for visualizing the detected layouts
  - Data Loading: Load layout data stored in JSON, CSV, and even PDFs
- Installation:
```
pip install layoutparser
# For deep learning layout models
pip install "layoutparser[layoutmodels]"
# For OCR toolkit
pip install "layoutparser[ocr]"
```
- Link: https://github.com/Layout-Parser/layout-parser
PyTesseract
- Description: Python wrapper for Google's Tesseract-OCR Engine.
- Capabilities:
  - Optical Character Recognition (OCR)
  - Extract text from images and PDFs
- Installation: pip install pytesseract
- Link: https://pypi.org/project/pytesseract/
Tabula-py
- Description: Simple wrapper of tabula-java, which can read tables in PDF and convert them into pandas DataFrames.
- Capabilities:
  - Extract tables from PDFs
- Installation: pip install tabula-py
- Link: https://pypi.org/project/tabula-py/
Python-PDFBox
- Description: Python interface to Apache PDFBox.
- Capabilities:
  - PDF manipulation (extract text, merge, split)
- Installation: pip install python-pdfbox
- Link: https://pypi.org/project/python-pdfbox/
PDFMiner
- Description: Tool for extracting information from PDF documents.
- Capabilities:
  - Text extraction
  - Layout analysis
- Installation: pip install pdfminer.six
- Link: https://pypi.org/project/pdfminer/

Web Scraping

BeautifulSoup
- Description: Library for pulling data out of HTML and XML files.
- Capabilities:
  - Parse and navigate HTML/XML documents
- Installation: pip install beautifulsoup4
- Link: https://pypi.org/project/beautifulsoup4/
Requests
- Description: HTTP library for Python.
- Capabilities:
  - Send HTTP requests
  - Handle HTTP sessions and cookies
- Installation: pip install requests
- Link: https://pypi.org/project/requests/
Requests-HTML
- Description: HTML Parsing for Humans.
- Capabilities:
  - Parse HTML with JavaScript support
  - Simplify web scraping tasks
- Installation: pip install requests-html
- Link: https://github.com/psf/requests-html

Development Tools

Debugging and Testing

StackPrinter
- Description: Debugging tool for printing informative tracebacks.
- Installation: pip install stackprinter
- Link: https://github.com/cknd/stackprinter
Pdb++
- Description: Drop-in replacement for pdb (Python debugger), with additional features.
- Installation: pip install pdbpp
- Link: https://github.com/pdbpp/pdbpp
tqdm
- Description: Fast, extensible progress bar for Python.
- Installation: pip install tqdm
- Link: https://tqdm.github.io/

Cross-Language Integration

RPy2
- Description: Interface to call R functions and use R packages directly from Python.
- Use Case: When specific R packages have no Python equivalent, especially for advanced econometric methods not yet available in Python.
- Example R Packages Accessible via RPy2:
  - did: Implements the Callaway and Sant'Anna (2020) DiD estimator.
    - Link: https://github.com/iamnaanm/did
  - bacondecomp: For the Goodman-Bacon decomposition in DiD settings.
    - Link: https://github.com/evanjflack/bacondecomp
  - fixest: Used for estimation with multiple fixed effects.
    - Link: https://github.com/lrberge/fixest
- Installation: pip install rpy2
- Link: https://rpy2.github.io/

Installation Summary

You can install most of these packages using pip:

pip install numpy pandas scipy statsmodels pingouin pymc pystan bambi linearmodels pyfixest econml doubleml marginaleffects pysensemakr scikit-learn xgboost lightgbm matplotlib seaborn plotnine rpy2 rdrobust rdlocrand rddensity rdmulti rdpower lpdensity synthdid SyntheticControlMethods arch ruptures xarray statsforecast neuralforecast recordlinkage dedupe py_stringmatching pyjarowinkler rapidfuzz fuzzywuzzy nameparser company-matching python-Levenshtein jellyfish PyStemmer nltk langdetect beautifulsoup4 requests requests-html pytesseract tabula-py python-pdfbox pdfminer.six plotly altair bokeh prettytable tabulate stackprinter pdbpp tqdm geopandas geoplot geopy geocoder libpysal binsreg prophet layoutparser stargazer pytablewriter xtable pystout tableone great_tables

License

This work is licensed under a Creative Commons Attribution 4.0 International License.

You are free to:

Share — copy and redistribute the material in any medium or format
Adapt — remix, transform, and build upon the material for any purpose, even commercially

Under the following terms:

Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made.

python-packages-for-applied-economists
python-packages-for-applied-economists copied to clipboard

Metadata

Python Packages for Applied Economists

Table of Contents

Core Libraries

Econometric Methods and Research Designs

General Statistical Methods

Instrumental Variables

Panel Data Methods

Regression Discontinuity Designs

Difference-in-Differences and Synthetic Control Methods

Treatment Effect Estimation Tools

Sensitivity Analysis

Machine Learning

Time Series Tools

Bayesian Analysis Tools

Data Management and Processing

DataFrame Libraries

Record Linkage and Data Matching

Distance Metrics and String Matching

Visualization and Reporting

Static Visualization

Interactive Visualization

Publication-Ready Outputs

Table Export and Formatting

Specialized Tools

Geospatial Analysis

Text Analysis

PDF Processing and Document Analysis

Web Scraping

Development Tools

Debugging and Testing

Cross-Language Integration

Installation Summary

License

← Metadata

Owner

Metadata

python-packages-for-applied-economists python-packages-for-applied-economists copied to clipboard

Metadata

Python Packages for Applied Economists

Table of Contents

Core Libraries

Econometric Methods and Research Designs

General Statistical Methods

Instrumental Variables

Panel Data Methods

Regression Discontinuity Designs

Difference-in-Differences and Synthetic Control Methods

Treatment Effect Estimation Tools

Sensitivity Analysis

Machine Learning

Time Series Tools

Bayesian Analysis Tools

Data Management and Processing

DataFrame Libraries

Record Linkage and Data Matching

Distance Metrics and String Matching

Visualization and Reporting

Static Visualization

Interactive Visualization

Publication-Ready Outputs

Table Export and Formatting

Specialized Tools

Geospatial Analysis

Text Analysis

PDF Processing and Document Analysis

Web Scraping

Development Tools

Debugging and Testing

Cross-Language Integration

Installation Summary

License

← Metadata

Owner

Metadata

python-packages-for-applied-economists
python-packages-for-applied-economists copied to clipboard