data-science-notebooks icon indicating copy to clipboard operation
data-science-notebooks copied to clipboard

Jupyter notebooks, accompanying the FinDS Python repo: contains code examples and results for 30+ financial data science projects

FINANCIAL DATA SCIENCE

Financial Data Science projects in Jupyter notebooks, with FinDS Python package:

  • use database engines SQL, Redis, MongoDB
  • interfaces for
    • structured data from CRSP, Compustat, IBES, TAQ
    • APIs from ALFRED, BEA
    • unstructured data from SEC Edgar, Federal Reserve websites
    • academic websites by Fama and French, Loughran and MacDonald, Hoberg and Phillips
  • recipes for econometrics, finance, graphs, event studies, backtesting
  • applications of statistics, machine learning, NLP, neural networks and LLMs.

Topics

notebook Financial Data Science
stock_prices Stock distributions, delistings CRSP stocks Statistical moments
jegadeesh_titman Overlapping portfolios;
Momentum effect
CRSP stocks Hypothesis testing;
Newey-West estimator
fama_french Portfolio sorts;
Value effect
CRSP stocks;
Compustat
Linear regression
fama_macbeth Cross-sectional Regressions;
CAPM
Ken French research library Non-linear regression;
Quadratic optimization
weekly_reversals Mean reversion;
Implementation shortfall
CRSP stocks Structural breaks;
Performance evaluation
quant_factors Factor investing;
Backtests
CRSP stocks;
Compustat; IBES
Cluster analysis
event_study Event studies S&P key developments Multiple testing;
FFT
economic_releases Economic data revisions;
Employment payrolls
ALFRED Outliers
regression_diagnostics Consumer and
producer prices
FRED Linear regression diagnostics;
Residual analysis
econometric_forecast Production and Inflation FRED Time series analysis
approximate_factors Approximate factor models FRED-MD Unit root test
economic_states State space models FRED-MD Gaussian Mixture;
HMM
term_structure Interest rates FRED yield curve SVD
bond_returns Bond risk factors FRED bond returns PCA
option_pricing Binomial tree;
Black-Scholes-Merton and the Greeks
simulated data Monte Carlo simulation
conditional_volatility Value at risk FRED crypto-currencies EWMA; GARCH
covariance_matrix Portfolio risk Fama-French industries Covariance matrix estimation
market_microstructure Market impact;
Liquidity risk
TAQ tick data High frequency volatility
event_risk Earnings misses IBES Poisson regression;
GLM
customer_ego Supply chain Compustat principal customers Graph networks
industry_community Industry sectors Hoberg and Phillips
research library
Community detection
bea_centrality Input-output tables Bureau of Economic Analysis Graph centrality
link_prediction Product markets Hoberg and Phillips Link prediction
spatial_regression Earnings surprises IBES
Hoberg and Phillips
Spatial regression
fomc_topics FOMC meetings Federal Reserve website Topic modeling
mda_sentiment 10-K Management Discussion SEC Edgar;
Loughran and Macdonald
research library
Sentiment analysis
business_description 10-K Business Description SEC Edgar POS tagging;
Density-based clustering
classification_models Industry classification SEC Edgar Classification
regression_models Macroeconomic forecasts FRED-MD Regression
deep_classifier Industry classification SEC Edgar Neural networks;
Word embeddings
recurrent_net Macroeconomic forecasts FRED-MD Recurrent Neural Nets;
Dynamic factor models
convolutional_net Macroeconomic forecasts FRED-MD Convolutional Neural Nets;
Vector autoregression
reinforcement_learning Retirement spending SBBI Reinforcement learning
fomc_language Fedspeak FOMC meetings minutes Language modelling;
Transformers
sentiment_llm Financial news sentiment Kaggle LLM prompting
summarization_llm 10-K Market Risks SEC Edgar Text summarization
finetune_llm Industry classification SEC Edgar LLM fine-tuning
rag_agent Corporate philanthropy text documents RAG, LLM chatbots and agents

Resources

  1. Online Jupyter-book, or download pdf

  2. FinDS API reference

  3. FinDS repo

  4. Jupyter notebooks repo

Contact

Github: https://terence-lim.github.io