financial-data-science
financial-data-science copied to clipboard
Financial and Investment Data Science: FinDS Python library and examples for applying quantitative and machine learning methods on structured and unstructured financial data sets
FINANCIAL DATA SCIENCE
FinDS Python package for financial data science projects
- use database engines SQL, MongoDB, Redis
- interfaces for
- structured data from CRSP, Compustat, IBES, TAQ
- APIs from ALFRED, BEA
- unstructured data from SEC Edgar, Federal Reserve websites
- academic websites by Ken French, Loughran and MacDonald, Hoberg and Phillips
- recipes for econometrics, finance, graphs, event studies, backtesting
- applications of statistics, machine learning, neural networks and large language models
Resources
Examples
| notebook | Financial | Data | Science |
|---|---|---|---|
| stock_prices | Stock distributions, delistings | CRSP stocks | Statistical moments |
| jegadeesh_titman | Overlapping portfolios; Momentum effect |
CRSP stocks | Hypothesis testing; Newey-West estimator |
| fama_french | Portfolio sorts; Value effect |
CRSP stocks; Compustat |
Linear regression; |
| fama_macbeth | Cross-sectional Regressions; CAPM |
Ken French research library | Non-linear regression; Quadratic optimization |
| weekly_reversals | Mean reversion; Implementation shortfall |
CRSP stocks | Structural breaks; Performance evaluation |
| quant_factors | Factor investing; Backtests |
CRSP stocks; Compustat; IBES |
Cluster analysis |
| event_study | Event studies | S&P key developments | Multiple testing; FFT |
| economic_releases | Economic data revisions; Employment payrolls |
ALFRED | Outliers |
| regression_diagnostics | Consumer and producer prices |
FRED | Linear regression diagnostics; Residual analysis |
| econometric_forecast | Production and Inflation | FRED | Time series analysis |
| approximate_factors | Approximate factor models | FRED-MD | Unit root test |
| economic_states | State space models | FRED-MD | Gaussian Mixture; HMM |
| term_structure | Interest rates | FRED yield curve | SVD |
| bond_returns | Bond risk factors | FRED bond returns | PCA |
| option_pricing | Binomial tree; Black-Scholes-Merton and the Greeks |
Simulations | Monte Carlo simulation |
| conditional_volatility | Value at risk | FRED crypto-currencies | EWMA; GARCH |
| covariance_matrix | Portfolio risk | Fama-French industries | Covariance matrix estimation |
| market_microstructure | Market impact; Liquidity risk |
TAQ tick data | High frequency volatility |
| event_risk | Earnings misses | IBES | Poisson regression; GLM |
| customer_ego | Supply chain | Compustat principal customers | Graph networks |
| industry_community | Industry sectors | Hoberg and Phillips research library |
Community detection |
| bea_centrality | Input-output tables | Bureau of Economic Analysis | Graph centrality |
| link_prediction | Product markets | Hoberg and Phillips | Link prediction |
| spatial_regression | Earnings surprises | IBES Hoberg and Phillips |
Spatial regression |
| fomc_topics | FOMC meetings | Federal Reserve website | Topic models |
| mda_sentiment | 10-K filings | SEC Edgar; Loughran and Macdonald research library |
Sentiment analysis |
| business_description | 10-K filings | SEC Edgar | POS tagging; Density-based clustering |
| classification_models | Industry classification | SEC Edgar | Classification |
| regression_models | Macroeconomic forecasts | FRED-MD | Regression |
| deep_classifier | Industry classification | SEC Edgar | Neural networks; Word embeddings |
| recurrent_net | Macroeconomic models | FRED-MD | RNN; Dynamic factor models |
| convolutional_net | Macroeconomic forecasts | FRED-MD | CNN; Vector autoregression |
| reinforcement_learning | Spending policy | SBBI | Reinforcement learning |
| fomc_language | Fedspeak | FOMC meetings minutes | Language modelling; Transformers |
| sentiment_llm | Financial news sentiment | Kaggle | LLM prompt engineering |
| summarization_llm | 10-K filings | SEC Edgar | LLM text summarization |
| finetune_llm | Industry classification | SEC Edgar | LLM fine-tuning |
| rag_agent | Corporate philanthropy | textbooks | LLM RAG, chatbots, agents |
Contact
Github: https://terence-lim.github.io