Szilard Pafka
Szilard Pafka
benchm-ml
A minimal benchmark for scalability, speed and accuracy of commonly used open source implementations (R packages, Python scikit-learn, H2O, xgboost, Spark MLlib etc.) of the top machine learning algor...
GBM-perf
Performance of various open source GBM implementations
benchm-databases
A minimal benchmark of various tools (statistical software, databases etc.) for working with tabular data of moderately large sizes (interactive data analysis).
benchm-dl
Playing with various deep learning tools and network architectures
datascience-latency
Latency numbers every data scientist should know (aka the pyramid of analytical tasks) - the order of magnitude of computational time for the most common analytical tasks (SQL-like data munging, linea...
dataset-sizes-kdnuggets
Size of datasets used for analytics based on 10 years of surveys by KDnuggets.
GBM-multicore
GBM multicore scaling: h2o, xgboost and lightgbm on multicore and multi-socket systems
GBM-tune
Tuning GBMs (hyperparameter tuning) and impact on out-of-sample predictions