A minimal benchmark for scalability, speed and accuracy of commonly used open source implementations (R packages, Python scikit-learn, H2O, xgboost, Spark MLlib etc.) of the top machine learning algorithms for binary classification (random forests, gradient boosted trees, deep neural networks etc.).
updated at June 18, 2024, 12:32 p.m.
Materials for STATS 418 - Tools in Data Science course taught in the Master of Applied Statistics at UCLA
updated at April 11, 2024, 7:32 a.m.