Introduction: Fortune 500 Companies R
A brief introduction to data analysis with R using the fortune 500 dataset.
A collection of analyses performed in R.
A brief introduction to data analysis with R using the fortune 500 dataset.
2016 Kaggle Caravan Insurance Challenge (Part 1 of 2). Dealing with imbalanced data.
2016 Kaggle Caravan Insurance Challenge (Part 2 of 2). Dimensionality reduction and feature analysis.
Attempting to quantify happiness. Building clustering models on the 2016 World happiness report.
What to do when things go too well. Building and comparing XGBoost and Random Forest models on the Agaricus dataset (Mushroom Database).
Plotting a few common statistical functions, namely: PDF, CDF, and iCDF
Analyzing the classic sleep dataset using, two-sample and paired t-tests, and calculating statistical power.
Getting started with modeling. Multiple approaches to Multiple Linear Regression using the classic Boston Housing dataset