A brief introduction to data analysis with Python using the fortune 500 dataset.
Python Analyses ListA collection of analyses performed in Python.
Plotting a few common statistical functions, namely: PDF, CDF, and iCDF
Analyzing a mock voters dataset using ANOVA, T-tests, and Turkey’s Range Test.
Getting started with modeling. Multiple approaches to Multiple Linear Regression using the US DoT airfare dataset.
2016 Kaggle Caravan Insurance Challenge (Part 1 of 2). Dealing with unbalanced data.
2016 Kaggle Caravan Insurance Challenge (Part 2 of 2). Dimensionality reduction and feature analysis.
What to do when things go too well. Building and comparing XGBoost and Random Forest models on the Agaricus dataset (Mushroom Database).
Attempting to quantify happiness. Building clustering models on the 2019 World happiness report.