## Clustering on the World Happiness Report 2019

Attempting to quantify happiness. Building clustering models on the 2019 World happiness report.

This site is no longer receiving frequent updates. Please see GitHub for my latest projects: github.com/Rypo.

Attempting to quantify happiness. Building clustering models on the 2019 World happiness report.

** Tags: **
AffinityPropagation,
Agglomerative,
clustering,
GMM,
happiness,
KMeans,
python

What to do when things go too well. Building and comparing XGBoost and Random Forest models on the Agaricus dataset (Mushroom Database).

** Tags: **
agaricus,
LIME,
python,
SHAP,
synthetic

2016 Kaggle Caravan Insurance Challenge (Part 2 of 2). Dimensionality reduction and feature analysis.

** Tags: **
dimensionalityReduction,
featureImportance,
featureSelection,
PCA,
python,
RFE,
t-SNE,
UMAP,
unbalanced

2016 Kaggle Caravan Insurance Challenge (Part 1 of 2). Dealing with unbalanced data.

** Tags: **
Bagging,
Boosting,
oversampling,
python,
RandomForest,
SMOTE,
unbalanced,
undersampling

Getting started with modeling. Multiple approaches to Multiple Linear Regression using the US DoT airfare dataset.

** Tags: **
airfare,
linear,
python,
regression

Analyzing a mock voters dataset using ANOVA, T-tests, and Turkey’s Range Test.

** Tags: **
ANOVA,
python,
statistics,
ttests,
voting

Plotting a few common statistical functions, namely: PDF, CDF, and iCDF

** Tags: **
functions,
plotting,
probability,
python,
statistics

A brief introduction to data analysis with Python using the fortune 500 dataset.

** Tags: **
EDA,
fortune500,
introduction,
python

Getting started with modeling. Multiple approaches to Multiple Linear Regression using the classic Boston Housing dataset

** Tags: **
airfare,
linear,
regression,
r

Analyzing the classic sleep dataset using, two-sample and paired t-tests, and calculating statistical power.

** Tags: **
power,
r,
sleep,
statistics,
ttests

Plotting a few common statistical functions, namely: PDF, CDF, and iCDF

** Tags: **
functions,
plotting,
probability,
r,
statistics

What to do when things go too well. Building and comparing XGBoost and Random Forest models on the Agaricus dataset (Mushroom Database).

** Tags: **
agaricus,
RandomForest,
r,
synthetic,
XGBoost

Attempting to quantify happiness. Building clustering models on the 2016 World happiness report.

** Tags: **
AffinityPropagation,
Agglomerative,
clustering,
GMM,
happiness,
KMeans,
r

2016 Kaggle Caravan Insurance Challenge (Part 1 of 2). Dealing with unbalanced data.

** Tags: **
Bagging,
Boosting,
oversampling,
RandomForest,
r,
SMOTE,
unbalanced,
undersampling

A brief introduction to data analysis with R using the fortune 500 dataset.

** Tags: **
EDA,
fortune500,
introduction,
r