## Clustering on the World Happiness Report 2019

Attempting to quantify happiness. Building clustering models on the 2019 World happiness report.

** Tags: **
AffinityPropagation,
Agglomerative,
clustering,
GMM,
happiness,
KMeans,
python

What to do when things go too well. Building and comparing XGBoost and Random Forest models on the Agaricus dataset (Mushroom Database).

** Tags: **
agaricus,
LIME,
python,
SHAP,
synthetic

2016 Kaggle Caravan Insurance Challenge (Part 2 of 2). Dimensionality reduction and feature analysis.

** Tags: **
dimensionalityReduction,
featureImportance,
featureSelection,
PCA,
python,
RFE,
t-SNE,
UMAP,
unbalanced

2016 Kaggle Caravan Insurance Challenge (Part 1 of 2). Dealing with imbalanced data.

** Tags: **
Bagging,
Boosting,
imbalanced,
oversampling,
python,
RandomForest,
SMOTE,
undersampling

Getting started with modeling. Multiple approaches to Multiple Linear Regression using the US DoT airfare dataset.

** Tags: **
airfare,
linear,
python,
regression

Analyzing a mock voters dataset using ANOVA, T-tests, and Turkey’s Range Test.

** Tags: **
ANOVA,
python,
statistics,
ttests,
voting

Plotting a few common statistical functions, namely: PDF, CDF, and iCDF

** Tags: **
functions,
plotting,
probability,
python,
statistics

A brief introduction to data analysis with Python using the fortune 500 dataset.

** Tags: **
EDA,
fortune500,
introduction,
python

** Tags: **
airfare,
linear,
regression,
r

** Tags: **
power,
r,
sleep,
statistics,
ttests

** Tags: **
functions,
plotting,
probability,
r,
statistics

** Tags: **
agaricus,
RandomForest,
r,
synthetic,
XGBoost

** Tags: **
AffinityPropagation,
Agglomerative,
clustering,
GMM,
happiness,
KMeans,
r

** Tags: **
dimensionalityReduction,
featureSelection,
PCA,
RFE,
r,
t-SNE,
UMAP,
unbalanced

** Tags: **
Bagging,
Boosting,
imbalanced,
oversampling,
RandomForest,
r,
SMOTE,
undersampling

** Tags: **
EDA,
fortune500,
introduction,
r