Tools
Current Version: 1.1
Updated: Nov 2022
Included Datasets
IRIS
Input Features:
Sepal Length, Sepal Width, Pedal Length, Pedal Width
Task:
Classification between three classes (Setosa, Versicolour, Virginica)
Titanic
Input Features:
Class (1 = 1st; 2 = 2nd; 3 = 3rd), Name, Sex, Age, sibsp (Number of Siblings/Spouses Aboard), parch (Number of Parents/Children Aboard), Ticket Number, Fare, Cabin, Port of Embarkation (C = Cherbourg; Q = Queenstown; S = Southampton)
Task:
Binary Classification (1=Survived, 0=Did not Survive)
Diabetes
Input Features:
Pregnancies, Glucose, Blood Pressure, Skin Thickness, Insulin, BMI, Diabetes Pedigree Function, Age
Task:
Predict measure of disease progression one year after baseline
Loan
Input Features:
Gender, Marriage Status, Education Level, Dependents, Applicant Income, Co-Applicant Income, Loan Amount, Loan Amount Term, Credit History, Property Area
Task:
Binary Classification (Approved vs Denied)
Input Features:
List of words in reddit comments
Task:
Classification between three classes (Physics Post, Chemistry Post, and Biology Post)
Mnist
Input Features:
16 continuous (Two integers, one x-axis and one y-axis, for the position of a pen on a 28x28 grid at 8 time points when writing a digit)
Task:
Classification between 10 classes (0-9)
AI Models
Decision Tree (Link)
Non-parametric supervised learning method used for classification and regression. Decision trees learn simple decision rules inferred from data features to predict the target value
Random Forest (Link)
Ensemble learning method for classification and regression. Random Forests operate by constructing multiple decision trees and inferring target values using
XGBoost (Link)
Specialized Random Forests using gradient boosting. Weak predictors (decision trees) are trained one at a time, adapting from the model's weaknesses as they are trained.
MLP (Link)
Feed-forward Neural Network
KNN (Link)
Non-parametric purervised learning method for classification and regression. Entries are classified by a plurality vote of its neighbors, weighted by the distance between them in the parameter space.
Explainers
LIME
A model-agnostic black-box global explainer that trains interpretable local surrogate models to explain individual predictions.
SHAP
A model-agnostic local explainer that uses shapley values. SHAP interprets the impact of having a certain value for a given feature in comparison to the prediction we'd make if that feature took some baseline value.
Anchor
An explainer for individual predictions of any black box classification model by finding a decision rule that "anchors" the prediction sufficiently. It anchors a prediction if changes in other feature values do not affect the prediction.