BeXAI
A Benchmark Suite for Explainable AI
BeXAI is a flexible benchmark suite to evaluate AI models and explainers
with explainability metrics using various datasets.
BeXAI has the following features:
State-of-the-art AI models in python
The latest explainers to interpret model behavior
A ready-to-use python virtual environment for easy start
A modular code organization, making it easier to bring your own models, explainers, and datasets
Motivation
Introduction of sophisticated models such as neural networks and ensembling methods have led to better accuracies which in-turn has led to building great AI-enabled products. Although they provide very good performance, the internal behavior of these models is difficult to understand (Black box). This property has hindered their scaled deployment in sensitive sectors. As a result, there is an increasing interest and work going to make machine learning models more explainable. However, there is no standard to measure these new explainability improvement against prior works. BexAI is a benchmark suite that tries to address this issue. In the current implementation, three implementation of model agnostic explainers are compared.
Use Cases
How does your AI model stack up?
Compare your AI accuracy and speed against existing state-of-the-art models
How interpretable is your data?
See how well existing models extract the trends you expect to find in your datasets
How clear is your new explainer?
Check how clear and faithful your explainer's outputs are compared to existing explainers
How well does your hardware perform?
Evaluate your hardware or simulate your new hardware designs with real XAI workloads