Credit risk interpretability platform using RandomForest + SHAP with MLflow tracking and FastAPI serving
A self-contained Python project that generates a synthetic population of Chilean credit applicants, trains and tracks a credit-risk model with MLflow, and exposes FastAPI endpoints that return predictions plus SHAP-powered explanations for denied customers. The README below walks through each layer so you can study the workflow end-to-end.
| Directory | Purpose |
|---|---|
src/data |
Synthetic Chilean dataset generator, summary helpers, CLI entry point (generate-data). |
src/model |
Model training pipeline (preprocessing, RandomForest, MLflow logging) and CLI (train-model). |
src/api |
FastAPI service that loads the MLflow artifact, caches a SHAP explainer, and serves /predict + /explain. |
tests |
Unit/integration tests plus fixtures derived from a reproducible 500-row sample. |
notebooks |
Narrative analysis (EDA, model performance, interpretability walk-through). |
docs |
Architecture diagrams (Mermaid flows) and future monitoring scaffolds. |
outputs/ |
Runtime artifacts (confusion matrix, metrics, pipeline joblib) produced during training. |
mlruns/ |
Default MLflow tracking store for experiments and model registry stages. |
flowchart TD
subgraph Dataset
A[generate-data CLI] --> B[Chilean customer table]
end
B --> C[Train (+preprocess + RandomForest)]
C --> D[MLflow run + artifacts]
D --> E[FastAPI service picks latest model]
E --> F[SHAP explainer + /predict + /explain]
F --> G[Documentation/notebooks capture interpretability stories]
generate-data produces 250k+ rows with demographics, housing, income, credit history, and macroeconomic features tailored to Chilean regions. The dataset embeds a logistic score + approval flag.train-model loads the Parquet dataset, preprocesses numerics/categoricals, fits a RandomForestClassifier, computes metrics (ROC-AUC, Brier score, confusion matrix), logs everything to MLflow, and stores artifacts under outputs/.run-api launches FastAPI; the service loads the MLflow pipeline, samples background data for SHAP, and exposes inference/explanation endpoints.| Feature group | Columns | Notes |
|---|---|---|
| Demographics | age, gender, region, education_level, marital_status, housing_status, dependents |
Mix of categorical and ordinal values; Chilean regions prioritize Metropolitana/Valparaíso. |
| Financial profile | income, employment_years, credit_history_months, num_loans, avg_monthly_payment |
Income is clipped [200k, 4M] CLP; avg_monthly_payment scales w/ income. |
| Risk signals | delinquency_rate, credit_inquiries, approval_score, approved |
Score combines income, education, regional boost, employment tenure, delinquency; approved is derived via logistic probability. |
| Macros | regional_cpi, regional_unemployment |
Drawn from normal distributions around Chilean averages to mimic macro cycles. |
The generator also exposes dataset_summary for quick sanity checks and save_dataset for CSV/Parquet exports.
| Command | Purpose |
|---|---|
poetry run generate-data --rows 250000 --prefix data/chilean_credit_data |
Create the synthetic dataset (writes Parquet, optional CSV). |
poetry run train-model --data data/chilean_credit_data.parquet |
Fit the RandomForest, log metrics/params to MLflow, save confusion matrix + metrics JSON + pipeline joblib. |
mlflow ui |
Inspect tracked runs, compare experiments, and promote runs into the registry if desired. |
mlruns/<experiment>/run/artifacts/model. The FastAPI service sets MODEL_URI to this path.outputs/confusion_matrix.png, outputs/metrics.json, and outputs/pipeline.joblib are logged with every run.Interpretability integration: During training the approval score and feature counts are retained to align with downstream SHAP explanations.
| Path | Method | Description |
|---|---|---|
/predict |
POST | Returns { approved: bool, probability: float }. When include_explanation=true, includes SHAP contributions fetched lazily. |
/explain |
POST | Forces SHAP explanation and returns base value + per-feature contributions. Useful for offline denial review. |
{
"customer": {
"age": 45,
"gender": "f",
"region": "Metropolitana",
"education_level": "media",
"marital_status": "casado",
"housing_status": "arriendo",
"dependents": 2,
"income": 800000,
"employment_years": 6,
"credit_history_months": 84,
"num_loans": 2,
"avg_monthly_payment": 150000,
"delinquency_rate": 0.05,
"credit_inquiries": 1,
"regional_cpi": 0.034,
"regional_unemployment": 0.086
},
"include_explanation": true
}
BACKGROUND_DATA_PATH) to build a shap.TreeExplainer./predict uses the pipeline’s predict_proba to return a probability (threshold 0.5) and optionally returns the cached explanation for transparency dashboards./explain recomputes SHAP values for the request, exports the base value, and lists feature contributions so analysts can see why a customer was denied.flowchart TD
subgraph API startup
A[Load MLflow pipeline] --> B[Sample background data]
B --> C[Initialize SHAP TreeExplainer]
end
subgraph Request
D[/predict] --> E[Transform input]
E --> F[Predict probability]
F --> G{explanation requested?}
G -->|yes| H[Return SHAP contributions]
end
notebooks/01-eda.ipynb: Validate regional income spreads, approval rates, and highlight the distribution of approved vs. denied cases.notebooks/02-model-performance.ipynb: Visualize the confusion matrix, ROC curve, Brier score, and calibration bucket analysis to evaluate classifier reliability.notebooks/03-interpretability.ipynb: Simulate a denied customer, call /explain via the FastAPI test client, and record the SHAP force plot values.Each notebook includes markdown commentary describing what to look for (e.g., “Why is the Maule customer denied?”) so learners can follow the reasoning for interpretability.
| Suite | Command | Notes |
|---|---|---|
| Unit (dataset) | python -m pytest tests/test_data_generator.py |
Validates distributions (income bounds, region spread, binary approval). |
| Integration (API) | python -m pytest tests/test_api.py |
Boots FastAPI, points to a fixture model, and checks /predict + /explain responses. |
| Notebook execution | poetry run python -m nbconvert --to notebook --execute notebooks/01-eda.ipynb |
Ensures narrative notebooks run with current dependencies. |
region, gender, and education_level as MLflow metrics and compare them batch-to-batch. Trigger alerts if disparities exceed ~5%. Document findings in notebooks/01-eda.ipynb.Staging/Production. Only register runs that also preserve interpretable SHAP rankings (e.g., income, debt ratio, delinquency should remain dominant).python-json-logger) for every /predict request and record whether SHAP explanations were returned.Streamlit or Literate UI that shows denied customers, their SHAP breakdowns, and simulation sliders for income or unemployment.docs/interpretability-report.md that explains how SHAP features map to business policies (e.g., “High delinquency pushes the model to deny credit”).# regenerate dataset
poetry run generate-data --rows 250000 --prefix data/chilean_credit_data --csv
# train the RandomForest and log to MLflow
poetry run train-model --data data/chilean_credit_data.parquet
# launch explainability FastAPI
export MODEL_URI="mlruns/0/<run_id>/artifacts/model"
export BACKGROUND_DATA_PATH="data/chilean_credit_data.parquet"
poetry run run-api
# run everything
python -m pytest
poetry run python -m nbconvert --to notebook --execute notebooks/01-eda.ipynb
For a study-oriented walkthrough, follow the notebooks top-to-bottom: start with EDA, move to model diagnostics, and finish by replaying the SHAP explanations that justify why a customer was denied credit. Each notebook references the FastAPI service and MLflow artifacts so you can trace every step from synthetic data to an interpretable denial.