Raj Aryan

AI-Powered Human Health Prediction System

Integrating High-Performance Ensemble Models with Explainable AI (XAI) for Transparent & Trustworthy Medical Disease Prediction.

1. The Core Challenge

Bridging the Gap Between Accuracy and Trust in Medical AI

The Diagnostic Dilemma

  • 📈Rising Diseases: Increasing burden of chronic diseases like heart disease, diabetes, and stroke.
  • Diagnostic Delays: Traditional methods can be slow, costly, and inaccessible.

The AI "Black Box" Problem

  • 🤖Lack of Transparency: High-accuracy AI models often operate as "black boxes," making their reasoning unclear.
  • Trust Deficit: Clinicians and patients are hesitant to trust AI predictions without clear, understandable explanations.

2. Research Methodology

A Structured Approach to Trustworthy AI

End-to-End Project Workflow

1. Data Collection

Aggregate & Preprocess Public Datasets

2. Ensemble Modeling

Develop Individual, Ensemble & Master Models

3. XAI Integration

Implement SHAP & LIME for Explainability

4. Deployment

Launch as a User-Friendly Web App

3. Model Performance Analysis

A Comprehensive Comparison of 66 Models

Individual Model Performance Across Diseases

We tested 10 different models for each of the 4 diseases. Below is a summary of the test accuracies, highlighting top performers like XGBoost and LightGBM.

Ensemble Model Performance (Stroke Prediction)

Ensemble methods were tested to improve upon individual models. For Stroke prediction, Bagging emerged as the top-performing ensemble technique.

Master Multi-Disease Model Accuracy

Finally, we developed master models capable of predicting all four diseases simultaneously. The Multi-Output XGBoost model achieved the highest overall accuracy.

4. Explainable AI (XAI) in Action

Opening the Black Box with SHAP & LIME

To build trust, we integrated XAI techniques to explain *why* our model makes a specific prediction. This transparency is crucial for clinical adoption.

SHAP (SHapley Additive exPlanations)

Provides global and local feature importance. It shows which factors (e.g., high glucose, age) pushed the prediction towards a positive or negative outcome for a specific patient.

LIME (Local Interpretable Model-agnostic Explanations)

Explains individual predictions by creating a simpler, interpretable model around the prediction. It highlights the key features influencing a single decision.