sentinel / README.md
jeuko's picture
Sync from GitHub (main)
0ba176c verified
metadata
title: Sentinel - Cancer Risk Assessment Assistant
emoji: 🏥
colorFrom: blue
colorTo: purple
sdk: docker
app_port: 8501
pinned: false

LLM-based Cancer Risk Assessment Assistant

This project is an API service that provides preliminary cancer risk assessments based on user-provided data. It is built using FastAPI and LangChain, with a flexible architecture that supports both local and API-based LLMs.

Development Setup

  1. Create the virtual environment:
uv sync

External API Configuration

For risk models that require external APIs, such as CanRisk (BOADICEA model), fill in the following section of the .env file:

# .env
CANRISK_USERNAME=your_canrisk_username
CANRISK_PASSWORD=your_canrisk_password

Then source it: source .env

For CanRisk API access , register at https://www.canrisk.org/.

Using a Local LLM (Ollama)

  1. Install Ollama for your platform.
  2. Pull the default model from the command line:
ollama pull gemma3:4b
  1. Ensure the Ollama desktop app or server is running. You can check your installed models with ollama list.

Using API-based LLMs (Google)

  1. Create a .env file in the project root with your GOOGLE_API_KEY:

    echo "GOOGLE_API_KEY=your_key_here" > .env
    

    Make sure the Generative AI API is enabled for your Google Cloud project.

  2. Run the command line demo with the Google provider (default):

    uv run python apps/cli/main.py
    

    Switch to the local model with:

    uv run python apps/cli/main.py model=gemma3_4b
    
  3. The model override also works with the Streamlit and FastAPI interfaces.

Interactive Demo

Run a simple command line demo with:

uv run python apps/cli/main.py

Enable developer mode and load user data from a file with:

uv run python apps/cli/main.py dev_mode=true user_file=examples/user_example.yaml

The script collects user data, prints the structured JSON assessment, and then allows follow-up questions in a chat-like loop. Type quit to exit.

The multi-page Streamlit interface provides an expert feedback interface located at apps/streamlit_ui/main.py. The first page, User Profile, lets you upload or manually create a profile before running assessments. The Configuration page allows you to choose the model and knowledge base modules and shows a live preview of the full LLM prompt. The Assessment page runs the model, shows a dashboard of results, and lets you export or chat with the assistant.

Exporting Reports

After the initial assessment is displayed in the terminal, you will be prompted to export the full report to a formatted file. You can choose to generate a PDF, an Excel file, or both. The generated files (e.g., Cancer_Risk_Report_20250626_213000.pdf) will be saved in the root directory of the project.

Note: This feature requires the openpyxl and reportlab libraries.

You can also provide a JSON or YAML file with all user information to skip the interactive prompts:

uv run python apps/cli/main.py user_file=examples/user_example.yaml

To launch the Streamlit interface, run the following command from the root of the project:

uv run streamlit run apps/streamlit_ui/main.py

Note To serve the app locally you can use ngrok

 ngrok http 8501

Important Note for Developers

When making changes to the project, check if the following files should also updated to reflect the changes:

  • README.md
  • AGENTS.md
  • GEMINI.md

Available Risk Models

The assistant currently includes the following built-in risk calculators:

  • Gail - Breast cancer risk
  • Claus - Breast cancer risk based on family history
  • Tyrer-Cuzick - Breast cancer risk (IBIS model)
  • BOADICEA - Breast and ovarian cancer risk (via CanRisk API)
  • PLCOm2012 - Lung cancer risk
  • LLPi - Liverpool Lung Project improved model for lung cancer risk (8.7-year prediction)
  • CRC-PRO - Colorectal cancer risk
  • PCPT - Prostate cancer risk
  • Extended PBCG - Prostate cancer risk (extended model)
  • Prostate Mortality - Prostate cancer-specific mortality prediction
  • MRAT - Melanoma risk (5-year prediction)
  • aMAP - Hepatocellular carcinoma (liver cancer) risk
  • QCancer - Multi-site cancer differential

Generating Documentation

The project includes a comprehensive PDF documentation generator that creates detailed documentation of all implemented risk models and their input requirements.

Generate Risk Model Documentation

To generate the PDF documentation:

uv run python scripts/generate_documentation.py

This will create a comprehensive PDF document (docs/risk_model_documentation.pdf) that includes:

  1. Overview Section:

    • Cancer type coverage chart
    • Statistics on implemented risk scores and cancer types covered
  2. Detailed Model Information:

    • Description, interpretation, and references for each risk model
    • Complete input requirements with field details, required status, units, and possible values/choices
  3. Input-to-Cancer Mapping:

    • Reverse mapping showing which cancer types use each input field
    • Possible values for each field
    • Comprehensive coverage analysis

The documentation is automatically regenerated based on the current codebase, ensuring it stays up-to-date as new risk models and input fields are added.

Documentation Features

  • Comprehensive Coverage: Documents all risk models and their input requirements
  • Visual Charts: Includes cancer type coverage visualization
  • Detailed Tables: Shows field specifications, constraints, and valid values
  • Professional Layout: Clean, readable PDF format suitable for sharing
  • Auto-Generated: Stays synchronized with code changes automatically