metadata
license: ecl-2.0
language:
- en
extra_gated_fields:
Name: text
Organization: text
Model Card for Pillar0-ChestCT
Model Details
Model Description
Pillar-0 (Chest CT) is a general-purpose radiology foundation model designed for high-resolution volumetric Chest CT understanding. Unlike standard models that process CTs as 2D slices, Pillar-0 utilizes a 3D Atlas backbone to capture volumetric context essential for chest pathologies.
This model is pretrained on a large-scale dataset of chest CTs using asymmetric contrastive learning with Qwen3-Embedding-8B. It demonstrates exceptional performance on standard diagnostic tasks and has been successfully adapted for long-horizon tasks like lung cancer risk prediction (Sybil-1.5), where it establishes a new state-of-the-art.
- Model type: Vision-Language Foundation Model.
- Architecture: Atlas Vision Encoder (Backbone) aligned with Qwen3-Embedding-8B (Text Encoder).
- Language(s) (NLP): English (Radiology Reports).
Model Sources
- Repository: https://huggingface.co/collections/YalaLab/pillar-0
- Code: https://github.com/YalaLab/pillar-pretrain
- Paper: Pillar-0: A New Frontier for Radiology Foundation Models
Evaluation
Testing Data & Metrics
- Test Set: 10,646 exams (UCSF).
- Protocol: Linear probing via RATE-Evals on 92 clinically grounded findings (e.g., lung masses, aortic dissection, atelectasis).
Results
Pillar-0 dominates 2D baselines and other 3D models in chest CT tasks.
| Model | Mean AUROC | Win Rate vs Pillar-0 |
|---|---|---|
| Pillar-0 (Ours) | 88.0 | - |
| MedGemma | 80.2 | 5.4% |
| MedImageInsight | 79.3 | 1.1% |
| Merlin | 77.2 | 2.2% |
Citation
@article{pillar0,
title = {Pillar-0: A New Frontier for Radiology Foundation Models},
author = {Agrawal, Kumar Krishna and Liu, Longchao and Lian, Long and Nercessian, Michael and Harguindeguy, Natalia and Wu, Yufu and Mikhael, Peter and Lin, Gigin and Sequist, Lecia V. and Fintelmann, Florian and Darrell, Trevor and Bai, Yutong and Chung, Maggie and Yala, Adam},
year = {2025}
}