About the Platform

Mission · Dataset · Evaluation · Organization

MedATLAS-Bench

A Multimodal Medical Benchmark Across Clinical Data Modalities

MedATLAS-Bench evaluates multimodal large language models across diverse clinical inputs — including structured text, 2D images, 3D volumetric data, and video — enabling robust and realistic assessment in real-world medical scenarios. The benchmark spans multiple clinical tasks such as classification, generation, and localization across varied datasets and conditions.

A unified platform to evaluate, compare, and understand the performance of AI models in real-world medical diagnosis scenarios.

🎯

Purpose

The goal of this platform is to provide a standardized benchmark for evaluating multimodal AI models in healthcare. It enables fair comparison across models and promotes transparency in performance.

📊

Dataset

The dataset is divided into multiple difficulty levels (Easy, Medium, Hard) to simulate real clinical complexity. It includes multimodal inputs such as text, imaging, and video.

⚙️

Evaluation

Models are evaluated using specific metrics corresponding to the question types, enabling accurate and fair comparison.

🏥

Organization

This platform is developed n UTHealth Houston with several partners to advance research in AI-powered medical diagnostics and support clinical decision-making.

Why This Matters

As AI becomes more integrated into healthcare, it is critical to evaluate models in realistic scenarios. This platform helps bridge the gap between research and clinical application by providing measurable, comparable results.

Acknowledgments

Funding

This work was supported by funding from the National Institutes of Health (NIH) (1R01NS138765-01), Google LLC, and the Ovarian Cancer Research Alliance (OCRA) (CRDGAI-2023–3-1002).

Collaborators

We are grateful to the following collaborators for their support in data acquisition and clinical expertise: Santiago Aristizabal Ortiz, MD, Mario E. Mahecha, MD, Laura A. Ocasio, Roy F. Riascos-Castaneda, MD, Elaine Stur, PhD, Anil Sood, MD, and Sunil Sheth, MD.

Data Access

For questions or requests regarding access to private datasets, please contact Shayan Shams, PhD at [email protected].