Medical AI research faces a growing credibility crisis as foundation models and large language models proliferate across healthcare applications without standardized reporting practices. This lack of transparency hampers reproducibility and clinical translation of promising artificial intelligence tools.

A 57-member international consortium has developed REFINE, a comprehensive 44-item reporting framework specifically designed for foundation model and large language model studies in medical research. The guideline emerged from a rigorous modified Delphi process involving experts from 17 countries across two structured consensus rounds. The framework addresses both unimodal and multimodal AI applications spanning text analysis, medical imaging, and structured healthcare data, providing detailed instructions for transparent methodology reporting.

This represents a significant advance in AI research governance, addressing a critical gap that has allowed inconsistent and incomplete reporting to undermine the field's scientific rigor. Unlike existing guidelines focused on traditional machine learning, REFINE specifically tackles the unique challenges posed by foundation models—massive neural networks trained on diverse datasets that can be adapted for multiple medical tasks. The framework's online implementation platform suggests practical adoption potential across research institutions.

However, guidelines alone cannot ensure compliance, and the medical AI community's willingness to embrace more rigorous reporting standards remains to be demonstrated. The initiative's success will ultimately depend on journal adoption, institutional enforcement, and researcher buy-in across the rapidly evolving landscape of medical artificial intelligence applications.