Medical artificial intelligence research faces a reproducibility crisis that undermines clinical translation and patient safety. When studies using foundation models and large language models lack standardized reporting, clinicians cannot properly evaluate their reliability or implement them safely in practice. The new REFINE framework addresses this critical gap by establishing the first international consensus guidelines specifically for AI model transparency in healthcare research. Through a rigorous modified Delphi process involving 57 contributors across 17 countries, researchers developed a comprehensive 44-item checklist spanning six key domains. The framework covers essential reporting elements for both unimodal and multimodal applications, encompassing text analysis, medical imaging, and structured clinical data integration. Each item includes detailed instructions for documenting model architecture, training data characteristics, validation procedures, and performance metrics. The consensus-driven approach ensures the guidelines reflect global expertise while addressing the unique challenges of foundation models in medical contexts. This standardization effort represents a significant advancement for evidence-based medicine in the AI era. Currently, many medical AI studies omit crucial details about data preprocessing, model fine-tuning, or bias mitigation strategies, making independent validation nearly impossible. The REFINE checklist should substantially improve study quality by mandating disclosure of these technical specifications. However, successful implementation will depend on journal adoption and researcher compliance. While the framework provides necessary structure for transparent reporting, it cannot address underlying methodological flaws in study design or resolve concerns about proprietary model limitations that restrict full reproducibility.