Among 5,237 NHANES 2017–2018 participants, 20.5% (n=1,072) met criteria for probable undiagnosed hypertension (mean BP ≥130/80 mmHg, no prior diagnosis). Three machine learning classifiers—Logistic Regression (LR), Random Forest (RF), and XGBoost—were trained on eight non-invasive predictors. LR achieved the best AUC of 0.611 (95% CI: 0.571–0.652) with sensitivity of 0.535 and specificity of 0.594. RF essentially failed on sensitivity (0.047), and XGBoost was intermediate. Diabetes status, sex, and age were the top predictors by permutation importance.

With roughly 1.28 billion adults globally living with hypertension and fewer than half aware of their condition, low-cost screening tools have genuine public health appeal—particularly in low-resource settings where laboratory access is limited. However, this preprint, not yet peer-reviewed, reveals a fundamental tension: an AUC of 0.61 sits only marginally above chance, and a sensitivity of 0.535 means nearly half of undiagnosed cases would be missed. For a screening tool, that gap is clinically consequential. The near-total sensitivity collapse in Random Forest further signals overfitting to the majority class despite cross-validation. The cross-sectional NHANES design also precludes causal inference, and the single survey cycle limits generalizability. The finding is best characterized as incremental and feasibility-oriented rather than practice-changing. External validation across ethnically diverse, internationally representative cohorts is essential before any clinical deployment. Nonetheless, the framework—non-invasive, lab-free, interpretable—remains a worthwhile research direction if model performance can be meaningfully improved.