BACKGROUND & AIMS:
:In medical research, the receiver operating characteristic (ROC) curves can be used to evaluate the performance of biomarkers for diagnosing diseases or predicting the risk of developing a disease in the future. The area under the ROC curve (ROC AUC), as a summary measure of ROC curves, is widely utilized, especially when comparing multiple ROC curves. In observational studies, the estimation of the AUC is often complicated by the presence of missing biomarker values, which means that the existing estimators of the AUC are potentially biased. In this article, we develop robust statistical methods for estimating the ROC AUC and the proposed methods use information from auxiliary variables that are potentially predictive of the missingness of the biomarkers or the missing biomarker values. We are particularly interested in auxiliary variables that are predictive of the missing biomarker values. In the case of missing at random (MAR), that is, missingness of biomarker values only depends on the observed data, our estimators have the attractive feature of being consistent if one correctly specifies, conditional on auxiliary variables and disease status, either the model for the probabilities of being missing or the model for the biomarker values. In the case of missing not at random (MNAR), that is, missingness may depend on the unobserved biomarker values, we propose a sensitivity analysis to assess the impact of MNAR on the estimation of the ROC AUC. The asymptotic properties of the proposed estimators are studied and their finite-sample behaviors are evaluated in simulation studies. The methods are further illustrated using data from a study of maternal depression during pregnancy.
背景与目标:
:在医学研究中,接收器工作特性(ROC)曲线可用于评估生物标志物在诊断疾病或预测将来患病风险方面的表现。作为ROC曲线的汇总度量,ROC曲线下的面积(ROC AUC)被广泛利用,尤其是在比较多个ROC曲线时。在观察性研究中,由于缺少生物标志物值,AUC的估算通常会变得复杂,这意味着现有的AUC估算器可能存在偏差。在本文中,我们开发了用于估计ROC AUC的可靠统计方法,并且所提出的方法使用了来自辅助变量的信息,这些信息可以潜在地预测生物标志物的缺失或生物标志物值的缺失。我们对预测缺失的生物标志物值的辅助变量特别感兴趣。在随机缺失(MAR)的情况下,也就是说,生物标志物值的缺失仅取决于观察到的数据,我们的估计量具有吸引人的特征,即如果根据辅助变量和疾病状态正确地指定一个模型,则该估计量是一致的缺失的可能性或生物标记值的模型。在不随机缺失(MNAR)的情况下,也就是说,缺失可能取决于未观察到的生物标志物值,我们提出了一种敏感性分析,以评估MNAR对ROC AUC估计的影响。研究了估计量的渐近性质,并在仿真研究中评估了它们的有限样本行为。使用来自孕妇孕期抑郁症研究的数据进一步说明了这些方法。