Princeton University

School of Engineering & Applied Science

Large-scale Multi-output Gaussian Processes for Clinical Decision Support

Li-Fang Cheng
Profs. Engelhardt and Li
402 Computer Science
Tuesday, December 18, 2018 - 3:00pm to 4:30pm


The growing collections of electronic health records (EHRs) are becoming more accessible and enable retrospective research on understanding how individual physiological states are influenced by diseases, medications, and environments. EHRs contain rich patient information---disease history, demographics, vital signs, and lab results---that clinicians use to diagnose and treat patients. A key motivation of learning from EHR data is to increase the accuracy in predicting a patient's future health condition. In particular, in the scenario of real-time monitoring of hospital patients, high-quality inference of patients' health status using all information available from vital signs and lab tests is essential to enable successful medical interventions and improve patient outcomes. Developing a computational framework that can learn from observational large-scale EHRs and make accurate real-time predictions is a critical step in achieving this goal. However, existing EHRs pose several challenges for conventional methods. For instance, many of the covariates are sparsely sampled in time across patients. In addition, there are substantial uncertainties in patient state and disease progression at any time. These properties make inferring the physiological status of a patient or joint analysis of time series across patients challenging.

In this talk, I will first present MedGP, a statistical framework that provides accurate real-time predictions of physiological states. MedGP improves the results through capturing rich temporal structures between clinical traits from noisy and irregularly sampled time series data. Different solutions to perform efficient inference are explored to enable learning from large-scale data sets with hundreds of thousands of patients. Both the prediction errors and computational complexities are greatly reduced compared with state-of-the-art approaches. MedGP paves the way for performing various downstream analyses on EHRs and empower more informative clinical decision-making processes. Finally, two extended frameworks are discussed. One focuses on clinical action recommendation and the other one aims at learning the heterogeneous effects of multiple clinical treatments. Both methods demonstrate encouraging results on improving clinical practices and advancing towards personalized medicine.