Data Driven Insights for Health Informatics
The move to Electronic Health Records and rapidly expanding availability of health-related information, from billing data body sensors, is providing unprecedented opportunities for innovative data driven solutions to problems in personalized medicine and population health. However, there are many formidable challenges in using EHR data that have limited their utility for clinical research so far, including diverse populations, heterogeneous and noisy information, longitudinal data, interpretability, domain constraints, and privacy concerns.
Our work takes a significant step towards the promise of exploiting large-scale EHR data for effective population health care and management. We are working on a variety of approaches for the analysis of such data, ranging from 1) high-throughput phenotyping via sparse non-negative tensor factorization of health data tensors, 2) providing new models that deal with very rare classes (e.g. rare conditions or diseases), 3) studying the utility vs. privacy trade-off in healthcare data analytics, and 4) ways of using multidimensional time series representing human physiological measures, for predictive modeling, e.g. determining if a patient is likely to go into cardiac arrest in the next few hours.
This research funded by the Schlumberger Chair and the USAA.
Paper 8: Marble: High-throughput Phenotyping from Electronic Health Records via Sparse Nonnegative Tensor Factorization (to appear)
Paper 9: LUDIA: An Aggregate-Constrained Low-Rank Reconstruction Algorithm to Leverage Publicly Released Health Data (to appear)
Paper 10: Extracting Phenotypes from Patient Claim Records using Non-negative Tensor Factorization (to appear)
Paper 11: A Hierarchical Ensemble of alpha-Trees for Predicting Expensive Hospital Visits (to appear)