Machine learning approaches for risk prediction in vaccine research using routinely collected electronic health records
Background
Accurate identification of individuals at highest risk of disease is key to determining population groups who should be prioritised for vaccination. Routinely collected electronic health record (EHR) data are a critical resource for identifying population groups at high risk of disease, as exemplified by the COVID-19 pandemic. Current studies tend to use curated ‘codelists’ to define population subgroups with clinical conditions. While this approach facilitates statistical analysis and clinical interpretation, it suffers from key drawbacks including: (i) the grouping of conditions that may vary by quality or severity; and (ii) the potential omission of informative diagnostic codes that are absent from curated codelists. A data-driven machine learning approach that treats each diagnostic or medication code as an individual piece of information may offer a powerful complement to existing methods.
Project objectives
- Compare risk prediction based on high-dimensional machine learning versus regression-based approaches in a case study
- Evaluation the performance of machine learning for risk prediction across different diseases and population subgroups
Skills/methods
- Machine learning using routine electronic health record data
- Regression-based risk prediction based on clinical code lists
- Type
- PhD position
- Institution
- London School of Hygiene & Tropical Medicine
- City
- London
- Country
- UK
- Closing date
- March 30th, 2025
- Posted on
- March 24th, 2025 09:54
- Last updated
- March 26th, 2025 17:57
- Share
- Tweet