AIPD-DC1-LIH

Multimodal Risk Models for Early Diagnosis of Parkinson's Disease (PD)

  • Host Institution: Luxembourg Institute of Health (LIH)
  • PhD Enrolment: University of Luxembourg
  • Start Date: October 2025
  • Duration: 36 months
  • Official PhD Supervisor: Rejko Krüger

Research Objectives

Building on prior work of FH in the Alzheimer’s field, the overall objective of this thesis project is to develop and evaluate a multimodal AI/ML model that can predict the risk of an individual to develop PD and assess the putative causal effect of lifestyle on that risk.

Two key questions are:

  1. which data modalities are best suited to build such a model (e.g. genetic predisposition vs. patient history associated factors), and
  2. how these modalities contribute relative to each other.

Therefore, model explainability as well as causality are key factors in this project. The specific aim of this project is thus to develop multi-modal predictive machine learning models combining, e.g., genomic data, prior diagnoses, and medication, as well as lifestyle associated factors, from UK Biobank, LuxPARK and ICEBERG. The project will quantify how far the prediction performance of multi-modal models exceeds that of models trained on single modalities, and the relative contribution of individual data modalities and features in such a multi-modal model. For that purpose, we will use Explainable AI (XAI) techniques such as (causal) SHAP, possibly also in combination with Bayesian Network techniques, to disentangle the relationship of the most important features. Furthermore, this project will identify how individual and potentially modifiable lifestyle associated factors (e.g. BMI, physical exercise) contribute to a causal effect on disease risk, relative to genetic predisposition. For this purpose, recent causal machine learning techniques such as R, S, X and T-learning will be employed and carefully evaluated via refutation tests. While this project focuses on risk assessment and in particular modifiable risk factors, DCs4, 10, 11 will focus on different aspects of early disease symptom detection.
 

Expected Results

  • Innovative and explainable AI/ML models predicting PD risk on an individual basis.
  • New insights about relevant data modalities and the putative causal effect of lifestyle on disease risk.
  • Pioneering the use of causal AI/ML techniques in the PD field.
     

Planned Secondment(s)

  • Host 1: Petanux
    • Duration: 18 months
    • Purpose: Learning about AI/ML
  • Host 2: Fraunhofer SCAI
    • Duration: 9 months
    • Purpose: Learning about causal AI/ML techniques

This project is part of the "Precision Neurology" work package.

References

  • Khanna, S., Domingo-Fernández, D., Iyappan, A. et al. Using Multi-Scale Genetic, Neuroimaging and Clinical Data for Predicting Alzheimer’s Disease and Reconstruction of Relevant Biological Mechanisms. Sci Rep 8, 11173 (2018). https://doi.org/10.1038/s41598-018-29433-3
  • Birkenbihl, C., Emon, M.A., Vrooman, H. et al. Differences in cohort study data affect external validation of artificial intelligence models for predictive diagnostics of dementia - lessons for translation into clinical practice. EPMA Journal 11, 367–376 (2020). https://doi.org/10.1007/s13167-020-00216-z
  • Tom Heskes, Ioan Gabriel Bucur, Evi Sijben, and Tom Claassen. 2020. Causal shapley values: exploiting causal knowledge to explain individual predictions of complex models. In Proceedings of the 34th International Conference on Neural Information Processing Systems (NIPS '20). Curran Associates Inc., Red Hook, NY, USA, Article 401, 4778–4789. https://dl.acm.org/doi/abs/10.5555/3495724.3496125
  • M. Lentzen et al., "A Transformer-Based Model Trained on Large Scale Claims Data for Prediction of Severe COVID-19 Disease Progression," in IEEE Journal of Biomedical and Health Informatics, vol. 27, no. 9, pp. 4548-4558, Sept. 2023. https://doi.org/10.1109/JBHI.2023.3288768