Voice-Based Biomarkers and Motor Severity in Parkinson’s Disease
This portfolio project explored whether motor symptom severity in Parkinson’s disease can be estimated from voice-based acoustic biomarkers and basic demographic variables.
The analysis was based on the Oxford Parkinson’s Telemonitoring Dataset and focused on predicting motor_UPDRS, a clinical score describing motor symptom severity.
Rather than building a diagnostic classifier, the goal was to frame the task as a regression problem and examine how much clinically meaningful information voice features can provide.
Project Focus
The project investigated the relationship between voice measurements, patient characteristics, and motor symptom severity in Parkinson’s disease.
The main target variable was motor_UPDRS. Higher values indicate more severe motor symptoms, while lower values suggest milder motor involvement.
Dataset at a Glance
The dataset contained 5,875 voice recording-based measurement records from patients with early-stage Parkinson’s disease.
A key methodological point is that these records do not represent 5,875 independent patients, because multiple measurements may belong to the same patient.
Target Variable: motor_UPDRS
The distribution of motor_UPDRS showed that most observations were concentrated in the mild-to-moderate and moderate severity ranges.
The mean and median values were close to each other, suggesting that the target variable was not strongly skewed in one direction.
Variability in Motor Severity
The boxplot provided a clearer view of the median, interquartile range, and overall spread of motor_UPDRS values.
The central 50% of observations were concentrated around the middle range, while both milder and more severe motor states were also present.
Age as a Key Background Factor
Age showed a visible positive relationship with motor_UPDRS. Higher motor_UPDRS values generally appeared in older age ranges.
At the same time, age alone did not fully explain motor symptom severity, since patients of similar age could still show different motor_UPDRS scores.
Voice Quality and Motor Severity
Among the acoustic variables, HNR was examined as a measure related to voice clarity and the ratio between harmonic and noisy components.
A weak negative relationship was observed between HNR and motor_UPDRS, suggesting that lower voice quality may be associated with greater motor symptom severity.
Model Interpretation
A Random Forest model was used to estimate motor_UPDRS and examine feature importance.
Age emerged as the strongest predictor, followed by voice-based variables such as HNR, PPE, Shimmer.APQ11, and Jitter.RAP. The contribution of sex was less substantial.
Main Takeaways
- Voice-based acoustic features can reflect clinically relevant aspects of Parkinson’s disease.
- Age was the strongest predictor of motor_UPDRS in this analysis.
- HNR, PPE, Shimmer.APQ11, and Jitter.RAP contributed additional information.
- The model’s predictive performance remained limited.
- Voice-based telemonitoring appears promising as a complementary research direction, not as a standalone clinical decision-support tool.
Limitations
- The dataset contained repeated measurement records, so the number of records was not equal to the number of individual patients.
- Voice features may be influenced by recording conditions, microphone quality, background noise, and the patient’s current state.
- The dataset focused on early-stage Parkinson’s disease, which limits generalizability to more advanced disease stages.
Portfolio Summary
This project demonstrated how demographic and voice-based acoustic variables can be used to explore motor symptom severity in Parkinson’s disease.
The results suggest that voice biomarkers may carry complementary information, especially when combined with demographic and clinical variables.
The analysis also highlighted an important practical limitation: voice features alone are not sufficient for accurate clinical-level prediction. More robust models would likely require longitudinal modeling, additional clinical variables, and external validation.
