Enhancing Parkinson’s Disease Classification: Evaluating SVM, Decision Tree and Ensemble Learning with Advanced Preprocessing Strategies
DOI:
https://doi.org/10.63075/kg526555Abstract
Parkinson’s disease (PD), the second most common neurodegenerative disorder, affects over millions of people worldwide and encompasses a wide variety of motor and non-motor symptoms which immensely impact one’s quality of life. This research aims to improve further diagnostic precision for PD using advanced machine learning (ML) algorithms which are essential in identifying and differentiating the condition from other similar neurodegenerative diseases during its preclinical phase. This research utilizes a comprehensive data set acquired from the Telemonitoring Database for Parkinson’s disease which contains clinical, genetic, and neuroimaging information from patients, employing a quantitative research design. The dataset consists of 5,875 patient records which included demographic information, assessment of motor and non-motor symptoms, and vocal impairment features vital for PD diagnosis. The stages of PD were classified using five ML models: Support Vector Machine (SVM), Random Forest, Decision Trees, Gradient Boosting, and Neural Networks which were all rigorously trained and tested to ensure precise classification. The models were also measured on accuracy, precision, recall, F1 score, and their cross-validated performance for generalizable reliability. Out of all models tested, Decision Trees came out on top with an impressive but potentially overfitting bias accuracy of 99.32%. Random Forest and Gradient Boosting also performed well with over 96% accuracy demonstrating their effectiveness on complex high dimensional data. Both the SVM and the Neural Networks were less accurate than other methods, but their use in initial screenings and dealing with nonlinear data relationships showed greater potential. The results of this study demonstrate ML models can transform PD diagnostic processes with early and precise detection that drastically improves patient care and management, optimizing treatment strategies and outcomes. This study supports the use of these models in clinical practice as they could provide accurate diagnostics, help track the course of the disease, and enable targeted adjustments to therapy. Further development of these models, broadening the diversity of the datasets, and investigating their practical use to ensure clinical relevance will improve outcomes for patients while addressing their optimal care needs.