Enhancing early detection of dementia using inter-relation-based features and oversampling technique

Yanawut Chaiyo

Please use this identifier to cite or link to this item: http://mfuir.mfu.ac.th:80/xmlui/handle/123456789/1072

Title:	Enhancing early detection of dementia using inter-relation-based features and oversampling technique
Authors:	Yanawut Chaiyo
metadata.dc.contributor.advisor:	Punnarumol Temdee
Keywords:	Dementia;Classification;Machine learning;Oversampling Technique
Issue Date:	2025
Publisher:	Mae Fah Luang University. Learning Resources and Educational Media Centre
Abstract:	Dementia affects both individuals and society, making early detection essential for effective management. However, reliance on advanced laboratory tests and specialized expertise limits accessibility, hindering timely diagnosis. To address this challenge, this study pioneers a novel approach by employing readily available biochemical and physiological features from electronic health records to develop a machine learning-based binary classification model, enhancing accessibility and early detection. This study utilizes a dataset from Phachanukroh Hospital in Chiang Rai, Thailand, for model construction. A hybrid data enrichment framework using feature augmentation and data balancing was proposed to increase data dimensionality. Inter-relation-based Features (IRFs) were suggested as a means to enhance data diversity and promote explainability by making features more informative through the application of medical domain knowledge. To balance the data, K-Means Synthetic Minority Oversampling Technique (K-Means SMOTE) was applied to generate synthetic samples in underrepresented regions of the feature space, improving class imbalance handling. Extra Trees (ET) was proposed for model construction because of its noise resilience and ability to manage multicollinearity. The performances were compared with Support Vector Machine (SVM), K-nearest Neighbors (KNN), Artificial Neural Networks (ANN), Random Forest (RF), and Gradient Boosting GB. Results revealed that the ET model significantly outperformed other models for the combined dataset with four Inter-Relation-Based Features (IRFs) and K-Means SMOTE across key metrics, including accuracy (96.47 %), precision (94.79 %), recall (97.86 %), F1-score (96.30%), and area under the curve of the Receiver Operating Characteristic (99.51 %).
Description:	Dissertation (Ph.D.) -- Computer Engineering, School of Applied Digital Technology. Mae Fah Luang University, 2025
URI:	http://mfuir.mfu.ac.th:80/xmlui/handle/123456789/1072
Appears in Collections:	ดุษฎีนิพนธ์ (Dissertation)

Files in This Item:

File	Description	Size	Format
140079-Fulltext.pdf	Fulltext	6.13 MB	Adobe PDF	View/Open
140079-Abstract.pdf	Abstract	608.45 kB	Adobe PDF	View/Open

Show full item record