Högskolan i Skövde

his.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • apa-cv
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Comparison of different machine learning models to predict glaucoma
University of Skövde, School of Bioscience.
2023 (English)Independent thesis Advanced level (degree of Master (Two Years)), 30 credits / 45 HE creditsStudent thesis
Abstract [en]

Glaucoma as an important cause of irreversible blindness has a great impact on the quality of patients’ life, society, and the healthcare budget. Early detection of glaucoma can decrease its impact. Primary open-angle glaucoma is one of the most prevalent types of glaucoma. This study aims to evaluate five machine learning methods (LDA, LR, KNN, RF, and SVM) to find the method with the highest accuracy. 258 antigens were analyzed in 30 patients with exfoliative glaucoma and 30 healthy donors. By performing principal component analysis and heatmap clustering, two outlier samples were detected. A moderated t-statistics test was conducted by using the limma package from R to identify the differential reactivity of antibodies binding to antigens. Differential antigens were sorted based on the p-value from low to high. Three data-groups with the top 10, 20, and 40 most significant antigens were created. The five machine learning methods were performed for all three data-groups. The training accuracy and area under the ROC curve values were calculated. The models were validated with a 10-fold cross-validation and the accuracy of validated models was computed. Different values were examined for each method’s parameters to achieve the highest accuracy for the models. RF, SVM, and KNN obtained the highest accuracy for data-group 2 with the top 20 most significant antigens. LR and LDA obtained the highest accuracy for the top 10 most significant antigens group. RF (78.3%), SVM (73.3%), LR (72.7%), LDA (71.3%), and KNN (67.3%) respectively showed the best performance of the 10-fold cross-validation. Data-group 3 with the 40 most significant antigens did not show any considerable performance in terms of accuracy in comparison to other groups. In conclusion, performing machine learning methods on our data has resulted in the highest accuracy for the random forest model by using the top 20 most significant antigens.

Place, publisher, year, edition, pages
2023. , p. 28
National Category
Biological Sciences
Identifiers
URN: urn:nbn:se:his:diva-23051OAI: oai:DiVA.org:his-23051DiVA, id: diva2:1784284
Subject / course
Systems Biology
Educational program
Biomarkers in Molecular Medicine - Master's Programme 120 ECTS
Supervisors
Examiners
Available from: 2023-07-26 Created: 2023-07-26 Last updated: 2023-07-26Bibliographically approved

Open Access in DiVA

fulltext(1196 kB)78 downloads
File information
File name FULLTEXT01.pdfFile size 1196 kBChecksum SHA-512
0ef8003d20049794938d7383c89041d29614174a5e3f5d2d3fb4a1c15ef96e28cdea22d6cfba3ecad8488f1ba1009a83cb897129d3fb3d0b302a1f6a407ccc22
Type fulltextMimetype application/pdf

By organisation
School of Bioscience
Biological Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 78 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 214 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • apa-cv
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf