Predicting the Conversion from Mild Cognitive Impairment to Alzheimer´s Disease using Evolution Patterns Andreia Liliana Duarte Fernandes Ferreira Thesis to obtain the Master of Science Degree in Biomedical Engineering Supervisors: Professor Sara Alexandra Cordeiro Madeira Doutor Alexandre Valério de Mendonça Examination Committee Chairperson: Professor Raúl Daniel Lavado Carneiro Martins Supervisor: Doutor Alexandre Valério de Mendonça Member of the Committee: Professor Alexandra Sofia Martins de Carvalho December 2014 [ii] Acknowledgments My first acknowledgement goes to my supervisor, Sara Madeira, for guiding me throughout this work and for the opportunity to be part of this project. I also would like to thank to Alexandre Mendonça for the clinical feedback given and for the commitment in this project. I express my gratitude to Luís Lemos for the constant availability to clarify my doubts and for the constructive critics and positive words. I would like to thank to Ricardo for the help and the constant encouraging words. I would also like to thank to the NEUROCLINOMICS group for being always promptly available to any question, especially to Telma for the unconditional support, for never let me doubt of myself, for the caring friendship and for the sharing moments along the last months. I would like to acknowledgement my family, especially my mother, the strongest person I ever known, for the patience, for the guidance in all phases of my life and for teaching me to never give up. I would like to thank my friends, especially to Simone, Serineu and Carolina. To Simone, I am grateful for our friendship, for the support in all decisions I made, for believing in me and for the countless hours we share learning from each other and growing up together. To Serineu and Carolina, thank you for the beautiful companionship, for being right next to me to celebrate every accomplishment and for the caress and the strength in the worst moments. Finally, I would like to thank to my biomedical colleagues, especially to Teresa, my dear friend and companion along these five years, for the affection and the comfort words, for the unforgettable moments and the loudly laughers we share. This work was supported by FCT through the NEUROCLINOMICS (PTDC/EIA-EIA/111239/2009) project. [iii] [iv] Abstract Declines in cognitive functions, together with other evidences of neurological degeneration, become increasingly likely as healthy people age [1,3]. Alzheimer´s Disease (AD) is a neurodegenerative disease characterized by progressive deterioration of cognitive function and is the most common cause of dementia in elderly people. Mild Cognitive Impairment is considered a prodromal state that represents the transitional period between normal ageing and dementia. As such is regarded with special attention since it represents higher risk to evolve to dementia. Thus, the definition of this clinical entity is fundamental to the timely administration of pharmaceutics and therapeutic interventions, improving patient’s quality of life. This thesis intends to predict the evolution of MCI patients to AD considering two approaches: the first where all patients are assumed to evolve similarly and the second where patient profiles are considered. Time windows for two to five years are used for prediction. Initially, we used supervised learning methods, using feature selection to effectively decrease the dimensionality of the problem. Then, standard clustering algorithms were applied with the purpose of studying the potential existence of MCI subtypes. The patients were also divided according to their state of depression, based on clinical information. The results demonstrated the importance of considering longer time interval to predict conversion of MCI patients to AD and that the grouping of patients according to their depressive symptoms influences positively the prognosis results. The clustering analyses validated the importance to study MCI subgroups considering the different characteristics of this clinical entity in the prediction models. Keywords: Alzheimer´s Disease, MCI, Temporal Window, Prognosis, Classification, Clustering. [v] [vi] Resumo As evidências de degeneração neurológica tornam-se cada vez mais prováveis de acontecer com o envelhecimento [1,3]. Doença de Alzheimer (DA) é uma doença neuro-degenerativa caracterizada pela deterioração cognitiva progressiva e é a causa mais comum de demência nos idosos. A deficiência cognitiva ligeira (DCL) é um estado que representa o período de transição entre o envelhecimento normal e demência. Assim, é preciso conceder-lhe uma atenção especial por constituir maior risco de evoluir para demência. Assim, a definição desta entidade clínica é fundamental para a administração oportuna de produtos farmacêuticos e intervenções terapêuticas, melhorando a qualidade de vida do paciente. Esta tese pretende prever a evolução de pacientes com DCL para DA considerando duas abordagens: a primeira que assume a evolução semelhante dos pacientes e a segunda que considera os seus perfis. Janelas temporais de dois a cinco anos são usadas para a previsão. Inicialmente foram utilizados métodos de aprendizagem supervisionada e técnicas de selecção de atributos para eficazmente diminuir a dimensionalidade do problema. Posteriormente, algoritmos standard de clustering foram aplicados com o objetivo de estudar potenciais subgrupos de DCL. Os pacientes também foram divididos de acordo com o seu estado de depressão, através de informações clínicas. Os resultados demonstraram a importância de aumentar as janelas temporais na previsão da conversão de pacientes com DCL para DA e que a separação de pacientes de acordo com estados depressivos influencia positivamente os resultados do prognóstico. As análises de clustering validaram a importância do estudo dos subgrupos de DCL em modelos preditivos. Palavras-Chave: Doença de Alzheimer, DCL, Janelas Temporais, Prognóstico, Classificação, Clustering [vii] [viii] Contents 1. Introduction ....................................................................................................................................................... 1 1.1 Problem Formulation ................................................................................................................................ 1 1.2 Goals ......................................................................................................................................................... 2 1.3 Dissertation Outline .................................................................................................................................. 2 2. Background........................................................................................................................................................ 3 2.1 Alzheimer´s Disease and Mild Cognitive Impairment .............................................................................. 3 2.2 Data Mining techniques ............................................................................................................................ 5 2.2.1 Data Preprocessing ..................................................................................................................... 5 2.2.2 Feature Selection ........................................................................................................................ 5 2.2.3 Unsupervised Learning Methods ................................................................................................ 6 2.2.4 Supervised Learning Methods .................................................................................................... 7 2.2.5 Class Imbalance ........................................................................................................................ 11 2.3 Result Evaluation .................................................................................................................................... 12 2.4 Related Work .......................................................................................................................................... 15 3. Experimental Methodology ............................................................................................................................ 19 3.1 Description of the Data ........................................................................................................................... 19 3.2 Description of the Tools .......................................................................................................................... 19 3.3 Dataset Preprocessing ............................................................................................................................. 20 3.3.1 Outlier Detection ...................................................................................................................... 20 3.3.2 Data Cleaning ........................................................................................................................... 20 3.3.3 Creating learning examples ...................................................................................................... 21 3.3.4 Handling Missing values .......................................................................................................... 23 3.3.5 Feature Selection ...................................................................................................................... 23 3.4 Classification .......................................................................................................................................... 25 3.4.1 Classification Model ................................................................................................................. 26 4. Predicting conversion from MCI to AD ........................................................................................................ 29 4.1 First Last Approach ................................................................................................................................ 29 4.1.1 Cross-Validation results ........................................................................................................... 30 4.1.2 Validation results ...................................................................................................................... 33 4.2 Two Years Temporal Window ................................................................................................................. 33 4.2.1 Cross-Validation results ........................................................................................................... 33 4.2.2 Validation results ...................................................................................................................... 36 [ix] 4.3 Three Years Temporal Window............................................................................................................... 36 4.3.1 Cross-Validation results ........................................................................................................... 37 4.3.2 Validation results ...................................................................................................................... 38 4.4 Four Years Temporal Window ................................................................................................................ 39 4.4.1 Cross-Validation results ........................................................................................................... 39 4.4.2 Validation results ...................................................................................................................... 41 4.5 Five Years Temporal Window ................................................................................................................. 42 4.5.1 Cross-Validation Results .......................................................................................................... 42 4.5.2 Validation Results .................................................................................................................... 44 4.6 Discussion............................................................................................................................................... 44 5. Predicting conversion from MCI to AD based on different MCI characteristics ...................................... 47 5.1 Prognosis prediction based on clinical criteria: depressed/ not depressed .............................................. 47 5.1.1 Clustering ................................................................................................................................. 47 5.1.2 Classification ............................................................................................................................ 48 5.1.2.1 Two Years Temporal Window .............................................................................................. 49 5.1.2.2 Three Years Temporal Window ............................................................................................ 53 5.1.2.3 Four Years Temporal Window .............................................................................................. 57 5.1.2.4 Five Years Temporal Window .............................................................................................. 61 5.1.3 Discussion ................................................................................................................................ 65 5.2 Prognosis prediction based on Patient Similarities ................................................................................. 66 5.2.1 Clustering ................................................................................................................................. 66 5.2.2 Classification ............................................................................................................................ 68 5.2.2.1 Two Years Temporal Window .............................................................................................. 69 5.2.2.2 Three Years Temporal Window ............................................................................................ 69 5.2.2.3 Four Years Temporal Window .............................................................................................. 70 5.2.2.4 Five Years Temporal Window .............................................................................................. 71 5.2.3 Discussion ................................................................................................................................ 72 6. Conclusions and Future Work ....................................................................................................................... 73 7. References ........................................................................................................................................................ 77 A. Appendix Medical Exams .............................................................................................................................. 83 B. Complementary Results: Chapter 4.............................................................................................................. 89 C. Complementary Results: Chapter 5 ............................................................................................................. 95 C.1 Clinical criteria: depressed/ not depressed ............................................................................................. 95 C.2 Patient Similarities ................................................................................................................................. 99 [x]
Description: