Advanced Computational Approaches to Alzheimer's Research (selected)
Circular-SWAT for deep learning based diagnostic classification of Alzheimer’s disease: Application to metabolome data
Taeho Jo et al., eBioMedicine (2023)
This study introduces the Circular-Sliding Window Association Test (c-SWAT), a methodology designed to enhance the diagnostic classification of Alzheimer's Disease (AD) using serum-based metabolomics data, with a focus on lipidomics. Leveraging data from 997 participants, c-SWAT integrates feature correlation analysis, feature selection via Convolutional Neural Networks (CNN), and final classification through Random Forest, achieving an accuracy of up to 80.8% and an AUC of 0.808 in distinguishing AD from cognitively normal older adults. [LINK]
Deep Learning-based Integration of Neuroimaging and Genetic Data for Classification of Alzheimer's Disease
Presenting Author: Taeho Jo, AAIC (2023)
This study introduces a new deep learning method using CNNs to analyze tau PET images and identify Alzheimer's Disease (AD) related patterns. The method achieved a 90.8% accuracy in classifying AD and highlighted significant tau deposition regions associated with AD. Additionally, we used the SWAT method to find AD-related SNPs, uncovering key genetic loci, including the known APOE regions, and achieved an AUC of 0.82.
Deep Learning-based SWAT-Tab Approach for Identifying Genetic Variants using Whole Genome Sequencing
Presenting Author: Taeho Jo, AAIC (2023)
The study introduces SWAT-TAB, an evolved form of SWAT-CNN, optimized for identifying genetic variants in Alzheimer's disease (AD). It utilizes the Tabnet algorithm to meticulously select relevant features using a concept called sequential attention and was applied to ADSP WGS data, revealing pivotal genetic features. SWAT-TAB demonstrated enhanced efficiency, offering reduced processing time and improved ease of implementation compared to its predecessor.
Novel circling SWAT for deep learning based diagnostic classification of Alzheimer’s disease: Application to metabolome data
Presenting Author: Taeho Jo, AAIC (2022)
We used serum-based cross-sectional lipidome data with 781 lipids from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) including 216 cognitively normal (CN), 635 MCI, and 382 dementia (AD). Phenotype influence scores (PIS) was derived by deep learning-based circling Sliding Window Association Test approach (Circling SWAT), an extension of SWAT (Jo et al., 2022) with correlation heatmap and dendrogram analysis for omics data with minimal features.
Deep learning-based identification of genetic variants: application to Alzheimer’s disease classification
Taeho Jo et al., Briefings in Bioinformatics (2022)
We propose a novel three-step approach (SWAT-CNN) for identification of genetic variants using deep learning to identify phenotype-related single nucleotide polymorphisms (SNPs) that can be applied to develop accurate disease classification models. We tested our approach using GWAS data from the ADNI including (N = 981; CN = 650, AD = 331). Our approach identified the well-known APOE region as the most significant genetic locus for AD. Our classification model achieved an AUC of 0.82.
Deep learning–based genome-wide association analysis in Alzheimer’s disease
Presenting Author: Taeho Jo, AAIC (2021)
We used genome-wide genotyping data (12,448,786 SNPs following imputation) from 916 participants in the Alzheimer’s Disease Neuroimaging Initiative (458 cognitively normal controls and 458 AD patients). A convolutional neural network (CNN) consisting of convolutional, pooling and fully connected Softmax layers was used in a two-stage approach.
Deep learning detection of informative features in tau PET for Alzheimer’s disease classification
Taeho Jo et al., BMC Bioinformatics (2020)
We developed a deep learning-based framework to identify informative features for AD classification using tau positron emission tomography (PET) scans. The 3D convolutional neural network (CNN)-based classification model of AD from cognitively normal (CN) yielded an average accuracy of 90.8% based on five-fold cross-validation. The LRP model identified the brain regions in tau PET images that contributed most to the AD classification from CN.
Deep learning detection of informative features in [18F] flortaucipir PET for Alzheimer’s disease classification
Presenting Author: Taeho Jo, AAIC (2020)
We downloaded 458 tau PET images (196 CN, 196 MCI, and 66 AD) from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) and included only one scan per individual. SPM12 was used to process the tau PET data using standard techniques. We used a 3D convolution neural network (CNN) method for the classification, and applied a layer-wise relevance propagation (LRP) algorithm to identify informative features and to visualize the classification results. Five-fold cross validation was applied, where 70% of the entire data set was used for model training, 20% for model testing, and 10% for independent validation.
Deep Learning in Alzheimer's Disease: Diagnostic Classification and Prognostic Prediction Using Neuroimaging Data
Taeho Jo et al., Frontiers in Aging Neuroscience (2019)
The application of deep learning to early detection and automated classification of AD has recently gained considerable attention, as rapid progress in neuroimaging techniques has generated large-scale multimodal neuroimaging data. A systematic review of publications using deep learning and neuroimaging data for diagnostic classification of AD was performed. A PubMed and Google Scholar search was used to identify deep learning papers on AD published between Jan 2013 and July 2018. These papers were reviewed, evaluated, and classified by algorithm and neuroimaging type, and the findings were summarized.
Multimodal-3DCNN: Diagnostic Classification of Alzheimer's Disease Using Deep Learning on Neuroimaging, Genetic, and Demographic Data
Presenting Author: Taeho Jo, AAIC (2019)
Demographic information, 3D MRI and PET image data, and APOE data were downloaded from the ADNI data repository (N=329; 185 CN and 144 AD). In our novel Multimodal-3DCNN approach, we first applied 3D Convolutional Neural Network (3D-CNN) to multimodal neuroimaging (MRI and PET) and then combined the output of 3D-CNN with APOE ε4 genotype and demographic information (age, sex, education, handedness etc.) using a gram matrix method (mCNN; Jo et al. AAIC2018). Finally, Deep Neural Network (DNN) was used to distinguish individuals with AD from CN. A 5-fold cross validation approach was employed to evaluate performance.
Multimodal-CNN: Improved Accuracy of MRI-based Classification of Alzheimer’s Disease by Incorporating Clinical Data in Deep Learning
Presenting Author: Taeho Jo, AAIC (2018)
Intermediate layers of the CNN were extracted, and the patient's clinical information was added by the gram matrix method. The clinical information was encoded as 2D matrices in this method, and the 2D images were extracted for train set by using the hippocampal segmentations, downloaded from the LONI ADNI site, carried out using Surgical Navigation Technologies (SNT). CNN with augmentation was performed on baseline scans from 103 participants with AD, 144 cognitively normal (CN) controls. Global CDR scores and the number of APOE ε4 alleles were included as clinical and genetic data.
Evaluation of Protein Structural Models Using Random Forests
Renzhi Cao, Taeho Jo, Jianlin Cheng, arXiv (2016)
We propose a new protein quality assessment method which can predict both local and global quality of the protein 3D structural models. Our method uses both multi and single model quality assessment method for global quality assessment, and uses chemical, physical, geo-metrical features, and global quality score for local quality assessment. CASP9 targets are used to generate the features for local quality assessment. We evaluate the performance of our local quality assessment method on CASP10, which is comparable with two stage-of-art QA methods based on the average absolute distance between the real and predicted distance. We blindly tested our method on CASP11, and the good performance shows that combining single and multiple model quality assessment method could be a good way to improve the accuracy of model quality assessment.
Improving Protein Fold Recognition by Deep Learning Networks
Taeho Jo et al., Scientific Reports (2015)
The three–dimensional structure of Heterosigma akashiwo Na+–ATPase (HANA) was predicted by means of homology modeling based on the crystal structure of the K+–bound form of shark Na+/K+–ATPase (PDB ID: 2ZXE). The overall structure of HANA appears to be similar to that of shark Na+/K+–ATPase. Both contain three characteristic cytoplasmic domains, A, N and P, which are unique to P–type ATPases. HANA has a long TM7–8 junction as a large extracellular domain, in place of the β–subunit of shark Na+/K+–ATPase. Two putative K+–binding sites in the transmembrane domain of HANA were identified by means of valence mapping based on the constructed structure. The presence of K+–binding sites and the reported ion requirements for ATPase activity and EP formation indicate that HANA may transport K+ ions in the same manner as animal Na+/K+–ATPases.
Improving protein fold recognition by random forest
Taeho Jo et al., BMC Bioinformatics (2014)
RF-Fold consists of hundreds of decision trees that can be trained efficiently on very large datasets to make accurate predictions on a highly imbalanced dataset. We evaluated RF-Fold on the standard Lindahl's benchmark dataset comprised of 976 × 975 target-template protein pairs through cross-validation. Compared with 17 different fold recognition methods, the performance of RF-Fold is generally comparable to the best performance in fold recognition of different difficulty ranging from the easiest family level, the medium-hard superfamily level, and to the hardest fold level. Based on the top-one template protein ranked by RF-Fold, the correct recognition rate is 84.5%, 63.4%, and 40.8% at family, superfamily, and fold levels, respectively. Based on the top-five template protein folds ranked by RF-Fold, the correct recognition rate increases to 91.5%, 79.3% and 58.3% at family, superfamily, and fold levels.
Homology Modeling of an Algal Membrane Protein, Heterosigma Akashiwo Na^+-ATPase
Taeho Jo et al., Membrane (2010)
The three–dimensional structure of Heterosigma akashiwo Na+–ATPase (HANA) was predicted by means of homology modeling based on the crystal structure of the K+–bound form of shark Na+/K+–ATPase (PDB ID: 2ZXE). The overall structure of HANA appears to be similar to that of shark Na+/K+–ATPase. Both contain three characteristic cytoplasmic domains, A, N and P, which are unique to P–type ATPases. HANA has a long TM7–8 junction as a large extracellular domain, in place of the β–subunit of shark Na+/K+–ATPase. Two putative K+–binding sites in the transmembrane domain of HANA were identified by means of valence mapping based on the constructed structure. The presence of K+–binding sites and the reported ion requirements for ATPase activity and EP formation indicate that HANA may transport K+ ions in the same manner as animal Na+/K+–ATPases.