Medical imaging datasets, such as magnetic resonance, are increasingly being used to investigate the genetic architecture of the brain. These images are commonly used as imaging-specific or –derived phenotypes when conducting genotype-phenotype association studies. When using this type of phenotype, multivariate genome-wide association study (GWAS) designs are considered better suited than univariate methods due to the ability to account for the inherent correlations between the phenotypes related to brain structures as determined from medical images. The main objective of this work is to establish and evaluate a comprehensive pipeline for investigating genotype-phenotype associations of the human brain using canonical component analysis. The proposed pipeline was tested to investigate genotype-phenotype associations between cortical brain region volumes in subjects with attention-deficit hyperactivity disorder as a proof-of-principle. Canonical component analysis, a form of multivariate GWAS and machine learning, was utilized to determine genotype-phenotype associations between cortical brain region volumes in subjects with attention-deficit hyperactivity disorder. Using the developed pipeline, several significant (p-value < 5E−04) single nucleotide polymorphisms were found that reside in or near several genes like DSCAM or DPYSL2 that are known to be associated with neurological and mental disorders or substance addiction, a common comorbidity for subjects with attention-deficit hyperactivity disorder. These clinically meaningful results show that the proposed pipeline using canonical component analysis can be used to investigate the genetic architecture of the brain.
|