Variance component model to account for sample structure in genome-wide association studies
- PMID: 20208533
- PMCID: PMC3092069
- DOI: 10.1038/ng.548
Variance component model to account for sample structure in genome-wide association studies
Abstract
Although genome-wide association studies (GWASs) have identified numerous loci associated with complex traits, imprecise modeling of the genetic relatedness within study samples may cause substantial inflation of test statistics and possibly spurious associations. Variance component approaches, such as efficient mixed-model association (EMMA), can correct for a wide range of sample structures by explicitly accounting for pairwise relatedness between individuals, using high-density markers to model the phenotype distribution; but such approaches are computationally impractical. We report here a variance component approach implemented in publicly available software, EMMA eXpedited (EMMAX), that reduces the computational time for analyzing large GWAS data sets from years to hours. We apply this method to two human GWAS data sets, performing association analysis for ten quantitative traits from the Northern Finland Birth Cohort and seven common diseases from the Wellcome Trust Case Control Consortium. We find that EMMAX outperforms both principal component analysis and genomic control in correcting for sample structure.
Conflict of interest statement
Figures
Similar articles
-
Genome-wide efficient mixed-model analysis for association studies.Nat Genet. 2012 Jun 17;44(7):821-4. doi: 10.1038/ng.2310. Nat Genet. 2012. PMID: 22706312 Free PMC article.
-
Novel probabilistic models of spatial genetic ancestry with applications to stratification correction in genome-wide association studies.Bioinformatics. 2017 Mar 15;33(6):879-885. doi: 10.1093/bioinformatics/btw720. Bioinformatics. 2017. PMID: 28025204 Free PMC article.
-
An efficient multi-locus mixed-model approach for genome-wide association studies in structured populations.Nat Genet. 2012 Jun 17;44(7):825-30. doi: 10.1038/ng.2314. Nat Genet. 2012. PMID: 22706313 Free PMC article.
-
Software engineering the mixed model for genome-wide association studies on large samples.Brief Bioinform. 2009 Nov;10(6):664-75. doi: 10.1093/bib/bbp050. Brief Bioinform. 2009. PMID: 19933212 Review.
-
Statistical methods for genome-wide and sequencing association studies of complex traits in related samples.Curr Protoc Hum Genet. 2015 Jan 20;84:1.28.1-1.28.9. doi: 10.1002/0471142905.hg0128s84. Curr Protoc Hum Genet. 2015. PMID: 25599666 Free PMC article. Review.
Cited by
-
The 1000 Chinese Indigenous Pig Genomes Project provides insights into the genomic architecture of pigs.Nat Commun. 2024 Nov 22;15(1):10137. doi: 10.1038/s41467-024-54471-z. Nat Commun. 2024. PMID: 39578420 Free PMC article.
-
Releasing a sugar brake generates sweeter tomato without yield penalty.Nature. 2024 Nov;635(8039):647-656. doi: 10.1038/s41586-024-08186-2. Epub 2024 Nov 13. Nature. 2024. PMID: 39537922 Free PMC article.
-
Multiomics dissection of Brassica napus L. lateral roots and endophytes interactions under phosphorus starvation.Nat Commun. 2024 Nov 10;15(1):9732. doi: 10.1038/s41467-024-54112-5. Nat Commun. 2024. PMID: 39523413 Free PMC article.
-
Genome-Wide Association-Based Identification of Alleles, Genes and Haplotypes Influencing Yield in Rice (Oryza sativa L.) Under Low-Phosphorus Acidic Lowland Soils.Int J Mol Sci. 2024 Oct 30;25(21):11673. doi: 10.3390/ijms252111673. Int J Mol Sci. 2024. PMID: 39519225 Free PMC article.
-
Structural variation reshapes population gene expression and trait variation in 2,105 Brassica napus accessions.Nat Genet. 2024 Nov;56(11):2538-2550. doi: 10.1038/s41588-024-01957-7. Epub 2024 Nov 5. Nat Genet. 2024. PMID: 39501128 Free PMC article.
References
-
- Weir BS, Anderson AD, Hepler AB. Genetic relatedness analysis: modern data and new challenges. Nat Rev Genet. 2006;7:771–780. - PubMed
-
- Helgason A, Yngvadttir B, Hrafnkelsson B, Gulcher J, Stefnsson K. An Icelandic example of the impact of population structure on association studies. Nat Genet. 2005;37:90–95. - PubMed
Publication types
MeSH terms
Grants and funding
- N01ES45530/ES/NIEHS NIH HHS/United States
- P30 1MH083268/MH/NIMH NIH HHS/United States
- HL087679-01/HL/NHLBI NIH HHS/United States
- 5PL1NS062410-03/NS/NINDS NIH HHS/United States
- R01 HL087679/HL/NHLBI NIH HHS/United States
- NH084698/NH/NIH HHS/United States
- U01-DA024417/DA/NIDA NIH HHS/United States
- UL1 DE019580/DE/NIDCR NIH HHS/United States
- 6R01HL087679-03/HL/NHLBI NIH HHS/United States
- K25 HL080079-05/HL/NHLBI NIH HHS/United States
- K25 HL080079/HL/NHLBI NIH HHS/United States
- GM053275-14/GM/NIGMS NIH HHS/United States
- U01 DA024417/DA/NIDA NIH HHS/United States
- 1K25HL080079/HL/NHLBI NIH HHS/United States
- PL1 NS062410/NS/NINDS NIH HHS/United States
- 5RL1MH083268-03/MH/NIMH NIH HHS/United States
- 5UL1DE019580-03/DE/NIDCR NIH HHS/United States
- RL1 MH083268/MH/NIMH NIH HHS/United States
- R01 GM053275/GM/NIGMS NIH HHS/United States
LinkOut - more resources
Full Text Sources
Other Literature Sources