iBet uBet web content aggregator. Adding the entire web to your favor.
iBet uBet web content aggregator. Adding the entire web to your favor.



Link to original content: http://www.ncbi.nlm.nih.gov/pubmed/29700474
Comparison of methods that use whole genome data to estimate the heritability and genetic architecture of complex traits - PubMed Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 May;50(5):737-745.
doi: 10.1038/s41588-018-0108-x. Epub 2018 Apr 26.

Comparison of methods that use whole genome data to estimate the heritability and genetic architecture of complex traits

Affiliations

Comparison of methods that use whole genome data to estimate the heritability and genetic architecture of complex traits

Luke M Evans et al. Nat Genet. 2018 May.

Abstract

Multiple methods have been developed to estimate narrow-sense heritability, h2, using single nucleotide polymorphisms (SNPs) in unrelated individuals. However, a comprehensive evaluation of these methods has not yet been performed, leading to confusion and discrepancy in the literature. We present the most thorough and realistic comparison of these methods to date. We used thousands of real whole-genome sequences to simulate phenotypes under varying genetic architectures and confounding variables, and we used array, imputed, or whole genome sequence SNPs to obtain 'SNP-heritability' estimates. We show that SNP-heritability can be highly sensitive to assumptions about the frequencies, effect sizes, and levels of linkage disequilibrium of underlying causal variants, but that methods that bin SNPs according to minor allele frequency and linkage disequilibrium are less sensitive to these assumptions across a wide range of genetic architectures and possible confounding factors. These findings provide guidance for best practices and proper interpretation of published estimates.

PubMed Disclaimer

Conflict of interest statement

COMPETING FINANCIAL INTERESTS

The authors declare no competing financial interests.

Figures

Figure 1
Figure 1
Mean h^SNP2 across 100 replicates from GRMs built from WGS SNPs in the least structured subsamples. Methods on the x-axis as follows: Single-component GREML (GREML-SC) with all SNPs or only MAF > 0.01; MAF-stratified GREML (GREML-MS); LD and MAF-stratified GREML (GREML-LDMS-R [regional LD] & -I [individual SNP LD]); Single-component Linkage Disequilibrium-Adjusted Kinships (LDAK-SC) with all SNPs or only MAF > 0.01; MAF-stratified LDAK (LDAK-MS); Extended Genealogy with Thresholded GRMs with all SNPs or only common (MAF > 0.01), presenting both h2SNP and h2Tot (=h2SNP + h2ibs>t); LD score regression (LDSC) using no PCs as covariates in GWAS, using PCs as covariates, or partitioned using PCs with MAF-stratification. Estimates are from samples of unrelated individuals (relatedness <0.05) except for those from the Threshold GRM method, which included all individuals. Simulated (true) h2 = 0.5. Colors represent the MAF range of the 1,000 randomly drawn CVs. See Online Methods for descriptions of each method and Supplementary Figures for additional estimates and Supplementary Table 2 for numerical results. Error bars represent 95% confidence intervals.
Figure 2
Figure 2
Mean h^SNP2 for four MAF bins across 100 replicates from multi-component approaches in unrelated individuals using WGS SNPs in the least structured subsample. See Fig. 1 for specific methods. Black lines are the true (simulated) h2 values; note that in the top panel, the true h2 values differ across MAF. See Online Methods for descriptions of each method and Supplementary Figures for additional estimates and Supplementary Table 4 for numerical results. Error bars represent 95% confidence intervals.
Figure 3
Figure 3
Mean h^SNP2 across 100 replicates from GRMs built from imputed SNPs in the least structured subsamples across different model assumptions (bars) and different ways of simulating CVs (x-axes). The x-axes of each panel show the simulated CV MAF-scaling parameter, α, and the CV effect size distribution, βk. The four panels show different MAF ranges of the 1,000 randomly-drawn CVs. DHS sites were randomly sampled without respect to MAF. Bar colors indicate the fitted model, with a single GRM used except for the “LDMS” models, which used 16 GRMs (α=−1) stratified by MAF and either regional (-R) or individual SNP (-I) LD score. See Online Methods for descriptions of each method and Supplementary Figures for additional estimates and Supplementary Table 6 for numerical results. Error bars represent 95% confidence intervals.
Figure 4
Figure 4
Mean h^SNP2 across 100 replicates from GRMs built from imputed SNPs in the least structured subsamples across different model assumptions (bars) and different ways of simulating CVs (x-axes). CV effect sizes were simulated from ~N(0,τk). The x-axes of each panel show the simulated CV MAF-scaling parameter, α. The three panels show different MAF ranges of the 1,000 randomly-drawn CVs. Bar colors indicate the fitted model. See Online Methods for descriptions of each method and Supplementary Figures for additional estimates and Supplementary Table 6 for numerical results. Error bars represent 95% confidence intervals.
Figure 5
Figure 5
Boxplots of the absolute bias of heritability estimates (|E(h^SNP2)h2|) across all simulated phenotypes from Supplementary Figures 24 & 26 using WGS data to estimate GRMs (top), and from Figures 3–4 using imputed variants to estimate the GRMs (bottom). X axis indicates the parameters for the estimation model, including the MAF scaling factor, α, and the assumed effect size distribution, βk, specified in the GRM and whether imputation scores (r2) were used in the GRM estimation. All used a single GRM except for LD- & MAF-stratified GREML (LDMS), which used 16 GRMs (α=−1) stratified by MAF and either regional (-R) or individual SNP (-I) LD score. * Typical GREML-SC parameters. † Typical LDAK-SC parameters. Boxplots show the median and interquartile, with whiskers extending 1.5 times the quartiles and more extreme points shown for N=22 (WGS) and 26 (imputed) mean estimates of heritability.
Figure 6
Figure 6
Estimated h^SNP2 using multiple methods with imputed variants for six complex traits in the UK Biobank. MAF>0.01 indicates common SNPs were used to create the GRMs. ∅ = information matrix was not invertible. HM3 indicates that only imputed HapMap3 sites were used in the LDSC analysis. Sample sizes as follows: height N=94,769; BMI N=94,595; impedance N=93,451; trunk fat N=93,414; fluid intelligence N=31,724; neuroticism N=78,565. See Supplementary Table 8 for numerical results. Error bars are 1 S.E.M.

Similar articles

Cited by

  • Genome-wide meta-analysis of myasthenia gravis uncovers new loci and provides insights into polygenic prediction.
    Braun A, Shekhar S, Levey DF, Straub P, Kraft J, Panagiotaropoulou GM, Heilbron K, Awasthi S, Meleka Hanna R, Hoffmann S, Stein M, Lehnerer S, Mergenthaler P, Elnahas AG, Topaloudi A, Koromina M, Palviainen T, Asbjornsdottir B, Stefansson H, Skuladóttir AT, Jónsdóttir I, Stefansson K, Reis K, Esko T, Palotie A, Leypoldt F, Stein MB, Fontanillas P; Estonian Biobank Research Team; 23andMe Research Team; Kaprio J, Gelernter J, Davis LK, Paschou P, Tannemaat MR, Verschuuren JJGM, Kuhlenbäumer G, Gregersen PK, Huijbers MG, Stascheit F, Meisel A, Ripke S. Braun A, et al. Nat Commun. 2024 Nov 13;15(1):9839. doi: 10.1038/s41467-024-53595-6. Nat Commun. 2024. PMID: 39537604 Free PMC article.
  • [Single nucleotide polymorphism heritability of non-syndromic cleft lip with or without cleft palate in Chinese population].
    Xue E, Chen X, Wang X, Wang S, Wang M, Li J, Qin X, Wu Y, Li N, Li J, Zhou Z, Zhu H, Wu T, Chen D, Hu Y. Xue E, et al. Beijing Da Xue Xue Bao Yi Xue Ban. 2024 Oct 18;56(5):775-780. doi: 10.19723/j.issn.1671-167X.2024.05.004. Beijing Da Xue Xue Bao Yi Xue Ban. 2024. PMID: 39397453 Free PMC article. Chinese.
  • Rare variant contribution to the heritability of coronary artery disease.
    Rocheleau G, Clarke SL, Auguste G, Hasbani NR, Morrison AC, Heath AS, Bielak LF, Iyer KR, Young EP, Stitziel NO, Jun G, Laurie C, Broome JG, Khan AT, Arnett DK, Becker LC, Bis JC, Boerwinkle E, Bowden DW, Carson AP, Ellinor PT, Fornage M, Franceschini N, Freedman BI, Heard-Costa NL, Hou L, Chen YI, Kenny EE, Kooperberg C, Kral BG, Loos RJF, Lutz SM, Manson JE, Martin LW, Mitchell BD, Nassir R, Palmer ND, Post WS, Preuss MH, Psaty BM, Raffield LM, Regan EA, Rich SS, Smith JA, Taylor KD, Yanek LR, Young KA; NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium; Hilliard AT, Tcheandjieu C, Peyser PA, Vasan RS, Rotter JI, Miller CL, Assimes TL, de Vries PS, Do R. Rocheleau G, et al. Nat Commun. 2024 Oct 9;15(1):8741. doi: 10.1038/s41467-024-52939-6. Nat Commun. 2024. PMID: 39384761 Free PMC article.
  • Comparison of machine learning methods for genomic prediction of selected Arabidopsis thaliana traits.
    Kelly CM, McLaughlin RL. Kelly CM, et al. PLoS One. 2024 Aug 28;19(8):e0308962. doi: 10.1371/journal.pone.0308962. eCollection 2024. PLoS One. 2024. PMID: 39196916 Free PMC article.
  • Polygenic scores and Mendelian randomization identify plasma proteins causally implicated in Alzheimer's disease.
    Cammann DB, Lu Y, Rotter JI, Wood AC, Chen J. Cammann DB, et al. Front Neurosci. 2024 Jul 23;18:1404377. doi: 10.3389/fnins.2024.1404377. eCollection 2024. Front Neurosci. 2024. PMID: 39108314 Free PMC article.

References

    1. Tenesa A, Haley CS. The heritability of human disease: estimation, uses and abuses. Nat. Rev. Genet. 2013;14:139–149. - PubMed
    1. Visscher PM, Hill WG, Wray NR. Heritability in the genomics era--concepts and misconceptions. Nat. Rev. Genet. 2008;9:255–66. - PubMed
    1. Keller MC, Coventry WL. Quantifying and addressing parameter indeterminacy in the classical twin design. Twin Res. Hum. Genet. 2005;8:201–213. - PubMed
    1. Eaves LJ, Last KA, Young PA, Martin NG. Model-fitting approaches to the analysis of human behaviour. Heredity (Edinb) 1978;41:249–320. - PubMed
    1. Yang J, et al. Common SNPs explain a large proportion of the heritability for human height. Nat. Genet. 2010;42:565–569. - PMC - PubMed

Publication types

LinkOut - more resources