iBet uBet web content aggregator. Adding the entire web to your favor.
iBet uBet web content aggregator. Adding the entire web to your favor.



Link to original content: https://pubmed.ncbi.nlm.nih.gov/30100822
Statistics versus machine learning - PubMed Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Apr;15(4):233-234.
doi: 10.1038/nmeth.4642. Epub 2018 Apr 3.

Statistics versus machine learning

Affiliations

Statistics versus machine learning

Danilo Bzdok et al. Nat Methods. 2018 Apr.
No abstract available

PubMed Disclaimer

Figures

Figure 1
Figure 1
Simulated expression and RNA-seq read counts for 40 genes in which the last 10 genes (A–J) are differentially expressed across two phenotypes (−/+). Simulated quantities and heat maps are log-scaled. (a) Simulated log mean expression levels for the genes generated by sampling from the normal distribution with mean 4 and s.d. 2. In the + phenotype the differential expression of genes A–J was created by the addition of a standard normal to each mean expression in the – phenotype. (b) The simulated RNA-seq read counts for ten subjects in each phenotype generated from an overdispersed Poisson distribution based on mean expression in a with biological variation. The heat map shows z-scores of the read counts normalized across all 20 subjects for a given gene.
Figure 2
Figure 2
Analysis of gene ranking by classical inference and ML. (a) Unadjusted log-scaled P values from statistical differential expression analysis as a function of effect size, measured by fold change in expression. (b) Log-scaled P values from a as a function of gene importance from random forest classification. In a and b, red circles identify the ten differentially expressed genes from Figure 1; the remaining genes are indicated by open circles. (c) Distribution of the number of dysregulated genes correctly identified in 1,000 simulations by inference (gray fill) and random forest (black line).

Similar articles

Cited by

References

    1. Bzdok D. Front Neurosci. 2017;11:543. - PMC - PubMed
    1. Bzdok D, Krzywinski M, Altman N. Nat Methods. 2017;14:1119–1120. - PMC - PubMed
    1. Krzywinski M, Altman N. Nat Methods. 2014;11:355–356. - PubMed
    1. Lever J, Krzywinski M, Altman N. Nat Methods. 2016;13:603–604. - PubMed
    1. Altman N, Krzywinski M. Nat Methods. 2017;14:933–934. - PMC - PubMed