iBet uBet web content aggregator. Adding the entire web to your favor.
iBet uBet web content aggregator. Adding the entire web to your favor.



Link to original content: https://pubmed.ncbi.nlm.nih.gov/28872630
If these data could talk - PubMed Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Sep 5:4:170114.
doi: 10.1038/sdata.2017.114.

If these data could talk

Affiliations

If these data could talk

Thomas Pasquier et al. Sci Data. .

Abstract

In the last few decades, data-driven methods have come to dominate many fields of scientific inquiry. Open data and open-source software have enabled the rapid implementation of novel methods to manage and analyze the growing flood of data. However, it has become apparent that many scientific fields exhibit distressingly low rates of reproducibility. Although there are many dimensions to this issue, we believe that there is a lack of formalism used when describing end-to-end published results, from the data source to the analysis to the final published results. Even when authors do their best to make their research and data accessible, this lack of formalism reduces the clarity and efficiency of reporting, which contributes to issues of reproducibility. Data provenance aids both reproducibility through systematic and formal records of the relationships among data sources, processes, datasets, publications and researchers.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing financial interests.

Figures

Figure 1
Figure 1. A simple W3C PROV-DM compliant provenance graph.
In this example, two processes (Process1) and (Process 2), use the data from the inputs File 1 and File 2, respectively. The processes are associated respectively with the users Alice and Bob, respectively. Process 1 informed (transferred information to) Process 2, which generated the output File 3.
Figure 2
Figure 2. Research teams across the sciences are integrating data provenance methods into their research practices in response to increases in computational demands.
On the left: (Photo Credit: A. Trisovic) The Compact Muon Solenoid (CMS) experiment at CERN during the technical stop in February 2017. On the right: (Photo Credit: M.K. Lau) One of several research towers used for ecological data collection at Harvard Forest. In addition to providing infrastructure for researchers to view the forest at multiple levels in the forest canopy, many instruments for automated observations, such as wind speed, CO2 flux, and leaf phenology, are placed on these towers. Data are relayed to a controlling computer via a wireless network.

Similar articles

Cited by

References

    1. Baker M. & Dolgin E. Cancer reproducibility project releases first results. Nature 541, 269–270 (2017). - PubMed
    1. Leek J. T. & Jager L. R. Is most published research really false? Annu Rev Stat Appl 4, 109–122 (2017).
    1. Sarewitz D. The pressure to publish pushes down quality. Nature 533, 147–147 (2016). - PubMed
    1. Peng R. D. Reproducible research in computational science. Science 334, 1226–1227 (2011). - PMC - PubMed
    1. Ellison A. M. et al. An analytic web to support the analysis and synthesis of ecological data. Ecology 87, 1345–1358 (2006). - PubMed

LinkOut - more resources