Research Data and Reproducibility
Welcome
Welcome! This guide is designed to centralize information about working with research data. It contains guidance and information about tools and resources to help OHSU faculty, students and staff manage data effectively. Below you'll find an overview of the guide's contents and common definitions of what research data is and includes.
Navigating OHSU Resources and Data Management Plans:
- Learn about data services at OHSUThis section includes information about data services available from the Library, OCTRI, ACC, and ITG.
- Learn about Data Management and Sharing (DMS)This section includes information about creating DMS plans, current trends, policies and governance from funders, journals, publishers, and OHSU, and frameworks for data quality and ethics.
Data Practices and Resources
- Processing DataData processing activities include cleaning, transformation, enrichment, validation, exploration, analysis, and visualization. Here you will find a variety of tools and resources to facilitate these activities.
- Organizing DataData organization refers to structuring project directories to aid the storage and finding of files, naming files to enable logical grouping and/or chronological sorting within directories, and structuring the contents of files to facilitate analysis.
- Documenting DataData documentation describes what data is, why variable names were chosen, when data was collected, and other happenings over the course of a research project.
- Storing DataData storage refers to where data is located while it is actively being collected or processed.
- Preserving DataData preservation refers to practices and procedures aimed at providing long-term storage of and access to data by conserving and maintaining its safety and integrity.
- Sharing DataData sharing is the practice of making research data available to others. An integral part of open science, many funding agencies, publishers, and institutions have policies requiring the sharing of research data.
- Reusing DataData reuse occurs when an investigator conducts their own analysis on research data collected by another investigator. In order to reuse data, one must consider where to find such data, whether it needs to be licensed, and how to cite it in order to give credit to its creators.
What do we mean by research data?
In general, research data, which may also be referred to as scientific data, underlying data, or the data that is integral to a publication, is the factual information and materials needed to substantiate research findings. This can include raw, processed, and analyzed data as well as code, software, and tangible research materials.
Here are some examples of funders' definitions of research data:
Scientific Data: The recorded factual material commonly accepted in the scientific community as of sufficient quality to validate and replicate research findings, regardless of whether the data are used to support scholarly publications. Scientific data do not include laboratory notebooks, preliminary analyses, completed case report forms, drafts of scientific papers, plans for future research, peer reviews, communications with colleagues, or physical objects, such as laboratory specimens.
Final NIH Policy for Data Management and Sharing
In general, data, software, and tangible research materials are integral to a publication if they are necessary to support the major claims of the publication or to reproduce and verify the published results.
Sharing Published Materials/Responsibilities of HHMI Authors
Underlying data encompasses all primary data, associated metadata, and any additional relevant data necessary to understand, assess, and replicate the reported study findings in totality. Underlying data can be compiled into any file type, including any necessary access instructions, code, or supporting information files, to ensure the file(s) can be accessed and used by others.
Note: We do not require sharing of data that is ethically unsound or legally encumbered.
Bill & Melinda Gates Foundation Open Access Policy Data Sharing Requirements
What kind of research data needs to be shared?
The types of data that may be subject to a data management and sharing policy may include:
- Primary data
- Datasets that have been cleaned and processed for final analysis
- Data that supports findings and figures in publications
- Data required to reproduce or replicate published findings or figures
- Computational artefacts such as models, algorithms, scripts, code, or software
- Metadata that describes all the above
Implementing good data management practices can make data sharing easier.
What kind of research data doesn't need to be shared?
Data management and sharing policies generally do not require you to share:
- Raw data
- Secondary data
- Protocols
- Laboratory notebooks
- Measurements from laboratory or field equipment
- Survey responses, transcripts, and codebooks
- Completed case report forms
- Physical objects, such as slides, artifacts, specimens, or samples
- Preliminary analyses
- Drafts of scientific papers
- Plans for future research
- Peer reviews
- Communications with colleagues
- Trade secrets
- Commercial information
- Materials that must be kept confidential until publication
- Data or information that would unnecessarily invade personal privacy
- Data that could be used to identify individual participants in research studies
- Data that is legally encumbered or protected by law
While sharing these types of data is not required by data management and sharing policies, open and reproducible science practices increasingly encourage sharing protocols and samples, and all types of data may all benefit from good data management practices.
- Last Updated: May 17, 2023 1:12 PM
- URL: https://libguides.ohsu.edu/research-data-services
- Print Page