Skip to Main Content

Research Data and Reproducibility

Your guide to research data services at OHSU.

What is data reuse?

Data reuse occurs when an investigator conducts their own analysis of research data collected by others. There are many reasons to reuse data: augment data collected in your research, ask new questions and conduct an analysis of data obtained from multiple sources, or attempt to reproduce or replicate the results of a published scientific study.

To reuse data, one must consider where to find such data, whether permission is needed to use it, and give credit to its creators. Data reuse is facilitated by data sharing.

Where can data be discovered?

Data journals are specialized publications that focus on publishing data papers to enhance the findability of high-quality datasets. Typically, they do not host data, but they may recommend places to deposit data. Below are some resources for finding data journals.

Does permission need to be obtained to reuse data?

In the United States, facts are not copyrightable, which means most data is not protected by copyright, but an expression of data, such as its curated form, a chart, or table, might be. Data is licensed to communicate to users how it may be used. Limitations may be placed on reuse, such as limiting commercial uses, requiring attribution, or protecting the privacy of study participants. Proprietary datasets also exist and may have costs associated with their reuse.

There is a growing movement to publish data under licenses that place datasets into the public domain. Doing so facilitates the widest distribution and avoids licensing conflicts, such as license interoperability and attribution stacking, where a chain of reuse triggers a cascade of attribution that becomes difficult to manage.

Contact Technology Transfer if you have intellectual property questions about your data.  

Contact the Library if you have any questions about licensing and attribution concerns related to data reuse.

Find more information about data licensing in the Sharing Data section of this guide.

How should reused datasets be cited?

When citing data, the following elements should be included:

  • who created the dataset
  • what the dataset is named
  • what year the dataset was published or released
  • what version of the dataset was used
  • where the dataset is hosted
  • what unique identifiers have been assigned to the dataset, such as a Digital Object Identifer (DOI) or Archival Resource Key (ARK)
  • what date the dataset was accessed

Note that many data repositories offer features that automatically generate formatted citations for the data they host, which can save you the work of creating the citation from scratch.

Contact the Library if you have questions about how to format dataset citations for a specific citation style.