Data storage refers to where data is located while actively being collected or processed. Data that is stored should be periodically backed up to a secure location. Storage and backup can be contrasted with preservation and archiving, which focus on handling data when a research project ends.
Good data storage practices consider:
Here are some tips for storing research data:
Data preservation involves saving your data in a way that keeps it safe, usable, and accessible for the long term. This includes using reliable file formats, storing files in secure and stable locations, and keeping copies in more than one place—such as on a local drive and in a trusted data repository.
Preservation usually happens at the end of a project and creates a final, unchanging version of the data. This is different from storage and backup, which happen during the research process while data is still being collected, analyzed, and updated.
Choose file formats that will be easy to open and use in the future. To keep your data readable over time, avoid uncommon, proprietary, or compressed formats that may rely on specific software or hardware.
Planning ahead by choosing the right formats now helps protect your data from becoming unreadable later.
| Type | Preferred | Acceptable | Not Recommended |
|---|---|---|---|
| Structured data and spreadsheets |
OpenDocument Spreadsheet (.ods) Microsoft Excel OOXML (.xlsx) SQLite (.sqlite3, .sqlite, .db) |
Microsoft Excel (.xls) SPSS (.por, .sav) |
|
| Text and word processing documents |
Plain Text (.txt) Markdown (.md) XML (.xml) SGML (.sgm, .sgml) |
PDF (.pdf) Microsoft Word OOXML (.docx) OpenDocument Text (.odt) LaTeX (.latex) EPUB (.epub) HTML (.htm, .html) Rich Text Format (.rtf) PostScript (.eps, .epsf, .ps) |
Microsoft Word (.doc) WordPerfect (.wpd) Google Docs All other text document formats not listed here |
| Photos, images, and vector graphics |
Tagged Image File Format (.tiff) JPEG (.jpeg) Portable Network Graphics (.png) Scalable Vector Graphics (.svg) |
Graphics Interchange Format (.gif) Digital Negative (.dng) Bitmap (.bmp) PDF (.pdf) |
Adobe Illustrator (.ai) Adobe Photoshop (.psd) All other image formats not listed here |
| Audio |
MPEG-1 or MPEG-2 Audio (.mp3) Waveform Audio File Format (.wav) Audio Interchange File Format (.aif, .aiff) Broadcast Wave (.bwf, .bwav) |
Standard MIDI (.mid) Free Lossless Audio Codec (.flac) MPEG-4 (.mp4, .m4a) Ogg Vorbis (.ogg) Sun Audio (.au) |
AIFF Compressed (.aifc) Windows Media Audio (.asf, .wma) All other audio formats not listed here |
| Video |
Audio Video Interleave (.avi) QuickTime Movie (.mov) MPEG-4 (.mp4) |
Windows Media Video (.asf, .wmv) All other video formats not listed here |
|
| Posters, presentations and slide decks |
PDF (.pdf) OpenDocument Presentation (.odp) Microsoft PowerPoint OOXML (.pptx) |
Microsoft Powerpoint (.ppt) Google Slides All other presentation formats not listed here |
Digital data are fragile, no matter where you store them—hard drives, servers, or elsewhere. Over time, files can become damaged or unreadable due to a problem known as bit rot. To keep your data safe, use two key strategies: refreshment and replication.
Refreshment means copying your data to a new device every 2–5 years to avoid loss from aging hardware.
Replication means keeping multiple copies of your data:
Personal computers and external hard drives are not reliable for long-term archiving. For preserving finalized data, networked file servers or trusted data repositories are the best option.