Data organization refers to structuring project directories to aid the storage and finding of files, naming files to enable logical grouping and/or chronological sorting within directories, and structuring the contents of files to facilitate analysis. This page outlines best practices for organizing your research data.
Good data organization considers:
You'll want to plan and use a file and folder organizational structure that is informative and keeps all the research information and materials associated with a project together.
Here are some tips for organizing directories:
Here are some suggestions for naming directories:
Put any source for scripts or code in a folder named src or code
Organized by file type | Organized by analysis |
---|---|
|
|
Establishing a file naming convention that produces groupings of related files can help everyone in your lab easily identify the data they're looking for by name. There is no one-size-fits-all naming convention -- the conventions you use should be based on your and your team's needs.
Here are some things to consider when choosing a naming scheme:
The following example filenames follow the guidance listed above:
Note how the components of each file name are meaningfully named, they're arranged from general to specific, they're separated with dashes, the dates use YYYYMMDD format, and the version numbers are zero-padded.
In addition to thinking about how to organize files on disk and how to name them, you should also give consideration to organizing the contents of your files. These links offer some best practices.