Directory structure and file naming
Think of a clear and consistent directory structure.
Directory structure
It is very important to organize your data in a clear directory structure. The structure aids users of your data set to navigate and get an overview of your data quickly. Directory names should be readily understandable, but not too long. If you need more text, put a “README” file in a directory explaining (in at most a couple of sentences) the contents of the files and the purpose they serve in your research.
There is not just one way to structure your data because it is a project-specific task, dependent on the project layout and the types of data you are collecting. The objective is to organize your data using a comprehensible folder structure with an easily understandable hierarchical structure. Some suggestions are:
- Site
- General
- Data files (spreadsheets/databases)
- Photographs
- Drawings/maps
- Etc.
- Material type
- Data files (spreadsheets/databases)
- Photographs
- Drawings/maps
- Data files (spreadsheets/databases)
- Etc.
- General
- Part of research or Publication
-
- Data files (spreadsheets/databases)
- Photographs
- Drawings /maps
- Etc.
- (Sub-research)
- Data files (spreadsheets/databases)
- Photographs
- Drawings /maps
- Etc.
-
File naming convention
A proper file name should be unique, consistent, and descriptive. That way it is easier to identify, locate, and makes use of data because the category, purpose, and version of a file are visible, even without opening every one of them. Proper file names should:
- be unique (not the same names in different folders)
- be consistent (upper / lower case)
- be permanent
- reserve the 3-letter extension (preceded by a dot) for the file type
- not contain the dots other than the one above
- indicate the version of the file in format v1, v2-3 or similar (if applicable)
- be meaningful but brief
- classify the file
- not have spaces and special characters
- use hyphens “-“ or underscores “_” to separate logical elements
- use the format YYYYMMDD (easy to find and sort chronologically)
- only use codes that are documented.