“Fast is fine, but accuracy is everything…”
– Wyatt Earp
OK, so he may have been talking about a gunfight rather than data management, but the premise still holds! Analysis is only as good as your data – this must be accurate and you need to look after it. So how’s best to do this? Chris from the Research Data Management team has been giving us more tips on keeping data tidy and secure, which I have summarised below.
Firstly, consistency is key, particularly in your directory structure and filenames:
- Use a consistent date format at the beginning of your filenames
- Use underscores rather than blank spaces
- Don’t use special characters
- Keep as brief as possible
- Make a filename meaningful and descriptive
- Separate ongoing projects from those which have already been completed
As well as making sure that you can identify a file without having to open it, it is important to use a consistent method of version control also. This could be using a sequential numbered system e.g. v1, v2 etc. or the use of version control software such as SVN, CVS or Git. This is particularly important if multiple people are likely to be working on the data.
Ask yourself this – when you need to go back to your file in a month or a year from now, will you remember where it is, what version is correct and what the title actually means?
Besides a sensible name and location, there is additional information required to ensure that your data is usable – the contextual information to give the data meaning. This should be documented and stored alongside the data, and include:
- A basic description of the project
- Descriptions of the methodologies, protocols, sampling techniques used etc.
- An explanation of the content of all the data files and how they relate to each other
- A list of any software required to access/analyse the data
- A detailed list of all the variables
Documenting your notebooks is also vital, and whilst physical notebooks do not necessarily have the same security risks as electronic ones, they can unfortunately get lost or damaged, and take a lot of time to ‘back-up’. Electronic notebooks have however come a long way, and their benefits include:
- Easily searchable
- Support collaborations from multiple researchers
- Simple to backup and share
- Accessible anywhere on multiple devices
So where’s best to ultimately store your data? This depends on who you want to share it with. At the University of Exeter we are lucky to have several options available:
- University server (U:Drive) – this is secure and regularly backed up
- University One Drive for Business – cloud storage which is GDPR compliant and supported by the University
- University shared network drive
- Portal storage e.g. memory stick, external hard-drive, laptop etc. – these have a finite lifetime and could be susceptible to loss/damage
Cloud storage can also be used, although it is worth remembering that under GDPR guidelines, any personal data should be stored on a server in the EU, and this is not always the case with cloud storage.
The 3-2-1 strategy gives us a handy reminder for backing up:
Always keep 3 copies of your data > Store copies on at least 2 different media > Keep at least one of these in a physically different location. Regarding data security, consider if your data is sensitive, either individually or from a commercial angle. Set access controls and encryptions to reflect this.
As always, the Research Data Management team can advise and elaborate on any of the above points – please just if needed!