Understanding (Research) Data Storage in the UiL OTS Labs
If you have suggestions on how to improve this document, or find mistakes, please send them to firstname.lastname@example.org
Click here for a short overview of this article
In the UiL OTS Lab you can put your data in various places. These places can essentially be divided into two categories: network storage and local temporary storage.
This how-to describes the differences between these two locations, and helps you figure out where to store your data. If you want to know how you can access your data, see the how-to on Accessing your (Research) Data.
The UiL OTS Lab provides four storage locations (or drives in Windows terminology) on the centralized data server of the lab. These storage locations are accessible anywhere in the lab, as well as from home, so they can be used to share data with others (e.g. when working together in a team), or to store your own data somewhere you can access from any computer with an internet connection.
An additional important advantage is that all data that is stored on one of the network storage locations is regularly backed up, which means your data is way more safe than if you store it just on one computer, or even on a USB drive.
The various network storage locations provided by the UiL OTS Labs, in addition to the University-wide U-drive, are these:
This is a private folder on the network that is regularly backed up, generally protected against data loss and accessible from anywhere by you. Only you have access to this directory and no one else does. Everybody who has been added to the all users group (through the lab access request form) has such a private labhome folder. You can only have one labhome folder and not request another.
This is where you put data that no one else needs to (or should) see, e.g. personal notes, data from a solo project, or (privacy) sensitive data from the experiment that should be hidden from other researchers in the project.
This directory is similar to the U-drive, but allows you to store more data, making it easier to put all your (private) research data here.
This is where you store data related to a specific project that should be accessible by everybody in your research group. This is typically where your (processed or unprocessed) experiment results and gathered data are stored, as well as your codebook, the minutes of your meetings, schedules, the paper you are writing collectively, etcetera.
This directory is accessible by everybody in your research group (provided they have been explicitly added; see the project folder request form) and by no one else. Apart from the multiple users, this folder works the same as your labhome drive.
You can request – and be a member of (or have access to) – multiple project folders. Generally, every running project in the lab is assigned exactly one project folder.
This is where you can freely exchange data with other users in the lab or with lab support. Everybody that can log in on the computers in the lab has access to this location. This means you can use it to share (non-sensitive) data with unspecified users. For example, you can put a file there and send an e-mail to everybody who needs to see it that they can find it there.
You should only put data here that you don’t mind anybody seeing. If it’s private data, you should not share it here. If it’s the only copy of the data, you should not share it here.If it doesn’t matter if more people than the actual target audience see the data, sharing through this location works quite well.
Do note that you should only put data here that you have also stored somewhere else. Data older than 14 days is automatically removed from this location. This location really only is meant to share data: not to store it. This is also the only network location that is not backed up.
On this location, any corpora that can be used for linguistic research and which we are licensed to share are made available.
This location is accessible by everybody in the lab, but you cannot change any of the files on here.
Local Temporary Storage
The network-storage locations have one important disadvantage: they are (relatively) slow. Any interaction with the networked files and/or storage requires additional steps. Hence the interactions are slow. In fact, the more users are busy with the storage the slower the interactions. You may compare this with a highway: A little traffic will have no impact on the speed of the individual cars (or files), but if the amount of traffic grows (more users are downloading/uploading files), the cars (or files) move with a slower speed and have to wait their turn.
When running experiments, timing is of the essence and having to wait for a file is often unacceptable. As a user you would not notice a file taking another 200 milliseconds to open, but if you are running a time sensitive experiment, and your trial starts 200 milliseconds later than planned, your results may very well become useless.
To counter this effect, local temporary storage is available on all computers in the UiL OTS Labs. This storage is not depended on the “traffic” of the network servers and is fast enough for time sensitive experiments.
This storage does not have any of the advantages of the network storage, however, so you need to keep the following things in mind when using the local temporary storage:
Your data is not backed up and your data is not accessible on any other computer than the one you saved it on. In fact, every once in a while, everything is removed from the local storage to keep our computers clean and fast. When using the local storage, you have to keep in mind the following things:
- Always run experiments from the local storage. See this how-to for the correct procedure.
- Always upload your data to the network storage before logging out or shutting down the computer. Don’t assume its safe.
- Don’t assume you can access your data anywhere. You can not. It’s only stored on that computer. The storage is very local.
- You can’t share or exchange local data. If you want to share data with others, you need to use the network storage.
Confused on which storage to use?
DON’T PANIC. It’s okay. You can use the following flow chart to see which storage location works best for you:
You can store your data either on the network, or locally and temporarily on a lab computer.
If you are running an experiment, you should always copy your experiment folder to the local temporary storage of the lab computer you are running the experiment from. The best place to put an experiment is in Experiments on your desktop. Also see this how-to for the complete guide on how to run your experiments properly.
If you want to store data you’ve already collected, you should always put it in one of the network drives:
- In your labhome if only you should have access to the data, and you mainly work at home or in the lab.
- On your U-drive if only you should have access to the data, but you mostly work on University computers outside of the lab.
- In the project folder associated with your research if the data is related to your research project and / or other people in the same project folder should have access to it.
Note: Do not use the data exchange to exchange files that you are still working on. These files are deleted automatically after 14 days, so you might lose important data if you use this folder to work on files.
It’s quite possible that the options for storing or sharing your data mentioned above are simply not sufficient. There are alternative options provided by Utrecht University here: