Skip to content

When you log into your sandboxes, you will have a number of folders available for you. To get started we will concentrate on the home, library-red and red folders.

This reference page goes through the other folder and explain what they are for and how they should be used.

The following is a high-level overview of the directories in the TRE: Image showing high-level overview of the TRE

Your Home Folder

Available at /home/ivm in your sandbox, this is your personal folder. You can use this to store any files you want to keep, but it is not backed up. If you delete a file from here, it is gone forever.

Home Folder Uses

You can use your home folder for any files you want to keep, but it is not backed up. The best place for this is in your red/ folder.

/home/ivm is semi-fast (HDD) storage and as such is faster than other parts of the sandbox. It might be worth it to run some jobs here especially if you are loading large amounts of data but you should be aware that this is not backed up and therefore anything you want to keep should be moved to the red/ folder.

Types of data sensitivity

Folders are suffixed with red or green to indicate the type of data that is stored there. Red is for potentially sensitive data that should not be shared without the outside world. Green is for data that can be shared with the outside world.

library-red

Available at /genesandhealth/library-red in your sandbox, this is a read-only folder that is shared between all users. This contains the data that you will want to use.

library-red is slower storage of large capacity (>8 PiB @ Feb 2022). For large files, the whole file needs to be read and cached first by gcsfuse, fileseek to a certain part of the file is not possible. For high performance/large files it may be better to make a copy to /genesandhealth/red or /home/ivm.

library-red is a google storage bucket gs://qmul-sandbox-production-library-red/ (read+write only for admins)

library-red is where curated and raw data is stored. This is where you will find the data you need to run your analysis. There are a number of subfolders in library-red that contain different types of data, and each folder should be used for a specific purpose. If you find a folder that does not have a readme file, please contact the Genes and Health team to get more information on what the folder is for.

red

red is used directly by the virtual machine, and is specific to each sandbox. Users in the same sandbox can all see the contents of the red folder. Most organisations use this folder to store their analyses.

red has slower storage than /home/ivm but is backed up, and we strongly recommend that you use this folder to store your data and any analysis files you are running frequently to avoid losing your work.

We strongly recommend that you make your own directory in the red folder to store your data. This will allow you to share your data with other users in the same sandbox, without the risk of them accidentally deleting it.

In the Old TRE you can do this directly in the File Manager or on the Command Line.

In the New TRE, files can only be copied to the red bucket by right-clicking on the file and selecting Upload to red bucket.

Image showing Upload to red bucket option

This can also be done via gsutil from within the TRE, for example:

gsutil cp -r -n my_file "gs://$BUCKET_SANDBOX_IVM/"

To remove a file from /genesandhealth/red, right-click on it in the File Manager and select Delete from Red Bucket.

Image showing Delete from red bucket option