Ingest Data into the Project Store (Tutorial)

Abbreviations Key
cvfcreate archive, verbose output, filename
ELNelectronic lab notebook
Guidglobally unique identifier
HISEHuman Immune System Explorer
IDEintegrated development environment
UIuser interface
SLIMSsimplified laboratory information management system
tartape archive (Linux command)

At a Glance

Each project has a Project Store where all project users can store analyses and upload data that's not associated with an automated pipeline. It's important to include a manifest that links these files with specific cohorts, subjects, and samples. Users can then locate the files in an advanced search, and they become traceable in a certificate of reproducibility. To send data to your Project Store, a designated watchfolder must be set up by the HISE administrator. For watchfolder setup or modification, contact immunology-support@alleninstitute.org.

Description

The Project Store is a read-only space where multiple users can upload data, mark files for deletion, and store insights in HISE. Data analysts use the Project Store UI to browse files and to save or locate certain project files, such as JPG and PDF files. Each project has its own Project Store, which means that the files in a given Project Store correspond to the specified project. 

You can upload files to the Project Store. If the Project Store has a designated watchfolder, drop your files there to ingest the data into HISE. If there is no designated watchfolder, the name of any file you drop into one of the project's other watchfolders must conform to the selected file type. For details see USE WATCHFOLDERS TO INGEST DATA.

Data does not persist in a watchfolder. Instead, it triggers a storage folder to upload and save the data. The files are then housed in your Project Store, where anyone with access can download them. You can also associate metadata with files you've already ingested. Doing so is a best practice to help study collaborators find and access your analysis and results. 

Instructions

Ingesting files into your Project Store is a three-step process: create a manifest to tie your files to specific samples, tar up the manifest and sample files for ingestion, and upload the tar file into the designated watchfolder. For details, follow the instructions below.

Step 1: Create the manifest file

1. Create a manifest.csv file declaring the file type and sample reference(s) for each file. (If your project doesn't include the type of file you want to upload, contact to request that your file type be added.) Use the .csv file in the following table as an example.

a. The accountGuid and projectGuid link your data to the right account and project. 

b. If you have a simplified ELN experiment, SLIMS can automatically generate a manifest.csv that contains the accountGuid and projectGuid. For details, see Attach Metadata to Files .

accountGuid 16309200-3228-46eb-9a8e-3a4133f4d723
projectGuide206cf7a-5b13-478f-b842-a305fe4954d8
filesamplesfileType
Testfilename.csvKT00970;KT01245;KT01244;KT00971Testfiletype
myFile.rdsKT2002;KT1304rds-file

Step 2: Create the tar file

1. To create a new directory for your manifest file and samples, open your terminal and enter the following command. (In this example, the directory is called myTar, but use a name that makes sense for your project.)

mkdir myTar

2. Drag and drop the manifest.csv file and your sample files into the directory folder you just created.

3. When you create a tar file, data can be accidentally overwritten. To save your data for safekeeping, copy the directory folder to another location before you proceed.

4. To create the tar file and prevent duplicate metadata, enter the following command. (In this example, the tar file is called myProjectFiles.tar, but use a name that makes sense for your project.)

COPYFILE_DISABLE=1 tar cvf myProjectFiles.tar myTar/*.*

Step 3: Ingest the tar file

1. Navigate to HISE, and log in with your organization's email account.

2. From your Personal Space, choose Environment. (Your Personal Space is located below your name in the upper-right corner of your screen.)

3. On the Configure HISE Environment screen, choose the Accounts tab, and click the drop-down menu next to Available Accounts. From the list, choose the account you want to work with.

4. In the Available Projects section, select the checkbox next to each project you want to work with. (To select or deselect all available projects, click the checkbox to the left of the table column headers.)

5. Return to your Personal Space. From the drop-down menu, choose Watch Folders, and click on the watchfolder for your account and project.

6. The watchfolder opens. Click UPLOAD FILES, or drag and drop your files into the watchfolder.

7. To see the status of your uploaded files, from the top navigation menu, choose Data Processing > Ingest Receipts

a. A "dismiss" error means that the file Regex is not formatted correctly. For details, see USE WATCHFOLDERS TO INGEST DATA.

b. A "fail" error means that there was some other problem uploading the file. Try again, and open a ticket (immunology-support@alleninstitute.org) if the issue persists.


Related Resources

Use Watchfolders to Ingest Data

Understand Watchfolders and the Project Store