Create or Delete Metadata
At a Glance
When metadata fields are not provided in SLIMS, we allow users to create their own samples, cohorts, and subjects.
Direct Cohort Metadata Ingest
The Direct Cohort Meta Data Ingest is useful in cases where we must create Cohorts, Subjects, Samples, or Specimens that are not on-site and created in our Laboratory Information Management System (LIMS). This is commonly the case with Reference Data Sets.
Most of the subject and sample metadata we see in HISE is referenced from lab data that exists in a single laboratory information management system (SLIMS). By integrating SLIMS with HISE, we reduce human errors and are able to easily track samples and lab resources.
The pitfall we run into is when partners need to reference samples outside of SLIMS; with the Direct Cohort Metadata Ingest feature, we allow users to create their own cohorts, sample, and subject metadata values.
Defining Demographics Scheme
In order for a user to create their own metadata values, the HISE support team first needs to configure the project by defining the File Type for demographic ingest, as well as defining the demographics scheme. When requesting the ability to create custom metadata, the HISE support team will need a map between the headers of the file and the metadata fields in HISE.
The following is an example of the kind of map information the HISE support team will need:
The HISE support team will also need the expected date format for fields that require one. Once the HISE support team receives the required information, the support team will create a watch folder and share the link the user that made the request. The user will then be able to create new metadata fields by uploading a CSV file.
Note: It's very important that the headers in the CSV and the date format you upload match the information you sent to the HISE support team. Otherwise, errors will occur and no new values will be created.
Creating Custom Metadata Values
It is helpful to know the levels and groupings of all the metadata fields and their relationships between each other in order to properly create the CSV file that gets uploaded to a watch folder.
Level 1 fields: Cohort
The top level field is Cohort. You can think of this reference field as an umbrella to all other metadata fields. This is the highest level of grouping, where all Subjects and Samples are grouped together by a Cohort. For a given Cohort, there will be many Subjects and Samples related.
The following fields are considered Level 1 fields:
cohort
cohortDescription
Level 2 fields: Subject
Subject fields are the next level of granularity of groupings. A single Subject will be in a single Cohort, but for one Subject there could be multiple Samples or Specimens that are mapped to it.
The following fields are considered Level 2 fields:
subjectGuid
birthYear
ethnicity
race
sex
Level 3 fields: Sample
The third level is sampleKitGUids and Sample metadata. The relationship between Sample and Specimen is one to many. In other words, for a single Sample there could be multiple Specimens related to that sample. A single Sample will only be associated with a single Subject.
The following fields are considered Level 3 fields:
sampleKitGuid
daysSinceFirstVisit
visitDetails
visitName
Level 4 fields: Specimen
The final layer and the most granular metadata fields are specimenGuid and Specimen-related fields. A single Subject or Sample can be associated with multiple Specimens. A Specimen is not an umbrella to any other fields and does not group any other metadata fields.
The following fields are considered Level 4 fields:
specimenGUID
totalViableCellCount
Putting it All Together
Now that we have a project set up to allow creating custom metadata values, lets look at an example CSV file that will create a new Cohort, as well as new Subjects and Samples.
The following is an example CSV that can be dropped into a watch folder:
cohort | cohortDescription | subjectGuid | sampleKitGuid | birthYear | daysSinceFirstVisit | ethnicity | race | sex | specimenGuid | totalCellCount | visitDetails | visitName |
Cohort A | Good Cohort | |||||||||||
Cohort A | subjectA | 1999 | other | other | male | |||||||
subjectA | sampleAA | 0 | tired | firstVisit | ||||||||
sampleAA | specimenA | 9000 |
Using the example CSV file, the following will be created:
- A Cohort with the name "Cohort A" and a description "Good Cohort".
- A Subject in "Cohort A" with the guid "subjectA", a birth year of "1999", ethnicity of "other", race of "other", and sex of "male".
- A Sample from "subjectA" with guid "sampleAA", daysSinceFirstVisit 0, visitDetails "Tired", and visitName "firstVisit"
The first row in this example CSV are headers that were defined when the project was configured. Please make sure the headers match the fields you specified when defining the demographics scheme.
The second row defines the Level 1 fields, cohortGuid
and cohort description
. These are the only columns that are required to be filled.
The third row is where we map the Cohort to any Subjects we want to create along with any Subject metadata values. Please make sure to provide the cohortGuid
in these rows so the mapping between Subject and Cohort can be applied correctly. We can create as many Subjects as needed by inserting additional rows that map to a Cohort that was created in the above row(s).
Note: The top level, in this case, the Cohort, must be defined before you create Subjects that belong in the Cohort. The Cohort could be a new one that is being added via CSV or one that already exists in HISE.
The fourth row in the CSV is where we assign the Level 3 fields, create the Sample and associate the subjectGuid
that the Sample should map to. All other fields in this row should only define the Sample metadata for the Sample we are creating.
Lastly, we assign the Level 4 fields by specifying the sampleKitGuid
and mapping that to the specimenGuid
and Specimen-related metadata fields.
If there are questions about how to properly set up this CSV to create demographic data points, please don't hesitate to email immunology-support@allenimmunology.org.
Edit metadata fields example
Users are able to edit metadata fields for Subjects and Samples that they have created. For example, a user can make changes to the Sample metadata fields for any given Sample that was created.
The following is an example of how to set up a CSV that will take sampleAA for subjectAA, and change values for fields like sex, birth year, and days since first visit: edit_demographics_example.csv
To make any edits, we define the Cohort, Subject, or Sample identifier fields. We then assign the metadata fields we want to make changes to.
Soft-Deleting Cohorts, Subjects, and Samples
There is a feature in HISE that allows users to soft-delete a Cohort, Subject or Sample. There is a cascading effect when soft-deleting a field. For example, if a Level 1 field like cohort is soft-deleted, then all other fields that are associated with the deleted cohort will also be deleted. Any result files that are mapped to the deleted Cohort will also be deleted.
If you want an entry soft-deleted, please email immunology-support@allenimmunology.org with your request.