Use the Read Files SDK Method (Tutorial) - DRAFT

	Abbreviations Key
`bool`	boolean	IDE	integrated development environment
df	DataFrame	obj	object
`desc`	descriptor(s)	SDK	software development kit
`dict`	dictionary	`pd`	pandas
guid	globally unique identifier	`tmp`	temporary
HISE	Human Immune System Explorer	UUID	universally unique ID
`hp`	hisepy

At a Glance

This document explains how to use read_files() to download files to your HISE NextGen IDE.

Description

This function fetches HISE files and returns one or more hise_file objects when you pass in one of the following: a list of file IDs (file_list), a saved search ID (query_id), or a custom search query (query_dict). A hise_file object is a strutured data container that holds file contents, metadata, and methods.

hp.read_files(file_list: list = None, query_id: list = None, query_dict: dict = None, to_df: bool = True)

The parameters for this method are listed in the following table. In each key:value pair, the value must be of type list.

Parameter	Data type	Description
`file_list`	`list`	List of UUIDS to retrieve
`query_id`	`string`	Value of the queryID from an advanced search
`query_dict`	`dict`	Dictionary that allows users to submit a query
`to_df`	`bool`	Boolean determining whether the result is returned as a DataFrame

Instructions

Import libraries

To get started, set up your environment to interact with HISE programmatically and access all available SDK functions. For details, see Use Hise SDK Methods.

1. Navigate to HISE, and use your organizational email address to sign in.

2. Open an IDE. For instructions, see Create Your First HISE NextGen IDE (Tutorial).

3. For programmatic access to HISE functions and efficient handling of tabular data, import the Python SDK and the pandas library.

# Import hisepy and pandas import hisepy as hp import pandas as pd

Define file IDs

In this step, we define the file IDs for this notebook. For details, see Use Advanced Search (Tutorial).

1. Retrieve your own set of file IDs, and then define them as shown below. (The example below uses placeholder UUIDs—replace them with your own.)

# Define the file IDs used in this analysis FILEIDS = ['4551e620-48db-4328-a2b0-122730cd128d', '6417a4c5-098b-4d70-8c24-951e1c1c44ce']

To see what's in a given dictionary key, use the following format:

tmp['key']

For example, let's see what's in the descriptors key:

tmp['descriptors']

Return dictionary output and apply tabular format

When you call read_files() with the to_df=True parameter, a dictionary is returned in which each key contains a pandas DataFrame. The to_df=True parameter arranges the data into a tabular format for easier analysis.

1. Pass your list of file IDs to read_files().

# Return dictionary output and print keys from read_files tmp = hp.read_files(file_list=FILEIDS, to_df=True) print("Type of tmp:", type(tmp)) print("Keys in tmp:", list(tmp.keys()))

The following output is returned:

# Shows the class of the returned objectType of tmp: <class 'dict'># Prints all keys (file IDs or names) in the dictionary
Keys in tmp: ['descriptors', 'labResults', 'specimens', 'values', 'errors'

Key	Description
`descriptors`	Project, sample, or subject metadata
`labResults`	Test results and IDs
`specimens`	Status and info on biological specimens
`values`	Raw data metrics
`errors`	File retrieval errors, if any

Preview the Data

Each key in the tmp dictionary represents a different dataset returned by hp.read_files(). The accompanying table summarizes the content of each key.

1. For each key, use a loop to print the file ID, value type, and a preview of the data.

for file_id, value in tmp.items(): print(f"File ID: {file_id}") print("Type of value:", type(value)) # If it's a DataFrame, show the first few rows try: print(value.head()) except AttributeError: print(value) # For non-DataFrame types print("-" * 40)

2. To see all column heads for a given data set, use the following line.

print(tmp['descriptors'].columns)

3. To get a summary of the DataFrame, use the following line.

print(tmp['descriptors'].info()) print(tmp['descriptors'].describe(include='all'))

Related Resources

Query SDK fileType

Use HISE SDK Methods

Create Your First HISE NextGen IDE (Tutorial)