Glossary

The following definitions should be considered within the context of the BRAINCommons™ (BC) platform.

BC Data Model: The data structure, including the variables, their possible values, and their relationships, that supports the BC platform. See the article Explore the Data Model for more information. 

Cohort: A group of cases (i.e., study participants) created by the user, based on specific criteria. The criteria are defined using filters which combine demographic, clinical, and project attributes. Each case comprising the cohort is accessed by a unique ID. The cohort can consist of cases from one or more projects, or it can even span the entire gallery.

A Cohort can be created from the Explore>Data page where it can be saved and later exported to a BC-workspace. If there is Unstructured Data associated with the cases that comprise the Cohort, then the corresponding manifest (a list of the file identifiers representing the files that contain the data) can be exported directly to a BC-workspace so that it can be used to retrieve that data. 

Data: There are two types of data within the BC platform:

  • Structured: Data that was mapped to the BC Data Model during Data Curation. It can be queried within a BC-workspace.
  • Unstructured: Data that was not mapped to the BC Data Model during Data Curation, but is accessible via the BC-workspace and is described within the BC structured data.

Data Contributor: The organization that submits data associated with a study, represented as a dataset, to the BC platform.

Data CurationThe process of deciding and implementing how a new dataset is brought into the BC platform. The following actions can occur: 
  • The new data are compared to the BC Data Model as it is currently defined.
  • Based on the new data, for any data brought into BC as structured data, the model is updated with any necessary additions and then the complete dataset is mapped to the updated model. Note that consideration is given to existing BC datasets and how revisions to the data model will affect them
  • For data that is brought in as unstructured data, corresponding structured data are created and brought into the model and used as a pointer to the unstructured data files such as an image (hence it's called unmapped data). The unmapped data reside in a cloud storage resource from Amazon Web Services known as an “S3 bucket.” 

Download: The action of copying data outside of the BC platform. Both the structured and unstructured data can be downloaded after approval. The structured data is downloaded in a JSON or TSV format and can be done at the project level using the Project Gallery. The unstructured data is in the format submitted by the data contributor.

Filter: An attribute(s) about the study or study participant found on the BRAINCommons User Interface (UI) selected by the user to narrow their search for a cohort made up of participants that meet specific criteria.

Export: While within the the UI, the action of pushing a set of case identifiers to a BC-workspace so that data can be retrieved programmatically in that environment for further analysis. Exporting into a BC-workspace does not create copies of actual data, it creates copies of case identifiers, with a subset of properties/attributes only, about each case. The user can access structured and unstructured data relating to the specific cases (i.e., cohort) using these case identifiers.

Import: While within the BC-workspace, the action of retrieving data. See the articles Querying the Database using the SDK and Working with unstructured data in a Workspace for further information.

Manifest: A list of unique identifiers which link unstructured data files (stored in an S3 bucket) to the cases that comprise a cohort. A manifest can be exported to a BC-workspace and used to import the corresponding unstructured files with the BC-Client command-line interface, or the BRAINCommons SDK for further analysis. A sample manifest is shown below:

Project: A collection of data that was generated or obtained during a study or series of studies. Projects are described by metadata that included, for example, their size, objectives, sponsor, location, etc. The Project Gallery is the collection of all the projects residing within the BC platform (see the article Using the Project Gallery).

Save to Workspace: The action of storing data or files in a BC-workspace directory for further analysis.

Study Participant: A patient, or a healthy control volunteer, offering his or her time and effort in the search for increased knowledge in preventing, treating, or palliating disease.