Define a Search and Save a Cohort

Introduction

The  BRAINCommons™ Explore Data feature allows you to search the Project Gallery (collection of studies) using a filter-based user interface (UI). You define your search criteria by selecting filters representing clinical attributes such as gender, race, diagnosis, medications, etc.

This article is comprised of three sections:

  • Introduction – Overview of the Explore Data functionality.
  • Define Your Search Using Filters – Detailed instructions on how to search the Project Gallery using the UI.
  • References – Deep dives into topics to enhance the effectiveness of your search.

Note

Prior to delving into this article, it would be helpful to familiarize yourself with the following definitions in the Glossary:

  • BC-workspace
  • Cohort
  • Filter
  • Manifest
  • Structured data
  • Unstructured data

The Explore Data landing page is reached by following the path Explore>Data in the navigation pane on the left-hand side of the screen, as shown below:

Upon entry to the page, notice the initial filter, the total number of cases shown, and the charts, which reflect the unfiltered contents of the Project Gallery.

Charts

There are several charts on this page, and for a given cohort they provide a graphical depiction of the following clinical attributes:

  • Primary diagnosis
  • Comorbidities
  • Age at enrollment by gender
  • Cases with unstructured data (ex: image files or wearable data files)

The following bar chart shows the proportion of cases in each category of primary diagnosis. Each color represents a different primary diagnosis. The colors map to the legend displayed below in the chart.

The pie chart shown below displays the number of cases with associated comorbidities grouped by the MedDRA System Organ Classification. Cases may be linked to more than one comorbidity, hence the sum of cases in this chart far exceeds the total number of cases in the relevant cohort.

The following mirror chart shows the distribution of patients by gender and for selected age groups at enrollment for the selected study participant cohort.

The bar chart shown below displays the distribution of the number of unstructured files that are linked to cases in the selected cohort grouped by the data type.

As you select and apply filters, the charts are dynamically adjusted to reflect the current cohort.

Define Your Search using Filters

To proceed, follow the steps below.

  1. Locate the Explore menu in the navigation pane on the left-hand side of the page and select Data. The Explore Data page appears.

    Notice the Filters section and the total number of cases before filtering.

    Note: The charts on this page reflect the contents of all unfiltered data available in the Project Gallery.

  2. Select the filters for your search.
    1. Click the down arrow. A drop-down containing the filtering options appears, as shown below.

    2. Select the desired filter. A drop-down containing the attributes based on that filter appears.
      Note: If you desire data from specific projects then you must choose the Project filter.
    3. Click the drop-down and select the relevant attributes.
    4. Repeat steps a–c for all of the filters needed for your search. There are no limits to the number of filters that can be applied
    5. Important!

      Filter selections within each drop-down are joined together with a Boolean "OR" operator, whereas filters on different rows are joined with a Boolean "AND" operator. You can find more information about this in the Reference section of this article.

  3. Click Apply. The Explore Data page refreshes to reflect the data from the cohort you created. Notice in the following screenshots how the number of cases changes.

    Filters are selected but not applied:

    Filters are applied:

    The charts are redrawn to reflect the current cohort.

    Note

    BRAINCommons only contains de-identified patient data. However, to further protect patient confidentiality and to ensure that it is not possible to derive a patient's identity by combining data, any charts containing a small number of cases (50 or less) from projects are not displayed if you do not have access to those projects. Instead, a message explaining that you are trying to view restricted data is shown.

  4. Save your cohort.
    1. Click Save Cohort. A window asking you for the cohort name appears.
    2. Enter the name and click Save Cohort. Your resulting dataset is saved in My Stuff, as shown below. For more information see Managing Your Cohorts.

References

What is a Cohort?

A cohort is a dataset that meets certain criteria. Proficiency in defining a cohort is a fundamental skill for using the BC platform, so let's take a deep dive with an example. For this discussion let's think of a cohort as a music playlist. That said, we can search on several different properties to create playlists. The following table contains the criteria for a few searches and the results of those searches:

Search
Number
Song Filter
Criteria
Resulting Playlist BRAINCommons
Equivalent
1 Artist: Beyoncé
Album: "I am Sasha
Fierce"
  • If I Were a Boy
  • Halo
  • Disappear
  • Broken-Hearted Girl
  • Ave Maria
  • Satellites
  • Single Ladies
  • Diva
  • Sweet Dreams
  • Video Phone
  • Why Don't You Love Me
Searched for:
  • All cases in a
    specific study
2 Artist: Beyoncé
Album: "I am Sasha
Fierce"
Ranking: #1 single
Halo Searched for:
  • A specific case
    in a specific study
3 Artist: Madonna
Album: All albums
Music Genre: Opera
Zero Results Searched for:
  • Specific research
    organization
  • Across all their studies
  • Looking for specific
    value or a single property
Results? No cases exist
4 Music Genre: Country
Year: 2021
Ranking: Top-10 single
Ranked by: Billboard
  • Famous Friends
  • Forever after All
  • What's Your Country Song
  • Single Saturday Night
  • Better Together
  • Gone
  • Drinkin' Beer
  • Just the Way
  • Almost Maybes
  • We Didn't Have Much
Searched for:
  • Specific values of
    four properties across
    all studies in the Project Gallery

Now that we have a cohort let's look at the rendering of a single case in JSON, as shown in the following illustration:

Relating this back to our playlist analogy, this single case is like the single song "Halo" from search #2. The project_id can be thought of as the album name, and the submitter_id as the artist, thus this case is uniquely identified. The property 'ethnicity' can be seen as the music genre, for example. Note that the property 'comorbidity' holds several values, which are a subset of all possible values.

In essence, a cohort is simply a subset of the available data that meets certain criteria.

How Boolean Operators Work in Your Search

OR Operator

Items listed within a drop-down are joined together with the OR Boolean operator if selected.
The rules is "OR makes more." Let's take a closer look at this.

In the following example we will filter by project and select a project comprised of 2816 cases. After we click Apply that quantity appears next to the Save Cohort button.

Now let's select another project, this one comprised of 800 cases. This amount will be added to our original sum of 2816. 

This means the search criteria is "Mjff-LRRK2C OR Mjff-DATATOP." After we click Apply the new total of 3616 appears, as we expected.

So, we have shown that "OR makes more."

AND Operator

The rows of filters are joined with the AND Boolean operator. AND give you less search results. This is different from everyday language use, where "and" adds two concepts together, like "cats and dogs."

Taking a closer look at this, let's start with our first filter. Note that when applied we have 3448 cases:

Let's select our second filter, keeping in mind that filters on different rows are joined with AND:

The search criteria is "Female AND Tremor." As you can see, when the second filter is applied, our search results are reduced to 20 cases, demonstrating the Boolean AND reduces our results.