Uploading Data

The Gen3-client provides an easy-to-use, command-line interface for uploading and downloading data files to and from the BRAIN Commons from the terminal or command prompt.

This guide has the following sections:

  1. Installation Instructions
  2. Configure a Profile
  3. Upload Data Files

1) Installation Instructions


The gen3-client can be downloaded from Github for Windows, Linux, or Mac OS, or it can be installed from source using Google’s GO language (instructions in Github README).

To install, download the correct version for your operating system and unzip the archive. The program must be executed from the command-line by running the command  gen3-client <options>. For more detailed instructions, see the section below for your operating system.

Note: Do not try to run the program by double-clicking on it. Instead, execute the program from within the shell / terminal / command prompt. The program does not provide a graphical user interface (GUI) at this time; so, commands are sent by typing them into the terminal.

Mac OS X / Linux Installation Instructions

  1. Download the latest Mac OS X or Linux version of the gen3-client here.
  2. Unzip the archive.
  3. Add the unzipped executable to a directory, for example: ~/.gen3/gen3-client.
  4. Open a terminal window.
  5. Add the directory containing the executable to your Path environment variable by entering this command in the terminal: echo 'export PATH=$PATH:~/.gen3' >> ~/.bash_profile.

Now you can execute the program by opening a terminal window and entering the command  gen3-client.

Windows Installation Instructions

  1. Download the Windows version of the gen3-client here.
  2. Unzip the archive.
  3. Add the unzipped executable to a directory, for example: C:\Program Files\gen3-client\gen3-client.exe.
  4. Open the Start Menu and type “edit environment variables”.
  5. Open the option “Edit the system environment variables”.
  6. In the “System Properties” window that opens up, on the “Advanced” tab, click on the “Environment Variables” button.
  7. In the box labeled “System Variables”, find the “Path” variable and click “Edit”.
  8. In the window that pops up, click “New”.
  9. Type in the full directory path of the executable file (e.g., C:\Program Files\gen3-client).
  10. Click “Ok” on all the open windows and restart the command prompt if it is already open by entering cmd into the start menu and hitting enter.

View the Help Menu

The tool can now be run on the command-line in your terminal or command prompt by typing  gen3-client. Typing this alone displays the help menu.

Notes about Working in the Shell

File Paths

When you create or download a file on your computer, that file is located in a folder (or directory) in your computer’s file system. For example, if you create the text file  example.txt in the folder My Documents, the “full path” of that file is, for example, C:\Users\Bob\My Documents\example.txt in Windows or /Users/Bob/Documents/example.txt in Mac OS X.

Present Working Directory

After opening a shell, command prompt or terminal window, you are “in” a folder known as the “present working directory”. You can change directories with the  cd <directory> command in either shell. To view your present working directory, enter the command echo $PWD in a Mac terminal or cd alone in the Windows command prompt.

You can list the contents of your present working directory by entering the command  ls in the Mac terminal or dir in the Windows command prompt. These files in the present working directory can be accessed by commands you type just by entering their filenames: for example, cat example.txt would print the contents of the file example.txt in the Mac terminal if your present working directory is /Users/Bob/Documents. However, if you’re in a different directory, you must enter the “full path” of the file: for example, if your present working directory is the My Downloads folder instead of My Documents, then you would need to specify the full path of the file and enter the command type "C:\Users\Bob\My Documents\example.txt", to print the file’s contents in the Windows command prompt.

Updating the PATH Environment Variable

When working in your shell, you can define variables that help make work easier. One such variable is PATH, which is a list of directories where executable programs are located. By adding a folder to the PATH, programs in that folder can be executed from any other folder/directory regardless of the present working directory.

So, by adding the directory containing the gen3-client program to your PATH variable, you can run it from any working directory without specifying the “full path” of the program. Simply enter the command  gen3-client, and you will run the program.

Note: In the case that you haven’t properly added the client to your path, the program can still be executed from any directory with the following command: /full/path/to/executable/gen3-client <options>. If you are working in the directory containing the executable, then /full/path/to/executable is simply ./. So the command from the executable’s directory would be ./gen3-client.

Sending Parameters to Programs on Command-line

Most programs require some sort of user input to run properly. Some programs will prompt you for input after execution, while other programs are sent this input during execution as “arguments” or “options”. The gen3-client uses the latter method of sending user input as command arguments during program execution.

For example, when configuring a profile with the client (details are in the following section), the user must specify the  configure option and also specify the profile name, API endpoint, and credentials file by adding the flags --profile--apiendpoint and --cred to the end of the command (see next section for specific examples).


2) Configure a Profile with Credentials


Before using the gen3-client to upload or download data, the gen3-client needs to be configured with API credentials downloaded from the user’s BRAIN Commons Profile (via Windmill data portal):

  1. To download the “credentials.json” from the BRAIn Commons, the user should start by logging into the portal, followed by clicking on “Profile” in the top navigation bar and then creating an API key. In the popup window which informs user an API key has been successfully created, click the “Download json” button to save the API key to a local machine.

    Get credentials.json

  2. From the command-line, run the gen3-client configure command with the –cred and –apiendpoint arguments (see examples below).

Example Usage:

gen3-client configure --profile=<profile_name> --cred=<credentials.json> --apiendpoint=<api_endpoint_url>  

Mac/Linux: 
gen3-client configure --profile=bob --cred=/Users/Bob/Downloads/credentials.json --apiendpoint=https://data.braincommons.org 

Windows: 
gen3-client configure --profile=bob --cred=C:\Users\Bob\Downloads\credentials.json --apiendpoint=https://data.braincommons.org

When successfully executed, this will create a configuration file, which contains all the API keys and urls associated with each commons profile configured, located in the user folder:

Mac/Linux:
/Users/Bob/.gen3/config 

Windows: 
C:\Users\Bob\.gen3\config

NOTE: These keys must be treated like important passwords; never share the contents of the credentials.jsonor gen3-client config file!


3) Upload Data Files


When data files are uploaded to a Gen3 data common’s object storage, they should be registered and assigned with a unique, 128-bit ID called a ‘GUID’. GUIDs are generated on the back-end, not submitted by users, and they are stored in the property object_id.

When using the  gen3-client upload command, a random GUID will be generated and assigned to each data file that has been submitted.

Example Usage:

For uploading a single file: 
gen3-client upload --profile=<profile_name> --upload-path=<path_to_files/data.bam> 

For uploading all files within an folder: 
gen3-client upload --profile=<profile_name> --upload-path=<path_to_files/folder/> 

Can also support regex such as: 
gen3-client upload --profile=<profile_name> --upload-path=<path_to_files/folder/*> 

Or: 
gen3-client upload --profile=<profile_name> --upload-path=<path_to_files/*/folder/*.bam>

gen3-client upload --profile=bob --file=test.gif  Uploading data ... test.gif  
3.64 MiB / 3.64 MiB [==========================================================================================] 100.00% 
Successfully uploaded file "test.gif" to GUID 65f5d77c-1b2a-4f41-a2c9-9daed5a59f14. 1 files uploaded.

Options and User Input Flags

Flag name Required? Default value Explanation Sample usage
profile Yes N/A The profile name that user wishes to use from the config file. --profile=Bob
upload-path Yes N/A The directory or file in which contains file(s) to be uploaded. --upload-path=../data_folder/
batch No false If set to `true`, gen3-client will upload multiple files simultaneously. The maximum number of file can be uploaded at a same time is specified by the `numparallel` option --batch=true
numparallel No 3 Number of uploads to run in parallel. Must be used in together with the `batch` option. --numparallel=5
include-subdirname No false Include subdirectory names in file name. --include-subdirname=true

Local Submission History

In this mode, the application will keep track of which local files have already been submitted to avoid potential duplication in submissions. This information is kept in a JSON file under the same user folder as where the  config file lives, for example:

Mac/Linux:
/Users/Bob/.gen3/<your_config_name>_history.json 

Windows: 
C:\Users\Bob\.gen3\<your_config_name>_history.json

Each object in the history JSON file is a key/value pair of the full file path of a file and GUID it associates with.

Example of a history JSON File:

{   
"/Users/Bob/test.gif":"65f5d77c-1b2a-4f41-a2c9-9daed5a59f14" 
}