Skip to main content

Data Management

Using the data management tools, you can access and manage the output files from sequencing runs and flow executions. Data management interfaces are accessible through different tabs on the Run Details pages. The Run Files tab shows the output files from a sequencing run. Other tabs show files from the executions of flows, such as the FASTQ Files tab for Bases2Fastq executions.

Within the data management interfaces, you can perform the following actions:

  • Download output files from a sequencing run or flow execution.
  • Access flow execution details and output files, including log files.
  • Reexecute a flow using a new run manifest or execution settings.
  • Stop a flow execution that is in progress.

Compatibility

Data management is available for runs and flow executions that use one of the following providers as a storage connection:

File Availability

The availability of a file depends on the file retention status, the cloud bucket settings, and the type of associated run activity.

  • Sequencing files populate throughout a run and are in sync with the run status on the instrument.
  • FASTQ files and other analysis output files become available when a flow execution reaches a final status.
  • Available files depend on the object storage class for files in the cloud bucket. For details on file availability with ElemBio Catalyst, see Data Storage Retention.
  • Archived files are visible but unavailable for download. To make archived files available again, contact Element Technical Support.

Downloading Files

The data management tools in ElemBio Cloud let you download a single file, multiple files, or all files from a sequencing run or flow execution.

  • To download a single file, select the Download icon in the Actions column.
  • To download a multiple selected files, use the interface to select files and generate a dynamic download script.
  • To download all files at a prefix, use the AWS CLI or generate a download script.

For information on output files, see Run Output Files and the Bases2Fastq Documentation.

Note:

The Copy URI icon in the Actions column lets you copy the Uniform Resource Identifier (URI) for a single file. You can use the URIs to individually download files using the AWS CLI.

Download with Script

ElemBio Cloud lets you download multiple selected files or all files at a prefix using a download script. The script uses the curl protocol and relies on presigned URLs that are dynamically generated.

The script contains a metadata header with execution instructions and presigned URLs for the selected files. Executing the script creates a directory on your system and downloads files into it.

Before executing the script, make sure to meet the following requirements:

  • Install or update curl v8.4 or later on your system. To check the version, use the command curl -V.
  • If you are using Windows OS, use the curl.exe command, and execute the command in Windows Command Prompt or PowerShell.
  • Execute before the presigned URLs expire. The URLs expire 7 days after creation, and the time of expiration appears in the script header. If you do not complete the download within 7 days, you must generate a new script to download the files.

Download Multiple Selected Files

  1. On the Run Details page, select the tab with the files you want to download.
  2. Select the checkboxes for the files you want to download.
  3. Select Download.
  4. Review the number of files and total download size to make sure you selected the correct files.
  5. Select Download curl config file and review the file in a text editor.
  6. Open a terminal window for the CLI on your OS and navigate to the location of the curl config file.
  7. In ElemBio Cloud, select the tab for your OS: MacOS/Linux or Windows.
  8. Copy the curl command and run it in the CLI.

Download All Files with the Download Script

  1. On the Run Details page, select the tab with the files you want to download.
  2. Select Download All.
  3. Wait for the interface to prepare the download, and then select Continue to Download.
  4. Select Download with a script.
  5. Review the number of files and total download size to make sure all necessary files are selected.
  6. Select Download curl config file and review the file in a text editor.
  7. Open a terminal window for the CLI on your OS and navigate to the location of the curl config file.
  8. In ElemBio Cloud, select the tab for your OS: MacOS/Linux or Windows.
  9. Copy the curl command and run it in the CLI.

AWS with CLI Command

ElemBio Cloud offers an option to use an AWS CLI command to download all files at a prefix. The command uses 36-hour temporary credentials that the user interface provides. Before you download data with the AWS CLI, you must configure three credential variables:

  • AWS_ACCESS_KEY_ID
  • AWS_SECRET_ACCESS_KEY
  • AWS_SESSION_TOKEN

To set the credential variables, configure your environment using one of the following methods:

  • Set credentials as local variables that the AWS CLI reads automatically.
  • Set credentials using the AWS configuration and credential file with the aws configure command.

Download All Files with the AWS CLI

Select the tab for your preferred method of credential configuration.

When you set the credential variables in the local terminal environment, the AWS CLI automatically detects them as known environment variables. Linux OS uses the export command, while Windows OS uses the set command to set the variables.

  1. On the Run Details page, select the tab with the files you want to download.
  2. Select Download All.
  3. Wait for the interface to prepare the download, and then select Continue to Download.
  4. Select Download with a CLI.
  5. Review the number of files and total download size to make sure all necessary files are selected.
  6. Open a terminal window to access the AWS CLI.
  7. Set the credentials provided by ElemBio Cloud. Use the commands for your OS.
Setting Example Credentials in Linux OS
export AWS_ACCESS_KEY_ID=ASIA52AWRNEXAMPLE
export AWS_SECRET_ACCESS_KEY=R8GTtGx7WwzQ9L1WQbPnHLEXAMPLE
export AWS_SESSION_TOKEN=R8GTtGx7WwzQ9L1EXAMPLE_SESSION_TOKEN_FROM_EBC
export AWS_DEFAULT_REGION=us-west-2
Setting Example Credentials in Windows OS
set AWS_ACCESS_KEY_ID=ASIA52AWRNEXAMPLE
set AWS_SECRET_ACCESS_KEY=R8GTtGx7WwzQ9L1WQbPnHLEXAMPLE
set AWS_SESSION_TOKEN=R8GTtGx7WwzQ9L1EXAMPLE_SESSION_TOKEN_FROM_EBC
set AWS_DEFAULT_REGION=us-west-2
  1. Use the env command filtered with grep to verify that you set the credentials as expected.

    If the credentials are set as expected, the CLI lists all the variables.

Verifying Example Credentials
env | grep AWS
AWS_ACCESS_KEY_ID=ASIA52AWRNEXAMPLE
AWS_SECRET_ACCESS_KEY=R8GTtGx7WwzQ9L1WQbPnHLEXAMPLE
AWS_SESSION_TOKEN=R8GTtGx7WwzQ9L1EXAMPLE_SESSION_TOKEN_FROM_EBC
AWS_DEFAULT_REGION=us-west-2
  1. Copy the aws command in ElemBio Cloud and run it in the CLI to download the dataset.

    The copied command downloads the files to the current directory in the CLI. To change the download location, replace the . in the command with the path to the preferred folder.

Example AWS CLI Command
aws s3 cp --recursive s3://elembio-quality-reads-inc-usw2-7b02-d-runs/runs/{Run Name}/ .

If you encounter an error when using the AWS CLI command after setting local variables, use the env command to ensure the environment variables are set as expected. The. For more information on environment variables, see Environment variables to configure the AWS CLI.

Flow Executions

In the tabs for flow executions, such as the FASTQ Files tab, you can reexecute a flow or stop an in-progress execution. Using the Execution ID drop-down menu, you can access files and information from previous executions.

Reexecute a FASTQ Flow

Reexecuting a FASTQ flow allows you to resolve errors, address quality issues, apply new optional arguments, or use a corrected run manifest.

  1. On the Run Details page, select FASTQ Files.
  2. Select Reexecute.
  3. When prompted, select the run manifest for the reexecution.
    1. Select Browse, and then browse to the run manifest file.
    2. Select the run manifest file, and then select OK.
  4. Enter any optional arguments in the Parameters field.

    For example, to add the QC only mode optional argument, enter --qc-only.

  5. Select Execute Flow.