Flows
With verified workflows, Element develops the analysis pipeline for you and simplifies the analysis setup. When a flow is added in ElemBio Cloud, the configured workflow values such as inputs, outputs, and parameters, can be reused across executions and can automatically or manually initiate analysis when a run completes.
Requirements
When you use an analysis flow in ElemBio Cloud, a connection to a cloud provider for storage and compute activities is required. Before you create a flow, connect your account to at least one cloud provider. Based on the flow that you set up, different providers are compatible. ElemBio Catalyst, a subscription-based native cloud analysis solution within ElemBio Cloud, includes customizable basic flows that Element sets up for you.
The following providers allow you to configure flows:
Add a Flow
To reuse a workflow configuration across many analysis executions, add a flow. When you execute a flow, global parameters that are set in the configuration are applied automatically, unless you override them at the start of the launch. Assignment flows can be set to launch manually or automatically when a run completes. All other flows can only launch manually.
To add a flow from the Workflow Library page, select Add a Flow from a workflow card, or select Add a Flow from the Flows page.
Assignment Workflows (Basic)
- Select Add a Flow from an Assignment workflow card, such as Bases2Fastq or Cells2Stats.
- Select the compute provider that will execute the flow.
Additional form fields might appear based on the compute provider that you selected.
- Enter a unique name for the flow.
- (Optional) Enter a description for the flow.
- In the Workflow Version drop-down menu, select Always use latest, or select a specific workflow version.
- When you select Latest, the flow is automatically upgraded to the most recent version of the workflow. This option can cause executions to use different workflow versions.
- When you select a specific workflow version, the flow is always executed with the same version and stability between executions is maintained.
- (Optional) If required by your provider, enter a workflow ID:
- For AWS Compute, enter a Shared Workflow ID.
- For DNAnexus Compute, enter a Workflow ID.
- Select a storage connection that you want to use for Output Storage.
The provider will write data to this bucket for this storage connection when analysis is completed.
- Select one of the following Trigger options:
- Automatically on completed runs automatically executes the assignment flow when a run completes.
- Manually trigger on runs requires you to manually start executions from the Analysis Execution tab on Run Detail pages.
- If a flow is set to automatically trigger, then select a storage connection for Input Storage.
The provider will read data from this bucket for this storage connection when a run is completed.
- Review the Workflow Terms of Use, and then select the checkbox to agree.
- Select Next.
- (Optional) In the Parameters step, you can globally apply optional parameters to all flow executions. Optional parameters can be input on the form fields.
- Parameters can be overridden at the time of execution.
- Leave a parameter empty to keep the default value.
- To view a complete list of available workflow parameters, see the corresponding workflow in the Analysis Workflow.
- Select Save.
Secondary Analysis Workflows (Premium)
- On the Workflow Library page, select Add a Flow from a Secondary Analysis Workflow card, such as Sentieon Germline DNA or Parabricks Germline DNA.
- Select the compute provider that will execute the flow. Additional form fields appear based on the compute provider that you select.
- Enter a unique name for the flow.
- (Optional) Enter a description for the flow.
- In the Workflow Version drop-down menu, select Always use latest, or a specific workflow version.
- Latest automatically upgrades the flow to the most recent version of the workflow. This option might cause executions to use different Workflow versions.
- When you select a specific Workflow version, the flow always executes with the same version and maintains stability between executions.
- Select a storage connection for Output Storage.
- The provider will write data to this bucket for this storage connection when analysis completes.
- Note: This flow is manually triggered and takes FASTQ files as input, chosen at time of execution.
- Select Next to move to the Parameters step.
- (Optional) In the Parameters step, you can globally apply any optional parameters to all flow executions. These can be input in the form fields or edited as JSON.
- Parameters may be overridden at time of execution.
- Leave a parameter value to keep the default.
- To view a list of workflow parameters, see the Analysis Workflow section.
- Select Save.
Flow Statuses
After you add a flow, a card appears with a summary of the saved configurations on the Flows page. A badge on flow cards indicates one of the following connection statuses:
- Connected: ElemBio Cloud successfully verified the flow.
- Partially Connected: ElemBio Cloud only completed some of the verifications for the flow, the associated provider, and the providers for any associated storage connections. Review the connection status of the associated providers to identify the issue.
- Unverified: ElemBio Cloud cannot verify the flow and the associated features. To resolve, verify the associated providers for the flow and the associated storage connections. Review any error messages that appear.
Managing Flows
The following buttons on the flow cards allow you to manage flows. The available actions are based on your user permissions.
- Edit: Modifies the flow settings.
- Changes take effect immediately and only apply to future executions.
- To apply changes to previous or in-progress executions, relaunch the flow.
- Delete: Removes the flow permanently from ElemBio Cloud. When prompted, enter the name of the flow to confirm deletion.
- Verify: Verifies the permissions for the flow.
- If the flow is successfully connected, a green success message appears.
- If a red error message appears, review the error message to address the issue.
Launch a Flow
To manually launch a flow, select the Launch Analysis button on the flow card, or from the Analysis Executions tab on a Run Details page.
Launch Assignment Workflows (Basic)
- Select Launch Analysis.
- In step 1 of the wizard, name the execution, and then select the assignment flow to launch.
- A reference of the cloud provider, compute connection, and output location for this flow is displayed in a Flow Details table for quick reference.
- Select Next to move to Step 2: Inputs, where you will enter details about the run directory to use as input.
- Select a run name (if you launch from a Run Details page, then the run name is auto-selected).
- In the Run Manifest field, choose to use the original run manifest from the completed run, or choose to upload a corrected run manifest csv.
- Select Next to move to Step 3: Parameters, where you can view a list of customizable workflow parameters. Parameters are available in form or JSON views and default values from the saved flow are automatically applied.
- When you launch a Bases2Fastq flow, you can modify any optional parameters for the flow execution.
- When you launch a Cells2Stats flow, you can modify any optional parameters for the flow execution.
- Select Start Analysis to launch the flow.
- A new execution starts and is accessible from the Executions table.
Launch Secondary Analysis Workflows (Premium)
- Select the Launch Analysis option.
- In step 1 of the wizard, name the execution, and then select the secondary analysis flow to launch.
- A reference of the cloud provider, compute connection, and output location for this flow is displayed in a Flow Details table for quick reference.
- Select Next to move to Step 2: Inputs, where you will curate a list of sample FASTQ files to use as input.
- Select a Bases2Fastq execution to start from.
- If you launch from a Run Details page, then the latest Bases2Fastq execution is auto-selected.
- You can add more than one Bases2Fastq execution in the auto-complete field.
- Select Load Samples to load a table of FASTQ files based on the specified executions.
- From the Actions column, select the Sample Name field to edit a sample name.
- From the Actions column, select the trashcan icon to delete a sample row.
- After the sample list is ready, select Next to move to Step 3: Parameters, where you can view a list of customizable workflow parameters. Parameters are available in form or JSON views and default values from the saved flow are automatically applied.
- Select Start Analysis to launch the flow.
- A new execution starts and is accessible from the Executions table.
Relaunch a Flow
To resolve errors, address quality issues, apply new optional parameters, or use a corrected run manifest, relaunch a flow. Once an execution is complete, you can relaunch from the Actions column on the All Executions table, or from the top of an Execution Details page.
- To relaunch a flow, use one of the following locations:
- Execution Overview section - The Execution Overview section is located at the top of the Executions page and displays a Relaunch button after an execution completes.
- All Executions table - This table offers a relaunch option in the the Actions column for each row.
- Select the Relaunch option.
- From the Launch wizard form, follow the steps to adjust the inputs and parameters that are autofilled with values from the reference execution.
- Select Start Analysis to relaunch.
- A new execution starts and is accessible from the Executions table.