Amazon Web Services
ElemBio Cloud integrates with AWS as a cloud service provider. The AWS provider enables the following tasks:
- Stream data directly from your AVITI to your own AWS Simple Storage Service (S3) bucket through a storage connection.
- Generate FASTQ files in the cloud through a verified Bases2Fastq flow using Amazon HealthOmics.
- Customize automation with the AWS product suite to analyze data.
- Manage data through data browsing and file download.
- Share data through granting read-only access to an AWS S3 bucket through bucket policies.
Requirements
An AWS provider must have access to an AWS account. To integrate with AWS, the account must have the following components:
- An AWS S3 bucket: The bucket stores output files from runs on AVITI Systems and analysis executions in ElemBio Cloud.
- An Authorizing IAM credential: An IAM credential is necessary to authorize actions that ElemBio Cloud performs with your AWS account. To integrate with AWS, you must set up one of the following two options:
- An IAM policy that grants access: The policy defines the permissions granted to ElemBio Cloud to complete actions. You must associate the IAM policy with your IAM credential. Some permissions are optional depending on the connected services.
If you plan to connect an AWS provider with Amazon HealthOmics, make sure to fulfill the additional requirements for Amazon HealthOmics.
Setting Up an AWS S3 Bucket
If you do not already have an AWS S3 bucket, you must create one. An AWS S3 bucket serves as the input and output for run and analysis activities and enables you to stream run data off the instrument. Element recommends the following settings for your bucket:
- ACLs disabled
- Public access blocked
- Default encryption enabled
Consult your IT representative to confirm the appropriate settings for your lab and determine appropriate encryption. Default encryption in transit and at rest protects the run, which includes genomic data. Bucket versioning and tags are not necessary for uploading runs. You cannot rename buckets. Selecting a region close to you increases the data transfer speed.
For more information on setting up an AWS S3 bucket, see Creating a Bucket in the AWS Documentation.
Authorizing ElemBio Cloud to Access Your Account
When ElemBio Cloud completes actions on your behalf, AWS generates temporary credentials through the IAM credential to authorize ElemBio Cloud to complete actions. The IAM credential enables ElemBio Cloud to operate on your AWS account as limited by the permissions granted in the associated IAM policy.
Choose one of the following two options to authorize ElemBio Cloud to access your accounts:
- Option 1: Create an IAM role. An IAM role is an identity in your account with specific permissions assigned and is associated to a specific user. Roles do not have long-term credentials.
- Option 2: Create an IAM user. An IAM user is an identity in your account that enables the creation of access key and secret access key credentials with specific permissions assigned. Access keys are long-term credentials.
For stronger security, use an IAM role as your AWS credential, as the IAM role does not use long-term credentials. If you use an IAM user, regularly rotate the access keys.
Creating an IAM Role Credential
To create an IAM role for ElemBio Cloud, see Creating an IAM Role in the AWS Documentation and apply the following requirements:
- Set up the IAM role as a Custom Trust Policy.
- Associate the role with the following Trust Relationship policy.
- Replace the required
External-ID-Example
with the external ID of your choice. The external ID can include alphanumeric characters and the special characters@:,=-./_
. Spaces are not permitted.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": ["arn:aws:iam::588258415937:root"]
},
"Action": "sts:AssumeRole",
"Condition": {
"StringEquals": {
"sts:ExternalId": "External-ID-Example"
}
}
}
]
}
- Give the IAM role a name that clearly indicates the access of Element instruments, such as Element-ServiceUser.
- After creating the role, edit the role and create an inline policy using the IAM Role policy template. Update the template with your bucket name and restrictions.
- After creating the role, use the AWS Console to set a maximum session duration of 12 hours (43,200 seconds).
Creating an IAM User Credential
To create an IAM role for ElemBio Cloud, see Creating IAM Users in the AWS Documentation and apply the following requirements:
- Give the IAM user a name that is clearly associated with Element instruments, such as Element-ServiceUser.
- Leave the option for AWS management console access unselected.
- After creating the user, edit the user and create an inline policy using the IAM User policy template. Update the template with your bucket name and restrictions.
- Create an access key through the AWS console.
- Select third-party service for the access key use case. Copy the access and secret access keys to use for adding the storage connection.
- Download the
.csv
file that is generated, or save your keys to a secure location.
Granting Access through IAM Policies
AWS grants access to actions through IAM policies, which determine what actions are allowed or denied AWS resources. You must associate the policy you create with the IAM credential that authorizes ElemBio Cloud access.
For more information on JSON policies in AWS, see Creating IAM Policies in the AWS Documentation.
JSON Policy Templates
The following JSON policy templates can be used to create inline policies for IAM roles or users during configuration. The templates include both required and optional permissions that the temporary credentials grant to ElemBio Cloud. To limit the permissions of the IAM policy, update the template for your bucket and your planned activities in ElemBio Cloud.
To access the correct JSON policy template, select the tab for your credential type.
- IAM Role
- IAM User
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "S3ObjectWrite",
"Effect": "Allow",
"Action": ["s3:PutObject"],
"Resource": "arn:aws:s3:::BUCKET_NAME/*"
},
{
"Sid": "S3ObjectRead",
"Effect": "Allow",
"Action": ["s3:GetObject"],
"Resource": "arn:aws:s3:::BUCKET_NAME/*"
},
{
"Sid": "S3ObjectListing",
"Effect": "Allow",
"Action": ["s3:ListBucket"],
"Resource": "arn:aws:s3:::BUCKET_NAME"
},
{
"Sid": "S3BucketLevelOperations",
"Effect": "Allow",
"Action": [
"s3:GetBucketLocation",
"s3:GetBucketPolicy",
"s3:PutBucketPolicy",
"s3:DeleteBucketPolicy"
],
"Resource": "arn:aws:s3:::BUCKET_NAME"
},
{
"Sid": "STSOperations",
"Effect": "Allow",
"Action": ["sts:GetCallerIdentity"],
"Resource": "*"
},
{
"Sid": "OmicsOperations",
"Effect": "Allow",
"Action": ["omics:GetWorkflow", "omics:StartRun", "omics:GetRun"],
"Resource": "*"
},
{
"Sid": "OmicsPassRole",
"Effect": "Allow",
"Action": ["iam:PassRole"],
"Resource": "*",
"Condition": {
"StringEquals": {
"iam:PassedToService": "omics.amazonaws.com"
}
}
}
]
}
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "S3ObjectWrite",
"Effect": "Allow",
"Action": ["s3:PutObject"],
"Resource": "arn:aws:s3:::BUCKET_NAME/*"
},
{
"Sid": "S3ObjectListing",
"Effect": "Allow",
"Action": ["s3:ListBucket"],
"Resource": "arn:aws:s3:::BUCKET_NAME"
},
{
"Sid": "S3BucketLevelOperations",
"Effect": "Allow",
"Action": [
"s3:GetBucketLocation",
"s3:GetBucketPolicy",
"s3:PutBucketPolicy",
"s3:DeleteBucketPolicy"
],
"Resource": "arn:aws:s3:::BUCKET_NAME"
},
{
"Sid": "STSOperations",
"Effect": "Allow",
"Action": ["sts:GetCallerIdentity", "sts:GetFederationToken"],
"Resource": "*"
},
{
"Sid": "OmicsOperations",
"Effect": "Allow",
"Action": ["omics:GetWorkflow", "omics:StartRun", "omics:GetRun"],
"Resource": "*"
},
{
"Sid": "OmicsPassRole",
"Effect": "Allow",
"Action": ["iam:PassRole"],
"Resource": "*",
"Condition": {
"StringEquals": {
"iam:PassedToService": "omics.amazonaws.com"
}
}
}
]
}
Update the JSON Policy Template
- Copy the template policy into the JSON section when creating an IAM policy.
- In all
"Resource"
sections, replaceBUCKET_NAME
with your bucket name. Make sure to keep/*
after the bucket names for theS3ObjectWrite
andS3ObjectRead
permissions. - If your bucket uses a prefix, make the following additional updates to the policy:
- For the
S3ObjectWrite
andS3ObjectRead
permissions, add the prefix to the bucket name in both"Resource"
sections, as in the following example:
"Resource": "arn:aws:s3:::BUCKET_NAME/OPTIONAL_PREFIX/*"
- After the
"Resource"
for theS3ObjectListing
permission, add a comma and the following"Condition"
. ReplaceOPTIONAL_PREFIX
with the prefix.
"Condition": {
"StringLike": {
"s3:prefix": [
"OPTIONAL_PREFIX/*"
]
}
} - For the
- If you do not want to include optional permissions, remove them from the template.
Policy Permissions
Permission | Requirement | Purpose | Required For |
---|---|---|---|
S3:GetBucketLocation | Required | Determines the region where a bucket resides | Data uploads from an instrument |
S3:ListBucket | Required | Lists objects in the bucket as needed by the upload mechanism | Data uploads from an instrument and data browsing in ElemBio Cloud |
S3:PutObject | Required | Performs single and multipart uploads | Data uploads from an instrument |
STS:GetCallerIdentity | Required | Verifies credentials using the current user or role name | Data uploads from an instrument |
STS:GetFederationToken | Required for IAM User only | Allows for the creation of limited temporary credentials for an IAM user | Data uploads from an instrument and the creation of temporary credentials for the AWS CLI |
omics:GetWorkflow | Optional | Retrieves the details of a HealthOmics Ready2Run workflow | Creation and use of an AWS HealthOmics verified Bases2Fastq flow |
omics:StartRun | Optional | Enables the start a HealthOmics Ready2Run workflow | Creation and use of an AWS HealthOmics verified Bases2Fastq flow |
iam:PassRole | Optional | Passes the execution role to the HealthOmics Ready2Run workflow commands for execution permissions | Creation and use of an AWS HealthOmics verified Bases2Fastq flow |
omics:GetRun | Optional | Retrieves details of a HealthOmics run | Creation and use of an AWS HealthOmics verified Bases2Fastq flow |
S3:GetBucketPolicy | Optional | Retrieves the bucket policy of an S3 bucket | Data sharing through a bucket-level policy |
S3:PutBucketPolicy | Optional | Applies a bucket policy to an S3 bucket. | Data sharing through a bucket-level policy |
S3:DeleteBucketPolicy | Optional | Deletes the bucket policy of an S3 bucket | Data sharing through a bucket-level policy |
S3:GetObject | Optional | Allows for the retrieval of objects from the data browser in AWS | Use of presigned URLs for file downloads from data browsing in ElemBio Cloud |
Using an IAM user requires the STS:GetFederationToken
permission as a security measure.
Data Analysis with Amazon HealthOmics
The AWS provider in ElemBio Cloud enables you to connect an analysis flow with workflows available in Amazon HealthOmics. You can leverage these workflows to use the diversity of analysis options that Amazon HealthOmics provides.
For the Bases2Fastq Flow, Element provides two ready-to-use options:
Requirements
If you plan to create a verified flow that uses Amazon HealthOmics, you must fulfill additional requirements beyond the AWS Provider requirements.
- The IAM policy associated with the IAM role or user for the provider must include all optional policy permissions for Amazon HealthOmics.
- You must create an execution role, a separate IAM role.
- The execution role uses a different IAM policy than the policy for the provider.
- In addition to the permissions policy, the execution role uses a trust relationship policy.
Creating an Execution Role
To integrate a verified flow with Amazon HealthOmics, you must create an additional IAM role known as the service role or execution role. When setting up a verified flow, enter the Amazon Resource Name (ARN) for the execution role.
While creating the execution role, complete the following requirements:
- Associate the execution role with the following Trust Relationship policy.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "omics.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}
- Associate the execution role with the following inline IAM policy, replacing
BUCKET_NAME
with your bucket.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "LogsAndECR",
"Effect": "Allow",
"Action": [
"cloudwatch:*",
"ecr:GetDownloadUrlForLayer",
"logs:*",
"ecr:BatchGetImage"
],
"Resource": "*"
},
{
"Sid": "S3",
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:GetObjectAcl",
"s3:GetObject",
"s3:ListBucket",
"s3:PutObjectAcl"
],
"Resource": ["arn:aws:s3:::BUCKET_NAME/*", "arn:aws:s3:::BUCKET_NAME"]
}
]
}
AWS HealthOmics Ready2Run Bases2Fastq Flow
Element has worked with AWS to publish the Bases2Fastq Ready2Run workflow on AWS HealthOmics. The following characteristics apply to the workflow:
- Only supports Bases2Fastq v1.4.0.
- Provides three workflow options:
- Bases2Fastq for 2x75
- Bases2Fastq for 2x150
- Bases2Fastq for 2x300
- Is restricted to certain regions. See the AWS documentation for the most recent available AWS HealthOmics region list.
Private Bases2Fastq Workflow Shared by Element
To support compatibility issues with the Bases2Fastq Ready2Run workflow, you can connect a Bases2Fastq Flow to a private workflow that Element shares with you. The following characteristics apply to the workflow:
- Only supports Bases2Fastq v2.0.0.
- Does not support projects. The workflow always applies the
--no-projects
optional argument. - The workflow is only available in the region where the share originates.
To obtain access to the private workflow:
- Contact Element Technical Support and ask for Element to share the Bases2Fastq private workflow. In the email, make sure to include your AWS Account ID and your AWS region of operation.
After Element receives and processes this information, Element shares the private Bases2Fastq workflow with your AWS account.
- Accept the workflow share in the AWS console.
- Copy the
Resource ID
value from the shared workflow, and add it to your Bases2Fastq flow in ElemBio Cloud as the Workflow ID.
Configuring ElemBio Cloud
After fulfilling the provider requirements, you can complete the following tasks to configure ElemBio Cloud. The available actions depend on your user permissions.
Add an AWS Provider
- Review the requirements for an AWS provider.
- On the Providers page, select Amazon Web Services.
- Enter a unique name.
- Select the region associated with the bucket.
If you do not see the region you need, contact Element Technical Support.
- Select the applicable credential type.
- For Role, enter the role ARN and external ID.
- For Access Keys, enter the access key and secret key.
- Select Save.
Add an AWS Storage Connection
A storage connection enables you to transfer data to the AWS bucket you own. Before you create a storage connection, you must set up the provider, including credentials.
- On the Storage page, select Add Storage.
- Select the AWS provider for the storage connection.
- Enter a unique name for the storage connection.
- Select the region associated with the bucket in the Region list.
If you do not see the region you need, contact Element Technical Support.
- Enter the Bucket Name.
- If applicable, enter a Prefix that indicates the folder structure for run data.
- Select the Use for Run Upload setting.
- If you enable the setting, the storage connection becomes available on instruments associated with your organization.
- If you disable the setting, the storage connection is unavailable on instruments. You can only use the storage connection for ElemBio Cloud activities, such as verified flows.
- Select the Use for Data Exploration setting.
- If you enable the setting, the storage connection is available for data browsing.
- If you disable the setting, the storage connection is unavailable for data browsing.
- Select Save.
Add an AWS Compute Connection
- On the Compute Connections page, select Add Compute.
- Select the provider for the compute connection.
- Enter a unique name for the compute connection.
- Enter the execution role ARN.
- Select the region for the compute connection.
You must select a compatible region for AWS HealthOmics. If you do not see the region you need, contact Element Technical Support.
- Select Save.
Add an AWS Bases2Fastq Flow
The Ready2Run Bases2Fastq workflow only supports Bases2Fastq v1.4.0, which is not compatible with Cloudbreak or Cloudbreak Freestyle chemistries. To continue using Amazon HealthOmics, you can instead connect to a private Bases2fastq workflow shared by Element.
- On the Flows page, select Bases2Fastq Flow.
- Select the AWS provider associated with a JSON policy with the necessary permissions for Amazon HealthOmics.
- Enter a unique name for the flow.
- Enter the execution role ARN.
- Enter the Workflow ID.
- For the Ready2Run Bases2Fastq workflow, leave this field blank.
- For the privately shared Bases2Fastq workflow, enter the Resource ID from the shared workflow in AWS.
- In the Bases2Fastq Version drop-down menu, select Always use latest or a specific software version.
- In the Parameters field, enter the Bases2Fastq optional arguments you want to use.
The flow applies these arguments to all executions.
- Select a Trigger option: Automatically on completed runs or Manually trigger on runs.
- Select an AWS storage connection for Input Storage.
- Select an AWS storage connection for Output Storage.
AWS HealthOmics writes data to the bucket for this storage connection when analysis is complete.
- Review the Bases2Fastq Terms of Service, and then select the checkbox to agree.
- Select Save.
Sharing Data Through Bucket Policies
An AWS S3 bucket policy is a JSON-based document that defines permissions for an S3 bucket and its objects. The bucket policy allows you to specify who can access the bucket, what actions they can perform, and under what conditions. Policies are highly customizable, enabling fine-grained control over your data. The policies work in conjunction with other AWS security features, such as IAM roles and IAM users, to ensure secure and efficient management of your S3 resources.
Using ElemBio Cloud, you can grant limited, read-only access to specific types of AWS principals, including IAM roles, IAM users, or AWS accounts. Read-only bucket policies simplify access management for many scenarios where modification of data is not required, such as:
- Sharing data within your own internal accounts
- Sharing data to research partners for direct data access to run or analysis data
- Sharing your FASTQ files directly with third-party analysis platforms for secondary analysis
Grant Access through an AWS Principal
- On the storage connection card on the Storage page, select More, and then select Manage Data Access.
- Select Add Principal ARN.
- Enter the AWS Principal ARN for the IAM role, IAM user, or account.
- Enter an optional description that represents the account you are granting access.
- Select Save.
After you add a principal from ElemBio Cloud, the following policy statement is appended to the existing bucket policy.
"Statement": [
{
"Sid": "ElemBio_BucketReadAccess_arn:aws:iam::999939710102:user/ExampleUser",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::999939710102:user/ExampleUser"
},
"Action": [
"s3:GetObject",
"s3:ListBucket",
"s3:GetBucketLocation"
],
"Resource": [
"arn:aws:s3:::BUCKETNAME",
"arn:aws:s3:::BUCKETNAME/*"
]
},