Introduction
During a sequencing run, the Element AVITI™ System records base calls and associated quality scores (Q scores) in .bases
files. Bases2Fastq operates off-instrument through a command-line interface (CLI) and converts the bases files into the FASTQ file format for secondary analysis with the FASTQ-compatible software of your choice.
Analysis begins with demultiplexing, which identifies each sample by the index sequences and assigns polonies to that sample. If samples are not indexed, Bases2Fastq skips demultiplexing and assigns all polonies to one sample. The software converts the demultiplexed bases into FASTQ files, generating one FASTQ file per read (e.g., Read 1 or Read 2) per sample.
Bases2Fastq includes the following features:
- Demultiplexing: Identify sequencing libraries by index sequence and generate FASTQ files
- Native Adapter Trimming: Trim adapter sequences during FASTQ generation, including the automated detection of adapter sequences
- QC Report: HTML Quality control (QC) reports to summarize run and sample quality
- Unique molecular identifier Generate UMI FASTQ files
Setting up a Automatic FASTQ Generation in ElemBio Cloud
While sequencing run data can be demultiplexed by Bases2Fastq in any local or cloud compute environment, Element provides options to automate FASTQ generation using ElemBio Cloud. The following cloud providers can be used to automate FASTQ generation on run completion in ElemBio Cloud:
- ElemBio Catalyst, native data storage and analysis add-on within ElemBio Cloud.
- AWS HealthOmics, supported by your own AWS account
- DNAnexus, supported by your own DNAnexus account
Run Manifest
A run manifest is CSV file that specifies demultiplexing settings, FASTQ file settings, and sample information. By default, Bases2Fastq uses the run manifest that the AVITI System outputs into the run folder (RunManifest.csv
).
You can execute Bases2Fastq with the original run manifest from a sequencing run or an alternate corrected manifest. Optional arguments provided at execution time override run manifest settings.
For complete information on run manifests, including preparation instructions and use cases for a corrected run manifest, see the Sequencing Run Manifest Documentation.
Adapter Trimming
Bases2Fastq trims adapters using the adapter values (R1Adapter
and R2Adapter
) and trimming settings in the run manifest or through automatic detection. Automatic detection occurs when the run manifest contains no adapter values or the execution uses the --detect-adapters
optional argument. The detection uses a passing filter (PF) rate threshold of > 70% to select reference regions. If you are using the Individually Addressable Lanes add-on, automatic detection leverages data from each lane.
For single-end trimming, the software determines where the adapter starts by matching the expected adapter sequences to each position in a read. For paired-end trimming, the software considers data from the expected adapter sequences and comparing Read 1 and Read 2 to determine where the adapters start. If the run manifest contains no adapter information, the software automatically detects adapters using the best estimate from comparing Read 1 and Read 2.
For more information on adapter trimming settings, see the Run Manifest Documentation.
License Agreement
Use of Bases2Fastq is subject to the license agreement available at the Element Biosciences website.