Skip to main content

Bases2Fastq Workflow

The bases2fastq-nf workflow by Element Biosciences is a Nextflow workflow that generates FASTQ files from raw sequencing basecall data produced by AVITI Systems.

This workflow runs as part of ElemBio Cloud, but can also be used independently and reproducibly in any Nextflow environment (local or cloud-based).

Workflow Summary

Description

Converting .bases is the necessary first step to convert raw sequencing data to FASTQ file, the predominant file type required to begin secondary analysis. This single step pipeline uses the Bases2Fastq Software to demultiplex AVITI System data.

  • Demultiplex reads and converts base calls into FASTQ files
  • Optionally write index and UMI FASTQs
  • Trim adapters before downstream analysis
  • Generate a Bases2Fastq QC HTML report
Workflow Diagram

Bases2Fastq Workflow Diagram

Release Notes

The workflow repository is maintained on GitHub, where you can find tags, release notes, and the latest updates.

Inputs

Bases2fastq-nf requires at minimum an AVITI sequencing run directory to run. Optional files may also be supplied.

InputDescriptionConstraints
Run DirectoryAn AVITI System sequencing run directory.Required
Sequencing Run ManifestBy default, the run manifest in the output run directory is used. If required, an alternate run manifest instead of the run manifest provided in the run directory.Optional
ParametersIn addition to the dataset, parameters can tune the output.Optional

Output

Bases2fastq-nf outputs the results of demultiplexing and the QC report. Depending on the parameters used, the output directory may be different. See Bases2Fastq outputs for specific details of output files.

Representative view of Bases2Fastq output

s3://output-bucket/analyses
└── bases2fastq
└── wfr_671ae78b2d6a2fc62332f8a3
├── DVT-0274_QC.html
├── IndexAssignment.csv
├── Metrics.csv
├── RunManifest.csv
├── RunManifest.json
├── RunParameters.json
├── RunStats.json
├── Samples
│ ├── DefaultProject
│ │ ├── Sample_1
│ │ │ ├── Sample_1_R1.fastq.gz
│ │ │ ├── Sample_1_R2.fastq.gz
│ │ │ └── Sample_1_stats.json
│ │ ├── Sample_2
│ │ │ ├── Sample_2_R1.fastq.gz
│ │ │ ├── Sample_2_R2.fastq.gz
│ │ │ └── Sample_2_stats.json
│ │ ├── DefaultProject_Metrics.csv
│ │ ├── DefaultProject_QC.html
│ │ └── DefaultProject_RunStats.json
│ └── Unassigned
│ ├── Unassigned_R1.fastq.gz
│ └── Unassigned_R2.fastq.gz
├── UnassignedSequences.csv
├── info
│ ├── Bases2Fastq.log
│ └── RunManifestErrors.json
└── run.log

Input Parameters

ParameterTypeResults
legacy_fastqbooleanApplies the --legacy-fastq option.
detect_adaptersbooleanApplies the --detect-adapters option.
force_index_orientationbooleanApplies the --force_index_orientation option.
split_lanesbooleanApples the --split_lanes option.
filter_maskstringApplies the --filter-mask option with the supplied value.
flowcell_idstringApplies the --flowcell-id option with the supplied value.
num_unassignedintegerApplies the --num_unassiged option with the supplied value.
qc_onlystringApplies the --qc-only option; FASTQ files will not be generated
b2f_argsstringSend a string of arguments to Bases2Fastq. The string will be utilized as is. Recommended for local development only, advanced use cases, or test only.