Bases2Fastq Workflow
The bases2fastq-nf workflow by Element Biosciences is a Nextflow workflow that generates FASTQ files from raw sequencing basecall data produced by AVITI Systems.
This workflow runs as part of ElemBio Cloud, but can also be used independently and reproducibly in any Nextflow environment (local or cloud-based).
Workflow Summary
Description
Converting .bases
is the necessary first step to convert raw sequencing data to FASTQ file, the predominant file type required to begin secondary analysis. This single step pipeline uses the Bases2Fastq Software to demultiplex AVITI System data.
- Demultiplex reads and converts base calls into FASTQ files
- Optionally write index and UMI FASTQs
- Trim adapters before downstream analysis
- Generate a Bases2Fastq QC HTML report
Inputs
Bases2fastq-nf requires at minimum an AVITI sequencing run directory to run. Optional files may also be supplied.
Input | Description | Constraints |
---|---|---|
Run Directory | An AVITI System sequencing run directory. | Required |
Sequencing Run Manifest | By default, the run manifest in the output run directory is used. If required, an alternate run manifest instead of the run manifest provided in the run directory. | Optional |
Parameters | In addition to the dataset, parameters can tune the output. | Optional |
Output
Bases2fastq-nf outputs the results of demultiplexing and the QC report. Depending on the parameters used, the output directory may be different. See Bases2Fastq outputs for specific details of output files.
Representative view of Bases2Fastq output
s3://output-bucket/analyses
└── bases2fastq
└── wfr_671ae78b2d6a2fc62332f8a3
├── DVT-0274_QC.html
├── IndexAssignment.csv
├── Metrics.csv
├── RunManifest.csv
├── RunManifest.json
├── RunParameters.json
├── RunStats.json
├── Samples
│ ├── DefaultProject
│ │ ├── Sample_1
│ │ │ ├── Sample_1_R1.fastq.gz
│ │ │ ├── Sample_1_R2.fastq.gz
│ │ │ └── Sample_1_stats.json
│ │ ├── Sample_2
│ │ │ ├── Sample_2_R1.fastq.gz
│ │ │ ├── Sample_2_R2.fastq.gz
│ │ │ └── Sample_2_stats.json
│ │ ├── DefaultProject_Metrics.csv
│ │ ├── DefaultProject_QC.html
│ │ └── DefaultProject_RunStats.json
│ └── Unassigned
│ ├── Unassigned_R1.fastq.gz
│ └── Unassigned_R2.fastq.gz
├── UnassignedSequences.csv
├── info
│ ├── Bases2Fastq.log
│ └── RunManifestErrors.json
└── run.log
Input Parameters
Parameter | Type | Results | |
---|---|---|---|
legacy_fastq | boolean | Applies the --legacy-fastq option. | |
detect_adapters | boolean | Applies the --detect-adapters option. | |
force_index_orientation | boolean | Applies the --force_index_orientation option. | |
split_lanes | boolean | Apples the --split_lanes option. | |
filter_mask | string | Applies the --filter-mask option with the supplied value. | |
flowcell_id | string | Applies the --flowcell-id option with the supplied value. | |
num_unassigned | integer | Applies the --num_unassiged option with the supplied value. | |
qc_only | string | Applies the --qc-only option; FASTQ files will not be generated | |
b2f_args | string | Send a string of arguments to Bases2Fastq. The string will be utilized as is. Recommended for local development only, advanced use cases, or test only. |