Skip to main content

Settings

Settings specify details for Bases2Fastq processes, such as demultiplexing and adapter trimming. The following sections describe the available settings and their default values. Settings that use a Boolean data type allow case-insensitive values of true, false, t, f, 0, and 1. T or 1 indicate true and f or 0 indicate false.

Columns

The Settings section includes SettingName and Value columns and an optional Lane column.

ColumnConstraintsValue
SettingNameRequiredThe name of the setting
ValueRequiredThe value applied to the setting
LaneOptionalThe number of a lane to restrict a library to: 1, 2, or 1+2 (default)

Base Masks

A base mask specifies a set of cycles for a particular operation in Bases2Fastq. A series of operators indicates which cycles are included in the base mask. A positive integer or asterisk follows each operator to specify the applicable cycles.

  • A Y (yes) operator indicates that a cycle is included in the mask.
  • An N (no) operator indicates that a cycle is excluded from the mask.
  • A positive integer indicates the number of cycles to include or exclude.
  • An asterisk matches any remaining cycles in the read.

For example, Y4N* creates a base mask for the first four cycles in a read. The base mask N3Y2N* excludes the first three cycles of a read, includes the fourth and fifth cycles, and excludes all remaining cycles.

Read Identifiers

A base mask can include read identifiers that restrict the mask to cycles for Index 1 (I1), Index 2 (I2), Read 1 (R1), or Read 2 (R2). Each read identifier is encoded as the abbreviated read name followed by a colon (e.g., R1:). If the base mask does not include a read identifier, Bases2Fastq uses a default read that depends on the base mask setting.

To specify one read for a base mask, start the base mask with the read identifier. If you are specifying multiple reads for a base mask, enter multiple read sections that each start with the read identifier. Separate each read section with a hyphen.

  • Example base mask that applies to one read: I1:Y3N*
  • Example base mask that applies to two reads: I1:Y3N*-I2:Y2N*

Cycle Lengths

A base mask must define the full cycle length of a read, regardless of whether you include select bases in the read or all bases. A read with a base mask that includes a subset of cycles must still account for the remaining cycles. Otherwise, Bases2Fastq displays a validation error.

For example, if Read 1 consists of 30 cycles and you want a base mask for the first 15 cycles, you must end the base mask with the remaining number of cycles. The base mask R1:Y15N15 includes the first 15 cycles (Y15) of Read 1 (R1:) and excludes the remaining 15 cycles (N15). Alternatively, R1:Y15N* achieves the same goal but uses an asterisk to cover the remaining number of cycles.

Base Mask Settings

SettingValueDefault
R1FastQMaskA base mask that defines which cycles to record in the Read 1 FASTQ fileR1:Y*N
R2FastQMaskA base mask that defines which cycles to record in the Read 2 FASTQ fileR2:Y*N
I1MaskA base mask that defines which cycles to use for Index 1 demultiplexing1
  • No indexing: I1:N*
  • Adept: I1:Y*
  • Elevate: I1:Y9N*
  • Elevate Cloudbreak: I1:N3Y9N*
I2MaskA base mask that defines which cycles to use for Index 2 demultiplexing1
  • No indexing: I2:N*
  • Adept: I2:Y*
  • Elevate: I2:Y9N*
  • Elevate Cloudbreak:I2:Y9N*
UmiMaskA base mask that defines which cycles sequence the unique molecular identifier (UMI). The following details apply to UmiMasks:
  • When the UmiMask does not specify a read, Bases2Fastq uses the default read of I1.
  • When the UmiMask does not specify cycles, Bases2Fastq uses the default cycle mask of N*.
  • When the output is not empty, Bases2Fastq includes the UMI sequence in the headers of the Read 1 and Read 2 FASTQ files.
I1:N*

1 No indexing indicates that indexed libraries are missing or each lane contains only one unindexed library.

Example Base Masks

The following table provides scenarios and example base masks.

ScenarioBase Mask
Create a base mask that includes the first two cycles of Read 1.R1:Y2N*
Create a base mask that includes the fourth and fifth cycles of the default read.N3Y2N3
Create a base mask that includes all but the first two and last two cycles of Index 1.I1:N2Y*N2
Create a base mask that includes all but the last cycles of Read 1 and Read 2.R1:Y*N-R2:Y*N
Use a base mask for a library that recommends only 28 base pairs in Read 1.R1:Y28N*
Use base masks for a 7-base UMI that is in line with Read 1.
  • R1FastQMask: R1:N7Y*
  • UmiMask: R1:Y7N*
Use base masks for an 8-base UMI that is in line with Index 1.
  • I1Mask: I1:N8Y*
  • UmiMask: I1:Y8N*
Set up base masks for a single-index library that requires Index 2 FASTQ files for secondary analysis.
  • I2Mask: I2:N*
  • UmiMask: I2:Y*
  • Set UmiFastQ setting as TRUE.

UMI, Index, and Control Settings

SettingValueDefault
UmiFastQA Boolean value that specifies whether to generate a UMI FASTQ file. When true, Bases2Fastq generates a UmiFastQ file based on the UmiMask setting.FALSE
I1FastQA Boolean value that specifies whether to generate an I1FastQ file. When true, Bases2Fastq generates an I1FastQ file based on the I1Mask.False
I2FastQA Boolean value that specifies whether to generate an I2FastQ file. When true, Bases2Fastq generates an I2FastQ file based on the I2Mask.False
I1MismatchThresholdAn integer 0-2 that specifies the number of mismatches Bases2Fastq allows when demultiplexing the Index 1 sequence11
I2MismatchThresholdAn integer 0-2 that specifies the number of mismatches Bases2Fastq allows when demultiplexing the Index 2 sequence11
SpikeInAsUnassignedA Boolean value that specifies whether to categorize PhiX Control Library reads as unassigned:
  • When libraries are absent or each lane contains only one unindexed library, the value defaults to true. You can reset it to false.
  • When indexed libraries are present, the value defaults to false. If you reset it to true, Bases2Fastq displays a warning.
True or false

1 A mismatch is the number of mismatched bases between the observed index read and the expected index sequence that Bases2Fastq tolerates.

Adapter Trimming

Library prep adds Read 1 and Read 2 adapters to each sample. When the length of Read 1 or Read 2 exceeds the length of the DNA insert, the run sequences into the adapter. Adapter trimming removes the adapter sequences from the 3' end of each read to prevent adapter-based errors in certain analyses.

Run manifest settings enable adapter trimming and specify the options. When adapter trimming is enabled, Bases2Fastq automatically detects and trims adapter sequences if the run manifest contains no adapter values or the execution uses the --detect-adapters optional argument. For more information, see the Bases2Fastq Documentation.

Figure 3: Trimming adapter sequences from Read 1 and Read 2

Adapter Trimming

Paired-End versus Single-End

Bases2Fastq includes paired-end and single-end adapter trimming. Paired-end adapter trimming aligns the Read 1 and Read 2 inserts to accurately trim short adapters. When a sample includes insertions and deletions (indels), the software accurately trims adapters that are as short as one base. Single-end adapter trimming individually processes each read, removing the adapter sequences without alignment.

Paired-end adapter trimming is more accurate but requires that Read 1 and Read 2 each include at least 17 cycles. Single-end adapter trimming supports applications that do not meet this requirement. Neither type of adapter trimming increases the run time.

Default Adapter Sequences

The default R1Adapter and R2Adapter values for the Adept Workflow are blank. Consult the third-party library prep documentation for adapter trimming recommendations. If you do not specify values, Read 1 and Read 2 must each include at least 48 cycles. Otherwise, Bases2Fastq cannot detect and trim the adapters.

For the Elevate Workflow, the following sequences are the default values:

  • R1Adapter—5' ATGTCGGAAGGTGTGCAGGCTACCGCTTGTCAACT 3'
  • R2Adapter—5' ATGTCGGAAGGTGTCTGGTGAGCCAATCCAGCACG 3'

Adapter Trimming Settings

SettingValueDefault
AdapterTrimTypeA value of Paired-End or Single-End to specify the type of adapter trimming to performPaired-End
R1AdapterTrimA Boolean value that specifies whether to trim the adapter sequence from Read 1False
R2AdapterTrimA Boolean value that specifies whether to trim the adapter sequence from Read 2False
R1AdapterThe adapter sequence to trim from Read 1. Valid values are A, C, G, N, and T. Separate multiple entries with a hyphen or a plus sign (e.g., ATTCCGGGGAATTTGCAT-CGGATTTTGCATT or ATTCCGGGGAATTTGCAT+CGGATTTTGCATT).See Adapter Trimming
R2AdapterThe adapter sequence to trim from Read 2. Valid values are A, C, G, N, and T. Separate multiple entries with a hyphen or a plus sign (e.g., ATTCCGGGGAATTTGCAT-CGGATTTTGCATT or ATTCCGGGGAATTTGCAT+CGGATTTTGCATT).See Adapter Trimming
R1AdapterNMaskA Boolean value that specifies whether to mask each base in the Read 1 adapter sequence with an N. This N-masking is an alternative to adapter trimming.False
R2AdapterNMaskA Boolean value that specifies whether to mask each base in the Read 2 adapter sequence with an N. This N-masking is an alternative to adapter trimming.False
R1AdapterMinimumOverlapAn integer from 1 through the Read 1 length that specifies the minimum length an adapter must be for single-end adapter trimming. Bases2Fastq does not trim adapters shorter than the value.3
R2AdapterMinimumOverlapAn integer from 1 through the Read 2 length that specifies the minimum length an adapter must be for single-end adapter trimming. Bases2Fastq does not trim adapters shorter than the value.The lesser value:
  • 3
  • Read 2 cycles
R1AdapterMinimumStringencyA value 0-1 that specifies the fraction of bases that must match the Read 1 adapter sequence for single-end adapter trimming0.9
R2AdapterMinimumStringencyA value 0-1 that specifies the fraction of bases that must match the Read 2 adapter sequence for single-end adapter trimming0.9
R1AdapterMinimumTrimmedLengthAn integer from 1 through the Read 1 length that specifies the minimum read length after adapter trimming. If a read is shorter than the value, Bases2Fastq removes the entire read, including the corresponding read from all FASTQ files.The lesser value:
  • 16
  • Read 1 cycles
R2AdapterMinimumTrimmedLengthAn integer from 1 through the Read 2 length that specifies the minimum read length after adapter trimming. If a read is shorter than the value, Bases2Fastq removes the entire read, including the corresponding read from all FASTQ files.The lesser value:
  • 16
  • Read 2 cycles

Analysis Lane

Adding a Lane column to the Settings section restricts each setting to a specified lane. If you are not using the Individually Addressable Lanes add-on, you can use the column to divide samples and enable parallel analysis in secondary analysis software. The following values are valid:

  • 1 for lane 1
  • 2 for lane 2
  • 1+2 for both lanes

If you omit the Lane column, Bases2Fastq applies all settings to both lanes.

[SETTINGS]
SettingName,Value,Lane,
AdapterTrimType, Paired-End, 1+2
R1AdapterTrim,FALSE,1
R1AdapterNMask,FALSE,1
R2AdapterTrim,FALSE,2
R2AdapterNMask,FALSE,2