Software and System Setup
When you set up Bases2Fastq on a system, you must install a static binary executable or use a Docker container. Additionally, to set up Docker or a static binary compute environment that meets system requirements, you must configure the compute environment to transfer files.
System Requirements
Operating System
Bases2Fastq is compatible with various operating systems (OS). Review the following compatibility matrix to determine the appropriate Bases2Fastq distribution for your OS:
Operating System | Docker Compatibility | Static Binary Compatibility | OS Notes |
---|---|---|---|
Linux OS | The static binary executable is compatible with any Linux OS on an x86 architecture that uses glibc v2.19 or later. To verify the glibc version for static binary, run the following command: ldd --version | ||
Windows OS | Windows OS is not directly compatible with the static binary executable. Install Docker and run using Docker. Make sure to review the system requirements. Element recommends enabling the WSL 2 backend feature. | ||
Windows OS with Windows Subsystem for Linux (WSL) | If you install WSL on Windows OS, you can use the static binary executable in the WSL environment. | ||
Mac OS | MacOS is not compatible with the static binary executable. Install Docker and run using Docker. |
Bases2Fastq is not supported on ARM processors.
Memory and Performance
Software performance is based on the resources that are dedicated to the processing environment. For optimal performance, make sure that you have at least 16 CPU cores available and enable threading Bases2Fastq with the -p
argument. The following memory requirements apply to both Docker and static binary distributions:
- A 2 x 75 or 2 x 150 AVITI System run requires 4 GB RAM per concurrent thread.
- A 2 x 300 AVITI System run requires 6 GB RAM per concurrent thread.
The following benchmarks estimate the time Bases2Fastq takes to execute a 2 x 150 sequencing run:
Setup | Time Estimate |
---|---|
A non-volatile memory express (NVMe) solid-state drive (SSD) that uses 8 threads | 60 minutes |
An NVMe SSD that uses 20 threads | 30 minutes |
An Amazon m5.12xlarge EC2 instance with 48 virtual CPUs and onboard SSD storage | < 30 minutes |
Temporary Directory
When you use cloud storage, Bases2Fastq downloads input files and stages output files in a temporary directory. Intermediate files that are generated during analysis are also stored in the temporary directory. After an execution completes, the temporary directory is cleared.
The temporary directory typically uses 400–500 GB for a 2 x 150 run (approximately 1 billion reads). For some applications, a run can use up to 800 GB. The necessary amount of scratch space is based on on the number of polonies and cycles in the run and the optional arguments in the Bases2Fastq execution.
By default, Bases2Fastq uses the temporary directory of the OS. To change the location of the temporary directory, set the environment variable TMPDIR
. Use the following example command and replace /path/to/scratch
with the desired directory:
export TMPDIR="/path/to/scratch"
File Transfer and Storage Setup
To transfer files, Bases2Fastq requires paths to input and output locations. You can store input and output files in a local location or the cloud. For cloud storage, the following providers are compatible:
- Amazon Simple Storage Service (Amazon S3)
- Google Cloud Storage (GCS)
- Any rclone-compatible provider
Amazon S3 and GCS storage connections require credential configuration for Bases2Fastq execution. For more information, see Execute with Amazon S3 and Execute with GCS.
Rclone Requirements
Rclone is a command-line program to manage files on cloud storage. Rclone provides the ability to mount any local, cloud, or virtual file system. Rclone allows Bases2Fastq to access many cloud storage providers. However, Element has not tested every available rclone provider.
To download and install rclone, follow the instructions at rclone.org/install. To communicate with your cloud storage, configure an rclone remote setup. For more information, see the provider-specific instructions at rclone.org/#providers.
Bases2Fastq Installation
To set up Bases2Fastq, use Docker or static binary. Current and previous versions are available for installation.
The static binary executable for Bases2Fastq is only compatible with specific OS configurations. Before installation, review the OS requirements.
Install the Current Version
- Docker
- Static Binary
Follow the OS-specific instructions at docs.docker.com/get-docker/.
To pull the latest version of the Bases2Fastq image from the Element public registry at DockerHub, run the following command:
docker pull elembio/bases2fastq
To confirm that Bases2Fastq is operational, run the following commands to display the software version and help content:
docker run elembio/bases2fastq bases2fastq --version
docker run elembio/bases2fastq bases2fastq --helpTo test the software, you can download simulated data and complete the Getting Started with Bases2Fastq tutorial.
To download the latest version of the static binary, use one of the following methods:
- Visit the Element website and follow the onscreen prompts.
- Run the following
curl
command:
curl https://bases2fastq-release.s3.amazonaws.com/bases2fastq-latest.tar.gz -o bases2fastq-latest.tar.gz
To extract the file, run the following
tar
command:tar -xvf bases2fastq-latest.tar.gz
To confirm that Bases2Fastq is operational, run the following commands to display the software version and help content:
./bases2fastq --version
./bases2fastq --helpRun one of the following commands to install Python 3.9 or newer with NumPy, Bokeh, and bs4 packages:
Linux dependency installationsudo apt install python3 python3-pip libjpeg-dev zlib1g-dev
pip3 install numpy==1.* bs4==0.*
pip3 install 'bokeh>=2.3,<3'
pip3 install pip --upgradeCentOS dependency installationsudo yum install python3 python3-pip libjpeg‑turbo‑devel zlib-devel
pip3 install numpy==1.* bs4==0.*
pip3 install 'bokeh>=2.3,<3'
pip3 install pip --upgradeThe bs4 package requires Pillow, which requires
libjpeg-dev
andzlib-dev
. Also, packages might require you to upgradepip3
. Install packages based on your system.If you do not install Python 3.9 or newer, or a package is missing, then Bases2Fastq logs a warning and does not generate QC reports.
To generate multiQC reports, run the following commands to install multiQC dependency:
pip install multiqc
To test the software, download the simulated data, and then complete the Getting Started with Bases2Fastq for Sequencing tutorial or the Getting Started with Bases2Fastq for Cytoprofiling tutorial.
Install a Previous Version
Bases2Fastq follows the semantic versioning specification. All major, minor, and patch versions are available by tag for both the Docker and static binary.
To review information for previous versions of Bases2Fastq, see Release Notes and Version Compatibility.
To complete the instructions for Bases2Fastq installation use a command to download a previous version. See the following code examples:
- Docker
- Static Binary
# Get the latest of version 2
docker pull elembio/bases2fastq:2
# Get the latest minor 2.2 version
docker pull elembio/bases2fastq:2.2
# Get a specific major, minor, patch version
docker pull elembio/bases2fastq:2.2.0
To retrieve the desired version, replace the {version}
in the url https://bases2fastq-release.s3.amazonaws.com/bases2fastq-{version}.tar.gz
.
# Get the latest of version 2
curl https://bases2fastq-release.s3.amazonaws.com/bases2fastq-v2.tar.gz -o bases2fastq-v2.tar.gz
# Get the latest minor 2.2 version
curl https://bases2fastq-release.s3.amazonaws.com/bases2fastq-v2.2.tar.gz -o bases2fastq-v2.2.tar.gz
# Get a specific major, minor, patch version
curl https://bases2fastq-release.s3.amazonaws.com/bases2fastq-v2.2.0.tar.gz -o bases2fastq-v2.2.0.tar.gz