Getting Started with Bases2Fastq for Cytoprofiling
Bases2Fastq processes Element AVITI™ System cytoprofiling data and converts base calls into FASTQ files. While ElemBio Cloud™ offers a Bases2Fastq verified flow that executes in cloud environments, Element makes Bases2Fastq available as an executable through a Docker container or a static binary. This tutorial uses simulated data to demonstrate how to manually set up and execute Bases2Fastq in a Linux or Windows environment.
This tutorial covers the following topics:
- The structure of a Bases2Fastq execution command
- How to execute Bases2Fastq for your OS and distribution method
- The creation of a corrected run manifest to troubleshoot an error
- How to re- execute Bases2Fastq with a corrected run manifest
Before You Begin
Make sure that you complete the following necessary prerequisites for this tutorial:
Install Docker or the static binary.
- The static binary is only compatible with Linux OS.
- When installing Docker for Windows OS, review the system requirements. Element recommends enabling the WSL 2 backend feature.
Install tree, a CLI tool that maps folder directories.
- Windows OS includes tree by default.
- For Linux OS, use the following command to install tree:
apt-get install tree
Set Up Docker
If you are using Docker for this tutorial, complete the following setup steps:
Make sure Docker for Desktop is running on your system.
Open a CLI terminal.
In the CLI terminal, run the following command to pull the latest Bases2Fastq image from the Element public registry hosted on DockerHub.
The CLI displays the current Bases2Fastq version.
docker run elembio/bases2fastq bases2fastq --version
Create a Directory
Create a
bases2fastq-setup
folder for this tutorial:mkdir bases2fastq-setup
To prepare for downloading the simulated data, make the
bases2fastq-setup
folder the working directory in the CLI:cd ./bases2fastq-setup
Create a
fastq
subfolder in thebases2fastq-setup
folder:mkdir fastq
Download the Simulated Data
Download and extract the simulated output data for Direct In Sample Sequencing (DISS) AVITI24 run to the
bases2fastq-setup
folder:An Amazon S3 cloud storage bucket hosts the data. The data uses the standard format and structure for output files from an AVITI24 System.
curl http://element-public-data.s3.amazonaws.com/bases2fastq-share/bases2fastq-cyto/20250715-bases2fastq-cyto-protein-ops.tar.gz -o cyto-data.zip
Extract the simulated data from the zip file:
tar -xvf cyto-data.zip
Use
tree
to visualize the files in a tree format and confirm successful extraction of the data:tree 20250715-bases2fastq-cyto-protein-ops
Execution Command Overview
Select the tab for your OS and Bases2Fastq distribution.
- Docker for Windows OS
- Docker for Linux OS
- Static Binary for Linux OS
When executing with Docker, the Bases2Fastq execution commands in this tutorial include the following components:
docker run
invokes the Docker daemon.elembio/bases2fastq
identifies the image you want to pull.bases2fastq
invokes the Bases2Fastq software inside the Docker image.
The commands end with the input and output locations for the execution.
When using Docker to run Bases2Fastq, you must mount your test directory to the Docker container. The commands in this tutorial use a bind mount -v
)pwd
) to the container, using a Docker volume to do so. The commands bind the present working directory to the variable /data
to make the input and output locations for the execution accessible to the Docker container.
Windows includes both the Windows Command Prompt
and Powershell
command line interfaces which use different commands. In Powershell, the current directory is referenced using ${PWD}
. In Command Prompt the current directory is referenced using %cd%
and may require additional handling such as enclosing paths in double quotes to handle any spaces in the path.
Commands in this tutorial uses the Powershell interface for Windows commands.
When executing with Docker, the Bases2Fastq execution commands in this tutorial include the following components:
docker run
invokes the Docker daemon.elembio/bases2fastq
identifies the image you want to pull.bases2fastq
invokes the Bases2Fastq software inside the Docker image.
The commands end with the input and output locations for the execution.
When using Docker to run Bases2Fastq, you must mount your test directory to the Docker container. The commands in this tutorial use a bind mount -v
)pwd
) to the container, using a Docker volume to do so. The commands bind the present working directory to the variable /data
to make the input and output locations for the execution accessible to the Docker container.
For the static binary, the Bases2Fastq execution commands in this tutorial use ./bases2fastq
to invoke the executable. The command then states the input and output locations for the execution. The basic structure of the command is as follows:
./bases2fastq <input> <output>
Execute Bases2Fastq
- Docker for Windows OS
- Docker for Linux OS
- Static Binary for Linux OS
Execute Bases2Fastq with the following command, which uses
${pwd}:/data
to mount the present working directory to the Docker container:docker run -v ${pwd}:/data elembio/bases2fastq bases2fastq /data/20250715-bases2fastq-cyto-protein-ops /data/fastq/test2
Wait until the execution completes:
The execution is complete when the CLI displays execution timing and output information.
============ Timing =============
Cyto fastq: 0.442s
Stats reports: 0.000s
Total elapsed: 19.292s
=================================
Output stored in /data/fastq/test2Use
tree
to visualize the output of the execution:tree ./fastq/test2 /f
Examine the
tree
directory to make sure that all output files are present:The following is an example of how the CLI lists the files:
./fastq/test2
├── Panel.json
├── RunManifest.csv
├── RunManifest.json
├── RunParameters.json
├── Samples
│ └── WellA1
│ ├── A1_B01_R1.fastq.gz
│ └── A1_B02_R1.fastq.gz
└── info
└── Bases2Fastq.logAccess
info/Bases2Fastq.log
in thebases2fastq-setup/fastq/test2
folder to view logs and check if any errors occurred:
Execute Bases2Fastq:
docker run -v $(pwd):"/data elembio/bases2fastq bases2fastq /data/20250715-bases2fastq-cyto-protein-ops /data/fastq/test2
Wait until the execution completes:
The execution is complete when the CLI displays execution timing and output information.
============ Timing =============
Cyto fastq: 0.442s
Stats reports: 0.000s
Total elapsed: 19.292s
=================================
Output stored in /data/fastq/test2Use
tree
to visualize the output of the execution:tree ./fastq/test2
Examine the
tree
directory to make sure that all the output files are present:The following is an example of how the CLI lists the files:
./fastq/test2
├── Panel.json
├── RunManifest.csv
├── RunManifest.json
├── RunParameters.json
├── Samples
│ └── WellA1
│ ├── A1_B01_R1.fastq.gz
│ └── A1_B02_R1.fastq.gz
└── info
└── Bases2Fastq.logAccess
info/Bases2Fastq.log
in thebases2fastq-setup/fastq/test2
folder to view logs and check if any errors occurred.
Make sure that you are at the home directory in the CLI, where the static binary executable is installed. If you are still in the bases2fastq-setup folder, use the command
cd ..
to move to the home directory.Execute Bases2Fastq with the following command:
./bases2fastq bases2fastq-setup/20250715-bases2fastq-cyto-protein-ops bases2fastq-setup/fastq/test2
Wait until the execution completes:
The execution is complete when the CLI displays execution timing and output information.
============ Timing =============
Cyto fastq: 0.442s
Stats reports: 0.000s
Total elapsed: 19.292s
=================================
Output stored in /data/fastq/test2Use
tree
to visualize the output of the execution:tree ./bases2fastq-setup/fastq/test2
Examine the
tree
directory to make sure all output files are present:The following is an example of how the CLI lists the files:
./fastq/test2
├── Panel.json
├── RunManifest.csv
├── RunManifest.json
├── RunParameters.json
├── Samples
│ └── WellA1
│ ├── A1_B01_R1.fastq.gz
│ └── A1_B02_R1.fastq.gz
└── info
└── Bases2Fastq.logAccess
info/Bases2Fastq.log
in thebases2fastq-setup/fastq/test2
folder to view logs and check if any errors occurred.
Troubleshooting
- Docker for Windows OS
- Docker for Linux OS
- Static Binary for Linux OS
If you receive an error message that states the input files do not exist, make sure the input files are available to the Docker container through the mounted file system.
Run the Docker container in interactive mode and attempt to list the files in the mounted file system:
docker run -i -v ${pwd}:/data elembio/bases2fastq ls /data/20250715-bases2fastq-cyto-protein-ops
Examine the returned list of files:
If the input folder for the command is correct, the CLI lists at the minimum the following files:
BaseCalls
Filter
Location
RunManifest.csv
RunParameters.jsonIf the CLI does not list the expected files, complete the following troubleshooting actions:
- Make sure the permissions for your system to Docker are correct.
- Make sure the input data is present in your current working directory.
If you receive an error message that states the input files do not exist, make sure the input files are available to the Docker container through the mounted file system.
Run the Docker container in interactive mode and attempt to list the files in the mounted file system.
docker run -i -v $(pwd):"/data" elembio/bases2fastq ls /data/20230404-bases2fastq-sim-151-151-9-9
Examine the returned list of files.
If the input folder for the command is correct, the CLI lists the following files.
BaseCalls
Filter
Location
RunManifest.csv
RunParameters.jsonIf the CLI does not list the expected files, complete the following troubleshooting actions:
- Make sure the permissions for your system to Docker are correct.
- Make sure the input data is present in your current working directory.
If you receive an error message that states the input files do not exist, make sure the input files are available in the input folder.
Attempt to list the files in the input directory:
ls bases2fastq-setup/20250715-bases2fastq-cyto-protein-ops
Examine the returned list of files:
If the input folder for the command is correct, the CLI lists the following files:
BaseCalls Filter Location RunManifest.csv RunParameters.json
If the CLI does not list the expected files, complete the following troubleshooting actions:
- Make sure your current working directory is correct.
- Make sure the input data is present in your current working directory.
Additional Resources
For more information related to the topics in this tutorial, see the following resources: