Skip to main content

Getting Started with Cells2Stats for Cytoprofiling

Cells2Stats processes AVITI24™ System cytoprofiling data to generate detailed statistics and outputs that enable downstream analysis, including visualization using CytoCanvas™. ElemBio Cloud™ offers a Cells2Stats verified flow that executes in ElemBio Catalyst™. Cells2Stats is also available as an executable through a docker container or static binary.

This tutorial uses simulated data to demonstrate how to manually set up and execute Cells2Stats in a Linux or Windows environment, and covers the following topics:

  1. The structure of a Cells2Stats execution command
  2. How to execute Cells2Stats for your OS and distribution method
  3. The creation of visualization files for CytoCanvas

Before You Begin

Make sure that you complete the following necessary prerequisites for this tutorial:

  1. Install Docker or the static binary.

    • The static binary is only compatible with Linux OS.
    • When you install Docker for Windows OS, review the system requirements. Element Biosciences™ recommends that you enable the WSL 2 backend feature.
  2. Install tree, a Command Line Interface (CLI) tool that maps folder directories.

    • Windows OS includes tree by default.
    • For Debian-based Linux OS, use the following command to install tree:
    apt-get install tree

Set Up Docker

If you are not using Docker for this tutorial, continue to Create a Directory. If you use Docker for this tutorial, complete the following setup steps:

  1. Make sure that Docker for Desktop is running on your system.

  2. Open a CLI terminal.

  3. In the CLI terminal, run the following command to pull the latest Cells2Stats image from the Element public registry that is hosted on DockerHub.

    The CLI displays the current Cells2Stats version.

    docker run elembio/cells2stats cells2stats --version

Create a Directory

  1. Create a cells2stats-setup folder for this tutorial:

    mkdir cells2stats-setup
  2. To prepare for downloading the simulated data, make the cells2stats-setup folder the working directory in the CLI:

    cd ./cells2stats-setup

Download the Simulated Data

  1. Download and extract the simulated output data for a Teton™ MAPK cell cycle run.

    An Amazon S3 cloud storage bucket hosts the data. The data uses the standard format and structure for output files from an AVITI24 System.

    curl https://element-share-data.s3.us-west-2.amazonaws.com/cells2stats-share/20250813_cells2stats_demo-teton-sim-run.tar.gz -o 20250813_cells2stats_demo-teton-sim-run.zip 
  2. Extract the simulated data from the following zip file:

    tar -xvf 20250813_cells2stats_demo-teton-sim-run.zip
  3. Use tree to visualize the files in a tree format and confirm successful extraction of the data:

    tree 20250813_cells2stats_demo-teton-sim-run

Execution Command Overview

Select one of the following tabs based on your OS and Cells2Stats distribution:

When you execute with Docker, the Cells2Stats execution commands in this tutorial include the following components:

  • docker run invokes the Docker daemon.
  • elembio/cells2stats identifies the image that you want to pull.
  • cells2stats invokes the Cells2Stats software inside the Docker image.

The commands end with the input locations for the execution.

When you use Docker to run Cells2Stats, you must mount your test directory to the Docker container. The commands in this tutorial use a bind mount (-v) to mount the current working directory (pwd) to the container with a Docker volume. The commands bind the current working directory to the variable /data to make the input and output locations for the execution accessible to the Docker container.

Adding --rm to the command removes the image after the execution completes. Element recommends using this argument to keep your system clean.

Choosing Your Windows Command Line Environment 

Windows includes both the Windows Command Prompt and Powershell command line interfaces that each use different commands. In Powershell, the current directory is referenced using ${PWD}. In Command Prompt, the current directory is referenced using %cd% and may require additional handling such as enclosing paths in double quotes to handle any spaces in the path.

Commands in this tutorial use the Powershell interface for Windows commands.

Execute Cells2Stats

  1. Execute Cells2Stats with the following command, which uses ${pwd}:/data to mount the current working directory to the Docker container:

    docker run --rm -v ${pwd}:/data  elembio/cells2stats cells2stats /data/20250813_cells2stats_demo-teton-sim-run
  2. Wait until the execution completes. The execution is complete when the CLI displays similar to the following log lines:

    20250902T191241Z is an example timestamp in the following command. Your execution will display a different timestamp.

    timestamp [info]: Analysis completed successfully.
    timestamp [info]: Output written to /data/20250813_cells2stats_demo-teton-sim-run/Cytoprofiling/<20250902T191241Z>
  3. Use tree to visualize the execution output. The command /f displays the names of files in the folders. Replace timestamp with the timestamp of your execution.

    tree ./20250813_cells2stats_demo-teton-sim-run/Cytoprofiling/timestamp /f
  4. Review the tree directory to make sure that all output files are present:

    The following is an example of how the CLI lists the files:

    ./20250813_cells2stats_demo-teton-sim-run/Cytoprofiling/timestamp
    │ AverageNormWellStats.csv
    │ multiqc_report.html
    │ Panel.json
    │ RawCellStats.csv
    │ RawCellStats.parquet
    │ RunManifest.csv
    │ RunManifest.json
    │ RunParameters.json
    │ RunStats.json
    │ Versions.json

    ├───CellSegmentation
    │ ├───WellA1
    │ │ L2R02C03S1_Cell.tif
    │ │ L2R02C03S1_Nuclear.tif
    │ │
    │ └───WellA2
    │ L1R02C02S1_Cell.tif
    │ L1R02C02S1_Nuclear.tif

    ├───Logs
    │ Cells2Stats.log
    │ MultiQCWrapper.log

    ├───multiqc_data
    │ barcoding_bar_Assigned_Reads.txt
    │ barcoding_bar_Mismatches.txt
    │ batch_metrics.txt
    │ batch_metrics_table.txt
    │ BETA-multiqc.parquet
    │ cell_assignment_bar_Batch_Counts.txt
    │ cell_assignment_bar_Batch_Density.txt
    │ cell_assignment_bar_Extracellular_Ratio.txt
    │ cell_assignment_bar_Total_Counts.txt
    │ cell_assignment_bar_Total_Density.txt
    │ cell_segmentation_bar_Cell_Count.txt
    │ cell_segmentation_bar_Cell_Diameter.txt
    │ cell_segmentation_bar_Confluency.txt
    │ cell_segmentation_bar_Nucleated_Cells.txt
    │ controls_bar_Negative_Control_1.txt
    │ controls_bar_Negative_Control_2.txt
    │ controls_bar_Negative_Control_3.txt
    │ controls_bar_Negative_Control_4.txt
    │ controls_bar_Positive_Control_1.txt
    │ multiqc.log
    │ multiqc_citations.txt
    │ multiqc_data.json
    │ multiqc_software_versions.txt
    │ multiqc_sources.txt
    │ well_assignment_plot.txt
    │ well_barcoding_plot.txt
    │ well_control_plot.txt
    │ well_metrics.txt
    │ well_metrics_table.txt
    │ well_segmentation_plot.txt

    └───Wells
    ├───WellA1
    │ ├───B01
    │ │ L2R02C03S1_barcodes.parquet
    │ │
    │ ├───B02
    │ │ L2R02C03S1_barcodes.parquet
    │ │
    │ ├───B03
    │ │ L2R02C03S1_barcodes.parquet
    │ │
    │ ├───B04
    │ │ L2R02C03S1_barcodes.parquet
    │ │
    │ ├───B05
    │ │ L2R02C03S1_barcodes.parquet
    │ │
    │ ├───B06
    │ │ L2R02C03S1_barcodes.parquet
    │ │
    │ └───B07
    │ L2R02C03S1_barcodes.parquet

    └───WellA2
    ├───B01
    │ L1R02C02S1_barcodes.parquet

    ├───B02
    │ L1R02C02S1_barcodes.parquet

    ├───B03
    │ L1R02C02S1_barcodes.parquet

    ├───B04
    │ L1R02C02S1_barcodes.parquet

    ├───B05
    │ L1R02C02S1_barcodes.parquet

    ├───B06
    │ L1R02C02S1_barcodes.parquet

    └───B07
    L1R02C02S1_barcodes.parquet
  5. Access files in the output.

    • Access Logs/Cells2Stats.log to view logs and check if any errors occurred.
    • Open multiqc_report.html to view the MultiQC report.

Troubleshooting

If you receive an error message that states the input files do not exist, make sure that the input files are available to the Docker container through the mounted file system.

  1. Run the Docker container in interactive mode and attempt to list the files in the mounted file system:

    docker run --rm -i -v ${pwd}:/data  elembio/cells2stats ls /data/20250813_cells2stats_demo-teton-sim-run
  2. Review the returned list of files. If the input folder for the command is correct, the CLI lists the following files:

    BaseCalling
    CellSegmentation
    Cytoprofiling
    Panel.json
    Projection
    RunManifest.csv
    RunManifest.json
    RunParameters.json
  3. If the CLI does not list the expected files, complete the following troubleshooting steps:

    • Make sure that you are at the home directory in the CLI, where the static binary executable is installed.
    • Make sure that the input data is present in your current working directory.

Visualization Files for CytoCanvas

AVITI24 output files require preprocessing by Cells2Stats to be used with CytoCanvas. Preprocessing includes tiling images and building a visualization manifest. To generate visualization files when you run Cells2Stats, use one of the following methods:

  • Use the --visualization argument to regenerate all cells statistics and generate visualization files.
  • Use the --visualization-only argument to generate only the visualization files.

The following instructions demonstrate how to generate visualization using simulated data:

  1. Append the following --visualization or --visualization-only optional argument to the end of your execution command and reexecute the command. The following example command uses --visualization-only:

    docker run --rm -v ${pwd}:/data  elembio/cells2stats cells2stats /data/20250813_cells2stats_demo-teton-sim-run --visualization-only
  2. When the execution completes, the visualization files are available at ./20250813_cells2stats_demo-teton-sim-run/Cytoprofiling/timestamp/visualization.

  3. (Optional) Install CytoCanvas and visualize the data.

Additional Resources

For more information related to the topics in this tutorial, see the following resources: