Output files
The following table lists the files and folders that Cells2Stats outputs. Parquet files are column-based files that efficiently store data. For more information, see the Apache Parquet Documentation.
| File | Directory and File Name | Description | Quantity |
|---|---|---|---|
| Spatial Data Object | {root}/SpatialData/RunID.zarr.zip | Zarr-based structured object containing all modalities from a run for downstream analysis and visualization. For more information, see Spatial Data Object. | One per run |
| CytoCanvas Visualization File | {root}/SpatialData/cyto.viz | File for use by the Element CytoCanvas™ tool | One per run |
| Run Manifest, JSON | {root}/RunManifest.json | JSON file that is reserved for Element Biosciences™ processes | One per run |
| Run Parameters | {root}/RunParameters.json | JSON file that records information about the run configuration | One per run |
| Panel | {root}/Panel.json | JSON file that records information about the targets for the run | One per run |
| Run Stats | {root}/RunStats.json | JSON file that records overall statistics about the run | One per run |
| Average Normalized Well Statistics | {root}/AverageNormWellStats.csv | Filtered and average metrics for each well in the run | One per run |
| Raw Cell Statistics, CSV | {root}/RawCellStats.csv | CSV file that reports values per cell for all morphology features and raw target counts in a run | One per run |
| Versions, JSON | {root}/Versions.json | File that reports the version number for CSV output files and bundled software programs | One per run |
| Barcodes | {root}/Wells/{well}/{batch}/{tile}_barcodes.parquet | Parquet files that provide barcoding information for each polony in a tile | One per tile per batch per well |
| MultiQC Report, HTML | {root}/multiqc_report.html | Interactive HTML quality control (QC) summary that is generated from the RunStats.json summary. Not present if --skip-html-report is used | One per run |
| Run Manifest, CSV | {root}/RunManifest.csv | Run manifest for the Cells2Stats execution | One per run |
| Cell Segmentation Mask | {run}/CellSegmentation/{well}/{tile}_Cell.tif | Cell segmentation masks for a well, where the Cell ID is the value for a pixel in a cell | One per tile per well |
| Nuclear Segmentation Mask | {run}/CellSegmentation/{well}/{tile}_Nuclear.tif | Nuclear segmentation masks for a well, where 1 is the value for a pixel in a nucleus | One per tile per well |
| Target Counts, JSON | {root}/TargetCounts.json | (For Teton Atlas™ runs with Target Cell Assignment enabled) JSON file that records read counts per spacer target across the run. Use this to assess library representation and identify low-count or absent spacers. | One per run |
| Target Cell Assignment Manifest, CSV | {root}/TargetCellAssignmentManifest.csv | Copy of the target cell assignment manifest used in this run. Confirms which spacer sequences and settings were applied. | One per run |
| Target Cell Assignment Manifest, JSON | {root}/TargetCellAssignmentManifest.json | JSON file that is reserved for Element processes | One per run |
| Antibody Screen Kit Report, CSV | {root}/AntibodyScreenKitReport.csv | (For Teton™ Custom Screen runs) Report with cell, quality control, and custom protein information | One per run |
Target cell assignment output details
When target cell assignment is enabled for at least one target site in a run, Cells2Stats populates target cell assignment-specific fields in RunStats.json and generates the following additional output files. For more information on how cell assignment works, see Cell assignment logic.
Cells are classified as pure (single spacer detected), mixed (multiple spacers, one dominant), or unassigned based on a confidence score. For a full explanation of these categories and how cell assignment is calculated, see Pure cells vs. mixed cells.
Key target cell assignment output files
| File | What to use it for |
|---|---|
TargetCounts.json | See total reads detected per spacer. High dropout (many spacers with zero reads) may indicate a manifest or TargetMask issue. |
TargetCellAssignmentManifest.csv | Confirm which sequences and settings were applied. Compare against your source library file to catch discrepancies. |
Panel.json | Contains the TargetMask that was applied per target site. Check here if assignment rates are unexpectedly low. |
RunStats.json | Contains all target cell assignment-specific quality metrics. See Target cell assignment metrics. |
RawCellStats.csv | Per-cell data including raw target read counts at each target site. Use for downstream per-cell analysis. |
Target cell assignment metrics in RunStats.json
| Field | Description | What to watch for |
|---|---|---|
PercentAssignedPureCells | Cells assigned to a single dominant spacer (confidence >0.5, meets minimum read count) | Primary metric — aim as high as possible for your MOI |
PercentAssignedMixedCells | Cells with multiple spacers detected, but one is dominant (confidence >0.5) | Expected to increase with higher MOI; still valid data |
PercentUnassignedMixedCells | Cells with multiple spacers; no spacer exceeds 50% confidence | High values suggest MOI may be too high for reliable single-guide assignment |
PercentUnassignedLowCountCells | Spacer was matched but read count is below the minimum threshold | High values suggest low expression or insufficient sequencing depth |
PercentTargetDropout | Percentage of spacers in the manifest not assigned to any cell | High dropout may indicate library representation problems |
MedianAbundantTargetCount | Median read count of dominant spacer in assigned cells | Indicator of per-cell sequencing depth at target site |
MeanUniqueTargetsPerCell | Mean number of unique spacers detected per cell | Values substantially above 1 suggest high MOI |
UnassignedSequences | List of sequences detected at target site but not matching any manifest entry | Primary troubleshooting resource — see Using UnassignedSequences for troubleshooting |
Using UnassignedSequences for troubleshooting
The UnassignedSequences field lists actual nucleotide sequences of reads detected at the target locus that did not match any spacer in the manifest. Inspect this field when assignment rates are low.
| What you see | Likely cause | Resolution |
|---|---|---|
| Sequences that look like your spacers but reversed | Orientation error in manifest | Re-enter spacers in 5' to 3' direction |
| Spacer-like sequences of a different length | TargetMask or sequence length mismatch | Confirm the Y cycle count in TargetMask equals spacer length |
| Recognizable spacers from your library | Spacers missing from manifest | Add missing entries to [TARGET INFORMATION] |
| Unrecognizable sequences | Off-target reads or artifacts | Contact Element Technical Support |
For more information on assignment thresholds and additional troubleshooting, see Cell assignment logic.
MultiQC Reports
MultiQC reports, which are generated by Seqera and designed by Element Biosciences, are available for use with Bases2Fastq™ and Cells2Stats. MultiQC reports analyze results and statistics from bioinformatics tool outputs, such as log files and console outputs. These reports help summarize experiments that contain multiple samples and multiple analysis steps and are designed to be placed at the end of pipelines or to be run manually when you finish running your tools. Furthermore, MultiQC reports contain the parsed data in a nice friendly format, ready for any further downstream analysis. For more information, see MultiQC Documentation.
Metrics
The output files contain a variety of metrics such as tile-specific and average metrics.
- The
RawCellStats.csvandRawCellStats.parquetcontain a full set of morphology and quantification metrics for each target and batch. - The
AverageNormWellStats.csvfile provides the averages of these metrics for each well. Metrics that end with.stdprovide the standard deviation for the metric.
Metrics files report metrics from the following CellProfiler modules, unless the user runs the --skip-cellprofiler argument:
MeasureObjectSizeShapeMeasureGranularityMeasureObjectIntensityMeasureObjectIntensityDistributionMeasureTexture
Certain metrics from these modules are not available in these files. For example, the output files do not report Zernike metrics. In some files, columns for Z-axis metrics appear with values of 0. Z-axis metrics are not available in the RawCellStats.csv and RawCellStats.parquet files because they are not relevant to the analysis output.
For more information on cytoprofiling metrics, see the measurement information in the CellProfiler Manuals.
Run statistics
The RunStats.json provides overall run statistics. The following table describes the run statistics fields:
| Field | Description | Data Type |
|---|---|---|
AnalysisID | Unique ID for the analysis | String |
AnalysisVersion | Version of the analysis software | String |
AssignedCountPerMM2 | Target counts per mm² of the cell area of a quality control target. Teton and Teton Custom Screen with quality control targets. | Float |
AssignedCountsPerMM2 | Target counts per mm² of the cell area of all barcoding targets. | Float |
AverageAssignedCountsPerMM2 | Target counts per mm² of the cell area of all barcoding targets, averaged across wells. | Float |
BatchName | Name of the batch | String |
Batches | List of batch-specific statistics | Array of Objects |
Count | Count of how many times a particular unassigned sequence appears | Integer |
DemuxStats | Statistics related to demultiplexing | Object |
ExpectedSequence | Expected DNA sequence for the target | String |
ExtraCellularRatio | The ratio of extracellular to intracellular assigned target density | Float |
FileVersion | Version of the file format | String |
FlowCellID | ID of the flow cell that is used | String |
ManifestLabel | Label of the custom protein in the manifest when a custom protein is included. Teton Custom Screen with antibody targets. | String |
MeanAssignedBarcodingCountPerCell | Mean number of polonies assigned to a target sequence in a cell per well | Float |
MeanUniqueTargetsPerCell | The mean number of unique targets found in a cell. Teton Atlas with Target Cell Assignment enabled. | Float |
MedianAbundantTargetCount | The median count of the target that is assigned to cells. Teton Atlas with Target Cell Assignment enabled. | Float |
MedianCellDiameter | Median diameter of a cell after cell segmentation. | Float |
NuclearFraction | Fraction of the cellular assigned reads of the target that are in the nucleus. Teton Custom Screen with antibody targets. | Float |
NucleatedRate | The fraction of cells that contain a segmented nucleus | Float |
NumPolonies (Total) | Number of polonies that are detected | Integer |
NumPolonies (Targets) | Number of polonies that are assigned to this target | Integer |
NumPolonies (Wells) | Number of polonies that are in the well | Integer |
PercentAssigned | Percentage of reads that are assigned to a target within the target site | Float |
PercentAssignedMixedCells | Percentage of cells that are assigned and mixed with a confidence score above 50% | Float |
PercentAssignedPureCells | Percentage of cells that are assigned that are pure with a confidence score above 50% | Float |
PercentAssignedReads (Targets) | Percentage of reads that are assigned to targets | Float |
PercentAssignedReads (Wells) | Percentage of reads that are assigned to targets in the well | Float |
PercentCellularAssigned | Percentage of reads that are assigned to cells | Float |
PercentMismatch (Total) | Percentage of reads with at least one base pair mismatch | Float |
PercentMismatch (Targets) | Percentage of reads that are assigned to a target with at least one base pair mismatch | Float |
PercentMismatch (Wells) | Percentage of reads in the well with at least one base pair mismatch | Float |
PercentTargetDropout | Percent of targets not assigned to any cell. Teton Atlas with Target Cell Assignment enabled. | Float |
PercentUnassignedLowCountCells | Percentage of cells that are unassigned that are pure but are below the confidence threshold of 50%. | Float |
PercentUnassignedMixedCells | Percentage of cells that are unassigned and mixed with a confidence score below 50%. | Float |
RunID | Unique ID for the run | String |
RunName | Name of the sequencing run | String |
Sequence | DNA sequence of unassigned reads | String |
TargetName | Name of the sequencing target | String |
TargetSiteName | Name of the target site as provided in the Panel.json. Teton Atlas with Target Cell Assignment enabled. | String |
Targets | List of target-specific statistics | Array of Objects |
UnassignedSequences | List of unassigned sequence details | Array of Objects |
WellLocation | Location of the well (for example, A1-F2) | String |
Wells | List of well-specific statistics | Array of Objects |
Versions
The Versions.json reports the version number for CSV output files and bundled software programs. The following table describes the fields:
| Field | Description | Data Type |
|---|---|---|
FileVersion | Overall version of the file | String |
FileVersions | Versions of individual files in the dataset | Object |
AverageNormWellStats.csv | Version of AverageNormWellStats.csv file | String |
RawCellStats.csv | Version of RawCellStats.csv file | String |
ProgramVersions | Versions of programs that are used in the analysis | Object |
CellProfiler | Version of the CellProfiler program | String |
Other files
Some files are copied from the input directory, such as RunManifest.json, RunParameters.json, Panel.json, Cell Segmentation Masks, Nuclear Segmentation Masks, and Barcodes. For more information on these files, see Cytoprofiling Run Output Files.