Output Files
The following table lists the files and folders that Cells2Stats outputs. Parquet files are column-based files that efficiently store data. For more information, see the Apache Parquet Documentation.
File | Directory and File Name | Description | Quantity |
---|---|---|---|
Run Manifest, CSV | {root}/RunManifest.csv | Run manifest for the Cells2Stats execution | One per run |
Run Manifest, JSON | {root}/RunManifest.json | JSON file that is reserved for Element Biosciences™ processes | One per run |
Run Parameters | {root}/RunParameters.json | JSON file that records information about the run configuration | One per run |
Run Stats | {root}/RunStats.json | JSON file that records overall statistics about the run | One per run |
Panel | {root}/Panel.json | JSON file that records information about the targets for the run | One per run |
Cell Segmentation Mask | {root}/CellSegmentation/{well}/{tile}_Cell.tif | Cell segmentation masks for a well, where the Cell ID is the value for a pixel in a cell | One per tile per well |
Nuclear Segmentation Mask | {root}/CellSegmentation/{well}/{tile}_Nuclear.tif | Nuclear segmentation masks for a well, where 1 is the value for a pixel in a nucleus | One per tile per well |
Average Normalized Well Statistics | {root}/AverageNormWellStats.csv | Filtered and average metrics for each well in the run | One per run |
Versions, JSON | {root}/Versions.json | File that reports the version number for CSV output files and bundled software programs | One per run |
Raw Cell Statistics, CSV | {root}/RawCellStats.csv | CSV file that reports values per cell for all morphology features and raw target counts in a run | One per run |
Raw Cell Statistics, Parquet (Viz) | {root}/visualization/RawCellStats.parquet | Parquet file that reports values per cell for all morphology features and raw target counts in a run | One per run |
Barcodes | {root}/Wells/{well}/{batch}/{tile}_barcodes.parquet | Parquet files that provide barcoding information for each polony in a tile | One per tile per batch per well |
Cells2Stats Log | {root}/Log/Cells2Stats.log | File that records logs for the Cells2Stats execution | One per run |
Target Counts, JSON | {root}/TargetCounts.json | (For Teton Atlas™ runs with Target Cell Assignment enabled) JSON file that records information about the target counts | One per run |
Target Cell Assignment Manifest, CSV | {root}/TargetCellAssignmentManifest.csv | Manifest for Teton Atlas runs with Target Cell Assignment enabled | One per run |
Target Cell Assignment Manifest, JSON | {root}/TargetCellAssignmentManifest.json | JSON file that is reserved for Element processes | One per run |
MultiQC Report, HTML | {root}/multiqc_report.html | Interactive HTML quality control (QC) summary that is generated from the RunStats.json summary. Not present if --skip-html-report is used | One per run |
Antibody Screen Kit Report, CSV | {root}/AntibodyScreenKitReport.csv | (For Teton™ Custom Screen runs) Report with cell, quality control, and custom protein information | One per run |
Visualization Files
The visualization
folder and visualization files are generated when the --visualization or --visualization-only arguments are run. Also, an additional log file is generated. The following table lists the additional folders and files that are generated when visualization is run:
File | Directory and File Name | Description | Quantity |
---|---|---|---|
cyto.viz | {root}/visualization/cyto.viz | File for use by the Element CytoCanvas™ tool | One per execution, when visualization is executed |
visualization.log | {root}/Log/Visualization.log | File that records logs for the visualiation execution | One per execution, when visualization is executed |
cells | {root}/visualization/cells | Additional folder that is used by CytoCanvas for cell boundaries | One per execution, when visualization is executed |
targets | {root}/visualization/targets | Additional folder that is used by CytoCanvas to specify targets and locations | One per execution, when visualization is executed |
locations | {root}/visualization/locations | Additional folder that is used by CytoCanvas to specify targets and locations | One per execution, when visualization is executed |
multiscale_flowcell.zarr | {root}/visualization/multiscale_flowcell.zarr | Additional folder that contains imaging channels in PNG in a ome-zarr format. | One per execution, when visualization is executed |
WellInformation.json | {root}/visualization/WellInformation.json | Additional folder that contains information on well location, label, and color in json format | One per execution, when visualization is executed |
TileInformation.json | {root}/visualization/TileInformation.json | Additional folder that contains information on tile location, label, and color in json format | One per execution, when visualization is executed |
RawCellStats.parquet | {root}/visualization/RawCellStats.parquet | Parquet file that reports values per cell for all morphology features and raw target counts in a run | One per run |
MultiQC Reports
MultiQC reports, which are generated by Seqera and designed by Element Biosciences, are available for use with Cells2Stats. MultiQC reports analyze results and statistics from bioinformatics tool outputs, such as log files and console outputs. These reports help summarize experiments that contain multiple samples and multiple analysis steps and are designed to be placed at the end of pipelines or to be run manually when you finish running your tools. Furthermore, MultiQC reports contain the parsed data in a nice friendly format, ready for any further downstream analysis. For more information, see MultiQC Documentation.
Metrics
The output files contain a variety of metrics such as tile-specific and average metrics.
- The
RawCellStats.csv
andRawCellStats.parquet
contain a full set of morphology and quantification metrics for each target and batch. - The
AverageNormWellStats.csv
file provides the averages of these metrics for each well. Metrics that end with.std
provide the standard deviation for the metric.
Metrics files report metrics from the following CellProfiler modules, unless the user runs the `--skip-cellprofiler' argument:
MeasureObjectSizeShape
MeasureGranularity
MeasureObjectIntensity
MeasureObjectIntensityDistribution
MeasureTexture
Certain metrics from these modules are not available in these files. For example, the output files do not report Zernike metrics. In some files, columns for Z-axis metrics appear with values of 0
. Z-axis metrics are not available in the RawCellStats.csv
and RawCellStats.parquet
files because they are not relevant to the analysis output.
For more information on cytoprofiling metrics, see the measurement information in the CellProfiler Manuals.
Run Statistics
The RunStats.json
provides overall run statistics. The following table describes the run statistics fields:
Field | Description | Data Type |
---|---|---|
AnalysisID | Unique ID for the analysis | String |
AnalysisVersion | Version of the analysis software | String |
AssignedCountPerMM2 | Target counts per mm² of the cell area of a quality control target. Teton and Teton Custom Screen with quality control targets. | Float |
AssignedCountsPerMM2 | Target counts per mm² of the cell area of all barcoding target counts in the CytoStats and Teton Atlas target counts. | Float |
AverageAssignedCountsPerMM2 | Target counts per mm² of the cell area of all barcoding targets, averaged across wells. | Float |
BatchName | Name of the batch | String |
Batches | List of batch-specific statistics | Array of Objects |
Count | Count of how many times a particular unassigned sequence appears | Integer |
DemuxStats | Statistics related to demultiplexing | Object |
ExpectedSequence | Expected DNA sequence for the target | String |
ExtraCellularRatio | The ratio of extracellular to intracellular assigned target density | Float |
FileVersion | Version of the file format | String |
FlowCellID | ID of the flow cell that is used | String |
ManifestLabel | Label of the custom protein in the manifest when a custom protein is included. Teton Custom Screen with antibody targets. | String |
MeanAssignedBarcodingCountPerCell | Mean number of polonies assigned to a target sequence in a cell per well | Float |
MeanUniqueTargetsPerCell | The mean number of unique targets found in a cell. Teton Atlas with Target Cell Assignment enabled. | Float |
MedianAbundantTargetCount | The median count of the target that is assigned to cells. Teton Atlas with Target Cell Assignment enabled. | Float |
MedianCellDiameter | Median diameter of a cell after cell segmentation. | Float |
NuclearFraction | Fraction of the cellular assigned reads of the target that are in the nucleus. Teton Custom Screen with antibody targets. | Float |
NucleatedRate | The fraction of cells that contain a segmented nucleus | Float |
NumPolonies (Total) | Number of polonies that are detected | Integer |
NumPolonies (Targets) | Number of polonies that are assigned to this target | Integer |
NumPolonies (Wells) | Number of polonies that are in the well | Integer |
PercentAssigned | Percentage of reads that are assigned to a target within the target site | Float |
PercentAssignedMixedCells | Percentage of cells that are assigned and mixed with a confidence score above 50% | Float |
PercentAssignedPureCells | Percentage of cells that are assigned that are pure with a confidence score above 50% | Float |
PercentAssignedReads (Targets) | Percentage of reads that are assigned to targets | Float |
PercentAssignedReads (Wells) | Percentage of reads that are assigned to targets in the well | Float |
PercentMismatch (Total) | Percentage of reads with at least one base pair mismatch | Float |
PercentMismatch (Targets) | Percentage of reads that are assigned to a target with at least one base pair mismatch | Float |
PercentMismatch (Wells) | Percentage of reads in the well with at least one base pair mismatch | Float |
PercentTargetDropout | Percent of targets not assigned to any cell. Teton Atlas with Target Cell Assignment enabled. | Float |
PercentUnassignedLowCountCells | Percentage of cells that are unassigned that are pure but are below the confidence threshold of 50%. | Float |
PercentUnassignedMixedCells | Percentage of cells that are unassigned and mixed with a confidence score below 50%. | Float |
RunID | Unique ID for the run | String |
RunName | Name of the sequencing run | String |
Sequence | DNA sequence of unassigned reads | String |
TargetName | Name of the sequencing target | String |
TargetSiteName | Name of the target site as provided in the Panel.json. Teton Atlas with Target Cell Assignment enabled. | String |
Targets | List of target-specific statistics | Array of Objects |
UnassignedSequences | List of unassigned sequence details | Array of Objects |
WellLocation | Location of the well (for example, A1-F2) | String |
Wells | List of well-specific statistics | Array of Objects |
Versions
The Versions.json
reports the version number for CSV output files and bundled software programs. The following table describes the run statistics fields:
Field | Description | Data Type |
---|---|---|
FileVersion | Overall version of the file | String |
FileVersions | Versions of individual files in the dataset | Object |
AverageNormWellStats.csv | Version of AverageNormWellStats.csv file | String |
RawCellStats.csv | Version of RawCellStats.csv file | String |
ProgramVersions | Versions of programs that are used in the analysis | Object |
CellProfiler | Version of the CellProfiler program | String |
Other Files
Some files are copied from the input directory, such as RunManifest.json
, RunParameters.json
, Panel.json
, Cell Segmentation Masks, Nuclear Segmentation Masks, and Barcodes. For more information on these files, see Cytoprofiling Run Ouput Files.