Output Files
The following table lists the files and folders that Cells2Stats outputs. Parquet files are column-based files that efficiently store data. For more information, see the Apache Parquet Documentation.
File | Directory and File Name | Description | Quantity |
---|---|---|---|
Run manifest | {root}/RunManifest.json | JSON file reserved for Element processes | One per run |
Run parameters | {root}/RunParameters.json | JSON file that records information about the run configuration | One per run |
Run Stats | {root}/RunStats.json | JSON file that records overall statistics about the run | One per run |
Panel | {root}/Panel.json | JSON file that records information about the targets for the run | One per run |
Cell Segmentation Mask | {root}/CellSegmentation/{well}/{tile}_Cell.tif | Cell segmentation masks for a well, where the value for a pixel in a cell is the cell ID | One per tile per well |
Nuclear Segmentation Mask | {root}/CellSegmentation/{well}/{tile}_Nuclear.tif | Nuclear segmentation masks for a well, where the value for a pixel in a nucleus is 1 | One per tile per well |
Average Normalized Well Statistics | {root}/AverageNormWellStats.csv | Filtered and average metrics for each well in the run | One per run |
Versions, JSON | {root}/Versions.json | File that reports the version number for CSV output files and bundled software programs | One per run |
Raw Cell Statistics, CSV | {root}/RawCellStats.csv | CSV file that reports values per cell for all morphology features and raw target counts in a run | One per run |
Raw Cell Statistics, Parquet | {root}/RawCellStats.parquet | Parquet file that reports values per cell for all morphology features and raw target counts in a run | One per run |
Barcodes | {root}/Wells/{well}/{batch}/{tile}_barcodes.parquet | Parquet files that provide barcoding information for each polony in a tile | One per tile per batch per well |
Cells2Stats Log | {root}/Log/Cells2Stats.log | File recording logs for Cells2Stats execution | One per run |
Visualization Files
In addition to these files, there is a visualization
folder which is generated when the --visualization
or --visualization-only
flags are run, an additional Log file. Additional files and folders generated when visualization is run are below:
File | Directory and File Name | Description | Quantity |
---|---|---|---|
cyto.viz | {root}/visualization/cyto.viz | File for use by Element Biosciences CytoCanvas tool | One per execution, when visualization is executed |
Visualization.log | {root}/Log/Visualization.log | File recording logs for the visualiation execution | One per execution, when visualization is executed |
cells | {root}/visualization/cells | Additional folder Used by CytoCanvas for cell boundaries | One per execution, when visualization is executed |
targets | {root}/visualization/targets | Additional folder used by CytoCanvas to specify targets and locations | One per execution, when visualization is executed |
locations | {root}/visualization/locations | Additional folder used by CytoCanvas to specify targets and locations | One per execution, when visualization is executed |
Multiscale_flowcell.zarr | {root}/visualization/multiscale_flowcell.zarr | Additional folder containing imaging channels in PNG in a ome-zarr format. | One per execution, when visualization is executed |
Metrics
The output files contain a variety of metrics, including tile-specific and average metrics.
- The
RawCellStats.csv
andRawCellStats.parquet
contain a full set of morphology and quantification metrics for each target and batch. - The
AverageNormWellStats.csv
file provides averages of these metrics for each well. Metrics that end with.std
provide the standard deviation for the metric.
The files report metrics from the following CellProfiler modules, unless the user runs with the `--skip-cellprofiler' flag:
- MeasureObjectSizeShape
- MeasureGranularity
- MeasureObjectIntensity
- MeasureObjectIntensityDistribution
- MeasureTexture
Certain metrics from these modules are not available in these files. For example, the output files do not report Zernike metrics. In some files, columns for Z-axis metrics appear with values of 0
. Z-axis metrics are not available in the RawCellStats.csv
and RawCellStats.parquet
files because they are not relevant to the analysis output.
For more information on the cytoprofiling metrics, see the measurement information in the CellProfiler Manuals.
Run Statistics
The RunStats.json
provides overall run statistics.
The following table describes the run statistics fields.
Field | Description | Data Type |
---|---|---|
AnalysisID | Unique ID for the analysis. | String |
AnalysisVersion | Version of the analysis software. | String |
FileVersion | Version of the file format. | String |
FlowCellID | ID of the flow cell used. | String |
RunID | Unique ID for the run. | String |
RunName | Name of the sequencing run. | String |
DemuxStats | Statistics related to demultiplexing. | Object |
NumPolonies | Number of polonies detected. | Integer |
PercentAssignedReads | Percentage of reads assigned to targets. | Float |
PercentMismatch | Percentage of reads with at least on base pair mismatch. | Float |
Batches | List of batch-specific statistics. | Array of Objects |
BatchName | Name of the batch. | String |
Targets | List of target-specific statistics. | Array of Objects |
TargetName | Name of the sequencing target. | String |
ExpectedSequence | Expected DNA sequence for the target. | String |
NumPolonies | Number of polonies assigned to this target. | Integer |
PercentMismatch | Percentage of reads assigned to this target with at least one base pair mismatch. | Float |
Wells | List of well-specific statistics. | Array of Objects |
WellLocation | Location of the well, e.g. A1-F2. | String |
NumPolonies | Number of polonies in this well. | Integer |
PercentAssignedReads | Percentage of reads assigned to targets in well. | Float |
PercentMismatch | Percentage of reads in this well with at least one base pair mismatch. | Float |
UnassignedSequences | List of unassigned sequence details. | Array of Objects |
Count | Count of how many times a particular unassigned sequence appears. | Integer |
Sequence | DNA sequence of unassigned reads. | String |
Versions
The Versions.json
reports out the version number for CSV output files and bundled software programs
The following table describes the run statistics fields.
Field | Description | Data Type |
---|---|---|
FileVersion | Overall version of the file. | String |
FileVersions | Versions of individual files in the dataset. | Object |
AverageNormWellStats.csv | Version of AverageNormWellStats.csv file. | String |
RawCellStats.csv | Version of RawCellStats.csv file. | String |
ProgramVersions | Versions of programs used in the analysis. | Object |
CellProfiler | Version of the CellProfiler program. | String |
Other Files
Several of the files are copied from the input directory, including the RunManifest.json
, RunParameters.json
, Panel.json
, Cell Segmentation Masks, Nuclear Segmentation Masks, and Barcodes. For more information on these files, see Cytoprofiling Run Ouput Files.