Output Files
The following table lists the files and folders that Cells2Stats outputs. Parquet files are column-based files that efficiently store data. For more information, see the Apache Parquet Documentation.
File | Directory and File Name | Description | Quantity |
---|---|---|---|
Run manifest | {root}/RunManifest.json | JSON file that is reserved for Element processes | One per run |
Run parameters | {root}/RunParameters.json | JSON file that records information about the run configuration | One per run |
Run Stats | {root}/RunStats.json | JSON file that records overall statistics about the run | One per run |
Panel | {root}/Panel.json | JSON file that records information about the targets for the run | One per run |
Cell Segmentation Mask | {root}/CellSegmentation/{well}/{tile}_Cell.tif | Cell segmentation masks for a well, where the value for a pixel in a cell is the cell ID | One per tile per well |
Nuclear Segmentation Mask | {root}/CellSegmentation/{well}/{tile}_Nuclear.tif | Nuclear segmentation masks for a well, where the value for a pixel in a nucleus is 1 | One per tile per well |
Average Normalized Well Statistics | {root}/AverageNormWellStats.csv | Filtered and average metrics for each well in the run | One per run |
Versions, JSON | {root}/Versions.json | File that reports the version number for CSV output files and bundled software programs | One per run |
Raw Cell Statistics, CSV | {root}/RawCellStats.csv | CSV file that reports values per cell for all morphology features and raw target counts in a run | One per run |
Raw Cell Statistics, Parquet | {root}/RawCellStats.parquet | Parquet file that reports values per cell for all morphology features and raw target counts in a run | One per run |
Barcodes | {root}/Wells/{well}/{batch}/{tile}_barcodes.parquet | Parquet files that provide barcoding information for each polony in a tile | One per tile per batch per well |
Cells2Stats Log | {root}/Log/Cells2Stats.log | File recording logs for Cells2Stats execution | One per run |
Visualization Files
The visualization
folder and visualization files are generated when the --visualization
or --visualization-only
flags are run. Also, an additional log file is generated. The following table lists the additional folders and files that are generated when visualization is run:
File | Directory and File Name | Description | Quantity |
---|---|---|---|
cyto.viz | {root}/visualization/cyto.viz | File for use by Element Biosciences CytoCanvas tool | One per execution, when visualization is executed |
Visualization.log | {root}/Log/Visualization.log | File recording logs for the visualiation execution | One per execution, when visualization is executed |
cells | {root}/visualization/cells | Additional folder that is used by CytoCanvas for cell boundaries | One per execution, when visualization is executed |
targets | {root}/visualization/targets | Additional folder that is used by CytoCanvas to specify targets and locations | One per execution, when visualization is executed |
locations | {root}/visualization/locations | Additional folder that is used by CytoCanvas to specify targets and locations | One per execution, when visualization is executed |
Multiscale_flowcell.zarr | {root}/visualization/multiscale_flowcell.zarr | Additional folder that contains imaging channels in PNG in a ome-zarr format. | One per execution, when visualization is executed |
WellInformation.json | {root}/visualization/WellInformation.json | Additional folder that contains information on well location, label, and color in json format | One per execution, when visualization is executed |
TileInformation.json | {root}/visualization/TileInformation.json | Additional folder that contains information on tile location, label, and color in json format | One per execution, when visualization is executed |
Metrics
The output files contain a variety of metrics such as tile-specific and average metrics.
- The
RawCellStats.csv
andRawCellStats.parquet
contain a full set of morphology and quantification metrics for each target and batch. - The
AverageNormWellStats.csv
file provides the averages of these metrics for each well. Metrics that end with.std
provide the standard deviation for the metric.
Metrics files report metrics from the following CellProfiler modules, unless the user runs with the `--skip-cellprofiler' flag:
- MeasureObjectSizeShape
- MeasureGranularity
- MeasureObjectIntensity
- MeasureObjectIntensityDistribution
- MeasureTexture
Certain metrics from these modules are not available in these files. For example, the output files do not report Zernike metrics. In some files, columns for Z-axis metrics appear with values of 0
. Z-axis metrics are not available in the RawCellStats.csv
and RawCellStats.parquet
files because they are not relevant to the analysis output.
For more information on the cytoprofiling metrics, see the measurement information in the CellProfiler Manuals.
Run Statistics
The RunStats.json
provides overall run statistics. The following table describes the run statistics fields:
Field | Description | Data Type |
---|---|---|
AnalysisID | Unique ID for the analysis | String |
AnalysisVersion | Version of the analysis software | String |
FileVersion | Version of the file format | String |
FlowCellID | ID of the flow cell that is used | String |
RunID | Unique ID for the run | String |
RunName | Name of the sequencing run | String |
DemuxStats | Statistics related to demultiplexing | Object |
NucleatedRate | The fraction of cells that contain a segmented nucleus | Float |
NumPolonies | Number of polonies that are detected | Integer |
PercentAssignedReads | Percentage of reads that are assigned to targets | Float |
PercentMismatch | Percentage of reads with at least on base pair mismatch | Float |
Batches | List of batch-specific statistics | Array of Objects |
BatchName | Name of the batch | String |
Targets | List of target-specific statistics | Array of Objects |
TargetName | Name of the sequencing target | String |
ExpectedSequence | Expected DNA sequence for the target | String |
NumPolonies | Number of polonies that are assigned to this target | Integer |
PercentMismatch | Percentage of reads that are assigned to this target with at least one base pair mismatch | Float |
Wells | List of well-specific statistics | Array of Objects |
WellLocation | Location of the well (for example, A1-F2) | String |
NumPolonies | Number of polonies that are in the well | Integer |
PercentAssignedReads | Percentage of reads that are assigned to targets in the well | Float |
PercentMismatch | Percentage of reads in the well with at least one base pair mismatch | Float |
UnassignedSequences | List of unassigned sequence details | Array of Objects |
Count | Count of how many times a particular unassigned sequence appears | Integer |
Sequence | DNA sequence of unassigned reads | String |
Versions
The Versions.json
reports the version number for CSV output files and bundled software programs. The following table describes the run statistics fields:
Field | Description | Data Type |
---|---|---|
FileVersion | Overall version of the file | String |
FileVersions | Versions of individual files in the dataset | Object |
AverageNormWellStats.csv | Version of AverageNormWellStats.csv file | String |
RawCellStats.csv | Version of RawCellStats.csv file | String |
ProgramVersions | Versions of programs that are used in the analysis | Object |
CellProfiler | Version of the CellProfiler program | String |
Other Files
Some files are copied from the input directory, such as RunManifest.json
, RunParameters.json
, Panel.json
, Cell Segmentation Masks, Nuclear Segmentation Masks, and Barcodes. For more information on these files, see Cytoprofiling Run Ouput Files.