cytonormpy.CytoNorm

Contents

cytonormpy.CytoNorm#

class cytonormpy.CytoNorm#

Cytometry data are divided into batches. Each batch contains one or more reference files that were measured in all batches.

First, data are setup where either FCS files can be processed or data stored in an anndata object. The data can be transformed using common cytomety transformations.

The reference data are then clustered using the FlowSOM algorithm. The SOM is stored in order to predict the corresponding clusters for the other samples.

For each cluster and batch, the expression values of the reference files at user-defined quantiles are calculated. The goal distribution is calculated by either calculating the mean expression for each cluster over all batches or by choosing one batch as the goal.

Next, interpolating functions (spline functions) are calculated for each batch and cluster to the goal distribution. All samples are then transformed by calculating the output of the respective spline function when using the expression values as an input.

Example

>>> import cytonormpy as cnp
>>>
>>> cn = CytoNorm()
>>>
>>> transformer = cnp.AsinhTransformer(cofactors = 5)
>>> cn.add_transformer(transformer)
>>>
>>> clusterer = cnp.FlowSOM(**flowsom_kwargs)
>>> cn.add_clusterer(clusterer)
>>>
>>> cn.run_fcs_data_setup("metadata.csv")
>>>
>>> # equivalently for the use of anndata:
>>> cn.run_anndata_setup(adata)
>>>
>>> cn.run_clustering(n_cells = 6000,
...                   test_cluster_cv = True)
>>>
>>> cn.calculate_quantiles()
>>> cn.calculate_splines()
>>>
>>> cn.normalize_data()

Methods

add_clusterer(clusterer)

Adds a clusterer instance to transform the data to the log, logicle, hyperlog or asinh space.

add_transformer(transformer)

Adds a transformer to transform the data to the log, logicle, hyperlog or asinh space.

calculate_cluster_cvs(n_metaclusters[, ...])

Compute per-cluster coefficient of variation (CV) across samples for multiple meta-cluster counts.

calculate_emd([cell_labels, files])

Calculates the EMD on the normalized and unnormalized samples.

calculate_mad([groupby, cell_labels, files])

Calculates the MAD on the normalized and unnormalized samples.

calculate_quantiles([n_quantiles, ...])

Calculates quantiles per batch, cluster and sample.

calculate_splines([limits, goal])

Calculates the spline functions of the expression values and the goal expression.

normalize_data([adata, file_names, batches, ...])

Applies the normalization procedure to the files and writes the data to disk or to the anndata file.

run_anndata_setup(adata[, layer, ...])

Method to setup the data handling for anndata objects.

run_clustering([n_cells, test_cluster_cv, ...])

Runs the clustering step.

run_fcs_data_setup(metadata, input_directory)

Method to setup the data handling for FCS data.

save_model([filename])

Function to save the current CytoNorm instance to disk.