GSForge.operations.normalizations module¶
Normalization functions inherit from the OperationInterface
class.
This means that they can all be called upon an AnnotatedGEM
or a GeneSetCollection
.
These (classes) functions have static methods that implement the transform on a numpy
or xarray
source.
- class GSForge.operations.normalizations.ReadsPerKilobaseMillion(*args, **params)¶
Bases:
GSForge.models._Interface.Interface
,param.parameterized.ParameterizedFunction
RPKM or FPKM – Reads or Fragments per per Kilobase Million.
These methods attempt to compensate for sequencing depth and gene length. The utility of this method is disputed in the literature [cite me].
Parameters inherited from:
GSForge.models._Interface.Interface
: gem, gene_set_collection, selected_gene_sets, selected_genes, gene_set_mode, sample_subset, count_variable, annotation_variables, count_mask, annotation_mask, count_transformlength_variable
= param.String(readonly=False)- length_variable = 'lengths'¶
- static xr_reads_per_kilobase_million(counts, lengths, sample_dim='Sample')¶
- static np_reads_per_kilobase_million(counts, lengths)¶
- name = 'ReadsPerKilobaseMillion'¶
- class GSForge.operations.normalizations.UpperQuartile(*args, **params)¶
Bases:
GSForge.models._Interface.Interface
,param.parameterized.ParameterizedFunction
Under this normalization method, after removing genes having zero read counts for all samples, the remaining gene counts are divided by the upper quartile of counts different from zero in the computation of the normalization factors associated with their sample and multiplied by the mean upper quartile across all samples of the dataset. [method_compare]
Original R code.
uq<-function(X){ #excluding zero counts in each sample UQ<-function(y){ quantile(y, 0.75) } X<-X+0.1 upperQ<-apply(X,2,UQ) f.uq<-upperQ/mean(upperQ) upq.res<-scale(X,center=FALSE,scale=f.uq) return(upq.res) }
Parameters inherited from:
GSForge.models._Interface.Interface
: gem, gene_set_collection, selected_gene_sets, selected_genes, gene_set_mode, sample_subset, count_variable, annotation_variables, count_mask, annotation_mask, count_transform- static np_upper_quartile(counts)¶
Perform the upper quartile normalization.
- Parameters
counts – A numpy array containing the raw count values. The shape is assumed to be (samples by genes). Zero counts are expected to be present as zeros.
- Returns
The upper quartile normalized count matrix.
- static xr_upper_quartile(counts)¶
Perform the upper quartile normalization.
- Parameters
counts – An
xarray.DataArray
containing the raw count values. The shape is assumed to be (samples by genes). Zero counts are expected to be present as zeros.- Returns
The upper quartile normalized count matrix.
- name = 'UpperQuartile'¶