GSForge.operations Package¶

`operations` Package¶

Inheritance diagram of GSForge.operations

GSForge operations can be broken down into three categories:

Analytics: For discrete operations, i.e. chi-squared tests, differential gene expression, etc.
Normalizations: For those operations that are meant to create an entire transform of the GEM.
Prospectors: For non-deterministic operations, used in ranking and comparing gene selections.

class GSForge.operations.get_data(*args, **params)[source]¶

Bases: GSForge.models._OperationInterface.OperationInterface

Gets the GEM matrix and an optional annotation column.

gem = param.ClassSelector(class_=<class ‘GSForge.models._AnnotatedGEM.AnnotatedGEM’>): An AnnotatedGEM object.
gene_set_collection = param.ClassSelector(class_=<class ‘GSForge.models._GeneSetCollection.GeneSetCollection’>): A GeneSetCollection object.
selected_gene_sets = param.ListSelector(default=[None], objects=[]): A list of keys from the provided GeneSetCollection (stored in gene_set_collection) that are to be used for selecting sets of genes from the count matrix.
selected_genes = param.Parameter(): A list of genes to use in indexing from the count matrix. This parameter takes priority over all other gene selecting methods. That means that selected lineaments (or combinations thereof) will have no effect.
gene_set_mode = param.ObjectSelector(default=’union’, objects=[‘complete’, ‘union’, ‘intersection’]): Controls how any selected gene sets are returned by the interface. + complete Returns the entire gene set of the AnnotatedGEM. + union Returns the union of the selected gene sets support. + intersection Returns the intersection of the selected gene sets support.
sample_subset = param.Parameter(): A list of samples to use in a given operation. These can be supplied directly as a list of genes, or can be drawn from a given GeneSet.
count_variable = param.String(): The name of the count matrix used.
annotation_variables = param.Parameter(): The name of the active annotation variable(s). These are the annotation columns that will be control the subset returned by y_annotation_data.
count_mask = param.ObjectSelector(default=’complete’, objects=[‘complete’, ‘masked’, ‘dropped’]): The type of mask to use for the count matrix. + ‘complete’ returns the entire count matrix as numbers. + ‘masked’ returns the entire count matrix with zero or missing as NaN values. + ‘dropped’ returns the count matrix without genes that have zero or missing values.
annotation_mask = param.ObjectSelector(default=’complete’, objects=[‘complete’, ‘dropped’]): The type of mask to use for the target array. + ‘complete’ returns the entire target array. + ‘masked’ returns the entire target array with zero or missing as NaN values. + ‘dropped’ returns the target array without samples that have zero or missing values.
count_transform = param.Callable(): A transform that will be run on the x_data that is supplied by this Interface. The transform runs on the subset of the matrix that has been selected.

property active_count_variable¶: Returns the name of the currently active count matrix.

debug(**kwargs)¶: Inspect .param.debug method for the full docstring

defaults(**kwargs)¶: Inspect .param.defaults method for the full docstring

force_new_dynamic_value = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.get_data'>)¶

property gene_index_name¶: Returns the name of the gene index.

get_gene_index(count_variable=None) → numpy.core.multiarray.array¶

Get the currently selected gene index as a numpy array.

Parameters: count_variable – The variable to be retrieved.
Returns: A numpy array of the currently selected genes.

get_param_values = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.get_data'>)¶

get_sample_index() → numpy.core.multiarray.array¶

Get the currently selected sample index as a numpy array.

Returns: A numpy array of the currently selected samples.

get_selection_indexes() → dict¶: Returns the currently selected indexes as a dictionary.

get_value_generator = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.get_data'>)¶

inspect_value = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.get_data'>)¶

instance = functools.partial(<function ParameterizedFunction.instance>, <class 'GSForge.operations.get_data'>)¶

message(**kwargs)¶: Inspect .param.message method for the full docstring

params = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.get_data'>)¶

pprint(imports=None, prefix='\n ', unknown_value='<?>', qualify=False, separator='')¶: Same as Parameterized.pprint, except that X.classname(Y is replaced with X.classname.instance(Y

classmethod print_param_defaults(*args, **kwargs)¶: Inspect .param.print_param_defaults method for the full docstring

print_param_values(**kwargs)¶: Inspect .param.print_param_values method for the full docstring

process()[source]¶: Abstract process.

property sample_index_name¶: Returns the name of the sample index.

script_repr(imports=[], prefix=' ')¶: Same as Parameterized.script_repr, except that X.classname(Y is replaced with X.classname.instance(Y

property selection¶: Returns the currently selected data.

classmethod set_default(*args, **kwargs)¶: Inspect .param.set_default method for the full docstring

set_dynamic_time_fn = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.get_data'>)¶

set_param = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.get_data'>)¶

state_pop()¶

Restore the most recently saved state.

See state_push() for more details.

state_push()¶

Save this instance’s state.

For Parameterized instances, this includes the state of dynamically generated values.

Subclasses that maintain short-term state should additionally save and restore that state using state_push() and state_pop().

Generally, this method is used by operations that need to test something without permanently altering the objects’ state.

verbose(**kwargs)¶: Inspect .param.verbose method for the full docstring

warning(**kwargs)¶: Inspect .param.warning method for the full docstring

property x_count_data¶

Returns the currently selected ‘x_data’. Usually this will be a subset of the active count array.

Returns: An Xarray.Dataset selection of the currently active ‘x_data’.

property y_annotation_data¶

Returns the currently selected ‘y_data’, or None, based on the selected_annotation_variables parameter.

Returns: An xarray.Dataset or xarray.DataArray object of the currently selected y_data.

`analytics` Module¶

Inheritance diagram of GSForge.operations.analytics

Analytics are intended to more closely rank or compare a GEM subset, rather than the entire GEM. These functions are intended for analyzing and comparing subsets generated by the functions found in prospectors.

Methods and notation from [method_compare] used.

\(LS\): Learning Sample, \(n\) instances of input-output values.
\(n\): Number of input-output value pairs in \(LS\).
\(m\): Number of input variables (features or genes) in \(LS\).
:math X_i: Input array of \(LS\). Ranges from \(i=1, ..., m\).
\(LS\): An algorithm that outputs some relevance score, \(s_i\), for each input variable :math X_i.

method_compare: A comparison of per sample global scaling and per gene normalization methods for differential expression analysis of RNA-seq data

class GSForge.operations.analytics.rank_genes_by_model(*args, **params)[source]¶

Bases: GSForge.models._OperationInterface.OperationInterface

Given some machine learning model, this operation runs n_iterations and returns a summary dataset of the ranking results.

gem = param.ClassSelector(class_=<class ‘GSForge.models._AnnotatedGEM.AnnotatedGEM’>): An AnnotatedGEM object.
gene_set_collection = param.ClassSelector(class_=<class ‘GSForge.models._GeneSetCollection.GeneSetCollection’>): A GeneSetCollection object.
selected_gene_sets = param.ListSelector(default=[None], objects=[]): A list of keys from the provided GeneSetCollection (stored in gene_set_collection) that are to be used for selecting sets of genes from the count matrix.
selected_genes = param.Parameter(): A list of genes to use in indexing from the count matrix. This parameter takes priority over all other gene selecting methods. That means that selected lineaments (or combinations thereof) will have no effect.
gene_set_mode = param.ObjectSelector(default=’union’, objects=[‘complete’, ‘union’, ‘intersection’]): Controls how any selected gene sets are returned by the interface. + complete Returns the entire gene set of the AnnotatedGEM. + union Returns the union of the selected gene sets support. + intersection Returns the intersection of the selected gene sets support.
sample_subset = param.Parameter(): A list of samples to use in a given operation. These can be supplied directly as a list of genes, or can be drawn from a given GeneSet.
count_variable = param.String(): The name of the count matrix used.
annotation_variables = param.Parameter(): The name of the active annotation variable(s). These are the annotation columns that will be control the subset returned by y_annotation_data.
count_mask = param.ObjectSelector(default=’complete’, objects=[‘complete’, ‘masked’, ‘dropped’]): The type of mask to use for the count matrix. + ‘complete’ returns the entire count matrix as numbers. + ‘masked’ returns the entire count matrix with zero or missing as NaN values. + ‘dropped’ returns the count matrix without genes that have zero or missing values.
annotation_mask = param.ObjectSelector(default=’complete’, objects=[‘complete’, ‘dropped’]): The type of mask to use for the target array. + ‘complete’ returns the entire target array. + ‘masked’ returns the entire target array with zero or missing as NaN values. + ‘dropped’ returns the target array without samples that have zero or missing values.
count_transform = param.Callable(): A transform that will be run on the x_data that is supplied by this Interface. The transform runs on the subset of the matrix that has been selected.

model = param.Parameter()

n_iterations = param.Integer(default=1, inclusive_bounds=(True, True), time_dependent=False, time_fn=Time(label=’Time’, name=’Time00001’, time_type=<class ‘int’>, timestep=1.0, unit=None, until=Infinity()))

property active_count_variable¶: Returns the name of the currently active count matrix.

debug(**kwargs)¶: Inspect .param.debug method for the full docstring

defaults(**kwargs)¶: Inspect .param.defaults method for the full docstring

force_new_dynamic_value = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.analytics.rank_genes_by_model'>)¶

property gene_index_name¶: Returns the name of the gene index.

get_gene_index(count_variable=None) → numpy.core.multiarray.array¶

Get the currently selected gene index as a numpy array.

Parameters: count_variable – The variable to be retrieved.
Returns: A numpy array of the currently selected genes.

get_param_values = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.analytics.rank_genes_by_model'>)¶

get_sample_index() → numpy.core.multiarray.array¶

Get the currently selected sample index as a numpy array.

Returns: A numpy array of the currently selected samples.

get_selection_indexes() → dict¶: Returns the currently selected indexes as a dictionary.

get_value_generator = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.analytics.rank_genes_by_model'>)¶

inspect_value = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.analytics.rank_genes_by_model'>)¶

instance = functools.partial(<function ParameterizedFunction.instance>, <class 'GSForge.operations.analytics.rank_genes_by_model'>)¶

message(**kwargs)¶: Inspect .param.message method for the full docstring

params = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.analytics.rank_genes_by_model'>)¶

pprint(imports=None, prefix='\n ', unknown_value='<?>', qualify=False, separator='')¶: Same as Parameterized.pprint, except that X.classname(Y is replaced with X.classname.instance(Y

classmethod print_param_defaults(*args, **kwargs)¶: Inspect .param.print_param_defaults method for the full docstring

print_param_values(**kwargs)¶: Inspect .param.print_param_values method for the full docstring

process()[source]¶: Abstract process.

property sample_index_name¶: Returns the name of the sample index.

script_repr(imports=[], prefix=' ')¶: Same as Parameterized.script_repr, except that X.classname(Y is replaced with X.classname.instance(Y

property selection¶: Returns the currently selected data.

classmethod set_default(*args, **kwargs)¶: Inspect .param.set_default method for the full docstring

set_dynamic_time_fn = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.analytics.rank_genes_by_model'>)¶

set_param = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.analytics.rank_genes_by_model'>)¶

state_pop()¶

Restore the most recently saved state.

See state_push() for more details.

state_push()¶

Save this instance’s state.

For Parameterized instances, this includes the state of dynamically generated values.

Subclasses that maintain short-term state should additionally save and restore that state using state_push() and state_pop().

Generally, this method is used by operations that need to test something without permanently altering the objects’ state.

verbose(**kwargs)¶: Inspect .param.verbose method for the full docstring

warning(**kwargs)¶: Inspect .param.warning method for the full docstring

property x_count_data¶

Returns the currently selected ‘x_data’. Usually this will be a subset of the active count array.

Returns: An Xarray.Dataset selection of the currently active ‘x_data’.

property y_annotation_data¶

Returns the currently selected ‘y_data’, or None, based on the selected_annotation_variables parameter.

Returns: An xarray.Dataset or xarray.DataArray object of the currently selected y_data.

class GSForge.operations.analytics.nFDR(*args, **params)[source]¶

Bases: GSForge.models._OperationInterface.OperationInterface

nFDR (False Discovery Rate) [method_compare].

nFDR trains two models and compares their feature_importances_ attributes to estimate the false discovery rate.

The FDR estimated is the percent of instances a shuffled output feature has a higher feature importance score than the same non-shuffled feature score.

This is repeated up to n_iterations.

gem = param.ClassSelector(class_=<class ‘GSForge.models._AnnotatedGEM.AnnotatedGEM’>): An AnnotatedGEM object.
gene_set_collection = param.ClassSelector(class_=<class ‘GSForge.models._GeneSetCollection.GeneSetCollection’>): A GeneSetCollection object.
selected_gene_sets = param.ListSelector(default=[None], objects=[]): A list of keys from the provided GeneSetCollection (stored in gene_set_collection) that are to be used for selecting sets of genes from the count matrix.
selected_genes = param.Parameter(): A list of genes to use in indexing from the count matrix. This parameter takes priority over all other gene selecting methods. That means that selected lineaments (or combinations thereof) will have no effect.
gene_set_mode = param.ObjectSelector(default=’union’, objects=[‘complete’, ‘union’, ‘intersection’]): Controls how any selected gene sets are returned by the interface. + complete Returns the entire gene set of the AnnotatedGEM. + union Returns the union of the selected gene sets support. + intersection Returns the intersection of the selected gene sets support.
sample_subset = param.Parameter(): A list of samples to use in a given operation. These can be supplied directly as a list of genes, or can be drawn from a given GeneSet.
count_variable = param.String(): The name of the count matrix used.
annotation_variables = param.Parameter(): The name of the active annotation variable(s). These are the annotation columns that will be control the subset returned by y_annotation_data.
count_mask = param.ObjectSelector(default=’complete’, objects=[‘complete’, ‘masked’, ‘dropped’]): The type of mask to use for the count matrix. + ‘complete’ returns the entire count matrix as numbers. + ‘masked’ returns the entire count matrix with zero or missing as NaN values. + ‘dropped’ returns the count matrix without genes that have zero or missing values.
annotation_mask = param.ObjectSelector(default=’complete’, objects=[‘complete’, ‘dropped’]): The type of mask to use for the target array. + ‘complete’ returns the entire target array. + ‘masked’ returns the entire target array with zero or missing as NaN values. + ‘dropped’ returns the target array without samples that have zero or missing values.
count_transform = param.Callable(): A transform that will be run on the x_data that is supplied by this Interface. The transform runs on the subset of the matrix that has been selected.

model = param.Parameter()

property active_count_variable¶: Returns the name of the currently active count matrix.

debug(**kwargs)¶: Inspect .param.debug method for the full docstring

defaults(**kwargs)¶: Inspect .param.defaults method for the full docstring

force_new_dynamic_value = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.analytics.nFDR'>)¶

property gene_index_name¶: Returns the name of the gene index.

get_gene_index(count_variable=None) → numpy.core.multiarray.array¶

Get the currently selected gene index as a numpy array.

Parameters: count_variable – The variable to be retrieved.
Returns: A numpy array of the currently selected genes.

get_param_values = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.analytics.nFDR'>)¶

get_sample_index() → numpy.core.multiarray.array¶

Get the currently selected sample index as a numpy array.

Returns: A numpy array of the currently selected samples.

get_selection_indexes() → dict¶: Returns the currently selected indexes as a dictionary.

get_value_generator = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.analytics.nFDR'>)¶

inspect_value = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.analytics.nFDR'>)¶

instance = functools.partial(<function ParameterizedFunction.instance>, <class 'GSForge.operations.analytics.nFDR'>)¶

message(**kwargs)¶: Inspect .param.message method for the full docstring

params = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.analytics.nFDR'>)¶

pprint(imports=None, prefix='\n ', unknown_value='<?>', qualify=False, separator='')¶: Same as Parameterized.pprint, except that X.classname(Y is replaced with X.classname.instance(Y

classmethod print_param_defaults(*args, **kwargs)¶: Inspect .param.print_param_defaults method for the full docstring

print_param_values(**kwargs)¶: Inspect .param.print_param_values method for the full docstring

process()[source]¶: Abstract process.

property sample_index_name¶: Returns the name of the sample index.

script_repr(imports=[], prefix=' ')¶: Same as Parameterized.script_repr, except that X.classname(Y is replaced with X.classname.instance(Y

property selection¶: Returns the currently selected data.

classmethod set_default(*args, **kwargs)¶: Inspect .param.set_default method for the full docstring

set_dynamic_time_fn = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.analytics.nFDR'>)¶

set_param = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.analytics.nFDR'>)¶

state_pop()¶

Restore the most recently saved state.

See state_push() for more details.

state_push()¶

Save this instance’s state.

For Parameterized instances, this includes the state of dynamically generated values.

Subclasses that maintain short-term state should additionally save and restore that state using state_push() and state_pop().

Generally, this method is used by operations that need to test something without permanently altering the objects’ state.

verbose(**kwargs)¶: Inspect .param.verbose method for the full docstring

warning(**kwargs)¶: Inspect .param.warning method for the full docstring

property x_count_data¶

Returns the currently selected ‘x_data’. Usually this will be a subset of the active count array.

Returns: An Xarray.Dataset selection of the currently active ‘x_data’.

property y_annotation_data¶

Returns the currently selected ‘y_data’, or None, based on the selected_annotation_variables parameter.

Returns: An xarray.Dataset or xarray.DataArray object of the currently selected y_data.

class GSForge.operations.analytics.mProbes(*args, **params)[source]¶

Bases: GSForge.models._OperationInterface.OperationInterface

mProbes [method_compare] works by randomly permuting the feature values in the supplied data. e.g. count values are shuffled within each samples feature (gene) array.

It then ranks the real and shadowed features (for n_iterations) with the supplied model via a call to model.fit(). It then examines model.feature_importances_ for the feature importance values, and then calculates the null rank distribution.

This is repeated upto n_iterations.

gem = param.ClassSelector(class_=<class ‘GSForge.models._AnnotatedGEM.AnnotatedGEM’>): An AnnotatedGEM object.
gene_set_collection = param.ClassSelector(class_=<class ‘GSForge.models._GeneSetCollection.GeneSetCollection’>): A GeneSetCollection object.
selected_gene_sets = param.ListSelector(default=[None], objects=[]): A list of keys from the provided GeneSetCollection (stored in gene_set_collection) that are to be used for selecting sets of genes from the count matrix.
selected_genes = param.Parameter(): A list of genes to use in indexing from the count matrix. This parameter takes priority over all other gene selecting methods. That means that selected lineaments (or combinations thereof) will have no effect.
gene_set_mode = param.ObjectSelector(default=’union’, objects=[‘complete’, ‘union’, ‘intersection’]): Controls how any selected gene sets are returned by the interface. + complete Returns the entire gene set of the AnnotatedGEM. + union Returns the union of the selected gene sets support. + intersection Returns the intersection of the selected gene sets support.
sample_subset = param.Parameter(): A list of samples to use in a given operation. These can be supplied directly as a list of genes, or can be drawn from a given GeneSet.
count_variable = param.String(): The name of the count matrix used.
annotation_variables = param.Parameter(): The name of the active annotation variable(s). These are the annotation columns that will be control the subset returned by y_annotation_data.
count_mask = param.ObjectSelector(default=’complete’, objects=[‘complete’, ‘masked’, ‘dropped’]): The type of mask to use for the count matrix. + ‘complete’ returns the entire count matrix as numbers. + ‘masked’ returns the entire count matrix with zero or missing as NaN values. + ‘dropped’ returns the count matrix without genes that have zero or missing values.
annotation_mask = param.ObjectSelector(default=’complete’, objects=[‘complete’, ‘dropped’]): The type of mask to use for the target array. + ‘complete’ returns the entire target array. + ‘masked’ returns the entire target array with zero or missing as NaN values. + ‘dropped’ returns the target array without samples that have zero or missing values.
count_transform = param.Callable(): A transform that will be run on the x_data that is supplied by this Interface. The transform runs on the subset of the matrix that has been selected.

model = param.Parameter()

property active_count_variable¶: Returns the name of the currently active count matrix.

debug(**kwargs)¶: Inspect .param.debug method for the full docstring

defaults(**kwargs)¶: Inspect .param.defaults method for the full docstring

force_new_dynamic_value = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.analytics.mProbes'>)¶

property gene_index_name¶: Returns the name of the gene index.

get_gene_index(count_variable=None) → numpy.core.multiarray.array¶

Get the currently selected gene index as a numpy array.

Parameters: count_variable – The variable to be retrieved.
Returns: A numpy array of the currently selected genes.

get_param_values = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.analytics.mProbes'>)¶

get_sample_index() → numpy.core.multiarray.array¶

Get the currently selected sample index as a numpy array.

Returns: A numpy array of the currently selected samples.

get_selection_indexes() → dict¶: Returns the currently selected indexes as a dictionary.

get_value_generator = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.analytics.mProbes'>)¶

inspect_value = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.analytics.mProbes'>)¶

instance = functools.partial(<function ParameterizedFunction.instance>, <class 'GSForge.operations.analytics.mProbes'>)¶

message(**kwargs)¶: Inspect .param.message method for the full docstring

params = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.analytics.mProbes'>)¶

pprint(imports=None, prefix='\n ', unknown_value='<?>', qualify=False, separator='')¶: Same as Parameterized.pprint, except that X.classname(Y is replaced with X.classname.instance(Y

classmethod print_param_defaults(*args, **kwargs)¶: Inspect .param.print_param_defaults method for the full docstring

print_param_values(**kwargs)¶: Inspect .param.print_param_values method for the full docstring

process()[source]¶: Abstract process.

property sample_index_name¶: Returns the name of the sample index.

script_repr(imports=[], prefix=' ')¶: Same as Parameterized.script_repr, except that X.classname(Y is replaced with X.classname.instance(Y

property selection¶: Returns the currently selected data.

classmethod set_default(*args, **kwargs)¶: Inspect .param.set_default method for the full docstring

set_dynamic_time_fn = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.analytics.mProbes'>)¶

set_param = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.analytics.mProbes'>)¶

state_pop()¶

Restore the most recently saved state.

See state_push() for more details.

state_push()¶

Save this instance’s state.

For Parameterized instances, this includes the state of dynamically generated values.

Subclasses that maintain short-term state should additionally save and restore that state using state_push() and state_pop().

Generally, this method is used by operations that need to test something without permanently altering the objects’ state.

verbose(**kwargs)¶: Inspect .param.verbose method for the full docstring

warning(**kwargs)¶: Inspect .param.warning method for the full docstring

property x_count_data¶

Returns the currently selected ‘x_data’. Usually this will be a subset of the active count array.

Returns: An Xarray.Dataset selection of the currently active ‘x_data’.

property y_annotation_data¶

Returns the currently selected ‘y_data’, or None, based on the selected_annotation_variables parameter.

Returns: An xarray.Dataset or xarray.DataArray object of the currently selected y_data.

`normalizations` Module¶

Inheritance diagram of GSForge.operations.normalizations

Normalization functions inherit from the OperationInterface class. This means that they can all be called upon an AnnotatedGEM or a GeneSetCollection.

These (classes) functions have static methods that implement the transform on a numpy or xarray source.

class GSForge.operations.normalizations.ReadsPerKilobaseMillion(*args, **params)[source]¶

Bases: GSForge.models._OperationInterface.OperationInterface

RPKM or FPKM – Reads or Fragments per per Kilobase Million.

These methods attempt to compensate for sequencing depth and gene length. The utility of this method is disputed in the literature [cite me].

gem = param.ClassSelector(class_=<class ‘GSForge.models._AnnotatedGEM.AnnotatedGEM’>): An AnnotatedGEM object.
gene_set_collection = param.ClassSelector(class_=<class ‘GSForge.models._GeneSetCollection.GeneSetCollection’>): A GeneSetCollection object.
selected_gene_sets = param.ListSelector(default=[None], objects=[]): A list of keys from the provided GeneSetCollection (stored in gene_set_collection) that are to be used for selecting sets of genes from the count matrix.
selected_genes = param.Parameter(): A list of genes to use in indexing from the count matrix. This parameter takes priority over all other gene selecting methods. That means that selected lineaments (or combinations thereof) will have no effect.
gene_set_mode = param.ObjectSelector(default=’union’, objects=[‘complete’, ‘union’, ‘intersection’]): Controls how any selected gene sets are returned by the interface. + complete Returns the entire gene set of the AnnotatedGEM. + union Returns the union of the selected gene sets support. + intersection Returns the intersection of the selected gene sets support.
sample_subset = param.Parameter(): A list of samples to use in a given operation. These can be supplied directly as a list of genes, or can be drawn from a given GeneSet.
count_variable = param.String(): The name of the count matrix used.
annotation_variables = param.Parameter(): The name of the active annotation variable(s). These are the annotation columns that will be control the subset returned by y_annotation_data.
count_mask = param.ObjectSelector(default=’complete’, objects=[‘complete’, ‘masked’, ‘dropped’]): The type of mask to use for the count matrix. + ‘complete’ returns the entire count matrix as numbers. + ‘masked’ returns the entire count matrix with zero or missing as NaN values. + ‘dropped’ returns the count matrix without genes that have zero or missing values.
annotation_mask = param.ObjectSelector(default=’complete’, objects=[‘complete’, ‘dropped’]): The type of mask to use for the target array. + ‘complete’ returns the entire target array. + ‘masked’ returns the entire target array with zero or missing as NaN values. + ‘dropped’ returns the target array without samples that have zero or missing values.
count_transform = param.Callable(): A transform that will be run on the x_data that is supplied by this Interface. The transform runs on the subset of the matrix that has been selected.

length_variable = param.String(default=’lengths’)

property active_count_variable¶: Returns the name of the currently active count matrix.

debug(**kwargs)¶: Inspect .param.debug method for the full docstring

defaults(**kwargs)¶: Inspect .param.defaults method for the full docstring

force_new_dynamic_value = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.normalizations.ReadsPerKilobaseMillion'>)¶

property gene_index_name¶: Returns the name of the gene index.

get_gene_index(count_variable=None) → numpy.core.multiarray.array¶

Get the currently selected gene index as a numpy array.

Parameters: count_variable – The variable to be retrieved.
Returns: A numpy array of the currently selected genes.

get_param_values = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.normalizations.ReadsPerKilobaseMillion'>)¶

get_sample_index() → numpy.core.multiarray.array¶

Get the currently selected sample index as a numpy array.

Returns: A numpy array of the currently selected samples.

get_selection_indexes() → dict¶: Returns the currently selected indexes as a dictionary.

get_value_generator = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.normalizations.ReadsPerKilobaseMillion'>)¶

inspect_value = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.normalizations.ReadsPerKilobaseMillion'>)¶

instance = functools.partial(<function ParameterizedFunction.instance>, <class 'GSForge.operations.normalizations.ReadsPerKilobaseMillion'>)¶

message(**kwargs)¶: Inspect .param.message method for the full docstring

params = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.normalizations.ReadsPerKilobaseMillion'>)¶

pprint(imports=None, prefix='\n ', unknown_value='<?>', qualify=False, separator='')¶: Same as Parameterized.pprint, except that X.classname(Y is replaced with X.classname.instance(Y

classmethod print_param_defaults(*args, **kwargs)¶: Inspect .param.print_param_defaults method for the full docstring

print_param_values(**kwargs)¶: Inspect .param.print_param_values method for the full docstring

process()[source]¶: Abstract process.

property sample_index_name¶: Returns the name of the sample index.

script_repr(imports=[], prefix=' ')¶: Same as Parameterized.script_repr, except that X.classname(Y is replaced with X.classname.instance(Y

property selection¶: Returns the currently selected data.

classmethod set_default(*args, **kwargs)¶: Inspect .param.set_default method for the full docstring

set_dynamic_time_fn = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.normalizations.ReadsPerKilobaseMillion'>)¶

set_param = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.normalizations.ReadsPerKilobaseMillion'>)¶

state_pop()¶

Restore the most recently saved state.

See state_push() for more details.

state_push()¶

Save this instance’s state.

For Parameterized instances, this includes the state of dynamically generated values.

Subclasses that maintain short-term state should additionally save and restore that state using state_push() and state_pop().

Generally, this method is used by operations that need to test something without permanently altering the objects’ state.

verbose(**kwargs)¶: Inspect .param.verbose method for the full docstring

warning(**kwargs)¶: Inspect .param.warning method for the full docstring

property x_count_data¶

Returns the currently selected ‘x_data’. Usually this will be a subset of the active count array.

Returns: An Xarray.Dataset selection of the currently active ‘x_data’.

property y_annotation_data¶

Returns the currently selected ‘y_data’, or None, based on the selected_annotation_variables parameter.

Returns: An xarray.Dataset or xarray.DataArray object of the currently selected y_data.

class GSForge.operations.normalizations.UpperQuartile(*args, **params)[source]¶

Bases: GSForge.models._OperationInterface.OperationInterface

Under this normalization method, after removing genes having zero read counts for all samples, the remaining gene counts are divided by the upper quartile of counts different from zero in the computation of the normalization factors associated with their sample and multiplied by the mean upper quartile across all samples of the dataset. [method_compare]

Original R code.

uq<-function(X){

  #excluding zero counts in each sample
  UQ<-function(y){
    quantile(y, 0.75)
  }
  X<-X+0.1
  upperQ<-apply(X,2,UQ)
  f.uq<-upperQ/mean(upperQ)
  upq.res<-scale(X,center=FALSE,scale=f.uq)
  return(upq.res)
}

method_compare: A comparison of per sample global scaling and per gene normalization methods for differential expression analysis of RNA-seq data

gem = param.ClassSelector(class_=<class ‘GSForge.models._AnnotatedGEM.AnnotatedGEM’>): An AnnotatedGEM object.
gene_set_collection = param.ClassSelector(class_=<class ‘GSForge.models._GeneSetCollection.GeneSetCollection’>): A GeneSetCollection object.
selected_gene_sets = param.ListSelector(default=[None], objects=[]): A list of keys from the provided GeneSetCollection (stored in gene_set_collection) that are to be used for selecting sets of genes from the count matrix.
selected_genes = param.Parameter(): A list of genes to use in indexing from the count matrix. This parameter takes priority over all other gene selecting methods. That means that selected lineaments (or combinations thereof) will have no effect.
gene_set_mode = param.ObjectSelector(default=’union’, objects=[‘complete’, ‘union’, ‘intersection’]): Controls how any selected gene sets are returned by the interface. + complete Returns the entire gene set of the AnnotatedGEM. + union Returns the union of the selected gene sets support. + intersection Returns the intersection of the selected gene sets support.
sample_subset = param.Parameter(): A list of samples to use in a given operation. These can be supplied directly as a list of genes, or can be drawn from a given GeneSet.
count_variable = param.String(): The name of the count matrix used.
annotation_variables = param.Parameter(): The name of the active annotation variable(s). These are the annotation columns that will be control the subset returned by y_annotation_data.
count_mask = param.ObjectSelector(default=’complete’, objects=[‘complete’, ‘masked’, ‘dropped’]): The type of mask to use for the count matrix. + ‘complete’ returns the entire count matrix as numbers. + ‘masked’ returns the entire count matrix with zero or missing as NaN values. + ‘dropped’ returns the count matrix without genes that have zero or missing values.
annotation_mask = param.ObjectSelector(default=’complete’, objects=[‘complete’, ‘dropped’]): The type of mask to use for the target array. + ‘complete’ returns the entire target array. + ‘masked’ returns the entire target array with zero or missing as NaN values. + ‘dropped’ returns the target array without samples that have zero or missing values.
count_transform = param.Callable(): A transform that will be run on the x_data that is supplied by this Interface. The transform runs on the subset of the matrix that has been selected.

property active_count_variable¶: Returns the name of the currently active count matrix.

debug(**kwargs)¶: Inspect .param.debug method for the full docstring

defaults(**kwargs)¶: Inspect .param.defaults method for the full docstring

force_new_dynamic_value = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.normalizations.UpperQuartile'>)¶

property gene_index_name¶: Returns the name of the gene index.

get_gene_index(count_variable=None) → numpy.core.multiarray.array¶

Get the currently selected gene index as a numpy array.

Parameters: count_variable – The variable to be retrieved.
Returns: A numpy array of the currently selected genes.

get_param_values = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.normalizations.UpperQuartile'>)¶

get_sample_index() → numpy.core.multiarray.array¶

Get the currently selected sample index as a numpy array.

Returns: A numpy array of the currently selected samples.

get_selection_indexes() → dict¶: Returns the currently selected indexes as a dictionary.

get_value_generator = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.normalizations.UpperQuartile'>)¶

inspect_value = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.normalizations.UpperQuartile'>)¶

instance = functools.partial(<function ParameterizedFunction.instance>, <class 'GSForge.operations.normalizations.UpperQuartile'>)¶

message(**kwargs)¶: Inspect .param.message method for the full docstring

static np_upper_quartile(counts)[source]¶

Perform the upper quartile normalization.

Parameters: counts – A numpy array containing the raw count values. The shape is assumed to be (samples by genes). Zero counts are expected to be present as zeros.
Returns: The upper quartile normalized count matrix.

params = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.normalizations.UpperQuartile'>)¶

pprint(imports=None, prefix='\n ', unknown_value='<?>', qualify=False, separator='')¶: Same as Parameterized.pprint, except that X.classname(Y is replaced with X.classname.instance(Y

classmethod print_param_defaults(*args, **kwargs)¶: Inspect .param.print_param_defaults method for the full docstring

print_param_values(**kwargs)¶: Inspect .param.print_param_values method for the full docstring

process()[source]¶

Perform the upper quartile normalization.

Returns: The upper quartile normalized count matrix.

property sample_index_name¶: Returns the name of the sample index.

script_repr(imports=[], prefix=' ')¶: Same as Parameterized.script_repr, except that X.classname(Y is replaced with X.classname.instance(Y

property selection¶: Returns the currently selected data.

classmethod set_default(*args, **kwargs)¶: Inspect .param.set_default method for the full docstring

set_dynamic_time_fn = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.normalizations.UpperQuartile'>)¶

set_param = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.normalizations.UpperQuartile'>)¶

state_pop()¶

Restore the most recently saved state.

See state_push() for more details.

state_push()¶

Save this instance’s state.

For Parameterized instances, this includes the state of dynamically generated values.

Subclasses that maintain short-term state should additionally save and restore that state using state_push() and state_pop().

Generally, this method is used by operations that need to test something without permanently altering the objects’ state.

verbose(**kwargs)¶: Inspect .param.verbose method for the full docstring

warning(**kwargs)¶: Inspect .param.warning method for the full docstring

property x_count_data¶

Returns the currently selected ‘x_data’. Usually this will be a subset of the active count array.

Returns: An Xarray.Dataset selection of the currently active ‘x_data’.

static xr_upper_quartile(counts)[source]¶

Perform the upper quartile normalization.

Parameters: counts – An xarray.DataArray containing the raw count values. The shape is assumed to be (samples by genes). Zero counts are expected to be present as zeros.
Returns: The upper quartile normalized count matrix.

property y_annotation_data¶

Returns the currently selected ‘y_data’, or None, based on the selected_annotation_variables parameter.

Returns: An xarray.Dataset or xarray.DataArray object of the currently selected y_data.

`prospectors` Module¶

Inheritance diagram of GSForge.operations.prospectors

Prospector operations return either boolean support arrays or arrays of selected genes. Prospector operations differ from analytics, in that they are not required to return a ‘result’ for every gene, or return the same result each call.

class GSForge.operations.prospectors.create_random_lineament(*args, **params)[source]¶

Bases: GSForge.models._OperationInterface.OperationInterface

Creates a random lineament of size k.

Picks from the gene index defined by the Interface options.

gem = param.ClassSelector(class_=<class ‘GSForge.models._AnnotatedGEM.AnnotatedGEM’>): An AnnotatedGEM object.
gene_set_collection = param.ClassSelector(class_=<class ‘GSForge.models._GeneSetCollection.GeneSetCollection’>): A GeneSetCollection object.
selected_gene_sets = param.ListSelector(default=[None], objects=[]): A list of keys from the provided GeneSetCollection (stored in gene_set_collection) that are to be used for selecting sets of genes from the count matrix.
selected_genes = param.Parameter(): A list of genes to use in indexing from the count matrix. This parameter takes priority over all other gene selecting methods. That means that selected lineaments (or combinations thereof) will have no effect.
gene_set_mode = param.ObjectSelector(default=’union’, objects=[‘complete’, ‘union’, ‘intersection’]): Controls how any selected gene sets are returned by the interface. + complete Returns the entire gene set of the AnnotatedGEM. + union Returns the union of the selected gene sets support. + intersection Returns the intersection of the selected gene sets support.
sample_subset = param.Parameter(): A list of samples to use in a given operation. These can be supplied directly as a list of genes, or can be drawn from a given GeneSet.
count_variable = param.String(): The name of the count matrix used.
annotation_variables = param.Parameter(): The name of the active annotation variable(s). These are the annotation columns that will be control the subset returned by y_annotation_data.
count_mask = param.ObjectSelector(default=’complete’, objects=[‘complete’, ‘masked’, ‘dropped’]): The type of mask to use for the count matrix. + ‘complete’ returns the entire count matrix as numbers. + ‘masked’ returns the entire count matrix with zero or missing as NaN values. + ‘dropped’ returns the count matrix without genes that have zero or missing values.
annotation_mask = param.ObjectSelector(default=’complete’, objects=[‘complete’, ‘dropped’]): The type of mask to use for the target array. + ‘complete’ returns the entire target array. + ‘masked’ returns the entire target array with zero or missing as NaN values. + ‘dropped’ returns the target array without samples that have zero or missing values.
count_transform = param.Callable(): A transform that will be run on the x_data that is supplied by this Interface. The transform runs on the subset of the matrix that has been selected.

k = param.Integer(default=100, inclusive_bounds=(True, True), time_dependent=False, time_fn=Time(label=’Time’, name=’Time00001’, time_type=<class ‘int’>, timestep=1.0, unit=None, until=Infinity()))

property active_count_variable¶: Returns the name of the currently active count matrix.

debug(**kwargs)¶: Inspect .param.debug method for the full docstring

defaults(**kwargs)¶: Inspect .param.defaults method for the full docstring

force_new_dynamic_value = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.prospectors.create_random_lineament'>)¶

property gene_index_name¶: Returns the name of the gene index.

get_gene_index(count_variable=None) → numpy.core.multiarray.array¶

Get the currently selected gene index as a numpy array.

Parameters: count_variable – The variable to be retrieved.
Returns: A numpy array of the currently selected genes.

get_param_values = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.prospectors.create_random_lineament'>)¶

get_sample_index() → numpy.core.multiarray.array¶

Get the currently selected sample index as a numpy array.

Returns: A numpy array of the currently selected samples.

get_selection_indexes() → dict¶: Returns the currently selected indexes as a dictionary.

get_value_generator = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.prospectors.create_random_lineament'>)¶

inspect_value = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.prospectors.create_random_lineament'>)¶

instance = functools.partial(<function ParameterizedFunction.instance>, <class 'GSForge.operations.prospectors.create_random_lineament'>)¶

message(**kwargs)¶: Inspect .param.message method for the full docstring

params = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.prospectors.create_random_lineament'>)¶

pprint(imports=None, prefix='\n ', unknown_value='<?>', qualify=False, separator='')¶: Same as Parameterized.pprint, except that X.classname(Y is replaced with X.classname.instance(Y

classmethod print_param_defaults(*args, **kwargs)¶: Inspect .param.print_param_defaults method for the full docstring

print_param_values(**kwargs)¶: Inspect .param.print_param_values method for the full docstring

process()[source]¶: Abstract process.

property sample_index_name¶: Returns the name of the sample index.

script_repr(imports=[], prefix=' ')¶: Same as Parameterized.script_repr, except that X.classname(Y is replaced with X.classname.instance(Y

property selection¶: Returns the currently selected data.

classmethod set_default(*args, **kwargs)¶: Inspect .param.set_default method for the full docstring

set_dynamic_time_fn = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.prospectors.create_random_lineament'>)¶

set_param = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.prospectors.create_random_lineament'>)¶

state_pop()¶

Restore the most recently saved state.

See state_push() for more details.

state_push()¶

Save this instance’s state.

For Parameterized instances, this includes the state of dynamically generated values.

Subclasses that maintain short-term state should additionally save and restore that state using state_push() and state_pop().

Generally, this method is used by operations that need to test something without permanently altering the objects’ state.

verbose(**kwargs)¶: Inspect .param.verbose method for the full docstring

warning(**kwargs)¶: Inspect .param.warning method for the full docstring

property x_count_data¶

Returns the currently selected ‘x_data’. Usually this will be a subset of the active count array.

Returns: An Xarray.Dataset selection of the currently active ‘x_data’.

property y_annotation_data¶

Returns the currently selected ‘y_data’, or None, based on the selected_annotation_variables parameter.

Returns: An xarray.Dataset or xarray.DataArray object of the currently selected y_data.

GSForge.operations.prospectors.parse_boruta_model(boruta_model, gene_coords, attrs=None, dim='Gene') → xarray.core.dataset.Dataset[source]¶

Convert a boruta model into an xarray.Dataset object.

Parameters

boruta_model – A boruta_py model.
attrs – A dictionary to be assigned to the output dataset attrs.
gene_coords – An array (index) of the genes passed to the boruta_model.
dim – The name of the coordinate dimension.

Returns

An xarray.Dataset object.

class GSForge.operations.prospectors.boruta_prospector(*args, **params)[source]¶

Bases: GSForge.models._OperationInterface.OperationInterface

Runs a single instance of BorutaPy feature selection.

This is just a simple wrapper for a boruta model that produces an xarray.Dataset object suitable for use in the creation of a GSForge.GeneSet object.

gem = param.ClassSelector(class_=<class ‘GSForge.models._AnnotatedGEM.AnnotatedGEM’>): An AnnotatedGEM object.
gene_set_collection = param.ClassSelector(class_=<class ‘GSForge.models._GeneSetCollection.GeneSetCollection’>): A GeneSetCollection object.
selected_gene_sets = param.ListSelector(default=[None], objects=[]): A list of keys from the provided GeneSetCollection (stored in gene_set_collection) that are to be used for selecting sets of genes from the count matrix.
selected_genes = param.Parameter(): A list of genes to use in indexing from the count matrix. This parameter takes priority over all other gene selecting methods. That means that selected lineaments (or combinations thereof) will have no effect.
gene_set_mode = param.ObjectSelector(default=’union’, objects=[‘complete’, ‘union’, ‘intersection’]): Controls how any selected gene sets are returned by the interface. + complete Returns the entire gene set of the AnnotatedGEM. + union Returns the union of the selected gene sets support. + intersection Returns the intersection of the selected gene sets support.
sample_subset = param.Parameter(): A list of samples to use in a given operation. These can be supplied directly as a list of genes, or can be drawn from a given GeneSet.
count_variable = param.String(): The name of the count matrix used.
annotation_variables = param.Parameter(): The name of the active annotation variable(s). These are the annotation columns that will be control the subset returned by y_annotation_data.
count_mask = param.ObjectSelector(default=’complete’, objects=[‘complete’, ‘masked’, ‘dropped’]): The type of mask to use for the count matrix. + ‘complete’ returns the entire count matrix as numbers. + ‘masked’ returns the entire count matrix with zero or missing as NaN values. + ‘dropped’ returns the count matrix without genes that have zero or missing values.
annotation_mask = param.ObjectSelector(default=’complete’, objects=[‘complete’, ‘dropped’]): The type of mask to use for the target array. + ‘complete’ returns the entire target array. + ‘masked’ returns the entire target array with zero or missing as NaN values. + ‘dropped’ returns the target array without samples that have zero or missing values.
count_transform = param.Callable(): A transform that will be run on the x_data that is supplied by this Interface. The transform runs on the subset of the matrix that has been selected.
estimator = param.Parameter(): A supervised learning estimator, with a ‘fit’ method that returns the feature_importances_ attribute. Important features must correspond to high absolute values in the feature_importances_.
n_estimators = param.Parameter(default=1000): If int sets the number of estimators in the chosen ensemble method. If ‘auto’ this is determined automatically based on the size of the dataset. The other parameters of the used estimators need to be set with initialisation.
perc = param.Integer(default=100, inclusive_bounds=(True, True), time_dependent=False, time_fn=Time(label=’Time’, name=’Time00001’, time_type=<class ‘int’>, timestep=1.0, unit=None, until=Infinity())): Instead of the max we use the percentile defined by the user, to pick our threshold for comparison between shadow and real features. The max tend to be too stringent. This provides a finer control over this. The lower perc is the more false positives will be picked as relevant but also the less relevant features will be left out. The usual trade-off. The default is essentially the vanilla Boruta corresponding to the max.
alpha = param.Number(default=0.05, inclusive_bounds=(True, True), time_dependent=False, time_fn=Time(label=’Time’, name=’Time00001’, time_type=<class ‘int’>, timestep=1.0, unit=None, until=Infinity())): Level at which the corrected p-values will get rejected in both correction steps.
two_step = param.Boolean(bounds=(0, 1), default=True): If you want to use the original implementation of Boruta with Bonferroni correction only set this to False.
max_iter = param.Integer(default=100, inclusive_bounds=(True, True), time_dependent=False, time_fn=Time(label=’Time’, name=’Time00001’, time_type=<class ‘int’>, timestep=1.0, unit=None, until=Infinity())): The number of maximum iterations to perform.
random_state = param.Parameter(): If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.
verbose = param.Integer(default=0, inclusive_bounds=(True, True), time_dependent=False, time_fn=Time(label=’Time’, name=’Time00001’, time_type=<class ‘int’>, timestep=1.0, unit=None, until=Infinity())): Controls verbosity of output: - 0: no output - 1: displays iteration number - 2: which features have been selected already

property active_count_variable¶: Returns the name of the currently active count matrix.

debug(**kwargs)¶: Inspect .param.debug method for the full docstring

defaults(**kwargs)¶: Inspect .param.defaults method for the full docstring

force_new_dynamic_value = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.prospectors.boruta_prospector'>)¶

property gene_index_name¶: Returns the name of the gene index.

get_gene_index(count_variable=None) → numpy.core.multiarray.array¶

Get the currently selected gene index as a numpy array.

Parameters: count_variable – The variable to be retrieved.
Returns: A numpy array of the currently selected genes.

get_param_values = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.prospectors.boruta_prospector'>)¶

get_sample_index() → numpy.core.multiarray.array¶

Get the currently selected sample index as a numpy array.

Returns: A numpy array of the currently selected samples.

get_selection_indexes() → dict¶: Returns the currently selected indexes as a dictionary.

get_value_generator = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.prospectors.boruta_prospector'>)¶

inspect_value = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.prospectors.boruta_prospector'>)¶

instance = functools.partial(<function ParameterizedFunction.instance>, <class 'GSForge.operations.prospectors.boruta_prospector'>)¶

message(**kwargs)¶: Inspect .param.message method for the full docstring

params = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.prospectors.boruta_prospector'>)¶

pprint(imports=None, prefix='\n ', unknown_value='<?>', qualify=False, separator='')¶: Same as Parameterized.pprint, except that X.classname(Y is replaced with X.classname.instance(Y

classmethod print_param_defaults(*args, **kwargs)¶: Inspect .param.print_param_defaults method for the full docstring

print_param_values(**kwargs)¶: Inspect .param.print_param_values method for the full docstring

process()[source]¶: Abstract process.

property sample_index_name¶: Returns the name of the sample index.

script_repr(imports=[], prefix=' ')¶: Same as Parameterized.script_repr, except that X.classname(Y is replaced with X.classname.instance(Y

property selection¶: Returns the currently selected data.

classmethod set_default(*args, **kwargs)¶: Inspect .param.set_default method for the full docstring

set_dynamic_time_fn = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.prospectors.boruta_prospector'>)¶

set_param = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.operations.prospectors.boruta_prospector'>)¶

state_pop()¶

Restore the most recently saved state.

See state_push() for more details.

state_push()¶

Save this instance’s state.

For Parameterized instances, this includes the state of dynamically generated values.

Subclasses that maintain short-term state should additionally save and restore that state using state_push() and state_pop().

Generally, this method is used by operations that need to test something without permanently altering the objects’ state.

warning(**kwargs)¶: Inspect .param.warning method for the full docstring

property x_count_data¶

Returns the currently selected ‘x_data’. Usually this will be a subset of the active count array.

Returns: An Xarray.Dataset selection of the currently active ‘x_data’.

property y_annotation_data¶

Returns the currently selected ‘y_data’, or None, based on the selected_annotation_variables parameter.

Returns: An xarray.Dataset or xarray.DataArray object of the currently selected y_data.

GSForge.operations Package¶

operations Package¶

analytics Module¶

normalizations Module¶

prospectors Module¶

`operations` Package¶

`analytics` Module¶

`normalizations` Module¶

`prospectors` Module¶