GSForge.models Package¶

`models` Package¶

There are two ‘core’ data models in GSForge, both of which store their associated data in xarray.Dataset object under a data attribute. You are encouraged to consult the xarray documentation for how to perform any transform or selection not provided by GSForge. The two ‘core’ data classes are:

AnnotatedGEM: Contains the gene expression matrix, which is indexed by a ‘Gene’ and ‘Sample’ coordinates. This xarray.Dataset object also contains (but is not limited to) phenotype information as well.
GeneSet: A GeneSet is a set of genes and any associated values. A GeneSet can a set of ‘supported’ genes, i.e. genes that are ‘within’ a given GeneSet.

The interface classes provide patterns of data access and common transformations that researchers may need from the core data classes. They are:

GeneSetCollection: The work-horse of the GSForge package. This object contains an AnnotatedGEM and a python dictionary of {name: GeneSet} objects. This class contains functions for comparing and analyzing GeneSet, as well as tools to pass of GeneSet-derived subsets to other functions.
Interface: The Interface object provides a common API to interacting with AnnotatedGEM or GeneSetCollection. It provides functions that facilitate pulling gene or sample subsets and access to any transforms of the count matrix.
OperationInterface: Aside from being abstract, this is the same as the above Interface, except this calls a single function as defined by process function in a subclass.

class GSForge.models.AnnotatedGEM(*args, **params)[source]¶

Bases: param.parameterized.Parameterized

A data class for a gene expression matrix and any associated sample or gene annotations.

This model holds the count expression matrix, and any associated labels or annotations as an xarray.Dataset object under the .data attribute. By default this dataset will be expected to have its indexes named “Gene” and “Sample”, although there are parameters to override those arrays and index names used.

An AnnotatedGEM object can be created with one of the class methods:

from_files(): A helper function for loading disparate GEM and annotation files through pandas.read_csv().
from_pandas(): Reads in a GEM pandas.DataFrame and an optional annotation DataFrame. These must share the same sample index.
from_netcdf(): Reads in from a .nc filepath. Usually this means loading a previously created AnnotatedGEM.

Randomly generate a demo AnnotatedGEM

# >>> from sklearn.datasets import make_multilabel_classification # >>> data, labels = make_multilabel_classification() # >>> agem = AnnotatedGEM.from_pandas(pd.DataFrame(data), pd.DataFrame(labels), name=”Generated GEM”)

# >>> agem # <GSForge.AnnotatedGEM> # Name: Generated GEM # Selected GEM Variable: ‘counts’ # Gene 100 # Sample 100

View the entire gene or sample index:

# >>> agem.gene_index # <xarray.DataArray ‘Gene’ (Gene: 100)>…

# >>> agem.sample_index # <xarray.DataArray ‘Sample’ (Sample: 100)>…

# >>> agem.infer_variables() # {‘all_labels’: …

data = param.ClassSelector(class_=<class ‘xarray.core.dataset.Dataset’>): An xarray.Dataset object that contains the Gene Expression Matrix, and any needed annotations. This xarray.Dataset object is expected to have a count array named ‘counts’, that has coordinates (‘Gene’, ‘Sample’).
count_array_name = param.String(default=’counts’): This parameter controls which variable from the xarray.Dataset should be considered to be the ‘count’ variable. Consider using this if you require different index names, or wish to control which count array among many should be used by default.
sample_index_name = param.String(default=’Sample’): This parameter controls which variable from the xarray.Dataset should be considered to be the ‘sample’ coordinate. Consider using this if you require different coordinate names.
gene_index_name = param.String(default=’Gene’): This parameter controls which variable from the Xarray.Dataset should be considered to be the ‘gene index’ coordinate. Consider using this if you require different coordinate names.

property count_array_names¶

Returns a list of all available count arrays contained within this AnnotatedGEM object.

This is done simply by returning all data variables that have the same dimension set as the default count array.

debug(**kwargs)¶: Inspect .param.debug method for the full docstring

defaults(**kwargs)¶: Inspect .param.defaults method for the full docstring

force_new_dynamic_value = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.models._AnnotatedGEM.AnnotatedGEM'>)¶

classmethod from_files(count_path: str, label_path: str = None, count_kwargs: dict = None, label_kwargs: dict = None, **params)[source]¶

Construct a GEM object from file paths and optional parsing arguments.

Parameters

count_path – The path to the gene expression matrix.
label_path – The path to the gene annotation data.
count_kwargs – Arguments to be passed to pandas.read_csv for the count matrix.
label_kwargs – Arguments to be passed to pandas.read_csv for the annotations.

Returns

An instance of the GEM class.

classmethod from_netcdf(netcdf_path, **params)[source]¶

Construct a GEM object from a netcdf (.nc) file path.

Parameters: netcdf_path – A path to a netcdf file. If this file has different index names than default (Gene, Sample, counts), be sure to explicitly set those parameters (gene_index_name, sample_index_name, count_array_name).

classmethod from_pandas(count_df: pandas.core.frame.DataFrame, label_df: pandas.core.frame.DataFrame = None, **params)[source]¶

Construct a GEM object from pandas.DataFrame objects.

Parameters

count_df – The gene expression matrix as a pandas.DataFrame. This file is assumed to have genes as rows and samples as columns.
label_df – The gene annotation data as a pandas.DataFrame. This file is assumed to have samples as rows and annotation observations as columns.

Returns

An instance of the GEM class.

property gene_index¶

Returns the entire gene index of this AnnotatedGEM object as an xarray.DataArray.

The actual variable or coordinate that this returns is controlled by the gene_index_name parameter.

get_param_values = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.models._AnnotatedGEM.AnnotatedGEM'>)¶

get_value_generator = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.models._AnnotatedGEM.AnnotatedGEM'>)¶

infer_variables(quantile_size=10, skip=None) → dict[source]¶

Infer categories for the variables in the AnnotatedGEM’s labels.

Parameters

quantile_size – The maximum number of unique elements before a variable is no longer considered as a quantile-able set of values.
skip – The variables to be skipped.

Returns

A dictionary of the inferred value types.

inspect_value = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.models._AnnotatedGEM.AnnotatedGEM'>)¶

message(**kwargs)¶: Inspect .param.message method for the full docstring

params = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.models._AnnotatedGEM.AnnotatedGEM'>)¶

pprint(imports=None, prefix=' ', unknown_value='<?>', qualify=False, separator='')¶: (Experimental) Pretty printed representation that may be evaluated with eval. See pprint() function for more details.

classmethod print_param_defaults(*args, **kwargs)¶: Inspect .param.print_param_defaults method for the full docstring

print_param_values(**kwargs)¶: Inspect .param.print_param_values method for the full docstring

property sample_index¶

Returns the entire sample index of this AnnotatedGEM object as an xarray.DataArray.

The actual variable or coordinate that this returns is controlled by the sample_index_name parameter.

save(path)[source]¶

Save as a netcdf (.nc) to the file at path.

Parameters: path – The filepath to save to. This should use the .nc extension.
Returns: The path to which the file was saved.

script_repr(imports=[], prefix=' ')¶: Variant of __repr__ designed for generating a runnable script.

classmethod set_default(*args, **kwargs)¶: Inspect .param.set_default method for the full docstring

set_dynamic_time_fn = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.models._AnnotatedGEM.AnnotatedGEM'>)¶

set_param = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.models._AnnotatedGEM.AnnotatedGEM'>)¶

state_pop()¶

Restore the most recently saved state.

See state_push() for more details.

state_push()¶

Save this instance’s state.

For Parameterized instances, this includes the state of dynamically generated values.

Subclasses that maintain short-term state should additionally save and restore that state using state_push() and state_pop().

Generally, this method is used by operations that need to test something without permanently altering the objects’ state.

verbose(**kwargs)¶: Inspect .param.verbose method for the full docstring

warning(**kwargs)¶: Inspect .param.warning method for the full docstring

class GSForge.models.GeneSet(*args, **params)[source]¶

Bases: param.parameterized.Parameterized

A data class for a the result of a gene selection or analysis.

A GeneSet can also be a measurement or ranking of a set of genes, and this could include all of the ‘available’ genes. In such cases a boolean array ‘support’ indicates membership in the GeneSet.

data = param.Parameter(): Contains a gene-index Xarray.Dataset object, it should have only those genes that are considered ‘within’ the GeneSet in the index, or a boolean variable named ‘support’.
support_index_name = param.String(default=’support’): This parameter controls which variable should be considered to be the (boolean) variable indicating membership in this GeneSet.
gene_index_name = param.String(default=’Gene’): This parameter controls which variable from the Xarray.Dataset should be considered to be the ‘gene index’ coordinate. Consider using this if you require different coordinate names.

debug(**kwargs)¶: Inspect .param.debug method for the full docstring

defaults(**kwargs)¶: Inspect .param.defaults method for the full docstring

force_new_dynamic_value = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.models._GeneSet.GeneSet'>)¶

classmethod from_GeneSets(*gene_sets, mode: str = 'union', attrs=None, **params)[source]¶

Create a new GeneSet by combining all the genes in the given GeneSets.

No variables or attributes from the original GeneSets are maintained in this process.

classmethod from_netcdf(path, **params)[source]¶: Construct a GeneSet object from a netcdf file path.

property gene_index¶

Returns the entire gene index of this GeneSet object as an xarray.DataArray.

The variable or coordinate that this returns is controlled by the gene_index_name parameter.

Returns: The entire gene index of this GeneSet as an xarray.DataArray.

gene_support() → numpy.core.multiarray.array[source]¶

Returns the list of genes ‘supported in this GeneSet.

The value that this return is (by default) controlled by the self.support_index_name parameter.

Returns: A numpy array of the genes ‘supported’ by this GeneSet.

get_param_values = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.models._GeneSet.GeneSet'>)¶

get_value_generator = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.models._GeneSet.GeneSet'>)¶

inspect_value = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.models._GeneSet.GeneSet'>)¶

k_best_genes(k=100, score_name=None) → numpy.core.multiarray.array[source]¶

Select the highest scoring genes from the ‘score_name’ variable.

Parameters

k – The number of genes to return.
score_name – The variable name to rank genes by.

Returns

A numpy array of the top k genes based on their scores in score_name.

message(**kwargs)¶: Inspect .param.message method for the full docstring

params = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.models._GeneSet.GeneSet'>)¶

static parse_GeneSets(*gene_sets, mode: str = 'union', attrs=None, **params)[source]¶

Combines the GeneSet objects given using mode to create a single new GeneSet object.

Since the complete gene index is not necessarily known, it must minimally be the union of all genes included in the provided gene sets.

Parameters

gene_sets – One or more GSForge.GeneSet objects.
mode – Mode by which the gene_sets should be combined. Options are “union” or “intersection”.
attrs – Optional attributes for the combined GeneSet. These attributes are added to the GeneSet.data.attrs attribute.
params – Keyword parameters for the GeneSet object to be initialized with.

Returns

A new GeneSet object that contains genes from the provided gene_sets.

static parse_pandas(dataframe, genes=None, attrs=None, **params)[source]¶: Parse a pandas.DataFrame for use in a GeneSet.

pprint(imports=None, prefix=' ', unknown_value='<?>', qualify=False, separator='')¶: (Experimental) Pretty printed representation that may be evaluated with eval. See pprint() function for more details.

classmethod print_param_defaults(*args, **kwargs)¶: Inspect .param.print_param_defaults method for the full docstring

print_param_values(**kwargs)¶: Inspect .param.print_param_values method for the full docstring

q_best_genes(q=0.999, score_name=None) → numpy.core.multiarray.array[source]¶

Returns a numpy array of the q best genes based on the quantile q, and the target variable score_name.

Parameters

q – The quantile cutoff.
score_name – The target variable to judge the genes by.

Returns

A numpy array of the top q quantile genes based on score_name.

save_as_netcdf(target_dir=None, name=None)[source]¶

Save this GeneSet as a netcdf (.nc) file in the target_dir directory.

The default filename will be: {GeneSet.name}.nc, if the GeneSet does not have a name, one must be provided via the name argument.

Parameters

target_dir – The directory to place the saved GeneSet into.
name – The name to give the GeneSet upon saving.

Returns output_path

The path to which the file was saved.

script_repr(imports=[], prefix=' ')¶: Variant of __repr__ designed for generating a runnable script.

classmethod set_default(*args, **kwargs)¶: Inspect .param.set_default method for the full docstring

set_dynamic_time_fn = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.models._GeneSet.GeneSet'>)¶

set_gene_support(genes)[source]¶: Set this GeneSet support to the given genes.

set_param = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.models._GeneSet.GeneSet'>)¶

state_pop()¶

Restore the most recently saved state.

See state_push() for more details.

state_push()¶

Save this instance’s state.

For Parameterized instances, this includes the state of dynamically generated values.

Subclasses that maintain short-term state should additionally save and restore that state using state_push() and state_pop().

Generally, this method is used by operations that need to test something without permanently altering the objects’ state.

verbose(**kwargs)¶: Inspect .param.verbose method for the full docstring

warning(**kwargs)¶: Inspect .param.warning method for the full docstring

class GSForge.models.GeneSetCollection(**params)[source]¶

Bases: param.parameterized.Parameterized

A data class that holds an AnnotatedGEM and a dictionary of associated GeneSet objects.

gem = param.ClassSelector(class_=<class ‘GSForge.models._AnnotatedGEM.AnnotatedGEM’>): A Gene Expression Matrix (GEM) object.
gene_sets = param.Dict(class_=<class ‘dict’>): A dictionary of {key: xarray.DataArray}, boolean arrays indicating support for a given gene.

as_dict(keys=None, exclude=None)[source]¶

Returns a dictionary of {name: supported_genes} for each gene set, or those specified by the keys argument.

Parameters

keys – The list of GeneSet keys to be included in the returned dictionary.
exclude – A list of GeneSet keys to exclude from the returned dictionary.

debug(**kwargs)¶: Inspect .param.debug method for the full docstring

defaults(**kwargs)¶: Inspect .param.defaults method for the full docstring

force_new_dynamic_value = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.models._GeneSetCollection.GeneSetCollection'>)¶

classmethod from_folder(gem, target_dir, glob_filter='*.nc', filter_func=None, **params)[source]¶: Create a CompoundFacet from a list of file paths. The base file names will be used as the key values.

get_param_values = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.models._GeneSetCollection.GeneSetCollection'>)¶

get_support(key) → numpy.core.multiarray.array[source]¶

Get the support array for a given key.

Parameters: key – The GeneSet from which to get the gene support.
Returns: A numpy array of the genes that make up the support of this GeneSet.

get_value_generator = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.models._GeneSetCollection.GeneSetCollection'>)¶

inspect_value = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.models._GeneSetCollection.GeneSetCollection'>)¶

intersection(keys=None, exclude=None)[source]¶: Get the intersection of supported genes in this GeneSet collection.

message(**kwargs)¶: Inspect .param.message method for the full docstring

pairwise_percent_intersection(keys=None)[source]¶: Get the normalized intersection length of each facet combination.

params = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.models._GeneSetCollection.GeneSetCollection'>)¶

pprint(imports=None, prefix=' ', unknown_value='<?>', qualify=False, separator='')¶: (Experimental) Pretty printed representation that may be evaluated with eval. See pprint() function for more details.

classmethod print_param_defaults(*args, **kwargs)¶: Inspect .param.print_param_defaults method for the full docstring

print_param_values(**kwargs)¶: Inspect .param.print_param_values method for the full docstring

save(target_dir, keys=None)[source]¶

Save this collection to target_dir. Each GeneSet will be saved as a separate .netcdf file within this directory.

Parameters

target_dir – The path to which the ‘GeneSet’ xarray.Dataset .netcdf files will be written.
keys – The list of GeneSet keys that should be saved. If this is not provided, all GeneSet objects are saved.

Returns

script_repr(imports=[], prefix=' ')¶: Variant of __repr__ designed for generating a runnable script.

classmethod set_default(*args, **kwargs)¶: Inspect .param.set_default method for the full docstring

set_dynamic_time_fn = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.models._GeneSetCollection.GeneSetCollection'>)¶

set_param = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.models._GeneSetCollection.GeneSetCollection'>)¶

state_pop()¶

Restore the most recently saved state.

See state_push() for more details.

state_push()¶

Save this instance’s state.

For Parameterized instances, this includes the state of dynamically generated values.

Subclasses that maintain short-term state should additionally save and restore that state using state_push() and state_pop().

Generally, this method is used by operations that need to test something without permanently altering the objects’ state.

union(keys=None, exclude=None)[source]¶: Get the union of supported genes in this GeneSet collection.

verbose(**kwargs)¶: Inspect .param.verbose method for the full docstring

warning(**kwargs)¶: Inspect .param.warning method for the full docstring

class GSForge.models.Interface(*args, **params)[source]¶

Bases: param.parameterized.Parameterized

The Interface provides common API access for interacting with the AnnotatedGEM and GeneSetCollection objects. It also accepts an AnnotatedGEM and a single GeneSet for subset selection.

For updating default parameters within subclasses, use the following, although it may cause ‘watching’ parameters to fire.

` self.set_param(key=value) `

gem = param.ClassSelector(class_=<class ‘GSForge.models._AnnotatedGEM.AnnotatedGEM’>): An AnnotatedGEM object.
gene_set_collection = param.ClassSelector(class_=<class ‘GSForge.models._GeneSetCollection.GeneSetCollection’>): A GeneSetCollection object.
selected_gene_sets = param.ListSelector(default=[None], objects=[]): A list of keys from the provided GeneSetCollection (stored in gene_set_collection) that are to be used for selecting sets of genes from the count matrix.
selected_genes = param.Parameter(): A list of genes to use in indexing from the count matrix. This parameter takes priority over all other gene selecting methods. That means that selected lineaments (or combinations thereof) will have no effect.
gene_set_mode = param.ObjectSelector(default=’union’, objects=[‘complete’, ‘union’, ‘intersection’]): Controls how any selected gene sets are returned by the interface. + complete Returns the entire gene set of the AnnotatedGEM. + union Returns the union of the selected gene sets support. + intersection Returns the intersection of the selected gene sets support.
sample_subset = param.Parameter(): A list of samples to use in a given operation. These can be supplied directly as a list of genes, or can be drawn from a given GeneSet.
count_variable = param.String(): The name of the count matrix used.
annotation_variables = param.Parameter(): The name of the active annotation variable(s). These are the annotation columns that will be control the subset returned by y_annotation_data.
count_mask = param.ObjectSelector(default=’complete’, objects=[‘complete’, ‘masked’, ‘dropped’]): The type of mask to use for the count matrix. + ‘complete’ returns the entire count matrix as numbers. + ‘masked’ returns the entire count matrix with zero or missing as NaN values. + ‘dropped’ returns the count matrix without genes that have zero or missing values.
annotation_mask = param.ObjectSelector(default=’complete’, objects=[‘complete’, ‘dropped’]): The type of mask to use for the target array. + ‘complete’ returns the entire target array. + ‘masked’ returns the entire target array with zero or missing as NaN values. + ‘dropped’ returns the target array without samples that have zero or missing values.
count_transform = param.Callable(): A transform that will be run on the x_data that is supplied by this Interface. The transform runs on the subset of the matrix that has been selected.

property active_count_variable¶: Returns the name of the currently active count matrix.

debug(**kwargs)¶: Inspect .param.debug method for the full docstring

defaults(**kwargs)¶: Inspect .param.defaults method for the full docstring

force_new_dynamic_value = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.models._Interface.Interface'>)¶

property gene_index_name¶: Returns the name of the gene index.

get_gene_index(count_variable=None) → numpy.core.multiarray.array[source]¶

Get the currently selected gene index as a numpy array.

Parameters: count_variable – The variable to be retrieved.
Returns: A numpy array of the currently selected genes.

get_param_values = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.models._Interface.Interface'>)¶

get_sample_index() → numpy.core.multiarray.array[source]¶

Get the currently selected sample index as a numpy array.

Returns: A numpy array of the currently selected samples.

get_selection_indexes() → dict[source]¶: Returns the currently selected indexes as a dictionary.

get_value_generator = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.models._Interface.Interface'>)¶

inspect_value = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.models._Interface.Interface'>)¶

message(**kwargs)¶: Inspect .param.message method for the full docstring

params = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.models._Interface.Interface'>)¶

pprint(imports=None, prefix=' ', unknown_value='<?>', qualify=False, separator='')¶: (Experimental) Pretty printed representation that may be evaluated with eval. See pprint() function for more details.

classmethod print_param_defaults(*args, **kwargs)¶: Inspect .param.print_param_defaults method for the full docstring

print_param_values(**kwargs)¶: Inspect .param.print_param_values method for the full docstring

property sample_index_name¶: Returns the name of the sample index.

script_repr(imports=[], prefix=' ')¶: Variant of __repr__ designed for generating a runnable script.

property selection¶: Returns the currently selected data.

classmethod set_default(*args, **kwargs)¶: Inspect .param.set_default method for the full docstring

set_dynamic_time_fn = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.models._Interface.Interface'>)¶

set_param = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.models._Interface.Interface'>)¶

state_pop()¶

Restore the most recently saved state.

See state_push() for more details.

state_push()¶

Save this instance’s state.

For Parameterized instances, this includes the state of dynamically generated values.

Subclasses that maintain short-term state should additionally save and restore that state using state_push() and state_pop().

Generally, this method is used by operations that need to test something without permanently altering the objects’ state.

verbose(**kwargs)¶: Inspect .param.verbose method for the full docstring

warning(**kwargs)¶: Inspect .param.warning method for the full docstring

property x_count_data¶

Returns the currently selected ‘x_data’. Usually this will be a subset of the active count array.

Returns: An Xarray.Dataset selection of the currently active ‘x_data’.

property y_annotation_data¶

Returns the currently selected ‘y_data’, or None, based on the selected_annotation_variables parameter.

Returns: An xarray.Dataset or xarray.DataArray object of the currently selected y_data.

class GSForge.models.OperationInterface(*args, **params)[source]¶

Bases: GSForge.models._Interface.Interface, param.parameterized.ParameterizedFunction

Abstract class for a GEMOperation.

Every GEMOperation undergoes some argument parsing, then calls self.process(), which must be implemented by implemented classes.

gem = param.ClassSelector(class_=<class ‘GSForge.models._AnnotatedGEM.AnnotatedGEM’>): An AnnotatedGEM object.
gene_set_collection = param.ClassSelector(class_=<class ‘GSForge.models._GeneSetCollection.GeneSetCollection’>): A GeneSetCollection object.
selected_gene_sets = param.ListSelector(default=[None], objects=[]): A list of keys from the provided GeneSetCollection (stored in gene_set_collection) that are to be used for selecting sets of genes from the count matrix.
selected_genes = param.Parameter(): A list of genes to use in indexing from the count matrix. This parameter takes priority over all other gene selecting methods. That means that selected lineaments (or combinations thereof) will have no effect.
gene_set_mode = param.ObjectSelector(default=’union’, objects=[‘complete’, ‘union’, ‘intersection’]): Controls how any selected gene sets are returned by the interface. + complete Returns the entire gene set of the AnnotatedGEM. + union Returns the union of the selected gene sets support. + intersection Returns the intersection of the selected gene sets support.
sample_subset = param.Parameter(): A list of samples to use in a given operation. These can be supplied directly as a list of genes, or can be drawn from a given GeneSet.
count_variable = param.String(): The name of the count matrix used.
annotation_variables = param.Parameter(): The name of the active annotation variable(s). These are the annotation columns that will be control the subset returned by y_annotation_data.
count_mask = param.ObjectSelector(default=’complete’, objects=[‘complete’, ‘masked’, ‘dropped’]): The type of mask to use for the count matrix. + ‘complete’ returns the entire count matrix as numbers. + ‘masked’ returns the entire count matrix with zero or missing as NaN values. + ‘dropped’ returns the count matrix without genes that have zero or missing values.
annotation_mask = param.ObjectSelector(default=’complete’, objects=[‘complete’, ‘dropped’]): The type of mask to use for the target array. + ‘complete’ returns the entire target array. + ‘masked’ returns the entire target array with zero or missing as NaN values. + ‘dropped’ returns the target array without samples that have zero or missing values.
count_transform = param.Callable(): A transform that will be run on the x_data that is supplied by this Interface. The transform runs on the subset of the matrix that has been selected.

property active_count_variable¶: Returns the name of the currently active count matrix.

debug(**kwargs)¶: Inspect .param.debug method for the full docstring

defaults(**kwargs)¶: Inspect .param.defaults method for the full docstring

force_new_dynamic_value = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.models._OperationInterface.OperationInterface'>)¶

property gene_index_name¶: Returns the name of the gene index.

get_gene_index(count_variable=None) → numpy.core.multiarray.array¶

Get the currently selected gene index as a numpy array.

Parameters: count_variable – The variable to be retrieved.
Returns: A numpy array of the currently selected genes.

get_param_values = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.models._OperationInterface.OperationInterface'>)¶

get_sample_index() → numpy.core.multiarray.array¶

Get the currently selected sample index as a numpy array.

Returns: A numpy array of the currently selected samples.

get_selection_indexes() → dict¶: Returns the currently selected indexes as a dictionary.

get_value_generator = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.models._OperationInterface.OperationInterface'>)¶

inspect_value = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.models._OperationInterface.OperationInterface'>)¶

instance = functools.partial(<function ParameterizedFunction.instance>, <class 'GSForge.models._OperationInterface.OperationInterface'>)¶

message(**kwargs)¶: Inspect .param.message method for the full docstring

params = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.models._OperationInterface.OperationInterface'>)¶

pprint(imports=None, prefix='\n ', unknown_value='<?>', qualify=False, separator='')¶: Same as Parameterized.pprint, except that X.classname(Y is replaced with X.classname.instance(Y

classmethod print_param_defaults(*args, **kwargs)¶: Inspect .param.print_param_defaults method for the full docstring

print_param_values(**kwargs)¶: Inspect .param.print_param_values method for the full docstring

process()[source]¶: Abstract process.

property sample_index_name¶: Returns the name of the sample index.

script_repr(imports=[], prefix=' ')¶: Same as Parameterized.script_repr, except that X.classname(Y is replaced with X.classname.instance(Y

property selection¶: Returns the currently selected data.

classmethod set_default(*args, **kwargs)¶: Inspect .param.set_default method for the full docstring

set_dynamic_time_fn = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.models._OperationInterface.OperationInterface'>)¶

set_param = functools.partial(<function Parameters.deprecate.<locals>.inner>, <class 'GSForge.models._OperationInterface.OperationInterface'>)¶

state_pop()¶

Restore the most recently saved state.

See state_push() for more details.

state_push()¶

Save this instance’s state.

For Parameterized instances, this includes the state of dynamically generated values.

Subclasses that maintain short-term state should additionally save and restore that state using state_push() and state_pop().

Generally, this method is used by operations that need to test something without permanently altering the objects’ state.

verbose(**kwargs)¶: Inspect .param.verbose method for the full docstring

warning(**kwargs)¶: Inspect .param.warning method for the full docstring

property x_count_data¶

Returns the currently selected ‘x_data’. Usually this will be a subset of the active count array.

Returns: An Xarray.Dataset selection of the currently active ‘x_data’.

property y_annotation_data¶

Returns the currently selected ‘y_data’, or None, based on the selected_annotation_variables parameter.

Returns: An xarray.Dataset or xarray.DataArray object of the currently selected y_data.