GSForge.models package¶
Module contents¶
There are two core data models in GSForge, both of which store their associated data
in xarray.Dataset
object under a data
attribute. You are encouraged to consult the
xarray documentation
for how to perform any transform or selection not provided by GSForge.
Core Data Classes¶
- AnnotatedGEM
Contains the gene expression matrix, which is indexed by a ‘Gene’ and ‘Sample’ coordinates. This
xarray.Dataset
object also contains (but is not limited to) phenotype information as well.- GeneSet
A GeneSet is a set of genes and any associated values. A GeneSet can a set of ‘supported’ genes, i.e. genes that are ‘within’ a given GeneSet.
These core data classes are constructed with a limited set of packages:
numpy
pandas
xarray
param
This allows the creation of container images without interactive visualization libraries.
- class GSForge.models.AnnotatedGEM(*args, **params)¶
Bases:
param.parameterized.Parameterized
A data class for a gene expression matrix and any associated sample or gene annotations.
This model holds the count expression matrix, and any associated labels or annotations as an
xarray.DataSet
object under the.data
attribute. By default this dataset will be expected to have its indexes named “Gene” and “Sample”, although there are parameters to override those arrays and index names used.data
= param.ClassSelector(readonly=False)An
xarray.Dataset
object that contains the Gene Expression Matrix, and any needed annotations. Thisxarray.Dataset
object is expected to have a count array named ‘counts’, that has coordinates (‘Gene’, ‘Sample’).count_array_name
= param.String(readonly=False)This parameter controls which variable from the
xarray.Dataset
should be considered to be the ‘count’ variable. Consider using this if you require different index names, or wish to control which count array among many should be used by default.sample_index_name
= param.String(readonly=False)This parameter controls which variable from the
xarray.Dataset
should be considered to be the ‘sample’ coordinate. Consider using this if you require different coordinate names.gene_index_name
= param.String(readonly=False)This parameter controls which variable from the
xarray.Dataset
should be considered to be the ‘gene index’ coordinate. Consider using this if you require different coordinate names.
- data = None¶
- count_array_name = 'counts'¶
- sample_index_name = 'Sample'¶
- gene_index_name = 'Gene'¶
- property gene_index: xarray.core.dataarray.DataArray¶
Returns the entire gene index of this AnnotatedGEM object as an
xarray.DataArray
.The variable or coordinate that this returns is controlled by the gene_index_name parameter.
- Returns
The complete gene index of this AnnotatedGEM.
- Return type
xarray.DataArray
- property sample_index: xarray.core.dataarray.DataArray¶
Returns the entire sample index of this AnnotatedGEM object as an
xarray.DataArray
.The actual variable or coordinate that this returns is controlled by the sample_index_name parameter.
- Returns
The complete sample index of this AnnotatedGEM.
- Return type
xarray.DataArray
- property count_array_names: List[str]¶
Returns a list of all available count arrays contained within this AnnotatedGEM object.
This is done simply by returning all data variables that have the same dimension set as the default count array.
- Returns
A list of available count arrays in this AnnotatedGEM.
- Return type
List[str]
- infer_variables(quantile_size: int = 10, skip: Optional[bool] = None) Dict[str, numpy.ndarray] ¶
Infer categories for the variables in the AnnotatedGEM’s labels.
- Parameters
quantile_size (int) – The maximum number of unique elements before a variable is no longer considered as a quantile-able set of values.
skip (bool) – The variables to be skipped.
- Returns
- Return type
A dictionary of the inferred value types.
- classmethod from_netcdf(netcdf_path: Union[str, pathlib.Path, IO], **params) GSForge.models._AnnotatedGEM.AnnotatedGEM ¶
Construct an
AnnotatedGEM
object from a netcdf (.nc) file path.- Parameters
netcdf_path (Union[str, Path, IO[AnyStr]]) – A path to a netcdf file. If this file has different index names than default (Gene, Sample, counts), be sure to explicitly set those parameters (gene_index_name, sample_index_name, count_array_name).
- Returns
AnnotatedGEM
- Return type
A new instance of the AnnotatedGEM class.
- classmethod from_pandas(count_df: pandas.core.frame.DataFrame, label_df: Optional[pandas.core.frame.DataFrame] = None, **params) GSForge.models._AnnotatedGEM.AnnotatedGEM ¶
Reads in a GEM pandas.DataFrame and an optional annotation DataFrame. These must share the same sample index.
- Parameters
count_df (pd.DataFrame) – The gene expression matrix as a pandas.DataFrame. This file is assumed to have genes as rows and samples as columns.
label_df (pd.DataFrame) – The gene annotation data as a pandas.DataFrame. This file is assumed to have samples as rows and annotation observations as columns.
- Returns
AnnotatedGEM
- Return type
A new instance of the AnnotatedGEM class.
- static xrarray_gem_from_pandas(count_df: pandas.core.frame.DataFrame, label_df: Optional[pandas.core.frame.DataFrame] = None, transpose_counts: bool = True) xarray.core.dataset.Dataset ¶
Stitch together a gene expression and annotation DataFrames into a single
xarray.Dataset
object.- Parameters
count_df (pd.DataFrame) – The gene expression matrix as a pandas.DataFrame; assumed to have genes as rows and samples as columns.
label_df (pd.DataFrame) – The gene annotation data as a pandas.DataFrame; assumed to have samples as rows and annotations as columns.
transpose_counts (bool) – Transpose the count matrix from (genes as rows, samples as columns) to (samples as rows, observations as columns).
- Returns
xarray.Dataset
- Return type
Containing the gene expression matrix and the gene annotation data.
- classmethod from_files(count_path: Union[str, pathlib.Path, IO], label_path: Optional[Union[str, pathlib.Path, IO]] = None, count_kwargs: Optional[dict] = None, label_kwargs: Optional[dict] = None, transpose_counts: bool = True, **params) GSForge.models._AnnotatedGEM.AnnotatedGEM ¶
Construct a
AnnotatedGEM
object from file paths and optional parsing arguments.- Parameters
count_path (Union[str, Path, IO[AnyStr]]) – Path to the gene expression matrix.
label_path (Union[str, Path, IO[AnyStr]]) – Path to the gene annotation data.
count_kwargs (dict) – A dictionary of arguments to be passed to
pandas.read_csv
for the count matrix.label_kwargs (dict) – A dictionary of arguments to be passed to
pandas.read_csv
for the annotations.
- Returns
AnnotatedGEM
- Return type
A new instance of the AnnotatedGEM class.
- classmethod from_geo_id(geo_id: str, destination: str = './') GSForge.models._AnnotatedGEM.AnnotatedGEM ¶
- save(path: Union[str, pathlib.Path, IO], **kwargs) str ¶
Save as a netcdf (.nc) to the file at
path
.- Parameters
path (Union[str, Path, IO[AnyStr]]) – The filepath to save to. This should use the
.nc
extension.- Returns
str
- Return type
The path to which the file was saved.
- name = 'AnnotatedGEM'¶
- class GSForge.models.GeneSet(*args, **params)¶
Bases:
param.parameterized.Parameterized
A data class for a the result of a gene selection or analysis.
A GeneSet can also be a measurement or ranking of a set of genes, and this could include all of the ‘available’ genes. In such cases a boolean array ‘support’ indicates membership in the GeneSet.
Create a GeneSet from a .netcf file path, ``pandas.DataFrame``, ``np.ndarray`` or list of genes:
# Supply any of the above objects along with any other parameters to create a GeneSet. my_geneset = GeneSet(<pandas.DataFrame, xarray.DataSet, numpy.ndarray, str>) # One can also explicitly call the constructors for the types above, e.g.: my_geneset = GeneSet.from_pandas(<pandas.DataFrame>)
Get supported Genes:
my_geneset.get_support()
Set the support with a list or array of genes:
my_geneset.set_support_by_genes(my_genes)
data
= param.Parameter(readonly=False)Contains a gene-index
xarray.Dataset
object, it should have only those genes that are considered ‘within’ the GeneSet in the index, or a boolean variable named ‘support’.support_index_name
= param.String(readonly=False)This parameter controls which variable should be considered to be the (boolean) variable indicating membership in this GeneSet.
gene_index_name
= param.String(readonly=False)This parameter controls which variable from the
xarray.Dataset
should be considered to be the ‘gene index’ coordinate. Consider using this if you require different coordinate names.
- data = None¶
- support_index_name = 'support'¶
- gene_index_name = 'Gene'¶
- classmethod from_pandas(dataframe: pandas.core.frame.DataFrame, genes: Optional[numpy.ndarray] = None, attrs=None, **params)¶
Create a GeneSet from a
pandas.DataFrame
.- Parameters
dataframe (pd.DataFrame) – A
pandas.DataFrame
object. Assumed to be indexed by genes names.genes (np.ndarray) – If you have a separate (but ordered the same!) gene array that corresponds to your data, it can be passed here to be set as the index appropriately.
attrs (dict) – A dictionary of attributes to be added to the
xarray.Dataset.attrs
attribute.params (dict) – Other parameters to set.
- Returns
- Return type
A new GeneSet object.
- classmethod from_GeneSets(*gene_sets: GSForge.models._GeneSet.GeneSet, mode: str = 'union', attrs=None, **params) GSForge.models._GeneSet.GeneSet ¶
Create a new GeneSet by combining all the genes in the given GeneSets.
No variables or attributes from the original GeneSets are maintained in this process.
- Parameters
*gene_sets (GeneSet) – One or more
GSForge.GeneSet
objects.mode (str) – Mode by which to combine the given
GeneSet
objects given.attrs (dict) – A dictionary of attributes to be added to the
xarray.Dataset.attrs
attribute.params (dict) – Other parameters to set.
- Returns
GeneSet
- Return type
A new GeneSet built from the given GeneSets as described by mode.
- classmethod from_bool_array(bool_array: numpy.ndarray, complete_gene_index: numpy.ndarray, attrs=None, **params) GSForge.models._GeneSet.GeneSet ¶
Create a GeneSet object from a boolean support array. This requires a matching gene index array.
- Parameters
bool_array (np.ndarray) – A boolean array representing support within this GeneSet.
complete_gene_index (np.ndarray) – The complete gene index.
attrs (dict) – A dictionary of attributes to be added to the
xarray.Dataset.attrs
attribute.params (dict) – Other parameters to set.
- Returns
GeneSet
- Return type
A new GeneSet object.
- classmethod from_gene_array(selected_gene_array: numpy.ndarray, complete_gene_index=None, attrs=None, **params) GSForge.models._GeneSet.GeneSet ¶
Parses arguments for a new GeneSet from an array or list of ‘selected’ genes. Such genes are assumed to be within the optionally supplied complete_gene_index.
- Parameters
selected_gene_array (np.ndarray) – The genes ‘selected’ to be within the support of this GeneSet.
complete_gene_index (np.ndarray) – Optional. The complete gene index to which those selected genes belong.
attrs (dict) – A dictionary of attributes to be added to the
xarray.Dataset.attrs
attribute.params (dict) – Other parameters to set.
- Returns
GeneSet
- Return type
A new GeneSet object.
- classmethod from_xarray_dataset(data: xarray.core.dataset.Dataset, **params) GSForge.models._GeneSet.GeneSet ¶
Create a GeneSet from an xarray.Dataset.
- Parameters
data (xr.Dataset) – An xarray.Dataset object. See the .data parameter of this class.
params (dict) – Other parameters to set.
- Returns
GeneSet
- Return type
A new GeneSet object.
- classmethod from_netcdf(path: Union[str, pathlib.Path, IO], **params)¶
Create a GeneSet object from a netcdf file path.
- Parameters
path (Union[str, Path, IO[AnyStr]]) – The path to the .netcdf file to be used.
params (dict) – Other parameters to set.
- Returns
GeneSet
- Return type
A new GeneSet object.
- property gene_index: xarray.core.dataarray.DataArray¶
Returns the entire gene index of this GeneSet object as an
xarray.DataArray
.The variable or coordinate that this returns is controlled by the gene_index_name parameter.
- Returns
xr.DataArray
- Return type
A copy of the entire gene index of this GeneSet as an
xarray.DataArray
.
- get_support() numpy.ndarray ¶
Returns the list of genes ‘supported in this GeneSet.
The value that this return is (by default) controlled by the self.support_index_name parameter.
- Returns
- Return type
A numpy array of the genes ‘supported’ by this GeneSet.
- property support_exists: bool¶
Returns True if a support array exists, and that it has at least one member within, returns False otherwise.
- set_support_by_genes(genes: numpy.ndarray) GSForge.models._GeneSet.GeneSet ¶
Set this GeneSet support to the given genes. This function calculates the boolean support array for the gene index via
np.isin(gene_index, genes)
. Returns an updated copy of the GeneSet.- Parameters
genes (np.ndarray) – An array of genes which represent the “supported” subset within the entire gene index.
- Returns
GeneSet
- Return type
Returns an updated copy of the GeneSet.
- set_support_from_boolean_array(boolean_array: numpy.ndarray) GSForge.models._GeneSet.GeneSet ¶
Set this GeneSet support based on the given boolean array, which must be the same length as the existing gene index. Returns an updated copy of the GeneSet.
This function calculates the boolean support array for the gene index via np.isin(gene_index, genes).
- Parameters
boolean_array (numpy.ndarray) – A boolean
numpy.ndarray
.- Returns
GeneSet
- Return type
Returns an updated copy of the GeneSet.
- get_genes_by_threshold(threshold, score_variable: str, comparison: str = 'ge', within_support: bool = True, absolute: bool = True) numpy.ndarray ¶
- get_top_n_genes(score_variable: str, n: int = 1000, within_support: bool = True, absolute: bool = True) numpy.ndarray ¶
- to_dataframe(only_supported: bool = True) pandas.core.frame.DataFrame ¶
Convert this GeneSet.data attribute to a
pandas.DataFrame
. This restricts the data returned to include only those genes that are returned byGeneSet.get_support()
.- Parameters
only_supported (bool) – Defaults to True, set to False if you want all GeneSet data to be in the DataFrame returned.
- Returns
- Return type
A
pandas.DataFrame
of thisGeneSet.data
attribute.
- save_as_netcdf(target_dir=None, name=None) str ¶
Save this GeneSet as a netcdf (.nc) file in the target_dir directory.
The default filename will be: {GeneSet.name}.nc, if the GeneSet does not have a name, one must be provided via the name argument.
- Parameters
target_dir (str) – The directory to place the saved GeneSet into.
name (str) – The name to give the GeneSet upon saving.
- Returns
str
- Return type
The path to which the file was saved.
- name = 'GeneSet'¶
- class GSForge.models.GeneSetCollection(**params)¶
Bases:
param.parameterized.Parameterized
An interface class which contains an AnnotatedGEM and a dictionary of GeneSet objects.
gem
= param.ClassSelector(readonly=False)A GSForge.AnnotatedGEM object.
- gem = None¶
- summarize_gene_sets() Dict[str, int] ¶
Summarize this GeneSetCollection, returns a dictionary of
{gene_set_name: support_length}
. This is used to generate display used in the__repr__
function.
- get_support(key: str) numpy.ndarray ¶
Get the support array for a given key.
- Parameters
key (str) – The GeneSet from which to get the gene support.
- Returns
np.ndarray
- Return type
An array of the genes that make up the support of this
GeneSet
.
- gene_sets_to_dataframes(keys: Optional[List[str]] = None, only_supported: bool = True) Dict[str, pandas.core.frame.DataFrame] ¶
Returns a dictionary of {key: pd.DataFrame} of the
GeneSet.data
. The DataFrame is limited to only those genes that are ‘supported’ within the GeneSet by default.- Parameters
keys (List[str]) – An optional list of gene_set keys to return, by default all keys are selected.
only_supported (bool) – Whether to return a subset defined by each GeneSet support, or the complete data frame.
- Returns
dict
- Return type
A dictionary of {key: pd.DataFrame} of the
GeneSet.data
attribute.
- gene_sets_to_csv_files(target_dir: Optional[str] = None, keys: Optional[List[str]] = None, only_supported: bool = True) None ¶
Writes GeneSet.data as .csv files.
By default this creates creates a folder with the current working directory and saves the .csv files within. By default only genes that are “supported” by a GeneSet are included.
- Parameters
target_dir – The target directory to save the .csv files to. This defaults to the name of this GeneSetCollection, which creates a folder in the current working directory.
keys (List[str]) – An optional list of gene_set keys to return, by default all keys are selected.
only_supported (bool) – Whether to return a subset defined by each GeneSet support, or the complete data frame.
- Returns
- Return type
None
- gene_sets_to_excel_sheet(name: Optional[str] = None, keys: Optional[List[str]] = None, only_supported: bool = True) None ¶
Writes the GeneSet.data within this GeneSetCollection as a single Excel worksheet.
By default this sheet is named using the
.name
of this GeneSetCollection. By default only genes that are “supported” by a GeneSet are included.- Parameters
name (str) – The name of the Excel sheet.
.xlsx
will be appended to the given name.keys (List[str]) – An optional list of gene_set keys to return, by default all keys are selected.
only_supported (bool) – Whether to return a subset defined by each GeneSet support, or the complete data frame.
- Returns
- Return type
None
- as_dict(keys: Optional[List[str]] = None, exclude: Optional[List[str]] = None, empty_supports: bool = False) Dict[str, numpy.ndarray] ¶
Returns a dictionary of {name: supported_genes} for each GeneSet, or those specified by the keys argument.
- Parameters
keys (List[str]) – An optional list of gene_set keys to return, by default all keys are selected.
exclude (List[str]) – An optional list of GeneSet keys to exclude from the returned dictionary.
empty_supports – Whether to include GeneSets that have no support array, or no genes supported within the support array.
- Returns
dict
- Return type
Dictionary of {name: supported_genes} for each GeneSet.
- intersection(keys: Optional[List[str]] = None, exclude: Optional[List[str]] = None) numpy.ndarray ¶
Return the intersection of supported genes in this GeneSet collection.
- Parameters
keys (List[str]) – An optional list of gene_set keys to return, by default all keys are selected.
exclude (List[str]) – An optional list of GeneSet keys to exclude from the returned dictionary.
- Returns
np.ndarray
- Return type
Intersection of the supported genes within GeneSets.
- union(keys: Optional[List[str]] = None, exclude: Optional[List[str]] = None) numpy.ndarray ¶
Get the union of supported genes in this GeneSet collection.
- Parameters
keys (List[str]) – An optional list of gene_set keys to return, by default all keys are selected.
exclude (List[str]) – An optional list of GeneSet keys to exclude from the returned dictionary.
- Returns
np.ndarray
- Return type
Union of the supported genes within GeneSets.
- difference(primary_key: str, other_keys: Optional[List[str]] = None, mode: str = 'union') numpy.ndarray ¶
Finds the genes within primary_key that are not within the mode of the sets given in other_keys.
If no other_keys are provided, all remaining keys are used. The default mode is union.
- Parameters
primary_key (List[str]) – The set
other_keys (List[str]) – An optional list of GeneSet keys…
mode (str) – Mode by which to join the GeneSets given by other_keys.
- Returns
…
- Return type
np.ndarray
- joint_difference(primary_keys: List[str], other_keys: Optional[List[str]] = None, primary_join_mode: str = 'union', others_join_mode: str = 'union')¶
- Parameters
primary_keys –
other_keys –
primary_join_mode –
others_join_mode –
- pairwise_unions(keys: Optional[List[str]] = None, exclude: Optional[List[str]] = None) Dict[Tuple[str, str], numpy.ndarray] ¶
Construct pairwise permutations of GeneSets within this collection, and return the union of each pair in a dictionary.
- Parameters
keys (List[str]) – An optional list of gene_set keys to return, by default all keys are selected.
exclude (List[str]) – An optional list of GeneSet keys to exclude from the returned dictionary.
- Returns
dict
- Return type
A dictionary of
{(GeneSet.name, GeneSet.name): gene support union}
.
- pairwise_intersection(keys: Optional[List[str]] = None, exclude: Optional[List[str]] = None) Dict[Tuple[str, str], numpy.ndarray] ¶
Construct pairwise combinations of GeneSets within this collection, and return the intersection of each pair in a dictionary.
- Parameters
keys (List[str]) – An optional list of gene_set keys to return, by default all keys are selected.
exclude (List[str]) – An optional list of GeneSet keys to exclude from the returned dictionary.
- Returns
dict
- Return type
A dictionary of
{GeneSet.Name, GeneSet.name): GeneSets.get_support() intersection}
.
- pairwise_percent_intersection(keys=None, exclude=None) List[Tuple[str, str, float]] ¶
Construct pairwise permutations of GeneSets within this collection, and return the intersection of each pair within a dictionary.
- Parameters
keys (List[str]) – An optional list of gene_set keys to return, by default all keys are selected.
exclude (List[str]) – An optional list of GeneSet keys to exclude from the returned dictionary.
- Returns
dict
- Return type
A dictionary of
{GeneSet.Name, GeneSet.name): percent gene intersection}
.
- construct_standard_specification(include: Optional[List[str]] = None, exclude=None) dict ¶
Construct a standard specification that can be used to view unions, intersections and differences (unique genes) of the sets within this collection.
- Parameters
include (List[str]) – An optional list of gene_set keys to return, by default all keys are selected.
exclude (List[str]) – An optional list of GeneSet keys to exclude from the returned dictionary.
- Returns
dict
- Return type
A specification dictionary.
- static merge_specifications(*specs)¶
Merges sets of defaultdict(list) objects with common keys.
- process_set_operation_specification(specification: Optional[dict] = None) dict ¶
Calls and stores the results from a specification. The specification must declare set operation functions and their arguments.
- Parameters
specification (Dict) –
- classmethod from_specification(source_collection, specification=None, name='processed_specification')¶
- classmethod from_folder(gem: GSForge.models._AnnotatedGEM.AnnotatedGEM, target_dir: Union[str, pathlib.Path, IO], glob_filter: str = '*.nc', filter_func: Optional[Callable] = None, **params) GSForge.models._GeneSetCollection.GeneSetCollection ¶
Create a GeneSetCollection from a directory of saved GeneSet objects.
The file name of each gene_set.nc file will be used as the key in the gene_sets dictionary.
- Parameters
gem (AnnotatedGEM) – A GSForge.AnnotatedGEM object.
target_dir (Union[str, Path, IO[AnyStr]]) – The directory which contains the saved GeneSet .netcdf files.
glob_filter (str) – A glob by which to restrict the files found within target_dir.
filter_func (Callable) – A function by which to filter which xarray.Dataset objects are included. This function should take an xarray.Dataset and return a boolean.
params – Parameters to configure the GeneSetCollection.
- Returns
GeneSetCollection
- Return type
A new GeneSetCollection.
- save(target_dir: str, keys: Optional[List[str]] = None) None ¶
Save this collection to
target_dir
. Each GeneSet will be saved as a separate .netcdf file within this directory.- Parameters
target_dir (str) – The path to which GeneSet
xarray.Dataset
.netcdf files will be written.keys (List[str]) – The list of GeneSet keys that should be saved. If this is not provided, all GeneSet objects are saved.
- Returns
- Return type
None
- name = 'GeneSetCollection'¶
- class GSForge.models.Interface(*args, **params)¶
Bases:
param.parameterized.Parameterized
The Interface provides common API access for interacting with the
AnnotatedGEM
andGeneSetCollection
objects.gem
= param.ClassSelector(readonly=False)An
AnnotatedGEM
object.gene_set_collection
= param.ClassSelector(readonly=False)A
GeneSetCollection
object.selected_gene_sets
= param.ListSelector(readonly=False)A list of keys from the provided GeneSetCollection (stored in gene_set_collection) that are to be used for selecting sets of genes from the count matrix.
selected_genes
= param.Parameter(readonly=False)A list of genes to use in indexing from the count matrix. This parameter takes priority over all other gene selecting methods. That means that selected GeneSets (or combinations thereof) will have no effect.
gene_set_mode
= param.ObjectSelector(readonly=False)Controls how any selected gene sets are returned by the interface. complete Returns the entire gene set of the
AnnotatedGEM
. union Returns the union of the selected gene sets support. intersection Returns the intersection of the selected gene sets support.sample_subset
= param.Parameter(readonly=False)A list of samples to use in a given operation. These can be supplied directly as a list of genes, or can be drawn from a given GeneSet.
count_variable
= param.ObjectSelector(readonly=False)The name of the count matrix used.
annotation_variables
= param.List(readonly=False)The name of the active annotation variable(s). These are the annotation columns that will be control the subset returned by
y_annotation_data
.count_mask
= param.ObjectSelector(readonly=False)The type of mask to use for the count matrix. complete Returns the entire count matrix as numbers. masked Returns the entire count matrix with zero or missing as NaN values. dropped Returns the count matrix without genes that have zero or missing values.
annotation_mask
= param.ObjectSelector(readonly=False)The type of mask to use for the target array. complete Returns the entire target array. dropped Returns the target array without samples that have zero or missing values.
count_transform
= param.Callable(readonly=False)A transform that will be run on the x_data that is supplied by this Interface. The transform runs on the subset of the matrix that has been selected.
- gem = None¶
- gene_set_collection = None¶
- selected_gene_sets = [None]¶
- selected_genes = None¶
- gene_set_mode = 'union'¶
- sample_subset = None¶
- count_variable = None¶
- annotation_variables = [None]¶
- count_mask = 'complete'¶
- annotation_mask = 'complete'¶
- count_transform = None¶
- property active_count_variable: str¶
Returns the name of the currently active count matrix.
- property gene_index_name: str¶
Returns the name of the gene index.
- property sample_index_name: str¶
Returns the name of the sample index.
- get_sample_index() numpy.ndarray ¶
Get the currently selected sample index as a numpy array.
- Returns
An array of the currently selected samples.
- Return type
np.ndarray
- property get_selection_indices: dict¶
Returns the currently selected indexes as a dictionary.
- property x_count_data: Optional[xarray.core.dataarray.DataArray]¶
Returns the currently selected ‘x_data’. Usually this will be a subset of the active count array.
Note: In constructing the a gene index, the count data is constructed first in order to infer coordinate selection based on masking.
- Returns
The selection of the currently active count data.
- Return type
xarray.Dataset
- get_gene_index() numpy.array ¶
Get the currently selected gene index as a numpy array.
- Returns
An array of the currently selected genes.
- Return type
np.ndarray
- property y_annotation_data: Optional[Union[xarray.core.dataset.Dataset, xarray.core.dataarray.DataArray]]¶
Returns the currently selected ‘y_data’, or None, based on the selected_annotation_variables parameter.
- Returns
- Return type
An
xarray.Dataset
of the currently selected y_data.
- get_gem_data(single_object=False, output_type='xarray', **params)¶
Returns count [and annotation] data based on the current parameters.
Users should call gsf.get_gem_data
- name = 'Interface'¶
- class GSForge.models.CallableInterface(**kwargs)¶
Bases:
GSForge.models._Interface.Interface
,param.parameterized.ParameterizedFunction
Parameters inherited from:
GSForge.models._Interface.Interface
: gem, gene_set_collection, selected_gene_sets, selected_genes, gene_set_mode, sample_subset, count_variable, annotation_variables, count_mask, annotation_mask, count_transform- name = 'CallableInterface'¶