astir.models package¶
Module contents¶
Classes
|
Class to perform statistical inference to assign cells to cell types. |
|
Class to perform statistical inference to on the activation |
|
Abstract class to perform statistical inference to assign. |
|
Type Recognition Neural Network. |
|
State Recognition Neural Network to get mean of z and standard deviation of z. |
-
class
astir.models.
CellTypeModel
(dset, random_seed=1234, dtype=torch.float64)[source]¶ Bases:
astir.models.abstract.AstirModel
Class to perform statistical inference to assign cells to cell types.
- Parameters
dset (SCDataset) – the input gene expression dataframe
random_seed (int, optional) – the random seed for parameter initialization, defaults to 1234
dtype (torch.dtype, optional) – the data type of parameters, should be the same as dset, defaults to torch.float64
Methods
diagnostics
(cell_type_assignments, alpha)Run diagnostics on cell type assignments
fit
([max_epochs, learning_rate, batch_size, …])- rtype
None
fit_yield_loss
([max_epochs, learning_rate, …])Runs train loops until the convergence reaches delta_loss for
Get the final assignment of the dataset.
get_celltypes
([threshold])Get the most likely cell types
Getter for the recognition net.
plot_clustermap
([plot_name, threshold, figsize])Save the heatmap of protein content in cells with cell types labeled.
predict
(new_dset)Feed new_dset to the recognition net to get a prediction.
-
diagnostics
(cell_type_assignments, alpha)[source]¶ Run diagnostics on cell type assignments
See
astir.Astir.diagnostics_celltype()
for full documentation- Return type
DataFrame
-
fit
(max_epochs=50, learning_rate=0.001, batch_size=128, delta_loss=0.001, msg='')[source]¶ - Return type
None
-
fit_yield_loss
(max_epochs=50, learning_rate=0.001, batch_size=128, delta_loss=0.001, msg='')[source]¶ - Runs train loops until the convergence reaches delta_loss for
delta_loss_batch sizes or for max_epochs number of times
- Parameters
max_epochs (
int
) – number of train loop iterations, defaults to 50learning_rate (
float
) – the learning rate, defaults to 0.01batch_size (
int
) – the batch size, defaults to 128delta_loss (
float
) – stops iteration once the loss rate reaches delta_loss, defaults to 0.001msg (
str
) – iterator bar message, defaults to empty string
- Return type
None
-
get_assignment
()[source]¶ Get the final assignment of the dataset.
- Returns
the final assignment of the dataset
- Return type
np.array
-
get_celltypes
(threshold=0.7)[source]¶ Get the most likely cell types
A cell is assigned to a cell type if the probability is greater than threshold. If no cell types have a probability higher than threshold, then “Unknown” is returned
- Parameters
threshold – the probability threshold above which a cell is assigned to a cell type
- Return type
DataFrame
- Returns
a data frame with most likely cell types for each
-
get_recognet
()[source]¶ Getter for the recognition net.
- Return type
TypeRecognitionNet
- Returns
the trained recognition net
-
plot_clustermap
(plot_name='celltype_protein_cluster.png', threshold=0.7, figsize=7, 5)[source]¶ Save the heatmap of protein content in cells with cell types labeled.
- Parameters
plot_name (str, optional) – name of the plot, extension(e.g. .png or .jpg) is needed, defaults to “celltype_protein_cluster.png”
threshold (float, optional) – the probability threshold above which a cell is assigned to a cell type, defaults to 0.7
- Return type
None
-
class
astir.models.
CellStateModel
(dset, const=2, dropout_rate=0, batch_norm=False, random_seed=42, dtype=torch.float64)[source]¶ Bases:
astir.models.abstract.AstirModel
- Class to perform statistical inference to on the activation
of states (pathways) across cells
Methods
Run diagnostics on cell state assignments
fit
([max_epochs, learning_rate, batch_size, …])Runs train loops until the convergence reaches delta_loss for delta_loss_batch sizes or for max_epochs number of times
- rtype
array
get_data
()Returns data parameter
get_final_mu_z
([new_dset])Returns the mean of the predicted z values for each core
Getter for losses
Getter for the recognition net
Returns the input dataset
Returns all variables
Returns True if the model converged
- Parameters
df_gex – the input gene expression dataframe
marker_dict – the gene marker dictionary
random_seed (
int
) – seed number to reproduce results, defaults to 1234dtype (
dtype
) – torch datatype to use in the model
-
diagnostics
()[source]¶ Run diagnostics on cell state assignments
- Return type
DataFrame
- Returns
diagnostics
-
fit
(max_epochs=50, learning_rate=0.001, batch_size=128, delta_loss=0.001, delta_loss_batch=10, msg='')[source]¶ Runs train loops until the convergence reaches delta_loss for delta_loss_batch sizes or for max_epochs number of times
- Parameters
max_epochs (
int
) – number of train loop iterations, defaults to 50learning_rate (
float
) – the learning rate, defaults to 0.01batch_size (
int
) – the batch size, defaults to 128delta_loss (
float
) – stops iteration once the loss rate reaches delta_loss, defaults to 0.001delta_loss_batch (
int
) – the batch size to consider delta loss, defaults to 10msg (
str
) – iterator bar message, defaults to empty string
- Return type
List
[float
]
-
get_final_mu_z
(new_dset=None)[source]¶ Returns the mean of the predicted z values for each core
- Parameters
new_dset (
Optional
[SCDataset
]) – returns the predicted z values of this dataset on the existing model. If None, it predicts using the existing dataset- Return type
Tensor
- Returns
the mean of the predicted z values for each core
-
get_losses
()[source]¶ Getter for losses
- Return type
array
- Returns
a torch tensor of losses for each training iteration the model runs
-
get_recognet
()[source]¶ Getter for the recognition net
- Return type
StateRecognitionNet
- Returns
the trained recognition net
-
class
astir.models.
AstirModel
(dset, random_seed, dtype)[source]¶ Bases:
object
Abstract class to perform statistical inference to assign. This module is the super class of CellTypeModel and CellStateModel and is not supposed to be instantiated.
Methods
-
class
astir.models.
TypeRecognitionNet
(C, G, hidden_size=10)[source]¶ Bases:
torch.nn.modules.module.Module
Type Recognition Neural Network.
- Parameters
C (
int
) – number of classesG (
int
) – number of featureshidden_size – size of hidden layers
Methods
forward
(x)One forward pass.
-
class
astir.models.
StateRecognitionNet
(C, G, const=2, dropout_rate=0, batch_norm=False)[source]¶ Bases:
torch.nn.modules.module.Module
State Recognition Neural Network to get mean of z and standard deviation of z. The neural network architecture looks like this: G -> const * C -> const * C -> G (for mu) or -> G (for std). With batch normal layers after each activation output layers and dropout activation units
- Parameters
C (
int
) – number of classesG (
int
) – number of proteinsconst (
int
) – the size of the hidden layers are const times proportional to Cdropout_rate (
float
) – the dropout ratebatch_norm (
bool
) – apply batch normal layers if True
Methods
forward
(x)One forward pass of the StateRecognitionNet