{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Loading data into astir"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Table of Contents:\n",
    "* [0 Loading necessary libraries](#zero)\n",
    "* [1 Starting Astir within python](#one)\n",
    "    * [1.0 Loading marker dictionary and design matrix](#one0)\n",
    "    * [1.1 Loading data as pd.DataFrame](#one1)\n",
    "    * [1.2 Loading data as np.array](#one2)\n",
    "    * [1.3 Loading data as SCDataset](#one3)\n",
    "* [2 Loading from csv and yaml files](#two)\n",
    "* [3 Loading from a directory of csvs and yaml](#three)\n",
    "* [4 Loading from loom](#four)\n",
    "* [5 Loading from anndata](#five)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 0. Loading necessary libraries and define paths <a class=\"anchor\" id=\"zero\"></a>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "# !pip install -e ../../..\n",
    "import os\n",
    "import sys\n",
    "\n",
    "module_path = os.path.abspath(os.path.join('../../..'))\n",
    "if module_path not in sys.path:\n",
    "    sys.path.append(module_path)\n",
    "module_path = os.path.abspath(os.path.join('../../../astir'))\n",
    "if module_path not in sys.path:\n",
    "    sys.path.append(module_path)\n",
    "    \n",
    "%load_ext autoreload\n",
    "%autoreload 2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [],
   "source": [
    "import yaml\n",
    "import pandas as pd\n",
    "import astir as ast\n",
    "import numpy as np\n",
    "import torch"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "yaml_marker_path = \"../../../astir/tests/test-data/jackson-2020-markers.yml\"\n",
    "design_mat_path = \"../../../astir/tests/test-data/design.csv\"\n",
    "expression_mat_path = \"../../../astir/tests/test-data/test_data.csv\"\n",
    "expression_dir_path = \"../../../astir/tests/test-data/test-dir-read\"\n",
    "expression_loom_path = \"../../../astir/tests/test-data/basel_100.loom\"\n",
    "expression_anndata_path=\"../../../astir/tests/test-data/adata_small.h5ad\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1. Starting Astir within python  <a class=\"anchor\" id=\"one\"></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The input dataset should represent protein expression in single cells. The rows should represent each cell (one row per cell) and the columns should represent each protein (one column per protein). A marker which maps the features (proteins) to cell type/state may is required. A design matrix is optional. If provided, it should be either `np.array` or `pd.DataFrame`.\n",
    "\n",
    "The initialization of `Astir` requires input dataset `input_expr` as one of `pd.DataFrame`, `Tuple[np.array, List[str], List[str]]` and `Tuple[SCDataset, SCDataset]`. \n",
    "\n",
    "Note: `dtype` and `random_seed` are always customizable. `dtype` is default to `torch.float64` and `random_seed` is default to `1234`."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 1.0 Loading marker dictionary and design matrix <a class=\"anchor\" id=\"one0\"></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Marker Dictionary <a class=\"anchor\" id=\"mk\"></a>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'cell_states': {'RTK_signalling': ['Her2', 'EGFR'], 'proliferation': ['Ki-67', 'phospho Histone'], 'mTOR_signalling': ['phospho mTOR', 'phospho S6'], 'apoptosis': ['cleaved PARP', 'Cleaved Caspase3']}, 'cell_types': {'stromal': ['Vimentin', 'Fibronectin'], 'B cells': ['CD45', 'CD20'], 'T cells': ['CD45', 'CD3'], 'macrophage': ['CD45', 'CD68'], 'epithelial(basal)': ['E-Cadherin', 'pan Cytokeratin', 'Cytokeratin 5', 'Cytokeratin 14', 'Her2'], 'epithelial(luminal)': ['E-Cadherin', 'pan Cytokeratin', 'Cytokeratin 7', 'Cytokeratin 8/18', 'Cytokeratin 19', 'Cytokeratin 5', 'Her2']}, 'hierarchy': {'immune': ['B cells', 'T cells', 'macrophage'], 'epithelial': ['epithelial(basal)', 'epithelial(luminal)']}}\n"
     ]
    }
   ],
   "source": [
    "with open(yaml_marker_path, \"r\") as stream:\n",
    "    marker_dict = yaml.safe_load(stream)\n",
    "print(marker_dict)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Some notes:\n",
    "\n",
    "1. The marker `marker_dict` is not required when `input_expr` is `Tuple[SCDataset, SCDataset]`. Otherwise, it is required to be `Dict[str, Dict[str, str]]`. \n",
    "\n",
    "2. The outer dictionary may have at most three keys: `cell_type`, `cell_state` and `hierarchy`. `cell_type` and `cell_state` maps to the corresponding dictionary which maps the name of cell type/state to protein features. `hierarchy` maps to the dictionary which indicates the cell type hierarchy.\n",
    "\n",
    "3. If the user is only intended to classify one of cell type and cell state, only the intended marker dictionary should be provided. So that marker_dict is one of `{\"cell_state\": {...}}`, `{\"cell_type\": {...}}` and `{\"cell_type\": {...}, \"cell_state\": {...}}`.\n",
    "\n",
    "4. The `hierarchy` dictionary should be included when the client tends to call `Astir.assign_celltype_hierarchy()`"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Design matrix:\n",
    "\n",
    "Note that the design matrix must have the same number of rows as there are number of cells."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(49, 40)\n"
     ]
    }
   ],
   "source": [
    "design_df = pd.read_csv(design_mat_path, index_col=0)\n",
    "print(design_df.shape)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note: `design` is not necessary when `input_expr` is `Tuple[SCDataset, SCDataset]`. Otherwise it is optional."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 1.1 Loading data as `pd.DataFrame`  <a class=\"anchor\" id=\"one1\"></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "When the input is `pd.DataFrame`, its row and column should respectively represent the cells and the features (proteins). "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>EGFR</th>\n",
       "      <th>Ruthenium_1</th>\n",
       "      <th>Ruthenium_2</th>\n",
       "      <th>Ruthenium_3</th>\n",
       "      <th>Ruthenium_4</th>\n",
       "      <th>Ruthenium_5</th>\n",
       "      <th>Ruthenium_6</th>\n",
       "      <th>Ruthenium_7</th>\n",
       "      <th>E-Cadherin</th>\n",
       "      <th>DNA1</th>\n",
       "      <th>...</th>\n",
       "      <th>CD45</th>\n",
       "      <th>CD68</th>\n",
       "      <th>CD3</th>\n",
       "      <th>Carbonic Anhydrase IX</th>\n",
       "      <th>Cytokeratin 8/18</th>\n",
       "      <th>Cytokeratin 7</th>\n",
       "      <th>Twist</th>\n",
       "      <th>phospho Histone</th>\n",
       "      <th>phospho mTOR</th>\n",
       "      <th>phospho S6</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>BaselTMA_SP41_126_X14Y7_1</th>\n",
       "      <td>0.281753</td>\n",
       "      <td>1.319588</td>\n",
       "      <td>0.597380</td>\n",
       "      <td>1.782863</td>\n",
       "      <td>1.757824</td>\n",
       "      <td>1.991857</td>\n",
       "      <td>2.580564</td>\n",
       "      <td>2.287167</td>\n",
       "      <td>1.814309</td>\n",
       "      <td>2.261638</td>\n",
       "      <td>...</td>\n",
       "      <td>0.044733</td>\n",
       "      <td>0.184805</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.928929</td>\n",
       "      <td>0.025526</td>\n",
       "      <td>0.043423</td>\n",
       "      <td>0.209742</td>\n",
       "      <td>0.137454</td>\n",
       "      <td>0.572811</td>\n",
       "      <td>0.215508</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>BaselTMA_SP41_126_X14Y7_2</th>\n",
       "      <td>0.303016</td>\n",
       "      <td>1.319588</td>\n",
       "      <td>0.597380</td>\n",
       "      <td>1.782863</td>\n",
       "      <td>1.757824</td>\n",
       "      <td>1.991857</td>\n",
       "      <td>2.580564</td>\n",
       "      <td>2.287167</td>\n",
       "      <td>1.517685</td>\n",
       "      <td>1.613060</td>\n",
       "      <td>...</td>\n",
       "      <td>0.046802</td>\n",
       "      <td>0.080406</td>\n",
       "      <td>0.110806</td>\n",
       "      <td>0.752101</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.032056</td>\n",
       "      <td>0.108013</td>\n",
       "      <td>0.048428</td>\n",
       "      <td>0.539647</td>\n",
       "      <td>0.655731</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>BaselTMA_SP41_126_X14Y7_3</th>\n",
       "      <td>0.252374</td>\n",
       "      <td>1.319588</td>\n",
       "      <td>0.597380</td>\n",
       "      <td>1.782863</td>\n",
       "      <td>1.757824</td>\n",
       "      <td>1.991857</td>\n",
       "      <td>2.580564</td>\n",
       "      <td>2.287167</td>\n",
       "      <td>1.246433</td>\n",
       "      <td>2.138744</td>\n",
       "      <td>...</td>\n",
       "      <td>0.028499</td>\n",
       "      <td>0.203248</td>\n",
       "      <td>0.020617</td>\n",
       "      <td>0.740759</td>\n",
       "      <td>0.083311</td>\n",
       "      <td>0.081503</td>\n",
       "      <td>0.119058</td>\n",
       "      <td>0.063097</td>\n",
       "      <td>0.409735</td>\n",
       "      <td>0.437845</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>BaselTMA_SP41_126_X14Y7_4</th>\n",
       "      <td>0.397732</td>\n",
       "      <td>1.306852</td>\n",
       "      <td>0.534496</td>\n",
       "      <td>1.678217</td>\n",
       "      <td>1.757824</td>\n",
       "      <td>1.961430</td>\n",
       "      <td>2.528551</td>\n",
       "      <td>2.183814</td>\n",
       "      <td>1.839785</td>\n",
       "      <td>1.816015</td>\n",
       "      <td>...</td>\n",
       "      <td>0.069053</td>\n",
       "      <td>0.305200</td>\n",
       "      <td>0.060264</td>\n",
       "      <td>1.095968</td>\n",
       "      <td>0.184603</td>\n",
       "      <td>0.131531</td>\n",
       "      <td>0.160778</td>\n",
       "      <td>0.090666</td>\n",
       "      <td>0.305718</td>\n",
       "      <td>0.132236</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>BaselTMA_SP41_126_X14Y7_5</th>\n",
       "      <td>0.426352</td>\n",
       "      <td>1.173439</td>\n",
       "      <td>0.597380</td>\n",
       "      <td>1.589303</td>\n",
       "      <td>1.389839</td>\n",
       "      <td>1.789887</td>\n",
       "      <td>2.343743</td>\n",
       "      <td>2.123334</td>\n",
       "      <td>1.618347</td>\n",
       "      <td>1.355214</td>\n",
       "      <td>...</td>\n",
       "      <td>0.233777</td>\n",
       "      <td>0.135084</td>\n",
       "      <td>0.057195</td>\n",
       "      <td>1.427983</td>\n",
       "      <td>0.035371</td>\n",
       "      <td>0.038448</td>\n",
       "      <td>0.014434</td>\n",
       "      <td>0.127032</td>\n",
       "      <td>0.261205</td>\n",
       "      <td>0.157786</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>5 rows × 45 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                               EGFR  Ruthenium_1  Ruthenium_2  Ruthenium_3  \\\n",
       "BaselTMA_SP41_126_X14Y7_1  0.281753     1.319588     0.597380     1.782863   \n",
       "BaselTMA_SP41_126_X14Y7_2  0.303016     1.319588     0.597380     1.782863   \n",
       "BaselTMA_SP41_126_X14Y7_3  0.252374     1.319588     0.597380     1.782863   \n",
       "BaselTMA_SP41_126_X14Y7_4  0.397732     1.306852     0.534496     1.678217   \n",
       "BaselTMA_SP41_126_X14Y7_5  0.426352     1.173439     0.597380     1.589303   \n",
       "\n",
       "                           Ruthenium_4  Ruthenium_5  Ruthenium_6  Ruthenium_7  \\\n",
       "BaselTMA_SP41_126_X14Y7_1     1.757824     1.991857     2.580564     2.287167   \n",
       "BaselTMA_SP41_126_X14Y7_2     1.757824     1.991857     2.580564     2.287167   \n",
       "BaselTMA_SP41_126_X14Y7_3     1.757824     1.991857     2.580564     2.287167   \n",
       "BaselTMA_SP41_126_X14Y7_4     1.757824     1.961430     2.528551     2.183814   \n",
       "BaselTMA_SP41_126_X14Y7_5     1.389839     1.789887     2.343743     2.123334   \n",
       "\n",
       "                           E-Cadherin      DNA1  ...      CD45      CD68  \\\n",
       "BaselTMA_SP41_126_X14Y7_1    1.814309  2.261638  ...  0.044733  0.184805   \n",
       "BaselTMA_SP41_126_X14Y7_2    1.517685  1.613060  ...  0.046802  0.080406   \n",
       "BaselTMA_SP41_126_X14Y7_3    1.246433  2.138744  ...  0.028499  0.203248   \n",
       "BaselTMA_SP41_126_X14Y7_4    1.839785  1.816015  ...  0.069053  0.305200   \n",
       "BaselTMA_SP41_126_X14Y7_5    1.618347  1.355214  ...  0.233777  0.135084   \n",
       "\n",
       "                                CD3  Carbonic Anhydrase IX  Cytokeratin 8/18  \\\n",
       "BaselTMA_SP41_126_X14Y7_1  0.000000               0.928929          0.025526   \n",
       "BaselTMA_SP41_126_X14Y7_2  0.110806               0.752101          0.000000   \n",
       "BaselTMA_SP41_126_X14Y7_3  0.020617               0.740759          0.083311   \n",
       "BaselTMA_SP41_126_X14Y7_4  0.060264               1.095968          0.184603   \n",
       "BaselTMA_SP41_126_X14Y7_5  0.057195               1.427983          0.035371   \n",
       "\n",
       "                           Cytokeratin 7     Twist  phospho Histone  \\\n",
       "BaselTMA_SP41_126_X14Y7_1       0.043423  0.209742         0.137454   \n",
       "BaselTMA_SP41_126_X14Y7_2       0.032056  0.108013         0.048428   \n",
       "BaselTMA_SP41_126_X14Y7_3       0.081503  0.119058         0.063097   \n",
       "BaselTMA_SP41_126_X14Y7_4       0.131531  0.160778         0.090666   \n",
       "BaselTMA_SP41_126_X14Y7_5       0.038448  0.014434         0.127032   \n",
       "\n",
       "                           phospho mTOR  phospho S6  \n",
       "BaselTMA_SP41_126_X14Y7_1      0.572811    0.215508  \n",
       "BaselTMA_SP41_126_X14Y7_2      0.539647    0.655731  \n",
       "BaselTMA_SP41_126_X14Y7_3      0.409735    0.437845  \n",
       "BaselTMA_SP41_126_X14Y7_4      0.305718    0.132236  \n",
       "BaselTMA_SP41_126_X14Y7_5      0.261205    0.157786  \n",
       "\n",
       "[5 rows x 45 columns]"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_expr = pd.read_csv(expression_mat_path, index_col=0)\n",
    "df_expr.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Astir object with 6 cell types, 4 cell states, and 49 cells.\n"
     ]
    }
   ],
   "source": [
    "a_df = ast.Astir(input_expr=df_expr, marker_dict=marker_dict, design=design_df)\n",
    "print(a_df)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 1.2 Loading data as `np.array` or `torch.tensor` <a class=\"anchor\" id=\"one2\"></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "When the input is `Tuple[Union[np.array, torch.tensor]], List[str], List[str]]`, the first element `np.array` or `torch.tensor` is the input dataset, the second element `List[str]` is the title of the columns (the names of proteins) and the third element `List[str]` is the title of the rows (the name　of the cells). The length of the second and third list should be equal to the number of columns and rows of the first array. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Astir object with 6 cell types, 4 cell states, and 49 cells.\n"
     ]
    }
   ],
   "source": [
    "# Load as np.array\n",
    "np_expr = df_expr.values\n",
    "features = list(df_expr.columns)\n",
    "cores = list(df_expr.index)\n",
    "a_np = ast.Astir(input_expr=(np_expr, features, cores), marker_dict=marker_dict, design=design_df)\n",
    "print(a_np)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Astir object with 6 cell types, 4 cell states, and 49 cells.\n"
     ]
    }
   ],
   "source": [
    "# Load as torch.tensor\n",
    "t_expr = torch.from_numpy(np_expr)\n",
    "a_t = ast.Astir(input_expr=(t_expr, features, cores), marker_dict=marker_dict, design=design_df)\n",
    "print(a_t)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 1.3 Loading data as `SCDataset`  <a class=\"anchor\" id=\"one3\"></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "When the input is `Tuple[SCDataset, SCDataset]`, the first `SCDataset` should be the cell type dataset and the second `SCDataset` should be the cell state dataset."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Astir object with 6 cell types, 4 cell states, and 49 cells.\n"
     ]
    }
   ],
   "source": [
    "type_scd = ast.SCDataset(expr_input=df_expr, marker_dict=marker_dict[\"cell_types\"], \n",
    "                         include_other_column=True, design=design_df)\n",
    "state_scd = ast.SCDataset(expr_input=df_expr, marker_dict=marker_dict[\"cell_states\"], \n",
    "                          include_other_column=False, design=design_df)\n",
    "a_scd = ast.Astir(input_expr=(type_scd, state_scd))\n",
    "print(a_scd)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2. Loading from csv and yaml files  <a class=\"anchor\" id=\"two\"></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A data reader `from_csv_yaml` for loading `csv` and `yaml` file is provided. \n",
    "\n",
    "The row of the `csv` file should represent the information of each single cells and the column of the `csv` file should represent the expression of each protein in different cells. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Astir object with 6 cell types, 4 cell states, and 49 cells.\n"
     ]
    }
   ],
   "source": [
    "a_csv = ast.from_csv_yaml(csv_input=expression_mat_path, marker_yaml=yaml_marker_path, design_csv=design_mat_path)\n",
    "print(a_csv)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Some notes:\n",
    "\n",
    "1. The yaml file at `yaml_marker_path` contains the marker which maps protein features to cell type/state classes. The format should match the description of *[marker dictionary](#mk).\n",
    "\n",
    "2. `from_csv_yaml` returns an `Astir` object. \n",
    "\n",
    "3. `dtype` and `random_seed` are also customizable. `dtype` is default to `torch.float64` and `random_seed` is default to `1234`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "torch.Tensor"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "type(a_csv.get_type_dataset().get_exprs())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>CD20</th>\n",
       "      <th>CD3</th>\n",
       "      <th>CD45</th>\n",
       "      <th>CD68</th>\n",
       "      <th>Cytokeratin 14</th>\n",
       "      <th>Cytokeratin 19</th>\n",
       "      <th>Cytokeratin 5</th>\n",
       "      <th>Cytokeratin 7</th>\n",
       "      <th>Cytokeratin 8/18</th>\n",
       "      <th>E-Cadherin</th>\n",
       "      <th>Fibronectin</th>\n",
       "      <th>Her2</th>\n",
       "      <th>Vimentin</th>\n",
       "      <th>pan Cytokeratin</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>BaselTMA_SP41_126_X14Y7_1</th>\n",
       "      <td>0.207884</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.044733</td>\n",
       "      <td>0.184805</td>\n",
       "      <td>0.134128</td>\n",
       "      <td>0.079956</td>\n",
       "      <td>0.178350</td>\n",
       "      <td>0.043423</td>\n",
       "      <td>0.025526</td>\n",
       "      <td>1.814309</td>\n",
       "      <td>1.039734</td>\n",
       "      <td>0.483007</td>\n",
       "      <td>0.444140</td>\n",
       "      <td>1.187512</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>BaselTMA_SP41_126_X14Y7_2</th>\n",
       "      <td>0.021506</td>\n",
       "      <td>0.110806</td>\n",
       "      <td>0.046802</td>\n",
       "      <td>0.080406</td>\n",
       "      <td>0.026951</td>\n",
       "      <td>0.066922</td>\n",
       "      <td>0.081147</td>\n",
       "      <td>0.032056</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>1.517685</td>\n",
       "      <td>1.147644</td>\n",
       "      <td>0.513386</td>\n",
       "      <td>0.270070</td>\n",
       "      <td>0.749379</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>BaselTMA_SP41_126_X14Y7_3</th>\n",
       "      <td>0.008878</td>\n",
       "      <td>0.020617</td>\n",
       "      <td>0.028499</td>\n",
       "      <td>0.203248</td>\n",
       "      <td>0.023515</td>\n",
       "      <td>0.186294</td>\n",
       "      <td>0.076112</td>\n",
       "      <td>0.081503</td>\n",
       "      <td>0.083311</td>\n",
       "      <td>1.246433</td>\n",
       "      <td>0.988906</td>\n",
       "      <td>0.633226</td>\n",
       "      <td>0.233909</td>\n",
       "      <td>1.216521</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>BaselTMA_SP41_126_X14Y7_4</th>\n",
       "      <td>0.053027</td>\n",
       "      <td>0.060264</td>\n",
       "      <td>0.069053</td>\n",
       "      <td>0.305200</td>\n",
       "      <td>0.114420</td>\n",
       "      <td>0.346273</td>\n",
       "      <td>0.164059</td>\n",
       "      <td>0.131531</td>\n",
       "      <td>0.184603</td>\n",
       "      <td>1.839785</td>\n",
       "      <td>0.842710</td>\n",
       "      <td>0.709272</td>\n",
       "      <td>0.542362</td>\n",
       "      <td>1.354303</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>BaselTMA_SP41_126_X14Y7_5</th>\n",
       "      <td>0.019127</td>\n",
       "      <td>0.057195</td>\n",
       "      <td>0.233777</td>\n",
       "      <td>0.135084</td>\n",
       "      <td>0.055368</td>\n",
       "      <td>0.124407</td>\n",
       "      <td>0.095323</td>\n",
       "      <td>0.038448</td>\n",
       "      <td>0.035371</td>\n",
       "      <td>1.618347</td>\n",
       "      <td>1.073357</td>\n",
       "      <td>0.482230</td>\n",
       "      <td>0.759944</td>\n",
       "      <td>0.629398</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                               CD20       CD3      CD45      CD68  \\\n",
       "BaselTMA_SP41_126_X14Y7_1  0.207884  0.000000  0.044733  0.184805   \n",
       "BaselTMA_SP41_126_X14Y7_2  0.021506  0.110806  0.046802  0.080406   \n",
       "BaselTMA_SP41_126_X14Y7_3  0.008878  0.020617  0.028499  0.203248   \n",
       "BaselTMA_SP41_126_X14Y7_4  0.053027  0.060264  0.069053  0.305200   \n",
       "BaselTMA_SP41_126_X14Y7_5  0.019127  0.057195  0.233777  0.135084   \n",
       "\n",
       "                           Cytokeratin 14  Cytokeratin 19  Cytokeratin 5  \\\n",
       "BaselTMA_SP41_126_X14Y7_1        0.134128        0.079956       0.178350   \n",
       "BaselTMA_SP41_126_X14Y7_2        0.026951        0.066922       0.081147   \n",
       "BaselTMA_SP41_126_X14Y7_3        0.023515        0.186294       0.076112   \n",
       "BaselTMA_SP41_126_X14Y7_4        0.114420        0.346273       0.164059   \n",
       "BaselTMA_SP41_126_X14Y7_5        0.055368        0.124407       0.095323   \n",
       "\n",
       "                           Cytokeratin 7  Cytokeratin 8/18  E-Cadherin  \\\n",
       "BaselTMA_SP41_126_X14Y7_1       0.043423          0.025526    1.814309   \n",
       "BaselTMA_SP41_126_X14Y7_2       0.032056          0.000000    1.517685   \n",
       "BaselTMA_SP41_126_X14Y7_3       0.081503          0.083311    1.246433   \n",
       "BaselTMA_SP41_126_X14Y7_4       0.131531          0.184603    1.839785   \n",
       "BaselTMA_SP41_126_X14Y7_5       0.038448          0.035371    1.618347   \n",
       "\n",
       "                           Fibronectin      Her2  Vimentin  pan Cytokeratin  \n",
       "BaselTMA_SP41_126_X14Y7_1     1.039734  0.483007  0.444140         1.187512  \n",
       "BaselTMA_SP41_126_X14Y7_2     1.147644  0.513386  0.270070         0.749379  \n",
       "BaselTMA_SP41_126_X14Y7_3     0.988906  0.633226  0.233909         1.216521  \n",
       "BaselTMA_SP41_126_X14Y7_4     0.842710  0.709272  0.542362         1.354303  \n",
       "BaselTMA_SP41_126_X14Y7_5     1.073357  0.482230  0.759944         0.629398  "
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a_csv.get_type_dataset().get_exprs_df().head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3. Loading from a directory of csvs and yaml  <a class=\"anchor\" id=\"three\"></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The user can also load the data from a directory of `csv` files with `from_csv_dir_yaml`.\n",
    "\n",
    "In this case, every `csv` file should represent the expression data from different samples. A design matrix will be generated automatically. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Astir object with 6 cell types, 4 cell states, and 40 cells.\n"
     ]
    }
   ],
   "source": [
    "a_dir = ast.from_csv_dir_yaml(input_dir=expression_dir_path, marker_yaml=yaml_marker_path)\n",
    "print(a_dir)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Some notes:\n",
    "\n",
    "1. The yaml file at `yaml_marker_path` contains the marker which maps protein features to cell type/state classes. The format should match the description of *[marker dictionary](#mk).\n",
    "\n",
    "2. `from_csv_dir_yaml` returns an Astir object.\n",
    "\n",
    "3. `dtype` and `random_seed` are also customizable. `dtype` is default to `torch.float64` and `random_seed` is default to `1234`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "torch.Tensor"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "type(a_dir.get_type_dataset().get_exprs())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>CD20</th>\n",
       "      <th>CD3</th>\n",
       "      <th>CD45</th>\n",
       "      <th>CD68</th>\n",
       "      <th>Cytokeratin 14</th>\n",
       "      <th>Cytokeratin 19</th>\n",
       "      <th>Cytokeratin 5</th>\n",
       "      <th>Cytokeratin 7</th>\n",
       "      <th>Cytokeratin 8/18</th>\n",
       "      <th>E-Cadherin</th>\n",
       "      <th>Fibronectin</th>\n",
       "      <th>Her2</th>\n",
       "      <th>Vimentin</th>\n",
       "      <th>pan Cytokeratin</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>BaselTMA_SP41_126_X14Y7_1</th>\n",
       "      <td>0.207884</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.044733</td>\n",
       "      <td>0.184805</td>\n",
       "      <td>0.134128</td>\n",
       "      <td>0.079956</td>\n",
       "      <td>0.178350</td>\n",
       "      <td>0.043423</td>\n",
       "      <td>0.025526</td>\n",
       "      <td>1.814309</td>\n",
       "      <td>1.039734</td>\n",
       "      <td>0.483007</td>\n",
       "      <td>0.444140</td>\n",
       "      <td>1.187512</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>BaselTMA_SP41_126_X14Y7_2</th>\n",
       "      <td>0.021506</td>\n",
       "      <td>0.110806</td>\n",
       "      <td>0.046802</td>\n",
       "      <td>0.080406</td>\n",
       "      <td>0.026951</td>\n",
       "      <td>0.066922</td>\n",
       "      <td>0.081147</td>\n",
       "      <td>0.032056</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>1.517685</td>\n",
       "      <td>1.147644</td>\n",
       "      <td>0.513386</td>\n",
       "      <td>0.270070</td>\n",
       "      <td>0.749379</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>BaselTMA_SP41_126_X14Y7_3</th>\n",
       "      <td>0.008878</td>\n",
       "      <td>0.020617</td>\n",
       "      <td>0.028499</td>\n",
       "      <td>0.203248</td>\n",
       "      <td>0.023515</td>\n",
       "      <td>0.186294</td>\n",
       "      <td>0.076112</td>\n",
       "      <td>0.081503</td>\n",
       "      <td>0.083311</td>\n",
       "      <td>1.246433</td>\n",
       "      <td>0.988906</td>\n",
       "      <td>0.633226</td>\n",
       "      <td>0.233909</td>\n",
       "      <td>1.216521</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>BaselTMA_SP41_126_X14Y7_4</th>\n",
       "      <td>0.053027</td>\n",
       "      <td>0.060264</td>\n",
       "      <td>0.069053</td>\n",
       "      <td>0.305200</td>\n",
       "      <td>0.114420</td>\n",
       "      <td>0.346273</td>\n",
       "      <td>0.164059</td>\n",
       "      <td>0.131531</td>\n",
       "      <td>0.184603</td>\n",
       "      <td>1.839785</td>\n",
       "      <td>0.842710</td>\n",
       "      <td>0.709272</td>\n",
       "      <td>0.542362</td>\n",
       "      <td>1.354303</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>BaselTMA_SP41_126_X14Y7_5</th>\n",
       "      <td>0.019127</td>\n",
       "      <td>0.057195</td>\n",
       "      <td>0.233777</td>\n",
       "      <td>0.135084</td>\n",
       "      <td>0.055368</td>\n",
       "      <td>0.124407</td>\n",
       "      <td>0.095323</td>\n",
       "      <td>0.038448</td>\n",
       "      <td>0.035371</td>\n",
       "      <td>1.618347</td>\n",
       "      <td>1.073357</td>\n",
       "      <td>0.482230</td>\n",
       "      <td>0.759944</td>\n",
       "      <td>0.629398</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                               CD20       CD3      CD45      CD68  \\\n",
       "BaselTMA_SP41_126_X14Y7_1  0.207884  0.000000  0.044733  0.184805   \n",
       "BaselTMA_SP41_126_X14Y7_2  0.021506  0.110806  0.046802  0.080406   \n",
       "BaselTMA_SP41_126_X14Y7_3  0.008878  0.020617  0.028499  0.203248   \n",
       "BaselTMA_SP41_126_X14Y7_4  0.053027  0.060264  0.069053  0.305200   \n",
       "BaselTMA_SP41_126_X14Y7_5  0.019127  0.057195  0.233777  0.135084   \n",
       "\n",
       "                           Cytokeratin 14  Cytokeratin 19  Cytokeratin 5  \\\n",
       "BaselTMA_SP41_126_X14Y7_1        0.134128        0.079956       0.178350   \n",
       "BaselTMA_SP41_126_X14Y7_2        0.026951        0.066922       0.081147   \n",
       "BaselTMA_SP41_126_X14Y7_3        0.023515        0.186294       0.076112   \n",
       "BaselTMA_SP41_126_X14Y7_4        0.114420        0.346273       0.164059   \n",
       "BaselTMA_SP41_126_X14Y7_5        0.055368        0.124407       0.095323   \n",
       "\n",
       "                           Cytokeratin 7  Cytokeratin 8/18  E-Cadherin  \\\n",
       "BaselTMA_SP41_126_X14Y7_1       0.043423          0.025526    1.814309   \n",
       "BaselTMA_SP41_126_X14Y7_2       0.032056          0.000000    1.517685   \n",
       "BaselTMA_SP41_126_X14Y7_3       0.081503          0.083311    1.246433   \n",
       "BaselTMA_SP41_126_X14Y7_4       0.131531          0.184603    1.839785   \n",
       "BaselTMA_SP41_126_X14Y7_5       0.038448          0.035371    1.618347   \n",
       "\n",
       "                           Fibronectin      Her2  Vimentin  pan Cytokeratin  \n",
       "BaselTMA_SP41_126_X14Y7_1     1.039734  0.483007  0.444140         1.187512  \n",
       "BaselTMA_SP41_126_X14Y7_2     1.147644  0.513386  0.270070         0.749379  \n",
       "BaselTMA_SP41_126_X14Y7_3     0.988906  0.633226  0.233909         1.216521  \n",
       "BaselTMA_SP41_126_X14Y7_4     0.842710  0.709272  0.542362         1.354303  \n",
       "BaselTMA_SP41_126_X14Y7_5     1.073357  0.482230  0.759944         0.629398  "
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a_dir.get_type_dataset().get_exprs_df().head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4. Loading from loom  <a class=\"anchor\" id=\"four\"></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "It is also possible to load the data from a `loom` file with `from_loompy_yaml`. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Astir object with 6 cell types, 4 cell states, and 100 cells.\n"
     ]
    }
   ],
   "source": [
    "a_loom = ast.from_loompy_yaml(loom_file=expression_loom_path, marker_yaml=yaml_marker_path, \n",
    "                         protein_name_attr=\"protein\", cell_name_attr=\"cell_name\", batch_name_attr=\"batch\")\n",
    "print(a_loom)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Some notes:\n",
    "\n",
    "1. The protein and cell names are taken from `ds.ra[protein_name_attr]` and `ds.ca[cell_name_attr]` respectively if specified, and `ds.ra[\"protein\"]` and `ds.cs[\"cell_name\"]` otherwise.\n",
    "\n",
    "2. If `batch_name` is sepecified, the corresponding column of `ds.ca[batch_name_attr]` will be assumed as the batch variable and turned into a design matrix. Otherwise it is taken as `ds.ca[\"batch\"]`\n",
    "\n",
    "3. The yaml file at `yaml_marker_path` contains the marker which maps protein features to cell type/state classes. The format should match the description of *[marker dictionary](#mk).\n",
    "\n",
    "4. `from_loom_yaml` returns an Astir object.\n",
    "\n",
    "5. `dtype` and `random_seed` are also customizable. `dtype` is default to `torch.float64` and `random_seed` is default to `1234`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "torch.Tensor"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "type(a_loom.get_type_dataset().get_exprs())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>CD20</th>\n",
       "      <th>CD3</th>\n",
       "      <th>CD45</th>\n",
       "      <th>CD68</th>\n",
       "      <th>Cytokeratin 14</th>\n",
       "      <th>Cytokeratin 19</th>\n",
       "      <th>Cytokeratin 5</th>\n",
       "      <th>Cytokeratin 7</th>\n",
       "      <th>Cytokeratin 8/18</th>\n",
       "      <th>E-Cadherin</th>\n",
       "      <th>Fibronectin</th>\n",
       "      <th>Her2</th>\n",
       "      <th>Vimentin</th>\n",
       "      <th>pan Cytokeratin</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>BaselTMA_SP41_44_X2Y7_726</th>\n",
       "      <td>0.080173</td>\n",
       "      <td>0.010469</td>\n",
       "      <td>0.013425</td>\n",
       "      <td>0.149373</td>\n",
       "      <td>0.128329</td>\n",
       "      <td>1.577395</td>\n",
       "      <td>0.210581</td>\n",
       "      <td>0.661001</td>\n",
       "      <td>1.777394</td>\n",
       "      <td>2.028177</td>\n",
       "      <td>1.606446</td>\n",
       "      <td>0.803987</td>\n",
       "      <td>0.279106</td>\n",
       "      <td>4.026491</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>BaselTMA_SP41_Liver_X2Y1_1909</th>\n",
       "      <td>0.051691</td>\n",
       "      <td>0.139216</td>\n",
       "      <td>0.061985</td>\n",
       "      <td>0.140193</td>\n",
       "      <td>0.208142</td>\n",
       "      <td>0.129807</td>\n",
       "      <td>0.117969</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>1.211757</td>\n",
       "      <td>0.642365</td>\n",
       "      <td>0.943650</td>\n",
       "      <td>1.488154</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.743843</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>BaselTMA_SP41_231_X6Y6_10_798</th>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.078386</td>\n",
       "      <td>0.144959</td>\n",
       "      <td>0.570016</td>\n",
       "      <td>0.158847</td>\n",
       "      <td>0.325296</td>\n",
       "      <td>0.129998</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.166604</td>\n",
       "      <td>0.699211</td>\n",
       "      <td>2.967311</td>\n",
       "      <td>0.891388</td>\n",
       "      <td>1.845164</td>\n",
       "      <td>0.137033</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>BaselTMA_SP41_141_X11Y2_4968</th>\n",
       "      <td>0.039043</td>\n",
       "      <td>0.028426</td>\n",
       "      <td>0.089894</td>\n",
       "      <td>0.089386</td>\n",
       "      <td>0.075023</td>\n",
       "      <td>0.294802</td>\n",
       "      <td>0.130921</td>\n",
       "      <td>0.105543</td>\n",
       "      <td>0.721960</td>\n",
       "      <td>1.462543</td>\n",
       "      <td>0.607401</td>\n",
       "      <td>0.847732</td>\n",
       "      <td>0.032395</td>\n",
       "      <td>2.527473</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>BaselTMA_SP41_141_X11Y2_746</th>\n",
       "      <td>0.079079</td>\n",
       "      <td>0.184354</td>\n",
       "      <td>1.174959</td>\n",
       "      <td>0.297893</td>\n",
       "      <td>0.039844</td>\n",
       "      <td>0.177649</td>\n",
       "      <td>0.129131</td>\n",
       "      <td>0.056974</td>\n",
       "      <td>0.017406</td>\n",
       "      <td>0.391993</td>\n",
       "      <td>1.529043</td>\n",
       "      <td>1.196052</td>\n",
       "      <td>0.816324</td>\n",
       "      <td>0.145410</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>BaselTMA_SP42_25_X3Y2_1178</th>\n",
       "      <td>0.110930</td>\n",
       "      <td>0.022230</td>\n",
       "      <td>0.031842</td>\n",
       "      <td>0.076643</td>\n",
       "      <td>0.128341</td>\n",
       "      <td>0.845475</td>\n",
       "      <td>0.036395</td>\n",
       "      <td>0.143434</td>\n",
       "      <td>1.005111</td>\n",
       "      <td>1.917632</td>\n",
       "      <td>0.455979</td>\n",
       "      <td>0.994131</td>\n",
       "      <td>0.314468</td>\n",
       "      <td>2.820787</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>BaselTMA_SP43_272_X11Y3_460</th>\n",
       "      <td>0.057730</td>\n",
       "      <td>0.124963</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.137121</td>\n",
       "      <td>0.201029</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.075655</td>\n",
       "      <td>0.804087</td>\n",
       "      <td>2.755348</td>\n",
       "      <td>0.286016</td>\n",
       "      <td>1.194052</td>\n",
       "      <td>0.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>BaselTMA_SP42_192_X8Y5_2214</th>\n",
       "      <td>0.053447</td>\n",
       "      <td>0.080335</td>\n",
       "      <td>0.031003</td>\n",
       "      <td>0.050570</td>\n",
       "      <td>0.095160</td>\n",
       "      <td>0.605501</td>\n",
       "      <td>0.156250</td>\n",
       "      <td>0.896955</td>\n",
       "      <td>1.221076</td>\n",
       "      <td>1.452352</td>\n",
       "      <td>0.302654</td>\n",
       "      <td>2.217377</td>\n",
       "      <td>0.687913</td>\n",
       "      <td>2.132569</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>BaselTMA_SP41_203_X8Y8_2433</th>\n",
       "      <td>0.028729</td>\n",
       "      <td>0.030617</td>\n",
       "      <td>0.322813</td>\n",
       "      <td>1.180702</td>\n",
       "      <td>0.081800</td>\n",
       "      <td>0.182332</td>\n",
       "      <td>0.083335</td>\n",
       "      <td>0.037738</td>\n",
       "      <td>0.212589</td>\n",
       "      <td>1.547722</td>\n",
       "      <td>2.593907</td>\n",
       "      <td>0.265961</td>\n",
       "      <td>0.074016</td>\n",
       "      <td>1.381932</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>BaselTMA_SP41_249_X3Y9_996</th>\n",
       "      <td>0.228180</td>\n",
       "      <td>0.117249</td>\n",
       "      <td>0.509954</td>\n",
       "      <td>1.180713</td>\n",
       "      <td>0.167626</td>\n",
       "      <td>0.107251</td>\n",
       "      <td>0.116069</td>\n",
       "      <td>0.026734</td>\n",
       "      <td>0.265320</td>\n",
       "      <td>0.502906</td>\n",
       "      <td>1.608899</td>\n",
       "      <td>0.579117</td>\n",
       "      <td>1.673233</td>\n",
       "      <td>1.438507</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>100 rows × 14 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                                   CD20       CD3      CD45      CD68  \\\n",
       "BaselTMA_SP41_44_X2Y7_726      0.080173  0.010469  0.013425  0.149373   \n",
       "BaselTMA_SP41_Liver_X2Y1_1909  0.051691  0.139216  0.061985  0.140193   \n",
       "BaselTMA_SP41_231_X6Y6_10_798  0.000000  0.078386  0.144959  0.570016   \n",
       "BaselTMA_SP41_141_X11Y2_4968   0.039043  0.028426  0.089894  0.089386   \n",
       "BaselTMA_SP41_141_X11Y2_746    0.079079  0.184354  1.174959  0.297893   \n",
       "...                                 ...       ...       ...       ...   \n",
       "BaselTMA_SP42_25_X3Y2_1178     0.110930  0.022230  0.031842  0.076643   \n",
       "BaselTMA_SP43_272_X11Y3_460    0.057730  0.124963  0.000000  0.000000   \n",
       "BaselTMA_SP42_192_X8Y5_2214    0.053447  0.080335  0.031003  0.050570   \n",
       "BaselTMA_SP41_203_X8Y8_2433    0.028729  0.030617  0.322813  1.180702   \n",
       "BaselTMA_SP41_249_X3Y9_996     0.228180  0.117249  0.509954  1.180713   \n",
       "\n",
       "                               Cytokeratin 14  Cytokeratin 19  Cytokeratin 5  \\\n",
       "BaselTMA_SP41_44_X2Y7_726            0.128329        1.577395       0.210581   \n",
       "BaselTMA_SP41_Liver_X2Y1_1909        0.208142        0.129807       0.117969   \n",
       "BaselTMA_SP41_231_X6Y6_10_798        0.158847        0.325296       0.129998   \n",
       "BaselTMA_SP41_141_X11Y2_4968         0.075023        0.294802       0.130921   \n",
       "BaselTMA_SP41_141_X11Y2_746          0.039844        0.177649       0.129131   \n",
       "...                                       ...             ...            ...   \n",
       "BaselTMA_SP42_25_X3Y2_1178           0.128341        0.845475       0.036395   \n",
       "BaselTMA_SP43_272_X11Y3_460          0.000000        0.137121       0.201029   \n",
       "BaselTMA_SP42_192_X8Y5_2214          0.095160        0.605501       0.156250   \n",
       "BaselTMA_SP41_203_X8Y8_2433          0.081800        0.182332       0.083335   \n",
       "BaselTMA_SP41_249_X3Y9_996           0.167626        0.107251       0.116069   \n",
       "\n",
       "                               Cytokeratin 7  Cytokeratin 8/18  E-Cadherin  \\\n",
       "BaselTMA_SP41_44_X2Y7_726           0.661001          1.777394    2.028177   \n",
       "BaselTMA_SP41_Liver_X2Y1_1909       0.000000          1.211757    0.642365   \n",
       "BaselTMA_SP41_231_X6Y6_10_798       0.000000          0.166604    0.699211   \n",
       "BaselTMA_SP41_141_X11Y2_4968        0.105543          0.721960    1.462543   \n",
       "BaselTMA_SP41_141_X11Y2_746         0.056974          0.017406    0.391993   \n",
       "...                                      ...               ...         ...   \n",
       "BaselTMA_SP42_25_X3Y2_1178          0.143434          1.005111    1.917632   \n",
       "BaselTMA_SP43_272_X11Y3_460         0.000000          0.075655    0.804087   \n",
       "BaselTMA_SP42_192_X8Y5_2214         0.896955          1.221076    1.452352   \n",
       "BaselTMA_SP41_203_X8Y8_2433         0.037738          0.212589    1.547722   \n",
       "BaselTMA_SP41_249_X3Y9_996          0.026734          0.265320    0.502906   \n",
       "\n",
       "                               Fibronectin      Her2  Vimentin  \\\n",
       "BaselTMA_SP41_44_X2Y7_726         1.606446  0.803987  0.279106   \n",
       "BaselTMA_SP41_Liver_X2Y1_1909     0.943650  1.488154  0.000000   \n",
       "BaselTMA_SP41_231_X6Y6_10_798     2.967311  0.891388  1.845164   \n",
       "BaselTMA_SP41_141_X11Y2_4968      0.607401  0.847732  0.032395   \n",
       "BaselTMA_SP41_141_X11Y2_746       1.529043  1.196052  0.816324   \n",
       "...                                    ...       ...       ...   \n",
       "BaselTMA_SP42_25_X3Y2_1178        0.455979  0.994131  0.314468   \n",
       "BaselTMA_SP43_272_X11Y3_460       2.755348  0.286016  1.194052   \n",
       "BaselTMA_SP42_192_X8Y5_2214       0.302654  2.217377  0.687913   \n",
       "BaselTMA_SP41_203_X8Y8_2433       2.593907  0.265961  0.074016   \n",
       "BaselTMA_SP41_249_X3Y9_996        1.608899  0.579117  1.673233   \n",
       "\n",
       "                               pan Cytokeratin  \n",
       "BaselTMA_SP41_44_X2Y7_726             4.026491  \n",
       "BaselTMA_SP41_Liver_X2Y1_1909         0.743843  \n",
       "BaselTMA_SP41_231_X6Y6_10_798         0.137033  \n",
       "BaselTMA_SP41_141_X11Y2_4968          2.527473  \n",
       "BaselTMA_SP41_141_X11Y2_746           0.145410  \n",
       "...                                        ...  \n",
       "BaselTMA_SP42_25_X3Y2_1178            2.820787  \n",
       "BaselTMA_SP43_272_X11Y3_460           0.000000  \n",
       "BaselTMA_SP42_192_X8Y5_2214           2.132569  \n",
       "BaselTMA_SP41_203_X8Y8_2433           1.381932  \n",
       "BaselTMA_SP41_249_X3Y9_996            1.438507  \n",
       "\n",
       "[100 rows x 14 columns]"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a_loom.get_type_dataset().get_exprs_df()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 5. Loading from anndata <a class=\"anchor\" id=\"five\"></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can read in data from the [AnnData](https://anndata.readthedocs.io/en/stable/anndata.AnnData.html) format, along with a `yaml` file containing marker information using the `from_anndata_yaml` function:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Astir object with 6 cell types, 4 cell states, and 10 cells.\n"
     ]
    }
   ],
   "source": [
    "a_ann = ast.from_anndata_yaml(anndata_file=expression_anndata_path, marker_yaml=yaml_marker_path,\n",
    "                        protein_name=\"protein\",cell_name=\"cell_name\", batch_name=\"batch\")\n",
    "print(a_ann)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Some notes:\n",
    "\n",
    "1. The protein and cell names are taken from `adata.var[protein_name]` and `adata.obs[cell_name]` respectively if specified, and `adata.var_names` and `adata.obs_names` otherwise.\n",
    "\n",
    "2. If `batch_name` is sepecified, the corresponding column of `adata.var` will be assumed as the batch variable and turned into a design matrix.\n",
    "\n",
    "3. The yaml file at `yaml_marker_path` contains the marker which maps protein features to cell type/state classes. The format should match the description of *[marker dictionary](#mk).\n",
    "\n",
    "4. `from_anndata_yaml` returns an Astir object.\n",
    "\n",
    "5. `dtype` and `random_seed` are also customizable. `dtype` is default to `torch.float64` and `random_seed` is default to `1234`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "torch.Tensor"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "type(a_ann.get_type_dataset().get_exprs())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>CD20</th>\n",
       "      <th>CD3</th>\n",
       "      <th>CD45</th>\n",
       "      <th>CD68</th>\n",
       "      <th>Cytokeratin 14</th>\n",
       "      <th>Cytokeratin 19</th>\n",
       "      <th>Cytokeratin 5</th>\n",
       "      <th>Cytokeratin 7</th>\n",
       "      <th>Cytokeratin 8/18</th>\n",
       "      <th>E-Cadherin</th>\n",
       "      <th>Fibronectin</th>\n",
       "      <th>Her2</th>\n",
       "      <th>Vimentin</th>\n",
       "      <th>pan Cytokeratin</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>ZTMA208_slide_11_By5x8_1</th>\n",
       "      <td>0.168521</td>\n",
       "      <td>0.090277</td>\n",
       "      <td>0.271871</td>\n",
       "      <td>0.412439</td>\n",
       "      <td>0.087354</td>\n",
       "      <td>0.155710</td>\n",
       "      <td>0.100308</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.096674</td>\n",
       "      <td>0.974271</td>\n",
       "      <td>2.867470</td>\n",
       "      <td>0.552905</td>\n",
       "      <td>2.335253</td>\n",
       "      <td>1.361075</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>ZTMA208_slide_11_By5x8_2</th>\n",
       "      <td>0.366301</td>\n",
       "      <td>0.352614</td>\n",
       "      <td>0.284034</td>\n",
       "      <td>0.312862</td>\n",
       "      <td>0.152354</td>\n",
       "      <td>0.508728</td>\n",
       "      <td>0.028651</td>\n",
       "      <td>0.029904</td>\n",
       "      <td>0.749755</td>\n",
       "      <td>2.787740</td>\n",
       "      <td>2.174494</td>\n",
       "      <td>1.046198</td>\n",
       "      <td>0.285699</td>\n",
       "      <td>2.454543</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>ZTMA208_slide_11_By5x8_3</th>\n",
       "      <td>0.177006</td>\n",
       "      <td>0.103808</td>\n",
       "      <td>0.150791</td>\n",
       "      <td>0.122472</td>\n",
       "      <td>0.292241</td>\n",
       "      <td>0.634366</td>\n",
       "      <td>0.090457</td>\n",
       "      <td>0.056627</td>\n",
       "      <td>0.446911</td>\n",
       "      <td>1.927940</td>\n",
       "      <td>2.997043</td>\n",
       "      <td>1.020517</td>\n",
       "      <td>2.887193</td>\n",
       "      <td>2.590460</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>ZTMA208_slide_11_By5x8_4</th>\n",
       "      <td>0.304068</td>\n",
       "      <td>0.222802</td>\n",
       "      <td>0.219736</td>\n",
       "      <td>0.277622</td>\n",
       "      <td>0.373870</td>\n",
       "      <td>2.212514</td>\n",
       "      <td>0.304824</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>1.904837</td>\n",
       "      <td>3.175959</td>\n",
       "      <td>1.598163</td>\n",
       "      <td>2.269974</td>\n",
       "      <td>0.877098</td>\n",
       "      <td>4.250308</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>ZTMA208_slide_11_By5x8_5</th>\n",
       "      <td>0.137789</td>\n",
       "      <td>0.130010</td>\n",
       "      <td>0.105604</td>\n",
       "      <td>1.035280</td>\n",
       "      <td>0.212105</td>\n",
       "      <td>0.144144</td>\n",
       "      <td>0.074692</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>1.900182</td>\n",
       "      <td>2.326346</td>\n",
       "      <td>0.610897</td>\n",
       "      <td>2.882146</td>\n",
       "      <td>0.275225</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>ZTMA208_slide_11_By5x8_6</th>\n",
       "      <td>0.182926</td>\n",
       "      <td>0.169596</td>\n",
       "      <td>0.270698</td>\n",
       "      <td>0.257178</td>\n",
       "      <td>0.224863</td>\n",
       "      <td>1.143546</td>\n",
       "      <td>0.189600</td>\n",
       "      <td>0.001542</td>\n",
       "      <td>0.650384</td>\n",
       "      <td>2.580153</td>\n",
       "      <td>1.891692</td>\n",
       "      <td>1.724237</td>\n",
       "      <td>1.931947</td>\n",
       "      <td>2.994441</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>ZTMA208_slide_11_By5x8_7</th>\n",
       "      <td>0.239257</td>\n",
       "      <td>0.149007</td>\n",
       "      <td>0.351788</td>\n",
       "      <td>0.138080</td>\n",
       "      <td>0.142505</td>\n",
       "      <td>1.415104</td>\n",
       "      <td>0.124484</td>\n",
       "      <td>0.001245</td>\n",
       "      <td>1.091975</td>\n",
       "      <td>2.696699</td>\n",
       "      <td>1.994174</td>\n",
       "      <td>1.796137</td>\n",
       "      <td>0.127125</td>\n",
       "      <td>3.523499</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>ZTMA208_slide_11_By5x8_8</th>\n",
       "      <td>0.175299</td>\n",
       "      <td>0.153332</td>\n",
       "      <td>0.215698</td>\n",
       "      <td>0.104709</td>\n",
       "      <td>0.237387</td>\n",
       "      <td>2.190369</td>\n",
       "      <td>0.264600</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>1.457901</td>\n",
       "      <td>2.788996</td>\n",
       "      <td>1.859896</td>\n",
       "      <td>1.726696</td>\n",
       "      <td>0.106661</td>\n",
       "      <td>4.245234</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>ZTMA208_slide_11_By5x8_9</th>\n",
       "      <td>0.210541</td>\n",
       "      <td>0.118273</td>\n",
       "      <td>0.146135</td>\n",
       "      <td>0.148164</td>\n",
       "      <td>0.362226</td>\n",
       "      <td>1.267224</td>\n",
       "      <td>0.173477</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.842407</td>\n",
       "      <td>2.950440</td>\n",
       "      <td>1.852758</td>\n",
       "      <td>2.183716</td>\n",
       "      <td>0.957369</td>\n",
       "      <td>3.098247</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>ZTMA208_slide_11_By5x8_10</th>\n",
       "      <td>0.308899</td>\n",
       "      <td>0.326121</td>\n",
       "      <td>0.224866</td>\n",
       "      <td>0.276182</td>\n",
       "      <td>0.140240</td>\n",
       "      <td>2.032473</td>\n",
       "      <td>0.334358</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>1.503531</td>\n",
       "      <td>2.938590</td>\n",
       "      <td>2.192502</td>\n",
       "      <td>2.312838</td>\n",
       "      <td>1.337983</td>\n",
       "      <td>4.199266</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                               CD20       CD3      CD45      CD68  \\\n",
       "ZTMA208_slide_11_By5x8_1   0.168521  0.090277  0.271871  0.412439   \n",
       "ZTMA208_slide_11_By5x8_2   0.366301  0.352614  0.284034  0.312862   \n",
       "ZTMA208_slide_11_By5x8_3   0.177006  0.103808  0.150791  0.122472   \n",
       "ZTMA208_slide_11_By5x8_4   0.304068  0.222802  0.219736  0.277622   \n",
       "ZTMA208_slide_11_By5x8_5   0.137789  0.130010  0.105604  1.035280   \n",
       "ZTMA208_slide_11_By5x8_6   0.182926  0.169596  0.270698  0.257178   \n",
       "ZTMA208_slide_11_By5x8_7   0.239257  0.149007  0.351788  0.138080   \n",
       "ZTMA208_slide_11_By5x8_8   0.175299  0.153332  0.215698  0.104709   \n",
       "ZTMA208_slide_11_By5x8_9   0.210541  0.118273  0.146135  0.148164   \n",
       "ZTMA208_slide_11_By5x8_10  0.308899  0.326121  0.224866  0.276182   \n",
       "\n",
       "                           Cytokeratin 14  Cytokeratin 19  Cytokeratin 5  \\\n",
       "ZTMA208_slide_11_By5x8_1         0.087354        0.155710       0.100308   \n",
       "ZTMA208_slide_11_By5x8_2         0.152354        0.508728       0.028651   \n",
       "ZTMA208_slide_11_By5x8_3         0.292241        0.634366       0.090457   \n",
       "ZTMA208_slide_11_By5x8_4         0.373870        2.212514       0.304824   \n",
       "ZTMA208_slide_11_By5x8_5         0.212105        0.144144       0.074692   \n",
       "ZTMA208_slide_11_By5x8_6         0.224863        1.143546       0.189600   \n",
       "ZTMA208_slide_11_By5x8_7         0.142505        1.415104       0.124484   \n",
       "ZTMA208_slide_11_By5x8_8         0.237387        2.190369       0.264600   \n",
       "ZTMA208_slide_11_By5x8_9         0.362226        1.267224       0.173477   \n",
       "ZTMA208_slide_11_By5x8_10        0.140240        2.032473       0.334358   \n",
       "\n",
       "                           Cytokeratin 7  Cytokeratin 8/18  E-Cadherin  \\\n",
       "ZTMA208_slide_11_By5x8_1        0.000000          0.096674    0.974271   \n",
       "ZTMA208_slide_11_By5x8_2        0.029904          0.749755    2.787740   \n",
       "ZTMA208_slide_11_By5x8_3        0.056627          0.446911    1.927940   \n",
       "ZTMA208_slide_11_By5x8_4        0.000000          1.904837    3.175959   \n",
       "ZTMA208_slide_11_By5x8_5        0.000000          0.000000    1.900182   \n",
       "ZTMA208_slide_11_By5x8_6        0.001542          0.650384    2.580153   \n",
       "ZTMA208_slide_11_By5x8_7        0.001245          1.091975    2.696699   \n",
       "ZTMA208_slide_11_By5x8_8        0.000000          1.457901    2.788996   \n",
       "ZTMA208_slide_11_By5x8_9        0.000000          0.842407    2.950440   \n",
       "ZTMA208_slide_11_By5x8_10       0.000000          1.503531    2.938590   \n",
       "\n",
       "                           Fibronectin      Her2  Vimentin  pan Cytokeratin  \n",
       "ZTMA208_slide_11_By5x8_1      2.867470  0.552905  2.335253         1.361075  \n",
       "ZTMA208_slide_11_By5x8_2      2.174494  1.046198  0.285699         2.454543  \n",
       "ZTMA208_slide_11_By5x8_3      2.997043  1.020517  2.887193         2.590460  \n",
       "ZTMA208_slide_11_By5x8_4      1.598163  2.269974  0.877098         4.250308  \n",
       "ZTMA208_slide_11_By5x8_5      2.326346  0.610897  2.882146         0.275225  \n",
       "ZTMA208_slide_11_By5x8_6      1.891692  1.724237  1.931947         2.994441  \n",
       "ZTMA208_slide_11_By5x8_7      1.994174  1.796137  0.127125         3.523499  \n",
       "ZTMA208_slide_11_By5x8_8      1.859896  1.726696  0.106661         4.245234  \n",
       "ZTMA208_slide_11_By5x8_9      1.852758  2.183716  0.957369         3.098247  \n",
       "ZTMA208_slide_11_By5x8_10     2.192502  2.312838  1.337983         4.199266  "
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a_ann.get_type_dataset().get_exprs_df()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}