{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Loading data into astir" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Table of Contents:\n", "* [0 Loading necessary libraries](#zero)\n", "* [1 Starting Astir within python](#one)\n", " * [1.0 Loading marker dictionary and design matrix](#one0)\n", " * [1.1 Loading data as pd.DataFrame](#one1)\n", " * [1.2 Loading data as np.array](#one2)\n", " * [1.3 Loading data as SCDataset](#one3)\n", "* [2 Loading from csv and yaml files](#two)\n", "* [3 Loading from a directory of csvs and yaml](#three)\n", "* [4 Loading from loom](#four)\n", "* [5 Loading from anndata](#five)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 0. Loading necessary libraries and define paths " ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "# !pip install -e ../../..\n", "import os\n", "import sys\n", "\n", "module_path = os.path.abspath(os.path.join('../../..'))\n", "if module_path not in sys.path:\n", " sys.path.append(module_path)\n", "module_path = os.path.abspath(os.path.join('../../../astir'))\n", "if module_path not in sys.path:\n", " sys.path.append(module_path)\n", " \n", "%load_ext autoreload\n", "%autoreload 2" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "import yaml\n", "import pandas as pd\n", "import astir as ast\n", "import numpy as np\n", "import torch" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "yaml_marker_path = \"../../../astir/tests/test-data/jackson-2020-markers.yml\"\n", "design_mat_path = \"../../../astir/tests/test-data/design.csv\"\n", "expression_mat_path = \"../../../astir/tests/test-data/test_data.csv\"\n", "expression_dir_path = \"../../../astir/tests/test-data/test-dir-read\"\n", "expression_loom_path = \"../../../astir/tests/test-data/basel_100.loom\"\n", "expression_anndata_path=\"../../../astir/tests/test-data/adata_small.h5ad\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1. Starting Astir within python " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The input dataset should represent protein expression in single cells. The rows should represent each cell (one row per cell) and the columns should represent each protein (one column per protein). A marker which maps the features (proteins) to cell type/state may is required. A design matrix is optional. If provided, it should be either `np.array` or `pd.DataFrame`.\n", "\n", "The initialization of `Astir` requires input dataset `input_expr` as one of `pd.DataFrame`, `Tuple[np.array, List[str], List[str]]` and `Tuple[SCDataset, SCDataset]`. \n", "\n", "Note: `dtype` and `random_seed` are always customizable. `dtype` is default to `torch.float64` and `random_seed` is default to `1234`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 1.0 Loading marker dictionary and design matrix " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Marker Dictionary " ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'cell_states': {'RTK_signalling': ['Her2', 'EGFR'], 'proliferation': ['Ki-67', 'phospho Histone'], 'mTOR_signalling': ['phospho mTOR', 'phospho S6'], 'apoptosis': ['cleaved PARP', 'Cleaved Caspase3']}, 'cell_types': {'stromal': ['Vimentin', 'Fibronectin'], 'B cells': ['CD45', 'CD20'], 'T cells': ['CD45', 'CD3'], 'macrophage': ['CD45', 'CD68'], 'epithelial(basal)': ['E-Cadherin', 'pan Cytokeratin', 'Cytokeratin 5', 'Cytokeratin 14', 'Her2'], 'epithelial(luminal)': ['E-Cadherin', 'pan Cytokeratin', 'Cytokeratin 7', 'Cytokeratin 8/18', 'Cytokeratin 19', 'Cytokeratin 5', 'Her2']}, 'hierarchy': {'immune': ['B cells', 'T cells', 'macrophage'], 'epithelial': ['epithelial(basal)', 'epithelial(luminal)']}}\n" ] } ], "source": [ "with open(yaml_marker_path, \"r\") as stream:\n", " marker_dict = yaml.safe_load(stream)\n", "print(marker_dict)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Some notes:\n", "\n", "1. The marker `marker_dict` is not required when `input_expr` is `Tuple[SCDataset, SCDataset]`. Otherwise, it is required to be `Dict[str, Dict[str, str]]`. \n", "\n", "2. The outer dictionary may have at most three keys: `cell_type`, `cell_state` and `hierarchy`. `cell_type` and `cell_state` maps to the corresponding dictionary which maps the name of cell type/state to protein features. `hierarchy` maps to the dictionary which indicates the cell type hierarchy.\n", "\n", "3. If the user is only intended to classify one of cell type and cell state, only the intended marker dictionary should be provided. So that marker_dict is one of `{\"cell_state\": {...}}`, `{\"cell_type\": {...}}` and `{\"cell_type\": {...}, \"cell_state\": {...}}`.\n", "\n", "4. The `hierarchy` dictionary should be included when the client tends to call `Astir.assign_celltype_hierarchy()`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Design matrix:\n", "\n", "Note that the design matrix must have the same number of rows as there are number of cells." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(49, 40)\n" ] } ], "source": [ "design_df = pd.read_csv(design_mat_path, index_col=0)\n", "print(design_df.shape)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note: `design` is not necessary when `input_expr` is `Tuple[SCDataset, SCDataset]`. Otherwise it is optional." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 1.1 Loading data as `pd.DataFrame` " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When the input is `pd.DataFrame`, its row and column should respectively represent the cells and the features (proteins). " ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
EGFRRuthenium_1Ruthenium_2Ruthenium_3Ruthenium_4Ruthenium_5Ruthenium_6Ruthenium_7E-CadherinDNA1...CD45CD68CD3Carbonic Anhydrase IXCytokeratin 8/18Cytokeratin 7Twistphospho Histonephospho mTORphospho S6
BaselTMA_SP41_126_X14Y7_10.2817531.3195880.5973801.7828631.7578241.9918572.5805642.2871671.8143092.261638...0.0447330.1848050.0000000.9289290.0255260.0434230.2097420.1374540.5728110.215508
BaselTMA_SP41_126_X14Y7_20.3030161.3195880.5973801.7828631.7578241.9918572.5805642.2871671.5176851.613060...0.0468020.0804060.1108060.7521010.0000000.0320560.1080130.0484280.5396470.655731
BaselTMA_SP41_126_X14Y7_30.2523741.3195880.5973801.7828631.7578241.9918572.5805642.2871671.2464332.138744...0.0284990.2032480.0206170.7407590.0833110.0815030.1190580.0630970.4097350.437845
BaselTMA_SP41_126_X14Y7_40.3977321.3068520.5344961.6782171.7578241.9614302.5285512.1838141.8397851.816015...0.0690530.3052000.0602641.0959680.1846030.1315310.1607780.0906660.3057180.132236
BaselTMA_SP41_126_X14Y7_50.4263521.1734390.5973801.5893031.3898391.7898872.3437432.1233341.6183471.355214...0.2337770.1350840.0571951.4279830.0353710.0384480.0144340.1270320.2612050.157786
\n", "

5 rows × 45 columns

\n", "
" ], "text/plain": [ " EGFR Ruthenium_1 Ruthenium_2 Ruthenium_3 \\\n", "BaselTMA_SP41_126_X14Y7_1 0.281753 1.319588 0.597380 1.782863 \n", "BaselTMA_SP41_126_X14Y7_2 0.303016 1.319588 0.597380 1.782863 \n", "BaselTMA_SP41_126_X14Y7_3 0.252374 1.319588 0.597380 1.782863 \n", "BaselTMA_SP41_126_X14Y7_4 0.397732 1.306852 0.534496 1.678217 \n", "BaselTMA_SP41_126_X14Y7_5 0.426352 1.173439 0.597380 1.589303 \n", "\n", " Ruthenium_4 Ruthenium_5 Ruthenium_6 Ruthenium_7 \\\n", "BaselTMA_SP41_126_X14Y7_1 1.757824 1.991857 2.580564 2.287167 \n", "BaselTMA_SP41_126_X14Y7_2 1.757824 1.991857 2.580564 2.287167 \n", "BaselTMA_SP41_126_X14Y7_3 1.757824 1.991857 2.580564 2.287167 \n", "BaselTMA_SP41_126_X14Y7_4 1.757824 1.961430 2.528551 2.183814 \n", "BaselTMA_SP41_126_X14Y7_5 1.389839 1.789887 2.343743 2.123334 \n", "\n", " E-Cadherin DNA1 ... CD45 CD68 \\\n", "BaselTMA_SP41_126_X14Y7_1 1.814309 2.261638 ... 0.044733 0.184805 \n", "BaselTMA_SP41_126_X14Y7_2 1.517685 1.613060 ... 0.046802 0.080406 \n", "BaselTMA_SP41_126_X14Y7_3 1.246433 2.138744 ... 0.028499 0.203248 \n", "BaselTMA_SP41_126_X14Y7_4 1.839785 1.816015 ... 0.069053 0.305200 \n", "BaselTMA_SP41_126_X14Y7_5 1.618347 1.355214 ... 0.233777 0.135084 \n", "\n", " CD3 Carbonic Anhydrase IX Cytokeratin 8/18 \\\n", "BaselTMA_SP41_126_X14Y7_1 0.000000 0.928929 0.025526 \n", "BaselTMA_SP41_126_X14Y7_2 0.110806 0.752101 0.000000 \n", "BaselTMA_SP41_126_X14Y7_3 0.020617 0.740759 0.083311 \n", "BaselTMA_SP41_126_X14Y7_4 0.060264 1.095968 0.184603 \n", "BaselTMA_SP41_126_X14Y7_5 0.057195 1.427983 0.035371 \n", "\n", " Cytokeratin 7 Twist phospho Histone \\\n", "BaselTMA_SP41_126_X14Y7_1 0.043423 0.209742 0.137454 \n", "BaselTMA_SP41_126_X14Y7_2 0.032056 0.108013 0.048428 \n", "BaselTMA_SP41_126_X14Y7_3 0.081503 0.119058 0.063097 \n", "BaselTMA_SP41_126_X14Y7_4 0.131531 0.160778 0.090666 \n", "BaselTMA_SP41_126_X14Y7_5 0.038448 0.014434 0.127032 \n", "\n", " phospho mTOR phospho S6 \n", "BaselTMA_SP41_126_X14Y7_1 0.572811 0.215508 \n", "BaselTMA_SP41_126_X14Y7_2 0.539647 0.655731 \n", "BaselTMA_SP41_126_X14Y7_3 0.409735 0.437845 \n", "BaselTMA_SP41_126_X14Y7_4 0.305718 0.132236 \n", "BaselTMA_SP41_126_X14Y7_5 0.261205 0.157786 \n", "\n", "[5 rows x 45 columns]" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_expr = pd.read_csv(expression_mat_path, index_col=0)\n", "df_expr.head()" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Astir object with 6 cell types, 4 cell states, and 49 cells.\n" ] } ], "source": [ "a_df = ast.Astir(input_expr=df_expr, marker_dict=marker_dict, design=design_df)\n", "print(a_df)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 1.2 Loading data as `np.array` or `torch.tensor` " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When the input is `Tuple[Union[np.array, torch.tensor]], List[str], List[str]]`, the first element `np.array` or `torch.tensor` is the input dataset, the second element `List[str]` is the title of the columns (the names of proteins) and the third element `List[str]` is the title of the rows (the name of the cells). The length of the second and third list should be equal to the number of columns and rows of the first array. " ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Astir object with 6 cell types, 4 cell states, and 49 cells.\n" ] } ], "source": [ "# Load as np.array\n", "np_expr = df_expr.values\n", "features = list(df_expr.columns)\n", "cores = list(df_expr.index)\n", "a_np = ast.Astir(input_expr=(np_expr, features, cores), marker_dict=marker_dict, design=design_df)\n", "print(a_np)" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Astir object with 6 cell types, 4 cell states, and 49 cells.\n" ] } ], "source": [ "# Load as torch.tensor\n", "t_expr = torch.from_numpy(np_expr)\n", "a_t = ast.Astir(input_expr=(t_expr, features, cores), marker_dict=marker_dict, design=design_df)\n", "print(a_t)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 1.3 Loading data as `SCDataset` " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When the input is `Tuple[SCDataset, SCDataset]`, the first `SCDataset` should be the cell type dataset and the second `SCDataset` should be the cell state dataset." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Astir object with 6 cell types, 4 cell states, and 49 cells.\n" ] } ], "source": [ "type_scd = ast.SCDataset(expr_input=df_expr, marker_dict=marker_dict[\"cell_types\"], \n", " include_other_column=True, design=design_df)\n", "state_scd = ast.SCDataset(expr_input=df_expr, marker_dict=marker_dict[\"cell_states\"], \n", " include_other_column=False, design=design_df)\n", "a_scd = ast.Astir(input_expr=(type_scd, state_scd))\n", "print(a_scd)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2. Loading from csv and yaml files " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A data reader `from_csv_yaml` for loading `csv` and `yaml` file is provided. \n", "\n", "The row of the `csv` file should represent the information of each single cells and the column of the `csv` file should represent the expression of each protein in different cells. " ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Astir object with 6 cell types, 4 cell states, and 49 cells.\n" ] } ], "source": [ "a_csv = ast.from_csv_yaml(csv_input=expression_mat_path, marker_yaml=yaml_marker_path, design_csv=design_mat_path)\n", "print(a_csv)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Some notes:\n", "\n", "1. The yaml file at `yaml_marker_path` contains the marker which maps protein features to cell type/state classes. The format should match the description of *[marker dictionary](#mk).\n", "\n", "2. `from_csv_yaml` returns an `Astir` object. \n", "\n", "3. `dtype` and `random_seed` are also customizable. `dtype` is default to `torch.float64` and `random_seed` is default to `1234`." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "torch.Tensor" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(a_csv.get_type_dataset().get_exprs())" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
CD20CD3CD45CD68Cytokeratin 14Cytokeratin 19Cytokeratin 5Cytokeratin 7Cytokeratin 8/18E-CadherinFibronectinHer2Vimentinpan Cytokeratin
BaselTMA_SP41_126_X14Y7_10.2078840.0000000.0447330.1848050.1341280.0799560.1783500.0434230.0255261.8143091.0397340.4830070.4441401.187512
BaselTMA_SP41_126_X14Y7_20.0215060.1108060.0468020.0804060.0269510.0669220.0811470.0320560.0000001.5176851.1476440.5133860.2700700.749379
BaselTMA_SP41_126_X14Y7_30.0088780.0206170.0284990.2032480.0235150.1862940.0761120.0815030.0833111.2464330.9889060.6332260.2339091.216521
BaselTMA_SP41_126_X14Y7_40.0530270.0602640.0690530.3052000.1144200.3462730.1640590.1315310.1846031.8397850.8427100.7092720.5423621.354303
BaselTMA_SP41_126_X14Y7_50.0191270.0571950.2337770.1350840.0553680.1244070.0953230.0384480.0353711.6183471.0733570.4822300.7599440.629398
\n", "
" ], "text/plain": [ " CD20 CD3 CD45 CD68 \\\n", "BaselTMA_SP41_126_X14Y7_1 0.207884 0.000000 0.044733 0.184805 \n", "BaselTMA_SP41_126_X14Y7_2 0.021506 0.110806 0.046802 0.080406 \n", "BaselTMA_SP41_126_X14Y7_3 0.008878 0.020617 0.028499 0.203248 \n", "BaselTMA_SP41_126_X14Y7_4 0.053027 0.060264 0.069053 0.305200 \n", "BaselTMA_SP41_126_X14Y7_5 0.019127 0.057195 0.233777 0.135084 \n", "\n", " Cytokeratin 14 Cytokeratin 19 Cytokeratin 5 \\\n", "BaselTMA_SP41_126_X14Y7_1 0.134128 0.079956 0.178350 \n", "BaselTMA_SP41_126_X14Y7_2 0.026951 0.066922 0.081147 \n", "BaselTMA_SP41_126_X14Y7_3 0.023515 0.186294 0.076112 \n", "BaselTMA_SP41_126_X14Y7_4 0.114420 0.346273 0.164059 \n", "BaselTMA_SP41_126_X14Y7_5 0.055368 0.124407 0.095323 \n", "\n", " Cytokeratin 7 Cytokeratin 8/18 E-Cadherin \\\n", "BaselTMA_SP41_126_X14Y7_1 0.043423 0.025526 1.814309 \n", "BaselTMA_SP41_126_X14Y7_2 0.032056 0.000000 1.517685 \n", "BaselTMA_SP41_126_X14Y7_3 0.081503 0.083311 1.246433 \n", "BaselTMA_SP41_126_X14Y7_4 0.131531 0.184603 1.839785 \n", "BaselTMA_SP41_126_X14Y7_5 0.038448 0.035371 1.618347 \n", "\n", " Fibronectin Her2 Vimentin pan Cytokeratin \n", "BaselTMA_SP41_126_X14Y7_1 1.039734 0.483007 0.444140 1.187512 \n", "BaselTMA_SP41_126_X14Y7_2 1.147644 0.513386 0.270070 0.749379 \n", "BaselTMA_SP41_126_X14Y7_3 0.988906 0.633226 0.233909 1.216521 \n", "BaselTMA_SP41_126_X14Y7_4 0.842710 0.709272 0.542362 1.354303 \n", "BaselTMA_SP41_126_X14Y7_5 1.073357 0.482230 0.759944 0.629398 " ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a_csv.get_type_dataset().get_exprs_df().head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3. Loading from a directory of csvs and yaml " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The user can also load the data from a directory of `csv` files with `from_csv_dir_yaml`.\n", "\n", "In this case, every `csv` file should represent the expression data from different samples. A design matrix will be generated automatically. " ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Astir object with 6 cell types, 4 cell states, and 40 cells.\n" ] } ], "source": [ "a_dir = ast.from_csv_dir_yaml(input_dir=expression_dir_path, marker_yaml=yaml_marker_path)\n", "print(a_dir)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Some notes:\n", "\n", "1. The yaml file at `yaml_marker_path` contains the marker which maps protein features to cell type/state classes. The format should match the description of *[marker dictionary](#mk).\n", "\n", "2. `from_csv_dir_yaml` returns an Astir object.\n", "\n", "3. `dtype` and `random_seed` are also customizable. `dtype` is default to `torch.float64` and `random_seed` is default to `1234`." ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "torch.Tensor" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(a_dir.get_type_dataset().get_exprs())" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
CD20CD3CD45CD68Cytokeratin 14Cytokeratin 19Cytokeratin 5Cytokeratin 7Cytokeratin 8/18E-CadherinFibronectinHer2Vimentinpan Cytokeratin
BaselTMA_SP41_126_X14Y7_10.2078840.0000000.0447330.1848050.1341280.0799560.1783500.0434230.0255261.8143091.0397340.4830070.4441401.187512
BaselTMA_SP41_126_X14Y7_20.0215060.1108060.0468020.0804060.0269510.0669220.0811470.0320560.0000001.5176851.1476440.5133860.2700700.749379
BaselTMA_SP41_126_X14Y7_30.0088780.0206170.0284990.2032480.0235150.1862940.0761120.0815030.0833111.2464330.9889060.6332260.2339091.216521
BaselTMA_SP41_126_X14Y7_40.0530270.0602640.0690530.3052000.1144200.3462730.1640590.1315310.1846031.8397850.8427100.7092720.5423621.354303
BaselTMA_SP41_126_X14Y7_50.0191270.0571950.2337770.1350840.0553680.1244070.0953230.0384480.0353711.6183471.0733570.4822300.7599440.629398
\n", "
" ], "text/plain": [ " CD20 CD3 CD45 CD68 \\\n", "BaselTMA_SP41_126_X14Y7_1 0.207884 0.000000 0.044733 0.184805 \n", "BaselTMA_SP41_126_X14Y7_2 0.021506 0.110806 0.046802 0.080406 \n", "BaselTMA_SP41_126_X14Y7_3 0.008878 0.020617 0.028499 0.203248 \n", "BaselTMA_SP41_126_X14Y7_4 0.053027 0.060264 0.069053 0.305200 \n", "BaselTMA_SP41_126_X14Y7_5 0.019127 0.057195 0.233777 0.135084 \n", "\n", " Cytokeratin 14 Cytokeratin 19 Cytokeratin 5 \\\n", "BaselTMA_SP41_126_X14Y7_1 0.134128 0.079956 0.178350 \n", "BaselTMA_SP41_126_X14Y7_2 0.026951 0.066922 0.081147 \n", "BaselTMA_SP41_126_X14Y7_3 0.023515 0.186294 0.076112 \n", "BaselTMA_SP41_126_X14Y7_4 0.114420 0.346273 0.164059 \n", "BaselTMA_SP41_126_X14Y7_5 0.055368 0.124407 0.095323 \n", "\n", " Cytokeratin 7 Cytokeratin 8/18 E-Cadherin \\\n", "BaselTMA_SP41_126_X14Y7_1 0.043423 0.025526 1.814309 \n", "BaselTMA_SP41_126_X14Y7_2 0.032056 0.000000 1.517685 \n", "BaselTMA_SP41_126_X14Y7_3 0.081503 0.083311 1.246433 \n", "BaselTMA_SP41_126_X14Y7_4 0.131531 0.184603 1.839785 \n", "BaselTMA_SP41_126_X14Y7_5 0.038448 0.035371 1.618347 \n", "\n", " Fibronectin Her2 Vimentin pan Cytokeratin \n", "BaselTMA_SP41_126_X14Y7_1 1.039734 0.483007 0.444140 1.187512 \n", "BaselTMA_SP41_126_X14Y7_2 1.147644 0.513386 0.270070 0.749379 \n", "BaselTMA_SP41_126_X14Y7_3 0.988906 0.633226 0.233909 1.216521 \n", "BaselTMA_SP41_126_X14Y7_4 0.842710 0.709272 0.542362 1.354303 \n", "BaselTMA_SP41_126_X14Y7_5 1.073357 0.482230 0.759944 0.629398 " ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a_dir.get_type_dataset().get_exprs_df().head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 4. Loading from loom " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It is also possible to load the data from a `loom` file with `from_loompy_yaml`. " ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Astir object with 6 cell types, 4 cell states, and 100 cells.\n" ] } ], "source": [ "a_loom = ast.from_loompy_yaml(loom_file=expression_loom_path, marker_yaml=yaml_marker_path, \n", " protein_name_attr=\"protein\", cell_name_attr=\"cell_name\", batch_name_attr=\"batch\")\n", "print(a_loom)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Some notes:\n", "\n", "1. The protein and cell names are taken from `ds.ra[protein_name_attr]` and `ds.ca[cell_name_attr]` respectively if specified, and `ds.ra[\"protein\"]` and `ds.cs[\"cell_name\"]` otherwise.\n", "\n", "2. If `batch_name` is sepecified, the corresponding column of `ds.ca[batch_name_attr]` will be assumed as the batch variable and turned into a design matrix. Otherwise it is taken as `ds.ca[\"batch\"]`\n", "\n", "3. The yaml file at `yaml_marker_path` contains the marker which maps protein features to cell type/state classes. The format should match the description of *[marker dictionary](#mk).\n", "\n", "4. `from_loom_yaml` returns an Astir object.\n", "\n", "5. `dtype` and `random_seed` are also customizable. `dtype` is default to `torch.float64` and `random_seed` is default to `1234`." ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "torch.Tensor" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(a_loom.get_type_dataset().get_exprs())" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
CD20CD3CD45CD68Cytokeratin 14Cytokeratin 19Cytokeratin 5Cytokeratin 7Cytokeratin 8/18E-CadherinFibronectinHer2Vimentinpan Cytokeratin
BaselTMA_SP41_44_X2Y7_7260.0801730.0104690.0134250.1493730.1283291.5773950.2105810.6610011.7773942.0281771.6064460.8039870.2791064.026491
BaselTMA_SP41_Liver_X2Y1_19090.0516910.1392160.0619850.1401930.2081420.1298070.1179690.0000001.2117570.6423650.9436501.4881540.0000000.743843
BaselTMA_SP41_231_X6Y6_10_7980.0000000.0783860.1449590.5700160.1588470.3252960.1299980.0000000.1666040.6992112.9673110.8913881.8451640.137033
BaselTMA_SP41_141_X11Y2_49680.0390430.0284260.0898940.0893860.0750230.2948020.1309210.1055430.7219601.4625430.6074010.8477320.0323952.527473
BaselTMA_SP41_141_X11Y2_7460.0790790.1843541.1749590.2978930.0398440.1776490.1291310.0569740.0174060.3919931.5290431.1960520.8163240.145410
.............................................
BaselTMA_SP42_25_X3Y2_11780.1109300.0222300.0318420.0766430.1283410.8454750.0363950.1434341.0051111.9176320.4559790.9941310.3144682.820787
BaselTMA_SP43_272_X11Y3_4600.0577300.1249630.0000000.0000000.0000000.1371210.2010290.0000000.0756550.8040872.7553480.2860161.1940520.000000
BaselTMA_SP42_192_X8Y5_22140.0534470.0803350.0310030.0505700.0951600.6055010.1562500.8969551.2210761.4523520.3026542.2173770.6879132.132569
BaselTMA_SP41_203_X8Y8_24330.0287290.0306170.3228131.1807020.0818000.1823320.0833350.0377380.2125891.5477222.5939070.2659610.0740161.381932
BaselTMA_SP41_249_X3Y9_9960.2281800.1172490.5099541.1807130.1676260.1072510.1160690.0267340.2653200.5029061.6088990.5791171.6732331.438507
\n", "

100 rows × 14 columns

\n", "
" ], "text/plain": [ " CD20 CD3 CD45 CD68 \\\n", "BaselTMA_SP41_44_X2Y7_726 0.080173 0.010469 0.013425 0.149373 \n", "BaselTMA_SP41_Liver_X2Y1_1909 0.051691 0.139216 0.061985 0.140193 \n", "BaselTMA_SP41_231_X6Y6_10_798 0.000000 0.078386 0.144959 0.570016 \n", "BaselTMA_SP41_141_X11Y2_4968 0.039043 0.028426 0.089894 0.089386 \n", "BaselTMA_SP41_141_X11Y2_746 0.079079 0.184354 1.174959 0.297893 \n", "... ... ... ... ... \n", "BaselTMA_SP42_25_X3Y2_1178 0.110930 0.022230 0.031842 0.076643 \n", "BaselTMA_SP43_272_X11Y3_460 0.057730 0.124963 0.000000 0.000000 \n", "BaselTMA_SP42_192_X8Y5_2214 0.053447 0.080335 0.031003 0.050570 \n", "BaselTMA_SP41_203_X8Y8_2433 0.028729 0.030617 0.322813 1.180702 \n", "BaselTMA_SP41_249_X3Y9_996 0.228180 0.117249 0.509954 1.180713 \n", "\n", " Cytokeratin 14 Cytokeratin 19 Cytokeratin 5 \\\n", "BaselTMA_SP41_44_X2Y7_726 0.128329 1.577395 0.210581 \n", "BaselTMA_SP41_Liver_X2Y1_1909 0.208142 0.129807 0.117969 \n", "BaselTMA_SP41_231_X6Y6_10_798 0.158847 0.325296 0.129998 \n", "BaselTMA_SP41_141_X11Y2_4968 0.075023 0.294802 0.130921 \n", "BaselTMA_SP41_141_X11Y2_746 0.039844 0.177649 0.129131 \n", "... ... ... ... \n", "BaselTMA_SP42_25_X3Y2_1178 0.128341 0.845475 0.036395 \n", "BaselTMA_SP43_272_X11Y3_460 0.000000 0.137121 0.201029 \n", "BaselTMA_SP42_192_X8Y5_2214 0.095160 0.605501 0.156250 \n", "BaselTMA_SP41_203_X8Y8_2433 0.081800 0.182332 0.083335 \n", "BaselTMA_SP41_249_X3Y9_996 0.167626 0.107251 0.116069 \n", "\n", " Cytokeratin 7 Cytokeratin 8/18 E-Cadherin \\\n", "BaselTMA_SP41_44_X2Y7_726 0.661001 1.777394 2.028177 \n", "BaselTMA_SP41_Liver_X2Y1_1909 0.000000 1.211757 0.642365 \n", "BaselTMA_SP41_231_X6Y6_10_798 0.000000 0.166604 0.699211 \n", "BaselTMA_SP41_141_X11Y2_4968 0.105543 0.721960 1.462543 \n", "BaselTMA_SP41_141_X11Y2_746 0.056974 0.017406 0.391993 \n", "... ... ... ... \n", "BaselTMA_SP42_25_X3Y2_1178 0.143434 1.005111 1.917632 \n", "BaselTMA_SP43_272_X11Y3_460 0.000000 0.075655 0.804087 \n", "BaselTMA_SP42_192_X8Y5_2214 0.896955 1.221076 1.452352 \n", "BaselTMA_SP41_203_X8Y8_2433 0.037738 0.212589 1.547722 \n", "BaselTMA_SP41_249_X3Y9_996 0.026734 0.265320 0.502906 \n", "\n", " Fibronectin Her2 Vimentin \\\n", "BaselTMA_SP41_44_X2Y7_726 1.606446 0.803987 0.279106 \n", "BaselTMA_SP41_Liver_X2Y1_1909 0.943650 1.488154 0.000000 \n", "BaselTMA_SP41_231_X6Y6_10_798 2.967311 0.891388 1.845164 \n", "BaselTMA_SP41_141_X11Y2_4968 0.607401 0.847732 0.032395 \n", "BaselTMA_SP41_141_X11Y2_746 1.529043 1.196052 0.816324 \n", "... ... ... ... \n", "BaselTMA_SP42_25_X3Y2_1178 0.455979 0.994131 0.314468 \n", "BaselTMA_SP43_272_X11Y3_460 2.755348 0.286016 1.194052 \n", "BaselTMA_SP42_192_X8Y5_2214 0.302654 2.217377 0.687913 \n", "BaselTMA_SP41_203_X8Y8_2433 2.593907 0.265961 0.074016 \n", "BaselTMA_SP41_249_X3Y9_996 1.608899 0.579117 1.673233 \n", "\n", " pan Cytokeratin \n", "BaselTMA_SP41_44_X2Y7_726 4.026491 \n", "BaselTMA_SP41_Liver_X2Y1_1909 0.743843 \n", "BaselTMA_SP41_231_X6Y6_10_798 0.137033 \n", "BaselTMA_SP41_141_X11Y2_4968 2.527473 \n", "BaselTMA_SP41_141_X11Y2_746 0.145410 \n", "... ... \n", "BaselTMA_SP42_25_X3Y2_1178 2.820787 \n", "BaselTMA_SP43_272_X11Y3_460 0.000000 \n", "BaselTMA_SP42_192_X8Y5_2214 2.132569 \n", "BaselTMA_SP41_203_X8Y8_2433 1.381932 \n", "BaselTMA_SP41_249_X3Y9_996 1.438507 \n", "\n", "[100 rows x 14 columns]" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a_loom.get_type_dataset().get_exprs_df()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 5. Loading from anndata " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can read in data from the [AnnData](https://anndata.readthedocs.io/en/stable/anndata.AnnData.html) format, along with a `yaml` file containing marker information using the `from_anndata_yaml` function:" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Astir object with 6 cell types, 4 cell states, and 10 cells.\n" ] } ], "source": [ "a_ann = ast.from_anndata_yaml(anndata_file=expression_anndata_path, marker_yaml=yaml_marker_path,\n", " protein_name=\"protein\",cell_name=\"cell_name\", batch_name=\"batch\")\n", "print(a_ann)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Some notes:\n", "\n", "1. The protein and cell names are taken from `adata.var[protein_name]` and `adata.obs[cell_name]` respectively if specified, and `adata.var_names` and `adata.obs_names` otherwise.\n", "\n", "2. If `batch_name` is sepecified, the corresponding column of `adata.var` will be assumed as the batch variable and turned into a design matrix.\n", "\n", "3. The yaml file at `yaml_marker_path` contains the marker which maps protein features to cell type/state classes. The format should match the description of *[marker dictionary](#mk).\n", "\n", "4. `from_anndata_yaml` returns an Astir object.\n", "\n", "5. `dtype` and `random_seed` are also customizable. `dtype` is default to `torch.float64` and `random_seed` is default to `1234`." ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "torch.Tensor" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(a_ann.get_type_dataset().get_exprs())" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
CD20CD3CD45CD68Cytokeratin 14Cytokeratin 19Cytokeratin 5Cytokeratin 7Cytokeratin 8/18E-CadherinFibronectinHer2Vimentinpan Cytokeratin
ZTMA208_slide_11_By5x8_10.1685210.0902770.2718710.4124390.0873540.1557100.1003080.0000000.0966740.9742712.8674700.5529052.3352531.361075
ZTMA208_slide_11_By5x8_20.3663010.3526140.2840340.3128620.1523540.5087280.0286510.0299040.7497552.7877402.1744941.0461980.2856992.454543
ZTMA208_slide_11_By5x8_30.1770060.1038080.1507910.1224720.2922410.6343660.0904570.0566270.4469111.9279402.9970431.0205172.8871932.590460
ZTMA208_slide_11_By5x8_40.3040680.2228020.2197360.2776220.3738702.2125140.3048240.0000001.9048373.1759591.5981632.2699740.8770984.250308
ZTMA208_slide_11_By5x8_50.1377890.1300100.1056041.0352800.2121050.1441440.0746920.0000000.0000001.9001822.3263460.6108972.8821460.275225
ZTMA208_slide_11_By5x8_60.1829260.1695960.2706980.2571780.2248631.1435460.1896000.0015420.6503842.5801531.8916921.7242371.9319472.994441
ZTMA208_slide_11_By5x8_70.2392570.1490070.3517880.1380800.1425051.4151040.1244840.0012451.0919752.6966991.9941741.7961370.1271253.523499
ZTMA208_slide_11_By5x8_80.1752990.1533320.2156980.1047090.2373872.1903690.2646000.0000001.4579012.7889961.8598961.7266960.1066614.245234
ZTMA208_slide_11_By5x8_90.2105410.1182730.1461350.1481640.3622261.2672240.1734770.0000000.8424072.9504401.8527582.1837160.9573693.098247
ZTMA208_slide_11_By5x8_100.3088990.3261210.2248660.2761820.1402402.0324730.3343580.0000001.5035312.9385902.1925022.3128381.3379834.199266
\n", "
" ], "text/plain": [ " CD20 CD3 CD45 CD68 \\\n", "ZTMA208_slide_11_By5x8_1 0.168521 0.090277 0.271871 0.412439 \n", "ZTMA208_slide_11_By5x8_2 0.366301 0.352614 0.284034 0.312862 \n", "ZTMA208_slide_11_By5x8_3 0.177006 0.103808 0.150791 0.122472 \n", "ZTMA208_slide_11_By5x8_4 0.304068 0.222802 0.219736 0.277622 \n", "ZTMA208_slide_11_By5x8_5 0.137789 0.130010 0.105604 1.035280 \n", "ZTMA208_slide_11_By5x8_6 0.182926 0.169596 0.270698 0.257178 \n", "ZTMA208_slide_11_By5x8_7 0.239257 0.149007 0.351788 0.138080 \n", "ZTMA208_slide_11_By5x8_8 0.175299 0.153332 0.215698 0.104709 \n", "ZTMA208_slide_11_By5x8_9 0.210541 0.118273 0.146135 0.148164 \n", "ZTMA208_slide_11_By5x8_10 0.308899 0.326121 0.224866 0.276182 \n", "\n", " Cytokeratin 14 Cytokeratin 19 Cytokeratin 5 \\\n", "ZTMA208_slide_11_By5x8_1 0.087354 0.155710 0.100308 \n", "ZTMA208_slide_11_By5x8_2 0.152354 0.508728 0.028651 \n", "ZTMA208_slide_11_By5x8_3 0.292241 0.634366 0.090457 \n", "ZTMA208_slide_11_By5x8_4 0.373870 2.212514 0.304824 \n", "ZTMA208_slide_11_By5x8_5 0.212105 0.144144 0.074692 \n", "ZTMA208_slide_11_By5x8_6 0.224863 1.143546 0.189600 \n", "ZTMA208_slide_11_By5x8_7 0.142505 1.415104 0.124484 \n", "ZTMA208_slide_11_By5x8_8 0.237387 2.190369 0.264600 \n", "ZTMA208_slide_11_By5x8_9 0.362226 1.267224 0.173477 \n", "ZTMA208_slide_11_By5x8_10 0.140240 2.032473 0.334358 \n", "\n", " Cytokeratin 7 Cytokeratin 8/18 E-Cadherin \\\n", "ZTMA208_slide_11_By5x8_1 0.000000 0.096674 0.974271 \n", "ZTMA208_slide_11_By5x8_2 0.029904 0.749755 2.787740 \n", "ZTMA208_slide_11_By5x8_3 0.056627 0.446911 1.927940 \n", "ZTMA208_slide_11_By5x8_4 0.000000 1.904837 3.175959 \n", "ZTMA208_slide_11_By5x8_5 0.000000 0.000000 1.900182 \n", "ZTMA208_slide_11_By5x8_6 0.001542 0.650384 2.580153 \n", "ZTMA208_slide_11_By5x8_7 0.001245 1.091975 2.696699 \n", "ZTMA208_slide_11_By5x8_8 0.000000 1.457901 2.788996 \n", "ZTMA208_slide_11_By5x8_9 0.000000 0.842407 2.950440 \n", "ZTMA208_slide_11_By5x8_10 0.000000 1.503531 2.938590 \n", "\n", " Fibronectin Her2 Vimentin pan Cytokeratin \n", "ZTMA208_slide_11_By5x8_1 2.867470 0.552905 2.335253 1.361075 \n", "ZTMA208_slide_11_By5x8_2 2.174494 1.046198 0.285699 2.454543 \n", "ZTMA208_slide_11_By5x8_3 2.997043 1.020517 2.887193 2.590460 \n", "ZTMA208_slide_11_By5x8_4 1.598163 2.269974 0.877098 4.250308 \n", "ZTMA208_slide_11_By5x8_5 2.326346 0.610897 2.882146 0.275225 \n", "ZTMA208_slide_11_By5x8_6 1.891692 1.724237 1.931947 2.994441 \n", "ZTMA208_slide_11_By5x8_7 1.994174 1.796137 0.127125 3.523499 \n", "ZTMA208_slide_11_By5x8_8 1.859896 1.726696 0.106661 4.245234 \n", "ZTMA208_slide_11_By5x8_9 1.852758 2.183716 0.957369 3.098247 \n", "ZTMA208_slide_11_By5x8_10 2.192502 2.312838 1.337983 4.199266 " ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a_ann.get_type_dataset().get_exprs_df()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.6" } }, "nbformat": 4, "nbformat_minor": 4 }