{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "bc42e37d",
   "metadata": {},
   "source": [
    "# Tutorial 1: Building G-Tensors\n",
    "\n",
    "This tutorial demonstrates how to build a genomic tensor (G-Tensor) from scratch using MuTopia.\n",
    "\n",
    "## Overview\n",
    "\n",
    "A G-Tensor is a multi-dimensional data structure that integrates various genomic features and annotations with mutational topography data. This tutorial will walk you through:\n",
    "\n",
    "1. **Data Preparation**: Downloading and preprocessing genomic annotations\n",
    "2. **Configuration**: Setting up a YAML configuration template\n",
    "3. **G-Tensor Construction**: Building the tensor object and ingesting data\n",
    "\n",
    "## Prerequisites\n",
    "\n",
    "Before starting this tutorial, ensure you have:\n",
    "- MuTopia package installed\n",
    "- Download the pre-compiled data to the `tutorial_data` directory\n",
    "- Download or symlink the hg38 reference genome to `tutorial_data/hg38.fa.gz`\n",
    "\n",
    "## 1. Download and preprocess feature annotations\n",
    "\n",
    "The first step in building a G-Tensor is to gather genomic annotations that will serve as features for our model. We'll download:\n",
    "\n",
    "- **Gene expression data** from ENCODE for Lung cells\n",
    "- **ATAC-seq peaks** representing chromatin accessibility\n",
    "- **Histone modification data** (H3K27me3) for epigenetic state\n",
    "\n",
    "A real attempt at modeling would require many more features, usually, but this will suffice for a demo.\n",
    "\n",
    "### 1.1 Download gene expression data\n",
    "\n",
    "First, we'll download RNA-seq quantification data from ENCODE and convert it into a format suitable for our G-Tensor. The `download_expression` function downloads TSV files and extracts gene coordinates and expression values - we use transcripts per million (called pme_TPM by ENCODE)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "856c0218",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO: Mutopia:Downloading GTF file: MANE.GRCh38.v1.3.ensembl_genomic.gtf ...\n",
      "INFO: Mutopia:Creating annotation file: MANE.GRCh38.annotation.bed ...\n",
      "INFO: Mutopia:Writing quantitation file: <_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'> ...\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "chr1\t65418\t71584\tENSG00000186092\t0.03\t+\n",
      "chr1\t450739\t451677\tENSG00000284733\t0.06\t-\n",
      "chr1\t685715\t686653\tENSG00000284662\t0.06\t-\n"
     ]
    }
   ],
   "source": [
    "%%bash\n",
    "function download_expression {\n",
    "  local url=\"${1}\"\n",
    "\n",
    "  ( set -o pipefail\n",
    "    curl -fsSL \"$url\" \\\n",
    "    | awk -F '\\t' 'NR>1 { print $1 \"\\t\" $10 }' \\\n",
    "    | gtensor utils make-expression-bedfile\n",
    "  )\n",
    "}\n",
    "\n",
    "mkdir -p data\n",
    "download_expression \"https://www.encodeproject.org/files/ENCFF769IPE/@@download/ENCFF769IPE.tsv\" > data/LungExpression.bed\n",
    "head -n3 data/LungExpression.bed"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "f589ed75",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "chr1\t42005\t42399\tPeak\n",
      "chr1\t778532\t778983\tPeak\n",
      "chr1\t817239\t817623\tPeak\n"
     ]
    }
   ],
   "source": [
    "%%bash\n",
    "function download_atac {\n",
    "    url=$1\n",
    "    wget $url -O - 2> /dev/null \\\n",
    "        | gzip -dc \\\n",
    "        | sort -k1,1 -k2,2n \\\n",
    "        | awk -v OFS=\"\\t\" '{print $1,$2,$3,\"Peak\"}'\n",
    "}\n",
    "\n",
    "download_atac \"https://www.encodeproject.org/files/ENCFF818EYD/@@download/ENCFF818EYD.bed.gz\" > \"data/LungAtac.bed\"\n",
    "head -n3 \"data/LungAtac.bed\""
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a55a2b88",
   "metadata": {},
   "source": "## 2. Specify a configuration template\n\nNow we'll create a YAML configuration template that defines how our G-Tensor should be structured. This template uses Jinja2 templating to make it flexible and reusable.\n\nThe configuration specifies:\n- **Data type**: SBS\n- **Reference files**: Genome sizes, blacklisted regions, and FASTA sequence\n- **Features**: Different types of genomic annotations with their normalization methods\n- **Region size**: Size of genomic windows (10kb in this example)\n\nThere are many different normalization strategies for the features, depending on the input data format:\n- [Bed]\n    - categorical: A discrete region of the genome which is modeled using the nonlinear GBT estimators\n    - mesoscale: A discrete region of the genome which is allowed to influence the signature distributions\n    - strand: A discrete region of the genome which has feature values of either \"+\" or \"-\", marks the strand orientation of a genomic element.\n- [Bed, Bedgraph, BigWig]\n    - log1p_cpm: Log1p-counts per million\n    - standardize: z-score normalization\n    - quantile: uniform quantile normalization\n    - power: variance-stabilizing power transformation\n    - robust: robust z-score, which trims outliers\n\nFinally, the `gex` normalization is a little bit special. It must be applied when you're working with expression-style data since it normalizes\nthe data per gene rather than across loci. If you don't specify `gex` for expression data, the values won't be comparable across celltypes.\n"
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "e5423dbe",
   "metadata": {},
   "outputs": [],
   "source": [
    "import mutopia.analysis as mu\n",
    "from mutopia.utils import fill_jinja_template\n",
    "\n",
    "template=\"\"\"\n",
    "name: {{name}}\n",
    "dtype: SBS\n",
    "chromsizes: {{chromsizes}}\n",
    "blacklist: {{blacklist}}\n",
    "fasta: {{fasta}}\n",
    "region_size: {{region_size}}\n",
    "\n",
    "features:\n",
    "  GCContent:\n",
    "    normalization: standardize\n",
    "    sources:\n",
    "      - {{ files[\"GCContent\"] }}\n",
    "\n",
    "  Transcription:\n",
    "    normalization: gex\n",
    "    column: 5\n",
    "    sources: \n",
    "      - {{ files[\"Transcription\"] }}\n",
    "\n",
    "  GeneStrand:\n",
    "    normalization: strand\n",
    "    column: 6\n",
    "    sources:\n",
    "      - {{ files[\"Transcription\"] }}\n",
    "\n",
    "  ATAC_accessible:\n",
    "    normalization: categorical\n",
    "    sources:\n",
    "      - {{ files[\"ATAC_accessible\"] }}\n",
    "\n",
    "  H3K27ac:\n",
    "    normalization: log1p_cpm\n",
    "    sources:\n",
    "      - {{ files[\"H3K27me3\"] }}\n",
    "\"\"\""
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c888ebdd",
   "metadata": {},
   "source": "To configure a feature, the basic format is:\n```\nfeatures:\n    <feature_name: str>:\n        normalization: <str>\n        column: <Optional[int], default=4>\n        null: <Optional[int]>\n        group: <Optional[str], default=\"all\">\n        classes: <Optional[List[str]]>\n        sources:\n            - <path or URL>\n```\nWhere:\n* `normalization` is outlined above.\n* `column` is applicable for bedfiles, and says which column to extract the data from\n* `null` indicates which value corresponds to no data in a bed or bedGraph file\n* `classes` specifies a class ordering for the categories found in a bed file.\n* `group` - can be used to set exclusive interaction groups between features. Default behavior is to not do any grouping.\n* `sources` - A list of files/URLs\n\n## 3. Fill the template using annotations\n\nNow we'll populate our template with actual file paths and configuration values. The `fill_jinja_template` function takes our template and replaces the placeholders with real values.\nI like to define configuration files for each celltype so I can dynamically generate config files for different scenarios.\n\n**Note**: If you don't have to preprocess your data in any way (like it's available as a bigWig, ready to ingest), you can simply provide a URL! MuTopia will download it for you."
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "50dfc0b1",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "name: example_gtensor\n",
      "dtype: SBS\n",
      "chromsizes: tutorial_data/hg38.mainchroms.sizes\n",
      "blacklist: tutorial_data/hg38-blacklist.v2.sorted.bed.gz\n",
      "fasta: tutorial_data/hg38.fa\n",
      "region_size: 10000\n",
      "\n",
      "features:\n",
      "  GCContent:\n",
      "    normalization: standardize\n",
      "    sources:\n",
      "      - tutorial_data/at_gc_ratio.hg38.bedgraph.gz\n",
      "\n",
      "  Transcription:\n",
      "    normalization: gex\n",
      "    column: 5\n",
      "    sources: \n",
      "      - data/LungExpression.bed\n",
      "\n",
      "  GeneStrand:\n",
      "    normalization: strand\n",
      "    column: 6\n",
      "    sources:\n",
      "      - data/LungExpression.bed\n",
      "\n",
      "  ATAC_accessible:\n",
      "    normalization: categorical\n",
      "    sources:\n",
      "      - data/LungATAC.bed\n",
      "\n",
      "  H3K27ac:\n",
      "    normalization: log1p_cpm\n",
      "    sources:\n",
      "      - https://www.encodeproject.org/files/ENCFF440HPT/@@download/ENCFF440HPT.bigWig\n"
     ]
    }
   ],
   "source": [
    "from mutopia.utils import fill_jinja_template\n",
    "\n",
    "config = fill_jinja_template(\n",
    "    template,\n",
    "    name=\"example_gtensor\",\n",
    "    chromsizes=\"tutorial_data/hg38.mainchroms.sizes\",\n",
    "    fasta=\"tutorial_data/hg38.fa\",\n",
    "    blacklist=\"tutorial_data/hg38-blacklist.v2.sorted.bed.gz\",\n",
    "    region_size=10_000,\n",
    "    files={\n",
    "        \"GCContent\": \"tutorial_data/at_gc_ratio.hg38.bedgraph.gz\",\n",
    "        \"Transcription\": \"data/LungExpression.bed\",\n",
    "        \"ATAC_accessible\": \"data/LungATAC.bed\",\n",
    "        \"H3K27me3\" : \"https://www.encodeproject.org/files/ENCFF440HPT/@@download/ENCFF440HPT.bigWig\",\n",
    "    }\n",
    ")\n",
    "\n",
    "print(config)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "8ee088b6",
   "metadata": {},
   "outputs": [],
   "source": [
    "with open(\"example_gtensor.yaml\", \"w\") as f:\n",
    "    f.write(config)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7496b7c3",
   "metadata": {},
   "source": [
    "Save the configuration to a YAML file that we'll use in the next step:\n",
    "\n",
    "## 4. Build the G-Tensor object and ingest the annotations\n",
    "\n",
    "With our configuration ready, we can now build the G-Tensor using the `gtensor compose` command. This step:\n",
    "\n",
    "1. **Creates the tensor structure**: Sets up the multi-dimensional array\n",
    "2. **Downloads remote files**: Fetches any BigWig or other remote annotations\n",
    "3. **Ingests features**: Ingests features using the correct paths depending on the filetype and normalization strategy\n",
    "\n",
    "The `-w 1` flag specifies using 1 worker process. For larger datasets, you can increase this number to parallelize the work.\n",
    "This command runs a luigi command in the background which can resume the pipeline upon any failures.\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "id": "2727fa4d",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "/Users/allen/miniconda3/envs/my-mutopia/lib/python3.11/site-packages/requests/__init__.py:86: RequestsDependencyWarning: Unable to find acceptable character detection dependency (chardet or charset_normalizer).\n",
      "  warnings.warn(\n",
      "INFO:luigi-interface:Loaded []\n",
      "INFO:luigi:logging configured by default settings\n",
      "INFO: Mutopia:Creating regions with cutouts for: GeneStrand, ATAC_accessible, Transcription\n",
      "INFO: Mutopia:Creating genomic regions ...\n",
      "INFO: Mutopia:Using chromosomes: \n",
      "\tchr1, chr2, chr3, chr4,\n",
      "\tchr5, chr6, chr7, chr8,\n",
      "\tchr9, chr11, chr10, chr12,\n",
      "\tchr13, chr14, chr15, chr16,\n",
      "\tchr17, chr18, chr20, chr19,\n",
      "\tchr22, chr21\n",
      "INFO: Mutopia:Making initial coarse-grained regions ...\n",
      "INFO: Mutopia:Building regions ...\n",
      "INFO: Mutopia:Wrote 25000 windows ...\n",
      "INFO: Mutopia:Wrote 50000 windows ...\n",
      "INFO: Mutopia:Wrote 75000 windows ...\n",
      "INFO: Mutopia:Wrote 100000 windows ...\n",
      "INFO: Mutopia:Wrote 125000 windows ...\n",
      "INFO: Mutopia:Wrote 150000 windows ...\n",
      "INFO: Mutopia:Wrote 175000 windows ...\n",
      "INFO: Mutopia:Wrote 200000 windows ...\n",
      "INFO: Mutopia:Wrote 225000 windows ...\n",
      "INFO: Mutopia:Wrote 250000 windows ...\n",
      "INFO: Mutopia:Wrote 275000 windows ...\n",
      "INFO: Mutopia:Wrote 300000 windows ...\n",
      "INFO: Mutopia:Wrote 325000 windows ...\n",
      "INFO: Mutopia:Wrote 350000 windows ...\n",
      "Window size report\n",
      "----------------------\n",
      "Num windows   | 353685\n",
      "Smallest      | 101\n",
      "Largest       | 10000    \n",
      "Quantile=0.1  | 420\n",
      "Quantile=0.25 | 5174\n",
      "Quantile=0.5  | 10000\n",
      "Quantile=0.75 | 10000\n",
      "Quantile=0.9  | 10000\n",
      "INFO: Mutopia:Calculating context frequencies ...\n",
      "Aggregating trinucleotide content: 353685it [08:47, 670.02it/s] \n",
      "INFO: Mutopia:Formatting G-Tensor ...\n",
      "INFO: Mutopia:Writing G-Tensor to example_gtensor.nc, and accompanying regions file to example_gtensor.nc.regions.bed ...\n",
      "Make sure not to leave the regions file behind if you move the G-Tensor!.\n",
      "INFO: Mutopia:Ingesting data from tutorial_data/at_gc_ratio.hg38.bedgraph.gz ...\n",
      "INFO: Mutopia:First values of feature GCContent: 0.35, 0.31, 0.31, 0.31, 0.23\n",
      "First non-null values: 0.35, 0.31, 0.31, 0.31, 0.23\n",
      "INFO: Mutopia:Added feature: GCContent\n",
      "INFO: Mutopia:Ingesting data from downloads/ENCFF440HPT.bigWig ...\n",
      "INFO: Mutopia:First values of feature H3K27ac: 0.07, 0.08, 0.07, 0.14, 0.05\n",
      "First non-null values: 0.07, 0.08, 0.07, 0.14, 0.05\n",
      "INFO: Mutopia:Added feature: H3K27ac\n",
      "INFO: Mutopia:Ingesting data from data/LungExpression.bed ...\n",
      "INFO: Mutopia:First values of feature Transcription: nan, nan, nan, nan, nan\n",
      "First non-null values: 2.87, 2.87, 2.87, 2.87, 2.87\n",
      "INFO: Mutopia:Added feature: Transcription\n",
      "INFO: Mutopia:Ingesting data from data/LungATAC.bed ...\n",
      "INFO: Mutopia:Found classes: \"None\", \"Peak\"\n",
      "INFO: Mutopia:Added feature: ATAC_accessible\n"
     ]
    }
   ],
   "source": [
    "!gtensor compose example_gtensor.yaml -w 1"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6b5c12bd",
   "metadata": {},
   "source": [
    "**Note:** Make sure the command above runs to completion, you can check by executing it again and making sure it does not try to launch additional tasks."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "id": "c27e456d",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "/Users/allen/miniconda3/envs/my-mutopia/lib/python3.11/site-packages/requests/__init__.py:86: RequestsDependencyWarning: Unable to find acceptable character detection dependency (chardet or charset_normalizer).\n",
      "  warnings.warn(\n",
      "INFO:luigi-interface:Loaded []\n",
      "INFO:luigi:logging configured by default settings\n"
     ]
    }
   ],
   "source": [
    "!gtensor compose example_gtensor.yaml -w 1"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bfa84071",
   "metadata": {},
   "source": [
    "## 5. Ingest samples\n",
    "\n",
    "Now we'll add some samples to our G-Tensor. The `gtensor sample add` command processes VCF/BCF files and extracts all of the relevant information from each mutation. Before you start, you need to organize some metadata:\n",
    "1. For each mutation in each VCF, you must add an `INFO` column for the mutation's apparent VAF. This is simply calculated as: (reads supporting variant)/(total reads). We would calculate this internally, but due to the heterogeneity in how different groups store this information, we leave it to the user.\n",
    "2. The background mutation rate - this is used to determine a baseline mutation rate to discover mutation clusters. Need not be tailored to the tissue, I would just use the one we provide.\n",
    "3. (Optional) The purity of each sample (if working with tumors). If you have this on-hand you should use it, but it's not necessary.\n",
    "\n",
    "Add a sample with `gtensor sample add`, making sure to give each sample a unique ID, and to set the \"sample-weight\" to the reciprocal of purity:\n",
    "$\\text{sample weight} = \\frac{1}{\\text{purity}}$\n",
    "\n",
    "```bash\n",
    "\n",
    "vcf_path=\"path/to/your/vcf-or-bcf\"\n",
    "reciprocal_purity=1.25  # e.g. for a sample with 80% purity\n",
    "\n",
    "gtensor sample add \\\n",
    "    -id \"example.001\"\\\n",
    "    --sample-weight $reciprocal_purity \\\n",
    "    --mutation-rate-file \"tutorial_data/mutation-rate.hg38.bedgraph.gz\" \\\n",
    "    --pass-only \\\n",
    "    example_gtensor.nc \\\n",
    "    $vcf_path\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fad07aba",
   "metadata": {},
   "source": [
    "Let's examine the structure and contents of our newly created G-Tensor:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "id": "1fe13af4",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Num features: 5\n",
      "Num samples: 0\n",
      "Dataset name: example_gtensor\n",
      "Dataset dimensions:\n",
      "\tconfiguration: 2\n",
      "\tcontext: 96\n",
      "\tlocus: 353685\n",
      "\tclustered: 2\n",
      "Dataset attributes:\n",
      "\tdtype: SBS\n",
      "\tregions_file: example_gtensor.nc.regions.bed\n",
      "\tgenome_file: tutorial_data/hg38.mainchroms.sizes\n",
      "\tfasta_file: tutorial_data/hg38.fa\n",
      "\tblacklist_file: tutorial_data/hg38-blacklist.v2.sorted.bed.gz\n",
      "\tregion_size: 10000\n",
      "\tcoordinates: mutation\n"
     ]
    }
   ],
   "source": [
    "!gtensor info example_gtensor.nc"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "58ed2168",
   "metadata": {},
   "source": [
    "You can check out the features you ingested with:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "id": "d4771c68",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Feature name    | normalization | group\n",
      "----------------+---------------+------\n",
      "GCContent       | standardize   | all  \n",
      "H3K27ac         | log1p_cpm     | all  \n",
      "Transcription   | gex           | all  \n",
      "ATAC_accessible | categorical   | all  \n",
      "GeneStrand      | strand        | all  \n"
     ]
    }
   ],
   "source": [
    "!gtensor feature ls example_gtensor.nc"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c5f2a90a",
   "metadata": {},
   "source": "## 6. Manipulating G-Tensors\n\nThe G-Tensor CLI contains many useful functionalities for editing and munging G-Tensors from the terminal. The CLI structure is hierarchical and (hopefully) self-documenting. \n\\\nFor samples, offsets - more on this later, and features, you can view information, add, remove, and edit:"
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "id": "6962bc99",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Usage: gtensor feature [OPTIONS] COMMAND [ARGS]...\n",
      "\n",
      "  Manage features in G-Tensor datasets.\n",
      "\n",
      "  Features represent genomic annotations or experimental data values\n",
      "  associated with each genomic region in the G-Tensor.\n",
      "\n",
      "Options:\n",
      "  --help  Show this message and exit.\n",
      "\n",
      "Commands:\n",
      "  add   Add features to a G-Tensor\n",
      "  edit  Edit feature attributes\n",
      "  ls    List features in a G-Tensor\n",
      "  rm    Remove features from a G-Tensor\n"
     ]
    }
   ],
   "source": [
    "!gtensor feature --help"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6fe8bdb0",
   "metadata": {},
   "source": [
    "You can also slice and subset G-Tensors using the `slice` commands:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "id": "bd6508c7",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Usage: gtensor slice [OPTIONS] COMMAND [ARGS]...\n",
      "\n",
      "Options:\n",
      "  --help  Show this message and exit.\n",
      "\n",
      "Commands:\n",
      "  regions  Extract specific genomic regions from a G-Tensor dataset.\n",
      "  samples  Extract specific samples from a G-Tensor dataset into a new file.\n"
     ]
    }
   ],
   "source": [
    "!gtensor slice --help"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "558c5388",
   "metadata": {},
   "source": [
    "**Check out all the functionalities using:**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "id": "b87e77d1",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Usage: gtensor [OPTIONS] COMMAND [ARGS]...\n",
      "\n",
      "  Manage G-Tensor datasets for genomic data analysis.\n",
      "\n",
      "  G-Tensors are structured genomic datasets that organize samples, features,\n",
      "  and observations across genomic loci for machine learning applications.\n",
      "\n",
      "Options:\n",
      "  --help  Show this message and exit.\n",
      "\n",
      "Commands:\n",
      "  compose   Run the G-Tensor construction pipeline\n",
      "  convert   Convert a G-Tensor to a different modality\n",
      "  create    Create a new G-Tensor\n",
      "  feature   Manage features in a G-Tensor\n",
      "  info      Display information about a G-Tensor\n",
      "  offsets   Manage genomic locus exposure offsets\n",
      "  sample    Manage samples in a G-Tensor\n",
      "  set-attr  Set attributes on a G-Tensor\n",
      "  slice     Slice a G-Tensor by samples or regions\n",
      "  split     Split a G-Tensor into training and test sets\n",
      "  utils     Utility functions for G-Tensors\n"
     ]
    }
   ],
   "source": [
    "!gtensor --help"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "my-mutopia",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.11"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}