transforms

Transform helpers and accessors for genome track plotting.

This module provides small utilities used by track plotting, including array transforms that preserve xarray objects, dataset accessors, and clustering helpers.

class mutopia.plot.track_plot.transforms.TopographyTransformer(mutation_order=('C>G', 'C>A', 'T>A', 'T>C', 'T>G', 'C>T'), data_key='predicted_marginal')[source]

Bases: object

Transformer for mutational topography matrices.

Fetches and standardizes a configuration x mutation x context matrix from a dataset, computes an informative row ordering, and provides labels for grouped mutation categories.

fit(dataset)[source]
property labels
transform(dataset)[source]
mutopia.plot.track_plot.transforms.apply_rows(fn)[source]

Create function to apply operation along rows (axis=1).

Parameters:

fn (callable) – Function to apply to each row

Returns:

Function that applies fn along axis 1

Return type:

callable

mutopia.plot.track_plot.transforms.clip(min_quantile=0.0, max_quantile=1.0)[source]

Create a clipping function based on quantiles.

Parameters:
  • min_quantile (float, default 0.0) – Lower quantile for clipping (0-1)

  • max_quantile (float, default 1.0) – Upper quantile for clipping (0-1)

Returns:

Function that clips input arrays to specified quantiles

Return type:

callable

mutopia.plot.track_plot.transforms.feature_matrix(*feature_names, source=None)[source]

Accessor function to retrieve multiple features from a dataset as a matrix.

This function creates an accessor that extracts multiple features from a dataset and stacks them into a 2D matrix with features as rows and loci as columns. If no feature names are provided, it automatically selects all numeric features from the dataset.

Parameters:
  • *feature_names (str or iterable) – Names of the features to access. Can be: - Multiple string arguments: feature_matrix(“feat1”, “feat2”, “feat3”) - Single iterable: feature_matrix([“feat1”, “feat2”, “feat3”]) - Empty: automatically selects all numeric features

  • source (str, optional) – Optional feature source or namespace passed through to fetch_features.

Returns:

Function that retrieves the specified features from the dataset and returns them as a DataArray with dimensions (feature, locus). If only one feature is selected, the ‘feature’ dimension is squeezed.

Return type:

callable

Examples

>>> get_features = feature_matrix("gc_content", "cpg_density")
>>> matrix = get_features(dataset)  # Shape: (2, n_loci)
>>> get_all_features = feature_matrix()
>>> all_matrix = get_all_features(dataset)  # All numeric features
mutopia.plot.track_plot.transforms.minmax_scale(x)[source]

Scale array to [0, 1] using min-max normalization.

Parameters:

x (ndarray) – Input array.

Returns:

Rescaled array with values in [0, 1].

Return type:

ndarray

mutopia.plot.track_plot.transforms.passthrough(data)[source]

Create a passthrough function that returns input data unchanged.

This function creates a closure that ignores any arguments passed to it and always returns the original data object. Useful in data processing pipelines where certain steps should be bypassed.

Parameters:

data (any) – Input data to be returned unchanged by the generated function

Returns:

Function that accepts any arguments but always returns the original data

Return type:

callable

mutopia.plot.track_plot.transforms.pipeline(*fns)[source]

Create a data processing pipeline from a sequence of functions.

This function composes multiple functions into a single pipeline function that applies each function in sequence. The output of each function becomes the input to the next function in the pipeline.

Parameters:

*fns (callable) – Variable number of functions to compose into a pipeline. Each function should accept one argument (the data) and return the transformed data for the next function.

Returns:

Composed function that applies all input functions in sequence from first to last

Return type:

callable

Examples

>>> normalize = lambda x: x / x.max()
>>> log_transform = lambda x: np.log(x + 1)
>>> process = pipeline(normalize, log_transform)
>>> result = process(data)
mutopia.plot.track_plot.transforms.renorm(x)[source]

Renormalize array to sum to 1.

Parameters:

x (array-like) – Input array

Returns:

Normalized array that sums to 1

Return type:

array-like

mutopia.plot.track_plot.transforms.reorder_df(df)[source]

Reorder a DataFrame’s rows using hierarchical clustering optimal order.

Parameters:

df (pandas.DataFrame) – Input DataFrame with numeric values.

Returns:

Reordered DataFrame according to optimal leaf ordering.

Return type:

pandas.DataFrame

mutopia.plot.track_plot.transforms.select(var_name, **sel)[source]

Create an accessor function to extract variables from datasets.

This function creates a closure that extracts a specific variable from a dataset and optionally applies selection criteria. The extracted variable is transposed to ensure ‘locus’ is the last dimension.

Parameters:
  • var_name (str) – Name of the variable to access from the dataset

  • **sel (dict) – Additional selection criteria passed to .sel() method. Keys should be dimension names and values should be selection criteria.

Returns:

Function that takes a dataset and returns the specified variable with ‘locus’ as the last dimension

Return type:

callable

Examples

>>> get_feature = select("Features/gc_content", sample=0)
>>> feature_data = get_feature(dataset)