transforms¶
Transform helpers and accessors for genome track plotting.
This module provides small utilities used by track plotting, including array transforms that preserve xarray objects, dataset accessors, and clustering helpers.
- class mutopia.plot.track_plot.transforms.TopographyTransformer(mutation_order=('C>G', 'C>A', 'T>A', 'T>C', 'T>G', 'C>T'), data_key='predicted_marginal')[source]¶
Bases:
objectTransformer for mutational topography matrices.
Fetches and standardizes a configuration x mutation x context matrix from a dataset, computes an informative row ordering, and provides labels for grouped mutation categories.
- property labels¶
- mutopia.plot.track_plot.transforms.apply_rows(fn)[source]¶
Create function to apply operation along rows (axis=1).
- Parameters:
fn (callable) – Function to apply to each row
- Returns:
Function that applies fn along axis 1
- Return type:
callable
- mutopia.plot.track_plot.transforms.clip(min_quantile=0.0, max_quantile=1.0)[source]¶
Create a clipping function based on quantiles.
- Parameters:
min_quantile (float, default 0.0) – Lower quantile for clipping (0-1)
max_quantile (float, default 1.0) – Upper quantile for clipping (0-1)
- Returns:
Function that clips input arrays to specified quantiles
- Return type:
callable
- mutopia.plot.track_plot.transforms.feature_matrix(*feature_names, source=None)[source]¶
Accessor function to retrieve multiple features from a dataset as a matrix.
This function creates an accessor that extracts multiple features from a dataset and stacks them into a 2D matrix with features as rows and loci as columns. If no feature names are provided, it automatically selects all numeric features from the dataset.
- Parameters:
*feature_names (str or iterable) – Names of the features to access. Can be: - Multiple string arguments: feature_matrix(“feat1”, “feat2”, “feat3”) - Single iterable: feature_matrix([“feat1”, “feat2”, “feat3”]) - Empty: automatically selects all numeric features
source (str, optional) – Optional feature source or namespace passed through to
fetch_features.
- Returns:
Function that retrieves the specified features from the dataset and returns them as a DataArray with dimensions (feature, locus). If only one feature is selected, the ‘feature’ dimension is squeezed.
- Return type:
callable
Examples
>>> get_features = feature_matrix("gc_content", "cpg_density") >>> matrix = get_features(dataset) # Shape: (2, n_loci)
>>> get_all_features = feature_matrix() >>> all_matrix = get_all_features(dataset) # All numeric features
- mutopia.plot.track_plot.transforms.minmax_scale(x)[source]¶
Scale array to [0, 1] using min-max normalization.
- Parameters:
x (ndarray) – Input array.
- Returns:
Rescaled array with values in [0, 1].
- Return type:
ndarray
- mutopia.plot.track_plot.transforms.passthrough(data)[source]¶
Create a passthrough function that returns input data unchanged.
This function creates a closure that ignores any arguments passed to it and always returns the original data object. Useful in data processing pipelines where certain steps should be bypassed.
- Parameters:
data (any) – Input data to be returned unchanged by the generated function
- Returns:
Function that accepts any arguments but always returns the original data
- Return type:
callable
- mutopia.plot.track_plot.transforms.pipeline(*fns)[source]¶
Create a data processing pipeline from a sequence of functions.
This function composes multiple functions into a single pipeline function that applies each function in sequence. The output of each function becomes the input to the next function in the pipeline.
- Parameters:
*fns (callable) – Variable number of functions to compose into a pipeline. Each function should accept one argument (the data) and return the transformed data for the next function.
- Returns:
Composed function that applies all input functions in sequence from first to last
- Return type:
callable
Examples
>>> normalize = lambda x: x / x.max() >>> log_transform = lambda x: np.log(x + 1) >>> process = pipeline(normalize, log_transform) >>> result = process(data)
- mutopia.plot.track_plot.transforms.renorm(x)[source]¶
Renormalize array to sum to 1.
- Parameters:
x (array-like) – Input array
- Returns:
Normalized array that sums to 1
- Return type:
array-like
- mutopia.plot.track_plot.transforms.reorder_df(df)[source]¶
Reorder a DataFrame’s rows using hierarchical clustering optimal order.
- Parameters:
df (pandas.DataFrame) – Input DataFrame with numeric values.
- Returns:
Reordered DataFrame according to optimal leaf ordering.
- Return type:
pandas.DataFrame
- mutopia.plot.track_plot.transforms.select(var_name, **sel)[source]¶
Create an accessor function to extract variables from datasets.
This function creates a closure that extracts a specific variable from a dataset and optionally applies selection criteria. The extracted variable is transposed to ensure ‘locus’ is the last dimension.
- Parameters:
var_name (str) – Name of the variable to access from the dataset
**sel (dict) – Additional selection criteria passed to .sel() method. Keys should be dimension names and values should be selection criteria.
- Returns:
Function that takes a dataset and returns the specified variable with ‘locus’ as the last dimension
- Return type:
callable
Examples
>>> get_feature = select("Features/gc_content", sample=0) >>> feature_data = get_feature(dataset)