LLM Hidden-State Topology¶
LLM hidden-state topological analysis.
- class att.llm.HiddenStateLoader(path)[source]¶
Bases:
objectLoad and query LLM hidden-state archives produced by extract_hidden_states.py.
- Parameters:
path (str) – Path to .npz archive containing hidden states.
(N, d) last-token hidden states at the final transformer layer.
(N, L+1, d) hidden states at the final token across all layers.
Index 0 is the embedding layer output; index -1 is the final transformer layer.
- property token_trajectories: ndarray¶
(N,) object array where each element is (T_i, d) token-position hidden states.
Dimensionality of hidden-state vectors.
- get_layer_cloud(layer, levels=None)[source]¶
Point cloud of hidden states at a specific layer across problems.
- Parameters:
- Return type:
(n_problems, d) array of hidden-state vectors.
- class att.llm.LayerwiseAnalyzer(n_pca_components=50, max_dim=2, subsample=200, n_permutations=200, seed=42)[source]¶
Bases:
objectPer-layer persistent homology with permutation-based z-score profiles.
Runs PersistenceAnalyzer at each transformer layer for each difficulty level, then computes z-score profiles via label-permutation tests.
- Parameters:
n_pca_components (int) – PCA dimensions before PH computation.
max_dim (int) – Maximum homology dimension (0=components, 1=loops, 2=voids).
subsample (int or None) – Max points to subsample per layer cloud.
n_permutations (int) – Number of permutations for z-score computation.
seed (int) – Random seed for reproducibility.
- fit(loader, levels=None)[source]¶
Run PH at every layer for each difficulty level.
- Parameters:
loader (HiddenStateLoader) – Loaded hidden-state archive.
levels (list of int or None) – Difficulty levels to analyze. None = all levels.
- Return type:
self
- property results_per_layer: dict[tuple[int, int], dict]¶
Raw PH results keyed by (level, layer_idx).
- entropy_profile()[source]¶
Per-layer persistence entropy by difficulty level.
- Return type:
dict mapping level -> (n_layers, max_dim+1) array of entropies.
- bottleneck_profile()[source]¶
Per-layer bottleneck distances between consecutive layers.
- Return type:
dict mapping level -> (n_layers-1,) array of bottleneck distances.
- zscore_profile(loader, metric='wasserstein_1')[source]¶
Compute per-layer z-score of inter-level topological distance.
Permutes difficulty labels and recomputes pairwise distances at each layer to build a null distribution, then computes z-scores.
- Parameters:
loader (HiddenStateLoader) – Same loader used in fit().
metric (str) – Distance metric (“wasserstein_1”, “bottleneck”).
- Returns:
z_scores : (n_layers,) per-layer z-scores p_values : (n_layers,) per-layer p-values observed : (n_layers,) observed mean pairwise distances null_mean : (n_layers,) null distribution means null_std : (n_layers,) null distribution stds per_dim : dict mapping dim -> (n_layers,) z-scores for each H_dim
- Return type:
dict with
- class att.llm.CROCKERMatrix(n_filtration_steps=100, max_dim=1, n_pca_components=50, subsample=200, seed=42)[source]¶
Bases:
objectCompute CROCKER matrices for LLM hidden-state topology.
Produces 2D heatmaps of Betti numbers β_k(ε, p) where ε is the filtration radius and p is a varying parameter (difficulty level or transformer layer index).
- Parameters:
- fit_by_difficulty(loader, layer=-1, levels=None)[source]¶
Compute CROCKER matrices with difficulty level as parameter axis.
- Parameters:
loader (HiddenStateLoader)
layer (int) – Layer index to analyze (-1 = final layer).
levels (list of int or None) – Levels to include (None = all).
- Return type:
- fit_by_layer(loader, level=1, layers=None)[source]¶
Compute CROCKER matrices with layer index as parameter axis.
- Parameters:
loader (HiddenStateLoader)
level (int) – Difficulty level to analyze.
layers (list of int or None) – Layer indices to include (None = all).
- Return type:
- class att.llm.TopologicalFeatureExtractor(max_dim=1, n_pca_components=50, subsample=200, feature_set='summary', pi_resolution=20, pi_sigma=0.1, seed=42)[source]¶
Bases:
objectExtract fixed-length topological feature vectors from point clouds.
- Parameters:
max_dim (int) – Maximum homology dimension.
n_pca_components (int) – PCA dimensions before PH computation.
subsample (int or None) – Max points per cloud for PH.
feature_set (str) – “summary” (8 features per dim) or “image” (summary + flattened PI).
pi_resolution (int) – Persistence image resolution (only used when feature_set=”image”).
pi_sigma (float) – Persistence image Gaussian bandwidth.
seed (int) – Random seed.
- extract_single(cloud)[source]¶
Extract topological features from a single point cloud.
- Parameters:
cloud ((n_points, d) point cloud.)
- Return type:
(n_features,) feature vector.
- extract_batch(loader, layer=-1)[source]¶
Extract features for all problems in a loader, per difficulty level.
Computes PH on the level-cloud at the given layer for each difficulty level, producing one feature vector per level.
- extract_per_problem(loader, layer=-1)[source]¶
Extract features per problem using token trajectories.
Each problem’s token trajectory (T_i, d) is treated as a point cloud.
- Parameters:
loader (HiddenStateLoader)
layer (int) – Not used for token trajectories (included for API consistency).
- Returns:
X ((n_problems, n_features) feature matrix.)
levels ((n_problems,) difficulty levels.)
- Return type:
- att.llm.twonn_dimension(cloud, fraction=0.9)[source]¶
Estimate intrinsic dimension via the TwoNN method (Facco et al. 2017).
Uses the ratio of distances to the second and first nearest neighbours. The ID is estimated as d = 1 / mean(log(mu)) where mu = r2/r1.
- Parameters:
cloud ((n, d) point cloud.)
fraction (float) – Fraction of points to use after trimming high-mu outliers (0, 1].
- Returns:
float
- Return type:
estimated intrinsic dimension.
- att.llm.phd_dimension(diagrams, dim=1)[source]¶
Estimate intrinsic dimension from persistence diagram lifetimes.
Based on the observation that in dimension d, the expected lifetime of H_k features scales as n^(-1/d) (Birdal et al. 2021). We estimate d from the distribution of H1 lifetimes using the log-log slope of the survival function.
- Parameters:
diagrams (list of (n_features, 2) arrays (persistence diagrams).)
dim (int) – Homology dimension to use (default 1 for loops).
- Returns:
float
- Return type:
estimated intrinsic dimension (0.0 if insufficient features).
- att.llm.id_profile(loader, levels=None, n_pca_components=50, method='twonn', fraction=0.9)[source]¶
Compute intrinsic dimension profile across layers for each difficulty level.
- Parameters:
loader (HiddenStateLoader) – Loaded hidden-state archive.
levels (list of int or None) – Difficulty levels to analyze. None = all levels.
n_pca_components (int) – PCA components before ID estimation (avoids curse of ambient dim).
method (str) – “twonn” (default) or “phd”.
fraction (float) – Fraction parameter for TwoNN trimming.
- Return type:
dict mapping level -> (n_layers,) array of ID estimates.
- class att.llm.ZigzagLayerAnalyzer(max_dim=1, n_pca_components=50, subsample=100, threshold=None, seed=42)[source]¶
Bases:
objectZigzag persistent homology across transformer layers.
- Constructs a zigzag filtration:
VR(X_0) <-> VR(X_0 ∪ X_1) <-> VR(X_1) <-> … <-> VR(X_{L-1})
where X_i is the point cloud at layer i. The union complexes use the minimum pairwise distance across both layers’ embeddings of each point.
- Parameters:
max_dim (int) – Maximum homology dimension (default 1 -> H0, H1).
n_pca_components (int) – PCA dimension reduction before computing distances.
subsample (int or None) – Subsample points per layer to manage runtime.
threshold (float or None) – VR complex distance threshold. If None, uses adaptive threshold based on data scale.
seed (int) – Random seed for subsampling.
- fit(loader, level, layer_indices=None)[source]¶
Compute zigzag persistence across layers for a difficulty level.
- Parameters:
loader (HiddenStateLoader) – Hidden state data.
level (int) – Difficulty level (1-5).
layer_indices (list of int or None) – Which layers to include. If None, uses all layers.
- Return type:
ZigzagResult with barcodes per dimension.
- class att.llm.TokenPartitioner(tokenizer=None)[source]¶
Bases:
objectPartition token positions into functional regions.
- Regions:
instruction_prefix: system instruction before the problem
problem: the math problem text
instruction_suffix: closing instruction after the problem
operator: tokens within the problem that are math operators/symbols
numeric: tokens within the problem that are numbers
- Parameters:
tokenizer (optional) – A HuggingFace tokenizer for accurate token-level partitioning. If None, uses character-length-based approximation.
- class att.llm.AttentionHiddenBinding(max_dim=1, image_resolution=50, image_sigma=0.1, n_pca_components=50, subsample=100, seed=42)[source]¶
Bases:
objectMeasure topological coupling between attention and hidden-state geometry.
Treats 1 - attention_weight as a precomputed distance matrix and computes PH on it. Compares against PH on hidden-state point clouds using persistence image subtraction (same principle as BindingDetector).
- Parameters:
max_dim (int) – Maximum homology dimension.
image_resolution (int) – Resolution of persistence images for comparison.
image_sigma (float) – Gaussian kernel bandwidth for persistence images.
n_pca_components (int) – PCA components for hidden-state clouds.
subsample (int) – Max points for hidden-state PH.
seed (int) – Random seed.
- static attention_to_distance(attn)[source]¶
Convert attention matrix to symmetric distance matrix.
D = 1 - (A + A^T) / 2, clipped to [0, 1], diagonal zeroed.
- compute_binding(attention_matrix, hidden_cloud)[source]¶
Compute binding score between attention topology and hidden-state topology.
- Parameters:
attention_matrix ((n, n) attention weight matrix (head-averaged).)
hidden_cloud ((n, d) hidden-state vectors for the same tokens.)
- Return type:
BindingResult with binding score and per-dim feature counts / entropy.
- test_significance(attention_matrix, hidden_cloud, n_permutations=100)[source]¶
Test binding significance via row-permutation surrogates.
Permutes rows (and corresponding columns) of the attention matrix to destroy the attention-hidden correspondence while preserving attention structure. The observed binding score is compared against the null distribution of surrogate scores.
- Parameters:
attention_matrix ((n, n) attention weight matrix.)
hidden_cloud ((n, d) hidden-state vectors.)
n_permutations (int) – Number of surrogate permutations.
- Return type:
SignificanceResult with observed score, null distribution, p-value, z-score.
- compute_binding_from_diagrams(attn_diagrams, hidden_cloud)[source]¶
Compute binding from pre-extracted attention PH diagrams and hidden cloud.
- Parameters:
attn_diagrams (list of (n, 2) arrays, one per homology dimension.)
hidden_cloud ((n, d) hidden-state vectors.)
- Return type:
BindingResult with binding score and feature stats.
- binding_profile(loader, attention_ph_data=None, levels=None, layer_indices=None)[source]¶
Compute binding scores across difficulty levels and layers.
If attention_ph_data is not available, returns binding scores based on hidden-state self-coupling (within-layer topology consistency). This serves as a template for when attention data becomes available.
- Parameters:
loader (HiddenStateLoader)
attention_ph_data (optional pre-computed attention PH (from extract_attention_weights.py).)
levels (difficulty levels to analyze.)
layer_indices (which layers to analyze.)
- Returns:
scores : dict mapping (level, layer) -> binding_score levels : list of levels layers : list of layer indices
- Return type:
dict with
- class att.llm.HiddenStateLoader(path)[source]¶
Bases:
objectLoad and query LLM hidden-state archives produced by extract_hidden_states.py.
- Parameters:
path (str) – Path to .npz archive containing hidden states.
- property layer_hidden: ndarray¶
(N, L+1, d) hidden states at the final token across all layers.
Index 0 is the embedding layer output; index -1 is the final transformer layer.
- property token_trajectories: ndarray¶
(N,) object array where each element is (T_i, d) token-position hidden states.
- get_layer_cloud(layer, levels=None)[source]¶
Point cloud of hidden states at a specific layer across problems.
- Parameters:
- Return type:
(n_problems, d) array of hidden-state vectors.
- class att.llm.LayerwiseAnalyzer(n_pca_components=50, max_dim=2, subsample=200, n_permutations=200, seed=42)[source]¶
Bases:
objectPer-layer persistent homology with permutation-based z-score profiles.
Runs PersistenceAnalyzer at each transformer layer for each difficulty level, then computes z-score profiles via label-permutation tests.
- Parameters:
n_pca_components (int) – PCA dimensions before PH computation.
max_dim (int) – Maximum homology dimension (0=components, 1=loops, 2=voids).
subsample (int or None) – Max points to subsample per layer cloud.
n_permutations (int) – Number of permutations for z-score computation.
seed (int) – Random seed for reproducibility.
- fit(loader, levels=None)[source]¶
Run PH at every layer for each difficulty level.
- Parameters:
loader (HiddenStateLoader) – Loaded hidden-state archive.
levels (list of int or None) – Difficulty levels to analyze. None = all levels.
- Return type:
self
- property results_per_layer: dict[tuple[int, int], dict]¶
Raw PH results keyed by (level, layer_idx).
- entropy_profile()[source]¶
Per-layer persistence entropy by difficulty level.
- Return type:
dict mapping level -> (n_layers, max_dim+1) array of entropies.
- bottleneck_profile()[source]¶
Per-layer bottleneck distances between consecutive layers.
- Return type:
dict mapping level -> (n_layers-1,) array of bottleneck distances.
- zscore_profile(loader, metric='wasserstein_1')[source]¶
Compute per-layer z-score of inter-level topological distance.
Permutes difficulty labels and recomputes pairwise distances at each layer to build a null distribution, then computes z-scores.
- Parameters:
loader (HiddenStateLoader) – Same loader used in fit().
metric (str) – Distance metric (“wasserstein_1”, “bottleneck”).
- Returns:
z_scores : (n_layers,) per-layer z-scores p_values : (n_layers,) per-layer p-values observed : (n_layers,) observed mean pairwise distances null_mean : (n_layers,) null distribution means null_std : (n_layers,) null distribution stds per_dim : dict mapping dim -> (n_layers,) z-scores for each H_dim
- Return type:
dict with
- class att.llm.TopologicalFeatureExtractor(max_dim=1, n_pca_components=50, subsample=200, feature_set='summary', pi_resolution=20, pi_sigma=0.1, seed=42)[source]¶
Bases:
objectExtract fixed-length topological feature vectors from point clouds.
- Parameters:
max_dim (int) – Maximum homology dimension.
n_pca_components (int) – PCA dimensions before PH computation.
subsample (int or None) – Max points per cloud for PH.
feature_set (str) – “summary” (8 features per dim) or “image” (summary + flattened PI).
pi_resolution (int) – Persistence image resolution (only used when feature_set=”image”).
pi_sigma (float) – Persistence image Gaussian bandwidth.
seed (int) – Random seed.
- extract_single(cloud)[source]¶
Extract topological features from a single point cloud.
- Parameters:
cloud ((n_points, d) point cloud.)
- Return type:
(n_features,) feature vector.
- extract_batch(loader, layer=-1)[source]¶
Extract features for all problems in a loader, per difficulty level.
Computes PH on the level-cloud at the given layer for each difficulty level, producing one feature vector per level.
- extract_per_problem(loader, layer=-1)[source]¶
Extract features per problem using token trajectories.
Each problem’s token trajectory (T_i, d) is treated as a point cloud.
- Parameters:
loader (HiddenStateLoader)
layer (int) – Not used for token trajectories (included for API consistency).
- Returns:
X ((n_problems, n_features) feature matrix.)
levels ((n_problems,) difficulty levels.)
- Return type:
- class att.llm.CROCKERMatrix(n_filtration_steps=100, max_dim=1, n_pca_components=50, subsample=200, seed=42)[source]¶
Bases:
objectCompute CROCKER matrices for LLM hidden-state topology.
Produces 2D heatmaps of Betti numbers β_k(ε, p) where ε is the filtration radius and p is a varying parameter (difficulty level or transformer layer index).
- Parameters:
- fit_by_difficulty(loader, layer=-1, levels=None)[source]¶
Compute CROCKER matrices with difficulty level as parameter axis.
- Parameters:
loader (HiddenStateLoader)
layer (int) – Layer index to analyze (-1 = final layer).
levels (list of int or None) – Levels to include (None = all).
- Return type:
- fit_by_layer(loader, level=1, layers=None)[source]¶
Compute CROCKER matrices with layer index as parameter axis.
- Parameters:
loader (HiddenStateLoader)
level (int) – Difficulty level to analyze.
layers (list of int or None) – Layer indices to include (None = all).
- Return type:
- class att.llm.ZigzagLayerAnalyzer(max_dim=1, n_pca_components=50, subsample=100, threshold=None, seed=42)[source]¶
Bases:
objectZigzag persistent homology across transformer layers.
- Constructs a zigzag filtration:
VR(X_0) <-> VR(X_0 ∪ X_1) <-> VR(X_1) <-> … <-> VR(X_{L-1})
where X_i is the point cloud at layer i. The union complexes use the minimum pairwise distance across both layers’ embeddings of each point.
- Parameters:
max_dim (int) – Maximum homology dimension (default 1 -> H0, H1).
n_pca_components (int) – PCA dimension reduction before computing distances.
subsample (int or None) – Subsample points per layer to manage runtime.
threshold (float or None) – VR complex distance threshold. If None, uses adaptive threshold based on data scale.
seed (int) – Random seed for subsampling.
- fit(loader, level, layer_indices=None)[source]¶
Compute zigzag persistence across layers for a difficulty level.
- Parameters:
loader (HiddenStateLoader) – Hidden state data.
level (int) – Difficulty level (1-5).
layer_indices (list of int or None) – Which layers to include. If None, uses all layers.
- Return type:
ZigzagResult with barcodes per dimension.
- class att.llm.TokenPartitioner(tokenizer=None)[source]¶
Bases:
objectPartition token positions into functional regions.
- Regions:
instruction_prefix: system instruction before the problem
problem: the math problem text
instruction_suffix: closing instruction after the problem
operator: tokens within the problem that are math operators/symbols
numeric: tokens within the problem that are numbers
- Parameters:
tokenizer (optional) – A HuggingFace tokenizer for accurate token-level partitioning. If None, uses character-length-based approximation.
- class att.llm.AttentionHiddenBinding(max_dim=1, image_resolution=50, image_sigma=0.1, n_pca_components=50, subsample=100, seed=42)[source]¶
Bases:
objectMeasure topological coupling between attention and hidden-state geometry.
Treats 1 - attention_weight as a precomputed distance matrix and computes PH on it. Compares against PH on hidden-state point clouds using persistence image subtraction (same principle as BindingDetector).
- Parameters:
max_dim (int) – Maximum homology dimension.
image_resolution (int) – Resolution of persistence images for comparison.
image_sigma (float) – Gaussian kernel bandwidth for persistence images.
n_pca_components (int) – PCA components for hidden-state clouds.
subsample (int) – Max points for hidden-state PH.
seed (int) – Random seed.
- static attention_to_distance(attn)[source]¶
Convert attention matrix to symmetric distance matrix.
D = 1 - (A + A^T) / 2, clipped to [0, 1], diagonal zeroed.
- compute_binding(attention_matrix, hidden_cloud)[source]¶
Compute binding score between attention topology and hidden-state topology.
- Parameters:
attention_matrix ((n, n) attention weight matrix (head-averaged).)
hidden_cloud ((n, d) hidden-state vectors for the same tokens.)
- Return type:
BindingResult with binding score and per-dim feature counts / entropy.
- test_significance(attention_matrix, hidden_cloud, n_permutations=100)[source]¶
Test binding significance via row-permutation surrogates.
Permutes rows (and corresponding columns) of the attention matrix to destroy the attention-hidden correspondence while preserving attention structure. The observed binding score is compared against the null distribution of surrogate scores.
- Parameters:
attention_matrix ((n, n) attention weight matrix.)
hidden_cloud ((n, d) hidden-state vectors.)
n_permutations (int) – Number of surrogate permutations.
- Return type:
SignificanceResult with observed score, null distribution, p-value, z-score.
- compute_binding_from_diagrams(attn_diagrams, hidden_cloud)[source]¶
Compute binding from pre-extracted attention PH diagrams and hidden cloud.
- Parameters:
attn_diagrams (list of (n, 2) arrays, one per homology dimension.)
hidden_cloud ((n, d) hidden-state vectors.)
- Return type:
BindingResult with binding score and feature stats.
- binding_profile(loader, attention_ph_data=None, levels=None, layer_indices=None)[source]¶
Compute binding scores across difficulty levels and layers.
If attention_ph_data is not available, returns binding scores based on hidden-state self-coupling (within-layer topology consistency). This serves as a template for when attention data becomes available.
- Parameters:
loader (HiddenStateLoader)
attention_ph_data (optional pre-computed attention PH (from extract_attention_weights.py).)
levels (difficulty levels to analyze.)
layer_indices (which layers to analyze.)
- Returns:
scores : dict mapping (level, layer) -> binding_score levels : list of levels layers : list of layer indices
- Return type:
dict with
- att.llm.twonn_dimension(cloud, fraction=0.9)[source]¶
Estimate intrinsic dimension via the TwoNN method (Facco et al. 2017).
Uses the ratio of distances to the second and first nearest neighbours. The ID is estimated as d = 1 / mean(log(mu)) where mu = r2/r1.
- Parameters:
cloud ((n, d) point cloud.)
fraction (float) – Fraction of points to use after trimming high-mu outliers (0, 1].
- Returns:
float
- Return type:
estimated intrinsic dimension.
- att.llm.phd_dimension(diagrams, dim=1)[source]¶
Estimate intrinsic dimension from persistence diagram lifetimes.
Based on the observation that in dimension d, the expected lifetime of H_k features scales as n^(-1/d) (Birdal et al. 2021). We estimate d from the distribution of H1 lifetimes using the log-log slope of the survival function.
- Parameters:
diagrams (list of (n_features, 2) arrays (persistence diagrams).)
dim (int) – Homology dimension to use (default 1 for loops).
- Returns:
float
- Return type:
estimated intrinsic dimension (0.0 if insufficient features).
- att.llm.id_profile(loader, levels=None, n_pca_components=50, method='twonn', fraction=0.9)[source]¶
Compute intrinsic dimension profile across layers for each difficulty level.
- Parameters:
loader (HiddenStateLoader) – Loaded hidden-state archive.
levels (list of int or None) – Difficulty levels to analyze. None = all levels.
n_pca_components (int) – PCA components before ID estimation (avoids curse of ambient dim).
method (str) – “twonn” (default) or “phd”.
fraction (float) – Fraction parameter for TwoNN trimming.
- Return type:
dict mapping level -> (n_layers,) array of ID estimates.