Research · Per Ardua

The Activation Geometry Program: Twelve Papers on the Mathematical Structure of Neural Network Representations

Synthesis of twelve experiments — from trajectory structure to the terminal measurement limit

AI-13 Activation Geometry DOI

Executive Summary

Over twelve papers, we developed a systematic account of how neural network activation spaces encode, separate, and mix domain-specific and structural information. The program began with a practical question — can training compute be reduced by exploiting trajectory structure? — and arrived at a theoretical result: the terminal measurement limit, which establishes that linear interventions at the output layer cannot invert nonlinear mixing at intermediate layers.

The Arc

Papers I-II establish the measurement foundations. Training trajectories have detectable stable and chaotic regimes. Independently initialized runs converge to the same activation manifold (cosine similarity >0.999), making the geometry a reproducible object of study.

Papers III-V show this geometry supports practical applications: parameter-level model composition (93.3% cross-domain win rate), domain-structure decomposition via INLP with double dissociation (shape survival 98.7-100%), and distillation detection (AUC 0.86-1.00).

Paper VI proves five sufficient conditions under which stochastic resonance produces net-beneficial noise effects, unifying the SR phenomena observed across the series.

Papers VII-VIII apply shaped noise injection to test whether the geometric structure supports targeted intervention. The central negative result: INLP directions separate domains perfectly in classification (>97.5% accuracy) but perturbation along those directions produces non-selective behavioral effects.

Papers IX-XII characterize why. Layer-resolved injection shows selectivity peaks modestly at intermediate layers. The layer Jacobian is an isotropic amplifier treating INLP directions identically to random directions. The concentration barrier theorem proves that any k-dimensional subspace captures variance fraction at most k/d_eff. Domain-specific information content is bounded at approximately 2 bits against ~15 bits of domain-agnostic perturbation.

The Collected Finding

A dissociation: classification and intervention operate under different constraints. High-dimensional activation spaces guarantee that fixed linear directions capture only a small fraction of the computation they describe. The geometry that INLP discovers is real, but the causal pathways through which domain-specific computation flows are distributed across dimensions that no fixed subspace can isolate.

This is not an engineering limitation — it is a property of high-dimensional nonlinear computation.

Program Papers

  • I. Leap+Verify: Regime-Adaptive Speculative Weight Prediction
  • II. Training Once Is Enough: Activation Fingerprint Convergence
  • III. Constellation-Indexed Model Composition
  • IV. The Shape of the Problem: Domain-Invariant Structural Signatures
  • V. Capability Manifold Surveillance: Distillation Detection
  • VI. The Generative Lossy Channel: Five Sufficient Conditions for SR
  • VII. GenAI Is Socially Awkward: RLHF Damages Social Cognition
  • VIII. Shaped Noise Injection: The Terminal Measurement Limit
  • IX. Layer-Resolved Response Tensor
  • X. Spectral Geometry of the Forward Pass
  • XI. The Concentration Barrier
  • XII. Channel Capacity of Domain-Specific Stochastic Resonance

Download Full Paper

Access the complete research paper with detailed methodology, empirical evidence, and formal proofs.

Download PDF