Selective Disentanglement: Natural Language Fine-Tuning as an Architecture-Dependent Decoupler of Cross-Domain Entanglement

Executive Summary

Structural entanglement — the geometric property that every informative direction in a transformer's activation space carries all concept dimensions simultaneously — was established as generic in prior work. This paper shows that natural language fine-tuning selectively destroys this property in Qwen-2.5-Coder-32B, driving entanglement intensity from 0.667 to 0.000 across 8 independent seeds while preserving domain classification accuracy (0.90–0.98). The collapse is a sharp phase transition occurring at training steps 2000–3500.

The same B3 protocol (code primary + NL secondary) applied to four other architectures at ~7B scale produces qualitatively different responses: CodeLlama increases EI (+27–39%), DeepSeek modestly decreases it (-14%), Mistral is unchanged (+0.4%), and Qwen-7B partially decreases (-26%). A scale ladder within the Qwen family reveals graded susceptibility with a phase boundary between 14B and 32B.

Key Findings

Phase transition, not gradual decay: B3 and B2 trajectories are indistinguishable through step 1500, then diverge sharply. Two of eight seeds temporarily recover entanglement before permanent collapse, consistent with oscillation near the non-degeneracy boundary.
Non-degeneracy margin refuted (Hypothesis A): Qwen-32B has the largest margin of all tested models (1.78 vs <10^-4 for all others), yet is the only one that collapses. Proximity to the boundary is neither necessary nor sufficient.
Pre-training composition unsupported (Hypothesis B): All models show similar cross-domain activation structure (centroid cosines ~-0.49, binary LOO-CV ≥0.992). No model is distinctively separated or integrated.
Scale dependence confirmed (Hypothesis C): Within the Qwen family, B3 response intensifies continuously: -26% at 7B, -76% at 14B, -100% at 32B. The qualitative phase boundary lies between 14B (margin grows to 4.79, EI stabilizes at 0.188) and 32B (margin crossed, EI permanently zero).
Architecture-specificity at matched scale: Four families at ~7B show four qualitatively different responses — the collapse is not a generic consequence of NL fine-tuning.

Key References

McEntire (2026) — Structural Entanglement (AI-26): establishes the geometric phenomenon across four architectures
McEntire (2026) — The Entanglement Theorem (AI-27): formal proof that entanglement is generic under a non-degeneracy condition
McEntire (2026) — Entanglement-Optimal Fine-Tuning (AI-28): discovers the architecture-specific collapse under B3
McEntire (2026) — Entangled Directions (AI-25): discovers the discrimination–activation dissociation

Executive Summary

Key Findings

Key References

Download Full Paper