top of page

AI Model Autophagy:

When AI Systems Consume Themselves


Dale Rutherford, Ph.D.

April 7, 2026



The Problem Hiding in Plain Sight

Every time a large language model is retrained, it consumes a corpus assembled from the internet. And the internet increasingly is large language model output. Research repositories, news synthesis layers, social media summarizers, knowledge bases -- model-generated text propagates into every channel from which training data is harvested. When that data feeds the next training cycle, the model ingests a distorted reflection of itself.


This is not a hypothetical scenario. Shumailov et al. (2023) have documented model collapse dynamics under recursive training conditions. Alemohammad et al. (2023) demonstrated that self-consuming generative models go MAD (their term, and the analogy is precise). Briesch et al. (2023) have shown empirically that diversity degrades under recursive loops. What has been missing is a formal quantitative framework for predicting decay rates, diagnosing multi-scale causal mechanisms, and evaluating the efficacy of governance interventions.


The Model Autophagy research initiative addresses that gap directly.

 

Model autophagy transforms the concept of knowledge degradation from metaphor into measurable governance science.

 

What Model Autophagy Actually Means

The term borrows deliberately from cell biology. In biological systems, autophagy denotes a process by which a cell degrades its own components under stress, a maintenance mechanism that, when dysregulated, accelerates pathological decline rather than recovery. The structural parallel to recursive AI training is tighter than it first appears: both involve self-referential feedback loops, both exhibit threshold-dependent phase transitions, and both produce path-dependent irreversibility once the degradation process passes a critical state.


In the LLM context, the mechanism operates as follows. Users interact with AI systems that reinforce their existing cognitive patterns, narrowing query diversity within sessions. That narrowed output propagates to the web and to research corpora through real-time augmentation pipelines. Data procurement systems, optimizing for cost rather than provenance integrity, select synthetic content at increasing rates. Models retrained on this contaminated corpus inherit the distortions and amplify them into the next generation. Each cycle compounds the previous cycle’s errors. 

Critical Structural Finding

Once errors and biases become embedded in model weights through retraining, they propagate forward with each subsequent cycle. Governance can slow this process; it cannot, under current architectures, reverse it. This path-dependency is the central challenge the framework is designed to characterize.

 The Formal State-Space

The research program’s core contribution is a coupled system of differential and recurrence equations governing eight interacting state variables across four temporal scales. This is not a descriptive taxonomy—it is a parametric model producing falsifiable predictions about decay trajectories, intervention thresholds, and governance efficacy under specific parameter regimes.

The primary state variables are: 

Symbol

Name

Description

I(t)

Corpus Integrity

Primary decay metric. Declines monotonically under autophagy pressure. Target for governance stabilization.

B(t)

Bias Index

Amplifies under confirmation loop feedback. Feeds back into I(t) via training contamination.

M(t)

Misinformation Index

Propagates through research and social dissemination channels. Accelerates after provenance opacity threshold.

E(t)

Error Propagation

Compounds across retraining generations. Partially irreversible once embedded in model weights.

P(t)

Provenance Integrity

Decays toward 0 as synthetic share S(t) increases and data lineage becomes untraceable.

S(t)

Synthetic Data Share

Proportion of training corpus that is model-generated. Primary driver of autophagic contamination.

H(t)

Homogeneity Index

Measures semantic diversity loss. Rising H(t) signals echo chamber intensification.

Q(t)

LLM Quality Index

Composite output quality. Degrades as I(t) declines and B, M, E indices rise.

 Key amplification parameters include the Feedback Injection Factor (FIF), which accelerates generational model drift, and the Bias Reduction Factor (BRF), which serves as the primary governance damping coefficient. The interaction between these parameters and the decay variables defines the core governance problem: under what FIF regimes can BRF-based interventions maintain I(t) above the critical collapse threshold near 0.5? 

GENERALIZED CORPUS INTEGRITY RECURRENCE

I(n+1) = I(n) - lambda_I S(n) (1 - P(n)) + eta_G2 * w_p

B(n+1) = B(n) + FIF B(n) (1 - D(n)) - BRF * D(n)

P(n+1) = P(n) (1 - gamma S(n)) / (1 + mu * (N - 1))

 The provenance equation’s multiplicative decay form is critical. An additive self-limiting form prevents P(t) from decaying from its baseline of 1.0 under realistic contamination conditions—a subtle modeling distinction with significant empirical implications for the parameter estimation phase of the research.

A Four-Layer Causal Architecture

The model operates across four temporally stratified layers. Governance interventions must be targeted at the appropriate layer to be effective. Applying a corpus-level control to a session-level dynamic does not slow the decay process—it addresses the wrong feedback loop entirely. 

Layer

Name & Timescale

Description

Governance Node

L1

User-Agent Interaction

seconds–minutes

Confirmation bias B(t) amplifies through query-response loops. The Ratchet Effect—user output shaping next query via prior AI response—operates here.

G1: SymPrompt+ Diversity Injection

L2

Training Data Contamination

weeks–months

I(t), P(t), S(t), and M(t) interact as model output propagates to corpora. Provenance opacity renders lineage untraceable.

G2: Provenance Filter & Deduplication

L3

Multi-Generation Propagation

quarters–years

Parameter drift Δθ compounds across retraining cycles amplified by FIF. Q(t) degrades as contamination accumulates in model weights.

G3: MIDCOT Drift Detection (SPC)

L4

Systemic Impact & Governance

continuous

B(t), M(t), and E(t) at civilizational scale. Epistemic reliability of the digital knowledge ecosystem becomes the operative concern.

G4: ALAGF Governance Orchestration

What Governance Can and Cannot Do

The model’s governance findings are carefully calibrated. Three-tier intervention—session diversity injection at L1, provenance filtering at L2, and SPC-based drift detection at L3—can slow autophagic decay from an exponential to a linear regime. Empirical simulation results show a recognizable plateau forming around the Year 2 intervention point when all three governance nodes are active.


But the plateau does not represent recovery. It represents managed decline. Once contamination is embedded in model weights, it propagates forward. The structural inertia of prior training generations, combined with persistent economic pressure toward low-cost synthetic data acquisition, prevents governance from reversing the trajectory without active restoration mechanisms that do not yet exist at scale. 

Core Governance Finding

Governance interventions operationalized through provenance weighting (w_p), diversity quotas (w_div), and human-to-synthetic balance controls (w_bal) demonstrably slow I(t) decay. They shift the trajectory from exponential collapse to linear decline. The critical threshold where exponential behavior initiates appears near I(t) = 0.5, making early detection—before this threshold is crossed—the key operational requirement for any viable anti-autophagy governance architecture.

 The implication for governance architects is direct: the detection problem precedes the intervention problem. SPC-based control charts tied to I(t) and B(t) trajectories, calibrated to trigger before the 0.5 threshold, are the operational prerequisite for effective governance. This is the design target of the Anti-Autophagy Monitor prototype.

 

Active governance nodes: G1: SymPrompt+ Diversity Injection  |  G2: Provenance Filter & Deduplication  |  G3: MIDCOT Drift Detection (SPC)  |  G4: ALAGF Governance Orchestration 

Standards Alignment and Regulatory Anchoring

The governance framework is not free-floating. Every intervention parameter maps to an established standards framework, making the research actionable within existing compliance infrastructures rather than requiring novel regulatory constructs.


ISO/IEC 42001 AI management system controls correspond directly to the governance weight parameters (w_p, w_div, w_bal) and the G1–G4 intervention nodes.


NIST AI Risk Management Framework GOVERN, MAP, MEASURE, and MANAGE functions align structurally with the four research phases, with the MEASURE function mapping specifically to Anti-Autophagy Monitor metrics.


EU AI Act High-risk system requirements for data governance and logging provide the regulatory grounding for the provenance infrastructure argument embedded in the P(t) decay thesis.

IEEE Ethics Guidelines Transparency and accountability principles underpin the epistemic reliability argument and the L4 civilizational-scale framing.


For AI governance officers, quality engineers, and risk managers working within these frameworks, the model provides something those frameworks currently lack: operational metrics for epistemic integrity. ISO/IEC 42001 and NIST RMF specify that measurable controls are required. They do not specify what to measure when the integrity concern is recursive synthetic contamination. This framework fills that gap.

 

Empirical Validation and Next Steps

The research program is currently in Phase 3 empirical validation. Controlled retraining experiments using GPT-2 as a surrogate model are running across five replicate tracks under non-governed baseline conditions, followed by governed condition experiments designed to test the efficacy claims formally. The experimental design uses ten retraining generations per track with calibrated synthetic-to-human data blends.


Validation targets Mann-Kendall monotonicity tests for the core decay hypotheses (H1: monotonic I(t) decline; H2: monotonic BME growth), alongside parameter estimation for FIF and decay rate coefficients from empirical trajectories. Model collapse is hypothesized to manifest near generation 7–8 under baseline conditions, with governed conditions delaying collapse onset by at least three generations at statistically significant margins.


The Anti-Autophagy Monitor simulator (v10, publication-ready) is deployed at darutherford.github.io/model-autophagy. It implements the full equation system with interactive governance parameter controls, allowing researchers and governance practitioners to explore the model’s behavior under user-specified parameter regimes without replicating the computational infrastructure.

 

References

Alemohammad, S., Casco-Rodriguez, J., Luzi, L., Humayun, A. I., Babaei, H., LeJeune, D., Siahkoohi, A., & Baraniuk, R. G. (2024). Self-consuming generative models go MAD. In Proceedings of the Twelfth International Conference on Learning Representations (ICLR 2024). https://doi.org/10.48550/arXiv.2307.01850


Briesch, M., Sobania, D., & Rothlauf, F. (2023). Large language models suffer from their own output: An analysis of the self-consuming training loop. arXiv. https://doi.org/10.48550/arXiv.2311.16822


International Organization for Standardization & International Electrotechnical Commission. (2023). ISO/IEC 42001:2023: Information technology -- Artificial intelligence -- Management system. https://www.iso.org/standard/81230.html


Shumailov, I., Shumaylov, Z., Zhao, Y., Papernot, N., Anderson, R., & Gal, Y. (2024). AI models collapse when trained on recursively generated data. Nature, 631(8022), 755--759. https://doi.org/10.1038/s41586-024-07566-y


Tabassi, E. (2023). Artificial intelligence risk management framework (AI RMF 1.0) (NIST AI 100-1). National Institute of Standards and Technology, U.S. Department of Commerce. https://doi.org/10.6028/NIST.AI.100-1


Rutherford, D. (2026). Large Language Model Autophagy: Quantifying Epistemic Decay and Governance Intervention in Recursive AI Training Ecosystems, Ethical AI Review 1(1.1). https://doi.org/10.5281/zenodo.19453508

  

About the Author

Dale A. Rutherford, Ph.D. is an AI Governance Strategist and the Principal Investigator at The Center for Ethical AI. He is the architect of MIDCOT, ALAGF, SymPrompt+, and the BME Metric Suite. His research aligns with ISO/IEC 42001, NIST AI RMF, and IEEE Ethics Guidelines.




Comments


bottom of page