
A geometric account of why post-alignment fine-tuning can pull model behavior back toward earlier training-history manifolds.
Jun 26, 2026

A talk on how safety can silently degrade when models are adapted after deployment, even without malicious intent.
Feb 9, 2026