Looped language model training cannot control hidden-state norm growth because RMSNorm normalizes scale away before the loss ...
All my agents needed was a little bit of codified workflows to follow ...