Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞): RT @che_shr_cat: 1/ Matrix-aware optimizers like Muon feel like algorithmic black magic. But it turns out they aren't heuristics at all. Th… | digg
Story: Matrix-Aware Optimizers like Muon Explained as Exact Continuous-Time Gradient Flows