t = 0
What it tests: races the candidate architectures against each other on empty grids (16²/4, 24²/6, 32²/10): a homogeneous shared policy, a per-agent identity residual, hardcoded explorer/relay roles, a learned skill-selector (with and without a congestion price), a scale-curriculum arm, and a decentralized-critic ablation. (For the controlled from-scratch vs. warm-started transfer A/B, see the dedicated Warm-start tab.)