What it tests: the single best architecture we have — explorer/relay role split, a learned-Lagrangian SOFT connectivity constraint (no hard mask), a frontier-attention LPAC explorer, and SLAM wall-sensing on — run on all four maps at 32×32 / 10 agents and scored as cover_r = 0 REAL visited coverage. Each map has a different god-view-oracle ceiling (open 72% · rooms 37% · mixed 31% · crowded 40%), so every tile is read as coverage-vs-optimal: not the raw percent, but the fraction of what is even reachable in 100 steps on that map. The question is whether the policy is uniformly good across very different terrain or whether corridors and clutter break it.
| Map | Oracle optimal | Real coverage | Coverage ÷ optimal | Real connectivity | seeds |
|---|---|---|---|---|---|
| computing from the rendered rollouts… | |||||
Scrub any rollout below to watch the explorer/relay split work the map. The table above is computed live from the rendered rollouts (not hand-typed), so it self-corrects whenever the tiles are re-rendered.