Warm-start → bigger · 16×16/4 parent → 32×32/10 child · 2 arms × 2 maps × 2 seeds (8 + 8 runs)

Warm-start → bigger — train small, then scale up

What it tests: train a policy at 16×16 / 4 agents (the parent), then warm-start a 32×32 / 10-agent run from it (the child) and see how far that initialisation carries — on open and rooms, two seeds, for both perception arms (base = SLAM-only, occ = occupancy belief). The 16² parents are shown alongside for context (labelled 16² parent). The interesting comparison is the warm-started 32² child against a 32² trained-from-scratch baseline.

Key finding Measured live from the rendered tiles below. This is the win. The warm-started 16²→32² base run reaches ≈ 60% coverage on open and ≈ 100% connectivity — versus ≈ 44% from scratch — breaking the coverage↔connectivity trade-off that every from-scratch run pays. (Occupancy hurts here too: occ trails base on open, exact gap in the live line above.)
Warm-started 32² children — per arm × map · mean of 2 seeds · rollout coverage + giant-component connectivity (100-step re-roll)
StageArmMapCoverageConnectivityseeds
computing from the rendered rollouts…

The run list is grouped by arm, with the warm-started 32² child and its 16² parent adjacent under each. Both columns are computed live from the rendered rollouts; connectivity is the mean giant-component fraction over the rollout.

t = 0