t = 0
What it tests: a controlled ablation that toggles three ingredients of the best recipe one at a time, on three maps (open · rooms · crowded), three seeds each. The four arms are role_soft_bump (explorer/relay roles + soft-λ connectivity + a bump-explore reward term), base_soft_bump (drop the roles → homogeneous), role_lag_bump (swap the soft penalty for a learned-Lagrangian one), and role_soft_flat (drop the bump term → flat reward). Reading the four side by side isolates what each piece is worth.
| Recipe | What it changes | Mean coverage | vs floor | runs |
|---|---|---|---|---|
| computing from the rendered rollouts… | ||||
The run list is grouped by recipe, then map — expand a recipe to compare it across terrain. The table is computed live from the rendered rollouts, so it self-corrects on every re-render.