A team of autonomous agents explores an unknown grid world under partial observability, communicating only over a limited-range, time-varying graph. Each agent acts on its own local sensing; the objective is scored at the mission level. The hard part is the coupling: the swarm must cover ground while keeping the comm graph connected, and pushing harder on one tends to cost the other.
This gallery collects trained-policy rollouts, every one rebuilt from its checkpoint and re-rolled for 100 steps. The headline set — all finished balthar runs, organised into five categorised experiments — answers what gave what at a glance: the best architecture across maps, a recipe ablation, the coverage↔connectivity frontier, an occupancy A/B, and a warm-start scale-up. Below them sit the earlier gallery families for reference. Each card is one self-contained experiment — labelled with what it tests and its key finding — and opens that experiment's own page with a live summary table computed from the rollouts; every page shares the same interactive viewer: press play to watch a rollout, scrub the slider, change speed.