Experiment 03 · obstacle maps
Obstacles — reward & barrier
A factorial study on obstacle worlds: reward shaping
(baseline · up-weighted coverage · info-gain) crossed with a connectivity barrier ON/OFF, run
across all three architectures (homogeneous, explorer/relay roles, learned selector) and three
scales (16²/4, 24²/6, 32²/10).
Key finding
Explorer/relay roles with up-weighted coverage reward and the connectivity barrier
off give the best coverage; switching the barrier on costs coverage across the
board, and the info-gain shaping reward-hacks — agents game the shaping term and coverage
collapses to near zero. Coverage also falls steeply as the grid scales from 16² to 32².