Optimising Worlds to Evaluate and Influence Reinforcement Learning Agents