08-2025

Thu 17 / 08 / 2025

I am working on a loose replication of the experiments in this video
- I love them
- They are excellent to test my setup
The environment has been create already
- 2D arena, grid style
- Agents have 4 discrete actions (up, down, left and right)
- Part of the arena has light, the other is dark
- If the agent is in the dark, it loses 1 health point per turn, 0 otherwise.
- The agents start with 100 health points
- The simulation last for 200 clocks, or until all the agents are dead
The integration of OpenAI-ES, Agent design and the environment worked successfully on the first experiment: Agents have the capability to detect light, can move in 4 directions, and the objective is to “be in the light”.
Following changes to the underlying data-structures in the optimizer broke the system for a while. It is solved now
- This underscores the critical problem of testing
2nd experiment made was successful: the agent detect it’s own health (-1 health point each time step in the dark, 0 in the light). The agents have learned successfully to find the light
3rd experiment was successful: no input at all.