Sat 16 / 08 / 2025

Status update

I am pushing in the direction of Alife and multi-agent modeling at the moment.

  • I am working on a loose replication of the experiments in this video
    • I love them
    • They are excellent to test my setup
  • The environment has been create already using python arcade package
    • 2D arena, grid style
    • Agents have 4 discrete actions (up, down, left and right)
    • Part of the arena has light, the other is dark
    • If the agent is in the dark, it loses 1 health point per turn, 0 otherwise.
    • The agents start with 100 health points
    • The simulation last for 200 clocks, or until all the agents are dead
    • The agents don’t interact with each other
    • More than 1 agent can exist on the same location
  • Done experiments
    • The experiments conducted later (in the following points) have fixed parameters:
      • 1000 generations cap
      • 32 individuals
      • 8 to 10 evaluation for each generation to conclude its performance
      • The agents has a neural network with 1 input only (depends on the experiment), a hidden state of 8 neurons, 1 recurrent connection, and 4 outputs (one per direction)
    • The integration of OpenAI-ES, Agent design and the environment worked successfully on the first experiment: Agents have the capability to detect light, can move in 4 directions, and the objective is to “be in the light”.
    • 2nd experiment made was successful: the agent detect it’s own health (-1 health point each time step in the dark, 0 in the light). The agents have learned successfully to find the light
    • 3rd experiment was successful: no input at all. Some runs failed, some succeeded.
    • 4th experiment failed: randomly change the placing of the light each evaluation
      • Unlikely due to a bug in the code at this stage
      • I suspect the size of neural network controllers is too small to learn a search pattern, and / or the number of evaluations are not sufficient
  • After the first experiment, I made changes to the data-structures used in the optimizer, which broke the system for a while. It is solved now
    • This underscores the critical problem of testing
  • Logging
    • I added simple logging via SQLite
    • I looked into mlflow, but I am concerned that its underlying assumptions are simple different

Persisting issues

  • Multiprocessing: I’ve struggled a lot with this. The level of multiprocessing I want is on the evaluation level. It seemed that pyglet (the backend of python arcade) or python arcade itself don’t play well with MacOS in headless mode
    • Potential simulation time savings from this are considerable: for 8 evaluations, the time went from ~40 seconds to ~17 seconds when using 10 processes on my laptop
    • Suggestion: Re-implement the engine using built-in python data-structures, and then - for the sake of visualization and testing interactivity - integrate this inside python arcade. So, in headless mode, I can trigger that engine only.

TODO

  • Short term
    • Address the multiprocessing problem with the earlier suggestion
    • More experiments
      • Limited resources
        • What if an agent can not take a place of another agent
        • Give agents a “kill agent in the direction of forward movement” action. Will it be used?
      • Complex environment
        • What if light changes location from one evaluation to another? (the 4th failed experiment tried earlier)? I am expecting some search pattern to emerge in the agents
        • Add obstacles in the playground
  • Longer term
    • Discretize the agent’s memory and start looking into its content
    • Communication between the agents
    • Implement different optimization algorithms
      • NEAT/HyperNEAT, Genetic Algorithm

29/06/2025

This morning I am experimenting with some physics on car racing game (demo video here). This is changing by the minute. What I will aim for is to build a track, and try to move the car as fast as possible in the track, without hitting the walls or the boundaries of the screen (penalty if you do). The score will be based on speed and penalties.


28/06/2025

I’ve been working on Lazy Tetris for a while now. The core of it is implemented and stable now. The main issue is to finalize the project.

This is a quick recap:

  1. I’ve tried to use arcade Sections to better manage the View (the main grid, the next shape…etc), but without success. For some reason, the view started to capture my keyboard strokes twice. I couldn’t resolve this from the docs.
  2. I managed to use Nuitka to build an executable on my Apple Silicon (dope!).
    1. To target Windows: It seems easy to do using MinGW64 compiler.
    2. To target Linux: It requires a docker container. I think I should build a flow for this later, but I can’t be bothered at the moment
  3. I removed the initial menu and the leaderboard. It didn’t feel right. I just wanted to play, and all these buttons were distracting me from what I want to do.
    1. I will just keep the top score, and add that to the status bar for reference

Update: So, this is done for now. I’ve released the code and made a very silly video about it.

For the ReadMe file in the repo, I tried for the first time to generate it using Github Copilot: I basically gave it the code as the context, with a short description for what the game does. The result was satisfying: I had to edit it afterwords, but it was nice all-in-all.

I am not sure yet how to properly communicate this kind of work, either on reddit or HN. At least with some friends they appreciated it.

Anyhow, here we are.