Context

For several weeks now, I’ve been working on building a simple game, to test ideas and emergent behaviors from agents learning playing a game. The learning is done by either evolutionary algorithms (the current status) or probably later down the road, reinforcement learning.

My stack is simple: python-arcade for the game development, pygmo for the evolutionary algorithms, and pytorch for the neural networks.

The first game is a proof of concept: two agents, one of them has to shoot the other. Each agent is controlled by a neural network, and the neural network is trained using evolutionary algorithms. Both agents use the same neural network. For the sake of simplicity, the neural network takes as input the position of both agents, and outputs the action to take (move forward, fire…etc).

The problem

For several weeks now, there was a stagnation in my progress. A big reason for this was the a problem that happened after nearly ~1800*32 evaluation times: the optimization program just freezes.

To describe what happens, there is class holding the problem: evaluate the quality of a solution (in this case, a neural network that represent the tank policy), by playing 32 games (the number of cores I’ve available), and measure the average score. The 32 games are played in parallel, thanks to the multiprocessing library in python.

The parallel evaluation exists inside the class, something like this:

# A snipped from the problem class
class Problem:
    def evaluate(self, solution):
        # ...
        with multiprocessing.Pool(processes=32) as pool:
            results = pool.map(worker, range(32))
        # ...

where the worker is basically a game run.

The issue is that after a while, the whole thing just freezes. The 32 processes are still running, but they are not doing anything. The CPU usage is idle. The memory usage is idle. The processes are still alive, but they are not doing anything. They are just stuck.

This kind of problems is probably the most annoying kind: a ghost in the wires. There are no traces that I can find of why there is a problem, and it is difficult to reproduce.

Without better information, I went through a process of elimination: I started with the pygmo optimization library. So, I built a small optimization library from scratch, with a couple of algorithms, benchmarked them, and then used it instead. Yet, the problem persisted.

I tried to use a signal to kill the processes after a certain time, but the problem persisted.

I then accidentally moved the with multiprocessing.Pool(processes=32) as pool part outside the class, and the problem disappeared. I thought it was a coincidence, but it wasn’t. The problem was gone.

Why is that?

After asking chatgpt, it seems that the use of the multiprocessing library inside a class implies that the class will have to pickled and unpickled. The pickled class is passed to the new processes. This can lead to unexpected behavior or even deadlocks if the class state cannot be pickled or if it has side effects when pickled and unpickled.

By moving the multiprocessing part outside the class, there is no need to pickle the class, and the problem is gone.

According to ChatGPT, to avoid pickling issues in python, the advice is:

  1. Use top-level functions: Functions that are defined at the top level of a module are easier to pickle and unpickle. Avoid using lambda functions or functions defined inside other functions or methods.
  2. Avoid non-picklable objects: Some objects cannot be pickled, including open file or network connections, threads, processes, stack frames, deeply recursive classes, and others. If you need to share such objects between processes, consider using a server process or file-based sharing.

…I hope this is right.