In this master’s degree project, an experiment is made to compare how agents trained with reinforcement learning behave during an epidemic. The goal of the project was to answer the question: How the spread rate of a pandemic virus affects the behavior of agents in reinforcement learning. To answer the question, an artefact was created. The artefact consists of an experiment environment that uses agents trained with reinforcement learning. The experiment environment is a big square where there are 20 agents and 20 goals. Five of each are placed on each edge of the square. The environment runs in multiple episodes where each episode consists of 3000 actions the agents can take. At the beginning of each episode, 4 agents become sick. The agent’s goal is to reach a goal on the opposite edge without getting sick. The agents were trained using reinforcement learning to achieve this behavior. The 6 different agent models are trained in the experiment scenario. Each model have been trained with a unique combination on how fast the virus spread and how much they should prioritize not getting sick. In the experiment, each combination ran 300 episodes each in an experiment environment. During the experiment information of each episode was logged, such as distance, collision and agents reaching the goal. The result of the experiments indicates that the spread rate of the virus made agents prioritize not getting sick rather than reaching the goal. They did this by having a longer distance between each other on average.
There are other digital material (eg film, image or audio files) or models/artifacts that belongs to the thesis and need to be archived.