Högskolan i Skövde

his.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • apa-cv
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Reinforcement learning behavior during an epidemic
University of Skövde, School of Informatics.
2021 (English)Independent thesis Advanced level (degree of Master (One Year)), 10 credits / 15 HE creditsStudent thesis
Abstract [en]

In this master’s degree project, an experiment is made to compare how agents trained with reinforcement learning behave during an epidemic. The goal of the project was to answer the question: How the spread rate of a pandemic virus affects the behavior of agents in reinforcement learning. To answer the question, an artefact was created. The artefact consists of an experiment environment that uses agents trained with reinforcement learning. The experiment environment is a big square where there are 20 agents and 20 goals. Five of each are placed on each edge of the square. The environment runs in multiple episodes where each episode consists of 3000 actions the agents can take. At the beginning of each episode, 4 agents become sick. The agent’s goal is to reach a goal on the opposite edge without getting sick. The agents were trained using reinforcement learning to achieve this behavior. The 6 different agent models are trained in the experiment scenario. Each model have been trained with a unique combination on how fast the virus spread and how much they should prioritize not getting sick. In the experiment, each combination ran 300 episodes each in an experiment environment. During the experiment information of each episode was logged, such as distance, collision and agents reaching the goal. The result of the experiments indicates that the spread rate of the virus made agents prioritize not getting sick rather than reaching the goal. They did this by having a longer distance between each other on average.

Place, publisher, year, edition, pages
2021. , p. 2, 34
Keywords [en]
Crowd simulation, reinforcement learning, epidemic
National Category
Information Systems, Social aspects
Identifiers
URN: urn:nbn:se:his:diva-20011OAI: oai:DiVA.org:his-20011DiVA, id: diva2:1574746
Subject / course
Informationsteknologi
Educational program
Data Science - Master’s Programme
Supervisors
Examiners
Note

There are other digital material (eg film, image or audio files) or models/artifacts that belongs to the thesis and need to be archived.

Available from: 2021-06-29 Created: 2021-06-29 Last updated: 2021-06-29Bibliographically approved

Open Access in DiVA

No full text in DiVA

By organisation
School of Informatics
Information Systems, Social aspects

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 160 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • apa-cv
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf