Högskolan i Skövde

his.sePublikasjoner
Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • apa-cv
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Reinforcement learning behavior during an epidemic
Högskolan i Skövde, Institutionen för informationsteknologi.
2021 (engelsk)Independent thesis Advanced level (degree of Master (One Year)), 10 poäng / 15 hpOppgave
Abstract [en]

In this master’s degree project, an experiment is made to compare how agents trained with reinforcement learning behave during an epidemic. The goal of the project was to answer the question: How the spread rate of a pandemic virus affects the behavior of agents in reinforcement learning. To answer the question, an artefact was created. The artefact consists of an experiment environment that uses agents trained with reinforcement learning. The experiment environment is a big square where there are 20 agents and 20 goals. Five of each are placed on each edge of the square. The environment runs in multiple episodes where each episode consists of 3000 actions the agents can take. At the beginning of each episode, 4 agents become sick. The agent’s goal is to reach a goal on the opposite edge without getting sick. The agents were trained using reinforcement learning to achieve this behavior. The 6 different agent models are trained in the experiment scenario. Each model have been trained with a unique combination on how fast the virus spread and how much they should prioritize not getting sick. In the experiment, each combination ran 300 episodes each in an experiment environment. During the experiment information of each episode was logged, such as distance, collision and agents reaching the goal. The result of the experiments indicates that the spread rate of the virus made agents prioritize not getting sick rather than reaching the goal. They did this by having a longer distance between each other on average.

sted, utgiver, år, opplag, sider
2021. , s. 2, 34
Emneord [en]
Crowd simulation, reinforcement learning, epidemic
HSV kategori
Identifikatorer
URN: urn:nbn:se:his:diva-20011OAI: oai:DiVA.org:his-20011DiVA, id: diva2:1574746
Fag / kurs
Informationsteknologi
Utdanningsprogram
Data Science - Master’s Programme
Veileder
Examiner
Merknad

There are other digital material (eg film, image or audio files) or models/artifacts that belongs to the thesis and need to be archived.

Tilgjengelig fra: 2021-06-29 Laget: 2021-06-29 Sist oppdatert: 2021-06-29bibliografisk kontrollert

Open Access i DiVA

Fulltekst mangler i DiVA

Av organisasjonen

Søk utenfor DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric

urn-nbn
Totalt: 166 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • apa-cv
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf