Högskolan i Skövde

his.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • apa-cv
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
How to train a self-driving vehicle: On the added value (or lack thereof) of curriculum learning and replay buffers
University of Skövde, School of Informatics. University of Skövde, Informatics Research Environment. (Interaction Lab)ORCID iD: 0000-0003-0093-3655
University of Skövde, School of Informatics. University of Skövde, Informatics Research Environment. (Interaction Lab)ORCID iD: 0000-0002-6568-9342
University of Skövde, School of Informatics. University of Skövde, Informatics Research Environment. (Interaction Lab)ORCID iD: 0000-0003-3129-4892
Donders Institute for Brain,Cognition, and Behaviour, Radboud University, Nijmegen, Netherlands.ORCID iD: 0000-0003-1177-4119
2023 (English)In: Frontiers in Artificial Intelligence, E-ISSN 2624-8212, Vol. 6, article id 1098982Article in journal (Refereed) Published
Abstract [en]

Learning from only real-world collected data can be unrealistic and time consuming in many scenario. One alternative is to use synthetic data as learning environments to learn rare situations and replay buffers to speed up the learning. In this work, we examine the hypothesis of how the creation of the environment affects the training of reinforcement learning agent through auto-generated environment mechanisms. We take the autonomous vehicle as an application. We compare the effect of two approaches to generate training data for artificial cognitive agents. We consider the added value of curriculum learning—just as in human learning—as a way to structure novel training data that the agent has not seen before as well as that of using a replay buffer to train further on data the agent has seen before. In other words, the focus of this paper is on characteristics of the training data rather than on learning algorithms. We therefore use two tasks that are commonly trained early on in autonomous vehicle research: lane keeping and pedestrian avoidance. Our main results show that curriculum learning indeed offers an additional benefit over a vanilla reinforcement learning approach (using Deep-Q Learning), but the replay buffer actually has a detrimental effect in most (but not all) combinations of data generation approaches we considered here. The benefit of curriculum learning does depend on the existence of a well-defined difficulty metric with which various training scenarios can be ordered. In the lane-keeping task, we can define it as a function of the curvature of the road, in which the steeper and more occurring curves on the road, the more difficult it gets. Defining such a difficulty metric in other scenarios is not always trivial. In general, the results of this paper emphasize both the importance of considering data characterization, such as curriculum learning, and the importance of defining an appropriate metric for the task.

Place, publisher, year, edition, pages
Frontiers Media S.A., 2023. Vol. 6, article id 1098982
Keywords [en]
data generation, curriculum learning, cognitive-inspired learning, reinforcement learning, replay buffer, self-driving cars
National Category
Computer Sciences Computer Vision and Robotics (Autonomous Systems) Robotics
Research subject
Interaction Lab (ILAB)
Identifiers
URN: urn:nbn:se:his:diva-22215DOI: 10.3389/frai.2023.1098982ISI: 000928959000001PubMedID: 36762255Scopus ID: 2-s2.0-85147654896OAI: oai:DiVA.org:his-22215DiVA, id: diva2:1732411
Funder
EU, Horizon 2020, 731593
Note

CC BY 4.0

Received 15 November 2022, Accepted 05 January 2023, Published 25 January 2023

This article was submitted to Machine Learning and Artificial Intelligence, a section of the journal Frontiers in Artificial Intelligence

This article is part of the Research Topic Artificial Intelligence and Autonomous Systems

Correspondence Sara Mahmoud sara.mahmoud@his.se

Part of this work was funded under the Horizon 2020 project DREAMS4CARS, Grant No. 731593.

Available from: 2023-01-31 Created: 2023-01-31 Last updated: 2023-05-04Bibliographically approved

Open Access in DiVA

fulltext(1928 kB)88 downloads
File information
File name FULLTEXT01.pdfFile size 1928 kBChecksum SHA-512
0d9ca0e289bff98226ef50ce02e83b660bdc990641d3351444fe41d076e1f9bb6702d3c01326ae4ffe97d289fadc3e4f7473abf4c55082c6e117f5eee3f06a83
Type fulltextMimetype application/pdf

Other links

Publisher's full textPubMedScopus

Authority records

Mahmoud, SaraBilling, ErikSvensson, HenrikThill, Serge

Search in DiVA

By author/editor
Mahmoud, SaraBilling, ErikSvensson, HenrikThill, Serge
By organisation
School of InformaticsInformatics Research Environment
In the same journal
Frontiers in Artificial Intelligence
Computer SciencesComputer Vision and Robotics (Autonomous Systems)Robotics

Search outside of DiVA

GoogleGoogle Scholar
Total: 88 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
pubmed
urn-nbn

Altmetric score

doi
pubmed
urn-nbn
Total: 203 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • apa-cv
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf