Högskolan i Skövde

his.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • apa-cv
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Humanoids learning to walk: a natural CPG-actor-critic architecture
University of Skövde, School of Humanities and Informatics. University of Skövde, The Informatics Research Centre.ORCID iD: 0000-0002-7236-997X
University of Skövde, School of Humanities and Informatics. University of Skövde, The Informatics Research Centre.ORCID iD: 0000-0002-1525-0745
University of Skövde, School of Humanities and Informatics. University of Skövde, The Informatics Research Centre.ORCID iD: 0000-0001-6883-2450
2013 (English)In: Frontiers in Neurorobotics, ISSN 1662-5218, Vol. 7, no 5Article in journal (Refereed) Published
Abstract [en]

The identification of learning mechanisms for locomotion has been the subject of much research for some time but many challenges remain. Dynamic systems theory (DST) offers a novel approach to humanoid learning through environmental interaction. Reinforcement learning (RL) has offered a promising method to adaptively link the dynamic system to the environment it interacts with via a reward-based value system. In this paper, we propose a model that integrates the above perspectives and applies it to the case of a humanoid (NAO) robot learning to walk the ability of which emerges from its value-based interaction with the environment. In the model, a simplified central pattern generator (CPG) architecture inspired by neuroscientific research and DST is integrated with an actor-critic approach to RL (cpg-actor-critic). In the cpg-actor-critic architecture, least-square-temporal-difference based learning converges to the optimal solution quickly by using natural gradient learning and balancing exploration and exploitation. Futhermore, rather than using a traditional (designer-specified) reward it uses a dynamic value function as a stability indicator that adapts to the environment. The results obtained are analyzed using a novel DST-based embodied cognition approach. Learning to walk, from this perspective, is a process of integrating levels of sensorimotor activity and value.

Place, publisher, year, edition, pages
Frontiers Media S.A., 2013. Vol. 7, no 5
Keywords [en]
reinforcement learning, humanoid walking, central pattern generators, actor-critic, dynamical systems theory, embodied cognition, value system
National Category
Computer and Information Sciences
Research subject
Technology
Identifiers
URN: urn:nbn:se:his:diva-8368DOI: 10.3389/fnbot.2013.00005ISI: 000209437600005PubMedID: 23675345Scopus ID: 2-s2.0-84902356043OAI: oai:DiVA.org:his-8368DiVA, id: diva2:639509
Note

CC BY 3.0

Available from: 2013-08-08 Created: 2013-08-08 Last updated: 2024-05-21Bibliographically approved

Open Access in DiVA

fulltext(2909 kB)49 downloads
File information
File name FULLTEXT01.pdfFile size 2909 kBChecksum SHA-512
40b49f546c3182df83361c2dae923fd0bd14a580c87b5414ec388e42db9672cdfbc77f4b42be704ee278e1ee3707dc1ee9f6f5820bd6c41a2bc69569b91f65af
Type fulltextMimetype application/pdf

Other links

Publisher's full textPubMedScopus

Authority records

Li, CaiLowe, RobertZiemke, Tom

Search in DiVA

By author/editor
Li, CaiLowe, RobertZiemke, Tom
By organisation
School of Humanities and InformaticsThe Informatics Research Centre
In the same journal
Frontiers in Neurorobotics
Computer and Information Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 49 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
pubmed
urn-nbn

Altmetric score

doi
pubmed
urn-nbn
Total: 901 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • apa-cv
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf