Högskolan i Skövde

his.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • apa-cv
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Dynamic multi-tour order picking in an automotive-part warehouse based on attention-aware deep reinforcement learning
School of Automation Science and Electrical Engineering, Beihang University, Beijing, China ; Department of Production Engineering, KTH Royal Institute of Technology, Stockholm, Sweden.
School of Automation Science and Electrical Engineering, Beihang University, Beijing, China ; State Key Laboratory of Intelligent Manufacturing System Technology, Beijing, China.
Department of Production Engineering, KTH Royal Institute of Technology, Stockholm, Sweden.ORCID iD: 0000-0001-8679-8049
Department of Sustainable Production Development, KTH Royal Institute of Technology, Stockholm, Sweden.ORCID iD: 0000-0003-4180-6003
Show others and affiliations
2025 (English)In: Robotics and Computer-Integrated Manufacturing, ISSN 0736-5845, E-ISSN 1879-2537, Vol. 94, article id 102959Article in journal (Refereed) Published
Abstract [en]

Dynamic order picking has usually demonstrated significant impacts on production efficiency in warehouse management. In the context of an automotive-part warehouse, this paper addresses a dynamic multi-tour order-picking problem based on a novel attention-aware deep reinforcement learning-based (ADRL) method. The multi-tour represents that one order-picking task must be split into multiple tours due to the cart capacity and the operator’s workload constraints. First, the multi-tour order-picking problem is formulated as a mathematical model, and then reformulated as a Markov decision process. Second, a novel DRL-based method is proposed to solve it effectively. Compared to the existing DRL-based methods, this approach employs multi-head attention to perceive warehouse situations. Additionally, three improvements are proposed to further strengthen the solution quality and generalization, including (1) the extra location representation to align the batch length during training, (2) the dynamic decoding to integrate real-time information of the warehouse environment during inference, and (3) the proximal policy optimization with entropy bonus to facilitate action exploration during training. Finally, comparison experiments based on thousands of order-picking instances from the Swedish warehouse validated that the proposed ADRL could outperform the other twelve DRL-based methods at most by 40.6%, considering the optimization objective. Furthermore, the performance gap between ADRL and seven evolutionary algorithms is controlled within 3%, while ADRL can be hundreds or thousands of times faster than these EAs regarding the solving speed.

Place, publisher, year, edition, pages
Elsevier, 2025. Vol. 94, article id 102959
Keywords [en]
Smart manufacturing system, Industry 5.0, Manual order picking, Deep reinforcement learning, Intelligent decision-making
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:his:diva-24924DOI: 10.1016/j.rcim.2025.102959ISI: 001401135400001Scopus ID: 2-s2.0-85214875132OAI: oai:DiVA.org:his-24924DiVA, id: diva2:1940220
Projects
Dynamic Scheduling of Assembly and Logistics Systems using AI (Dynamic SALSA)
Funder
Vinnova
Note

© 2025 Published by Elsevier Ltd.

Corresponding author at: School of Automation Science and Electrical Engineering, Beihang University, Beijing 100191, China

The authors would like to acknowledge the support of Swedish Innovation Agency (VINNOVA). This study is part of the Dynamic Scheduling of Assembly and Logistics Systems using AI (Dynamic SALSA) project. This research is also supported by the National Key R&D Program of China (No. 2023YFB3308201).

Available from: 2025-02-25 Created: 2025-02-25 Last updated: 2025-09-29Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Wang, LihuiRuiz Zúñiga, Enrique

Search in DiVA

By author/editor
Wang, LihuiRuiz Zúñiga, EnriqueWang, Xi Vincent
In the same journal
Robotics and Computer-Integrated Manufacturing
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 86 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • apa-cv
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf