Högskolan i Skövde

his.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • apa-cv
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Real-Time Automatic Checkout via Prompt-Based Product Extraction and Cross-Domain Learning
University of Skövde, School of Engineering Science. University of Skövde, Virtual Engineering Research Environment. Jönköping University, Sweden ; Itab Shop Products AB, Jönköping, Sweden. (Virtual Production Development (VPD))ORCID iD: 0000-0001-8880-7965
Dept. of Computer Science and Informatics, Jönköping University, Sweden.ORCID iD: 0000-0003-2900-9335
Dept. of Computing, Jönköping University, Sweden.
2024 (English)In: Proceedings 2024 International Conference on Machine Learning and Applications ICMLA 2024: Miami, Florida 18-20 December 2024 / [ed] M. Arif Wani; Plamen Angelov; Feng Luo; Mitsunori Ogihara Xintao Wu; Radu-Emil Precup; Ramin Ramezani; Xiaowei Gu, IEEE, 2024, p. 1396-1403Conference paper, Published paper (Refereed)
Abstract [en]

Automatic checkout systems are designed to predict a complete shopping receipt using an image from the checkout area. These systems require high classification accuracy across numerous classes and must operate in real-time, despite domain differences between training data and real-world conditions. Building on recent advancements, we propose a method that outperforms current solutions and can be applied in real-time in automatic checkout systems. Our method leverages the Segment Anything Model to extract high-quality masks from lab product images, which are then transformed into synthetic checkout images and adapted to the real domain using contrastive unpaired translation. We train a product recognition model with data augmentation, named SCA+Y8, and further improve it through fine-tuning with pseudo-labels from unlabeled checkout images, resulting in an improved model called SCAFT+Y8. SCAFT+Y8 achieves a great increase in state-of-the-art performance, with an average receipt classification accuracy of 97.58%, and shows strong performance in smaller models, indicating the potential for deployment on low-cost edge devices. 

Place, publisher, year, edition, pages
IEEE, 2024. p. 1396-1403
Series
International Conference on Machine Learning and Applications (ICMLA), ISSN 1946-0740, E-ISSN 1946-0759
Keywords [en]
Automatic Checkout, Domain Adaptation, Object Detection, YOLOv8, Contrastive Learning, Image enhancement, Image segmentation, Object recognition, Classification accuracy, Cross-domain learning, Domain differences, Objects detection, Real- time, Real-world, Training data
National Category
Computer Sciences Computer graphics and computer vision
Research subject
Virtual Production Development (VPD)
Identifiers
URN: urn:nbn:se:his:diva-24982DOI: 10.1109/ICMLA61862.2024.00217ISI: 001468515500208Scopus ID: 2-s2.0-105000879245ISBN: 979-8-3503-7489-6 (print)ISBN: 979-8-3503-7488-9 (electronic)OAI: oai:DiVA.org:his-24982DiVA, id: diva2:1949605
Conference
2024 International Conference on Machine Learning and Applications ICMLA 2024, Miami, Florida, 18-20 December 2024
Funder
Knowledge Foundation, 2020-0044Swedish Research Council, 2022-06725
Note

© 2024 IEEE

The authors would like to thank ITAB Shop Products AB and Smart Industry Sweden (KKS-2020-0044) for their support. The machine learning training was enabled by resources provided by the National Academic Infrastructure for Supercomputing in Sweden (NAISS), partially funded by the Swedish Research Council through grant agreement no. 2022-06725.

Available from: 2025-04-03 Created: 2025-04-03 Last updated: 2025-12-15Bibliographically approved
In thesis
1. Product Recognition with OCR Text: Advancing Grocery Product Recognition through Robust Approaches, Fine-Grained Recognition, and Domain Adaptation for Real-Time Performance
Open this publication in new window or tab >>Product Recognition with OCR Text: Advancing Grocery Product Recognition through Robust Approaches, Fine-Grained Recognition, and Domain Adaptation for Real-Time Performance
2025 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

The physical retail sector faces challenges in improving operational efficiency, reducing costs, and enhancing customer experience. Over the past decade, companies have introduced product recognition technology solutions to improve checkout efficiency, inventory management, and fraud detection. However, most initiatives have struggled to scale or achieve sufficient accuracy due to the complex nature of physical retail, which includes a large number of products that continuously change, as well as varied environmental conditions. In parallel, academic research has tackled many of these challenges by providing datasets and new methods to improve recognition performance, but considerable challenges persist.

This thesis addresses three main challenges in grocery product recognition: robust recognition, recognition of visually similar products, and domain adaptation between different retail systems. To address these challenges, the work centers on the use of Optical Character Recognition (OCR) to extract textual information found on product packaging for product recognition. With extensive experiments and the creation of a dataset, the results show that OCR-based methods for product recognition can improve recognition robustness, enable more accurate differentiation between similar products, and also work across different retail systems.

Therefore, the main contribution of this thesis is the development and validation of these OCR text-based methods and approaches, specifically designed to address the requirements in physical retail.

Abstract [sv]

Den fysiska detaljhandeln står inför utmaningar med att förbättra den operativa effektiviteten, sänka kostnader och stärka kundupplevelsen. Under det senaste decenniet har företag introducerat lösningar baserade på produktigenkänningsteknik för att effektivisera kassaprocesser, förbättra lagerhantering och upptäcka bedrägerier. De flesta initiativ har dock haft svårt att skala upp eller uppnå tillräcklig noggrannhet på grund av detaljhandelns dynamiska natur, som kännetecknas av ett stort och ständigt föränderligt produktsortiment samt skiftande butiksmiljöer. Parallellt har akademisk forskning adresserat många av dessa utmaningar genom att tillhandahålla datamängder och nya metoder för att förbättra igenkänningsprestandan, men betydande utmaningar kvarstår.

Denna avhandling adresserar tre huvudsakliga utmaningar inom produktigenkänning i dagligvaruhandeln: robust igenkänning, finmaskig igenkänning av visuellt liknande produkter samt domänanpassning mellan olika retailsystem. För att hantera dessa utmaningar fokuserar arbetet på att använda optisk teckenläsning (OCR) för att extrahera textinformation från produktförpackningar för produktigenkänning. Genom omfattande experiment och skapandet av en datamängd visar resultaten att OCR-baserade metoder kan förbättra robustheten i igenkänningen, möjliggöra mer noggrann differentiering mellan produkter samt fungera över olika retailmiljöer.

Avhandlingens huvudsakliga bidrag är utvecklingen och valideringen av metoder och tillvägagångssätt för produktigenkänning med text från OCR som möter de unika kraven inom den fysiska detaljhandeln.

Place, publisher, year, edition, pages
Skövde: University of Skövde, 2025. p. xv, 140
Series
Dissertation Series ; 67
National Category
Computer graphics and computer vision Natural Language Processing Artificial Intelligence
Research subject
Virtual Production Development (VPD)
Identifiers
urn:nbn:se:his:diva-26062 (URN)978-91-989080-7-7 (ISBN)978-91-989080-8-4 (ISBN)
Public defence
2026-01-23, ASSAR Industrial Innovation Arena, Kavelbrovägen 2B, 541 36, Skövde, 09:15 (English)
Opponent
Supervisors
Note

Ett av sex delarbeten (övriga se rubriken Delarbeten/List of papers):

6. Tobias Pettersson, Maria Riveiro, and Tuwe Löfström. “Real-Time OCR-Based Grocery Product Recognition with Orientation Alignment and Embedding-Driven Classification”. In: Accepted and presented at the International Conference on Machine Vision (ICMV 2025). 2025.

PUBLICATIONS WITH LOW RELEVANCE

7. Puneet Mishra, Aneesh Chauhan, and Tobias Pettersson. “Seeing through plastics: A novel combination of NIR hyperspectral imaging and spectral orthogonalization for detecting fresh fruit inside plastic packaging to support automated barcode less checkouts in supermarkets”. In: Food Control 150 (2023), p. 109762.

8. Faeze Zakaryapour Sayyad, Tobias Pettersson, Seyed Jalaleddin Mousavirad, Irida Shallari, and Mattias O’Nils. “AdAPT: Advertisement detector adaptation under newspaper domain shift with null-based pseudo-labeling”. In: Machine Learning with Applications (2025), p. 100806.DOI: https://doi.org/10.1016/j.mlwa.2025.100806.

Available from: 2025-12-15 Created: 2025-12-12 Last updated: 2025-12-15Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Pettersson, TobiasRiveiro, Maria

Search in DiVA

By author/editor
Pettersson, TobiasRiveiro, Maria
By organisation
School of Engineering ScienceVirtual Engineering Research Environment
Computer SciencesComputer graphics and computer vision

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 222 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • apa-cv
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf