Högskolan i Skövde

his.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • apa-cv
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Predicting Customer Churn in Retailing
University of Skövde, School of Informatics. University of Skövde, Informatics Research Environment.ORCID iD: 0000-0001-5378-0862
Dept. of Computing, Jönköping University, Sweden.
Dept. of Information Technology, University of Borås, Sweden.
University of Skövde, School of Informatics. University of Skövde, Informatics Research Environment. (Interaction Lab (iLab))ORCID iD: 0000-0002-7554-2301
2022 (English)In: Proceedings 21st IEEE International Conference on Machine Learning and Applications ICMLA 2022: 12–14 December 2022 Nassau, The Bahamas / [ed] M. Arif Wani; Mehmed Kantardzic; Vasile Palade; Daniel Neagu; Longzhi Yang; Kit-Yan Chan, IEEE, 2022, p. 635-640Conference paper, Published paper (Refereed)
Abstract [en]

Customer churn is one of the most challenging problems for digital retailers. With significantly higher costs for acquiring new customers than retaining existing ones, knowledge about which customers are likely to churn becomes essential. This paper reports a case study where a data-driven approach to churn prediction is used for predicting churners and gaining insights about the problem domain. The real-world data set used contains approximately 200 000 customers, describing each customer using more than 50 features. In the pre-processing, exploration, modeling and analysis, attributes related to recency, frequency, and monetary concepts are identified and utilized. In addition, correlations and feature importance are used to discover and understand churn indicators. One important finding is that the churn rate highly depends on the number of previous purchases. In the segment consisting of customers with only one previous purchase, more than 75% will churn, i.e., not make another purchase in the coming year. For customers with at least four previous purchases, the corresponding churn rate is around 25%. Further analysis shows that churning customers in general, and as expected, make smaller purchases and visit the online store less often. In the experimentation, three modeling techniques are evaluated, and the results show that, in particular, Gradient Boosting models can predict churners with relatively high accuracy while obtaining a good balance between precision and recall. 

Place, publisher, year, edition, pages
IEEE, 2022. p. 635-640
Keywords [en]
Sales, Case-studies, Churn rates, Correlation, Customer churn prediction, Customer churns, Digital retailing, Feature importance, High costs, RFM analysis, Top probability, Forecasting, correlations, top probabilities
National Category
Software Engineering Business Administration Computer Sciences
Research subject
Interaction Lab (ILAB)
Identifiers
URN: urn:nbn:se:his:diva-22430DOI: 10.1109/ICMLA55696.2022.00105ISI: 000980994900094Scopus ID: 2-s2.0-85152214345ISBN: 978-1-6654-6283-9 (electronic)ISBN: 978-1-6654-6284-6 (print)OAI: oai:DiVA.org:his-22430DiVA, id: diva2:1751909
Conference
2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA), 12-14 December 2022, Nassau, Bahamas
Funder
Knowledge Foundation, 20160035, 20170215
Note

© 2022 IEEE.

The current research is a part of the Industrial Graduate School in Digital Retailing (INSiDR) at the University of Borås, funded by the Swedish Knowledge Foundation, grants nr. 20160035, 20170215.

Available from: 2023-04-20 Created: 2023-04-20 Last updated: 2023-10-03Bibliographically approved
In thesis
1. Data-driven decision support in digital retailing
Open this publication in new window or tab >>Data-driven decision support in digital retailing
2023 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

In the digital era and advent of artificial intelligence, digital retailing has emerged as a notable shift in commerce. It empowers e-tailers with data-driven insights and predictive models to navigate a variety of challenges, driving informed decision-making and strategic formulation. While predictive models are fundamental for making data-driven decisions, this thesis spotlights binary classifiers as a central focus. These classifiers reveal the complexities of two real-world problems, marked by their particular properties. Specifically, binary decisions are made based on predictions, relying solely on predicted class labels is insufficient because of the variations in classification accuracy. Furthermore, prediction outcomes have different costs associated with making different mistakes, which impacts the utility.

To confront these challenges, probabilistic predictions, often unexplored or uncalibrated, is a promising alternative to class labels. Therefore, machine learning modelling and calibration techniques are explored, employing benchmark data sets alongside empirical studies grounded in industrial contexts. These studies analyse predictions and their associated probabilities across diverse data segments and settings. The thesis found, as a proof of concept, that specific algorithms inherently possess calibration while others, with calibrated probabilities, demonstrate reliability. In both cases, the thesis concludes that utilising top predictions with the highest probabilities increases the precision level and minimises the false positives. In addition, adopting well-calibrated probabilities is a powerful alternative to mere class labels. Consequently, by transforming probabilities into reliable confidence values through classification with a rejection option, a pathway emerges wherein confident and reliable predictions take centre stage in decision-making. This enables e-tailers to form distinct strategies based on these predictions and optimise their utility.

This thesis highlights the value of calibrated models and probabilistic prediction and emphasises their significance in enhancing decision-making. The findings have practical implications for e-tailers leveraging data-driven decision support. Future research should focus on producing an automated system that prioritises high and well-calibrated probability predictions while discarding others and optimising utilities based on the costs and gains associated with the different prediction outcomes to enhance decision support for e-tailers.

Place, publisher, year, edition, pages
Skövde: University of Skövde, 2023. p. xiii, 108
Series
Dissertation Series ; 53
Keywords
Digital Retailing, Decision Support, Probabilistic Prediction, Calibration, Product Returns, Customer Churn, Binary Classification, Scikit-Learn
National Category
Other Computer and Information Science Computer Sciences Computer Systems Software Engineering Business Administration
Identifiers
urn:nbn:se:his:diva-23279 (URN)978-91-987906-7-2 (ISBN)
Presentation
2023-10-31, G111, Högskolan i Skövde, Skövde, 13:15 (English)
Opponent
Supervisors
Funder
Knowledge Foundation
Note

The current thesis is a part of the industrial graduate school in digital retailing (INSiDR) at the University of Borås and funded by the Swedish Knowledge Foundation.

Available from: 2023-10-03 Created: 2023-10-03 Last updated: 2023-10-03Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Sweidan, DirarAlenljung, Beatrice

Search in DiVA

By author/editor
Sweidan, DirarAlenljung, Beatrice
By organisation
School of InformaticsInformatics Research Environment
Software EngineeringBusiness AdministrationComputer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 143 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • apa-cv
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf