his.sePublikasjoner
Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Recommender Systems Evaluation
Universidad Autónoma de Madrid, Madrid, Spain.
Högskolan i Skövde, Institutionen för informationsteknologi. Högskolan i Skövde, Forskningscentrum för Informationsteknologi. (Skövde Artificial Intelligence Lab (SAIL))ORCID-id: 0000-0002-2929-0529
2018 (engelsk)Inngår i: Encyclopedia of Social Network Analysis and Mining / [ed] Reda Alhajj, Jon Rokne, Springer, 2018, 2Kapittel i bok, del av antologi (Fagfellevurdert)
sted, utgiver, år, opplag, sider
Springer, 2018, 2.
HSV kategori
Forskningsprogram
Skövde Artificial Intelligence Lab (SAIL); INF301 Data Science
Identifikatorer
URN: urn:nbn:se:his:diva-15039DOI: 10.1007/978-1-4939-7131-2_110162ISBN: 978-1-4939-7130-5 (tryckt)ISBN: 978-1-4939-7131-2 (digital)ISBN: 978-1-4939-7132-9 (tryckt)OAI: oai:DiVA.org:his-15039DiVA, id: diva2:1197264
Merknad

The evaluation of RSs has been, and still is, the object of active research in the field. Since the advent of the first RS, recommendation performance has been usually equated to the accuracy of rating prediction, that is, estimated ratings are compared against actual ratings, and differences between them are computed by means of the MAE and RMSE metrics. In terms of the effective utility of recommendations for users, there is however an increasing realization that the quality (precision) of a ranking of recommended items can be more important than the accuracy in predicting specific rating values. As a result, precision-oriented metrics are being increasingly considered in the field, and a large amount of recent work has focused on evaluating top-N ranked recommendation lists with the above type of metrics. Besides that, other dimensions apart from accuracy – such as coverage, diversity, novelty, and serendipity – have been recently taken into account and analyzed when considered what makes a good recommendation (Said et al, 2014b; Cremonesi et al, 2011; McNee et al, 2006; Bellog´ın and de Vries, 2013; Bollen et al, 2010). So, what makes a good evaluation? The realization that high prediction accuracy might not translate to a higher perceived performance from the users has brought a plethora of novel metrics and methods, focusing on other aspects of recommendation (Said et al, 2013a; Castells et al, 2015; Vargas and Castells, 2014). Recent trends in evaluation methodologies point towards there being a shift from traditional methods solely based on statistical analyses of static data, i.e., raising precision performance of algorithms on offline data (Ekstrand et al, 2011b) – offline data in this case being recorded user interactions such as movie ratings or product purchases. Evaluation is the key to identifying how well an algorithm or a system works. Deploying a new algorithm in a new system will have an effect on the overall performance of the system – in terms of accuracy and other types of metrics. Both prior deploying the algorithm, and after the deployment, it is important to evaluate the system performance. It is in the evaluation of a RS one needs to decide on what should be sought-for, e.g., depending on whether the evaluation is to be performed from the users’ perspective (accuracy, serendipity, novelty), the vendor’s perspective (catalog, profit, churn), or even from the technical perspective of the system running the RS (CPU load, training time, adaptability). Given the context of the system, there might be other perspectives as well; in summary, what is important is to define the Key Performance Indicator (KPI) that one wants to measure. Let us imagine an online marketplace where customers buy various goods, an improved recommendation algorithm could result in, e.g., increased numbers of sold goods, more expensive goods sold, more goods from a specific section of the catalog sold, customers returning to the marketplace more often, etc. When evaluating a system like this, one needs to decide on what is to be evaluated – what the soughtfor quality is – and how it is going to be measured.

Tilgjengelig fra: 2018-04-12 Laget: 2018-04-12 Sist oppdatert: 2019-02-14bibliografisk kontrollert

Open Access i DiVA

Fulltekst mangler i DiVA

Andre lenker

Forlagets fulltekst

Personposter BETA

Said, Alan

Søk i DiVA

Av forfatter/redaktør
Said, Alan
Av organisasjonen

Søk utenfor DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric

doi
isbn
urn-nbn
Totalt: 1013 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf