Högskolan i Skövde

his.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • apa-cv
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Accuracy on a Hold-out Set: The Red Herring of Data Mining
School of Business and Informatics, University of Borås, Sweden.
University of Skövde, School of Humanities and Informatics. University of Skövde, The Informatics Research Centre. School of Business and Informatics, University of Borås, Sweden. (Skövde Cognition and Artificial Intelligence Lab)
University of Skövde, School of Humanities and Informatics. University of Skövde, The Informatics Research Centre. (Skövde Cognition and Artificial Intelligence Lab)
2006 (English)In: Proceedings of SAIS 2006: The 23rd Annunual Workshop of the Swedish Artificial Intelligence Society / [ed] Michael Minock; Patrik Eklund; Helena Lindgren, Umeå: Swedish Artificial Intelligence Society - SAIS, Umeå University , 2006, p. 137-146Conference paper, Published paper (Refereed)
Abstract [en]

Abstract: When performing predictive modeling, the overall goal is to generate models likely to have high accuracy when applied to novel data. A technique commonly used to maximize generalization accuracy is to create ensembles of models, e.g., averaging the output from a number of individual models. Several, more or less sophisticated techniques, aimed at either directly creating ensembles or selecting ensemble members from a pool of available models, have been suggested. Many techniques utilize a part of the available data not used for the training of the models (a hold-out set) to rank and select either ensembles or ensemble members based on accuracy on that set. The obvious underlying assumption is that increased accuracy on the hold-out set is a good indicator of increased generalization capability on novel data. Or, put in another way, that there is high correlation between accuracy on the hold-out set and accuracy on yet novel data. The experiments in this study, however, show that this is generally not the case; i.e. there is little to gain from selecting ensembles using hold-out set accuracy. The experiments also show that this low correlation holds for individual neural networks as well; making the entire use of hold-out sets to compare predictive models questionable

Place, publisher, year, edition, pages
Umeå: Swedish Artificial Intelligence Society - SAIS, Umeå University , 2006. p. 137-146
Series
Report / UMINF - Umeå University, Department of Computing Science, ISSN 0348-0542 ; 06.19
National Category
Information Systems Computer Sciences
Identifiers
URN: urn:nbn:se:his:diva-2020OAI: oai:DiVA.org:his-2020DiVA, id: diva2:32296
Conference
The 23rd Annual Workshop of the Swedish Artificial Intelligence Society Workshop, SAIS 2006, Umeå, Sweden, May 10-12
Available from: 2007-03-22 Created: 2007-03-22 Last updated: 2021-06-28Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

http://sais2006.cs.umu.se/

Authority records

Löfström, TuveNiklasson, Lars

Search in DiVA

By author/editor
Löfström, TuveNiklasson, Lars
By organisation
School of Humanities and InformaticsThe Informatics Research Centre
Information SystemsComputer Sciences

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 654 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • apa-cv
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf