his.sePublikationer
Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Evolving Decision Trees Using Oracle Guides
School of Business and Informatics, University of Borås, Sweden.
Högskolan i Skövde, Institutionen för kommunikation och information. Högskolan i Skövde, Forskningscentrum för Informationsteknologi.
2009 (Engelska)Ingår i: 2009 IEEE Symposium on Computational Intelligence and Data Mining (CIDM 2009) Proceedings, IEEE, 2009, s. 238-244Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

Some data mining problems require predictive models to be not only accurate but also comprehensible. Comprehensibility enables human inspection and understanding of the model, making it possible to trace why individual predictions are made. Since most high-accuracy techniques produce opaque models, accuracy is, in practice, regularly sacrificed for comprehensibility. One frequently studied technique, often able to reduce this accuracy vs. comprehensibility tradeoff, is rule extraction, i.e., the activity where another, transparent, model is generated from the opaque. In this paper, it is argued that techniques producing transparent models, either directly from the dataset, or from an opaque model, could benefit from using an oracle guide. In the experiments, genetic programming is used to evolve decision trees, and a neural network ensemble is used as the oracle guide. More specifically, the datasets used by the genetic programming when evolving the decision trees, consist of several different combinations of the original training data and "oracle data", i.e., training or test data instances, together with corresponding predictions from the oracle. In total, seven different ways of combining regular training data with oracle data were evaluated, and the results, obtained on 26 UCI datasets, clearly show that the use of an oracle guide improved the performance. As a matter of fact, trees evolved using training data only had the worst test set accuracy of all setups evaluated. Furthermore, statistical tests show that two setups, both using the oracle guide, produced significantly more accurate trees, compared to the setup using training data only.

Ort, förlag, år, upplaga, sidor
IEEE, 2009. s. 238-244
Nationell ämneskategori
Data- och informationsvetenskap
Forskningsämne
Teknik
Identifikatorer
URN: urn:nbn:se:his:diva-3209DOI: 10.1109/CIDM.2009.4938655ISI: 000271487700034Scopus ID: 2-s2.0-67650505073ISBN: 978-1-4244-2765-9 (tryckt)OAI: oai:DiVA.org:his-3209DiVA, id: diva2:225362
Konferens
2009 IEEE Symposium on Computational Intelligence and Data Mining (CIDM 2009), March 30-April 2, 2009, Sheraton Music City Hotel, Nashville, TN, USA
Tillgänglig från: 2009-06-26 Skapad: 2009-06-26 Senast uppdaterad: 2018-01-13Bibliografiskt granskad

Open Access i DiVA

Fulltext saknas i DiVA

Övriga länkar

Förlagets fulltextScopusExternal link to full text

Personposter BETA

Niklasson, Lars

Sök vidare i DiVA

Av författaren/redaktören
Niklasson, Lars
Av organisationen
Institutionen för kommunikation och informationForskningscentrum för Informationsteknologi
Data- och informationsvetenskap

Sök vidare utanför DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetricpoäng

doi
isbn
urn-nbn
Totalt: 415 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf