his.sePublikationer
Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Estimating class probabilities in random forests
Högskolan i Skövde, Institutionen för kommunikation och information. Högskolan i Skövde, Forskningscentrum för Informationsteknologi.
2007 (Engelska)Ingår i: ICMLA 2007. Sixth International Conference onMachine Learning and Applications, 2007., IEEE Computer Society, 2007, s. 211-216Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

For both single probability estimation trees (PETs) and ensembles of such trees, commonly employed class probability estimates correct the observed relative class frequencies in each leaf to avoid anomalies caused by small sample sizes. The effect of such corrections in random forests of PETs is investigated, and the use of the relative class frequency is compared to using two corrected estimates, the Laplace estimate and the m-estimate. An experiment with 34 datasets from the UCI repository shows that estimating class probabilities using relative class frequency clearly outperforms both using the Laplace estimate and the m-estimate with respect to accuracy, area under the ROC curve (AUC) and Brier score. Hence, in contrast to what is commonly employed for PETs and ensembles of PETs, these results strongly suggest that a non-corrected probability estimate should be used in random forests of PETs. The experiment further shows that learning random forests of PETs using relative class frequency significantly outperforms learning random forests of classification trees (i.e., trees for which only an unweighted vote on the most probable class is counted) with respect to both accuracy and AUC, but that the latter is clearly ahead of the former with respect to Brier score.

Ort, förlag, år, upplaga, sidor
IEEE Computer Society, 2007. s. 211-216
Nationell ämneskategori
Datavetenskap (datalogi)
Forskningsämne
Teknik
Identifikatorer
URN: urn:nbn:se:his:diva-1465DOI: 10.1109/ICMLA.2007.64ISI: 000252793400035Scopus ID: 2-s2.0-47349133606ISBN: 978-0-7695-3069-7 OAI: oai:DiVA.org:his-1465DiVA, id: diva2:25500
Konferens
6th International Conference on Machine Learning and Applications, ICMLA 2007;Cincinnati, OH;13 December 2007through15 December 2007
Tillgänglig från: 2008-09-29 Skapad: 2008-09-29 Senast uppdaterad: 2018-01-13Bibliografiskt granskad

Open Access i DiVA

Fulltext saknas i DiVA

Övriga länkar

Förlagets fulltextScopus

Personposter BETA

Boström, Henrik

Sök vidare i DiVA

Av författaren/redaktören
Boström, Henrik
Av organisationen
Institutionen för kommunikation och informationForskningscentrum för Informationsteknologi
Datavetenskap (datalogi)

Sök vidare utanför DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetricpoäng

doi
isbn
urn-nbn
Totalt: 341 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf