his.sePublications
Change search
Link to record
Permanent link

Direct link
BETA
König, Rikard
Alternative names
Publications (10 of 11) Show all publications
Johansson, U., König, R. & Niklasson, L. (2009). Genetically Evolved kNN Ensembles (1ed.). In: Robert Stahlbock, Sven F. Crone, Stefan Lessmann (Ed.), Data Mining: Special Issue in Annals of Information Systems (pp. 299-313). Springer Science+Business Media B.V.
Open this publication in new window or tab >>Genetically Evolved kNN Ensembles
2009 (English)In: Data Mining: Special Issue in Annals of Information Systems / [ed] Robert Stahlbock, Sven F. Crone, Stefan Lessmann, Springer Science+Business Media B.V., 2009, 1, p. 299-313Chapter in book (Other academic)
Abstract [en]

Both theory and a wealth of empirical studies have established that ensembles are more accurate than single predictive models. For the ensemble approach to work, base classifiers must not only be accurate but also diverse, i.e., they should commit their errors on different instances. Instance-based learners are, however, very robust with respect to variations of a data set, so standard resampling methods will normally produce only limited diversity. Because of this, instance-based learners are rarely used as base classifiers in ensembles. In this chapter, we introduce a method where genetic programming is used to generate kNN base classifiers with optimized k-values and feature weights. Due to the inherent inconsistency in genetic programming (i.e., different runs using identical data and parameters will still produce different solutions) a group of independently evolved base classifiers tend to be not only accurate but also diverse. In the experimentation, using 30 data sets from the UCI repository, two slightly different versions of kNN ensembles are shown to significantly outperform both the corresponding base classifiers and standard kNN with optimized k-values, with respect to accuracy and AUC.

Place, publisher, year, edition, pages
Springer Science+Business Media B.V., 2009 Edition: 1
Series
Annals of Information Systems, ISSN 1934-3221 ; 8
National Category
Computer and Information Sciences
Research subject
Technology
Identifiers
urn:nbn:se:his:diva-3839 (URN)10.1007/978-1-4419-1280-0_13 (DOI)978-1-4419-1279-4 (ISBN)978-1-4419-1280-0 (ISBN)
Available from: 2010-04-01 Created: 2010-04-01 Last updated: 2018-01-12Bibliographically approved
König, R. (2009). Predictive Techniques and Methods for Decision Support in Situations with Poor Data Quality. (Licentiate dissertation). Örebro University
Open this publication in new window or tab >>Predictive Techniques and Methods for Decision Support in Situations with Poor Data Quality
2009 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

Today, decision support systems based on predictive modeling are becoming more common, since organizations often collect more data than decision makers can handle manually. Predictive models are used to find potentially valuable patterns in the data, or to predict the outcome of some event. There are numerous predictive techniques, ranging from simple techniques such as linear regression, to complex powerful ones like artificial neural networks. Complex models usually obtain better predictive performance, but are opaque and thus cannot be used to explain predictions or discovered patterns. The design choice of which predictive technique to use becomes even harder since no technique outperforms all others over a large set of problems. It is even difficult to find the best parameter values for a specific technique, since these settings also are problem dependent. One way to simplify this vital decision is to combine several models, possibly created with different settings and techniques, into an ensemble. Ensembles are known to be more robust and powerful than individual models, and ensemble diversity can be used to estimate the uncertainty associated with each prediction.

In real-world data mining projects, data is often imprecise, contain uncertainties or is missing important values, making it impossible to create models with sufficient performance for fully automated systems. In these cases, predictions need to be manually analyzed and adjusted. Here, opaque models like ensembles have a disadvantage, since the analysis requires understandable models. To overcome this deficiency of opaque models, researchers have developed rule extraction techniques that try to extract comprehensible rules from opaque models, while retaining sufficient accuracy.

This thesis suggests a straightforward but comprehensive method for predictive modeling in situations with poor data quality. First, ensembles are used for the actual modeling, since they are powerful, robust and require few design choices. Next, ensemble uncertainty estimations pinpoint predictions that need special attention from a decision maker. Finally, rule extraction is performed to support the analysis of uncertain predictions. Using this method, ensembles can be used for predictive modeling, in spite of their opacity and sometimes insufficient global performance, while the involvement of a decision maker is minimized.

The main contributions of this thesis are three novel techniques that enhance the performance of the purposed method. The first technique deals with ensemble uncertainty estimation and is based on a successful approach often used in weather forecasting. The other two are improvements of a rule extraction technique, resulting in increased comprehensibility and more accurate uncertainty estimations.

Place, publisher, year, edition, pages
Örebro University, 2009. p. 112
Series
Studies from the School of Science and Technology at Örebro University ; 5
Keywords
Rule Extraction, Genetic Programming, Uncertainty estimation, Machine Learning, Artificial Neural Networks, Data Mining, Information Fusion
National Category
Computer and Information Sciences
Research subject
Technology
Identifiers
urn:nbn:se:his:diva-3208 (URN)
Presentation
(English)
Available from: 2009-06-26 Created: 2009-06-26 Last updated: 2018-01-13Bibliographically approved
Löfström, T., Johansson, U., Sönströd, C., König, R. & Niklasson, L. (Eds.). (2007). Proceedings of SAIS 2007: The 24th Annual Workshop of the Swedish Artificial Intelligence Society, Borås, May 22-23, 2007. University College of Borås
Open this publication in new window or tab >>Proceedings of SAIS 2007: The 24th Annual Workshop of the Swedish Artificial Intelligence Society, Borås, May 22-23, 2007
Show others...
2007 (English)Conference proceedings (editor) (Other academic)
Place, publisher, year, edition, pages
University College of Borås, 2007
Research subject
Technology
Identifiers
urn:nbn:se:his:diva-3701 (URN)
Available from: 2010-02-16 Created: 2010-02-16 Last updated: 2017-11-27
Johansson, U., Löfström, T., König, R. & Niklasson, L. (2006). Accurate Neural Network Ensembles Using Genetic Programming. In: Proceedings of SAIS: The 23rd Annual Workshop of the Swedish Artificial Intelligence Society. Swedish Artificial Intelligence Society - SAIS, Umeå universitet
Open this publication in new window or tab >>Accurate Neural Network Ensembles Using Genetic Programming
2006 (English)In: Proceedings of SAIS: The 23rd Annual Workshop of the Swedish Artificial Intelligence Society, Swedish Artificial Intelligence Society - SAIS, Umeå universitet , 2006Conference paper, Published paper (Other academic)
Abstract [en]

Abstract: In this paper we present and evaluate a novel algorithm for ensemble creation. The main idea of the algorithm is to first independently train a fixed number of neural networks (here ten) and then use genetic programming to combine these networks into an ensemble. The use of genetic programming makes it possible to not only consider ensembles of different sizes, but also to use ensembles as intermediate building blocks. The final result is therefore more correctly described as an ensemble of neural network ensembles. The experiments show that the proposed method, when evaluated on 22 publicly available data sets, obtains very high accuracy, clearly outperforming the other methods evaluated. In this study several micro techniques are used, and we believe that they all contribute to the increased performance.

One such micro technique, aimed at reducing overtraining, is the training method, called tombola training, used during genetic evolution. When using tombola training, training data is regularly resampled into new parts, called training groups. Each ensemble is then evaluated on every training group and the actual fitness is determined solely from the result on the hardest part.

Place, publisher, year, edition, pages
Swedish Artificial Intelligence Society - SAIS, Umeå universitet, 2006
Series
UMINF, ISSN 0348-0542
Identifiers
urn:nbn:se:his:diva-1914 (URN)
Available from: 2007-03-22 Created: 2007-03-22 Last updated: 2017-11-27
Löfström, T., König, R., Johansson, U., Niklasson, L., Strand, M. & Ziemke, T. (2006). Benefits of Relating the Retail Domain to Information Fusion. In: 9th International Conference on Information Fusion: IEEE ISIF. Paper presented at 9th International Conference on Information Fusion, ICIF '06, Florence (Italy), July 10-13, 2006 (pp. Article number 4085930). IEEE conference proceedings
Open this publication in new window or tab >>Benefits of Relating the Retail Domain to Information Fusion
Show others...
2006 (English)In: 9th International Conference on Information Fusion: IEEE ISIF, IEEE conference proceedings, 2006, p. Article number 4085930-Conference paper, Published paper (Refereed)
Place, publisher, year, edition, pages
IEEE conference proceedings, 2006
Identifiers
urn:nbn:se:his:diva-1956 (URN)2-s2.0-50149109426 (Scopus ID)0-9721844-6-5 (ISBN)
Conference
9th International Conference on Information Fusion, ICIF '06, Florence (Italy), July 10-13, 2006
Available from: 2008-04-11 Created: 2008-04-11 Last updated: 2017-11-27Bibliographically approved
Johansson, U., Löfström, T., König, R. & Niklasson, L. (2006). Building Neural Network Ensembles using Genetic Programming. In: The International Joint Conference on Neural Networks 2006. Paper presented at International Joint Conference on Neural Networks 2006, IJCNN '06;Vancouver, BC;16 July 2006through21 July 2006 (pp. 2239-2244). IEEE Press
Open this publication in new window or tab >>Building Neural Network Ensembles using Genetic Programming
2006 (English)In: The International Joint Conference on Neural Networks 2006, IEEE Press, 2006, p. 2239-2244Conference paper, Published paper (Refereed)
Abstract [en]

algorithm for ensemble creation. The main idea of the algorithm is to first independently train a fixed number of neural networks (here ten) and then use genetic programming to combine these networks into an ensemble. The use of genetic programming makes it possible to not only consider ensembles of different sizes, but also to use ensembles as intermediate building blocks. The final result is therefore more correctly described as an ensemble of neural network ensembles. The experiments show that the proposed method, when evaluated on 22 publicly available data sets, obtains very high accuracy, clearly outperforming the other methods evaluated. In this study several micro techniques are used, and we believe that they all contribute to the increased performance. One such micro technique, aimed at reducing overtraining, is the training method, called tombola training, used during genetic evolution. When using tombola training, training data is regularly resampled into new parts, called training groups. Each ensemble is then evaluated on every training group and the actual fitness is determined solely from the result on the hardest part.

Place, publisher, year, edition, pages
IEEE Press, 2006
Identifiers
urn:nbn:se:his:diva-1806 (URN)10.1109/IJCNN.2006.246836 (DOI)000245125902029 ()2-s2.0-38049049329 (Scopus ID)
Conference
International Joint Conference on Neural Networks 2006, IJCNN '06;Vancouver, BC;16 July 2006through21 July 2006
Available from: 2007-10-10 Created: 2007-10-10 Last updated: 2017-11-27
Johansson, U., Löfström, T., König, R. & Niklasson, L. (2006). Genetically Evolved Trees Representing Ensembles. In: Artificial intelligence and soft computing - ICAISC 2006: 8th international conference, Zakopane, Poland, June 25 - 29, 2006 ; proceedings (pp. 613-622).
Open this publication in new window or tab >>Genetically Evolved Trees Representing Ensembles
2006 (English)In: Artificial intelligence and soft computing - ICAISC 2006: 8th international conference, Zakopane, Poland, June 25 - 29, 2006 ; proceedings, 2006, p. 613-622Conference paper, Published paper (Refereed)
Abstract [en]

We have recently proposed a novel algorithm for ensemble creation called GEMS (Genetic Ensemble Member Selection). GEMS first trains a fixed number of neural networks (here twenty) and then uses genetic programming to combine these networks into an ensemble. The use of genetic programming makes it possible for GEMS to not only consider ensembles of different sizes, but also to use ensembles as intermediate building blocks. In this paper, which is the first extensive study of GEMS, the representation language is extended to include tests partitioning the data, further increasing flexibility. In addition, several micro techniques are applied to reduce overfitting, which appears to be the main problem for this powerful algorithm. The experiments show that GEMS, when evaluated on 15 publicly available data sets, obtains very high accuracy, clearly outperforming both straightforward ensemble designs and standard decision tree algorithms.

Series
Lecture Notes in Computer Science, ISSN 0302-9743 ; 4029
National Category
Engineering and Technology
Research subject
Technology
Identifiers
urn:nbn:se:his:diva-1587 (URN)10.1007/11785231_64 (DOI)000239600000064 ()2-s2.0-33746239343 (Scopus ID)978-3-540-35748-3 (ISBN)
Available from: 2008-02-08 Created: 2008-02-08 Last updated: 2017-11-27
König, R., Johansson, U. & Niklasson, L. (2006). Increasing rule extraction comprehensibility. International Journal of Information Technology and Intelligent Computing, 1(2), 303-314
Open this publication in new window or tab >>Increasing rule extraction comprehensibility
2006 (English)In: International Journal of Information Technology and Intelligent Computing, ISSN 1895-8648, Vol. 1, no 2, p. 303-314Article in journal (Refereed) Published
Place, publisher, year, edition, pages
Łódź: Academy of Humanities and Economics (WSHE), 2006
Identifiers
urn:nbn:se:his:diva-7190 (URN)
Available from: 2013-02-11 Created: 2013-02-11 Last updated: 2017-11-27Bibliographically approved
Johansson, U., Löfström, T., König, R., Sönströd, C. & Niklasson, L. (2006). Rule Extraction from Opaque Models: A Slightly Different Perspective. In: 6th International Conference on Machine Learning and Applications (pp. 22-27). IEEE Computer Society
Open this publication in new window or tab >>Rule Extraction from Opaque Models: A Slightly Different Perspective
Show others...
2006 (English)In: 6th International Conference on Machine Learning and Applications, IEEE Computer Society, 2006, p. 22-27Conference paper, Published paper (Refereed)
Abstract [en]

When performing predictive modeling, the key criterion is always accuracy. With this in mind, complex techniques like neural networks or ensembles are normally used, resulting in opaque models impossible to interpret. When models need to be comprehensible, accuracy is often sacrificed by using simpler techniques directly producing transparent models; a tradeoff termed the accuracy vs. comprehensibility tradeoff. In order to reduce this tradeoff, the opaque model can be transformed into another, interpretable, model; an activity termed rule extraction. In this paper, it is argued that rule extraction algorithms should gain from using oracle data; i.e. test set instances, together with corresponding predictions from the opaque model. The experiments, using 17 publicly available data sets, clearly show that rules extracted using only oracle data were significantly more accurate than both rules extracted by the same algorithm, using training data, and standard decision tree algorithms. In addition, the same rules were also significantly more compact; thus providing better comprehensibility. The overall implication is that rules extracted in this fashion will explain the predictions made on novel data better than rules extracted in the standard way; i.e. using training data only.

Place, publisher, year, edition, pages
IEEE Computer Society, 2006
Identifiers
urn:nbn:se:his:diva-1952 (URN)10.1109/ICMLA.2006.46 (DOI)000244477800004 ()2-s2.0-40349090116 (Scopus ID)0-7695-2735-3 (ISBN)
Available from: 2008-04-11 Created: 2008-04-11 Last updated: 2017-11-27
Johansson, U., Niklasson, L. & König, R. (2004). Accuracy vs. comprehensibility in data mining models. In: Proceedings of the Seventh International Conference on Information Fusion: 28 June - 1 July 2004 Stockholm Sweden. Paper presented at The 7th International Conference on Information Fusion, June 28 to July 1, 2004 in Stockholm, Sweden (pp. 295-300).
Open this publication in new window or tab >>Accuracy vs. comprehensibility in data mining models
2004 (English)In: Proceedings of the Seventh International Conference on Information Fusion: 28 June - 1 July 2004 Stockholm Sweden, 2004, p. 295-300Conference paper, Published paper (Other academic)
Abstract [en]

This paper addresses the important issue of the tradeoff between accuracy and comprehensibility in data mining. The paper presents results which show that it is, to some extent, possible to bridge this gap. A method for rule extraction from opaque models (Genetic Rule EXtraction – G-REX) is used to show the effects on accuracy when forcing the creation of comprehensible representations. In addition the technique of combining different classifiers to an ensemble is demonstrated on some well-known data sets. The results show that ensembles generally have very high accuracy, thus making them a good first choice when performing predictive data mining.

Keywords
data mining, ensembles, rule extraction
Research subject
Technology
Identifiers
urn:nbn:se:his:diva-1500 (URN)2-s2.0-6344291509 (Scopus ID)
Conference
The 7th International Conference on Information Fusion, June 28 to July 1, 2004 in Stockholm, Sweden
Available from: 2007-10-10 Created: 2007-10-10 Last updated: 2017-11-27Bibliographically approved
Organisations

Search in DiVA

Show all publications