his.sePublikasjoner
Endre søk
Link to record
Permanent link

Direct link
BETA
Publikasjoner (10 av 19) Visa alla publikasjoner
Löfström, T., Johansson, U. & Boström, H. (2009). Ensemble Member Selection Using Multi-Objective Optimization. In: Proceedings of IEEE Symposium on Computational Intelligence and Data Mining (CIDM): . Paper presented at 2009 IEEE Symposium on Computational Intelligence and Data Mining (CIDM 2009) proceedings : March 30-April 2, 2009, Sheraton Music City Hotel, Nashville, TN, USA (pp. 245-251). IEEE conference proceedings
Åpne denne publikasjonen i ny fane eller vindu >>Ensemble Member Selection Using Multi-Objective Optimization
2009 (engelsk)Inngår i: Proceedings of IEEE Symposium on Computational Intelligence and Data Mining (CIDM), IEEE conference proceedings, 2009, s. 245-251Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

Both theory and a wealth of empirical studies have established that ensembles are more accurate than single predictive models. Unfortunately, the problem of how to maximize ensemble accuracy is, especially for classification, far from solved. In essence, the key problem is to find a suitable criterion, typically based on training or selection set performance, highly correlated with ensemble accuracy on novel data. Several studies have, however, shown that it is difficult to come up with a single measure, such as ensemble or base classifier selection set accuracy, or some measure based on diversity, that is a good general predictor for ensemble test accuracy. This paper presents a novel technique that for each learning task searches for the most effective combination of given atomic measures, by means of a genetic algorithm. Ensembles built from either neural networks or random forests were empirically evaluated on 30 UCI datasets. The experimental results show that when using the generated combined optimization criteria to rank candidate ensembles, a higher test set accuracy for the top ranked ensemble was achieved, compared to using ensemble accuracy on selection data alone. Furthermore, when creating ensembles from a pool of neural networks, the use of the generated combined criteria was shown to generally outperform the use of estimated ensemble accuracy as the single optimization criterion.

sted, utgiver, år, opplag, sider
IEEE conference proceedings, 2009
HSV kategori
Forskningsprogram
Teknik
Identifikatorer
urn:nbn:se:his:diva-3212 (URN)10.1109/CIDM.2009.4938656 (DOI)000271487700035 ()2-s2.0-67650434708 (Scopus ID)978-1-4244-2765-9 (ISBN)
Konferanse
2009 IEEE Symposium on Computational Intelligence and Data Mining (CIDM 2009) proceedings : March 30-April 2, 2009, Sheraton Music City Hotel, Nashville, TN, USA
Tilgjengelig fra: 2009-06-26 Laget: 2009-06-26 Sist oppdatert: 2018-01-13bibliografisk kontrollert
Deegalla, S. & Boström, H. (2009). Fusion of Dimensionality Reduction Methods: a Case Study in Microarray Classification. In: Proceedings of the 12th International Conference on Information Fusion: . Paper presented at Fusion 2009 : the 12th International Conference on Information Fusion : Grand Hyatt Seattle, Seattle, Washington, USA, 6-9 July, 2009 (pp. 460-465). ISIF
Åpne denne publikasjonen i ny fane eller vindu >>Fusion of Dimensionality Reduction Methods: a Case Study in Microarray Classification
2009 (engelsk)Inngår i: Proceedings of the 12th International Conference on Information Fusion, ISIF , 2009, s. 460-465Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

Dimensionality reduction has been demonstrated to improve the performance of the k-nearest neighbor (kNN) classifier for high-dimensional data sets, such as microarrays. However, the effectiveness of different dimensionality reduction methods varies, and it has been shown that no single method constantly outperforms the others. In contrast to using a single method, two approaches to fusing the result of applying dimensionality reduction methods are investigated: feature fusion and classifier fusion. It is shown that by fusing the output of multiple dimensionality reduction techniques, either by fusing the reduced features or by fusing the output of the resulting classifiers, both higher accuracy and higher robustness towards the choice of number of dimensions is obtained.

sted, utgiver, år, opplag, sider
ISIF, 2009
Emneord
Nearest neighbor classification, dimensionality reduction, feature fusion, classifier fusion, microarrays
HSV kategori
Forskningsprogram
Teknik
Identifikatorer
urn:nbn:se:his:diva-3455 (URN)000273560000060 ()2-s2.0-70449375084 (Scopus ID)978-0-9824-4380-4 (ISBN)
Konferanse
Fusion 2009 : the 12th International Conference on Information Fusion : Grand Hyatt Seattle, Seattle, Washington, USA, 6-9 July, 2009
Tilgjengelig fra: 2009-10-20 Laget: 2009-10-20 Sist oppdatert: 2018-01-12bibliografisk kontrollert
Dudas, C., Ng, A. & Boström, H. (2009). Information Extraction from Solution Set of Simulation-based Multi-objective Optimisation using Data Mining. In: D. B. Das, V. Nassehi & L. Deka (Ed.), Proceedings of Industrial Simulation Conference 2009: . Paper presented at 7th International Industrial Simulation Conference 2009, ISC'09, June 1-3, 2009, Loughborough, United Kingdom (pp. 65-69). EUROSIS-ETI
Åpne denne publikasjonen i ny fane eller vindu >>Information Extraction from Solution Set of Simulation-based Multi-objective Optimisation using Data Mining
2009 (engelsk)Inngår i: Proceedings of Industrial Simulation Conference 2009 / [ed] D. B. Das, V. Nassehi & L. Deka, EUROSIS-ETI , 2009, s. 65-69Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

In this work, we investigate ways of extracting information from simulations, in particular from simulation-based multi-objective optimisation, in order to acquire information that can support human decision makers that aim for optimising manufacturing processes. Applying data mining for analyzing data generated using simulation is a fairly unexplored area. With the observation that the obtained solutions from a simulation-based multi-objective optimisation are all optimal (or close to the optimal Pareto front) so that they are bound to follow and exhibit certain relationships among variables vis-à-vis objectives, it is argued that using data mining to discover these relationships could be a promising procedure. The aim of this paper is to provide the empirical results from two simulation case studies to support such a hypothesis.

sted, utgiver, år, opplag, sider
EUROSIS-ETI, 2009
Emneord
Output analysis, Data mining, Information extraction
HSV kategori
Forskningsprogram
Teknik
Identifikatorer
urn:nbn:se:his:diva-3301 (URN)000280184200011 ()2-s2.0-84898467726 (Scopus ID)9789077381489 (ISBN)
Konferanse
7th International Industrial Simulation Conference 2009, ISC'09, June 1-3, 2009, Loughborough, United Kingdom
Tilgjengelig fra: 2009-07-10 Laget: 2009-07-10 Sist oppdatert: 2018-01-13bibliografisk kontrollert
Dudas, C. & Boström, H. (2009). Using Uncertain Chemical and Thermal Data to Predict Product Quality in a Casting Process. In: Jian Pei; Lise Getoor; Ander De Keijzer (Ed.), Proceedings of the 1st ACM SIGKDD Workshop on Knowledge Discovery from Uncertain Data. Paper presented at Proceedings of the First ACM SIGKDD International Workshop on Knowledge Discovery from Uncertain Data : Paris, France, June 28, 2009 ; in conjunction with KDD'09 (pp. 57-61). AMC, Inc.
Åpne denne publikasjonen i ny fane eller vindu >>Using Uncertain Chemical and Thermal Data to Predict Product Quality in a Casting Process
2009 (engelsk)Inngår i: Proceedings of the 1st ACM SIGKDD Workshop on Knowledge Discovery from Uncertain Data / [ed] Jian Pei; Lise Getoor; Ander De Keijzer, AMC, Inc. , 2009, s. 57-61Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

Process and casting data from different sources have been collected and merged for the purpose of predicting, and determining what factors affect, the quality of cast products in a foundry. One problem is that the measurements cannot be directly aligned, since they are collected at different points in time, and instead they have to be approximated for specific time points, hence introducing uncertainty. An approach for addressing this problem is investigated, where uncertain numeric features values are represented by intervals and random forests are extended to handle such intervals. A preliminary experiment shows that the suggested way of forming the intervals, together with the extension of random forests, results in higher predictive performance compared to using single (expected) values for the uncertain features together with standard random forests.

sted, utgiver, år, opplag, sider
AMC, Inc., 2009
HSV kategori
Forskningsprogram
Teknik
Identifikatorer
urn:nbn:se:his:diva-3418 (URN)10.1145/1610555.1610563 (DOI)2-s2.0-70450267831 (Scopus ID)978-1-60558-675-5 (ISBN)
Konferanse
Proceedings of the First ACM SIGKDD International Workshop on Knowledge Discovery from Uncertain Data : Paris, France, June 28, 2009 ; in conjunction with KDD'09
Tilgjengelig fra: 2009-10-14 Laget: 2009-10-14 Sist oppdatert: 2018-01-12
Boström, H. & Norinder, U. (2009). Utilizing Information on Uncertainty for In Silico Modeling using Random Forests. In: Proceedings of the 3rd Skövde Workshop on Information Fusion Topics (SWIFT 2009). Paper presented at Proceedings of the 3rd Annual Skövde Workshop on Information Fusion Topics (SWIFT 2009), 12-13 Oct 2009, Skövde, Sweden. (pp. 59-62). University of Skövde
Åpne denne publikasjonen i ny fane eller vindu >>Utilizing Information on Uncertainty for In Silico Modeling using Random Forests
2009 (engelsk)Inngår i: Proceedings of the 3rd Skövde Workshop on Information Fusion Topics (SWIFT 2009), University of Skövde , 2009, s. 59-62Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

Information on uncertainty of measurements or estimates of molecular properties are rarely utilized by in silico predictive models. In this study, different approaches to handling uncertain numerical features are explored when using the stateof- the-art random forest algorithm for generating predictive models. Two main approaches are considered: i) sampling from probability distributions prior to tree generation, which does not require any change to the underlying tree learning algorithm, and ii) adjusting the algorithm to allow for handling probability distributions, similar to how missing values typically are handled, i.e., partitions may include fractions of examples. An experiment with six datasets concerning the prediction of various chemical properties is presented, where 95% confidence intervals are included for one of the 92 numerical features. In total, five approaches to handling uncertain numeric features are compared: ignoring the uncertainty, sampling from distributions that are assumed to be uniform and normal respectively, and adjusting tree learning to handle probability distributions that are assumed to be uniform and normal respectively. The experimental results show that all approaches that utilize information on uncertainty indeed outperform the single approach ignoring this, both with respect to accuracy and area under ROC curve. A decomposition of the squared error of the constituent classification trees shows that the highest variance is obtained by ignoring the information on uncertainty, but that this also results in the highest mean squared error of the constituent trees.

sted, utgiver, år, opplag, sider
University of Skövde, 2009
Serie
Skövde University Studies in Informatics, ISSN 1653-2325 ; 2009:3
HSV kategori
Forskningsprogram
Teknik
Identifikatorer
urn:nbn:se:his:diva-3542 (URN)978-91-978513-2-9 (ISBN)
Konferanse
Proceedings of the 3rd Annual Skövde Workshop on Information Fusion Topics (SWIFT 2009), 12-13 Oct 2009, Skövde, Sweden.
Tilgjengelig fra: 2010-01-07 Laget: 2010-01-07 Sist oppdatert: 2018-01-12
Johansson, R., Boström, H. & Karlsson, A. (2008). A Study on Class-Specifically Discounted Belief for Ensemble Classifiers. In: Proceedings of the IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI 2008). Paper presented at 2008 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems, MFI;Seoul;20 August 2008 through22 August 2008 (pp. 614-619). IEEE Press
Åpne denne publikasjonen i ny fane eller vindu >>A Study on Class-Specifically Discounted Belief for Ensemble Classifiers
2008 (engelsk)Inngår i: Proceedings of the IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI 2008), IEEE Press, 2008, s. 614-619Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

Ensemble classifiers are known to generally perform better than their constituent classifiers. Whereas a lot of work has been focusing on the generation of classifiers for ensembles, much less attention has been given to the fusion of individual classifier outputs. One approach to fuse the outputs is to apply Shafer’s theory of evidence, which provides a flexible framework for expressing and fusing beliefs. However, representing and fusing beliefs is non-trivial since it can be performed in a multitude of ways within the evidential framework. In a previous article, we compared different evidential combination rules for ensemble fusion. The study involved a single belief representation which involved discounting (i.e., weighting) the classifier outputs with classifier reliability. The classifier reliability was interpreted as the classifier’s estimated accuracy, i.e., the percentage of correctly classified examples. However, classifiers may have different performance for different classes and in this work we assign the reliability of a classifier output depending on the classspecific reliability of the classifier. Using 27 UCI datasets, we compare the two different ways of expressing beliefs and some evidential combination rules. The result of the study indicates that there is indeed an advantage of utilizing class-specific reliability compared to accuracy in an evidential framework for combining classifiers in the ensemble design considered.

sted, utgiver, år, opplag, sider
IEEE Press, 2008
Emneord
ensemble classifiers, random forests, evidence theory, Dempster-Shafer theory, combination rules
Forskningsprogram
Teknik
Identifikatorer
urn:nbn:se:his:diva-3627 (URN)10.1109/MFI.2008.4648012 (DOI)000265022100009 ()2-s2.0-67650514819 (Scopus ID)978-1-4244-2144-2 (ISBN)
Konferanse
2008 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems, MFI;Seoul;20 August 2008 through22 August 2008
Tilgjengelig fra: 2010-02-01 Laget: 2010-02-01 Sist oppdatert: 2017-11-27
Boström, H. (2008). Calibrating Random Forests. In: Proceedings of the Seventh International Conference on Machine Learning and Applications (ICMLA'08) (pp. 121-126). IEEE Computer Society
Åpne denne publikasjonen i ny fane eller vindu >>Calibrating Random Forests
2008 (engelsk)Inngår i: Proceedings of the Seventh International Conference on Machine Learning and Applications (ICMLA'08), IEEE Computer Society, 2008, s. 121-126Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

 When using the output of classifiers to calculate the expected utility of different alternatives in decision situations, the correctness of predicted class probabilities may be of crucial importance. However, even very accurate classifiers may output class probabilities of rather poor quality. One way of overcoming this problem is by means of calibration, i.e., mapping the original class probabilities to more accurate ones. Previous studies have however indicated that random forests are difficult to calibrate by standard calibration methods. In this work, a novel calibration method is introduced, which is based on a recent finding that probabilities predicted by forests of classification trees have a lower squared error compared to those predicted by forests of probability estimation trees (PETs). The novel calibration method is compared to the two standard methods, Platt scaling and isotonic regression, on 34 datasets from the UCI repository. The experiment shows that random forests of PETs calibrated by the novel method significantly outperform uncalibrated random forests of both PETs and classification trees, as well as random forests calibrated with the two standard methods, with respect to the squared error of predicted class probabilities.

 

 

sted, utgiver, år, opplag, sider
IEEE Computer Society, 2008
Forskningsprogram
Teknik
Identifikatorer
urn:nbn:se:his:diva-2587 (URN)10.1109/ICMLA.2008.7 (DOI)000263205800017 ()2-s2.0-60649102439 (Scopus ID)13: 978-0-7695-3495-4 (ISBN)
Tilgjengelig fra: 2009-01-22 Laget: 2009-01-22 Sist oppdatert: 2017-11-27
Johansson, U., Sönströd, C., Löfström, T. & Boström, H. (2008). Chipper - A Novel Algorithm for Concept Description. In: Frontiers in Artificial Intelligence and Applications. Paper presented at 10th Scandinavian Conference on Artificial Intelligence, SCAI 2008;Stockholm;26 May 2008through28 May 2008 (pp. 133-140). IOS Press
Åpne denne publikasjonen i ny fane eller vindu >>Chipper - A Novel Algorithm for Concept Description
2008 (engelsk)Inngår i: Frontiers in Artificial Intelligence and Applications, IOS Press, 2008, s. 133-140Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

In this paper, several demands placed on concept description algorithms are identified and discussed. The most important criterion is the ability to produce compact rule sets that, in a natural and accurate way, describe the most important relationships in the underlying domain. An algorithm based on the identified criteria is presented and evaluated. The algorithm, named Chipper, produces decision lists, where each rule covers a maximum number of remaining instances while meeting requested accuracy requirements. In the experiments, Chipper is evaluated on nine UCI data sets. The main result is that Chipper produces compact and understandable rule sets, clearly fulfilling the overall goal of concept description. In the experiments, Chipper’s accuracy is similar to standard decision tree and rule induction algorithms, while rule sets have superior comprehensibility.

sted, utgiver, år, opplag, sider
IOS Press, 2008
Serie
Frontiers in Artificial Intelligence and Applications, ISSN 0922-6389, 1879-8314 ; 173
Forskningsprogram
Teknik
Identifikatorer
urn:nbn:se:his:diva-3614 (URN)000273520700017 ()2-s2.0-84867569402 (Scopus ID)978-1-58603-867-0 (ISBN)
Konferanse
10th Scandinavian Conference on Artificial Intelligence, SCAI 2008;Stockholm;26 May 2008through28 May 2008
Tilgjengelig fra: 2010-01-29 Laget: 2010-01-29 Sist oppdatert: 2017-11-27
Sönströd, C., Johansson, U., Norinder, U. & Boström, H. (2008). Comprehensible Models for Predicting Molecular Interaction with Heart-Regulating Genes. In: Proceedings of the Seventh International Conference on Machine Learning and Applications (pp. 559-564). IEEE Computer Society
Åpne denne publikasjonen i ny fane eller vindu >>Comprehensible Models for Predicting Molecular Interaction with Heart-Regulating Genes
2008 (engelsk)Inngår i: Proceedings of the Seventh International Conference on Machine Learning and Applications, IEEE Computer Society, 2008, s. 559-564Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

 When using machine learning for in silico modeling, the goal is normally to obtain highly accurate predictive models. Often, however, models should also bring insights into interesting relationships in the domain. It is then desirable that machine learning techniques have the ability to obtain small and transparent models, where the user can control the tradeoff between accuracy, comprehensibility and coverage. In this study, three different decision list algorithms are evaluated on a data set concerning the interaction of molecules with a human gene that regulates heart functioning (hERG). The results show that decision list algorithms can obtain predictive performance not far from the state-of-the-art method random forests, but also that algorithms focusing on accuracy alone may produce complex decision lists that are very hard to interpret. The experiments also show that by sacrificing accuracy only to a limited degree, comprehensibility (measured as both model size and classification complexity) can be improved remarkably.

sted, utgiver, år, opplag, sider
IEEE Computer Society, 2008
Forskningsprogram
Teknik
Identifikatorer
urn:nbn:se:his:diva-2575 (URN)10.1109/ICMLA.2008.130 (DOI)000263205800082 ()2-s2.0-60649087754 (Scopus ID)13:978-0-7695-3495-4 (ISBN)
Tilgjengelig fra: 2009-01-21 Laget: 2009-01-21 Sist oppdatert: 2017-11-27
Johansson, U., Boström, H. & König, R. (2008). Extending Nearest Neighbor Classification with Spheres of Confidence. In: Proceedings of the Twenty-First International FLAIRS Conference (FLAIRS 2008): . Paper presented at 21th International Florida Artificial Intelligence Research Society Conference, FLAIRS-21, Coconut Grove, FL, 15 May 2008 through 17 May 2008 (pp. 282-287). AAAI Press
Åpne denne publikasjonen i ny fane eller vindu >>Extending Nearest Neighbor Classification with Spheres of Confidence
2008 (engelsk)Inngår i: Proceedings of the Twenty-First International FLAIRS Conference (FLAIRS 2008), AAAI Press, 2008, s. 282-287Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

The standard kNN algorithm suffers from two major drawbacks: sensitivity to the parameter value k, i.e., the number of neighbors, and the use of k as a global constant that is independent of the particular region in which theexample to be classified falls. Methods using weighted voting schemes only partly alleviate these problems, since they still involve choosing a fixed k. In this paper, a novel instance-based learner is introduced that does not require kas a parameter, but instead employs a flexible strategy for determining the number of neighbors to consider for the specific example to be classified, hence using a local instead of global k. A number of variants of the algorithm are evaluated on 18 datasets from the UCI repository. The novel algorithm in its basic form is shown to significantly outperform standard kNN with respect to accuracy, and an adapted version of the algorithm is shown to be clearlyahead with respect to the area under ROC curve. Similar to standard kNN, the novel algorithm still allows for various extensions, such as weighted voting and axes scaling.

sted, utgiver, år, opplag, sider
AAAI Press, 2008
Emneord
Artificial intelligence, Standards, Algorithms
HSV kategori
Forskningsprogram
Teknik
Identifikatorer
urn:nbn:se:his:diva-2828 (URN)2-s2.0-55849145096 (Scopus ID)978-1-57735-365-2 (ISBN)
Konferanse
21th International Florida Artificial Intelligence Research Society Conference, FLAIRS-21, Coconut Grove, FL, 15 May 2008 through 17 May 2008
Tilgjengelig fra: 2009-03-23 Laget: 2009-03-04 Sist oppdatert: 2019-03-07bibliografisk kontrollert
Organisasjoner
Identifikatorer
ORCID-id: ORCID iD iconorcid.org/0000-0001-8382-0300