Högskolan i Skövde

his.sePublications
Change search
Link to record
Permanent link

Direct link
Publications (10 of 14) Show all publications
Ventocilla, E., Martins, R. M., Paulovich, F. & Riveiro, M. (2021). Scaling the Growing Neural Gas for Visual Cluster Analysis. Big Data Research, 26, Article ID 100254.
Open this publication in new window or tab >>Scaling the Growing Neural Gas for Visual Cluster Analysis
2021 (English)In: Big Data Research, ISSN 2214-5796, E-ISSN 2214-580X, Vol. 26, article id 100254Article in journal (Refereed) Published
Abstract [en]

The growing neural gas (GNG) is an unsupervised topology learning algorithm that models a data space through interconnected units that stand on the populated areas of that space. Its output is a graph that can be visually represented on a two-dimensional plane, and be used as means to disclose cluster patterns in datasets. GNG, however, creates highly connected graphs when trained on high dimensional data, which in turn leads to highly clutter representations that fail to disclose any meaningful patterns. Moreover, its sequential learning limits its potential for faster executions on local datasets, and, more importantly, its potential for training on distributed datasets while leveraging from the computational resources of the infrastructures in which they reside.

This paper presents two methods that improve GNG for the visualization of cluster patterns in large and high-dimensional datasets. The first one focuses on providing more meaningful and accurate cluster pattern representations of high-dimensional datasets, by avoiding connections that lead to high-dimensional graphs in the modeled topology, which may, in turn, lead to visual cluttering in 2D representations. The second method presented in this paper enables the use of GNG on big and distributed datasets with faster execution times, by modeling and merging separate parts of a dataset using the MapReduce model.

Quantitative and qualitative evaluations show that the first method leads to the creation of lower-dimensional graph structures, which in turn provide more accurate and meaningful cluster representations; and that the second method preserves the accuracy and meaning of the cluster representations while enabling its execution in distributed settings.

Place, publisher, year, edition, pages
Elsevier, 2021
Keywords
Growing neural gas, clustering, cluster patterns, visualization, mapreduce
National Category
Computer Systems
Research subject
Skövde Artificial Intelligence Lab (SAIL); VF-KDO
Identifiers
urn:nbn:se:his:diva-19460 (URN)10.1016/j.bdr.2021.100254 (DOI)000710458600012 ()2-s2.0-85113545584 (Scopus ID)
Note

CC BY 4.0

Available from: 2021-02-10 Created: 2021-02-10 Last updated: 2023-02-22Bibliographically approved
Ventocilla, E. (2021). Visualizing Cluster Patterns at Scale: A Model and a Library. (Doctoral dissertation). Skövde: University of Skövde
Open this publication in new window or tab >>Visualizing Cluster Patterns at Scale: A Model and a Library
2021 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Large quantities of data are being collected and analyzed by companies and institutions, with the aim of extracting knowledge and value. When little is known about the data at hand, analysts engage in exploratory data analysis to achieve a better understanding. One approach in doing so is through the modeling and visualization of a dataset's structure, i.e., the neighborhood relations among its data points, and their distribution in the multidimensional space. Such a process allows users to disclose and discover neighborhoods, outliers and cluster patterns - insights that enable more informed subsequent analytical decisions.

Visualizing the structure of multidimensional data (i.e., with four or more dimensions or features) is generally done via two steps: modeling distance or neighborhood relations among data points, and visually encoding those modeled relations. As datasets grow in size (number of data points) and dimensionality (number of features), different scalability challenges arise. High-dimensional datasets, on the one hand, are more sparse in the multidimensional space, making it more difficult to make meaningful assessments about distances during the modeling, which, in turn, hinders the meaningfulness of the visual representations. Large datasets, on the other hand, make it more difficult to maintain the usability of a system in terms of the effectiveness of the visual representation (due to clutter or overplotting), or the efficiency of the solution (time and memory-wise). Different approaches have been proposed to overcome these challenges, but they apply to a particular combination of modeling and visual encoding, and their usability still degrades when dealing with very large, potentially distributed, multidimensional datasets.

Moreover, the availability or format of existing Visual Analytics solutions (i.e., solutions that aid data analysis through Machine Learning, visual and interactive techniques) for visualizing data structure - and, hence, cluster patterns - presents an accessibility challenge to the data science community. Namely, many solutions are either unavailable for use, thus requiring their re-implementation, or come as domain-dependent or standalone applications which are too rigid to use in other scenarios, or to integrate with other data analysis tools. 

This thesis addresses these challenges and makes two contributions: a process model, describing a generic approach for the effective and efficient visualization of cluster patterns in large and multidimensional datasets; and an open source library for the interactive visualization of cluster patterns, even in distributed datasets, packaged in an accessible format that allows its integration with other tools within a data analysis environment. The process model suggests sampling and vector quantization to avoid cluttering and overplotting, as well as to improve the efficiency of the system in terms of memory and latency (i.e., times taken to produce visual feedback from the modeling process and from user interactions). The library instantiates one of the possible configurations of the model, using Apache Spark for distributed computations, the Growing Neural Gas for vector quantization, and Force-directed Placement for constructing the two-dimensional layout. Seven research publications provide empirical and theoretical groundings to the validity of both the model and the library.

Abstract [sv]

Stora datamängder samlas (idag) in och analyseras för att skapa ny kunskap och bidra med värde i akademiska och industriella tillämpningar. För att skapa en större förståelse om vad datan innehåller, speciellt i fall där kunskapen om innehållet i datan är bristfällig, använder analytiker Visual Analytics (VA). Ett typiskt angreppssätt inom VA är att modellera och sedan visualisera den underliggande strukturen på datan. Det kan, till exempel, göras genom att visualisera datapunkternas lokala relationer i en multidimensionell rymd. Genom en sådan analys så kan analytiker upptäcka intressanta områden, (uteliggare check if this is the correct term for outlier), och kluster av datapunkter. Med dessa insikter kan efterliggande analyser utföras på ett bättre sätt (och då ge ett än större mervärde).

Visualisering av högdimensionell data, det vill säga data med fler än fyra dimensioner, är oftast en tvåstegsprocess. I det första steget modelleras relationerna mellan datapunkterna. I det andra steget så visualisera datapunkterna i en lågdimensionell rymd, där så mycket som möjligt av relationerna mellan datapunkterna behålls. Det uppstår dock skalbarhetsproblem när dataseten växer i storlek, som innefattar antalet datadimensioner och antalet datapunkter. Högdimensionell data är ofta gles och utspridd i den högdimensionella rymden, vilket gör den svår att modellera relationerna mellan de olika datapunkterna. Detta gör is sin tur det svår att tolka ut meningsfulla och informativa visualiseringar. Dataset med många datapunkter är också svåra att visualisera och kräver mycket beräkningstid och mycket datorminne för att få fram bra representationer och det finns en stor risk att dessa representationer blir röriga och svårtolkade. Detta gör att nyttan med VA minskar när storleken på datan växer. Det har gjorts en hel del forskning inom detta område och lösningar har presenterats för specifika problem men än så länge finns inga generella lösningar för stora - och mångdimensionella - dataset.

Många av de lösningar som tagits fram är dessutom inte publikt tillgängliga eller bara finns tillgängliga för att användas i specifika domäner. Det gör att dessa lösningar ofta måste återskapas om de ska användas för andra domäner eller integreras i standardiserade mjukvaror. Det gör att hela "data science" fältet står inför en stor utmaning om vilka format som kan användas när data klustras och hur lättillgängliga och publika mjukvaror kan skapas.

Avhandlingen gör två bidrag till detta. Dels så beskrivs ett effektivt och och generellt tillvägagångssätt för att processa och visualisera högdimensionell data. Ett programvarubibliotek som möjliggör interaktiv analys av data, även när datan är högdimensionell, har dessutom publicerats och finns tillgänglig som öppen källkod. Genom att programvaran finnas som öppen källkod så är det möjligt att integrera den i andra analys-mjukvaror. Tillvägagångsättet för att modellera datan använder sig av stickprovsurval och vektorkvantifiering för att undvika att de visuliseringarna innehåller för många punkter och därmed blir klottriga. Dessa metoder gör också att modelleringen och visualliseringen kräver en mindre mängd datorminne och att det är möjligt för modellen att ingå i en process där modellen producerar visuella representationer och en analytiker ger återkoppling som modellen reagerar på.

Programbiblioteket som publicerats innehåller en av alla möjliga konfigurationer av den framtagna modellen och är implementerat för köras på Apache Spark. I denna implementation så används en GNG för att utföra vektorkvantifieringen och en fdp används sedan för att konstruera en tvådimensionell representation som kan visualiseras För att fastställa validiteten och ge empiriska och teoretiska belägg för modellen så har sju vetenskapliga publikationer publicerats.

Place, publisher, year, edition, pages
Skövde: University of Skövde, 2021
Series
Dissertation Series ; 36 (2021)
Keywords
Visual analytics, cluster patterns, big data, unsupervised learning, multidimensional projections, vector quantization, progressive visual analytics
National Category
Computer Systems
Research subject
Skövde Artificial Intelligence Lab (SAIL)
Identifiers
urn:nbn:se:his:diva-19458 (URN)978-91-984919-0-6 (ISBN)
Public defence
2021-03-12, Insikten, Kanikegränd 3a, 541 34, Skövde, 13:15 (English)
Opponent
Supervisors
Available from: 2021-02-16 Created: 2021-02-10 Last updated: 2021-02-18Bibliographically approved
Ventocilla, E. & Riveiro, M. (2020). A comparative user study of visualization techniques for cluster analysis of multidimensional data sets. Information Visualization, 19(4), 318-338
Open this publication in new window or tab >>A comparative user study of visualization techniques for cluster analysis of multidimensional data sets
2020 (English)In: Information Visualization, ISSN 1473-8716, E-ISSN 1473-8724, Vol. 19, no 4, p. 318-338Article in journal (Refereed) Published
Abstract [en]

This article presents an empirical user study that compares eight multidimensional projection techniques for supporting the estimation of the number of clusters, k, embedded in six multidimensional data sets. The selection of the techniques was based on their intended design, or use, for visually encoding data structures, that is, neighborhood relations between data points or groups of data points in a data set. Concretely, we study: the difference between the estimates of k as given by participants when using different multidimensional projections; the accuracy of user estimations with respect to the number of labels in the data sets; the perceived usability of each multidimensional projection; whether user estimates disagree with k values given by a set of cluster quality measures; and whether there is a difference between experienced and novice users in terms of estimates and perceived usability. The results show that: dendrograms (from Ward's hierarchical clustering) are likely to lead to estimates of k that are different from those given with other multidimensional projections, while Star Coordinates and Radial Visualizations are likely to lead to similar estimates; t-Stochastic Neighbor Embedding is likely to lead to estimates which are closer to the number of labels in a data set; cluster quality measures are likely to produce estimates which are different from those given by users using Ward and t-Stochastic Neighbor Embedding; U-Matrices and reachability plots will likely have a low perceived usability; and there is no statistically significant difference between the answers of experienced and novice users. Moreover, as data dimensionality increases, cluster quality measures are likely to produce estimates which are different from those perceived by users using any of the assessed multidimensional projections. It is also apparent that the inherent complexity of a data set, as well as the capability of each visual technique to disclose such complexity, has an influence on the perceived usability.

Place, publisher, year, edition, pages
Sage Publications, 2020
Keywords
Cluster patterns, visualization, data structure, user study, multidimensional data
National Category
Computer Sciences
Research subject
Skövde Artificial Intelligence Lab (SAIL); VF-KDO
Identifiers
urn:nbn:se:his:diva-18851 (URN)10.1177/1473871620922166 (DOI)000545375800001 ()2-s2.0-85087437127 (Scopus ID)
Note

CC BY

Available from: 2020-07-20 Created: 2020-07-20 Last updated: 2023-02-22Bibliographically approved
Ventocilla, E. & Riveiro, M. (2020). A Model for the Progressive Visualization of Multidimensional Data Structure. In: Ana Paula Cláudio; Kadi Bouatouch; Manuela Chessa; Alexis Paljic; Andreas Kerren; Christophe Hurter; Alain Tremeau; Giovanni Maria Farinella (Ed.), Ana Paula Cláudio, Kadi Bouatouch, Manuela Chessa, Alexis Paljic, Andreas Kerren, Christophe Hurter, Alain Tremeau, Giovanni Maria Farinella (Ed.), Computer Vision, Imaging and Computer Graphics Theory and Applications: 14th International Joint Conference, VISIGRAPP 2019, Prague, Czech Republic, February 25–27, 2019, Revised Selected Papers. Paper presented at 14th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, VISIGRAPP 2019; Prague; Czech Republic; 25 February 2019 through 27 February 2019; Code 237909 (pp. 203-226). Paper presented at 14th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, VISIGRAPP 2019; Prague; Czech Republic; 25 February 2019 through 27 February 2019; Code 237909. Cham: Springer, 1182
Open this publication in new window or tab >>A Model for the Progressive Visualization of Multidimensional Data Structure
2020 (English)In: Computer Vision, Imaging and Computer Graphics Theory and Applications: 14th International Joint Conference, VISIGRAPP 2019, Prague, Czech Republic, February 25–27, 2019, Revised Selected Papers / [ed] Ana Paula Cláudio; Kadi Bouatouch; Manuela Chessa; Alexis Paljic; Andreas Kerren; Christophe Hurter; Alain Tremeau; Giovanni Maria Farinella, Cham: Springer, 2020, Vol. 1182, p. 203-226Chapter in book (Refereed)
Abstract [en]

This paper presents a model for the progressive visualization and exploration of the structure of large datasets. That is, an abstraction on different components and relations which provide means for constructing a visual representation of a dataset’s structure, with continuous system feedback and enabled user interactions for computational steering, in spite of size. In this context, the structure of a dataset is regarded as the distance or neighborhood relationships among its data points. Size, on the other hand, is defined in terms of the number of data points. To prove the validity of the model, a proof-of-concept was developed as a Visual Analytics library for Apache Zeppelin and Apache Spark. Moreover, nine user studies where carried in order to assess the usability of the library. The results from the user studies show that the library is useful for visualizing and understanding the emerging cluster patterns, for identifying relevant features, and for estimating the number of clusters. 

Place, publisher, year, edition, pages
Cham: Springer, 2020
Series
Communications in Computer and Information Science, ISSN 1865-0929, E-ISSN 1865-0937 ; 1182
Keywords
Data structure, Exploratory data analysis, Growing neural gas, Large data analysis, Multidimensional data, Multidimensional projection, Progressive visualization, Visual analytics, Computation theory, Computer vision, Data handling, Data structures, Information analysis, Large dataset, Petroleum prospecting, Visualization, Data visualization
National Category
Computer Sciences
Research subject
Skövde Artificial Intelligence Lab (SAIL); VF-KDO
Identifiers
urn:nbn:se:his:diva-18343 (URN)10.1007/978-3-030-41590-7_9 (DOI)000659188700009 ()2-s2.0-85081622039 (Scopus ID)978-3-030-41589-1 (ISBN)978-3-030-41590-7 (ISBN)
Conference
14th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, VISIGRAPP 2019; Prague; Czech Republic; 25 February 2019 through 27 February 2019; Code 237909
Available from: 2020-03-26 Created: 2020-03-26 Last updated: 2023-02-24Bibliographically approved
Ventocilla, E., Martins, R. M., Paulovich, F. & Riveiro, M. (2020). Progressive Multidimensional Projections: A Process Model based on Vector Quantization. In: Daniel Archambault, Ian Nabney, Jaakko Peltonen (Ed.), Machine Learning Methods in Visualisation for Big Data: . Paper presented at MLVis 2020, International Workshop on Machine Learning in Visualisation for Big Data 2020 Co-located with EGEV2020 - Eurographics & Eurovis 2020, May 25, 2020, Norrköping, Sweden (pp. 1-5). Eurographics - European Association for Computer Graphics
Open this publication in new window or tab >>Progressive Multidimensional Projections: A Process Model based on Vector Quantization
2020 (English)In: Machine Learning Methods in Visualisation for Big Data / [ed] Daniel Archambault, Ian Nabney, Jaakko Peltonen, Eurographics - European Association for Computer Graphics, 2020, p. 1-5Conference paper, Published paper (Refereed)
Abstract [en]

As large datasets become more common, so becomes the necessity for exploratory approaches that allow iterative, trial-and-error analysis. Without such solutions, hypothesis testing and exploratory data analysis may become cumbersome due to long waiting times for feedback from computationally-intensive algorithms. This work presents a process model for progressive multidimensional projections (P-MDPs) that enables early feedback and user involvement in the process, complementing previous work by providing a lower level of abstraction and describing the specific elements that can be used to provide early systemfeed back, and those which can be enabled for user interaction. Additionally, we outline a set of design constraints that must be taken into account to ensure the usability of a solution regarding feedback time, visual cluttering, and the interactivity ofthe view. To address these constraints, we propose the use of incremental vector quantization (iVQ) as a core step within the process. To illustrate the feasibility of the model, and the usefulness of the proposed iVQ-based solution, we present a prototype that demonstrates how the different usability constraints can be accounted for, regardless of the size of a dataset.

Place, publisher, year, edition, pages
Eurographics - European Association for Computer Graphics, 2020
Keywords
Multidimensional projections, progressive visual analytics, vector quantization, process model
National Category
Computer Systems
Research subject
Skövde Artificial Intelligence Lab (SAIL)
Identifiers
urn:nbn:se:his:diva-19461 (URN)10.2312/mlvis.20201099 (DOI)2-s2.0-85113558878 (Scopus ID)978-3-03868-113-7 (ISBN)
Conference
MLVis 2020, International Workshop on Machine Learning in Visualisation for Big Data 2020 Co-located with EGEV2020 - Eurographics & Eurovis 2020, May 25, 2020, Norrköping, Sweden
Available from: 2021-02-10 Created: 2021-02-10 Last updated: 2022-09-08Bibliographically approved
Ventocilla, E. (2019). Big Data programming with Apache Spark. In: Alan Said; Vicenç Torra (Ed.), Data science in Practice: (pp. 171-194). Springer
Open this publication in new window or tab >>Big Data programming with Apache Spark
2019 (English)In: Data science in Practice / [ed] Alan Said; Vicenç Torra, Springer, 2019, p. 171-194Chapter in book (Refereed)
Abstract [en]

In this chapter we give an introduction to Apache Spark, a Big Data programming framework. We describe the framework’s core aspects as well as some of the challenges that parallel and distributed computing entail.

Place, publisher, year, edition, pages
Springer, 2019
Series
Studies in Big Data, ISSN 2197-6503, E-ISSN 2197-6511 ; 46
National Category
Computer and Information Sciences Computer Sciences Computer Systems
Research subject
Skövde Artificial Intelligence Lab (SAIL)
Identifiers
urn:nbn:se:his:diva-16812 (URN)10.1007/978-3-319-97556-6_10 (DOI)000464719500011 ()2-s2.0-85105366633 (Scopus ID)978-3-319-97556-6 (ISBN)978-3-319-97555-9 (ISBN)
Available from: 2019-04-24 Created: 2019-04-24 Last updated: 2022-12-28Bibliographically approved
Ventocilla, E. & Riveiro, M. (2019). Visual Growing Neural Gas for Exploratory Data Analysis. In: Andreas Kerren; Christophe Hurter; Jose Braz (Ed.), Proceedings of the 14th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications: Volume 3: IVAPP, 58-71, 2019, Prague, Czech Republic. Paper presented at 14th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, February 25-27, 2019, Prague, Czech Republic (pp. 58-71). SciTePress, 3
Open this publication in new window or tab >>Visual Growing Neural Gas for Exploratory Data Analysis
2019 (English)In: Proceedings of the 14th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications: Volume 3: IVAPP, 58-71, 2019, Prague, Czech Republic / [ed] Andreas Kerren; Christophe Hurter; Jose Braz, SciTePress, 2019, Vol. 3, p. 58-71Conference paper, Published paper (Refereed)
Abstract [en]

This paper argues for the use of a topology learning algorithm, the Growing Neural Gas (GNG), for providing an overview of the structure of large and multidimensional datasets that can be used in exploratory data analysis. We introduce a generic, off-the-shelf library, Visual GNG, developed using the Big Data framework Apache Spark, which provides an incremental visualization of the GNG training process, and enables user-in-the-loop interactions where users can pause, resume or steer the computation by changing optimization parameters. Nine case studies were conducted with domain experts from different areas, each working on unique real-world datasets. The results show that Visual GNG contributes to understanding the distribution of multidimensional data; finding which features are relevant in such distribution; estimating the number of k clusters to be used in traditional clustering algorithms, such as K-means; and finding outliers.

Place, publisher, year, edition, pages
SciTePress, 2019
Keywords
Growing Neural Gas, Dimensionality Reduction, Multidimensional Data, Visual Analytics, Exploratory Data Analysis
National Category
Computer Systems
Research subject
Skövde Artificial Intelligence Lab (SAIL); VF-KDO
Identifiers
urn:nbn:se:his:diva-16756 (URN)10.5220/0007364000580071 (DOI)000668124000005 ()2-s2.0-85064748097 (Scopus ID)978-989-758-354-4 (ISBN)
Conference
14th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, February 25-27, 2019, Prague, Czech Republic
Available from: 2019-04-08 Created: 2019-04-08 Last updated: 2023-02-24Bibliographically approved
Ventocilla, E. (2019). Visualizing and Explaining Cluster Patterns: A Framework for the Exploratory Analysis of Large Multidimensional Datasets. Skövde: University of Skövde
Open this publication in new window or tab >>Visualizing and Explaining Cluster Patterns: A Framework for the Exploratory Analysis of Large Multidimensional Datasets
2019 (English)Report (Other academic)
Abstract [en]

Large quantities of data are being collected and analyzed by companies and institutions, with the intention of drawing knowledge and value. Advances in storage, computation, automated analysis and visual and interactive techniques have facilitated this process. It is, however, not always transparent on how these can be brought together for the effective and efficient exploration, monitoring and/or processing of data. That is, knowing which automated techniques and frameworks to use for a given task, how to deploy them and integrate them, how to interpret their results and processes, and how to ease their use through visual and interactive techniques.

In the context of exploratory data analysis, where users approach data without preconceived hypotheses, this thesis proposal argues for the benefit of a framework describing the components, techniques and relations that contribute to the visualization and explanation of cluster patterns in large and multidimensional datasets. That is, a Visual Analytics framework aimed at supporting data scientists in their first steps towards understanding the overall structure of a dataset. The problem area is large and, therefore, the scope is limited to the visualization of cluster patterns in table-like datasets with meaningful attributes.

This thesis proposal motivates the relevance of conducting research for the development of such framework. It formalizes a set of research questions and presents a research plan based on the design science research method. Moreover, it provides a description of preliminary results as well as related background theory and state-of-the-art research.

Place, publisher, year, edition, pages
Skövde: University of Skövde, 2019. p. 48
Keywords
visual analytics, machine learning, dimensionality reduction, clustering, visualization, interaction
National Category
Computer and Information Sciences
Research subject
Skövde Artificial Intelligence Lab (SAIL)
Identifiers
urn:nbn:se:his:diva-16931 (URN)
Note

Thesis proposal, PhD programme, University of Skövde

Available from: 2019-05-31 Created: 2019-05-31 Last updated: 2023-07-19Bibliographically approved
Bae, J., Ventocilla, E., Riveiro, M. & Torra, V. (2018). On the Visualization of Discrete Non-additive Measures. In: Vicenç Torra; Radko Mesiar; Bernard De Baets (Ed.), Aggregation Functions in Theory and in Practice AGOP 2017: . Paper presented at 9th International Summer School on Aggregation Functions (AGOP), Skövde, Sweden, June 19-22, 2017 (pp. 200-210). Springer
Open this publication in new window or tab >>On the Visualization of Discrete Non-additive Measures
2018 (English)In: Aggregation Functions in Theory and in Practice AGOP 2017 / [ed] Vicenç Torra; Radko Mesiar; Bernard De Baets, Springer, 2018, p. 200-210Conference paper, Published paper (Refereed)
Abstract [en]

Non-additive measures generalize additive measures, and have been utilized in several applications. They are used to represent different types of uncertainty and also to represent importance in data aggregation. As non-additive measures are set functions, the number of values to be considered grows exponentially. This makes difficult their definition but also their interpretation and understanding. In order to support understability, this paper explores the topic of visualizing discrete non-additive measures using node-link diagram representations.

Place, publisher, year, edition, pages
Springer, 2018
Series
Advances in Intelligent Systems and Computing, ISSN 2194-5357, E-ISSN 2194-5365 ; 581
National Category
Computer Sciences
Research subject
Skövde Artificial Intelligence Lab (SAIL); INF301 Data Science
Identifiers
urn:nbn:se:his:diva-15590 (URN)10.1007/978-3-319-59306-7_21 (DOI)000432811600021 ()2-s2.0-85019989762 (Scopus ID)978-3-319-59306-7 (ISBN)978-3-319-59305-0 (ISBN)
Conference
9th International Summer School on Aggregation Functions (AGOP), Skövde, Sweden, June 19-22, 2017
Available from: 2018-06-14 Created: 2018-06-14 Last updated: 2023-01-02Bibliographically approved
Ventocilla, E., Helldin, T., Riveiro, M., Bae, J., Boeva, V., Falkman, G. & Lavesson, N. (2018). Towards a Taxonomy for Interpretable and Interactive Machine Learning. In: David W. Aha, Trevor Darrell, Patrick Doherty, Daniele Magazzeni (Ed.), : . Paper presented at 2nd Workshop on Explainable AI (XAI-18), 27th International Joint Conferences on Artificial Intelligence (IJCAI) July 13-19, 2018, Stockholm, Sweden (pp. 151-157).
Open this publication in new window or tab >>Towards a Taxonomy for Interpretable and Interactive Machine Learning
Show others...
2018 (English)In: / [ed] David W. Aha, Trevor Darrell, Patrick Doherty, Daniele Magazzeni, 2018, p. 151-157Conference paper, Poster (with or without abstract) (Refereed)
Abstract [en]

We propose a taxonomy for classifying and describing papers which contribute to making Machine Learning (ML) techniques interactive and interpretable for users. The taxonomy is composed of six elements – Dataset, Optimizer, Model, Predictions, Evaluator and Goodness – where each can bemade available for user interpretation and interaction. We give definitions to the terms interpretable and interactive in the context of useroriented Machine Learning, describe the role of each of the elements in the taxonomy, and describe papers as seen through the lens of the proposed taxonomy.

Keywords
Machine learning, interpretable machine learning, interactive machine learning
National Category
Computer Systems
Research subject
Skövde Artificial Intelligence Lab (SAIL)
Identifiers
urn:nbn:se:his:diva-19457 (URN)10.13140/RG.2.2.14534.98886 (DOI)
Conference
2nd Workshop on Explainable AI (XAI-18), 27th International Joint Conferences on Artificial Intelligence (IJCAI) July 13-19, 2018, Stockholm, Sweden
Available from: 2021-02-10 Created: 2021-02-10 Last updated: 2021-02-19Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0002-0864-5247

Search in DiVA

Show all publications