his.sePublications
Change search
Link to record
Permanent link

Direct link
BETA
Ventocilla, Elio
Publications (8 of 8) Show all publications
Ventocilla, E. (2019). Big Data programming with Apache Spark. In: Alan Said, Vicenç Torra (Ed.), Data science in Practice: (pp. 171-194). Springer
Open this publication in new window or tab >>Big Data programming with Apache Spark
2019 (English)In: Data science in Practice / [ed] Alan Said, Vicenç Torra, Springer, 2019, p. 171-194Chapter in book (Refereed)
Abstract [en]

In this chapter we give an introduction to Apache Spark, a Big Data programming framework. We describe the framework’s core aspects as well as some of the challenges that parallel and distributed computing entail.

Place, publisher, year, edition, pages
Springer, 2019
Series
Studies in Big Data, ISSN 2197-6503, E-ISSN 2197-6511 ; 46
National Category
Computer and Information Sciences Computer Sciences Computer Systems
Research subject
Skövde Artificial Intelligence Lab (SAIL)
Identifiers
urn:nbn:se:his:diva-16812 (URN)10.1007/978-3-319-97556-6_10 (DOI)000464719500011 ()978-3-319-97556-6 (ISBN)978-3-319-97555-9 (ISBN)
Available from: 2019-04-24 Created: 2019-04-24 Last updated: 2019-09-30Bibliographically approved
Ventocilla, E. & Riveiro, M. (2019). Visual Growing Neural Gas for Exploratory Data Analysis. In: Andreas Kerren, Christophe Hurter, Jose Braz (Ed.), Proceedings of the 14th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications: Volume 3: IVAPP, 58-71, 2019, Prague, Czech Republic. Paper presented at 14th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, February 25-27, 2019, Prague, Czech Republic (pp. 58-71). SciTePress, 3
Open this publication in new window or tab >>Visual Growing Neural Gas for Exploratory Data Analysis
2019 (English)In: Proceedings of the 14th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications: Volume 3: IVAPP, 58-71, 2019, Prague, Czech Republic / [ed] Andreas Kerren, Christophe Hurter, Jose Braz, SciTePress, 2019, Vol. 3, p. 58-71Conference paper, Published paper (Refereed)
Abstract [en]

This paper argues for the use of a topology learning algorithm, the Growing Neural Gas (GNG), for providing an overview of the structure of large and multidimensional datasets that can be used in exploratory data analysis. We introduce a generic, off-the-shelf library, Visual GNG, developed using the Big Data framework Apache Spark, which provides an incremental visualization of the GNG training process, and enables user-in-the-loop interactions where users can pause, resume or steer the computation by changing optimization parameters. Nine case studies were conducted with domain experts from different areas, each working on unique real-world datasets. The results show that Visual GNG contributes to understanding the distribution of multidimensional data; finding which features are relevant in such distribution; estimating the number of k clusters to be used in traditional clustering algorithms, such as K-means; and finding outliers.

Place, publisher, year, edition, pages
SciTePress, 2019
Keywords
Growing Neural Gas, Dimensionality Reduction, Multidimensional Data, Visual Analytics, Exploratory Data Analysis
National Category
Computer Systems
Research subject
Skövde Artificial Intelligence Lab (SAIL)
Identifiers
urn:nbn:se:his:diva-16756 (URN)10.5220/0007364000580071 (DOI)2-s2.0-85064748097 (Scopus ID)978-989-758-354-4 (ISBN)
Conference
14th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, February 25-27, 2019, Prague, Czech Republic
Available from: 2019-04-08 Created: 2019-04-08 Last updated: 2019-09-30Bibliographically approved
Ventocilla, E. (2019). Visualizing and Explaining Cluster Patterns: A Framework for the Exploratory Analysis of Large Multidimensional Datasets.
Open this publication in new window or tab >>Visualizing and Explaining Cluster Patterns: A Framework for the Exploratory Analysis of Large Multidimensional Datasets
2019 (English)Report (Other academic)
Abstract [en]

Large quantities of data are being collected and analyzed by companies and institutions, with the intention of drawing knowledge and value. Advances in storage, computation, automated analysis and visual and interactive techniques have facilitated this process. It is, however, not always transparent on how these can be brought together for the effective and efficient exploration, monitoring and/or processing of data. That is, knowing which automated techniques and frameworks to use for a given task, how to deploy them and integrate them, how to interpret their results and processes, and how to ease their use through visual and interactive techniques.

In the context of exploratory data analysis, where users approach data without preconceived hypotheses, this thesis proposal argues for the benefit of a framework describing the components, techniques and relations that contribute to the visualization and explanation of cluster patterns in large and multidimensional datasets. That is, a Visual Analytics framework aimed at supporting data scientists in their first steps towards understanding the overall structure of a dataset. The problem area is large and, therefore, the scope is limited to the visualization of cluster patterns in table-like datasets with meaningful attributes.

This thesis proposal motivates the relevance of conducting research for the development of such framework. It formalizes a set of research questions and presents a research plan based on the design science research method. Moreover, it provides a description of preliminary results as well as related background theory and state-of-the-art research.

Publisher
p. 48
Keywords
visual analytics, machine learning, dimensionality reduction, clustering, visualization, interaction
National Category
Computer and Information Sciences
Research subject
Skövde Artificial Intelligence Lab (SAIL)
Identifiers
urn:nbn:se:his:diva-16931 (URN)
Note

Thesis proposal, PhD programme, University of Skövde

Available from: 2019-05-31 Created: 2019-05-31 Last updated: 2019-06-03Bibliographically approved
Bae, J., Ventocilla, E., Riveiro, M. & Torra, V. (2018). On the Visualization of Discrete Non-additive Measures. In: Torra V, Mesiar R, Baets B (Ed.), Aggregation Functions in Theory and in Practice AGOP 2017: . Paper presented at 9th International Summer School on Aggregation Functions (AGOP), Skövde, Sweden, June 19-22, 2017 (pp. 200-210). Springer
Open this publication in new window or tab >>On the Visualization of Discrete Non-additive Measures
2018 (English)In: Aggregation Functions in Theory and in Practice AGOP 2017 / [ed] Torra V, Mesiar R, Baets B, Springer, 2018, p. 200-210Conference paper, Published paper (Refereed)
Abstract [en]

Non-additive measures generalize additive measures, and have been utilized in several applications. They are used to represent different types of uncertainty and also to represent importance in data aggregation. As non-additive measures are set functions, the number of values to be considered grows exponentially. This makes difficult their definition but also their interpretation and understanding. In order to support understability, this paper explores the topic of visualizing discrete non-additive measures using node-link diagram representations.

Place, publisher, year, edition, pages
Springer, 2018
Series
Advances in Intelligent Systems and Computing, ISSN 2194-5357, E-ISSN 2194-5365 ; 581
National Category
Computer Sciences
Research subject
Skövde Artificial Intelligence Lab (SAIL); INF301 Data Science
Identifiers
urn:nbn:se:his:diva-15590 (URN)10.1007/978-3-319-59306-7_21 (DOI)000432811600021 ()2-s2.0-85019989762 (Scopus ID)978-3-319-59306-7 (ISBN)978-3-319-59305-0 (ISBN)
Conference
9th International Summer School on Aggregation Functions (AGOP), Skövde, Sweden, June 19-22, 2017
Available from: 2018-06-14 Created: 2018-06-14 Last updated: 2018-10-02Bibliographically approved
Ventocilla, E., Bae, J., Riveiro, M. & Said, A. (2017). A Billiard Metaphor for Exploring Complex Graphs. In: Marijn Koolen, Jaap Kamps, Toine Bogers, Nick Belkin, Diane Kelly, Emine Yilmaz (Ed.), Second Workshop on Supporting Complex Search Tasks: . Paper presented at Second Workshop on Supporting Complex Search Tasks co-located with the ACM SIGIR Conference on Human Information Interaction & Retrieval (CHIIR 2017), Oslo, Norway, March 11, 2017 (pp. 37-40). , 1798
Open this publication in new window or tab >>A Billiard Metaphor for Exploring Complex Graphs
2017 (English)In: Second Workshop on Supporting Complex Search Tasks / [ed] Marijn Koolen, Jaap Kamps, Toine Bogers, Nick Belkin, Diane Kelly, Emine Yilmaz, 2017, Vol. 1798, p. 37-40Conference paper, Published paper (Refereed)
Abstract [en]

Exploring and revealing relations between the elements is a fre-quent task in exploratory analysis and search. Examples includethat of correlations of attributes in complex data sets, or facetedsearch. Common visual representations for such relations are di-rected graphs or correlation matrices. These types of visual encod-ings are often - if not always - fully constructed before being shownto the user. This can be thought of as a top-down approach, whereusers are presented with a full picture for them to interpret andunderstand. Such a way of presenting data could lead to a visualoverload, specially when it results in complex graphs with highdegrees of nodes and edges. We propose a bottom-up alternativecalled Billiard where few elements are presented at rst and fromwhich a user can interactively construct the rest based on whats/he nds of interest. The concept is based on a billiard metaphorwhere a cue ball (node) has an eect on other elements (associatednodes) when stroke against them.

Series
CEUR Workshop Proceedings, E-ISSN 1613-0073 ; 1798
Keywords
Visualization, interaction, correlation
National Category
Computer Systems
Research subject
Skövde Artificial Intelligence Lab (SAIL); INF301 Data Science
Identifiers
urn:nbn:se:his:diva-14775 (URN)2-s2.0-85019592292 (Scopus ID)
Conference
Second Workshop on Supporting Complex Search Tasks co-located with the ACM SIGIR Conference on Human Information Interaction & Retrieval (CHIIR 2017), Oslo, Norway, March 11, 2017
Available from: 2018-02-27 Created: 2018-02-27 Last updated: 2018-09-24Bibliographically approved
Bae, J., Ventocilla, E., Riveiro, M., Helldin, T. & Falkman, G. (2017). Evaluating Multi-Attributes on Cause and Effect Relationship Visualization. In: Alexandru Telea, Jose Braz, Lars Linsen (Ed.), Proceedings of the 12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2017): Volumne 3: IVAPP. Paper presented at 8th International Conference on Information Visualization Theory and Applications (IVAPP), part of the 12th International Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2017), February 27-March 1, 2017, in Porto, Portugal (pp. 64-74). SciTePress
Open this publication in new window or tab >>Evaluating Multi-Attributes on Cause and Effect Relationship Visualization
Show others...
2017 (English)In: Proceedings of the 12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2017): Volumne 3: IVAPP / [ed] Alexandru Telea, Jose Braz, Lars Linsen, SciTePress, 2017, p. 64-74Conference paper, Published paper (Refereed)
Abstract [en]

This paper presents findings about visual representations of cause and effect relationship's direction, strength, and uncertainty based on an online user study. While previous researches focus on accuracy and few attributes, our empirical user study examines accuracy and the subjective ratings on three different attributes of a cause and effect relationship edge. The cause and effect direction was depicted by arrows and tapered lines; causal strength by hue, width, and a numeric value; and certainty by granularity, brightness, fuzziness, and a numeric value. Our findings point out that both arrows and tapered cues work well to represent causal direction. Depictions with width showed higher conjunct accuracy and were more preferred than that with hue. Depictions with brightness and fuzziness showed higher accuracy and were marked more understandable than granularity. In general, depictions with hue and granularity performed less accurately and were not preferred compared to the ones with numbers or with width and brightness.

Place, publisher, year, edition, pages
SciTePress, 2017
Keywords
Cause and effect, uncertainty, evaluation, graph visualization
National Category
Computer and Information Sciences
Research subject
Skövde Artificial Intelligence Lab (SAIL); INF301 Data Science
Identifiers
urn:nbn:se:his:diva-14190 (URN)10.5220/0006102300640074 (DOI)000444939500005 ()2-s2.0-85040593124 (Scopus ID)978-989-758-228-8 (ISBN)
Conference
8th International Conference on Information Visualization Theory and Applications (IVAPP), part of the 12th International Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2017), February 27-March 1, 2017, in Porto, Portugal
Funder
Knowledge Foundation
Available from: 2017-10-02 Created: 2017-10-02 Last updated: 2018-12-20Bibliographically approved
Ventocilla, E. (2017). On Making Machine Learning Accessible for Exploratory Data Analysis Through Visual Analytics.
Open this publication in new window or tab >>On Making Machine Learning Accessible for Exploratory Data Analysis Through Visual Analytics
2017 (English)Report (Other academic)
Abstract [en]

Visual Analytics is a field of study that seeks to aid human cognition in the process of analyzing data. It aims at doing so through visual interfaces and automated computations. Such automated computations often translate to Machine Learning algorithms. Users can leverage from these algorithms in order to find interesting patterns in massive and unstructured data. These algorithms are, however, still often regarded as black-boxes i.e. mathematical models which are hard to interpret and difficult to interact with. In a single Machine Learning library more than 40 variants of algorithms can be found, and some with up to 14 different parameters. The Visual Analytics community has pointed out a lack of research in helping users set parameters and compare results between algorithms. Moreover, it is argued that most technology in the field, which has aimed at bringing transparency to black-boxes, is embedded in domain-specific systems thus making it hard to reach and use for other users and in other domains. In general, Machine Learning has an accessibility challenge: it is hard to use, to interpret and, if not either of these, to reach. This research proposal motivates the need for research in making Machine Learning more accessible for exploratory data analysis, reviews existing work in the field, presents a research plan and, finally, describes preliminary results.

Publisher
p. 28
Keywords
visual analytics, machine learning, exploratory data analysis, big data, interpretability
National Category
Computer Systems Interaction Technologies
Research subject
Skövde Artificial Intelligence Lab (SAIL); Skövde Artificial Intelligence Lab (SAIL); Skövde Artificial Intelligence Lab (SAIL); INF301 Data Science
Identifiers
urn:nbn:se:his:diva-14379 (URN)
Note

Research proposal, PhD programme, University of Skövde

Available from: 2017-11-08 Created: 2017-11-08 Last updated: 2018-06-11Bibliographically approved
Ventocilla, E. & Riveiro, M. (2017). Visual Analytics Solutions as 'off-the-shelf' Libraries. In: Ebad Banissi, Mark W. McK. Bannatyne, Fatma Bouali, Nuno Miguel Soares Datia, Georges Grinstein, Dennis Groth, Weidong Huang, Malinka Ivanova, Sarah Kenderdine, Minoru Nakayama, Joao Moura Pires, Muhammad Sarfraz, Marco Temperini, Anna Ursyn, Gilles Venturini, Theodor G. Wyeld, Jian J. Zhang (Ed.), 2017 21st International Conference Information Visualisation (IV): Computer Graphics, Imaging and Visualisation. Biomedical Visualization, Visualisation on Built and Rural Environments & Geometric Modelling and Imaging, IEETeL2017. Paper presented at 2017 21st International Conference Information Visualisation (IV), London, United Kingdom, July 11-14, 2017 (pp. 281-287). IEEE Computer Society
Open this publication in new window or tab >>Visual Analytics Solutions as 'off-the-shelf' Libraries
2017 (English)In: 2017 21st International Conference Information Visualisation (IV): Computer Graphics, Imaging and Visualisation. Biomedical Visualization, Visualisation on Built and Rural Environments & Geometric Modelling and Imaging, IEETeL2017 / [ed] Ebad Banissi, Mark W. McK. Bannatyne, Fatma Bouali, Nuno Miguel Soares Datia, Georges Grinstein, Dennis Groth, Weidong Huang, Malinka Ivanova, Sarah Kenderdine, Minoru Nakayama, Joao Moura Pires, Muhammad Sarfraz, Marco Temperini, Anna Ursyn, Gilles Venturini, Theodor G. Wyeld, Jian J. Zhang, IEEE Computer Society, 2017, p. 281-287Conference paper, Published paper (Refereed)
Abstract [en]

Visual Analytics has brought forward many solutions to different tasks such as exploring topics, understanding user and customer behavior, comparing genomes, or detecting anomalies. Many of these solutions, if not most, are standalone applications with technological contributions which cannot be easily taken for: reuse in other domains, further improvement, benchmarking, or integration and deployment alongside other solutions. The latter can prove specially helpful for exploratory data analysis. This often leads researchers to re-implement solutions and thus to a suboptimal use of skills and resources. This paper discusses further the lack of off-the-shelf libraries for Visual Analytics, and proposes the creation of pluggable libraries on top of existing technologies such as Spark and Zeppelin. We provide an illustrative example of a pluggable, Visual Analytics library using these technologies.

Place, publisher, year, edition, pages
IEEE Computer Society, 2017
Series
IEEE International Conference on Information Visualisation, ISSN 1550-6037, E-ISSN 2375-0138
Keywords
visual analytics, machine learning, off-the-shelf libraries
National Category
Computer and Information Sciences Computer Sciences Human Computer Interaction
Research subject
Skövde Artificial Intelligence Lab (SAIL); INF301 Data Science
Identifiers
urn:nbn:se:his:diva-14682 (URN)10.1109/iV.2017.77 (DOI)000419271000044 ()2-s2.0-85040618455 (Scopus ID)978-1-5386-0832-6 (ISBN)978-1-5386-0831-9 (ISBN)
Conference
2017 21st International Conference Information Visualisation (IV), London, United Kingdom, July 11-14, 2017
Projects
NOVA (20140294), Swedish Knowledge Foundation
Funder
Knowledge Foundation
Available from: 2018-01-25 Created: 2018-01-25 Last updated: 2018-07-31Bibliographically approved
Organisations

Search in DiVA

Show all publications