Högskolan i Skövde

his.sePublications
Change search
ExportLink to record
Permanent link

Direct link
BETA

Project

Project type/Form of grant
Project grant
Title [sv]
Disclosure risk and transparency in big data privacy
Title [en]
Disclosure risk and transparency in big data privacy
Abstract [en]
Data privacy is to ensure that no disclosure of sensitive information takes place, or that the risk is minimized. Three different communities work on this topic: one in statistics (statistical disclosure control) and two within computer science (privacy in data mining and databases, and privacy in communications). There is an increasing interest on privacy. Legislation has been updated (in both national and international evel), funding agencies include specific calls (see e.g. H2020 calls), and software packages include modules for data privacy. Solutions for data privacy need to be computationally feasible, and be able to find compromises between privacy and security, and between privacy and data utility. Any discussion on privacy is rooted on the notion of disclosure. A good understanding of its meaning and consequences is fundamental, accurate estimations of the disclosure risk is fundamental for any data release. For standard databases, the problem is understood. Nevertheless, in big data, disclosure risk and more particularly, identity disclosure, is a more complex issue. In this project we will study disclosure risk in the context of big data. We will study disclosure risk for big data. We will focus on the re-identification model that seems to fit best to big data. We will study the effecs of transparency and data provenance on disclosure. We will develop masking methods for graph data (social networks) and use them as test beds for our risk analysis.
Publications (10 of 16) Show all publications
Senavirathne, N. & Torra, V. (2023). Rounding based continuous data discretization for statistical disclosure control. Journal of Ambient Intelligence and Humanized Computing, 14(11), 15139-15157
Open this publication in new window or tab >>Rounding based continuous data discretization for statistical disclosure control
2023 (English)In: Journal of Ambient Intelligence and Humanized Computing, ISSN 1868-5137, E-ISSN 1868-5145, Vol. 14, no 11, p. 15139-15157Article in journal (Refereed) Published
Abstract [en]

“Rounding” can be understood as a way to coarsen continuous data. That is, low level and infrequent values are replaced by high-level and more frequent representative values. This concept is explored as a method for data privacy with techniques like rounding, microaggregation, and generalisation. This concept is explored as a method for data privacy in statistical disclosure control literature with perturbative techniques like rounding, microaggregation and non-perturbative methods like generalisation. Even though “rounding” is well known as a numerical data protection method, it has not been studied in depth or evaluated empirically to the best of our knowledge. This work is motivated by three objectives, (1) to study the alternative methods of obtaining the rounding values to represent a given continuous variable, (2) to empirically evaluate rounding as a data protection technique based on information loss (IL) and disclosure risk (DR), and (3) to analyse the impact of data rounding on machine learning based models. Here, in order to obtain the rounding values we consider discretization methods introduced in the unsupervised machine learning literature along with microaggregation and re-sampling based approaches. The results indicate that microaggregation based techniques are preferred over unsupervised discretization methods due to their fair trade-off between IL and DR. 

Place, publisher, year, edition, pages
Springer, 2023
Keywords
Micro data protection, Rounding for micro data, Unsupervised discretization, Discrete event simulation, Economic and social effects, Machine learning, Numerical methods, Volume measurement, Data protection techniques, Discretization method, Numerical data protection methods, Perturbative techniques, Statistical disclosure Control, Unsupervised machine learning, Data privacy
National Category
Computer Sciences
Research subject
Skövde Artificial Intelligence Lab (SAIL)
Identifiers
urn:nbn:se:his:diva-17858 (URN)10.1007/s12652-019-01489-7 (DOI)2-s2.0-85074009425 (Scopus ID)
Funder
Swedish Research Council, 2016-03346
Note

CC BY 4.0

Published: 25 September 2019

Correspondence to Navoda Senavirathne.

This work is supported by Vetenskapsrådet project: “Disclosure risk and transparency in big data privacy” (VR 2016-03346, 2017-2020)

DRIAT

Available from: 2019-11-07 Created: 2019-11-07 Last updated: 2024-02-13Bibliographically approved
Senavirathne, N. & Torra, V. (2021). Systematic evaluation of probabilistic K-anonymity for privacy preserving micro-data publishing and analysis. In: Sabrina De Capitani di Vimercati; Pierangela Samarati (Ed.), Proceedings of the 18th International Conference on Security and Cryptography, SECRYPT 2021: . Paper presented at 18th International Conference on Security and Cryptography, SECRYPT 2021, Virtual, Online, 6 July 2021 - 8 July 2021 (pp. 307-320). SciTePress
Open this publication in new window or tab >>Systematic evaluation of probabilistic K-anonymity for privacy preserving micro-data publishing and analysis
2021 (English)In: Proceedings of the 18th International Conference on Security and Cryptography, SECRYPT 2021 / [ed] Sabrina De Capitani di Vimercati; Pierangela Samarati, SciTePress, 2021, p. 307-320Conference paper, Published paper (Refereed)
Abstract [en]

In the light of stringent privacy laws, data anonymization not only supports privacy preserving data publication (PPDP) but also improves the flexibility of micro-data analysis. Machine learning (ML) is widely used for personal data analysis in the present day thus, it is paramount to understand how to effectively use data anonymization in the ML context. In this work, we introduce an anonymization framework based on the notion of “probabilistic k-anonymity” that can be applied with respect to mixed datasets while addressing the challenges brought forward by the existing syntactic privacy models in the context of ML. Through systematic empirical evaluation, we show that the proposed approach can effectively limit the disclosure risk in micro-data publishing while maintaining a high utility for the ML models induced from the anonymized data. 

Place, publisher, year, edition, pages
SciTePress, 2021
Series
International Joint Conference on e-Business and Telecommunications - SECRYPT, ISSN 2184-7711
Keywords
Anonymization, Data Privacy, Privacy Preserving Machine Learning, Statistical Disclosure Control, Cryptography, Information analysis, Data anonymization, Data publishing, Disclosure risk, Empirical evaluations, Privacy models, Privacy preserving, Privacy-preserving data publications, Systematic evaluation, Privacy by design
National Category
Computer Sciences
Research subject
Skövde Artificial Intelligence Lab (SAIL)
Identifiers
urn:nbn:se:his:diva-20485 (URN)10.5220/0010560703070320 (DOI)000720102500025 ()2-s2.0-85111886138 (Scopus ID)978-989-758-524-1 (ISBN)
Conference
18th International Conference on Security and Cryptography, SECRYPT 2021, Virtual, Online, 6 July 2021 - 8 July 2021
Funder
Swedish Research Council, 2016-03346
Note

CC BY-NC-ND 4.0

Copyright © 2021 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved

This work is supported by Vetenskapsrådet project:”Disclosure risk and transparency in big data privacy” (VR 2016-03346, 2017-2020)

DRIAT

Available from: 2021-08-19 Created: 2021-08-19 Last updated: 2022-01-26Bibliographically approved
Torra, V., Taha, M. & Navarro-Arribas, G. (2021). The space of models in machine learning: using Markov chains to model transitions. Progress in Artificial Intelligence, 10(3), 321-332
Open this publication in new window or tab >>The space of models in machine learning: using Markov chains to model transitions
2021 (English)In: Progress in Artificial Intelligence, ISSN 2192-6352, Vol. 10, no 3, p. 321-332Article in journal (Refereed) Published
Abstract [en]

Machine and statistical learning is about constructing models from data. Data is usually understood as a set of records, a database. Nevertheless, databases are not static but change over time. We can understand this as follows: there is a space of possible databases and a database during its lifetime transits this space. Therefore, we may consider transitions between databases, and the database space. NoSQL databases also fit with this representation. In addition, when we learn models from databases, we can also consider the space of models. Naturally, there are relationships between the space of data and the space of models. Any transition in the space of data may correspond to a transition in the space of models. We argue that a better understanding of the space of data and the space of models, as well as the relationships between these two spaces is basic for machine and statistical learning. The relationship between these two spaces can be exploited in several contexts as, e.g., in model selection and data privacy. We consider that this relationship between spaces is also fundamental to understand generalization and overfitting. In this paper, we develop these ideas. Then, we consider a distance on the space of models based on a distance on the space of data. More particularly, we consider distance distribution functions and probabilistic metric spaces on the space of data and the space of models. Our modelization of changes in databases is based on Markov chains and transition matrices. This modelization is used in the definition of distances. We provide examples of our definitions. 

Place, publisher, year, edition, pages
Springer, 2021
Keywords
Hypothesis space, Machine and statistical learning models, Probabilistic metric spaces, Space of data, Space of models, Data privacy, Distribution functions, Machine learning, Markov chains, Constructing models, Distance distribution functions, Model Selection, Model transition, Nosql database, Statistical learning, Transition matrices, Database systems
National Category
Computer Sciences
Research subject
Skövde Artificial Intelligence Lab (SAIL)
Identifiers
urn:nbn:se:his:diva-19666 (URN)10.1007/s13748-021-00242-6 (DOI)000639627000001 ()2-s2.0-85104447939 (Scopus ID)
Funder
Swedish Research Council, 2016-03346Knut and Alice Wallenberg Foundation
Note

CC BY 4.0

© 2021, The Author(s).

Correspondence Address: Torra, V.; School of Informatics, Sweden; email: vtorra@ieee.org

Published: 12 April 2021

Acknowledgements: This study was partially funded by Vetenskapsrådet project “Disclosure risk and transparency in big data privacy” (VR 2016-03346, 2017-2020), Spanish project TIN2017-87211-R is gratefully acknowledged, and by the Wallenberg AI, Autonomous Systems and Software Program (WASP) funded by the Knut and Alice Wallenberg Foundation.

Available from: 2021-04-29 Created: 2021-04-29 Last updated: 2021-09-13Bibliographically approved
Salas, J. & Torra, V. (2020). Differentially Private Graph Publishing and Randomized Response for Collaborative Filtering. In: Pierangela Samarati; Sabrina De Capitani di Vimercati; Mohammad Obaidat; Jalel Ben-Othman (Ed.), Proceedings of the 17th International Joint Conference on e-Business and Telecommunications: Volume 3: SECRYPT. Paper presented at The 17th International Conference on Security and Cryptography (SECRYPT 2020), 8-10 July 2020, online streaming, Lieusaint - Paris, France (pp. 415-422). SciTePress, 3
Open this publication in new window or tab >>Differentially Private Graph Publishing and Randomized Response for Collaborative Filtering
2020 (English)In: Proceedings of the 17th International Joint Conference on e-Business and Telecommunications: Volume 3: SECRYPT / [ed] Pierangela Samarati; Sabrina De Capitani di Vimercati; Mohammad Obaidat; Jalel Ben-Othman, SciTePress, 2020, Vol. 3, p. 415-422Conference paper, Published paper (Refereed)
Abstract [en]

Several methods for providing edge and node-differential privacy for graphs have been devised. However, most of them publish graph statistics, not the edge-set of the randomized graph. We present a method for graph randomization that provides randomized response and allows for publishing differentially private graphs. We show that this method can be applied to sanitize data to train collaborative filtering algorithms for recommender systems. Our results afford plausible deniability to users in relation to their interests, with a controlled probability predefined by the user or the data controller. We show in an experiment with Facebook Likes data and psychodemographic profiles, that the accuracy of the profiling algorithms is preserved even when they are trained with differentially private data. Finally, we define privacy metrics to compare our method for different parameters of e with a k-anonymization method on the MovieLens dataset for movie recommendations.

Place, publisher, year, edition, pages
SciTePress, 2020
Series
International Joint Conference on e-Business and Telecommunications - SECRYPT, ISSN 2184-7711
Keywords
Noise-graph Addition, Randomized Response, Edge Differential Privacy, Collaborative Filtering
National Category
Computer Sciences
Research subject
Skövde Artificial Intelligence Lab (SAIL)
Identifiers
urn:nbn:se:his:diva-19525 (URN)10.5220/0009833804150422 (DOI)000615962200040 ()2-s2.0-85110834027 (Scopus ID)978-989-758-446-6 (ISBN)
Conference
The 17th International Conference on Security and Cryptography (SECRYPT 2020), 8-10 July 2020, online streaming, Lieusaint - Paris, France
Funder
Swedish Research Council, 2016-03346
Note

CC BY-NC-ND 4.0

This work was partially supported by the Swedish Research Council (Vetenskapsrådet) project DRIAT (VR 2016-03346), the Spanish Government under grants RTI2018-095094-B-C22 ”CONSENT”, and the UOC postdoctoral fellowship program.

ICETE: International Conference on E-Business and Telecommunication Networks

Available from: 2021-03-05 Created: 2021-03-05 Last updated: 2021-08-10Bibliographically approved
Torra, V., Navarro-Arribas, G. & Galván, E. (2020). Explaining Recurrent Machine Learning Models: Integral Privacy Revisited. In: Josep Domingo-Ferrer, Krishnamurty Muralidhar (Ed.), Privacy in Statistical Databases: UNESCO Chair in Data Privacy, International Conference, PSD 2020, Tarragona, Spain, September 23–25, 2020, Proceedings. Paper presented at UNESCO Chair in Data Privacy, International Conference, PSD 2020, Tarragona, Spain, September 23–25, 2020 (pp. 62-73). Cham: Springer
Open this publication in new window or tab >>Explaining Recurrent Machine Learning Models: Integral Privacy Revisited
2020 (English)In: Privacy in Statistical Databases: UNESCO Chair in Data Privacy, International Conference, PSD 2020, Tarragona, Spain, September 23–25, 2020, Proceedings / [ed] Josep Domingo-Ferrer, Krishnamurty Muralidhar, Cham: Springer, 2020, p. 62-73Conference paper, Published paper (Refereed)
Abstract [en]

We have recently introduced a privacy model for statistical and machine learning models called integral privacy. A model extracted from a database or, in general, the output of a function satisfies integral privacy when the number of generators of this model is sufficiently large and diverse. In this paper we show how the maximal c-consensus meets problem can be used to study the databases that generate an integrally private solution. We also introduce a definition of integral privacy based on minimal sets in terms of this maximal c-consensus meets problem. 

Place, publisher, year, edition, pages
Cham: Springer, 2020
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 12276
Keywords
Clustering, Integral privacy, Maximal c-consensus meets, Parameter selection, Data privacy, Database systems, Machine learning models, Privacy models, Machine learning
National Category
Computer Sciences
Research subject
Skövde Artificial Intelligence Lab (SAIL)
Identifiers
urn:nbn:se:his:diva-19186 (URN)10.1007/978-3-030-57521-2_5 (DOI)2-s2.0-85092091090 (Scopus ID)978-3-030-57520-5 (ISBN)978-3-030-57521-2 (ISBN)
Conference
UNESCO Chair in Data Privacy, International Conference, PSD 2020, Tarragona, Spain, September 23–25, 2020
Funder
Swedish Research Council, 2016-03346
Note

CC BY 4.0

Also part of the Information Systems and Applications, incl. Internet/Web, and HCI book sub series (LNISA, volume 12276)

Partial support of the project Swedish Research Council (grant number VR 2016-03346) is acknowledged.

DRIAT

Available from: 2020-10-15 Created: 2020-10-15 Last updated: 2021-08-18Bibliographically approved
Torra, V. (2020). Fuzzy clustering-based microaggregation to achieve probabilistic k-anonymity for data with constraints. Journal of Intelligent & Fuzzy Systems, 39(5), 5999-6008
Open this publication in new window or tab >>Fuzzy clustering-based microaggregation to achieve probabilistic k-anonymity for data with constraints
2020 (English)In: Journal of Intelligent & Fuzzy Systems, ISSN 1064-1246, E-ISSN 1875-8967, Vol. 39, no 5, p. 5999-6008Article in journal (Refereed) Published
Abstract [en]

Microaggregation is an effective data-driven protection method that permits us to achieve a good trade-off between disclosure risk and information loss. In this work we propose a method for microaggregation based on fuzzy c-means, that is appropriate when there are constraints (linear constraints) on the variables that describe the data. Our method leads to results that satisfy these constraints even when the data to be masked do not satisfy them. 

Place, publisher, year, edition, pages
IOS Press, 2020
Keywords
clustering statistical disclosure control, data privacy, edit constraints, k-Anonymity, Microaggregation, Economic and social effects, Data driven, Disclosure risk, Fuzzy C mean, Information loss, Linear constraints, Protection methods, Fuzzy clustering
National Category
Computer Sciences
Research subject
Skövde Artificial Intelligence Lab (SAIL)
Identifiers
urn:nbn:se:his:diva-19308 (URN)10.3233/JIFS-189074 (DOI)000595520600004 ()2-s2.0-85096990170 (Scopus ID)
Funder
Swedish Research Council, 2016-03346
Note

CC BY-NC 4.0

Partial support of the project Swedish Research Council (Vetenskapsrådet) (grant number VR 2016-03346) is acknowledged.

DRIAT

Available from: 2020-12-10 Created: 2020-12-10 Last updated: 2021-08-18Bibliographically approved
Senavirathne, N. & Torra, V. (2020). On the role of data anonymization in machine learning privacy. In: Guojun Wang, Ryan Ko, Md Zakirul Alam Bhuiyan, Yi Pan (Ed.), Proceedings - 2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications, TrustCom 2020: . Paper presented at 2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications, TrustCom 2020, 29 December 2020 – 1 January 2021, Guangzhou, China (pp. 664-675). IEEE
Open this publication in new window or tab >>On the role of data anonymization in machine learning privacy
2020 (English)In: Proceedings - 2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications, TrustCom 2020 / [ed] Guojun Wang, Ryan Ko, Md Zakirul Alam Bhuiyan, Yi Pan, IEEE, 2020, p. 664-675Conference paper, Published paper (Refereed)
Abstract [en]

Data anonymization irrecoverably transforms the raw data into a protected version by eliminating direct identifiers and removing sufficient details from indirect identifiers in order to minimize the risk of re-identification when there is a requirement for data publishing. Nevertheless, data protection laws (i.e., GDPR) do not consider anonymized data as personal data thus allowing them to be freely used, analysed, shared and monetized without a compliance risk. Motivated by the above advantages, it is plausible that the data controllers anonymize the data before releasing them for any data analysis tasks such as machine learning (ML); which is applied in a wide variety of domains where personal data are used. Moreover, in recent research, it has shown that ML models are vulnerable to privacy attacks as they retain sensitive information from the training data. Taking all of these facts into consideration, in this work we explore the interplay between data anonymization and ML with the ultimate aim of clarifying whether data anonymization is sufficient to achieve privacy for ML under different adversarial scenarios. We also discuss the challenges and opportunities of integrating these two domains. As per our findings, it is conspicuous that in order to substantially minimize the privacy risks in ML, existing data anonymization techniques have to be applied with high privacy levels that cause a deterioration in model utility. 

Place, publisher, year, edition, pages
IEEE, 2020
Series
IEEE International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom), ISSN 2324-898X, E-ISSN 2324-9013
Keywords
Data anonymization, Data privacy, Privacy preserving machine learning, Deterioration, Machine learning, Data controllers, Data protection laws, Data publishing, Privacy Attacks, Re identifications, Recent researches, Sensitive informations, Privacy by design
National Category
Computer Sciences Computer Systems
Research subject
Skövde Artificial Intelligence Lab (SAIL)
Identifiers
urn:nbn:se:his:diva-19522 (URN)10.1109/TrustCom50675.2020.00093 (DOI)000671077600079 ()2-s2.0-85101295825 (Scopus ID)978-0-7381-4380-4 (ISBN)978-0-7381-4381-1 (ISBN)
Conference
2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications, TrustCom 2020, 29 December 2020 – 1 January 2021, Guangzhou, China
Funder
Swedish Research Council, 2016-03346
Note

© 2020 IEEE.

This work is supported by Vetenskapsrådet project: “Disclosure risk and transparency in big data privacy” (VR 2016-03346, 2017-2020).

DRIAT

Available from: 2021-03-04 Created: 2021-03-04 Last updated: 2021-08-20Bibliographically approved
Salas, J., Megías, D., Torra, V., Toger, M., Dahne, J. & Sainudiin, R. (2020). Swapping trajectories with a sufficient sanitizer. Pattern Recognition Letters, 131, 474-480
Open this publication in new window or tab >>Swapping trajectories with a sufficient sanitizer
Show others...
2020 (English)In: Pattern Recognition Letters, ISSN 0167-8655, E-ISSN 1872-7344, Vol. 131, p. 474-480Article in journal (Refereed) Published
Abstract [en]

Real-time mobility data is useful for several applications such as planning transports in metropolitan areas or localizing services in towns. However, if such data is collected without any privacy protection it may reveal sensible locations and pose safety risks to an individual associated to it. Thus, mobility data must be anonymized preferably at the time of collection. In this paper, we consider the SwapMob algorithm that mitigates privacy risks by swapping partial trajectories. We formalize the concept of sufficient sanitizer and show that the SwapMob algorithm is a sufficient sanitizer for various statistical decision problems. That is, it preserves the aggregate information of the spatial database in the form of sufficient statistics and also provides privacy to the individuals. This may be used for personalized assistants taking advantage of users’ locations, so they can ensure user privacy while providing accurate response to the user requirements. We measure the privacy provided by SwapMob as the Adversary Information Gain, which measures the capability of an adversary to leverage his knowledge of exact data points to infer a larger segment of the sanitized trajectory. We test the utility of the data obtained after applying SwapMob sanitization in terms of Origin-Destination matrices, a fundamental tool in transportation modelling.

Place, publisher, year, edition, pages
Elsevier, 2020
Keywords
Intelligent transportation systems, Origin-Destination matrices, Privacy preserving mobility data mining, Real-time mobility data anonymization, Sufficient sanitizer, Trajectory anonymization, Data mining, Intelligent systems, Knowledge management, Matrix algebra, Real time systems, Trajectories, Anonymization, Mobility datum, Origin destination matrices, Data privacy
National Category
Computer Sciences Transport Systems and Logistics
Research subject
Skövde Artificial Intelligence Lab (SAIL)
Identifiers
urn:nbn:se:his:diva-18264 (URN)10.1016/j.patrec.2020.02.011 (DOI)000521971700064 ()2-s2.0-85079419408 (Scopus ID)
Funder
Swedish Research Council, 2016-03346
Note

CC BY 4.0

This work is partly funded by the Spanish Government through grants RTI2018-095094-B-C22 “CONSENT” and TIN2014-57364-C2-2-R “SMARTGLACIS”, Swedish VR (project VR 2016-03346). Raaz Sainudiin was partly funded by Combient Competence Centre for Data Engineering Sciences at Uppsala University and the Research Center for Cyber Security at Tel Aviv University established by the State of Israel, the Prime Minister’s Office and Tel-Aviv University. Julián Salas acknowledges the support of a UOC postdoctoral fel- lowship.

Available from: 2020-02-28 Created: 2020-02-28 Last updated: 2021-06-15Bibliographically approved
Torra, V. & Salas, J. (2019). Graph Perturbation as Noise Graph Addition: A New Perspective for Graph Anonymization. In: Cristina Pérez-Solà; Guillermo Navarro-Arribas; Alex Biryukov; Joaquin Garcia-Alfaro (Ed.), Data Privacy Management, Cryptocurrencies and Blockchain Technology: ESORICS 2019 International Workshops, DPM 2019 and CBT 2019, Luxembourg, September 26–27, 2019, Proceedings. Paper presented at ESORICS 2019 International Workshops, DPM 2019 and CBT 2019, Luxembourg, September 26–27, 2019 (pp. 121-137). Cham: Springer, 11737
Open this publication in new window or tab >>Graph Perturbation as Noise Graph Addition: A New Perspective for Graph Anonymization
2019 (English)In: Data Privacy Management, Cryptocurrencies and Blockchain Technology: ESORICS 2019 International Workshops, DPM 2019 and CBT 2019, Luxembourg, September 26–27, 2019, Proceedings / [ed] Cristina Pérez-Solà; Guillermo Navarro-Arribas; Alex Biryukov; Joaquin Garcia-Alfaro, Cham: Springer, 2019, Vol. 11737, p. 121-137Conference paper, Published paper (Refereed)
Abstract [en]

Different types of data privacy techniques have been applied to graphs and social networks. They have been used under different assumptions on intruders’ knowledge. i.e., different assumptions on what can lead to disclosure. The analysis of different methods is also led by how data protection techniques influence the analysis of the data. i.e., information loss or data utility. One of the techniques proposed for graph is graph perturbation. Several algorithms have been proposed for this purpose. They proceed adding or removing edges, although some also consider adding and removing nodes. In this paper we propose the study of these graph perturbation techniques from a different perspective. Following the model of standard database perturbation as noise addition, we propose to study graph perturbation as noise graph addition. We think that changing the perspective of graph sanitization in this direction will permit to study the properties of perturbed graphs in a more systematic way. 

Place, publisher, year, edition, pages
Cham: Springer, 2019
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 11737
Keywords
Data privacy, Edge removal, Graphs, Noise addition, Social networks, Blockchain, Computer privacy, Electronic money, Perturbation techniques, Social networking (online), Anonymization, Data protection techniques, Data utilities, Information loss, Sanitization
National Category
Computer Sciences
Research subject
Skövde Artificial Intelligence Lab (SAIL)
Identifiers
urn:nbn:se:his:diva-18009 (URN)10.1007/978-3-030-31500-9_8 (DOI)000558296200008 ()2-s2.0-85075616311 (Scopus ID)978-3-030-31499-6 (ISBN)978-3-030-31500-9 (ISBN)
Conference
ESORICS 2019 International Workshops, DPM 2019 and CBT 2019, Luxembourg, September 26–27, 2019
Funder
Swedish Research Council, 2016-03346
Note

CC BY 4.0

Also part of the Security and Cryptology book sub series (LNSC, volume 11737)

This work was partially supported by the Swedish Research Council (Vetenskapsrådet) project DRIAT (VR 2016-03346), the Spanish Government under grants RTI2018-095094-B-C22 “CONSENT” and TIN2014-57364-C2-2-R “SMARTGLACIS”, and the UOC postdoctoral fellowship program.

Available from: 2019-12-12 Created: 2019-12-12 Last updated: 2021-08-18Bibliographically approved
Torra, V. & Senavirathne, N. (2019). Maximal c consensus meets. Information Fusion, 51, 58-66
Open this publication in new window or tab >>Maximal c consensus meets
2019 (English)In: Information Fusion, ISSN 1566-2535, E-ISSN 1872-6305, Vol. 51, p. 58-66Article in journal (Refereed) Published
Abstract [en]

Given a set S of subsets of a reference set X, we define the problem of finding c subsets of X that maximize the size of the intersection among the included subsets. Maximizing the size of the intersection means that they are subsets of the sets in S and they are as large as possible. We can understand the result of this problem as c consensus sets of S, or c consensus representatives of S. From the perspective of lattice theory, each representative will be a meet of some sets in S. In this paper we define formally this problem, and present heuristic algorithms to solve it. We also discuss the relationship with other established problems in the literature.

Place, publisher, year, edition, pages
NETHERLANDS: Elsevier, 2019
Keywords
clustering, consensus clustering, heuristic algorithms, Maximal c consensus meets, Cluster analysis, Clustering algorithms, Lattice theory, Set theory, Reference set
National Category
Computer Sciences
Research subject
Skövde Artificial Intelligence Lab (SAIL)
Identifiers
urn:nbn:se:his:diva-16463 (URN)10.1016/j.inffus.2018.09.011 (DOI)000469155600006 ()2-s2.0-85056612105 (Scopus ID)
Funder
Swedish Research Council, 2016–03346
Note

Partially supported by Vetenskapsrådet project: “Disclosure risk and transparency in big data privacy” (VR 2016–03346).

DRIAT

Available from: 2019-01-30 Created: 2019-01-30 Last updated: 2021-08-18Bibliographically approved
Principal InvestigatorTorra, Vicenc
Coordinating organisation
University of Skövde
Funder
Period
2017-01-01 - 2020-12-31
National Category
Other Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
DiVA, id: project:2266Project, id: 2016-03346_VR