Integral Privacy Compliant Statistics Computation
2019 (English)In: Data Privacy Management, Cryptocurrencies and Blockchain Technology: ESORICS 2019 International Workshops, DPM 2019 and CBT 2019, Luxembourg, September 26–27, 2019, Proceedings / [ed] Cristina Pérez-Solà, Guillermo Navarro-Arribas, Alex Biryukov, Joaquin Garcia-Alfaro, Cham: Springer, 2019, Vol. 11737, p. 22-38Conference paper, Published paper (Refereed)
Abstract [en]
Data analysis is expected to provide accurate descriptions of the data. However, this is in opposition to privacy requirements when working with sensitive data. In this case, there is a need to ensure that no disclosure of sensitive information takes place by releasing the data analysis results. Therefore, privacy-preserving data analysis has become significant. Enforcing strict privacy guarantees can significantly distort data or the results of the data analysis, thus limiting their analytical utility (i.e., differential privacy). In an attempt to address this issue, in this paper we discuss how “integral privacy”; a re-sampling based privacy model; can be used to compute descriptive statistics of a given dataset with high utility. In integral privacy, privacy is achieved through the notion of stability, which leads to release of the least susceptible data analysis result towards the changes in the input dataset. Here, stability is explained by the relative frequency of different generators (re-samples of data) that lead to the same data analysis results. In this work, we compare the results of integrally private statistics with respect to different theoretical data distributions and real world data with differing parameters. Moreover, the results are compared with statistics obtained through differential privacy. Finally, through empirical analysis, it is shown that the integral privacy based approach has high utility and robustness compared to differential privacy. Due to the computational complexity of the method we propose that integral privacy to be more suitable towards small datasets where differential privacy performs poorly. However, adopting an efficient re-sampling mechanism can further improve the computational efficiency in terms of integral privacy. © 2019, The Author(s).
Place, publisher, year, edition, pages
Cham: Springer, 2019. Vol. 11737, p. 22-38
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 11737
Keywords [en]
Descriptive statistics, Privacy-preserving statistics, Privacy-preseving data analysis, Blockchain, Computational efficiency, Computer privacy, Electronic money, Information analysis, Sampling, Statistics, Data distribution, Differential privacies, Empirical analysis, Privacy preserving, Privacy requirements, Relative frequencies, Sensitive informations, Data privacy
National Category
Computer Sciences
Research subject
Skövde Artificial Intelligence Lab (SAIL)
Identifiers
URN: urn:nbn:se:his:diva-18008DOI: 10.1007/978-3-030-31500-9_2ISI: 000558296200002Scopus ID: 2-s2.0-85075604651ISBN: 978-3-030-31499-6 (print)ISBN: 978-3-030-31500-9 (electronic)OAI: oai:DiVA.org:his-18008DiVA, id: diva2:1377799
Conference
ESORICS 2019 International Workshops, DPM 2019 and CBT 2019, Luxembourg, September 26–27, 2019
2019-12-122019-12-122020-09-17Bibliographically approved