Please use this identifier to cite or link to this item: https://hdl.handle.net/2440/58793
Citations
Scopus Web of Science® Altmetric
?
?
Type: Conference paper
Title: Reconstructing Data Perturbed by Random Projections When the Mixing Matrix Is Known
Author: Sang, Y.
Shen, H.
Tian, H.
Citation: Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2009 Bled, Slovenia, September 7-11, 2009: Proceedings, Part II / W. Buntine, M. Grobelnik D. Mladenic, J. Shawe-Taylor (eds.): pp.334-349
Publisher: Springer
Publisher Place: Germany
Issue Date: 2009
Series/Report no.: Lecture Notes in Artificial Intelligence; 5782
ISBN: 3642041736
9783642041730
ISSN: 0302-9743
1611-3349
Conference Name: ECML/PKDD (2009 : Bled, Slovenia)
Editor: Buntine, W.
Grobelnik, M.
Mladenic, D.
ShaweTaylor, J.
Statement of
Responsibility: 
Yingpeng Sang, Hong Shen and Hui Tian
Abstract: Random Projection (RP) has drawn great interest from the research of privacy-preserving data mining due to its high efficiency and security. It was proposed in [27] where the original data set composed of m attributes, is multiplied with a mixing matrix of dimensions k×m (m;>;k) which is random and orthogonal on expectation, and then the k series of perturbed data are released for mining purposes. To our knowledge little work has been done from the view of the attacker, to reconstruct the original data to get some sensitive information, given the data perturbed by and some priori knowledge, e.g. the mixing matrix, the means and variances of the original data. In the case that the attributes of the original data are mutually independent and sparse, the reconstruction can be treated as a problem of Underdetermined Independent Component Analysis (UICA), but UICA has some permutation and scaling ambiguities. In this paper we propose a reconstruction framework based on UICA and also some techniques to reduce the ambiguities. The cases that the attributes of the original data are correlated and not sparse are also common in data mining. We also propose a reconstruction method for the typical case of Multivariate Gaussian Distribution, based on the method of Maximum A Posterior (MAP). Our experiments show that our reconstructions can achieve high recovery rates, and outperform the reconstructions based on Principle Component Analysis (PCA). © 2009 Springer Berlin Heidelberg.
Keywords: Privacy-preserving Data Mining
Data Perturbation
Data Reconstruction
Underdetermined Independent Component Analysis
Maximum A Posteriori
Principle Component Analysis
Rights: © Springer-Verlag Berlin Heidelberg 2009
DOI: 10.1007/978-3-642-04174-7_22
Grant ID: http://purl.org/au-research/grants/arc/DP0985063
Published version: http://dx.doi.org/10.1007/978-3-642-04174-7_22
Appears in Collections:Aurora harvest
Computer Science publications

Files in This Item:
There are no files associated with this item.


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.