t-Closeness through Microaggregation: Strict Privacy with Enhanced Utility Preservation
Microaggregation is a technique for disclosure limitation aimed at protecting the privacy of data subjects in microdata releases. It has been used as an alternative to generalization and suppression to generate k-anonymous data sets, where the identity of each subject is hidden within a group of k subjects. Unlike generalization, microaggregation perturbs the data and this additional masking freedom allows improving data utility in several ways, such as increasing data granularity, reducing the impact of outliers and avoiding discretization of numerical data. k-Anonymity, on the other side, does not protect against attribute disclosure, which occurs if the variability of the confidential values in a group of k subjects is too small. To address this issue, several refinements of k-anonymity have been proposed, among which t-closeness stands out as providing one of the strictest privacy guarantees. Existing algorithms to generate t-close data sets are based on generalization and suppression.