In the era of big data, the secure sharing of sensitive information across various domains such as healthcare, finance, and social networks has become increasingly vital. Traditional data anonymization techniques often struggle to balance the competing demands of preserving privacy and maintaining data utility, particularly in complex and dynamic data-sharing environments. This paper presents a novel hybrid approach to data anonymization that integrates differential privacy with adaptive anonymization algorithms, specifically designed to enhance privacy protection while retaining the analytical value of the data. The proposed methodology tailor’s anonymization strategies to the specific context of data sharing, effectively addressing the limitations of existing techniques. Extensive experiments conducted on diverse datasets, including healthcare and financial data, demonstrate the superior performance of this approach in reducing re-identification risks while maintaining high data utility. The findings suggest that these advancements in anonymization techniques provide a robust solution for secure and privacy-preserving data sharing, addressing the growing challenges posed by the increasing volume and sensitivity of data in modern digital ecosystems. The paper concludes with a discussion of the broader implications for cybersecurity and suggests future research directions to further enhance privacy-preserving technologies.
References
Sweeney, L. (2002). k-anonymity: A model for protecting privacy. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 10(5), 557-570.
Machanavajjhala, A., Kifer, D., Gehrke, J., & Venkitasubramaniam, M. (2007). l-diversity: Privacy beyond k-anonymity. ACM Transactions on Knowledge Discovery from Data (TKDD), 1(1), 3-es.
Li, N., Li, T., & Venkatasubramanian, S. (2007). t-closeness: Privacy beyond k-anonymity and l-diversity. IEEE 23rd International Conference on Data Engineering, 106-115.
Fung, B. C., Wang, K., Chen, R., & Yu, P. S. (2010). Privacy-preserving data publishing: A survey of recent developments. ACM Computing Surveys (CSUR), 42(4), 1-53.
Narayanan, A., & Shmatikov, V. (2008). Robust de-anonymization of large sparse datasets. Proceedings of the 2008 IEEE Symposium on Security and Privacy, 111-125.
Zhang, Q., Yang, L. T., Chen, Z., & Li, P. (2017). Privacy-preserving deep computation model on cloud for big data feature learning. IEEE Transactions on Computers, 65(5), 1351-1362.
Li, J., Qiu, Z., Xiao, Y., & Zhao, H. (2018). Privacy-preserving data publishing: A survey on recent developments and future directions. ACM Computing Surveys (CSUR), 51(4), 1-36.
Dwork, C. (2008). Differential privacy: A survey of results. International Conference on Theory and Applications of Models of Computation, 1-19.
Dwork, C., & Roth, A. (2014). The algorithmic foundations of differential privacy. Foundations and Trends® in Theoretical Computer Science, 9(3-4), 211-407.
Abadi, M., Chu, A., Goodfellow, I., McMahan, H. B., Mironov, I., Talwar, K., & Zhang, L. (2016). Deep learning with differential privacy. Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, 308-318.
McMahan, H. B., Moore, E., Ramage, D., & Hampson, S. (2017). Communication-efficient learning of deep networks from decentralized data. Artificial Intelligence and Statistics, 1273-1282.
Aggarwal, G. (2005). k-Anonymity: An Enhanced Privacy Model. In Proceedings of the 21st International Conference on Data Engineering (ICDE), 96-105.
Meyerson, A., & Williams, R. (2004). On the complexity of optimal k-anonymity. In Proceedings of the 23rd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS), 223-228.
Lee, J., & Clifton, C. (2011). How much is enough? Choosing ε for differential privacy. In Proceedings of the 14th Information Security Conference (pp. 325-340). Springer, Berlin, Heidelberg.
Fan, L., Jin, H., & Wang, W. (2020). Differential privacy preservation in big data analytics: A survey. IEEE Communications Surveys & Tutorials, 22(2), 1126-1166.
Ganta, S. R., Kasiviswanathan, S. P., & Smith, A. (2008). Composition attacks and auxiliary information in data privacy. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 265-273). ACM.
He, X., Han, S., & Huang, T. S. (2014). Robust de-anonymization: Measuring the privacy risk of social network publishing. In Proceedings of the 23rd International Conference on World Wide Web (pp. 419-430). ACM.
Bonawitz, K., Eichner, H., Grieskamp, W., Huba, D., Ingerman, A., Ivanov, V., & van Overveldt, T. (2019). Towards federated learning at scale: System design. In Proceedings of the 2nd SysML Conference, Stanford, CA, USA.
Nissenbaum, H. (2004). Privacy as contextual integrity. Washington Law Review, 79(1), 119-158.