Privacy-preserving machine learning on Apache Spark

doi:10.1109/ACCESS.2023.3332222

Utilize este identificador para referenciar este registo: https://hdl.handle.net/1822/90761

Título:	Privacy-preserving machine learning on Apache Spark
Autor(es):	Brito, Cláudia Vanessa Martins Ferreira, Pedro G. Portela, Bernardo L. Oliveira, Rui Carlos Mendes de Paulo, Joao T.
Palavras-chave:	apache spark distributed systems Intel SGX machine learning Privacy-preserving trusted execution environments
Data:	2023
Editora:	IEEE
Revista:	IEEE Access
Citação:	Brito, C. V., Ferreira, P. G., Portela, B. L., Oliveira, R. C., & Paulo, J. T. (2023). Privacy-Preserving Machine Learning on Apache Spark. IEEE Access. Institute of Electrical and Electronics Engineers (IEEE). http://doi.org/10.1109/access.2023.3332222
Resumo(s):	The adoption of third-party machine learning (ML) cloud services is highly dependent on the security guarantees and the performance penalty they incur on workloads for model training and inference. This paper explores security/performance trade-offs for the distributed Apache Spark framework and its ML library. Concretely, we build upon a key insight: in specific deployment settings, one can reveal carefully chosen non-sensitive operations (e.g. statistical calculations). This allows us to considerably improve the performance of privacy-preserving solutions without exposing the protocol to pervasive ML attacks. In more detail, we propose Soteria, a system for distributed privacy-preserving ML that leverages Trusted Execution Environments (e.g. Intel SGX) to run computations over sensitive information in isolated containers (enclaves). Unlike previous work, where all ML-related computation is performed at trusted enclaves, we introduce a hybrid scheme, combining computation done inside and outside these enclaves. The experimental evaluation validates that our approach reduces the runtime of ML algorithms by up to 41% when compared to previous related work. Our protocol is accompanied by a security proof and a discussion regarding resilience against a wide spectrum of ML attacks.
Tipo:	Artigo
URI:	https://hdl.handle.net/1822/90761
DOI:	10.1109/ACCESS.2023.3332222
Versão da editora:	https://ieeexplore.ieee.org/document/10314994
Arbitragem científica:	yes
Acesso:	Acesso aberto
Aparece nas coleções:	HASLab - Artigos em revistas internacionais