Utilize este identificador para referenciar este registo: https://hdl.handle.net/1822/74150

TítuloA scalable and automated machine learning framework to support risk management
Autor(es)Ferreira, Luís
Pilastri, André Luiz
Martins, Carlos
Santos, Pedro
Cortez, Paulo
Palavras-chaveAutomated machine learning
Distributed machine learning
Supervised Learning
Risk Management
Data2021
EditoraSpringer
RevistaLecture Notes in Computer Science
CitaçãoIn A. Rocha, L. Steels and J. van den Herik (Eds.), Agents and Artificial Intelligence, 12th International Conference, ICAART 2020, Valletta, Malta, Revised Selected Papers, Lecture Notes in Artificial Intelligence 12613, chapter 14, pp. 291-307, 2021, ISBN 978-3-030-71157-3
Resumo(s)Due to the growth of data and wide spread usage of Machine Learning (ML) by non-experts, automation and scalability are becoming key issues for ML. This paper presents an automated and scalable framework for ML that requires minimum human input. We designed the framework for the domain of telecommunications risk management. This domain often requires non-ML-experts to continuously update supervised learning models that are trained on huge amounts of data. Thus, the framework uses Automated Machine Learning (AutoML), to select and tune the ML models, and distributed ML, to deal with Big Data. The modules included in the framework are task detection (to detect classification or regression), data preprocessing, feature selection, model training, and deployment. In this paper, we focus the experiments on the model training module. We first analyze the capabilities of eight AutoML tools: Auto-Gluon, Auto-Keras, Auto-Sklearn, Auto-Weka, H2O AutoML, Rminer, TPOT, and TransmogrifAI. Then, to select the tool for model training, we performed a benchmark with the only two tools that address a distributed ML (H2O AutoML and TransmogrifAI). The experiments used three real-world datasets from the telecommunications domain (churn, event forecasting, and fraud detection), as provided by an analytics company. The experiments allowed us to measure the computational effort and predictive capability of the AutoML tools. Both tools obtained high- quality results and did not present substantial predictive differences. Nevertheless, H2O AutoML was selected by the analytics company for the model training module, since it was considered a more mature technology that presented a more interesting set of features (e.g., integration with more platforms). After choosing H2O AutoML for the ML training, we selected the technologies for the remaining components of the architecture (e.g., data preprocessing and web interface).
TipoArtigo em ata de conferência
URIhttps://hdl.handle.net/1822/74150
ISBN978-3-030-71157-3
e-ISBN978-3-030-71158-0
DOI10.1007/978-3-030-71158-0_14
ISSN0302-9743
Versão da editorahttps://doi.org/10.1007/978-3-030-71158-0_14
Arbitragem científicayes
AcessoAcesso aberto
Aparece nas coleções:CAlg - Livros e capítulos de livros/Books and book chapters

Ficheiros deste registo:
Ficheiro Descrição TamanhoFormato 
artigo_LNAI.pdf398,39 kBAdobe PDFVer/Abrir

Este trabalho está licenciado sob uma Licença Creative Commons Creative Commons

Partilhe no FacebookPartilhe no TwitterPartilhe no DeliciousPartilhe no LinkedInPartilhe no DiggAdicionar ao Google BookmarksPartilhe no MySpacePartilhe no Orkut
Exporte no formato BibTex mendeley Exporte no formato Endnote Adicione ao seu ORCID