Utilize este identificador para referenciar este registo: https://hdl.handle.net/1822/86377

TítuloComparison of different deployment approaches of FPGA-based hardware accelerator for 3D object detection models
Autor(es)Pereira, Pedro
Linhares Silva, António
Machado, Rui
Silva, João
Durães, Dalila
Machado, José Manuel
Novais, Paulo
Monteiro, João L.
Melo-Pinto, Pedro
Fernandes, Duarte Manuel Azevedo
Palavras-chaveHardware accelerator
Light detection and ranging (LiDAR)
Object detection
Data2022
EditoraSpringer, Cham
RevistaLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
CitaçãoPereira, P. et al. (2022). Comparison of Different Deployment Approaches of FPGA-Based Hardware Accelerator for 3D Object Detection Models. In: Marreiros, G., Martins, B., Paiva, A., Ribeiro, B., Sardinha, A. (eds) Progress in Artificial Intelligence. EPIA 2022. Lecture Notes in Computer Science(), vol 13566. Springer, Cham. https://doi.org/10.1007/978-3-031-16474-3_24
Resumo(s)GPU servers have been responsible for the recent improvements in the accuracy and inference speed of the object detection models targeted to autonomous driving. However, its features, namely, power consumption and dimension, make its integration in autonomous vehicles impractical. Hybrid FPGA-CPU boards emerged as an alternative to server GPUs in the role of edge devices in autonomous vehicles. Despite their energy efficiency, such devices do not offer the same computational power as GPU servers and have fewer resources available. This paper investigates how to deploy deep learning models tailored to object detection in point clouds in edge devices for onboard real-time inference. Different approaches, requiring different levels of expertise in logic programming applied to FPGAs, are explored, resulting in three main solutions: utilization of software tools for model adaptation and compilation for a proprietary hardware IP; design and implementation of a hardware IP optimized for computing traditional convolutions operations; design and implementation of a hardware IP optimized for sparse convolutions operations. The performance of these solutions is compared in the KITTI dataset with computer performances. All the solutions resort to parallelism, quantization and optimized access control to memory to reduce the usage of logical FPGA resources, and improve processing time without significantly sacrificing accuracy. Solutions probed to be effective for real-time inference, power limited and space-constrained purposes.
TipoArtigo em ata de conferência
URIhttps://hdl.handle.net/1822/86377
ISBN978-3-031-16473-6
e-ISBN978-3-031-16474-3
DOI10.1007/978-3-031-16474-3_24
ISSN0302-9743
Versão da editorahttps://link.springer.com/chapter/10.1007/978-3-031-16474-3_24
Arbitragem científicayes
AcessoAcesso restrito UMinho
Aparece nas coleções:CAlg - Artigos em livros de atas/Papers in proceedings

Ficheiros deste registo:
Ficheiro Descrição TamanhoFormato 
paper_101.pdf
Acesso restrito!
442,68 kBAdobe PDFVer/Abrir

Partilhe no FacebookPartilhe no TwitterPartilhe no DeliciousPartilhe no LinkedInPartilhe no DiggAdicionar ao Google BookmarksPartilhe no MySpacePartilhe no Orkut
Exporte no formato BibTex mendeley Exporte no formato Endnote Adicione ao seu ORCID