Comparison of different deployment approaches of FPGA-based hardware accelerator for 3D object detection models

doi:10.1007/978-3-031-16474-3_24

Utilize este identificador para referenciar este registo: https://hdl.handle.net/1822/86377

Título:	Comparison of different deployment approaches of FPGA-based hardware accelerator for 3D object detection models
Autor(es):	Pereira, Pedro Linhares Silva, António Machado, Rui Silva, João Durães, Dalila Machado, José Manuel Novais, Paulo Monteiro, João L. Melo-Pinto, Pedro Fernandes, Duarte Manuel Azevedo
Palavras-chave:	Hardware accelerator Light detection and ranging (LiDAR) Object detection
Data:	2022
Editora:	Springer, Cham
Revista:	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Citação:	Pereira, P. et al. (2022). Comparison of Different Deployment Approaches of FPGA-Based Hardware Accelerator for 3D Object Detection Models. In: Marreiros, G., Martins, B., Paiva, A., Ribeiro, B., Sardinha, A. (eds) Progress in Artificial Intelligence. EPIA 2022. Lecture Notes in Computer Science(), vol 13566. Springer, Cham. https://doi.org/10.1007/978-3-031-16474-3_24
Resumo(s):	GPU servers have been responsible for the recent improvements in the accuracy and inference speed of the object detection models targeted to autonomous driving. However, its features, namely, power consumption and dimension, make its integration in autonomous vehicles impractical. Hybrid FPGA-CPU boards emerged as an alternative to server GPUs in the role of edge devices in autonomous vehicles. Despite their energy efficiency, such devices do not offer the same computational power as GPU servers and have fewer resources available. This paper investigates how to deploy deep learning models tailored to object detection in point clouds in edge devices for onboard real-time inference. Different approaches, requiring different levels of expertise in logic programming applied to FPGAs, are explored, resulting in three main solutions: utilization of software tools for model adaptation and compilation for a proprietary hardware IP; design and implementation of a hardware IP optimized for computing traditional convolutions operations; design and implementation of a hardware IP optimized for sparse convolutions operations. The performance of these solutions is compared in the KITTI dataset with computer performances. All the solutions resort to parallelism, quantization and optimized access control to memory to reduce the usage of logical FPGA resources, and improve processing time without significantly sacrificing accuracy. Solutions probed to be effective for real-time inference, power limited and space-constrained purposes.
Tipo:	Artigo em ata de conferência
URI:	https://hdl.handle.net/1822/86377
ISBN:	978-3-031-16473-6
e-ISBN:	978-3-031-16474-3
DOI:	10.1007/978-3-031-16474-3_24
ISSN:	0302-9743
Versão da editora:	https://link.springer.com/chapter/10.1007/978-3-031-16474-3_24
Arbitragem científica:	yes
Acesso:	Acesso restrito UMinho
Aparece nas coleções:	CAlg - Artigos em livros de atas/Papers in proceedings

Ficheiros deste registo:

Ficheiro	Descrição	Tamanho	Formato
paper_101.pdf Acesso restrito!		442,68 kB	Adobe PDF	Ver/Abrir

Ver registo completo Sugerir correção Estatísticas

Citations

Altmetrics