Utilize este identificador para referenciar este registo:
https://hdl.handle.net/1822/86377
Título: | Comparison of different deployment approaches of FPGA-based hardware accelerator for 3D object detection models |
Autor(es): | Pereira, Pedro Linhares Silva, António Machado, Rui Silva, João Durães, Dalila Machado, José Manuel Novais, Paulo Monteiro, João L. Melo-Pinto, Pedro Fernandes, Duarte Manuel Azevedo |
Palavras-chave: | Hardware accelerator Light detection and ranging (LiDAR) Object detection |
Data: | 2022 |
Editora: | Springer, Cham |
Revista: | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
Citação: | Pereira, P. et al. (2022). Comparison of Different Deployment Approaches of FPGA-Based Hardware Accelerator for 3D Object Detection Models. In: Marreiros, G., Martins, B., Paiva, A., Ribeiro, B., Sardinha, A. (eds) Progress in Artificial Intelligence. EPIA 2022. Lecture Notes in Computer Science(), vol 13566. Springer, Cham. https://doi.org/10.1007/978-3-031-16474-3_24 |
Resumo(s): | GPU servers have been responsible for the recent improvements in the accuracy and inference speed of the object detection models targeted to autonomous driving. However, its features, namely, power consumption and dimension, make its integration in autonomous vehicles impractical. Hybrid FPGA-CPU boards emerged as an alternative to server GPUs in the role of edge devices in autonomous vehicles. Despite their energy efficiency, such devices do not offer the same computational power as GPU servers and have fewer resources available. This paper investigates how to deploy deep learning models tailored to object detection in point clouds in edge devices for onboard real-time inference. Different approaches, requiring different levels of expertise in logic programming applied to FPGAs, are explored, resulting in three main solutions: utilization of software tools for model adaptation and compilation for a proprietary hardware IP; design and implementation of a hardware IP optimized for computing traditional convolutions operations; design and implementation of a hardware IP optimized for sparse convolutions operations. The performance of these solutions is compared in the KITTI dataset with computer performances. All the solutions resort to parallelism, quantization and optimized access control to memory to reduce the usage of logical FPGA resources, and improve processing time without significantly sacrificing accuracy. Solutions probed to be effective for real-time inference, power limited and space-constrained purposes. |
Tipo: | Artigo em ata de conferência |
URI: | https://hdl.handle.net/1822/86377 |
ISBN: | 978-3-031-16473-6 |
e-ISBN: | 978-3-031-16474-3 |
DOI: | 10.1007/978-3-031-16474-3_24 |
ISSN: | 0302-9743 |
Versão da editora: | https://link.springer.com/chapter/10.1007/978-3-031-16474-3_24 |
Arbitragem científica: | yes |
Acesso: | Acesso restrito UMinho |
Aparece nas coleções: |
Ficheiros deste registo:
Ficheiro | Descrição | Tamanho | Formato | |
---|---|---|---|---|
paper_101.pdf Acesso restrito! | 442,68 kB | Adobe PDF | Ver/Abrir |