Search papers, labs, and topics across Lattice.
This paper introduces CORVET, a CORDIC-based vector processing engine designed for resource-constrained edge AI applications, enabling dynamic switching between approximate and accurate computation modes. The design utilizes time-multiplexed execution and flexible precision scaling (4/8/16-bit) to achieve up to 4x throughput improvement within the same hardware resources. ASIC implementation demonstrates a compute density of 4.83 TOPS/mm2 and energy efficiency of 11.67 TOPS/W, outperforming prior art, and is validated through object detection and classification tasks on a Pynq-Z2 platform.
Forget bulky GPUs at the edge: CORVET achieves 4.83 TOPS/mm2 and 11.67 TOPS/W with a CORDIC-powered, mixed-precision vector engine.
This brief presents a runtime-adaptive, performance-enhanced vector engine featuring a low-resource, iterative CORDIC-based MAC unit for edge AI acceleration. The proposed design enables dynamic reconfiguration between approximate and accurate modes, exploiting the latency-accuracy trade-off for a wide range of workloads. Its resource-efficient approach further enables up to 4x throughput improvement within the same hardware resources by leveraging vectorised, time-multiplexed execution and flexible precision scaling. With a time-multiplexed multi-AF block and a lightweight pooling and normalisation unit, the proposed vector engine supports flexible precision (4/8/16-bit) and high MAC density. The ASIC implementation results show that each MAC stage can save up to 33% of time and 21% of power, with a 256-PE configuration that achieves higher compute density (4.83 TOPS/mm2 ) and energy efficiency (11.67 TOPS/W) than previous state-of-the-art work. A detailed hardware-software co-design methodology for object detection and classification tasks on Pynq-Z2 is discussed to assess the proposed architecture, demonstrating a scalable, energy-efficient solution for edge AI applications.