UWJun 15, 2026arXiv:2606.16190

Embedded Arena: Iterative Optimization via Hardware Feedback

Zhihan Zhang, Alexander Le Metzger, Jiuyang Lyu, Chun-Cheng Chang, Jiayi Shao, Yujia Liu, Emmanuel Azuh Mensah, Edward Wang, Kurtis Heimerl, Gregory D. Abowd, Shwetak Patel, Natasha Jaques, Vikram Iyer

AI Summary

This paper explores the use of a hardware-in-the-loop agent to optimize AI models for embedded devices, addressing the challenges of memory, power, and temperature constraints while maintaining accuracy. By leveraging real hardware feedback, the proposed method enables iterative refinement of both model and firmware, significantly improving deployment success rates compared to existing frontier models. The results show that this approach can achieve substantial model compression with minimal accuracy loss, demonstrating practical applications in wildlife monitoring and clinical wearables.

Key Contribution

LLMs without hardware feedback fail to deploy, but a new iterative optimization method achieves 250x compression with less than 3.3% accuracy loss in real-world applications.

Abstract

Embedded devices from wildlife monitoring stations to clinical wearables require local AI inference due to latency, communication, or privacy constraints. Optimizing models for heterogeneous microcontrollers (MCUs) requires simultaneously satisfying hard physical constraints on memory, power, and temperature while preserving accuracy, a multidimensional optimization that is today performed manually by experts. We ask whether an LLM agent can autonomously navigate this complex, multi-turn pipeline guided by real hardware feedback, and introduce a hardware-in-the-loop agent arena in which the agent iteratively refines both model and firmware -- compiling, flashing, and measuring on real hardware -- to enable closed-loop optimization. Frontier models, including Claude Opus 4.7 and Gemini 3.1 Pro, fail entirely without hardware feedback (0% deployment success), whereas our hardware-in-the-loop formulation achieves the first successful deployment within three iterations and can surpass human expert results within seven. This agentic co-optimization achieves 250x compression for vision models with <3.3% accuracy loss and 400x for audio with <6% Feature Error Rate loss, enabling battery-free operation on a commercial MCU via solar harvesting. We demonstrate practical impact in two real-world systems: an elk-detection camera trap (96.7% accuracy) and a phonetic-transcription wearable (8.44% FER) for child development research.

Distributed Systems & Hardware Inference & Quantization

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Embedded Arena: Iterative Optimization via Hardware Feedback

Related Papers