MVVM: Deploy Your AI Agents-Securely, Efficiently, Everywhere

Yiwei Yang, Aibo Hu, Yusheng Zheng, Brian Zhao, Xinqi Zhang, Dawei Xiang, Kexin Chu, Wei Zhang, Andi Quinn

Published: 2024/10/21

Abstract

The rise of AI agents powered by Large Language Models (LLMs) presents critical challenges: how to securely execute and migrate these agents across heterogeneous environments while protecting sensitive user data, maintaining availability during network failures, minimizing response latency for time-critical decisions, and ensuring output safety in mission-critical applications. We present MVVM, a WebAssembly-based secure container framework that enables transparent live migration of LLM agent workspaces between edge devices and cloud servers with end-to-end privacy guarantees, resilient multi-tier replication, speculative execution for latency optimization, and integrated validation for safety assurance. MVVM introduces two key innovations: (1) a two-way sandboxing framework leveraging hardware enclaves and accelerator extensions that protects both the agent from malicious hosts and the host from compromised agents; (2) an efficient cross platform migration mechanism using WebAssembly and WASI's platform-agnostic design, enabling seamless movement across ARM phones, RISC-V MCUs, x86 servers, and heterogeneous accelerators; and three astonishing use cases: (1) privacy-aware daemon that automatically determines whether to execute locally or remotely based on data sensitivity and resource availability; (2) multi-tier replication with intelligent quality degradation that maintains service availability despite network failures or resource constraints; (3) a comprehensive execution framework combining speculative execution for 10x latency reduction with parallel validation that ensures output safety without compromising responsiveness. Our evaluation demonstrates that MVVM is validated on three separate devices across 18 workloads.

Read Full Paper (arXiv.org)