BeihangApr 6, 2026arXiv:2604.05072

Hierarchical SVG Tokenization: Learning Compact Visual Programs for Scalable Vector Graphics Modeling

Ximing Xing, Ziteng Xue, Zhenxi Li, Weicong Liang, Linqing Wang, Zhantao Yang, Tiankai Hang, Zijin Yin, Qinglin Lu, Chunyu Wang, Qian Yu

AI Summary

The paper introduces HiVG, a hierarchical tokenization scheme for SVG generation that decomposes SVG strings into atomic and segment tokens to improve sequence efficiency and syntactic validity. To address spatial mismatch, they propose a Hierarchical Mean-Noise (HMN) initialization strategy that injects numerical ordering signals and semantic priors into new token embeddings. Experiments on text-to-SVG and image-to-SVG tasks demonstrate that HiVG improves generation fidelity, spatial consistency, and sequence efficiency compared to byte-level tokenization.

Key Contribution

Ditch the byte-level baggage: HiVG's hierarchical tokenization for SVGs slashes token redundancy and coordinate hallucinations, paving the way for more efficient and geometrically sound vector graphics generation.

Abstract

Recent large language models have shifted SVG generation from differentiable rendering optimization to autoregressive program synthesis. However, existing approaches still rely on generic byte-level tokenization inherited from natural language processing, which poorly reflects the geometric structure of vector graphics. Numerical coordinates are fragmented into discrete symbols, destroying spatial relationships and introducing severe token redundancy, often leading to coordinate hallucination and inefficient long-sequence generation. To address these challenges, we propose HiVG, a hierarchical SVG tokenization framework tailored for autoregressive vector graphics generation. HiVG decomposes raw SVG strings into structured \textit{atomic tokens} and further compresses executable command--parameter groups into geometry-constrained \textit{segment tokens}, substantially improving sequence efficiency while preserving syntactic validity. To further mitigate spatial mismatch, we introduce a Hierarchical Mean--Noise (HMN) initialization strategy that injects numerical ordering signals and semantic priors into new token embeddings. Combined with a curriculum training paradigm that progressively increases program complexity, HiVG enables more stable learning of executable SVG programs. Extensive experiments on both text-to-SVG and image-to-SVG tasks demonstrate improved generation fidelity, spatial consistency, and sequence efficiency compared with conventional tokenization schemes. Our code is publicly available at https://github.com/ximinng/HiVG

Architecture Design (Transformers, SSMs, MoE)Code Generation & Program Synthesis Multimodal Models

Citation Metrics

Citations0

Influential citations0

References42

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Hierarchical SVG Tokenization: Learning Compact Visual Programs for Scalable Vector Graphics Modeling

Related Papers