Sunghwan Hong

Papers on Lattice

Total citations

Topics

Publication activitypapers/week, last 8 weeks

Research focus

Multimodal Models (5)Computer Vision (4)Robotics & Embodied AI (3)Recommendation & Information Retrieval (1)

Frequent co-authors

Marc Pollefeys (5)Sung‐Jin Hong (2)Jaewoo Jung (2)Boyang Sun (1)

Papers (6)

Jul 2, 2026

ETH1w ago·also D translation and a continuous

LIME: Learning Intent-aware Camera Motion from Egocentric Video

LIME turns ordinary egocentric video into a powerful tool for robots to dynamically adjust their camera poses based on user intent, revolutionizing how we think about robotic perception.

Boyang Sun, Jiajie Li, Yung-Hsu Yang +9

Multimodal Models Robotics & Embodied AI

Jun 15, 2026

ETHJun 15, 2026·also Oxford

PROSE: Training-Free Egocentric Scene Registration with Vision-Language Models

Achieving state-of-the-art registration accuracy without any learned parameters or depth sensors, PROSE redefines the landscape of egocentric scene understanding.

Zhiang Chen, Nahyuk Lee, Taein Kwon +3

Computer Vision Multimodal Models Robotics & Embodied AI

Jun 15, 2026·also SNU, Yonsei

Geometric Action Model for Robot Policy Learning

GAM revolutionizes robot policy learning by seamlessly integrating 3D geometric reasoning, outperforming traditional models in accuracy and efficiency.

Jisang Han, Seonghu Jeon, Jaewoo Jung +7

Multimodal Models Robotics & Embodied AI

Jun 3, 2026

ETHJun 3, 2026

ZipSplat: Fewer Gaussians, Better Splats

Achieving six times fewer Gaussians while surpassing state-of-the-art performance redefines efficiency in 3D scene reconstruction.

Alexander Veicht, Sunghwan Hong, Dániel Baráth +1

Computer Vision Multimodal Models

Apr 9, 2026

Marcel Gröpl +7Apr 9, 2026·also SNU, Yonsei

Entropy-Gradient Grounding: Training-Free Evidence Retrieval in Vision-Language Models

Forget training wheels: this training-free method leverages uncertainty to guide vision-language models to the right image regions, boosting performance on detail-oriented tasks.

Marcel Gröpl, Marcel Gropl, Jaewoo Jung +5

Computer Vision Multimodal Models Recommendation & Information Retrieval

Feb 15, 2026

A. Said Gurbuz +3Feb 15, 2026·also IBM Research

Moving Beyond Sparse Grounding with Complete Screen Parsing Supervision

Forget sparse annotations: a new dataset and compact VLM show that dense, complete screen parsing supervision unlocks substantial gains in UI understanding and grounding, even for large foundation models.

A. Said Gurbuz, Sunghwan Hong, Ahmed Nassar +1

Computer Vision Tool Use & Agents

Search

Sunghwan Hong

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (6)