Search papers, labs, and topics across Lattice.
Introduction With the advancement of multimodal large language models (MLLMs) and coding agents [Claude
1
0
4
2
Today's best multimodal agents still fall into "blind execution" traps when building websites from ambiguous, non-expert user instructions, highlighting a critical gap in intent recognition and adaptive interaction.