Search papers, labs, and topics across Lattice.
National Institute of Advanced Industrial Science and Technology (AIST), Japan
2
0
5
Expert alignment is hard not just because of model limitations, but because human subjective evaluation is a moving target.
Multimodal LLMs suffer a major performance hit when asked to switch from text-based to image-based tasks mid-conversation, revealing a surprising asymmetry in their ability to handle task interference.