Search papers, labs, and topics across Lattice.
The University of Hong Kong
2
0
3
Forget bolting vision onto language models – truly powerful multimodal AI demands rethinking architectures from the ground up.
Expert-level video aesthetics can be captured and improved using a hierarchical rubric and reward models trained with a progressive learning scheme.