Search papers, labs, and topics across Lattice.
Rensselaer Polytechnic Institute, Harvard University
2
0
4
Decoder-based Sense Knowledge Distillation (DSKD) lets generative models learn structured semantics without the inference-time overhead of dictionary lookups.
By randomly attending to different time patches and progressively mixing scales, SEMixer achieves state-of-the-art long-term time series forecasting with a lightweight architecture.