Search papers, labs, and topics across Lattice.
1
3
13
Looping a language model block four times only gives you the effective capacity of 1.4 additional unique blocks, but costs as much to train as 2.4.