Ido Galil

Technion

Papers on Lattice

Total citations

Topics

h-index

Research focus

Architecture Design (Transformers, SSMs, MoE) (1)Inference & Quantization (1)Open-Source Models & Weights (1)Constitutional AI & AI Ethics (1)Eval Frameworks & Benchmarks (1)

Frequent co-authors

Ran El-Yaniv (2)A. Bercovich (1)Nir Ailon (1)Vladimir Anisimov (1)

Papers (2)

Feb 12, 2026

NVIDIAFeb 12, 2026·also Technion

Extending Puzzle for Mixture-of-Experts Reasoning Models with Application to GPT-OSS Acceleration

You can slash LLM inference costs without sacrificing quality by strategically pruning experts, quantizing, and swapping full attention for windowed attention, as demonstrated on gpt-oss-120B.

A. Bercovich, Nir Ailon, Vladimir Anisimov +21

Architecture Design (Transformers, SSMs, MoE)Inference & Quantization Open-Source Models & Weights

Feb 12, 2026

When Should LLMs Be Less Specific? Selective Abstraction for Reliable Long-Form Text Generation

LLMs can significantly boost factual accuracy in long-form generation by strategically "toning down" uncertain details, rather than simply omitting them.

Shani Goren, Ido Galil, Ran El-Yaniv

Constitutional AI & AI Ethics Eval Frameworks & Benchmarks Natural Language Processing

Search

Ido Galil

Research focus

Frequent co-authors

Papers (2)