March 25 – April 1, 2026

Open-Source Models & Weights - Weekly Roundup

27 papers published across 0 labs.

23% acceleration

Top Papers

Mar 31, 2026

Jonas Landsgesell +11d ago

ScoringBench: A Benchmark for Evaluating Tabular Foundation Models with Proper Scoring Rules

Tabular foundation model performance hinges on the evaluation metric, revealing that no single pretraining objective is universally optimal across different risk profiles.

Jonas Landsgesell, Pascal Knoll

Eval Frameworks & Benchmarks Open-Source Models & Weights

The Harker School1d ago

Bringing Up a Bilingual BabyLM: Investigating Multilingual Language Acquisition Using Small-Scale Models

Bilingual language models can achieve performance comparable to monolingual models in both languages, challenging the assumption that bilingual input poses significant learning obstacles.

Linda Zeng, Steven Y. Feng, Michael C. Frank

Data Curation & Synthetic Data Natural Language Processing Open-Source Models & Weights

Gensyn1d ago

Training-Free Dynamic Upcycling of Expert Language Models

Forget expensive finetuning: DUME dynamically combines existing expert LLMs into a powerful MoE *without* additional training, unlocking multi-domain performance at minimal cost.

Eros Fanì, Oğuzhan Ersoy

Natural Language Processing Open-Source Models & Weights Training Efficiency & Optimization

Anass Sedrati +21d ago

L-ReLF: A Framework for Lexical Dataset Creation

Unlock knowledge equity for underserved languages: L-ReLF offers a reproducible recipe for creating high-quality lexical datasets where they're needed most.

Anass Sedrati, M. Afifi, Reda Benkhadra

Data Curation & Synthetic Data Natural Language Processing Open-Source Models & Weights

Ona de Gibert +11d ago

Open Machine Translation for Esperanto

Despite its simple grammar, Esperanto translation still poses challenges for LLMs, with NLLB models only preferred in about half of human evaluations.

Ona de Gibert, Llu'is de Gibert

Eval Frameworks & Benchmarks Natural Language Processing Open-Source Models & Weights

All Papers (27)

Mar 31, 2026

Jonas Landsgesell +11d ago

ScoringBench: A Benchmark for Evaluating Tabular Foundation Models with Proper Scoring Rules

Tabular foundation model performance hinges on the evaluation metric, revealing that no single pretraining objective is universally optimal across different risk profiles.

Jonas Landsgesell, Pascal Knoll

Eval Frameworks & Benchmarks Open-Source Models & Weights

The Harker School1d ago

Bringing Up a Bilingual BabyLM: Investigating Multilingual Language Acquisition Using Small-Scale Models

Bilingual language models can achieve performance comparable to monolingual models in both languages, challenging the assumption that bilingual input poses significant learning obstacles.

Linda Zeng, Steven Y. Feng, Michael C. Frank

Data Curation & Synthetic Data Natural Language Processing Open-Source Models & Weights

Gensyn1d ago

Training-Free Dynamic Upcycling of Expert Language Models

Forget expensive finetuning: DUME dynamically combines existing expert LLMs into a powerful MoE *without* additional training, unlocking multi-domain performance at minimal cost.

Eros Fanì, Oğuzhan Ersoy

Natural Language Processing Open-Source Models & Weights Training Efficiency & Optimization

Anass Sedrati +21d ago

L-ReLF: A Framework for Lexical Dataset Creation

Unlock knowledge equity for underserved languages: L-ReLF offers a reproducible recipe for creating high-quality lexical datasets where they're needed most.

Anass Sedrati, M. Afifi, Reda Benkhadra

Data Curation & Synthetic Data Natural Language Processing Open-Source Models & Weights

Ona de Gibert +11d ago

Open Machine Translation for Esperanto

Despite its simple grammar, Esperanto translation still poses challenges for LLMs, with NLLB models only preferred in about half of human evaluations.

Ona de Gibert, Llu'is de Gibert

Eval Frameworks & Benchmarks Natural Language Processing Open-Source Models & Weights

Ranidu Gurusinghe +11d ago

SiPaKosa: A Comprehensive Corpus of Canonical and Classical Buddhist Texts in Sinhala and Pali

Proprietary language models trounce open-source alternatives by 3-6x on a new, large-scale corpus of Sinhala and Pali Buddhist texts.

Ranidu Gurusinghe, Nevidu Jayatilleke

Data Curation & Synthetic Data Natural Language Processing Open-Source Models & Weights

Iulian Lucău +11d ago

Can Commercial LLMs Be Parliamentary Political Companions? Comparing LLM Reasoning Against Romanian Legislative Expuneri de Motive

LLMs can mimic legislative reasoning, but their performance hinges on the proposal's idiosyncrasy, revealing a susceptibility to plausible-sounding confabulation that could mislead policymakers.

Iulian Lucău, Adelin-George Voicu

Eval Frameworks & Benchmarks Open-Source Models & Weights Reasoning & Chain-of-Thought

Zhirui Dai +61d ago

Kernel-SDF: An Open-Source Library for Real-Time Signed Distance Function Estimation using Kernel Regression

Real-time, uncertainty-aware signed distance functions are now possible without sacrificing accuracy, thanks to a novel kernel regression and GP regression hybrid.

Zhirui Dai, Tianxing Fan, Mani Amani +4

Computer Vision Open-Source Models & Weights Robotics & Embodied AI

Sjoerd Halmans +31d ago

HackRep: A Large-Scale Dataset of GitHub Hackathon Projects

Unlock new insights into rapid software development and collaboration with a massive dataset of over 100,000 hackathon projects.

Sjoerd Halmans, Lavinia Paganini, Alexander Serebrenik +1

Code Generation & Program Synthesis Data Curation & Synthetic Data Open-Source Models & Weights

University of Sannio1d ago

Machine Learning in the Wild: Early Evidence of Non-Compliant ML-Automation in Open-Source Software

Open-source projects are quietly integrating ML models in ways that may violate terms of service and regulations, raising concerns about unchecked ML automation.

Zohaib Arshid, Daniele Bifolco, Fiorella Zampetti +1

Constitutional AI & AI Ethics Natural Language Processing Open-Source Models & Weights

Mar 30, 2026

Doan Nam Long Vu +12d ago

The Scaffold Effect: How Prompt Framing Drives Apparent Multimodal Gains in Clinical VLM Evaluation

VLMs can appear to gain up to 58% F1 on clinical tasks simply by *mentioning* MRI data in the prompt, even when the data is uninformative, revealing a "scaffold effect" that inflates performance metrics.

Doan Nam Long Vu, Simone Balloccu

Eval Frameworks & Benchmarks Multimodal Models Open-Source Models & Weights

Yakov Pyotr Shkolnikov2d ago

Bit-Identical Medical Deep Learning via Structured Orthogonal Initialization

Random weight initialization is a major source of instability in deep learning, especially for rare classes, but this work shows how to eliminate it entirely with structured orthogonal initialization.

Yakov Pyotr Shkolnikov

Open-Source Models & Weights Scientific Discovery & Drug Design Training Efficiency & Optimization

Institut für Theoretische Physik2d ago·also National High Magnetic Field Laboratory, VTT

Compressing Transformer Language Models via Matrix Product Operator Decomposition: A Case Study on PicoGPT

Forget pruning or quantization: MPO decomposition lets you compress a transformer by 13x while retaining 97% accuracy.

Younes Javanmard, Tanmoy Pandit, Masoud Mardani

Architecture Design (Transformers, SSMs, MoE)Inference & Quantization Open-Source Models & Weights

Islamic University of Technology (IUT)2d ago

From Reviews to Requirements: Can LLMs Generate Human-Like User Stories?

LLMs can now reliably transform messy app store reviews into well-formatted user stories, but still fall short of creating truly independent and unique requirements for agile development.

Shadman Sakib, Oishy Fatema Akhand, Tasnia Tasneem +1

Eval Frameworks & Benchmarks Natural Language Processing Open-Source Models & Weights

Ricardo Alves Faval +52d ago

Empowering Mobile Networks Security Resilience by using Post-Quantum Cryptography

Quantum-proofing your 5G core doesn't have to break the bank: a sidecar proxy can add post-quantum cryptography with a predictable 50ms latency hit.

Ricardo Alves Faval, Ricardo Alves Faval, Rodrigo Moreira +3

Architecture Design (Transformers, SSMs, MoE)Distributed Systems & Hardware Open-Source Models & Weights

Aymen Lassoued +62d ago

VulnScout-C: A Lightweight Transformer for C Code Vulnerability Detection

A task-specific, lightweight transformer can outperform state-of-the-art reasoning LLMs and commercial tools in C code vulnerability detection, at a fraction of the inference cost.

Aymen Lassoued, Nacef Mbarek, Bechir Dardouri +4

Architecture Design (Transformers, SSMs, MoE)Code Generation & Program Synthesis Open-Source Models & Weights

E. Valero +72d ago·also University of the Basque Country

Merge and Conquer: Instructing Multilingual Models by Adding Target Language Weights

Forget fine-tuning: merging language-specific weights into instruction-tuned LLMs unlocks surprisingly effective instruction following in low-resource languages.

E. Valero, Eneko Valero, Maria Ribalta i Albado +5

Natural Language Processing Open-Source Models & Weights Training Efficiency & Optimization

Leon Witt +42d ago

Democratizing Federated Learning with Blockchain and Multi-Task Peer Prediction

Blockchain-based federated learning can be made practical by using multi-task peer prediction to overcome the computational bottleneck of contribution measurement.

Leon Witt, Kentaroh Toyoda, Wojciech Samek +2

Distributed Systems & Hardware Open-Source Models & Weights Training Efficiency & Optimization

Xiaohang Nie +132d ago

Synergy: A Next-Generation General-Purpose Agent for Open Agentic Web

Synergy's architecture lets agents evolve through experience by proactively recalling rewarded trajectories, hinting at a new way to build agents that learn and adapt in open, collaborative environments.

Xiaohang Nie, Zihan Guo, Kezhuo Yang +11

Open-Source Models & Weights Robotics & Embodied AI Tool Use & Agents

2d ago

Attesting LLM Pipelines: Enforcing Verifiable Training and Release Claims

Securing LLM supply chains requires cryptographically binding training and release claims to artifacts, enabling verifiable enforcement of security policies across teams and stages.

Zhuoran Tan, Jeremy Singer, Christos Anagnostopoulos

Data Curation & Synthetic Data Eval Frameworks & Benchmarks Open-Source Models & Weights

Oliver Aleksander Larsen +52d ago

BitSov: A Composable Bitcoin-Native Architecture for Sovereign Internet Infrastructure

Bitcoin can be more than just digital gold: BitSov proposes a composable architecture for a censorship-resistant internet, anchored to Bitcoin's blockchain, that could reshape how we build decentralized applications.

Oliver Aleksander Larsen, Oliver Aleksander Larsen, Rasmus Thorsen Larsen +3

Architecture Design (Transformers, SSMs, MoE)Distributed Systems & Hardware Open-Source Models & Weights

Antoine Soetewey2d ago

Statistics 101, 201, and 202: Three Shiny Apps for Teaching Probability Distributions, Inferential Statistics, and Simple Linear Regression

Ditch the command line: these open-source Shiny apps make introductory statistics concepts like hypothesis testing and regression intuitively accessible to students without any programming experience.

Antoine Soetewey

Open-Source Models & Weights

Abdullah Azhar +52d ago

Physical Design of UET-RVMCU: A Streamlined Open-Source RISC-V Microcontroller

Open-source RISC-V microcontrollers are now easier to build, thanks to a streamlined design and fully open RTL-to-GDS flow.

Abdullah Azhar, A. Azhar, Uneeb Kamal +3

Architecture Design (Transformers, SSMs, MoE)Distributed Systems & Hardware Open-Source Models & Weights

Mar 29, 2026

Dario Paape3d ago

What can LLMs tell us about the mechanisms behind polarity illusions in humans? Experiments across model scales and training steps

LLMs exhibit polarity illusions without rational inference, suggesting that "good enough" processing and partial grammaticalization may suffice to explain these phenomena in both machines and humans.

Dario Paape

Natural Language Processing Open-Source Models & Weights Scaling Laws & Emergent Abilities

Rauan Akylzhanov3d ago

KazByte: Adapting Qwen models to Kazakh via Byte-level Adapter

Adapting LLMs to low-resource languages might be as simple as teaching them to "speak" bytes, sidestepping the tokenization bottleneck.

Rauan Akylzhanov

Natural Language Processing Open-Source Models & Weights Training Efficiency & Optimization

3d ago·also College of Charleston, McMaster University, Sydney

Revisiting the Replication Study Design Used in Computing Education Research

Despite increased discussion around open science, replication studies in computing education research have only seen marginal growth, suggesting a disconnect between espoused values and actual research practices.

Rita Garcia, Ellie Lovellette, Xi Wu +1

Eval Frameworks & Benchmarks Open-Source Models & Weights

Kennesaw State University3d ago·also Quanta Technology

Safer Builders, Risky Maintainers: A Comparative Study of Breaking Changes in Human vs Agentic PRs

AI coding agents are less likely to break your code *except* when they're confidently "maintaining" it, where they're actually twice as risky as humans.

K M Ferdous, Dipayan Banik, Kowshik Chowdhury +1

Code Generation & Program Synthesis Open-Source Models & Weights Tool Use & Agents

Search

Open-Source Models & Weights - Weekly Roundup

Top Papers

All Papers (27)