March 4 – March 11, 2026

Open-Source Models & Weights - Weekly Roundup

59 papers published across 3 labs.

23% acceleration

Selected Labs publishing this week

Top Papers

Mar 11, 2026

Turku Bioscience Centre3w ago·also Åbo Akademi University, Foundation for the Finnish Cancer Institute, InFLAMES Research Flagship Centre, Instituto de Tecnologia Química e Biológica António Xavier +1

Packaging Jupyter notebooks as installable desktop apps using LabConstrictor

Turn your Jupyter notebooks into one-click installable desktop apps with LabConstrictor, democratizing access to computational methods for researchers without DevOps expertise.

Iván Hidalgo-Cenalmor, Marcela Xiomara Rivera Pineda, Bruno M. Saraiva +2

Code Generation & Program Synthesis Open-Source Models & Weights Scientific Discovery & Drug Design

Nishat Raihan +13w ago

Temporal Text Classification with Large Language Models

Despite their general prowess, open-source LLMs still lag behind proprietary models in the nuanced task of dating texts, even after fine-tuning.

Nishat Raihan, Marcos Zampieri

Eval Frameworks & Benchmarks Natural Language Processing Open-Source Models & Weights

A. Trybus +23w ago

Making Bielik LLM Reason (Better): A Field Report

Can a dedicated research program keep a smaller, local LLM competitive against global giants in the rapidly evolving AI landscape?

A. Trybus, Bartosz Bartnicki, Remigiusz Kinas

Eval Frameworks & Benchmarks Open-Source Models & Weights Reasoning & Chain-of-Thought

Tobias Geger +43w ago

From Education to Evidence: A Collaborative Practice Research Platform for AI-Integrated Agile Development

An AI-integrated agile education platform accelerates practice-relevant AI research by closing the theory-practice gap in software development.

Tobias Geger, Andreas Rausch, Ina Schiering +2

Code Generation & Program Synthesis Open-Source Models & Weights Tool Use & Agents

Jesse Yu +13w ago

The Orthogonal Vulnerabilities of Generative AI Watermarks: A Comparative Empirical Benchmark of Spatial and Latent Provenance

Single-domain watermarks are fundamentally insufficient against modern adversarial toolsets, as spatial and latent watermarks exhibit orthogonal vulnerabilities to generative and geometric attacks, respectively.

Jesse Yu, Nicholas Wei

Eval Frameworks & Benchmarks Open-Source Models & Weights Red-Teaming & Adversarial Robustness

All Papers (59)

Mar 11, 2026

Packaging Jupyter notebooks as installable desktop apps using LabConstrictor

Turn your Jupyter notebooks into one-click installable desktop apps with LabConstrictor, democratizing access to computational methods for researchers without DevOps expertise.

Iván Hidalgo-Cenalmor, Marcela Xiomara Rivera Pineda, Bruno M. Saraiva +2

Code Generation & Program Synthesis Open-Source Models & Weights Scientific Discovery & Drug Design

Nishat Raihan +13w ago

Temporal Text Classification with Large Language Models

Despite their general prowess, open-source LLMs still lag behind proprietary models in the nuanced task of dating texts, even after fine-tuning.

Nishat Raihan, Marcos Zampieri

Eval Frameworks & Benchmarks Natural Language Processing Open-Source Models & Weights

A. Trybus +23w ago

Making Bielik LLM Reason (Better): A Field Report

Can a dedicated research program keep a smaller, local LLM competitive against global giants in the rapidly evolving AI landscape?

A. Trybus, Bartosz Bartnicki, Remigiusz Kinas

Eval Frameworks & Benchmarks Open-Source Models & Weights Reasoning & Chain-of-Thought

Tobias Geger +43w ago

From Education to Evidence: A Collaborative Practice Research Platform for AI-Integrated Agile Development

An AI-integrated agile education platform accelerates practice-relevant AI research by closing the theory-practice gap in software development.

Tobias Geger, Andreas Rausch, Ina Schiering +2

Code Generation & Program Synthesis Open-Source Models & Weights Tool Use & Agents

Jesse Yu +13w ago

The Orthogonal Vulnerabilities of Generative AI Watermarks: A Comparative Empirical Benchmark of Spatial and Latent Provenance

Jesse Yu, Nicholas Wei

Eval Frameworks & Benchmarks Open-Source Models & Weights Red-Teaming & Adversarial Robustness

Yujie Liao +43w ago

OSUM-Pangu: An Open-Source Multidimension Speech Understanding Foundation Model Built upon OpenPangu on Ascend NPUs

A fully open-source speech understanding model, OSUM-Pangu, proves that competitive performance is achievable on non-CUDA hardware, challenging the dominance of GPU-centric ecosystems.

Yujie Liao, Xuelong Geng, Hongfei Xue +2

Distributed Systems & Hardware Open-Source Models & Weights Speech & Audio

Thomas Thebaud +43w ago

Speaker Verification with Speech-Aware LLMs: Evaluation and Augmentation

Speech-aware LLMs are surprisingly bad at speaker verification, but a simple embedding injection trick closes the gap with dedicated systems while preserving the LLM's language abilities.

Thomas Thebaud, Yuzhe Wang, L. Moro-Velázquez +2

Eval Frameworks & Benchmarks Open-Source Models & Weights Speech & Audio

Mar 10, 2026

3w ago·also BUPT, CUHK, Fudan, NJU +3

InternVL-U: Democratizing Unified Multimodal Models for Understanding, Reasoning, Generation and Editing

A 4B-parameter model, InternVL-U, outperforms 14B-parameter models in multimodal generation and editing, proving that size isn't everything.

Changyao Tian, Danni Yang, Guanzhou Chen +27

Architecture Design (Transformers, SSMs, MoE)Multimodal Models Open-Source Models & Weights

Federico Bello +43w ago

GNNs for Time Series Anomaly Detection: An Open-Source Framework and a Critical Evaluation

GNNs don't just detect time series anomalies better, they also offer a crucial interpretability boost for real-world diagnosis.

Federico Bello, Gonzalo Chiarlone, Marcelo Fiori +2

Eval Frameworks & Benchmarks Open-Source Models & Weights

3w ago

ToolRosetta: Bridging Open-Source Repositories and Large Language Model Agents through Automated Tool Standardization

Automating the messy process of turning open-source code into LLM tools unlocks a new level of agent capabilities, outperforming even commercial LLMs.

Shimin Di, Xujie Yuan, Hanghui Guo +9

Code Generation & Program Synthesis Open-Source Models & Weights Tool Use & Agents

Palmer Schallon3w ago

Surgical Repair of Collapsed Attention Heads in ALiBi Transformers

Pretrained ALiBi transformers suffer from a widespread attention collapse that can be surgically repaired to yield a 25% perplexity improvement, suggesting that standard pretraining leaves performance on the table.

Palmer Schallon

Architecture Design (Transformers, SSMs, MoE)Open-Source Models & Weights Training Efficiency & Optimization

Luc Builtjes +13w ago

Tracking Cancer Through Text: Longitudinal Extraction From Radiology Reports Using Open-Source Large Language Models

Open-source LLMs can now rival proprietary systems in extracting crucial cancer progression data from radiology reports, unlocking scalable analysis while preserving patient privacy.

Luc Builtjes, Alessa Hering

Natural Language Processing Open-Source Models & Weights Scientific Discovery & Drug Design

3w ago

An Empirical Study and Theoretical Explanation on Task-Level Model-Merging Collapse

Forget parameter conflicts: representational incompatibility is the real culprit behind LLM merging failures, setting fundamental limits on which tasks can be successfully combined.

Yuan Cao, Dezhi Ran, Yuzhe Guo +4

Architecture Design (Transformers, SSMs, MoE)Open-Source Models & Weights Training Efficiency & Optimization

3w ago

Model Merging in the Era of Large Language Models: Methods, Applications, and Future Directions

Forget ensembling or retraining: model merging lets you Frankenstein LLMs for specialized skills at minimal cost.

Inference & Quantization Open-Source Models & Weights Training Efficiency & Optimization

Shreyas Meher3w ago

Build, Borrow, or Just Fine-Tune? A Political Scientist's Guide to Choosing NLP Models

Don't build a domain-specific model just because you can: fine-tuning a general-purpose model can achieve comparable performance on common tasks, saving significant resources.

Shreyas Meher

Eval Frameworks & Benchmarks Natural Language Processing Open-Source Models & Weights

3w ago

Compiler-First State Space Duality and Portable $O(1)$ Autoregressive Caching for Inference

Mamba-2's efficiency doesn't require custom CUDA kernels: XLA's compiler optimizations are enough to unlock near-optimal performance across diverse hardware.

Cosmo Santoni, C. Santoni

Architecture Design (Transformers, SSMs, MoE)Inference & Quantization Open-Source Models & Weights

3w ago

The Costs of Reproducibility in Music Separation Research: a Replication of Band-Split RNN

Open-sourcing a fully reproducible, optimized Band-Split RNN for music separation, this paper reveals the surprisingly large gap between published results and what can be achieved with a faithful reimplementation, even with significant effort.

Paul Magron, R. Serizel, Constance Douwes

Open-Source Models & Weights Speech & Audio Training Efficiency & Optimization

Mar 9, 2026

3w ago

Fish Audio S2 Technical Report

Open-source TTS gets a serious upgrade with Fish Audio S2, offering instruction-following control via natural language and production-ready streaming performance.

Shijia Liao, Yuxuan Wang, Songting Liu +12

Natural Language Processing Open-Source Models & Weights Speech & Audio

3w ago

OSS-CRS: Liberating AIxCC Cyber Reasoning Systems for Real-World Open-Source Security

AI-powered cyber reasoning can now find real-world bugs in open-source software thanks to a new framework that liberates DARPA's AI Cyber Challenge systems from their inaccessible cloud origins.

Andrew Chin, Dongkwan Kim, Yu-Fu Fu +8

Code Generation & Program Synthesis Open-Source Models & Weights

Covenant AI3w ago·also Mila

Covenant-72B: Pre-Training a 72B LLM with Trustless Peers Over-the-Internet

Democratized LLM pre-training is now a reality: Covenant-72B proves you can train a competitive 72B model with untrusted peers over the internet, opening the door to broader participation and reduced costs.

Joel Lidin, J. Lidin, Amir Sarfi +11

Distributed Systems & Hardware Open-Source Models & Weights Training Efficiency & Optimization

Ariel Rodriguez +93w ago

An Open-Source Robotics Research Platform for Autonomous Laparoscopic Surgery

Ditch the da Vinci: this open-source surgical robotics platform brings precision and flexibility to autonomous laparoscopic procedures using standard industrial robots.

Ariel Rodriguez, Ariel Rodríguez, Lorenzo Mazza +7

Open-Source Models & Weights Robotics & Embodied AI

Lucas Shen +13w ago

Social Proof is in the Pudding: The (Non)-Impact of Social Proof on Software Downloads

Turns out, buying stars and downloads for open-source software doesn't actually trick developers into using it.

Lucas Shen, Gaurav Sood

Code Generation & Program Synthesis Open-Source Models & Weights Recommendation & Information Retrieval

Xi Mo3w ago

IronEngine: Towards General AI Assistant

IronEngine achieves 100% task completion on file operation benchmarks by decoupling planning quality from execution capability via a novel three-phase pipeline and intelligent tool routing.

Xi Mo

Architecture Design (Transformers, SSMs, MoE)Open-Source Models & Weights Tool Use & Agents

3w ago

Supporting Workflow Reproducibility by Linking Bioinformatics Tools across Papers and Executable Code

Bridging the gap between narrative descriptions and workflow implementations, CoPaLink automatically links bioinformatics tools mentioned in papers to their usage in code, boosting reproducibility.

Clémence Sebe, Olivier Ferret, Aur'elie N'ev'eol +4

Code Generation & Program Synthesis Open-Source Models & Weights Scientific Discovery & Drug Design+1

3w ago

Meissa: Multi-modal Medical Agentic Intelligence

A 4B-parameter model, Meissa, rivals the performance of much larger proprietary models in medical agent tasks, offering a cost-effective and privacy-preserving alternative for clinical applications.

Yixiong Chen, Xinyi Bai, Yue Pan +2

Multimodal Models Open-Source Models & Weights Tool Use & Agents

Jonas Landsgesell +13w ago

Distributional Regression with Tabular Foundation Models: Evaluating Probabilistic Predictions via Proper Scoring Rules

Tabular foundation models, despite excelling in point estimate benchmarks, need proper scoring rules like CRPS to reliably evaluate their probabilistic regression capabilities, revealing a crucial blind spot in current evaluation practices.

Jonas Landsgesell, Pascal Knoll

Eval Frameworks & Benchmarks Open-Source Models & Weights

Indiana University3w ago·also Tsinghua AI, Regenstrief Institute

Automated Thematic Analysis for Clinical Qualitative Data: Iterative Codebook Refinement with Full Provenance

LLMs can automate and improve thematic analysis of qualitative data, achieving expert-level alignment in clinical domains through iterative codebook refinement.

Seungjun Yi, Joakim Nguyen, Huimin Xu +9

Eval Frameworks & Benchmarks Natural Language Processing Open-Source Models & Weights

Tilde3w ago

TildeOpen LLM: Leveraging Curriculum Learning to Achieve Equitable Language Representation

A new 30B open-weight LLM trained on 34 European languages achieves state-of-the-art performance on low-resource languages with significantly less compute, proving that clever training beats brute force.

Toms Bergmanis, Martins Kronis, Ingus Jānis Pretkalniņš +5

Data Curation & Synthetic Data Natural Language Processing Open-Source Models & Weights

Mar 8, 2026

3w ago·also Beijing Institute of Computer Technology

Trusting What You Cannot See: Auditable Fine-Tuning and Inference for Proprietary AI

Bridge the trust gap in cloud-based LLM services with AFTUNE, a practical framework that lets you audit proprietary fine-tuning and inference without prohibitive overhead.

Heng Jin, Chaoyu Zhang, Hexuan Yu +4

Distributed Systems & Hardware Inference & Quantization Open-Source Models & Weights

Tajamul Ashraf +93w ago

Bolbosh: Script-Aware Flow Matching for Kashmiri Text-to-Speech

Zero-shot multilingual TTS models stumble when synthesizing Kashmiri, but a script-aware, flow-based adaptation strategy unlocks intelligible speech.

Tajamul Ashraf, Burhaan Rasheed Zargar, Saeed Abdul Muizz +7

Natural Language Processing Open-Source Models & Weights Speech & Audio

3w ago

AI Steerability 360: A Toolkit for Steering Large Language Models

Steer LLMs like never before with AI Steerability 360, an open-source toolkit that unifies input, structural, state, and output steering methods under a common pipeline.

Erik Miehling, Karthikeyan Natesan Ramamurthy, Praveen Venkateswaran +10

Natural Language Processing Open-Source Models & Weights Tool Use & Agents

3w ago·also Chongqing Normal University, Corresponding author

Large Language Model for Discrete Optimization Problems: Evaluation and Step-by-step Reasoning

Chain-of-Thought prompting doesn't always improve LLMs' ability to solve discrete optimization problems, and surprisingly, "disordered" datasets can sometimes boost performance on simpler tasks.

Tianhao Qian, Guilin Qi, Z. Y. Wu +3

Eval Frameworks & Benchmarks Open-Source Models & Weights Reasoning & Chain-of-Thought

Mar 5, 2026

Harvey Lederman +13w ago

Dissociating Direct Access from Inference in AI Introspection

AI models can detect injected thoughts, but they often have no idea *what* those thoughts are, relying on content-agnostic anomaly detection and then guessing common concepts.

Harvey Lederman, Kyle Mahowald

Interpretability & Mechanistic Interp Open-Source Models & Weights

Siddharth Boppana +103w ago

Reasoning Theater: Disentangling Model Beliefs from Chain-of-Thought

LLMs often know the answer long before their "reasoning" suggests, wasting tokens on performative chain-of-thought.

Siddharth Boppana, S. Boppana, An-gelos Ma +8

Interpretability & Mechanistic Interp Open-Source Models & Weights Reasoning & Chain-of-Thought

3w ago·also AeroVironment

ROScopter: A Multirotor Autopilot based on ROSflight 2.0

Achieve state-of-the-art autopilot performance with a codebase that's significantly leaner and more modular, unlocking faster iteration for robotics researchers.

Jacob Moore, Ian Reid, Phillip Tokumaru +5

Architecture Design (Transformers, SSMs, MoE)Open-Source Models & Weights Robotics & Embodied AI

NVIDIA3w ago

Building Enterprise Realtime Voice Agents from Scratch: A Technical Tutorial

Forget slow, end-to-end models: building real-time voice agents hinges on a cascaded streaming pipeline, as demonstrated by a new tutorial achieving sub-second latency.

Jielin Qiu, Zixiang Chen, Liangwei Yang +11

Open-Source Models & Weights Speech & Audio Tool Use & Agents

Lianyu Wang +33w ago

Authorize-on-Demand: Dynamic Authorization with Legality-Aware Intellectual Property Protection for VLMs

VLMs can now dynamically adapt to changing deployment environments with user-controlled authorization, thanks to a new framework that protects intellectual property while maintaining performance.

Lianyu Wang, Meng Wang, Huazhu Fu +1

Computer Vision Multimodal Models Open-Source Models & Weights+1

Guangming Liu +23w ago

MQED-QD: An Open-Source Package for Quantum Dynamics Simulation in Complex Dielectric Environments

MQED-QD offers a unified, open-source workflow for simulating exciton dynamics in complex nanophotonic environments, enabling the rational design of nanoscale architectures.

Guangming Liu, Siwei Wang, Hsing-Ta Chen

Open-Source Models & Weights Scientific Discovery & Drug Design

Nghi D. Q. Bui3w ago

Building Effective AI Coding Agents for the Terminal: Scaffolding, Harness, Context Engineering, and Lessons Learned

A terminal-native coding agent, OPENDEV, achieves robust autonomous software engineering by enforcing explicit reasoning phases and prioritizing context efficiency, offering a blueprint for secure and extensible AI assistance.

Nghi D. Q. Bui

Code Generation & Program Synthesis Open-Source Models & Weights Tool Use & Agents

Brian Jing Hong Nge +53w ago

Hate Speech Detection using Large Language Models with Data Augmentation and Feature Enhancement

Data augmentation can boost a TF-IDF model to near state-of-the-art hate speech detection accuracy on certain datasets, rivaling much larger transformer models.

Brian Jing Hong Nge, Stefan Su, Thanh Thi Nguyen +3

Data Curation & Synthetic Data Natural Language Processing Open-Source Models & Weights

3w ago·also Warsaw University of Technology IDEAS

Censored LLMs as a Natural Testbed for Secret Knowledge Elicitation

Censored LLMs offer a surprisingly natural and effective environment for stress-testing methods that aim to elicit truthfulness and detect deception.

Helena Casademunt, Helena Casademunt, Bartosz Cywiński +9

Eval Frameworks & Benchmarks Open-Source Models & Weights Red-Teaming & Adversarial Robustness

Johan Linaaker +43w ago

Public Sector Open Source Program Offices - Archetypes for how to Grow (Common) Institutional Capabilities

Uncovered: six distinct archetypes of Public Sector Open Source Program Offices (OSPOs) that reveal how different organizational structures drive OSS adoption and collaboration.

Johan Linaaker, Johan Linåker, Astor Nummelin Carlberg +2

Natural Language Processing Open-Source Models & Weights

Anatoly Belikov +33w ago

Good-Enough LLM Obfuscation (GELO)

LLM privacy on shared accelerators doesn't have to break the bank: GELO achieves strong obfuscation with only 20-30% latency overhead, defeating common attacks.

Anatoly Belikov, A. Belikov, I. Fedotov +1

Inference & Quantization Open-Source Models & Weights Red-Teaming & Adversarial Robustness

Ye Zhu +83w ago

A Practical Post-Quantum Distributed Ledger Protocol for Financial Institutions

A new lattice-based transaction scheme offers financial institutions a post-quantum secure and auditable distributed ledger solution that existing Ring-CT models can't provide.

Ye Zhu, Yeoh Wei Zhu, Naresh Goud Boddu +6

Architecture Design (Transformers, SSMs, MoE)Distributed Systems & Hardware Open-Source Models & Weights

3w ago

Generalizable Multiscale Segmentation of Heterogeneous Map Collections

Forget specialized models: a single segmentation framework, trained on diverse historical maps, now achieves state-of-the-art performance across collections, scales, and regions.

Rémi Petitpierre, Remi Petitpierre

Computer Vision Data Curation & Synthetic Data Open-Source Models & Weights

Inayat Arshad +23w ago

MUTEX: Leveraging Multilingual Transformers and Conditional Random Fields for Enhanced Urdu Toxic Span Detection

A new model, MUTEX, achieves 60% token-level F1 score on Urdu toxic span detection, providing the first supervised baseline for a challenging low-resource language.

Inayat Arshad, Fajar Saleem, Ijaz Hussain

Architecture Design (Transformers, SSMs, MoE)Natural Language Processing Open-Source Models & Weights

3w ago

Functionality-Oriented LLM Merging on the Fisher--Rao Manifold

By merging models on the Fisher-Rao manifold, this work achieves stable and accurate LLM merging even with many heterogeneous models, overcoming the representation collapse issues plaguing simpler weight averaging techniques.

Jiayu Wang, Zuojun Ye, Wenpeng Yin

Architecture Design (Transformers, SSMs, MoE)Open-Source Models & Weights Training Efficiency & Optimization

Mar 4, 2026

Han XiaoMar 4, 2026

mlx-vis: GPU-Accelerated Dimensionality Reduction and Visualization on Apple Silicon

Ditch matplotlib for blazing-fast, GPU-powered dimensionality reduction and visualization on your Mac with mlx-vis.

Han Xiao

Distributed Systems & Hardware Inference & Quantization Open-Source Models & Weights

NVIDIAMar 4, 2026

Position: Vector Prompt Interfaces Should Be Exposed to Enable Customization of Large Language Models

Forget text prompts: vector prompt interfaces are the key to unlocking scalable and stable LLM customization.

Liangwei Yang, Shiyu Wang, Haolin Chen +12

Architecture Design (Transformers, SSMs, MoE)Natural Language Processing Open-Source Models & Weights

Wagner Comin Sonaglio +2Mar 4, 2026

Impact of 5G SA Logical Vulnerabilities on UAV Communications: Threat Models and Testbed Evaluation

A compromised 5G network can hijack or disable UAVs, revealing a major security gap in current UAV communication protocols.

Wagner Comin Sonaglio, Ágney Lopes Roth Ferraz, Lourenço Alves Pereira Júnior

Open-Source Models & Weights Red-Teaming & Adversarial Robustness Robotics & Embodied AI

I. Prokopiou +5Mar 4, 2026·also Athens University of Economics and Business, Omilia Conversational Intelligence, Orfium, University of Patras

LabelBuddy: An Open Source Music and Audio Language Annotation Tagging Tool Using AI Assistance

Unleash the power of AI-assisted audio annotation with LabelBuddy, the open-source tool that lets you plug in your own models and build richer, more nuanced music representations.

I. Prokopiou, Ioannis Prokopiou, Ioannis Sina +3

Open-Source Models & Weights Speech & Audio Tool Use & Agents

Jakub PrejznerMar 4, 2026

Bielik-Q2-Sharp: A Comparative Study of Extreme 2-bit Quantization Methods for a Polish 11B Language Model

Two-bit quantization can nearly match the performance of larger models on Polish language tasks, but beware: some methods that look good on paper fail catastrophically when generating text.

Jakub Prejzner

Inference & Quantization Natural Language Processing Open-Source Models & Weights

Mar 4, 2026

Traces of Social Competence in Large Language Models

LLMs' performance on False Belief Tests isn't just about size – it's profoundly skewed by how you phrase the question, revealing that models learn stereotypical responses to mental-state vocabulary during pre-training.

Tom Kouwenhoven, Max van Duijn

Eval Frameworks & Benchmarks Natural Language Processing Open-Source Models & Weights

Mar 4, 2026

Code Fingerprints: Disentangled Attribution of LLM-Generated Code

Think your LLM's code is anonymous? This paper shows you can fingerprint it with high accuracy, even across different programming languages.

Jiaxun Guo, Ziyuan Yang, Mengyu Sun +3

Code Generation & Program Synthesis Constitutional AI & AI Ethics Open-Source Models & Weights

Hung Vu Nguyen +6Mar 4, 2026

VietNormalizer: An Open-Source, Dependency-Free Python Library for Vietnamese Text Normalization in TTS and NLP Applications

Finally, a lightweight, dependency-free Python library streamlines Vietnamese text normalization, handling everything from currency to acronyms without needing GPUs or external APIs.

Hung Vu Nguyen, Loan Do, Thanh Ngoc Nguyen +4

Natural Language Processing Open-Source Models & Weights Speech & Audio

M. Gabbay +2Mar 4, 2026

Nominal techniques as an Agda library

Nominal techniques, a principled approach to variable binding, are now accessible as a practical Agda library.

M. Gabbay, Murdoch J. Gabbay, Orestis Melkonian

Code Generation & Program Synthesis Natural Language Processing Open-Source Models & Weights

Jy-oti Aneja +5Mar 4, 2026

Phi-4-reasoning-vision-15B Technical Report

Data quality, not just model size, reigns supreme: Phi-4-reasoning-vision-15B proves that smaller, open-weight multimodal models can achieve competitive performance through rigorous data curation and architecture choices.

Jy-oti Aneja, Michael Harrison, Neel Joshi +3

Multimodal Models Open-Source Models & Weights Reasoning & Chain-of-Thought

Mar 4, 2026·also National Institute of Mental Health and Neuro

Benchmarking Motivational Interviewing Competence of Large Language Models

LLMs aren't just chatbots; they're surprisingly good motivational interviewers, even outperforming human therapists on key metrics and fooling psychiatrists in distinguishability tests.

Aishwariya Jha, Prakrithi Shivaprakash, Lekhansh Shukla +3

Eval Frameworks & Benchmarks Natural Language Processing Open-Source Models & Weights

Mar 4, 2026

PTOPOFL: Privacy-Preserving Personalised Federated Learning via Persistent Homology

Ditch gradient sharing: PTOPOFL uses persistent homology to communicate only 48-dimensional topological summaries in federated learning, slashing reconstruction risk by 4.5x while boosting AUC.

Kelly L. Vomo-Donfack, Adryel Hoszu, G. Ginot +3

Data Curation & Synthetic Data Distributed Systems & Hardware Open-Source Models & Weights

Search

Open-Source Models & Weights - Weekly Roundup

Selected Labs publishing this week

Top Papers

All Papers (59)