IBM ResearchMar 8, 2026arXiv:2603.07837

AI Steerability 360: A Toolkit for Steering Large Language Models

Erik Miehling, Karthikeyan Natesan Ramamurthy, Praveen Venkateswaran, Irene Ko, Pierre Dognin, Moninder Singh, Tejaswini Pedapati, Avinash Balakrishnan, Matthew Riemer, Dennis Wei, Inge Vejsbjerg, Elizabeth M. Daly, Kush R. Varshney

AI Summary

The AI Steerability 360 toolkit is introduced as an open-source Python library designed to facilitate the steering of large language models through four control surfaces: input, structural, state, and output. It provides a unified interface, the steering pipeline, for composing multiple steering methods and offers use case and benchmark classes for comprehensive evaluation. By simplifying the development and assessment of steering techniques, this toolkit lowers the barrier to entry in the field.

Key Contribution

Steer LLMs like never before with AI Steerability 360, an open-source toolkit that unifies input, structural, state, and output steering methods under a common pipeline.

Abstract

The AI Steerability 360 toolkit is an extensible, open-source Python library for steering LLMs. Steering abstractions are designed around four model control surfaces: input (modification of the prompt), structural (modification of the model's weights or architecture), state (modification of the model's activations and attentions), and output (modification of the decoding or generation process). Steering methods exert control on the model through a common interface, termed a steering pipeline, which additionally allows for the composition of multiple steering methods. Comprehensive evaluation and comparison of steering methods/pipelines is facilitated by use case classes (for defining tasks) and a benchmark class (for performance comparison on a given task). The functionality provided by the toolkit significantly lowers the barrier to developing and comprehensively evaluating steering methods. The toolkit is Hugging Face native and is released under an Apache 2.0 license at https://github.com/IBM/AISteer360.

Natural Language Processing Open-Source Models & Weights Tool Use & Agents

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

AI Steerability 360: A Toolkit for Steering Large Language Models

Related Papers