Search papers, labs, and topics across Lattice.
The paper introduces the Flourishing AI Benchmark (FAI Benchmark) to evaluate AI alignment with human flourishing across seven dimensions, moving beyond traditional capability or harm-prevention metrics. It uses 1,229 objective and subjective questions, evaluated by specialized LLM judges and scored using a geometric mean to ensure balanced performance across dimensions. Empirical evaluation of 28 leading language models reveals that while some models show promise (up to 72/100), none achieve acceptable alignment across all flourishing dimensions, particularly in areas like Faith and Spirituality.
Current LLMs fall far short of supporting holistic human well-being, with even the best models struggling to score above 72/100 on a new Flourishing AI Benchmark, particularly in areas like Faith and Spirituality.
This paper introduces the Flourishing AI Benchmark (FAI Benchmark), a novel evaluation framework that assesses AI alignment with human flourishing across seven dimensions: Character and Virtue, Close Social Relationships, Happiness and Life Satisfaction, Meaning and Purpose, Mental and Physical Health, Financial and Material Stability, and Faith and Spirituality. Unlike traditional benchmarks that focus on technical capabilities or harm prevention, the FAI Benchmark measures AI performance on how effectively models contribute to the flourishing of a person across these dimensions. The benchmark evaluates how effectively LLM AI systems align with current research models of holistic human well-being through a comprehensive methodology that incorporates 1,229 objective and subjective questions. Using specialized judge Large Language Models (LLMs) and cross-dimensional evaluation, the FAI Benchmark employs geometric mean scoring to ensure balanced performance across all flourishing dimensions. Initial testing of 28 leading language models reveals that while some models approach holistic alignment (with the highest-scoring models achieving 72/100), none are acceptably aligned across all dimensions, particularly in Faith and Spirituality, Character and Virtue, and Meaning and Purpose. This research establishes a framework for developing AI systems that actively support human flourishing rather than merely avoiding harm, offering significant implications for AI development, ethics, and evaluation.