May 5, 2026arXiv:2605.04191

Heterogeneous Ordinal Structure Learning with Bayesian Nonparametric Complexity Discovery

AI Summary

This paper introduces a novel framework for learning heterogeneous ordinal structures in survey data, combining monotone Gaussian score embedding with Bayesian nonparametric complexity discovery to identify the number of underlying clusters. The approach uses a discovery-to-confirmation workflow, where a BNP stage calibrates archetype complexity before confirmatory refitting yields stable, interpretable cluster-specific DAGs. Applied to the 2024 Pew American Trends Panel AI attitudes survey, the method reduces holdout MSE by 25.8% compared to a single-graph baseline, demonstrating its effectiveness in capturing heterogeneous dependencies.

Key Contribution

Discovering distinct clusters with individual dependency structures in survey data slashes prediction error by 25% compared to assuming everyone thinks alike.

Abstract

Public attitudes toward artificial intelligence are heterogeneous, ordinally measured, and poorly captured by any single dependency graph. Existing ordinal structure learners assume a shared directed acyclic graph (DAG) across all respondents; recent heterogeneous ordinal graphical-model approaches focus on subgroup discovery rather than confirmatory cluster-specific DAG estimation; and latent profile analyses discard dependency structure entirely. We introduce a heterogeneous ordinal structure-learning framework combining monotone Gaussian score embedding, Bayesian nonparametric (BNP) complexity discovery via a truncated stick-breaking prior, and confirmatory fixed-K estimation with cluster-specific sparse DAG learning. The key methodological insight is a discovery-to-confirmation workflow: the nonparametric stage calibrates plausible archetype complexity, while inner-validated confirmatory refitting yields stable, interpretable structural estimates. On the 2024 Pew American Trends Panel AI attitudes survey, Wave 152 (W152) survey, (N = 4,788, 8 ordinal items), the confirmatory K*=5 model reduces holdout transformed-score mean squared error (MSE) by 25.8% over a single-graph baseline and by 4.6% over mixture-only clustering. A controlled tiered semi-synthetic benchmark calibrated to W152 structure validates recovery across difficulty regimes and transparently reveals failure modes under stress conditions.

Natural Language Processing

Citation Metrics

Citations0

Influential citations0

References26

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Heterogeneous Ordinal Structure Learning with Bayesian Nonparametric Complexity Discovery

Related Papers