Vision-Language Models Show Systematic Bias From Embedded

Numeric anchors embedded in images systematically bias Vision-Language Model quality assessments, according to new research that tested six VLMs across five architectural families.

The study found anchor effects with statistical significance ranging from eta² = 0.18 to 0.77 (all p < 0.001), demonstrating that simple numbers overlaid on images can dramatically skew model judgments. The bias proved 2.5 times larger than the effects of severe image quality degradation.

Researchers used layer-wise probing to identify where this bias emerges within model architectures. They discovered a consistent pattern: layers where anchor classification reaches saturation (L12-L34) perform poorly for quality prediction tasks, while optimal layers for quality assessment appear much deeper in the network (R² = 0.69-0.91).

Architecture-Dependent Integration Patterns

The fusion analysis revealed striking differences between model families. Two models showed instant fusion of visual and textual information at layers L1-L2, while three others demonstrated partial or no fusion at these early stages.

This architectural variation explains why some VLMs prove more susceptible to anchoring bias than others. Models with early fusion appear particularly vulnerable to numeric interference in visual quality tasks.

The findings establish what researchers call "a causal account of visual anchoring bias," directly linking behavioral susceptibility to internal representation dynamics. The bias cannot be explained by visual changes alone, as the numeric overlays had minimal impact on actual image quality.

The research has immediate implications for VLM deployment in quality assessment applications. Current models may provide unreliable judgments when images contain embedded numbers, ratings, or scores.

The study's methodology involved systematic testing across multiple architectural families, ensuring the findings generalize beyond specific model implementations. The layer-wise analysis provides actionable insights for model developers seeking to mitigate anchoring effects in future architectures.

Vision-Language Models Show Systematic Bias From Embedded Numbers in Images

Architecture-Dependent Integration Patterns

Related reading

Higgsfield Launches "Supercomputer" — One Chat, 30+ AI Models, Zero Tool-Hopping

Anthropic Economic Index shows Claude usage diversifying across lower-wage tasks

Researchers Build Compliance-Grade LLM Stack for Fraud Detection and AML

💬 Discussion