New

31 Dimensions Of News Bias, Queryable From Claude In Plain English

The problem

Most "news bias" tools collapse a story into a single number on a left-right axis. That's useful for a thumbnail, but it's the wrong granularity for almost any real workflow - a newsroom standards editor, a fact-checker triaging a viral claim, a journalism-school instructor teaching framing, an AI safety researcher building a misinformation classifier.

What those workflows actually need is structured, multi-dimensional, queryable framing data. So I built it and put it behind an MCP server that any AI assistant can call.

The schema

Helium MCP scores every article on 31 dimensions. A non-exhaustive sample:

liberal_conservative - the standard left-right axis (kept for compatibility)
credibility - sourcing density, named-source ratio, evidence-citation pattern
opinion_vs_fact - opinion language vs declarative-fact language ratio
scapegoating - actor-blaming patterns vs structural-explanation patterns
covering_responses - whether the article gives space to the people/orgs being criticized
fearful - emotional valence, threat language
sensationalism - headline-vs-body amplification, superlative density
overconfidence - hedge language vs declarative certainty
intelligence - reading level, conceptual density
begging_the_question - assumes the conclusion in the framing
oversimplification - reduces complex causation to single factors
ai_authorship_probability - explicit estimate that the article was LLM-generated
... 19 more

The key thing is that every dimension is operationalized - it's not just "vibes" labeling, each one is computed from features in the text.

The corpus

3.2M+ articles
5,000+ sources
Updated continuously
Sources span global mainstream, US partisan, business press, tech press, regional, and long-tail / hyperlocal

The MCP interface

Helium MCP exposes this via three main tools you can call from any AI assistant:

get_source_bias(source) - aggregate scores across a source's recent corpus
get_bias_from_url(url) - score a single article on demand
search_balanced_news(query) - synthesize multi-source coverage of an event with structured outcomes

Setup (one line)

Add to your mcp.json in Cursor or Claude Desktop:

{  
  "mcpServers": {  
    "helium": {  
      "command": "npx",  
      "args": ["-y", "mcp-remote", "https://heliumtrades.com/mcp"]  
    }  
  }  
}

Free, no signup, no API key.

A real example

In Claude, I asked: "Show me the bias profile for CNN's recent corpus using Helium."

Real output (445 articles analyzed):

Liberal/Conservative:  -2   (slightly left)  
Credibility:           15   (moderate-high)  
Fearful:                4  
Intelligence:          11  
Covering Responses:     9   (gives space to the criticized)  
Opinion:                5  
Overconfidence:         8

The value of seeing it as 31 numbers (and not 1) is that you can ask follow-up questions like "For these same articles, are the high-credibility ones more or less likely to be high-overconfidence?" - and the agent can compute the correlation in-place.

Use cases this unlocks

For a newsroom standards editor: triage incoming wire/syndication content by ai_authorship_probability before it goes through your editorial pipeline.

For a fact-checker: rank a list of suspect URLs by credibility (low) and overconfidence (high) - the combination is a strong indicator of claims worth investigating.

For a journalism instructor: show students how the same event was framed across 10 sources, with structured scapegoating / covering_responses / opinion_vs_fact scores attached.

For an AI safety researcher: the schema is essentially a deployed multi-criterion eval pipeline applied to news rather than to LLM outputs - useful as an empirical reference for how multi-criterion eval criteria interact with each other in production (Goodhart, distribution shift, taxonomy choice).

For anyone building an AI agent that consumes news: structured per-source/per-article framing metadata is the missing primary key for reasoning about source reliability programmatically.

The bigger point

In a world where readers query LLMs more than they visit homepages, the value of an individual article goes down and the value of structured, queryable, per-article metadata goes up. The schema above is one open attempt at what that metadata layer should look like.

If you have ideas for dimensions that should be added (or critiques of the existing ones), I'd love to hear them - the methodology is open.

Caveats

This is not a substitute for human editorial judgment. Bias scoring is hard, the schema can be wrong, and there are distribution-shift / Goodhart concerns with any operationalized criterion. Use it as a triage layer, not a verdict.