How ClankerStatus Works

Methodology, sources, and the fail-honest invariants behind every score.

The problem with AI status pages

Every major AI provider publishes a status page. The problem: they’re siloed, they’re incentivized to show green, and they measure different things at different granularities. When your application breaks at 2 AM, you need one honest answer to a simple question: is this my code, or is the provider down?

ClankerStatus aggregates multiple monitoring sources per provider into a single confidence-weighted health score. When the sources agree, you get a clear number. When they disagree, you get an honest “Conflicting Signals” instead of a misleading average. When data goes stale, you get “Unknown” instead of a cached green.

How we monitor

Three source tiers, fetched every 15 minutes, in priority order:

01
Official status pages
The provider’s own Statuspage v2 JSON feed: Anthropic (status.claude.com), OpenAI (status.openai.com), Groq (groqstatus.com), Fireworks AI, and Together AI. xAI publishes an RSS feed (status.x.ai) instead. This is our highest-trust source — it reflects what the provider has declared. We parse component-level granularity where available (API, Chat, Realtime, Console) so a degraded console doesn’t inflate the API score.
02
Synthetic availability probes
Token-free GET requests to each provider’s /v1/models endpoint — a lightweight liveness signal that requires no API spend. A 2xx response means the API surface is reachable; anything else is recorded as a degraded or unavailable signal at the component level.
03
Community reports
Verified users can submit a status report. Reports are trust-weighted (an account with a history of accurate reports counts more) and anti-brigading protected (one vote per user per component per hour, with a minimum quorum before a report affects the score). Reports can only adjust the score within bounds — they can’t fabricate a major outage on their own.

The scoring engine

The score is computed by deterministic code — no machine learning, no model judgment for the core pipeline. Each observation is a numeric signal; plain arithmetic produces the result. The algorithm has four properties that matter:

Freshness decay: Every source has a soft TTL (the point at which its weight starts dropping) and a hard TTL (the point at which it expires entirely). A source fetched 10 minutes ago counts fully. One fetched 28 minutes ago is down-weighted. When all sources for a component expire, the status becomes Unknown — never a cached green. Staleness is always visible in the UI.
Conflict detection: If active sources disagree by two or more severity levels — say, one reports Operational and another reports Major Outage — the result is Conflicting Signals, not an average. Averaging a real disagreement produces a number that’s confidently wrong. Surfacing the conflict is more honest and more useful.
Confidence: Confidence = coverage × agreement × freshness. A score is only emitted when confidence clears a meaningful threshold. Below that threshold the score field is null and the UI renders a dash — never a fabricated number. Pro subscribers see the confidence band alongside the score.
Impact-weighted rollup: Component scores roll up to a provider headline using impact weights. The critical API surface counts more than the dashboard or console. The provider headline is the worst-of its critical components — a degraded API is a degraded provider even if every other surface is green.

The fail-honest promise

A status product that goes confidently wrong is worse than no status product. Every design decision above is downstream of one invariant: ClankerStatus never shows a confident number when it shouldn’t.

Stale data shows a stale badge, not a cached score.
Conflicting sources show Conflicting Signals, not an average.
No data shows Unknown, not Operational.
Low confidence omits the score entirely rather than fabricating one.
Our own pipeline health is visible at /status — a silently degraded status product is the worst failure mode.

Frontier model overlay

Providers don’t publish per-model status, so ClankerStatus derives a per-model availability overlay from two token-free signals and shows it alongside the provider status — never as a substitute for it.

Routing-edge signal (Tier A): OpenRouter’s public /endpoints API reports each model’s routing status and 24h uptime. A model with a healthy routing status and positive uptime is marked operational at the edge; one with a down status or low uptime is marked degraded or unavailable.
Native catalog signal (Tier B): Each top-ranked model is checked against the provider’s own authenticated /v1/models catalog. A model absent from its provider’s own catalog is retired from the overlay — this caught a real case where a router ranked a model that the provider had removed.

The displayed status is the worse-of the inherited provider surface status and the per-model overlay — per-model evidence can only downgrade, never upgrade above the surface. A model showing Operational means its provider’s API surface is up and the model is reachable at the routing edge.

Providers we track

Each provider name links to its dedicated status page with current health and FAQ.