Analyzing Results5 min read

Experiment Health

An automatic checklist on every results page that tells you whether your numbers are safe to trust yet, and what to fix if they aren't.

Browse docs

Experiment Health

Every results page has an automatic checklist that tells you whether your numbers are safe to trust yet — and what to fix if they aren’t. Think of it as a friend looking over your shoulder before you call a winner.

On every test’s results page you’ll find a card labelled Experiment health. Before you decide a winner, it quietly checks a handful of things that trip people up — like whether you’ve collected enough visitors and whether your traffic split looks right — and gives each one a simple status. It only looks and reports: it never pauses, stops, or edits your test.

The card starts collapsed, showing a one-line headline like “Looks clean — safe to read” or “A few things worth a look.” Click it to open the full list, then click any single check to see why it matters, what your result means right now, and what to do about it. The headline always reflects the most serious thing the card found.

What the statuses mean

All clear

This check looks healthy. Nothing to do.

Warning

Worth a look before you trust the result.

Issue

Stop — something is likely wrong. Fix it before deciding.

Notice

Just information, not a problem.

The checks it runs

Sample size

Have you collected enough visitors yet?

Compares the visitors you’ve gathered against the number your test was planned to need, and shows roughly how many days are left at your current traffic. Reaching the target is what keeps an exciting-looking result from quietly vanishing once more people arrive. If your score crosses the line early, it lets you know — so you can choose to wait or ship knowingly.

Traffic split

Did each version get its fair share of visitors?

You chose how to split traffic when you set the test up — say 50/50. This counts what actually happened. If the real split is far from what you asked for, visitors aren’t being shared out correctly, which bends every other number on the page. Below about 200 visitors it waits, because a wobbly split that early is just normal randomness.

Secondary goals

Is a winner quietly hurting your other goals?

A version can win your main goal while harming something else — more sign-ups but fewer purchases, for example. This watches your other goals and warns you if one of them drops meaningfully, so you can decide whether the main win is still worth it.

Mid-test changes

Was the test edited after it started?

Editing a variant, goal, or targeting after a test starts mixes together visitors who saw different things, which can spoil the comparison. This spots edits made during the test window and points you to the version history so you can see exactly what changed.

Active segment filter

Are you looking at everyone, or just a slice?

Appears only when you’ve filtered the results to a slice of visitors (like mobile only, or one country). Slices are smaller and noisier, so a result that looks strong inside one is more likely to be a coincidence. Clear the filter to read the result for all your traffic before deciding.

Getting the most from it

Open it before you call a winner. A glance at the headline tells you whether to trust the page. A red issue means the numbers can’t be trusted until you fix the cause — for a traffic-split problem, that usually means fixing the cause and starting a fresh test, because the data already collected can’t be rescued.

A clean card isn’t a guarantee of a real winner. It means the common traps it checks for didn’t fire. You still need a score that’s reached your confidence threshold and a result that makes sense for your business. See Reading Results for how to read the score itself.

Let it do the worrying. The checks update by themselves as new visitors and conversions arrive, so the card always reflects the latest data — you don’t have to re-run anything.

Frequently asked questions

Quick answers to the questions teams ask most about this part of Otter A/B.