The experiment health check provides an at-a-glance view of the sample data entering your experiment. When your health check is green (showing a green heart icon with a checkmark), you can be confident that your metric calculations are trustworthy and reflective of the true value of your metrics for each experiment variation.
The Health Check is found at the top left of an Experiment page, and shows how experiment data conforms to the following criteria:
- Seasonality effect completeness (for sequential testing): A healthy experiment has run for its defined time duration, so the likelihood of a seasonality effect is small.
- Experimental review period completeness (for fixed horizon testing): Fixed horizon testing is designed to identify subtle effects that are detectable only over a longer period of time, so peeking is strongly discouraged until the experimental period is complete (at which time the value will be healthy, showing the green heart symbol).
- Sample ratio: In a healthy experiment, the actual percentage of samples per experiment variation is in ratio with the targeting percentages. Sample ratio agreement shows that the experiment’s variation assignments are truly random, not affected by unanticipated factors.
- Number of sample exclusions: When the experiment is healthy, few samples are excluded from the experiment calculations. (Exclusions can happen when targeting rules change and keys are reallocated. In this case, some keys are excluded to avoid vacillating between experiment variations, because they would not be reflective of a single variation. A high number of exclusions can affect the experiment and indicates that metric results should be interpreted with caution.)
Health check details
The health check details can help you validate your interpretation of experimental statistics. Details also help you understand the severity of any failed check and lead you to the next steps to troubleshoot.
When you click the See details link on the Health Check popup, you will see a slide out modal with your seasonality effect / experimental review period, sample ratio, and sample exclusion details. The possible values for these criteria and additional useful references are described in the sections below.
Seasonality effect completeness (for sequential testing)
Possible values:
-
Seasonality effect complete
-
Seasonality effect incomplete
-
Experimental review period not started
Importance of seasonality effect completeness:
While sequential testing is designed to accrue results incrementally, seasonality patterns (such as weekend or holiday spikes or dips in user activity, daily network bandwidth changes, etc.) can affect data. The seasonality effect completeness result tells you if your experiment time duration is complete.
Troubleshooting steps:
The seasonality effect will be complete when the experiment has run to the experiment end date. As a best practice, you can carefully evaluate that your experiment duration correctly spans one or more seasonality cycles.
Useful references:
- Reviewing metrics during an experiment
- Experimental review period
- Where did my statistical significance go?
- When are metrics automatically recalculated?
- Sample size and sensitivity calculators
Experimental review period completeness (for fixed horizon testing)
Possible values:
-
Experimental review period complete
-
Experimental review period incomplete
-
Experimental review period not started
Importance of experimental review completeness:
Fixed horizon testing is designed to detect subtle but consistent effects of experiment variations. Allowing the experiment to run longer, powers the experiment with more data. This reduces noise and contributes to the accuracy of metric results. For this reason, peeking early at the results of a fixed horizon test is strongly discouraged. The seasonality effect completeness result tells you if your experiment time duration is complete.
Troubleshooting steps:
The experimental review period will be complete when the experiment has run to the experiment end date. As a best practice, you can carefully evaluate that your experiment duration allows enough traffic (data entering your experiment) to raise the sensitivity (ability to find a minimal percentage impact) of your experiment to your desired level.
Useful references:
- Using fixed horizons in experimental review periods
- Review periods
- Sample size and sensitivity calculators
Sample ratio
Possible values:
-
Sample ratio is valid
-
Sample ratio mismatch detected
-
Sample ratio not applicable
Importance of sample ratio:
The meaningful analysis of an experiment depends on the impartial distribution of samples between the variations. If the samples are not selected truly at random, then experimental results may be caused by the method used to select the samples and not the change being tested.
Troubleshooting steps:
Look for a design flaw in the experiment that might be preventing random sampling and causing a sample ratio mismatch.
Useful references:
- Sample ratio check
- How can I troubleshoot a Sample Ratio Mismatch in my feature flag?
- Sample ratio mismatch calculator
Number of sample exclusions
Possible values:
-
No exclusions made
-
<2% of sample excluded
-
n% of sample excluded
-
Exclusions not applicable
Importance of the sample exclusions percentage:
High sample exclusions in an experiment potentially introduce bias and reduce generalizability, especially if the excluded participants differ significantly from those included. For an experiment with a feature flag as the assignment source, keys are excluded that have been reassigned treatments more than once (if a key has been assigned three treatments, then it is excluded).
Troubleshooting steps:
Examine the sample data that has come into the experiment, and consider why keys may have been reassigned treatments. To reduce the percentage of exclusions, you can introduce new data into the experiment (for example, by increasing feature flag traffic exposure) or redesign and restart the experiment with a new assignment source.
Useful references:
- Attribution and exclusion (see Exclusions)
- Reallocate
- Export data
Comments
0 comments
Article is closed for comments.