The meaningful analysis of an experiment is contingent upon the independent and identical distribution of samples between the treatments. If the samples are not selected truly at random, then any conclusions drawn may be attributable to the method by which the samples were selected and not the change being tested.
To detect a sampling bias in your randomization, ensure that the samples selected by the targeting rules engine match the requested distribution within a reasonable confidence interval.
If you design an experiment with equal percentages (the targeted ratio is 50% in treatment on and 50% in treatment off) and the current sample distribution deviates from the targeted ratio, even a little, the experiment may have an inherent bias, so the calculated impact and experimental results are rendered invalid.
The scale of this deviation shrinks as your sample sizes increases.
As an illustrative example, in the case of a 50/50 rollout, the 95th percent confidence interval for 1,000 samples lies at 50±3.1%. Simply put, with 1,000 users, if there are more than 531 users in any given treatment there is sample ratio mismatch (SRM). This delta shrinks as the sample size increases. With 1,000,000 samples, a variation of ±0.098% in the sample distribution is cause for concern. As you can see, it is important to understand this potential bias and evaluate your sample distribution thoroughly to prevent invalid results.
Sample ratio check
The Split platform performs a sample ratio check with each calculation update to monitor for a significant deviation between the targeted and current sample ratios. This sample ratio check is located beneath the summary of key and organization metrics with other key information including duration and last updated.
The table below shows the results of the check and a quick overview of each.
|Valid||The split has a valid sample ratio based on the treatments and targeting rule selected.|
|Mismatch||The split has a sample ratio mismatch based on the treatments and targeting rule you selected. Do not trust the impact shown below.|
|Not Applicable||Sample ratio calculation is not applicable as you have selected the any targeting rule.|
|Not Applicable||Sample ratio calculation is not applicable as you have selected the whitelisted targeting rule.|
|Not Applicable||Sample ratio calculation is not applicable as you have the whitelisted segment targeting rule selected.|
|Not Applicable||Sample ratio calculation is not applicable as you have no baseline selected.|
|Not Applicable||Sample ratio calculation is not applicable as no traffic is in at least one of the treatments.|
|Not Applicable||Sample ratio calculation is not available as there are no users in your samples.|
|Not Applicable||Sample ratio calculation is not available as the calculation has not yet run for this split.|
How do you check when doing multiple treatments?
When performing a sample ratio check in the Split platform, the current treatment pair ratio has to match the targeted treatment pair ratio. For example, say your targeted ratio was 25/25/50 across treatments A/B/C. If you are comparing A to B, the targeted ratio is 25/25 and the targeted treatment pair ratio is 50/50. The sample ratio check is conducted against the targeted 50/50 distribution across A and B.
What significance threshold does Split use for its sample ratio check?
When conducting its sample ratio check, Split compares the calculated p-value against a threshold of 0.001. This threshold was determined based on the constant and rigorous monitoring performed on the accuracy of our randomization algorithms, and to minimize the impact a false positive would have on the trust of experimental results.