When an experiment starts, metric cards are calculated every 15 minutes. You can see the last time the metrics for an experiment were calculated under Summary of key and organizational metrics on the Metrics Impact tab.
The duration between updates scales with the length of the version. We do this because, for example, on day 5 the traffic for the previous 15 minutes is not likely to show a material difference. The time between calculations increases incrementally through the duration of a version. It will typically be an extra 1-2 hours per day between calculations. Though, depending on timing on the back end, if a split has been running for more than 12 days, it can be up to 48 hours between calculations and up to 72 hours after the version of a split has been running for more than 48 days.
We have it on our roadmap to provide an ETA for next calculation for any experiment you are running. In the meantime, you can send a request to support for Split to manually run a calculation.
In addition, as a best practice you should establish experimental review periods. Making conclusions about the impact of your metrics during set experimental review periods will minimize the chance of errors and allow you to account for seasonality in your data.
Perhaps, you see a spike in data on certain days of the week. It would be against best practice to make your product decisions based on the data observed only on those days, or without including those days. Or, a key event, such as arriving for a restaurant reservation, may not happen until a few weeks after the impression.
We will always show your current metrics impact, the review period has no direct impact on the metrics, neither the ingestion of events nor the recalculation of metrics. It's there as a guideline for users to provide a caution against making a decision too early or without accounting for seasonality, even if the card shows as statistically significant.
Experimental review periods are a common practice for sophisticated growth and experimentation teams. For some experimentation teams, no decisions can be made until the experiment has run for a set number of days.
As it’s an organization-side setting, whatever you set may or may not be applicable for a specific experiment. That’s why the warning says it MAY not be conclusive. In other words, if it’s set at 14 days, in many cases the results on day 15 are probably as accurate as the results on day 14 (or 16 or 17 or….). The exception would be if there is a very specific cadence to the results.
This article provides more information on reviewing your metrics cards.