If some differences are too small to be determined by a blind A/B test, how small is too small? And who decides?