The Strong Law of Small Numbers (see also Wikipedia) says that “There aren’t enough small numbers to meet the many demands made of them.” It means that when you look at small numbers, it’s easy to see compelling patterns that turn out to be false when you look at larger numbers.
Using complexity theory, we can give a partial account of this phenomenon. The concept that we’ll use is length-conditional complexity: if is an
-bit number,
means the length of the shortest Turing machine that outputs
when given
as an input. In a previous post, I stated the following theorem:
Theorem: Suppose you have a computable property
of
-bit numbers
such that:
- For all
,
,
,
implies
.
- For all
,
, there are at least
![]()
-digit numbers
such that
holds.
Then there is a
such that for all
and
,
holds for every
of length
such that
.
Furthermore, the number of
‘s of length
such that
is at least
.
In this context, might be of the form “
is not close to some special set of numbers”, like “
is not close to a square number” or “
is not close to a prime”, where the meaning of “close” will depend on
and
. Or it could be that some statistic about
deviates from the expected value by some large amount (again, with the meaning of “large” depending on
and
).
Now, suppose we have a bag of computable properties that we’re interested in. Assume that we’re using the same
and
for each and that they are of similar complexity, meaning that the value of
in the above theorem is the same for each of them.
Then assuming that is small, given a random
-bit number
, it is likely (with probability at least
) that
satisfies
, and thus
should hold for all
without exception. If some
doesn’t hold, that means some assumption was violated, in particular that
doesn’t hold for at least
-bit numbers, which is good evidence that there’s some pattern or mathematical law here that you were unaware of (e.g., a large proportion of numbers are close to one in some special set). (Note that it is also possible that the assumption that was violated was that the complexity
of
was higher than expected. However, that
is tied pretty closely to the length of a program that implements
, so, assuming that you know how to compute
, it’s unlikely that it’s much more complex than you originally thought.)
On the other hand, if is large (say,
), then there may be no
‘s such that
. In that case, it may simply be that the properties
are independent for different values of
. In that case, it’s not reasonable to assume that if some
fails, then you
have discovered a possible new mathematical law, it may just be a coincidence instead.
Furthermore, as gets smaller, you will be forced to decrease
, making
larger, since if
, then each
must be trivial, since it must hold of at least
of the
numbers, and the only way to do that is to hold for all of them, making the properties trivial.
I certainly don’t claim this captures everything about the Strong Law of Small Numbers. But I like this account, because it gives a way to think about what “small” means: namely, it means small enough that you’ve chosen an such that
, where
is the complexity of the properties of numbers that you’re interested in.