# A Complexity-Theoretic Account of The Strong Law of Small Numbers

The Strong Law of Small Numbers (see also Wikipedia) says that “There aren’t enough small numbers to meet the many demands made of them.” It means that when you look at small numbers, it’s easy to see compelling patterns that turn out to be false when you look at larger numbers.

Using complexity theory, we can give a partial account of this phenomenon.  The concept that we’ll use is length-conditional complexity: if $x$ is an $n$-bit number, $C(x|n)$ means the length of the shortest Turing machine that outputs $x$ when given $n$ as an input.  In a previous post, I stated the following theorem:

Theorem: Suppose you have a computable property $P_{m,n}(x)$ of $n$-bit numbers $x$ such that:

1.  For all $m_1 < m_2$, $n$, $x$, $P_{m_1,n}(x)$ implies $P_{m_2,n}(x)$.
2. For all $m$, $n$, there are at least $2^n(1-2^m)$ $n$-digit numbers $x$ such that $P_{m,n}(x)$ holds.

Then there is a $c$ such that for all $m$ and $n$, $P_{m,n}(x)$ holds for every $x$ of length $n$ such that $C(x|n) > n + c - m$.

Furthermore, the number of $x$‘s of length $n$ such that $C(x|n) > n + c - m$ is at least $2^n(1 - 2^{c - m})$.

In this context, $P_{n,m}(x)$ might be of the form “ $x$ is not close to some special set of numbers”, like “ $x$ is not close to a square number” or “ $x$ is not close to a prime”, where the meaning of “close” will depend on $m$ and $n$.  Or it could be that some statistic about $x$ deviates from the expected value by some large amount (again, with the meaning of “large” depending on $m$ and $n$).

Now, suppose we have a bag of computable properties $P^1_{m,n}, P^2_{m,n},\ldots$ that we’re interested in.  Assume that we’re using the same $m$ and $n$ for each and that they are of similar complexity, meaning that the value of $c$ in the above theorem is the same for each of them.

Then assuming that $c - m$ is small, given a random $n$-bit number $x$, it is likely (with probability at least $1 - 2^{c-m}$) that $x$ satisfies $C(x|n) > n + c - m$, and thus $P^i_{m,n}(x)$ should hold for all $P^i$ without exception.  If some $P^i_{m,n}(x)$ doesn’t hold, that means some assumption was violated, in particular that $P^i_{m,n}$ doesn’t hold for at least $2^m$ $n$-bit numbers, which is good evidence that there’s some pattern or mathematical law here that you were unaware of (e.g., a large proportion of numbers are close to one in some special set).  (Note that it is also possible that the assumption that was violated was that the complexity $c$ of $P^i_{m,n}(x)$ was higher than expected.  However, that $c$ is tied pretty closely to the length of a program that implements $P$, so, assuming that you know how to compute $P$, it’s unlikely that it’s much more complex than you originally thought.)

On the other hand, if $c - m$ is large (say, $\geq 0$), then there may be no $x$‘s such that $C(x|n) > n + c - m$.  In that case, it may simply be that the properties $P^i_{m,n}(x)$ are independent for different values of $i$.  In that case, it’s not reasonable to assume that if some $P^i_{m,n}(x)$ fails, then you
have discovered a possible new mathematical law, it may just be a coincidence instead.

Furthermore, as $n$ gets smaller, you will be forced to decrease $m$, making $c - m$ larger, since if $n < m$, then each $P^i_{m,n}$ must be trivial, since it must hold of at least $1-2^m$ of the $2^n$ numbers, and the only way to do that is to hold for all of them, making the properties trivial.

I certainly don’t claim this captures everything about the Strong Law of Small Numbers. But I like this account, because it gives a way to think about what “small” means: namely, it means small enough that you’ve chosen an $m$ such that $m\leq c$, where $c$ is the complexity of the properties of numbers that you’re interested in.