Main menu:

Categories

Site search

 

December 2006
M T W T F S S
« Nov   Jan »
 123
45678910
11121314151617
18192021222324
25262728293031

Archives

Meta

Most numbers are boring, asymptotically speaking.

Let f(n) be the number of Google hits for the integer n. Then f(578) is about 100 million, and f(1156), that is, the number of hits for a number twice as big, is about 40 million, a bit less than half as big. Doubling the input continues to halve the output: f(2312) is about 20 million (half again!), and f(4624) is about 8 million, and f(9248) is about 4 million.

There are about half as many pages talking about numbers that are twice as big. This is an example of a power law, and indeed, a log-log plot of f looks linear to my blurry vision:

Estimated Number of Search Results for Integers

Doing a linear regression in R gives the red line, or in symbols, f(x) \approx 5,800,000,000 / x^{1.029}. Rather humorously, this means that f(a)/f(b) \approx b/a. In the end, this is not so surprising: Zipf’s law says that, in a corpus of naturally occuring text, the frequency of a word is inversely proportional to its rank; here, we have a similar phenomenon at work: roughly, the popularity of a number is inversely proportional to its size.

In other words, while the number of integers expressible with fewer than n bits grows exponentially in n, the number of pages discussing integers expressible with fewer than n bits grows linearly in n; being silly, I’d say that this is an asymptotic version of the claim that most large numbers are uninteresting. After all, popular numbers have a lot of fan sites.

Comments

Comment from D. Eppstein
Time: December 10, 2006, 11:32 am

I’m a little confused. Popularity “inversely proportional to size” would be P(n) ~ 1/log n, where P(n) is the proportion of web pages that mention the number n, right? But the number of pages involving numbers of ≤ k bits would grow linearly with k with something more like P(n) ~ log n / n. And neither of these can be correct asymptotically, as they both lead to diverging sums. So what type of expression for P(n) did you have in mind?

Comment from kisonecat
Time: December 12, 2006, 10:55 pm

You’re right: I should have said “inversely proportional to magnitude,” because I was thinking P(n) ~ 1/n.

The asymptotics worried me, too. But, the linear regression actually suggests P(n) ~ 1/n^1.029, in which case the sum of P(n) will converge. Even so, the sum of P(n) doesn’t have to converge since the events “mentions the number n” aren’t independent. I’m currently taking random samples of searches for two numbers to look at this.

Write a comment