Monday, July 28, 2008

Google Counts More Than 1 Trillion Unique Web URLs

Bert wrote to me:

Google Counts More Than 1 Trillion Unique Web URLs

The number is downright mind-boggling. If we were to print all these URLs, one per line, in pocket book format, it would require approximately 25 billion pages to do so. Which would translate into about 71.5 million 350-pages paperbacks, occupying a staggering 1,600 kilometers of shelves. Tightly packed, the books would occupy a volume of 30,000 cubic meters (i.e. a cube with 31m (102 ft.) edges).

It seems safe to say that this represents more pages than the entire body of significant literature ever published worldwide (for reference, the Books in Print database contains 7.8 million records, including duplicates/reprints, etc.), and this is only for the URLs...

Regards, Bert.

P.S. I used a typical paperback novel as a reference for the dimensions (108 x 177 x 22mm for 350 pages, including covers; 40 lines per page).

3 comments:

Pascal [P-04referent] said...

Ah, but remember: every news article written these days becomes an internet page. Several thousand ones for each, in fact, because of the redundancy in world news. You cannot compare it to all the books ever written before the internet age. You need to compare it to pretty much EVERYTHING that gets written, including every bit of gossip in every tiny local town paper, every article on every blog, no matter how uninteresting most blogs are ("all about me and my so unique boring life blah blah my mood today drone listening to this music chatter neighbors been arguing again yatta yatta with local chances of rain mumble look what the cat dragged in snore my parents don't understand me frazzle what's with women anyway..."). And every news report, formerly phoned in or faxed. And even accounting books, pretty much. One trillion unique wing flaps in the hive's confuse buzz.

Multiply this by the fact that the world was never more chatty than today. From country Presidents to the most insignificant town council vice-secretary, every public statement gets "onlined". It's an ever-increasing "more noise, less voice".

Besides, how much of it all is just porn sites, with a page for every image, and one page counted for every time they insert an iframe banner ad on a page? Some pages are each a compilation of several hundreds of those. Buzz, buzz...

"The name of Hogwarts headmaster, "Dumbledore", is an old Devon word for "bumblebee" and was picked by J.K. Rowling because she imagines him wandering around the castle humming to himself." (From Wikipedia)

The internet is in great part a titanesque Onlinus Doublebore. [kzzt! crackle-frizzle!]

Bert said...

Just to set the record straight, I wasn't attempting to compare URLs to literature in terms of contents, but rather trying to visualize what the number meant, and nothing else.

Even if you were to assume that the number of URLs pointing to distinct documents is 100 times less, or even 1,000 times less, the numbers are still difficult to grasp...

Pascal [P-04referent] said...

"the numbers are still difficult to grasp"

So's the anaconda in the pants of Chuck Norris. ;-)
"And its head is just another fist."