Yesterday, I wrote about our new private URL shortener at u.gleeson.us. In the subsequent comments to that post, I explained more about how it works. But there are fascinating theoretical implications revolving around the whole topic. For instance, just how many URLs can this application hold before it runs out of room?
Before answering such a question, one must first define precisely what “running out of room” might mean, but as you’ll see, by any definition, the number of URLs we can shorten is… rather a lot.
Here are the facts. The URLs are all stored in a MySQL database table with two fields: ‘id’ and ‘url’. The ‘url’ field holds the text of the URL and the ‘id’ field is a unique integer to identify each one. The “shortened” URLs are really just references to those ID numbers. They all take the form of
http://u.gleeson.us/[#]
where [#] stands for the ID number. But the shortened URLs aren’t decimal numbers. They are written in a custom base-52 numbering system, so each place can have 52 distinct values. Here are the 52 characters this system uses (in order, from 0 to 51):
23456789-bcdfghjklmnpqrstv
wxyz_BCDFGHJKLMNPQRSTVWXYZ
So, in this system, the character ‘2’ is really 0, the ‘Z’ is 51, and all the others are the values between: the ‘k’ is 16, the ‘G’ is 35, and so forth. So the shortened URL
http://u.gleeson.us/G
would really be a reference to the URL with ID number 35 in our database. (We don’t have that many yet, so don’t try it.) We could theoretically encode 52 URLs without even needing a second digit, but in practice, it’s really only 51, because the ID numbers in the database start at 1, not 0.
Of course, there’s no reason to limit ourselves to just one digit, but there must be some number of digits that we don’t want to exceed, and that number will be the key to answering the question of how many URLs we can hold before we’re full. The formula is
Q = 52 ^ D - 1
or, the maximum quantity Q equals 52 raised to the power of the number of digits D, minus one. As we’ve seen, when D = 1, Q = 51. But Q grows exponentially as you add digits.
2 digits => 2,703 URLs
3 digits => 140,607 URLs
4 digits => 7,311,615 URLs
So we already exceeded (by far) the number of URLs that Phoebe and I could ever possibly need to shorten, and the system is nowhere near full. There’s no theoretical or practical reason to stop at four digits. The lowest “limit” I can rationalize is 5 digits, because that would make a URL of 25 characters (the first 20 characters are used for the “http://u.gleeson.us/” part), and 25 seems like a pretty good cutoff point. (For instance, Twitter automatically truncates URLs longer than 25 characters.)
So, this will be my answer to the question, how many URLs can u.gleeson.us hold before it runs out of room? It’s at least 52 to the fifth power minus one, or 380,204,031. I doubt that we will ever reach this limit, but even if we do, I can always build us a new shortener at a different URL.
3:44 PM
I miss the Autorantic Virtual Moonbat, why doesn’t he get a FB page too?! I’m sure you could just replace “Bush” with “Sarah Palin” or “Glenn Beck” for the same, yet updated, deranged reactions! :)
4:03 PM
Hi Beth!
Maybe someday I’ll bring back the AVM, but for now I don’t want it to compete with the PLT. Certainly not before the UK general election.
11:49 AM
Hello, Just as a update, the original statue and all of her “secrets of antiquity” now reside in my familys private gallery.In a sealed alcove,only opened on rare occasion for those guests that display the proper piece of jewelery or request the correct cigar and whisky combination after desert has been served.How the statue ended up there is a interesting story on it’s own. No…… I’m ust a poor slob with a laptop,but well done on both games and the history. Well done….