Tags

I stumbled on this wonderful page which tries to explain computer science topics, simply. Their description of Big O kind of bothered me, though:

Say you order Harry Potter: Complete 8-Film Collection [Blu-ray] from Amazon and download the same film collection online at the same time. You want to test which method is faster. The delivery takes almost a day to arrive and the download completed about 30 minutes earlier. Great! So it’s a tight race.

What if I order several Blu-ray movies like The Lord of the Rings, Twilight, The Dark Knight Trilogy, etc. and download all the movies online at the same time? This time, the delivery still take a day to complete, but the online download takes 3 days to finish.

For online shopping, the number of purchased item (input) doesn’t affect the delivery time. The output is constant. We call this O(1).

From the experiments we know that online shopping scales better than online downloading. It is very important to understand Big O notation because it helps you to analyze the scalability and efficiency of algorithms.

It sounds great, but they leave out a critical detail: Big O is worst-case. Let’s take a quick look at a common sorting algorithm, Quicksort. It goes something like this:

1. Scan down the list until you find a value less than the median or “pivot.”
2. Scan up the list until you find a value greater than or equal to the median.
3. Swap those two values.
4. Continue steps 1 to 3 until both scans meet.
5. That meeting place divides the list in two; perform a Quicksort on each half.

It’s fairly simple. But how do you determine what the median is, without scanning down the list? Naive approaches will just pick the first value on the list, and for randomly-scattered lists this is decent. But imagine if you handed this variant of Quicksort a list that’s in reverse order; it’ll use the largest value as the “median,” and divide that list of N values into two partitions of size 1 (the “median”) and N-1.

That small one is already sorted, so it’s ignored. Quicksort looks at the N-1 list, and pulls off the first value it sees to act as the median: the second-largest value. The cycle repeats, and we now have partitions of size 1 and (N-1)-1. This carries on, and so for N values we wind up scanning N-1 items each, which translates to a Big O of O(n2). That doesn’t seem like a very “quick” sort, especially when others like Merge or Heap are never worse than O(n log(n)), and yet in practice it’s still one of the quickest.

The worst case is rarely a good representative of the typical case.

Leaving out the “worst case” bit can lead to all sorts of absurdities. Wikipedia proudly declares that the best case scenario for Quicksort is either O(n log(n)) or O(n), depending on which variant you use. In plain English, that translates to “the worst case of the best case performance is either proportional to N log(N) or N, where N is the number of items in the list.” It’s technically correct, but silly.

And seriously, how hard is it to insert the words “worst-case” into your analogy?