Fun with Algorithms

by Paul Hsieh

The problem

I have always maintained that having a strong math background has helped me when I am programming. When people ask me how that is, or to give me an example, I am usually at a bit of a loss. I'll say things like "Well, I figured out a way to divide by 3 very quickly and that's really useful for 24 bit graphics ..." but that kind of thing generally doesn't satisfy anyone.

Well, just the other day, I had an opportunity to put my pure math skills to work. My brother, who also has a very strong mathematical background and works for an enourmous software company in Redmond, WA called me up and asked me to help him work out a sorting algorithm. Now, at first, I was confused. There are plenty of well known sorting algorithms, so I just responded out of hand with "use radix sort, and you'll be happy." That comment lead off on a tangent about why I thought radix sort was better than other algorithms.

Anyhow, it turns out that isn't exactly what he was looking for. "What if you only have partial ordering information?" That is to say, for two elements a and b you had a comparison function that would determine that a<b, b<a, or that a and b have no known ordering. So the sense of ordering should just be that for any two elements a and b of the final list if a<b, or then a must appear before b in the list.

Ok, so I thought about it for a second and said, "Well what about just incrementally building a ring of chains?" What I meant was that you would grow a data structure that basically was a linked list with the tail joined back at the head (a ring) and that each entry was just a pointer to another resizable array that was suitable for use with any ordinary algorithm. So for each element simply compare it with the head of each chain, until you got either a a<b or b<a, relationship and then use some sort of incremental insertion sort algorithm to enter the element into that chain or else simply build a new chain. (Presumably the chains could use a balanced tree structure for greatest efficiency.)

"That's no good," he said, "An element may belong to more than one chain." What? "But if a<b and b<c then a<c, so if such an element did exist the chains would have had to have been joined in the first place, which is a contradiction so such an element could not exist." I came back. My brother thought for a moment and said "No, that's not what we have. You see, the comparison function is not complete. It would not automatically give you transitivity. Suppose the function only gave ordering information for adjacent elements, and gave an unknown for all other comparisons."

I then tried in vain to save the ring of chains idea, but there didn't seem to be a good way to do it. "Well, without transitivity, you can throw out all the generally used sorting algorithms out there; they all use transitivity as a built-in assumption."

"Why do you think I'm asking you?".

The solution

Ok. So now we have the ground rules. All of a sudden the trivialities of the known algorithms in general computer science have abandoned us. How do we proceed?

Well, the one thing we have going for us is that everything here is finite. So idiotic solution number one is to form every list and perform a O(n*n) test to see if the list is correctly ordered according to the original criteria. So this algorithm is O(n!n*n) which is not exactly usable in the real world.

But it got me thinking along the right lines. In the beginning of my set theory and real analysis, courses we established the soundness of using "proof by induction" by relying on the fact that any finite subset of integers always has a minimum element. This idea is all that is really needed. The only integers we have are list positions, so that directed my line of thinking.

We start by choosing an arbitrary ordering for the list.
Then starting from left to right, identify the first element, a, which is incorrectly ordered (of course we are done if there isn't one.)
Then find any element b in the list that such that b<a (we know there is one, other wise the criteria for choosing a would not exist.)
Swap a and b and go back to step 2

Voila! Of course, we are not quite done. How do we know this algorithm terminates? Well, its pretty easy to see that the position of the chosen value a will never move leftward. So how do we know it ever moves rightward? (If we know it eventually moves right, then we are set.) If it doesn't then it would have to eventually swap values into that position to create a cycle (since there are a finite number of elements.) This would create a contradiction (something like a<b<c<d< ... <a). QED.

Ok, that's just dandy, but what would he need such an algorithm for? Simple, he tells me, he needs to render a number of 3D objects using a Z-sorting algorithm; using a Z-buffer is out of the question because it must run on absolutely the most minimum hardware (a 90Mhz P5 -- remember we cannot all be Michael Abrash's.)

So hey, learning a little mathematics can be useful! That's what book learnin's done for me, anyhow.