Fibonacci Numbers and Bad Teachers
Recursion is a fascinating and essential idea in mathematics, programming and design (among other things.) As a programmer, if you don’t understand this very simple idea (and I mean really understand it) you are pretty much done for. While it’s not a hard concept for any half-intelligent person to grasp, I don’t exactly know how many of us really do grasp it. But it’s not what I’m going to bitch about today.
In programming courses, one of the first examples used to introduce the idea of recursion is the Fibonacci sequence. You know, the series of integers starting with 0 and 1, where every next number is the sum of the two previous ones. (OK, I know that you do know. I was just covering all my bases.)
Anyways, the most simple and elegant introductory way of implementing a function that returns the nth number in the Fibonacci series goes something like this:
1 2 3 4 5 6 7 | // Just making things clear! typedef unsigned long long Integer; Integer Fib (unsigned n) { return n < 2 ? n : Fib(n - 2) + Fib(n - 1); } |
And every half-descent programmer knows that that’s probably one of the worst ways you can implement the idea. I’m not kidding. The teachers just throw this at the students, and the brighter of the students see the elegance and simple beauty of it and it will take years for most of them to realize the problems with this particular implementation. Some of them never do. This implementation performs very poorly for even the small values of n, which of course should be apparent when you contemplate its time complexity and recursion tree (I won’t go into space considerations, because they are not much of a problem until n goes into (many) thousands, at which point the value of the function becomes larger than you can store naively in most languages, which means you should have already given space considerations a serious thought.)
But even later, when the students learn about complexity analysis and Big O notation and crap like that, more often than not, they fail to apply the newly acquired knowledge to the old beliefs. If the teacher is good, and the student is bright and lucky enough, they walk out of their algorithm and data structure course with the new belief that bubble sort is bad, hash tables are good, and disk seek times dominate everything else (all of which are obviously bullshit in the absolute sense.)
For the more astute reader, I need to clarify that the badness of the mentioned implementation stems from the fact that most common languages (including C++) are mainly side-effect-driven languages, in that they rely mostly on the side-effects of expressions and statements to get the intended job done (the usage of the term “statement” is a clear sign of that.) Otherwise, you wouldn’t see so much assignment in our code. A functional and side-effect-free programming language (or more accurately, programming model) could easily cache the result of each invocation of the Fib function for any particular n (because the function call would be without side effects and therefore time-invariant) and would not need to evaluate any Fib(n) more than once. This is something that is being done in some of the better implementations of functional languages, AFAIK.
To emulate this behavior in this case, one can use a hand-coded version of what is known as Memoization, a la:
1 2 3 4 5 6 7 8 9 10 11 12 | Integer Fib (unsigned n) { static std::vector<Integer> Mem; if (n < Mem.size() && 0 != Mem[n]) return Mem[n]; // The second condition is redundant, right? Integer ret = n < 2 ? n : Fib(n - 1) + Fib(n - 2); if (n >= Mem.size()) Mem.resize (n + 1, 0); Mem[n] = ret; return ret; } |
This is not a bad implementation. It does have some issues in multithreaded applications, but I’m willing to overlook that for now. This implementation is linear in both time and space, and it can be faster than simple non-recursive implementations because it never calculates any Fib(n) more than once over the whole lifetime of the process. But it’s not pretty (it has its charms, though!) and this way of doing things is error-prone (because you are adding memory and bookkeeping to a mathematical construct that is otherwise free of this stuff.) Not to mention that it is basically impossible to use as an introductory example; because it needs knowledge from all over the map.
A better implementation might be:
1 2 3 4 5 6 7 8 9 10 11 12 | std::pair<Integer, Integer> _fib (unsigned n) { if (n < 2) return std::make_pair(n, 0); std::pair<Integer, Integer> Fn_1 = _fib(n - 1); return std::make_pair(Fn_1.first + Fn_1.second, Fn_1.first); } Integer Fib (unsigned n) { return _fib(n).first; } |
Of course, the meat of this implementation (such as it is) is the _fib function. The other one is just there to present a nicer and more consistent interface. Let me be the first one to say that there are more ways to implement a better Fibonacci and still stay essentially recursive. This is just one way, and not a particularly great one. But it’s not bad either.
Now, how come no programming instructor ever teaches the likes of this implementation? Not the ones I studied under, at least. Are they afraid that their students’ heads are going to explode? PROGRAMMING IS HARD PEOPLE! If a programmer can’t handle a substantial improvement over a bad implementation of a simple idea, they are in for a big (and not at all nice) surprise. I hereby beg any teachers, instructors, professors and whatnot that have anything to do with teaching starting programmers to quit treating them like idiots, because idiots won’t make good programmers. The idiots are not going to understand what they need anyways, so why take the chance of profound understanding from those who have the capacity to learn, but might not if you treat them like babies?
For the sake of completeness, here’s a non-recursive implementation:
1 2 3 4 5 6 7 8 9 10 11 12 13 | Integer Fib (unsigned n) { if (0 == n) return 0; Integer last = 1, penultimate = 0; for (unsigned i = 1; i < n; ++i) { penultimate += last; std::swap (last, penultimate); } return last; } |
This is the fastest of implementations offered here, with the exception of the memoizing one (and only in amortized sense.) But nothing prevents us from applying the memoization technique to an iterative implementation as well, albeit not in a general and automatic way. It might not be apparent or significant in such a simple sample, but for some class of problems, recursive solutions are usually easier to devise and understand, and therefore easier to get right. In any case, nothing is clear-cut when it comes to programming, and I’m almost offended when people (specially people who should know better) behave as if almost everything is.
I remember during programming contests I used to do memoizing instead of proper dynamic programming. People kept warning me about the overhead of function calls, but it worked all the time.
Fuck them. I bet there wasn’t one of them who knew the actual cost of a function call. Function call overhead used to matter when CPU speeds where in double-digit megahertz, when the cost of memory access was flat and roughly the same as (or less than) a multiplication. These days, what dominates every program’s execution time is cache misses (e.g. the CPU might be stalled 99% of its time for data from memory, and that’s not an unrealistic value.)
To be fair, function call overhead (and other classic stuff like conditionals or integer division or FP operations) is an issue when you are concerned with nanoseconds. For example, in the inner loop of a (software) renderer or anything else that is run several million times a second. Otherwise, just forget about it. There are much bigger fish to fry here.
Anyways, as a wise person has said “Premature optimization is root of all evil.” It follows directly that uninformed decisions about performance are nothing more than superstition and just as correct.