Fun with C++: Metaprogramming, Part 1
Metaprogramming can mean many things, but all of them can be made to (loosely) fit under the description of “writing programs that write or rather describe (a class of) other programs.” A classic and somewhat useless sample of this is writing a program that prints it’s own source code. My first real contact with this field came from working with YACC. The rule file one writes to be fed to YACC, is kind of a metaprogram.
As a better example, if you could write a program that would accept a set of strings and generate a C++ program that would implement a “perfect hash function” that would hash those N input strings onto N collision-less values, then you’d have written a metaprogram. (Note that you are not implementing a hash function; you are writing a program that implements many hash functions based on the input.)
Now, these metaprograms can be in any language and output any language. If you write one in C++ that produces C++, then you have to run your compiler twice to get to the final usable program. Not really convenient, but it is used in the real world. For example, that perfect hash function generator example is exactly how gperf works.
On a cooler side, there’s already a couple of other mini-languages that any C++ compiler package can understand, e.g. the preprocessor language. I’m the first one to admit that writing a program only in preprocessor is like having trying to open a locked door with your head, but it’s not entirely without merits and fantastic experiences.
OK, so let’s get down to it. Before we can program in any language, we should know how some basic ideas like logic, conditions, repetition, etc. are expressed in that language. First off, doing loops a la “for” and gang is really really hard to do in preprocessor (I’d say impossible, but I’ve seen Boost.Preprocessor!) Fortunately, C++ preprocessor language does support a form of recursion (guess how?) and we can use that facility both as loops and as subroutines.
If you think about it, you’ll find out that the preprocessor’s programming model is not unlike digital logic circuits, with all the variables, inputs and outputs being single bits (whether a specific macro has been #defined or not) and the only operations being ANDs, ORs and NOTs. Although the preprocessor can assign arbitrary integers to macros and can compare them, but cannot do arithmetic on them.
As a first example, let’s see how an XOR logical function with two one-bit inputs can be implemented.
1 2 3 4 5 | #if (defined(IN_a) && !defined(IN_b)) || (!defined(IN_a) && defined(IN_b)) #define OUT_a #else #undef OUT_a #endif |
That’s it. Not very complex or revolutionary, but this idea can be taken far. What if we put the above code in a header file, and #include that header whenever and wherever we need a 1-bit XOR? Of course, the parameters to that XOR are always the same (think of them as global variables) but this concept can be use to implement functions and therefore recursion. I’m not going to go into more elaborate examples of recursion. Just let me demonstrate a little more of the preprocessor with another example that actually does something and generates output. I’ve implemented a 1-but full-adder (three inputs: A, B and Carry, and two outputs: A and Carry. The input carry is for reusability.) This example consists of two files; a “OneBitAdder.h” like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 | /* OneBitAdder.h Inputs: "IN_X" and "IN_Y" and "IN_C" Outputs: "OUT_X" and "OUT_C" Returns the sum of the inputs. Note that each input and output here is one bit: either a macro is defined (it's 1) or not (it's 0.) */ #if !defined(IN_X) && !defined(IN_Y) && !defined(IN_C) // 0 #undef OUT_X #undef OUT_C #elif defined(IN_X) && defined(IN_Y) && defined(IN_C) // 3 #define OUT_X #define OUT_C #elif defined(IN_X) && !defined(IN_Y) && !defined(IN_C) // 1 #define OUT_X #undef OUT_C #elif !defined(IN_X) && defined(IN_Y) && !defined(IN_C) // 1 #define OUT_X #undef OUT_C #elif !defined(IN_X) && !defined(IN_Y) && defined(IN_C) // 1 #define OUT_X #undef OUT_C #else // 2 #undef OUT_X #define OUT_C #endif |
and the “PreprocessorAdderTest.cpp“:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | // Set input #define IN_X #undef IN_Y #define IN_C // Compute #include "OneBitAdder.h" // Display(!) output # if !defined(OUT_X) && !defined(OUT_C) #pragma message ("--- The answer is 0") #elif defined(OUT_X) && !defined(OUT_C) #pragma message ("--- The answer is 1") #elif !defined(OUT_X) && defined(OUT_C) #pragma message ("--- The answer is 2") #elif defined(OUT_X) && defined(OUT_C) #pragma message ("--- The answer is 3") #endif // main is mainly cosmetic. int main () {return 0;} |
I’ve used #pragma message for output here. I’m not sure which compilers support it (I suspect all that matter) but if you want you can use other methods, like using #error.
Create these two files and compile the “.cpp” file (note: just compile it.) There should be a string “--- The answer is 2” in your compiler output! See what we did? We just run a program by compiling a C++ source file! Cool, eh?
Of course, getting any real use out of the preprocessor (beside the obvious) is really is not worth the effort, but if you want to see whether you have the hang of it, try writing a 2-bit adder using the above 1-bit adder. It’s not as easy (nor as hard) as it may sound. The samples here were not really good examples for metaprogramming, but I think you can extrapolate from here on if you want to.
I originally (i.e when I started writing this post!) wanted to write about template metaprogramming, but I went in a completely different direction. Maybe next time then.
I am fascinated by the idea and meaning of “Meta” followed by something else in general. Like meta-mathematics, meta-physics, meta-programming. It has a very beautiful essence in it. It tries to come out of the mold, look from above, outside the box, find out the truth behind phenomena. Similar to what human beings are interested in doing and are very good at compared to other species, reflection on oneself. The meta-X is related to a higher consciousness… there are always some answers in the meta level … I guess I got a bit carried away here
Very interesting point in your post is the similarity to logic gates which are the foundation for all programs…
yzt wrote “If you write one in C++ that produces C++, then you have to run your compiler twice to get to the final usable program. Not really convenient”
Why isn’t that convenient? This is all about adding a pass to the compiler. Those pass traverses some form of our input c++ program. For example it may traverse Abstract Syntax Tree. I knew, adding a pass to a compiler is not a really straightforward stuff. But doing everything in preprocessor is outrageous to some extent. isn’t it?
Of course it it. I am preparing you for the grand finalé! I am going to introduce template metaprogramming as more viable option, than modifying your compiler and writing metaprograms in preprocessor.
By the way, what I meant by running your compiler twice was not modifying the compiler to add another pass. I meant writing another C++ program to read some specification and generate source code to do what we want. It is actually not uncommon.
@ahf:
The similarity is indeed interesting. And the real question is whether the wise designers of the C preprocessor had envisioned this use and had designed it with that in mind or not.
And each alternative is cooler than the other! Envisioning this obscure and rarely used usage and enabling it with no impact on the rest of the design is one thing, but designing a system such that it can be used in novel and fantastic ways is on a totally another level of wisdom!