Programmatically Speaking: 2010

Sunday, November 21, 2010

Integrating invisible tools with the shell

I've been working on a build system called Mim for some time. It has some interesting features, the most interesting being that files/artifacts are implicitly built when they are read. In other words, there is never a file on disk that is out of date.

That's pretty neat, but what happens if a file cannot be built for some reason? What happens it there is a syntax error in a .c file, for instance? Since Mim runs in the background, how should such errors be reported? Simply writing them to a log file might suffice, but the experience for the end-user is not very nice. Imagine having to open a file every time a file fails to compile... Horrible!

This is not a unique problem for Mim. All tools that runs in the background have problem with reporting errors, and most solve this by writing to a log. However, for Mim I chose a totally other way of doing it.

Most shell utilities, e.g. cat, uses functions in the standard library for printing, open files, etc, but more importantly (for this post) for reporting errors. The error function is used to report errors.

That means that we can control how errors are printed, by LD_PRELOADing a .so-file with another implementation of the error function. This is what I did to implement error reporting for Mim.

This is how Mim works now; if you try to open a file that Mim cannot build because of a build error (e.g., compile error), you'll get the following printed to the console:

$ cat generated-file.txt

cat: Mim: error 512 when building artifact
which is much better than the old error message which just was a generic message indicating that the file could not be opened.

But there is still room for improvement! It's nice to get a error message that says that the file couldn't be built, but it would be even nicer if the message told us why the artifact couldn't be built. I haven't implemented that yet, but I'm working on it. I imagine something like this:

$ cat generated-file.txt

cat: Mim: error 512 when building artifact generated-file.txt, because:
input-file.cc failed to build, because:
input-file.cc:53: Syntax error.
$ emacs input-file.cc

That will require some more work though as I said, but it's definitely possible and it's technically not hard.

Saturday, September 4, 2010

The firge, the door and the fire alarm

Today when I prepared breakfast I realized that the fridge wasn't properly closed. It had been a small opening the whole night. That's not particularly good.

But what's worse is that the fridge did not make any noise indicating that the door was open. In fact, it did the opposite: it turned off the only indication there was that the door wasn't properly closed -- it turned of the light inside the fridge. If that light wasn't turned off it would be easier to spot that the door was open.

Turning the light off would not be a big problem if there was some way the fridge alarmed when the door was open. Let's say making a noise when it had been open for more than 1 minute.

I'm sure that it's a feature that the fridge turns off the light by it self when it has been on for too long. But I can't the use-case when it would be a useful. There must be a timer somewhere that turns off the light. That timer should instead trigger an alarm.

Actually there is an noise-making-device in the fridge. But that only used when the temperature in the fridge is above a certain temperature. That noise-making-devices should be triggered by that timer. Why didn't it?

I don't know.

Recently I've noticed more and more weird design choices in everyday things. Like having the handle of a door be shaped as a pipe when you should push the door, and having the handle be shaped like a flat surface when you should pull the door (think about it, a surface is easy to push and a pipe is easy to pull).

Even worse, I've experienced that fire alarms sounds very very much like the break-in alarm.

Friday, September 3, 2010

Mim: the build system you always wanted

The word mim means any of the following:

an incorrect way of writing 1999 in roman numbers (as in "we gonna party like it's 1999"),
the Madame Mim from the Disney movie Sword in the Stone,
Swedish for "mime" meaning "imitating"
a figure in Norse mythology renowned for his wisdom
an acronym for "Mim Isn't Make"

it's also the name of the build system I've been thinking and working of for a while. This post is about what Mim is now and what it can become. Text in red describes stuff that isn't implemented yet, text in yellow is things that are kind-of implemented, and text in black describes implemented features. First, I'll describe how Mim is different from other build systems and why it's better.

Problem
The problems with traditional build system are:

make sure the dependencies are correct and built in the right order (think: make depend && make && make install),
be sure of that every built file you see is up to date,
hard understand how a software project is built (what are the artifacts? where are they stored? what are the dependencies?),
hard to understand which variables a build have (e.g., make DEBUG=1)

Solution
The solution to these problem is (of course) Mim. With Mim you clearly see all artifact the build system produces, how they are built, and when they need to be built. In fact, you can see the artifacts, e.g., by doing ls, before they are built. In other words, using your normal command line tools you can browse the project file tree to find the artifact you need (to run, to view, to copy, etc.) without building anything. Sounds like magic? Keep reading...

When when you found the program you need to get built you simple execute it. No need to issue any command yourself to build it. The program will be automatically built for you and then executed. Similarly, if you have a tool that generates a text file as part of the build, you can do emacs file.txt. The file will be automatically generated and then opened in emacs.

If you need to get more information about a file you can do mim ls filename. You'll get something like this written to the console:



-r-xr-xr-x you users filename (g++ filename.cc -o filename)

that is, you can easily discover to the access rights, the owner, the group, and how the file is built, of any artifact simply by locating it in the file system.

This is especially powerful when you work with project of which source code generation is part of the building process. Understanding what files are generated, how they are generated, can be very hard. With Mim, however, this is very easy, because every file (generated or not) looks the same to you when you browse the file tree. Additionally, you can ask Mim for additional information about a certain artifact file, e.g., get the command for building the file by doing mim cmd file.

Furthermore, getting the dependencies right when the build involves several steps (generating .d files, generating source code, generating .d files for the generated code, compiling the source code, linking the source code against libraries (which themselves has to be built, including its generated source code)) is also very very hard. And understanding it a few month later is even harder. Mim solves all these problems.

Let's take an example. Let's assume we have a directory called hello with a file hello.cc contaning an implementation of Hello, World. There is also a file called Mimfile that tells Mim how to build this program. Doing mim ls in hello gives you:

greeting.txt
hello.cc     codegen -lang en greeting.txt -o hello.cc
hello-en     g++ hello.cc -o hello-en
Mimfile

which tells you that there is only one physical file, greetings.txt, in this directory (except the obligatory Mimfile) and two artifact files, hello.cc and hello-en, of which the latter is an executable. You can also see that there is some code generation involved by looking at the command line for hello.cc. You can tell all this by issuing one simple command, mim ls, without building anything at all.

Let's say you wish to take a peek at the hello.cc file. All you need to do then is to do cat hello.cc. No need to worry that the code you see is out of date because Mim makes sure it's up to date before it's opened.

Being lazy
Being lazy is a good attribute; as long as you're not too lazy. Being lazy in the context of a build system means that you only rebuild a file if is inputs have changed. How do Mim achieve this?

Mim woks by intercepting all read and write accesses to the file system. This means that Mim has perfect knowledge of 1) which files are read when a given artifact if built, and 2) which files that have been updated since the artifact was last built. This means that artifacts are only rebuilt when needed.

This information makes it possible for Mim to easily construct a graph dependencies and what needs to be built. This implies that Mim start building the leaves of the graph as early as possible to reduce the build time.

Instant messaging

In large project there is usually variables that controls how the project is built. Mim supports such variables in a nice way. To list the variables simply do mim conf, you'll get something like this printed to the console:

lang      en   en|fr|swe  Controls in which language "Hello, World" is printed.
debug     off  on|off     Compile with or without debug information.
optimize  0    1|2|3      Optimization level as defined by gcc:s -O flag.

The first column is the the name of the variable, second column is its current value, third is its possible value, and last column is a textual description of it.

Changing the value of the variable lang to swe is done by doing mim conf lang=swe. Setting a variable have instant effect. For example, if a variable affects which files are built, you can see the updated list of files immediately by doing ls without building anything. Example:

$ ls hello*
hello.cc  hello-en
$ mim lang=swe
$ ls
hello.cc  hello-swe

That is, by changing the variable lang from en to swe we get an artifact file called hello-swe instead of hello-en like we got before.

To aid discoverability even more, you can do mim ls hello* lang==* to list all files matching hello* for all values of lang. For the hello example this outputs:

hello-en   lang=en   g++ hello.cc -o hello-en
hello-fr   lang=fr   g++ hello.cc -o hello-fr
hello-swe  lang=swe  g++ hello.cc -o hello-swe

All these features helps you in learning how an a project is built, how to use its build system, how to change it, how those changes affect the build system, and more.

The Tracker
Since Mim intercepts all accesses to the file system, it can help you when you do potentially bad things, for example, if you delete a file that is needed to build an artifact. Mim also updates your Mimfiles automatically when you do non-destructive changes to your source tree. For instance, when you rename an input file to gcc as the following example illustrates this:

$ mim ls hello*
hello.cc  codegen -lang en greeting.txt -o hello.cc
hello-en  g++ hello.cc -o hello-en
$ mv hello.cc hi.cc
$ mim ls hello*
hi.cc     codegen -lang en greeting.txt -o hi.cc
hello-en  g++ hi.cc -o hello-en

Mim can do this automatic update of Mimfiles correctly in many cases, but it's impossible to get it right every time since that would require deep knowledge of all possible build tools (e.g., gcc, ld, m4, etc). However, Mim makes certain (reasonable) assumption on how the build tools behave, and as long as the build tools' conform to these assumptions Mim gets it right most of the time. Anyway, to check that Mim got it right, you simple build of application so this isn't a big problem in practice.

Getting general

Mim is built around the concepts of events and actions triggered by event. So far we have implicitly discussed the event open artifact for reading and the action was update artifact. The possible events that can trigger actions are:

open physical of artifact file for reading (before-read),
closing a physical or artifact file after write (after-write),
deleting physical or artifact file or directory (delete-file),
creating a physical file or directory (create-file), and
moving a physical or artifact file (move-file).

Since arbitrary commands can be executed as the result of an event, you can do what ever you heart desire when a file is moved, delete, created, etc. However, actions that interact with Mim would be hard to implement ourself and are thus built into Mim:

update an artifact (without reading the artifact file)
log to the Mim log (e.g., an error message)
reject event (e.g., reject opening a file).

The reject event action is useful, e.g., if you need to verify the consistency of a save file: if the file is consistent it cannot be saved.

Playing nice
So far so good. But what about integrating with other non-miming tools like, say, make. Since all Mim does to build an artifact files is to execute a command line in the shell, Mim can easily be used as a front-end to make. Here is an example of doing mim ls in a directory of a project that does just that:

main.cc
hello     make hello
Makefile
Mimfile

Furthermore, any program can use Mim for building files, because all the program need to do to build a file is to (try to) open it for reading. Mim will then kick in and build the file, without the other program ever nothing anything special. This is very powerful, because every program you have on you computer, cat, emacs, Mozilla, etc, can build any file using Mim.

Saving the environment
Do you have ./configure && make && sudo make install aliased to nike on you system? If you do you're not using Mim.

One reason to use autoconf (i.e., the ./configure script above) is to find the paths to all the necessary tools, libraries, headers, etc, needed to build your project. With autoconf this involves generating Makefiles that are specifically targeted at your system. The problem with autoconf and the generated Makefiles is that you have to be a genius to understand what the heck is going on.

So what do Mim do instead? Easy, you have an artifact file for each tool, library, header, etc, that is needed by your build. Those artifacts are configured (e.g., using autoconf) to link to a specific file on your system. These tools (or, ) are normally placed in a directory called ext (for "external"); thus, to list the dependencies to external tools simply do find ext or ls -R ext. Every file listed is an external dependency. Example:


$ cd my-project/ext
$ ls -R *
ext/bin:
g++  ld  python

ext/include:
3rdparty.h

ext/lib:
3rdparty.so

In this example we have configured the ext directory to contain a compiler, linker, and a python interpreter. So, when we need to compile a .cc file we have the following command line in the Mimfile ext/bin/g++ file.cc. The first time this command line is executed, the artifacts (ext/bin/g++, etc) are not yet configured, so Mim launches the configuration tool (autoconf). After that, the artifacts in the ext directory point to the correct files on the local system and can be executed easily.

A way of thinking of this is that Mim creates a virtual environment (directory structure) that is ideal(ized) for you specific project. All the tools and libraries you need are right there in the ext directory.

Deep understanding
Mim really, really, understands mimfiles. If a mimfile is updated, artifacts potentially need to be rebuilt. However, Mim will only rebuild artifact which command lines has changed or if their dependencies' command lines are changed. Have you ever experienced that your entire project needs to be rebuilt just because you added an a comment to a makefile? That will never happen with Mim.

Mim has even support for refactoring mimfiles via your filesystem! In the hidden directory .mimfile the artifact files' mimfile-data is represented as files. So, if you want to change the name of an artifact file simply do mv .mimfile/old-name .mimfile/new-name; to create a new artifact file that is identical to an existing one do cp .mimfile/original .mimfile/copy; to delete an entry from the mimfile do rm .mimfile/remove-me. Finally, to edit the command line for an artifact you simply edit the corresponding file under .mimfile, e.g., echo 'gcc input-file.c -o %output' > .mimfile/file-to-be-edited.

As mentioned earlier, Mim also has variables that can be used to configure how your project is built. These variables are also represented as files under the directory .mimvars. Here you can easily browse the variables using ls, find, grep, etc, or even your graphical file browser. The get the current value of a variable, simply do cat .mimvars/the-variable and to set the value do something like echo 'new value' > .mimvars/the-variable.

When thing go wrong

Ok, so Mim is nice, I think I've communicated that by now. But sometimes the world is very very unnice. Sometimes the world don't want to compile your code. Then you need to either 1) fix the world, or 2) fix your code. To my experience the latter is slightly easer than the former, although your mileage may vary.

To fix your code you need to know what the compiler complains about, but how do you do this when the files is compiled behind the scenes? Simply do mim log errors to print the error messages to the console, fix the error, and run the program again.

Doing mim log error after every build error gets annoying very fast. Thus there is a variant, mim log errors -f, that is useful when you wish to get continuous feedback. Given the -f flag Mim will run in the background and print the error message to the console as soon as they appear. So, assuming hello.cc contains an error, trying to execute hello will print the flowing:

$ mim errors -f
$ ./hello
hello/hello.cc: In function ‘int main()’:
hello/hello.cc:3: error: ‘printf’ was not declared in this scope

which is is just what gcc outputs, thus, any tool that parses the output from gcc will understand the error output from Mim as well. This is of course also applicable for other tools like make.

Summing up
I've described the ideas behind Mim, its current state, and possible future features (that, by that way should be very reasonable to implement :)). You can find Mim here. You'll need Intel Threading Building Blocks and Lua 5.1. I've developed using Ubuntu and also compiled it on Suse 11.2. The instructions for how to build Mim itself is currently horribly missing, so please contact me if you like to try Mim out.

Friday, July 9, 2010

DSL: Domain Specific Logging

A few years ago, I was asked to to design a logging framework for our new Product X 2.0. Yeah, I know, "design a logging framework?". That's what I thought too. So I didn't really do much design; all I did was to say "let's use java.util.logging", and then I was done designing. Well kind of anyway...

I started to think back on my first impression of the logging framework in Product X 1.0. I remembered that I had a hard time understanding how it was (supposed) to be used from a developer's point-of-view. That is, when faced with a situation where I had to log, I could not easily answer:

to which severity category does the entry belong (FINEST, FINER, FINE, INFO, WARNING, SEVERE, ERROR, etc)
what data should be provided in the entry and how it should be formatted, and
how log entries normally is expressed textually (common phrases, and other conventions used).

In other words, it was too general (i.e., complex) with too much freedom to define your own logging levels, priorities, etc.

That's why I proposed to make logging easier, thus, the logging framework in Product X 2.0 has very few severity levels. This is implemented by simply wrapping a java.util.logging.Logger in a new class that only has the most necessary methods.

This simplification reduced the code noise in the logger interface, which made it a lot easier to choose log-level when writing a new log message in the source. This had an important implication (that we originally didn't think of): given that a certain message had to be logged, we (the developers) agreed more on which level that message should be logged. No more, "oh, this is a log message written by Robert -- he always use high log level for unimportant messages".

Of course, you could still here comments like "oh, this log message is written by John -- he always use ':' instead of '=' when printing variables". This isn't much of an issue if an somewhat intelligent being is reading the logs, e.g., a human. However, if the logs are read by a computer program, this can be a big problem -- this was the case for Product X 2.0.

This variable-printing business could be solved quite easily; we simply added a method to the logger class (MyLogger) called MyLogger variable(final String name, final Object value) that logged a variable. Example:

logger.variable("result", compResult).low("Computation completed.");

which would result in the following log message:


2008-apr-12 11:49:30 LOW: Compuation completed. [result = 37289.32];

When I did this, I began to think differently about log messages. It put me into a new mind-set -- the pattern matching mind-set. Patterns in log messages started to appear almost everywhere. Actually, most of our logs followed one of the following patterns:

Started X
Completed X
Trying to X
Failed to X
Succeeded to X

Here is an example:

try {
  myLogger.tryingTo("connect to remote host").variable(
                    "address", remoteAddress).low();
  // Code for connecting.
  myLogger.succeededTo("connect to remote host").variable(
                       "address", remoteAddress).low();
} catch (final IOException e) {
  myLogger.failedTo("connect to remote host").cause(e).medium();
}

To take this even further, it is possible to only allow certain combinations of succeededTo, tryingTo, cause, etc, by declaring them in different interfaces. Let's assume that it should only be possible to use cause if the failedTo method have been called. Here's the code for doing that:


interface MyLogger {
  // ... other logging methods.
  CauseLogger failedTo(String msg);
}

interface CauseLogger {
  LevelChooser cause(Throwable cause);
  LevelChooser cause(String message);
}

interface LevelChooser {
  void low();
  void medium();
  void high();
}

This is essentially what's called fluent interfaces. It complicates the design of the logging framework, but also makes it impossible to misuse it. Good or bad? It's a matter of taste, I think, but it's a good design strategy to have up your sleeve.

Thursday, June 17, 2010

The only design pattern is small solutions

I just saw this TED talk (which you need to see too!). It's essentially about how we prefer big complex solutions to any problem we face. Why? Because it makes us feel smart and important. As my head is filled software thoughts, I started to think how this it relates to software design. We software developers really!, really!, really!, like big solutions to small problems: "Oh, you got a program that needs to store some data? You better get yourself a dedicated database machine, a persistence layer, and define a XML schema for communication data format."

We don't need big solutions to small problems. Big solutions are easy to find. Big solutions need man-hours but no understanding. We need small solution to big problems. Small solutions are hard to find. Small solutions need insight into the actual problem we're solving. The actual problem is what's left when we remove all accidental complexity, marketing buzz-words, etc, and think clearly about the original problem.

Small solutions are orthogonal to each other; big solutions are not, they interact in non-obvious ways. Thus, big solutions creates more problems, or as the american journalist Eric Sevareid, said:

The chief cause of problems is solutions

which is more true in software development than in most other areas. Implement small solutions to problems and your future self will thank you. Implement big solutions and you fall for the sirens' calls of the marketeers, or your own wishes to do seemingly cool stuff while looking smart doing it. Do you really need a DSL? A database? Web interface? Reflection? Operator overloading? Meta-programming? Code generation? Ruby? SOAP?

Thinking small have big impact.

Monday, June 14, 2010

If it hurts do it more often

As a kid, most of what you did was related to learning. Some call it "playing" or "being curious", but whatever you call it, it's learning in one form or another. And learning hurts. A lot.

So you start going to school to make learning not hurt so much. As a 7-8 year-old you enjoy school, because it makes learning much easier. Also, you can play with your class mates, like say, play soccer. Oh, by the way, playing soccer makes you run faster longer, gives you better balance, gives you better feeling of how flying objects behaves (like a ball), better timing your own movements to others (like your team mates). It also makes you better winning and loosing (you aren't a bad looser, are you? How about a bad winner?), and it makes you better at focusing on a particular task, and making other (your team mates) focusing on the same task as you. It's all learning.

As you grow older you become more comfortable with most things you do because you know them. When you finally leave school to start working as a programmer you lost all will to do things that hurts... because you lost your will to learn (in comparison to how much you wanted to learn stuff as a kid). You lost your will to discover new things.

Stupidity hurts

So, when stated with a problem that forces you to do something you don't know or something you really don't like, you stall. Stall, stall, stall, because you don't want to do it. You want to do something fun. Something you know how to do. Something that makes you feel secure and comfortable. Something that makes you feel less stupid!

Yes, you are stupid. I'm stupid. We are all stupid. No one know everything, so everyone is stupid at something. And we will remain stupid at those things unless we learn how to do them better. But... Ouch! Remember? Learning hurts! So we think to yourselfs "Ooh, I don't want to do that!", or put slightly different "I may get hurt doing that! I'd rather stay stupid!". Imagine if you thought that way as a 8-year-old kid in school... or at the playground, or at the soccer field. We wouldn't get many marathon runners or Nobel prize winners that way, would we?

The cure

It's really easy: dare to be stupid. Dare to ask stupid questions. After a while, you'll notice that you start to ask really hard questions, and before you know it you'll ask questions no one (you know) have answers to. And then, as it happens, people will ask you instead. If they dare to of course (hint: they should).

In fact, I think that people will be more likely to ask you (stupid) questions because they'll seen or heard (of) you ask (stupid) questions. And you know what, people do as other people do. Especially like people they look up to, and since you know so much (from asking stupid questions) they look up to you. Isn't that neat?

The workplace

If you have colleagues who don't mind potentially looking stupid by asking a question, you've work at a good company, I think. But I'm not really someone who read or thinks about this a lot, so I may be wrong. There may be effect here that I'm clueless about. Anyway, my feelings are that I'd be more at home at a company like that than a company that encourages silence and really intelligent questions only. But I'm just me. You are you, and you may feel entirely different about this.

Wednesday, June 9, 2010

Design for optimizability

Only optimize when you found a bottleneck by measuring. That's what we all have been taught, right? It's a good rule. But is it really right? What if you're designing a performance critical application? Should you really be ignorant about performance until you measured and found a bottleneck?

I say: don't optimize, but plan for optimizability. I think Dwight D. Eisenhower said it best, though

Plans are nothing; planning is everything.

And now a disclaimer. I am not encouraging over-design in this post. Neither am I proposing that you should analyze every silly detail of your application domain before you start coding. I encourage you to start hacking on you application without worrying about performance, because you know (if you follow the tips that follows and use your brain) that you can optimize it later. Knowing that you can rewrite that slow and memory-hungry Ruby script into a snappy C program (that don't use any heap memory at all) makes it so much easier to sleep at night. Believe me.

Ok, now let's get going!

Caring about planning

Planning for optimizability means that you make sure that:

seams are placed to allow optimizing,
system-wide concerns are encapsulated,
it's possible to do design lowering later on,
algorithmic decisions are delayed,
intelligence can be moved from data to code to data,
it's possible to move run-time aspects to compile-time,
you know your tools.

We will now go through the first three of these bullets one by one and discuss them in detail. I will discuss the remaining four in a later post.

Seams and optimization borders

A seams in software design is a some-what vaguely defined concept. If two components of a software design are loosely coupled, we say that there is a seam between them. In other words, seams separate the design into parts that can vary independently of each other. Inheritance and dependency injection is often used to achieve this in object-oriented systems; function pointers play a similar role in procedural programming.

Since planning for optimizability means that we wish to make it easy to replace slow code with faster, we thus should place seams in the design such that they surround potentially slow code. This implies that design seams also are optimization borders; borders over which optimizations cannot easily be done without affecting other parts of the system.

This doesn't mean that it's impossible to optimize over optimization borders, just that other parts of the system will be affected by it. Thus, optimizing over optimization boarders is harder than optimizing within them. Let's take an example.

Assume that we iterate over a a large list in a performance critical part of our application:

void Foo::iterateOverLargeList(std::list<Thing> l) {
 for (int i = 0; i < l.size(); i++)
   l[i].doEasyCalculation();
}

Where should we place the design seams in this code? Or, more concretely, how do we use virtual functions to enable us to optimize this piece of code as much as possible without affecting the surrounding code? As I see it, we have the following alternatives:

make Thing::doEasyCalculation virtual,
make Foo::iterateOverLargeList virtual, or
encapsulate std::list<Thing> in a new class ThingList, move iterateOverLargeList to ThingList, and make it virtual.

Of these three alternatives 3) gives us the most freedom to increase the performance of the loop, because it allows us to do not only optimize how we iterate, but also change the underlying data structure. For instance, changing from std::list (a linked list) to std::vector (essentially a linear array) will make the access pattern linear, thus the (L1, L2, L3) cache can do a better job.

However, 3) is also the option that is the hardest to refactor to, because we have to change every part of the application that uses the std::list<Thing> list. So, if our application is large, this may involve a lot of refactoring work. Thus, if we choose 3) early in the development we don't have to refactor so much later. In other words, early understanding of the domain, the application's use-cases, and the application design, is important for us when designing for optimizability.

This example touches on another important tool when designing for optimizability: encapsulation. We will now discuss this in greater detail.

Concerning encapsulation

One of the worst kind of optimizations you can do is to optimize something that affect the entire system, e.g., the data flow. A global property like is a system-wide concern and having the system's performance depend on such property is very very bad. We need to encapsulate these global properties to make sure that there are design seams that enable us to optimize them when we need to.

This means that we need to design our code such that if we need to optimize something, that something is encapsulated into a single class (or few classes). This does not mean that we should encapsulate every little petty detail, though. It does mean, however, that we need to put a bit more effort in understanding our application, and then think about its design, data flow, etc. This is actually something we should do anyway even if we're not designing for optimizability, because understand the current design of the application make it easier to improve it.

When we got a firm understanding of the application and how it will be used, then we can identify the system-wide concerns, and only then can we start to encapsulate them to make them possible to optimize in the future.

Avoid memory copying

Reading data from memory is one of the most costly thing a modern CPU can do. If the data is in (L1) cache reading memory is fast (3-5 cycles or so), but if the data needs to be read from main memory it takes 100 times longer. That's right: 500 cycles to read a single bit of data. Writing data to main memory is also very costly operation, thus, copying is extremely costly operation.

With this in mind, it may be tempting to return a pointer to a shared data buffer instead of returning a copy of that data. For example:

class Thingie {
 int* stuff;
 // constructor that initializes 'stuff'.
 int* getStuff() { return stuff; }
};

This may look efficient since there is no memory copying going on. However, this design is horrible with regard to encapsulation. For instance, what if we need to modify the returned data in some other part of the program without affecting the Thing instance? In that case we need to make a copy and modify the copied data. But what if that copied data is returned just like the int* stuff above? And what if it needs to be modified in some other part of the program? Then we need to make another copy!

In situations like this, you need to take a step back and look at the data flow through your application and then try to optimize it to minimize the number of copies. This is not an easy task, so we really wish to avoid doing it, or at least make it easier for us to do. Is there some way to design our application (with little extra effort) from the start to make optimizations like this possible (without huge redesigns)?

In the example above, the system-wide concern is how the int* stuff data is passed around in the application and how to make sure that it is not copied unnecessarily.

Let's rewrite this example a bit into:

class Thingie2 {
 MyBuffer stuff;
 // constructor that initializes 'stuff'.
 MyBuffer getStuff() { return stuff; }
};

Since the data now is encapsulated in MyBuffer (and the entire application uses MyBuffer to refer the the data), we can now implement a lot of optimizations that affect the data-flow of the application. For instance, we can implement copy-on-write semantics and reference counting; this would be extremely hard to do without using MyBuffer to encapsulate the data-flow, which is a system-wide concern.

Design lowering

The term instruction lowering is an optimization technique used in compilers to generate more efficient code. For example, the C code a = 16 * b; can be implemented (in pseudo-assembler) by a multiplying operation A = MULT(16, B) but also by a simpler and faster shift operation A = SHL(4, B).

With the term design lowering I mean a similar concept: reimplementing the design in a simpler and faster language or platform, or using faster language constructs or data structures. Examples of such design lowering include:

rewrite Python code in C,
replace reflection-heavy Java code with simpler non-reflection code,
using a simple array instead of a hash map or a linked list,
using the memory layout of C structs as data format instead of XML in files, over network, etc
use hardware acceleration like graphics cards.

Design lowering can easily increase the performance 10-100 times or even much more. Lets take an example of this.

A story of left and right

Imagine two small program left and right; left generates data and sends it to right, which sums the received data. Here is the first version of this pair of programs implemented in Python:

--- left.py ---
a = { "first":[1, 123], "middle":[1, 2, 456], "last":[1, 2, 3, 789] }
for i in xrange(0, 1000000):
 print a

--- right.py ---
import sys
total = 0
for line in sys.stdin:
 a = eval(line)
 total = total + sum(a["first"]) + sum(a["middle"]) + sum(a["last"])
 print total

Translating this Python code into C++ idioms this becomes:

--- left.cc ---
struct Data {
 Data() {
   first[0] = 1; first[1] = 123;
   middle[0] = 1; middle[1] = 2; middle[2] = 456;
   last[0] = 1; last[1] = 2; last[2] = 3; last[3] = 789;
 }
 int first[2];
 int middle[3];
 int last[4];
};

int main() {
 Data data;
 for (size_t a = 0; a < 1000000; a++) {
   for (size_t b = 0; b < sizeof(Data); b++)
     putchar(reinterpret_cast<char*>(&data)[b]);
 }
}

--- right.cc ---
struct Data {
 int first[2];
 int middle[3];
 int last[4];
};

int main() {
 Data data;
 long total = 0;
 for (size_t a = 0; a < 1000000; a++) {
   for (size_t b = 0; b < sizeof(Data); b++)
     reinterpret_cast<char*>(&data)[b] = getchar();
   total += data.first[0] + data.first[1] + data.middle[0] + data.middle[1] +
   data.middle[2] + data.last[0] + data.last[1] + data.last[2] + data.last[3];
 }
 printf("%ld\n", total);
}

And running them we get:

$ time ./left.py | ./right.py
1378000000

real 1m3.284s
user 1m14.057s
sys 0m0.256s
$ g++ -O2 right.cc -o right && g++ -O2 left.cc -o left && time ./left | ./right
1378000000

real 0m0.763s
user 0m1.292s
sys 0m0.088s
$ echo 74.067/1.292 | bc
57

Thus, the C++ version of these little programs are 57 times faster than the equivalent Python code. But this is not the neat thing... Let's get back to discussing design lowering.

Where to do design lowering

The neat thing is that you can implement left and right in Python without worrying about performance because they are easily design lowered to C++. I can say this because left and right comes in pair; if you redesign one side you also redesign the other. In other words, the details of the implementations are free to change in ways to make them execute more efficient. Thus, it's easy to do design lowering.

However, this would not be the case if left sent data to an unknown part via a predefined protocol. If this was so, then reimplementing left in C++ would probably be considerably harder. In other words, design lowering left would be less feasible. It would still be possible to do, though.

In summary, design lowering can be used as a plan for optimizability when the design can vary in ways that make it feasible to implement faster in another language (or data structure, etc). How the design can vary depends on many things, though, thus design lowering is not always easy to do without first refactoring the design.

As before, design lowering is not something you get for free, though: you need to ha good understanding of how/where the application will be used. When you understand your application, you're more likely to make correct assumptions of what parts of that can be design lowered and which parts that can't.

Conclusions

Just like optimizing, designing for optimizability requires us to understand the problem. We cannot escape it: to do something good, we need to know what "good" is, and to know what "good" is we need to understand the domain. Beware of over-analyzing though: concentrate on system-wide concerns that are likely to affect performance, and know how the design can vary to enable design lowering. Also, make sure that design seams don't hinder optimizability.

There are still much more to discuss on this topic, so stay tuned!

Saturday, May 29, 2010

LLVM talks

A few very interesting presentations of LLVM from the LLVM developer's meeting 2009.

Wednesday, May 26, 2010

Find parent pid given a pid

A few days ago I needed to get the parent process id (pid) of a given pid on Linux. There are shell commands for that, .e.g, ps -o ppid= p the-pid, but I needed to get the pid programmatically from C.

For some reason the getppid function can only return the parent pid of the current process, not any process. I searched the web after an alternative solution but I found nothing.

So I took a peek at the source for the ps command, which revealed that there is a function called get_proc_stats defined in proc/readproc.h that (among other things) returns the parent pid of a given pid. You need to do install libproc-dev to get this function. You can then do:

#include <proc/readproc.h>
void printppid(pid_t pid) {
proc_t process_info;
get_proc_stats(pid, &process_info);
printf("Parent of pid=%d is pid=%d\n", pid, process_info.ppid);
}

and compile it with gcc the-file.c -lproc. I hope you find this useful.

Sunday, May 16, 2010

A story of being humble with confidence

I just read a post over at SKORKS, a blog I read every now and then. He writes a bit about The Art Of Unix Programming, which got me reminded of when I first heard of that book. I then remembered reading The Art of Humble Confidence and I felt that I really had to write something along those lines. Here goes.

It was 2006 and I was working with a handful experienced colleagues on a project trying to radically increase the usefulness two large applications by making them work together. This was an extremely interesting time for me and they taught me a lot that I today value very highly.

One day one of my colleagues who was sitting next door to my office knocked gently on my open door to get my attention. He said, "Sorry, am I interrupting you? Do you have a moment?" He was humble as always, speaking slowly and silently to make sure that I wouldn't be anxious about what he'd say next.

"Do you remember what we talked about before?" Even though I wasn't really sure to what he was referring I replied "sure", thinking that I'd get what he means when he starts to talk about it.

While he slowly pulled up a chair and sat down next to me he said, "did you consider what we said about the MessageReceiver class?" I now realized that he was referring to our discussion over a coffee they day before. I nodded, remembering that he didn't really liked how I designed some part of the system we were working on.

Though I couldn't really understand his argument from yesterday I had redesigned it anyway to be more in line with his suggestions. Making a poor design made me feel a bit bad and not understanding why it was bad made me feel worse. But I didn't want to ask him explaining it (again) because I didn't want to look (more) stupid. That would be even worse. Or so I thought.

I guess he realized my anxiety about not properly understanding his design, because he next said "I did a pretty crappy job explaining why that class needed to be changed, right?" He smiling and chuckled, "I was always better at instructing computers than people." We laughed.

"Anyway", he said, "I read this book a bit yesterday and I think chapter 10 explains what I mean much better than I ever can." He handed me the book and said "you can borrow it if you like." He laughed again and added "but not for too long. I need to read it soon again since you ask so incredibly interesting and hard questions." He got up from the chair and said "let's go and get some coffee." He smiled and added "I'm 'starving'".

I grabbed my cup and we walked over to our colleagues offices and asked them to joined us. As we walked to the coffee machine I felt like I was in the greatest group of developers there was. Everyone was working for our mutual goal while having fun, learning, and teaching at the same time.

My colleague had basically just asked me questions, yet managed to tell me something. Yes, he evened managed to tell me what to do. But more importantly, he taught me that you will never know everything and that working in software is a constant process of learning.

Tuesday, May 11, 2010

My worst mistakes (since two days)

I just spent two days on writing, testing, rewriting, testing, debugging, etc, a piece of code only to find the error to be five misplaces pixels. Learn from my mistake: never use i and j as index in nested loops.

What made this problem worse than it should have needed to be was that the erroneous code was in a test-case. Learn from my mistake: never write complex control flow in your test code.

What made it take longer than it should have take for me to find this mistake was that I didn't assert my assumptions. Learn from my mistake: always assert, never assume anything of any piece of code when you're chasing weird behavior.

Friday, May 7, 2010

Threading is not a model

I just saw the Threading is not a model talk by Joe Gregori from PyCon 2010, which I found very interesting. It has some history about programming language development, and some discussion about design of every-day physical stuff and every-day programming language stuff. I especially find the idea of sufficient irritation very funny and interesting. :)

The main part of the talk is about different ways of implementing concurrency, mainly CSP (Communicating Sequential Processes) and Actors. Interesting stuff presented by a good speaker.

Must-know tricks for safer macros

Almost every developer knows to stay away from (C-style) macros unless it's absolutely necessary to use them, or that is saves a lot! of code to use them. And when they do use them, they make sure to write them properly. I won't go into the details of the traditional ways of writing good macros (as in "harder to misuse, but still possible") because there are several pages like that already (see the links above). Instead, I'll discuss an entirely different way of making macros more safe.

Why macros are hard

Let's describe the problems with macros with an example. This simple macro definition multiplies the provided argument with it self to yield the square of the input:

#define SQUARE(x) x * x

Looks simple right? Too bad, because its not that simple. For example, what happens when the following code is evaluated:

int a = 2;
int b = SQUARE(a + 1);

I tell you what happens: all hell breaks loose! The above code is expanded into:

int a = 2;
int b = a + 1 * a + 1;

thus, b will equal 2 + 1 * 2 + 1 = 5. Not quite what we expected by looking at the code SQUARE(a + 1), right? All in all, macros looks simple and harmless enough, but are not simple at to get right. And definitely not harmless, on the contrary, it's extremely easy to get bitten horribly bad. We are now going to discuss how to make macros a bit more safe to work with.

Making them softer: check your typing

Types are an important part of the C language and even more so of C++ with all its classes, templates, and function overloading. Macros, though, are simple text substitution without knowledge of types, so macros fits very badly in the normal type-centric C/C++ world we are used to work with.

For example, you give a function the wrong types as argument. What do you get? A type error. No code is emitted by the compiler. This is good. On the other hand, if you give a macro the wrong types as argument, what do you get? If you're lucky some kind of error; maybe a syntax error, maybe a semantic error. If you're unlucky, though, you won't get any error at all. The compiler will just silently emit ill-behaving code into the .o-file. This is extremely bad because we're fooled into believing our code works as we expects it to.

Luckily, there is a way of making macros more safe in this regard. Let's take simple, yet illustrative example: a macro called ZERO that takes one argument, which is a variable, and sets it to 0. The first version looks like this:

#define ZERO(variable) variable = 0;

and is intended to be used inside a function like this:

void func() {
int i;
ZERO(i);
// more code here...
}

Simple but not safe enough for our tastes. For example, this macro can be called as ZERO(var0 += var1) and it will produce code the compiler accepts, but that code does not have the behavior the macro was intended to have. The macro will expand this code to var0 += var1 = 0, which (I think) is equivalent to var1 = 0; var0 += 0. Whatever the expanded code does, its not what we intended ZERO to do. In fact, ZERO was never designed to handle this kind of argument and should thus reject it with a compilation error. We will now discuss how to reject such invalid argument. Here goes...

Halt! Identify yourself!

To make sure that the compiler emits an error when the ZERO macro is given a non-variable as argument, we rewrite it to:

#define ZERO(variable) \
{ enum variable { }; } \
variable = 0;

That is, an enumeration is declared with the same name as argument inside a new scope. This makes sure that the argument is a valid identifier and not just any expression, since an expression can't be used as a name for an enumeration. For example, the previously problematic code, ZERO(var0 += var1), will expand to:

{ enum var0 += var1 { }; } var0 += var1 = 0;

which clearly won't compile. On the other hand given correct argument, e.g., the code ZERO(var0), we get

{ enum var0 { }; } var0 = 0;

which compiles and behaves as we expect ZERO to behave. Neat! Even neater, the compiler won't emit any code (in the resulting .o-file) for the extra "type-checking code" we added, because all it does is to declare a type, and that type is never used in our program. Awesomeness!

So we now have a pattern for making sure that a macro argument is a variable: declare a enumeration (or a class or struct) inside a new scope with the same name as the variable. We can encapsulate this pattern in a macro VARIABLE and rewrite ZERO using it

#define VARIABLE(v) { enum v { }; }

#define ZERO(variable) \
VARIABLE(variable) \
variable = 0;

Note that with a bit of imagination, the definition of ZERO can be read as the signature (VARIABLE(variable)) followed by the macro body (variable = 0;), making macros look more like function definitions that we are familiar with. This wraps up our discussion about variables as macro argument. But read on, there's more!

Constants

Let's assume that we wish to generalize ZERO into another macro called ASSIGN that sets the provided variable to any constant integer expression not just zero. For example, 1, 2, and 39 + 3 are valid arguments, but i + 2, 1.0, and f() are not because those are not constant integers. One way of defining such macro is as follows:

#define ASSIGN(variable, value) \
VARIABLE(variable) \
variable = value;

that is, we simply added an argument value that variable is assigned to. Simple, but as usual with macros, very easy to misuse. For example ASSIGN(myVar, myVar + 1) will assign myVar to a non-constant value, which is precisely what we didn't want ASSIGN to do.

To solve this problem, we recall that an enumerator (a member of an enumeration) can be assign a constant integer value inside the enumeration declaration. This is exactly the kind of values we wish ASSIGN to accepts, thus, we rewrite it into the following code:

#define ASSIGN(variable, value) \
VARIABLE(variable) \
{ enum { E = value }; } \
variable = value;

This version of ASSIGN only accepts variables names for its first argument and constant integers for its second argument. Note, that the constant can be a constant expression, so things like ASSIGN(var0, 1 + C * D) will work as long as C and D are static const int's. If we extract the pattern for checking that an argument is a constant integer int CONST_INT, we get the following two definitions:

#define CONST_INT(v) { enum { E = v }; }
#define ASSIGN(variable, value) \
VARIABLE(variable) CONST_INT(value) \
variable = value;

As for the final version of ZERO, the definition of ASSIGN can be read as the signature of ASSIGN followed by the body of it.

Types

Now we will modify ASSIGN into DECLARE; a macro that declares a variable of some type, which is provided as argument to DECLARE. Similar to ASSIGN, DECLARE initializes the variable to the provided constant integer expression. Our first implementation of such macro is:

#define DECLARE(type, variable, value) \
VARIABLE(variable) CONST_INT(value) \
type variable = value;

However, the compiler will accept code like DECLARE(int i =, j, 0) (assuming j is delcared variable and i is not). So following our habit from previous examples, we wish to make it a bit safer by making sure the type argument actually is a type, e.g., int, MyClass, or MyTemplate<MyClass>. We do this by having the macro using type as a template argument, as follows:

template<typename TYPE> class TypeChecker { };
#define DECLARE(type, variable, value) \
VARIABLE(variable) CONST_INT(value) \
{ class Dummy : TypeChecker<type> { }; } \
type variable = value;

This definition is much more safe from misuse than the previous; code like DECLARE(int i =, j, 0)won't compile. If we extract the argument-checking code into a separate macro, TYPE, we get:

template<typename TYPE> class TypeChecker { };
#define TYPE(t) { class Dummy : TypeChecker<t> { }; }
#define DECLARE(type, variable, value) \
TYPE(type) VARIABLE(variable) CONST_INT(value) \
type variable> = value;

As before, note that we can read this definition as two parts: first the macro signature and then the macro body. Compiler enforced documentation FTW!

Everyone's unique

To not make this post too long, I'll stop giving background and reasons for the rest of the type-checking macros I'll present from now on. I'll just briefly describe what they do.

The following macro makes sure that a list of identifiers only contains unique identifiers:

#define UNIQUE(is...) { enum { is }; }

Note that this macro requires that the compiler supports macro varargs. It used as UNIQUE(name1, name2, name3), or UNIQUE(name1, name2, name1) where former is ok, but the latter will emit an error.

Comparison

These macros compares constant integer expressions in various ways. The basic idea here is that the size of an array must not be negative and that the boolean value true is converted into 1 in a integer context and false is converted to 0. We use this to implement the macro IS_TRUE as follows:

#define IS_TRUE(a) { struct _ { int d0[!(a)]; int d1[-!(a)]; }; }

Many comparison macros are then trivially implemented using IS_TRUE, for example:

#define LESS_THAN(a, b) IS_TRUE(a < b)
#define EQUAL(a, b) IS_TRUE(a == b)
#define SMALL_TYPE(t) IS_TRUE(sizeof(t) < 8)

You may ask yourself why such macro is needed. Shouldn't templates be used here instead? I agree, but there are some of us who is (un-)lucky enough to use C and not C++...

Let's get general

The general idea we've used so far is to have two parts of the macro. One part that implements the desired (visible) behavior, and another part that works like type-checking code. The type-checking code is implemented by having short trivial side-effect free pieces of code that only will compile under the assumptions you make on the argument. For example, the argument is a variable, or the argument is a constant integer expression.

Of course, it may still be possible to fool the "type-checking" code, but its much less likely to happen indeliberately, which is the most important cases to find.

Descriptive error message?

In short: no. The error messages got from any of these type-checking macros are extremely non-descriptive. However, any error message, even weird and non-descriptive ones, is still better than no error message at all and ill-behaving binary code.

The sum of all fears

Does the approach described here solve all problems with macros? No, it does not. It does however, make it less of an issue. It is possible to write macros that are type-safe and behaves in a good way (by which I mean: either compiles into correct code or does not compile at all). However, I'm pretty sure that are uses for macros that cannot be covered with this approach.

Despite this, I highly recommend to use this idea when you write macros. It will make the macro better in most way possible, e.g., safer and better documented. Compiler-enforced documentation, even! Just like type declarations in all your other C/C++ code. Neat, don't you think?

Sunday, May 2, 2010

Beautiful Dependency Injection in C++

Dependency injection is a very nice way of making classes testable and more reusable. An instance of a class Foo that a class NeedsFoo depends on are simply injected into NeedsFoo's constructor. That is, the client of NeedsFoo (the code instantiating it) controls how Foo instances are created, thus, NeedsFoo is more decoupled from the rest of the system. Why? Because any object that is a Foo can be used with NeedsFoo: subclasses of Foo, instances shared other objects, or a Foo created in a special way (e.g., a singelton or proxy to a remote object). Compare this to the traditional non-dependency injection way, where Foo is instantiated by NeedsFoo, thus making it impossible for the client to control what kind of instance of Foo that is used by NeedsFoo.

Object life-time management complicated

Dependency injection is straight forward to do (correctly) in languages with automatic memory management like Java and Python, but it's much harder to get right in languages like C++ which forces you to manage memory manually. Of course, it possible to do simply delete the injected object in NeedsFoo's destructor; like:

class NeedsFoo {
Foo* foo;
public:
NeedsFoo(Foo* foo) : foo(foo) { }
~NeedsFoo() { delete foo; }
// Methods using foo that needs to be tested.
};

However, this is far from an optimal solution because now all objects given to any instance of NeedsFoo must be heap allocated because we delete them in the destructor. So even when NeedsFoo is stack allocated (e.g., for performance reasons), its Foo object must be heap allocated. Besides being hard to get right, heap allocation is extremely costly compared to stack allocation. So if we want performance we're screwed, and since we're using C++ I assume that performance is important. If it's not, we could just as well use some other language.

Another reason, arguably more important, for that doing delete in the destructor is bad! bad! bad! is that the injected object cannot be easily share with some other part of the application: who knows when (and if) the object should be deleted...? NeedsFoo don't know it, that's for sure.

How do we fix this? Well, we could simply remove the delete from the destructor:

class NeedsFoo {
Foo* foo;
public:
NeedsFoo(Foo* foo) : foo(foo) { }
// Methods using foo that needs to be tested.
};

Easy to test, right? Yes. Easy to use in production code? Well, kind of. Easy to use correctly in production code? Definitely not!

Why? Because, as said before, C++ lacks garbage collection thus the injected object needs
to be managed manually somehow. For example using referenced counting pointers, or making sure that foo are has the same life time as the NeedsFoo instance. We will focus on the latter for the remaining of this post. But first a short discussion why it is common that the injected objects (e.g., a Foo instance) should be destroyed at the same time as the object they are injected into (e.g., a NeedsFoo instance).

Let them here on be joined and never parted

Dependency injection is used a lot to make code more testable. However, in my experience it is often the case that if it wasn't for testing, dependency injection wouldn't be needed because its always the same kind if objects that are injected (e.g., the Foo instance is always instantiated the same way for all NeedsFoo instances). In a way using dependency injection is a kind of over-design: the code is more generic than it needs to be, thus, more complex that it needs to be.

Despite this, we still do dependency injection and we still consider it part of a good design. Why? Because, as said, it makes the code testable. We consider testable code to be more important than simple design. I can accept that for languages with garbage collector, but when you do the memory management manually the design get much! much! much! more complex. Not good, not good. Infact it's terrible. The reason it's terrible is that memory management is a program global problem that isn't easy to encapsulate.

For example, if I call malloc in my code and pass the pointer to a function defined in a library, how do I know if that memory is deallocated by the library? I can't. Not without manually inspecting the source code of the library. How about using smart pointers? Doesn't help much. The library is most probably a C library. Crap.

So, how can we do dependency injection to simplify testing while keeping memory management simple? This is what we will discuss now, my friend. Let's go!

Object life-time management simplified

One way making sure that a bunch of objects is created and destroyed at the same time is to stuff them into a class:

class Concrete : public NeedsFoo {
Foo foo;
public:
Concrete() : NeedsFoo(&foo) { }
};

Here, an instance (actually an instance of a sub-type) of NeedsFoo is created and injected with a Foo object. Since both object are held in a Concrete instance, the NeedsFoo and the Foo instances will be destroyed at the same time. Sweetness!

This approach works well under the assumption we never need to inject a subclass of Foo to NeedsFoo. (Well, of course it's still possible to inject a subclass to NeedsFoo, but if we do that we are back to square one since we need to manage those objects' lifetime manually; Concrete is only usable for Foo objects not subtypes of Foo).

So, for classes that injected with the same kind of instances, this approach solves the memory management problem. Goody-goody!

Lies, damn lies, and untested source code

Wait, what's that? Do you the cries? Breaking news! Lies on the Internet! Read all about it! It sad, but true. And I'm a bit ashamed to tell the truth.. but here goes.

In the code for Concrete above, the injected object Foo is not constructed when it's injected into NeedsFoo's constructor. Why? Because the constructor of base classes are called before the constructor of derived classes. In other words, NeedsFoo's constructor is called before Concrete's, thus, before Foo's constructor. Similarly, when NeedsFoo's destructor is called, Foo has already been destroyed. Why? Because NeedsFoo's destructor is called after Concrete's.

So what does this mean for us? Is everything we done here useless? Fortunately, it is not that bad. All we need to do is make sure Foo isn't used in the constructor or destructor of NeedsFoo. More precisely, we must not in any way touch the memory where Foo will be/was constructed. In fact this is good rule in any case, because constructors shouldn't do any work any way. Constructors should ideally just initialize the class' members. Destructors shouldn't do any work either (except for destroying object, closing sockets, etc., of course).

Making it testable (my favourite pass-time)

Let's take a step back before we continue and look at the piece we have. Note that NeedsFoo uses Foo which is a concrete class, not an interface. Thus, there is no way to inject an object that does not share any implementation with Foo (e.g., because Foo's constructor is called even for derived classes). But this is precisly what we need for testing! Crap! How to solve this? Templates to the rescue!

I know what you think: what a stupid template-loving masochistic guy. Yeah, you're half-right. I'm a guy. But not a template-loving masochistic one. Just a guy, who actually, do think templates can be put to good use (as in: std::vector). However, templates can also be used in way they weren't designed for (as in: most of Boost). There are better ways of doing so-called "template meta-programming"(affectionately called so by the Boost guys; disgustedly called so by sane people[1]). The D language is a good example of this. Check out this talk by the inventor of D, Walter Bright, about how meta-programming should be done.

Anyway, if we use templates we get:

template<class TFOO>
class NeedsFoo {
TFOO foo;
public:
NeedsFoo(TFOO foo) : foo(foo) { }
// Methods using foo that needs to be tested.
};

and the derived class that manages the injected objects life-time becomes:

class Concrete : public NeedsFoo<Foo*> {
Foo foo;
public:
Concrete() : NeedsFoo<Foo*>(&foo) { }
};

So, now we have a class that contains the code with interesting behavior (NeedsFoo) and a class that contains the construction and life-time management of the injected objects (Concrete). Furthermore, NeedsFoo can be instantiate with any kind of class that looks like a Foo class, e.g., a stub implementation of Foo. This means that the interesting code can be tested easily because of dependency injection, while still being easy to instantiate and pass around.

Also, note that the template part of the class is never visible in the production source code. That is, all other parts of the production code uses Concrete, not NeedsFoo. Furthermore, the implementation of NeedsFoo does not need to be given in the header file as is traditional for template classes. The reason is that we know all possible types TFOO will refer to. Thus, the build-time will no increase significantly using this approach compared to a initial code we had.

Another important thing to note is that this style of dependency injection is in some ways more powerful than the Java-style dito, because NeedsFoo can actually instantiate the template parameter class TFOO itself. Of course, it requires TFOO to a have a matching constructor, though. In Java, you would need to write a factory to achieve something roughly equivalent.

Pros and Cons

There are several things to consider with the code we started with and what we ended up with. First of all, the code for NeedsFoo has become more complex. This is bad, but I would argue that the test-cases for the production code will be much simpler. Thus, the total complexity (of production code + test-cases) is less for the templated code, than for the code we started with. This is good.

Second, the production code we ended up with is more generic yet maintains a its original simple interface. I say this because the main implementation (NeedsFoo) is now a template, thus, any class that looks like Foo can be used instead of Foo. Yet the interface is simple, since the most common case (when Foo is used) is still simple because of the Concrete helper-class. This is good.

Third, the performance of the templated code should be the same as the original code, because there are no unnecessary virtual calls. Virtual methods are traditionally used for making a class testable, but we managed to avoid those here. Instead we use templates, which means that methods calls can be statically bound to a concrete method implementation. Thus, the compiler can do a better job at optimizing the code. This is good.

The major drawback with this approach is that the error messages emitted by the compiler usually gets harder to read. This is common for templated code though, since the error messages contain a bunch of references to template and the types of the template parameters (here FOO). For someone who is a bit familiar with the compiler's error messages this should be very hard to figure out what the error messages mean, though. However, compared to a non-templated solution, the error messages are much harder to understand for someone not used to them.

I recommend you to try this way of doing dependency injection in C++. I'm not saying its the best way of doing under all circumstances, but I am sying its good card to have in your sleeve. A card that may come handy sometime.

[1] The Boost guys have my fullest respect for what they are doing. It is a really impressive project that has helped me more time than I can count. But that doesn't change the fact that what they are doing is insane.

Saturday, May 1, 2010

Best TED-talk ever

TED is an extremely good place to find interesting (short) talks about various topics. You can literally spend days there watching really high quality presentations ranging from musical performances to science talks. Today, though, I saw something that bets every other TED-talks I've ever seen before.

It was short and witty, it was science and humor, and it was refreshingly different. And best of all, you actually learnt something useful! Stop wasting your time here. Get over there and see it now!

Saturday, April 24, 2010

Lambda the Ultimate Joker

From Lambda the Ultimate Weblog:

All people are lazy, but very eager when evaluated.

Funny stuff for all us language nerds. :)

Sunday, April 18, 2010

Testing generated code

I started to write this post roughly one and a half years ago. I never finished writing it until now for whatever reason. Here's the post.

<one and a half years ago>
At work I'm currently devloping a little tool that generates Java code for decoding messages (deeply nested data structures) received over a network. To be more precise, the tool generates code that wraps a third-party library, which provides generic access to the messages' data structures. Slightly amusing is that the library consists of generated Java code.

The third-party library is... clumpsy to use, to say the least. Its basic problem is that is so generic/general that it's feels like programming in assembler when using it. Ok, I'm exaggerating a bit, but you get the point: its API is so general it fits no one.

There are several reasons for hiding a library like this:

simplifynig the API such that it fits your purpose;
removing messy boilerplate code that's hard to read, understand, and test;
less code to write, debug, and maintain;
introducing logging, range checks, improved exception handling, Javadoc, etc, is cheap;
removing the direct dependency to the third-party library; and
you get a warm fuzzy feeling knowing that 3 lines of trivial DSL code corresponds to something like 60-80 lines of messy Java code.

I'm developing the code generator using Ruby, which have had its pros and cons. The good thing with Ruby is that is easy to prototype things and ou can do fairly complex stuff really easy.

On the bad side is that I reqularly get confused about what types go where. This isn't to much of a problem from a bug point-of-view since test-cases catches the mistakes I make, but it's a bit of an annoyance if you're used to developing Java with Eclipse like I am. The Ruby IDE I'm using (Eclipse RDT) is not nearly as nice as Eclipse JDT, which is natural since an IDE for a dynamic language has less information available for refactorings, content assist, etc.

I've discovered that a functional style is really the way to go when writing code generators, especially when there are no requirements on performance. This is a nice fit, beacuse Ruby encurage a functional programming style -- or rather, it encurage me.

What keeps biting me when it come to code generators is how to test them. Sure, the intermediate steps are fairly easy to test, that is, while things are still objects and not just a chunk of characters. But how is the output tested?

Usually I test the output (the generated code) by matching it against a few regual expressions or similar, but this isn't a very good solutions as the test-cases are hard to read. Also, if an assertion fails the error message isn't very informative (e.g., <true> was not <false>). Furthermore, test-cases like these are still just an approximation how the code really should look like. For example, I've found no general and simple way of testing that all used classes (and no other classes) are imported. So, it is possible that a bug such as:
import tv.muppets.DanishChef;
wouldn't be found by my test-cases, even though the resulting code wouldn't even compile (do'h! The chef's from Sweden not Denmark!).

Ok, this could be fixed by having some test-cases that actually compiles the output by invoking a Java compiler. Not very beautiful but still possible. Even better, the generated-and-compiled code could be tested with a few hand-written test-cases. These test-cases would test the generated code, alright, but would not document the code at all (since "the code" here means the uby code that genreated the executed Java code). This problem, I'm sad to say, I think is impossible to solve with reasonable effort.

This approach is good and covers a lot of pitfalls. However, it misses one important aspect: generated documentation. The code generator I wrote generated Javadoc for all the methods and classes, but how should such documentation be tested? The generated documentation is definitely an important part of the output since it's basically the only human-readable part of it (in other words, the generated code that implements methods are really hard to read and understand).

Another approach is to simply compared the output with the output from a previous "Golden Run" that is known to be good. The problem here is, of course, to know what is 'good'. Also, when the Golden Run needs to be updated, the entire output of the (potential) new Golden Run has to be manually inspected to be sure it really is good/a Golden Run. The good thing with a "Golden Run" is that documentation in the generated code is also tested and not only the generated code.

The approach I'm using is to have a lot of simple test-cases that exercises the entire code generator (from input to output). Each of these test-cases verifies a certain aspect of the generated code, for instance:

number of methods;
number of public methods;
name of the methods;
a bunch of key code sequences exists in method implementations (e.g., foo.doIt(), new Bar(beer), baz.ooka(boom), etc);
for a certain method the code looks exactly like a certain string (e.g., return (Property) ((SomethingImpl) s).propertyOf(obj););
name of the class;
name of the implemented interfaces;
number of imports;
that used classes are imported;
key frases exist in Javadoc.

In addition to these test-cases, there is are test-cases that simply generates a lot of Java classes and compiles them. The test-cases pass if everything compiles. Finially, some of these generated classes are tested using hand-written JUnit test-cases. For me, this approach worked good and I didn't experience any problems (meaning more problem than normally) with testing the generated code. For the generated documentation, however, there were much more bugs. These bugs didn't get fixed either because it wasn't high priority to fix them. Too bad.
</one and a half years ago>

Looking back at this story now, I would have done things differently. First of all I would not use Ruby. Creating (internal) DSLs with Ruby is really easy and most of the time turn out really nice. I've tried doing similar things with Python, Java and C++, but the syntax of these languages just isn't as forgiving as Ruby's. However, the internal DSL I created for this tool never used the power of Ruby, so it just be an external DSL instead. An external DSL could just as well be implemented in some other language than Ruby.

Second, I would consider just generating helper methods instead of entire classes and packages of classes. So instead of doing:
void setValueFrom(Message m) {
this.value = new GeneratedMessageWrapper(m).fieldA().fieldB().value();
}
you would do
void setValueFrom(Message m) {
this.value = getValue(m);
}
@MessageDecoder("Message.fieldA.fieldB.value")
int getValue(Message m) {
}
where the code for the getValue method would be generated and inserted into the file when the class is built. This way, much less code would be needed to be generated, no DSL would be needed (annotations are used instead), and things like name collisions would not be such a big problem as it was (believe it or not).

At any rate, writing this tool was a really valuable experience for meany reasons that I wouldn't wish to have undone. Considering how good (meaning how much code it saved us from writing) it was a sucess. On the other hand, considering how much time we spent on getting it to work it was a huge failure. As it turns out, this tool has been replaced with a much simpler variant written entierly in Java. Although simpler, it gives the user much more value, though, so its arguable much better then that stuff I wrote 1,5 years ago. My mistake.