Saturday, September 4, 2010

The firge, the door and the fire alarm

Today when I prepared breakfast I realized that the fridge wasn't properly closed. It had been a small opening the whole night. That's not particularly good.

But what's worse is that the fridge did not make any noise indicating that the door was open. In fact, it did the opposite: it turned off the only indication there was that the door wasn't properly closed -- it turned of the light inside the fridge. If that light wasn't turned off it would be easier to spot that the door was open.

Turning the light off would not be a big problem if there was some way the fridge alarmed when the door was open. Let's say making a noise when it had been open for more than 1 minute.

I'm sure that it's a feature that the fridge turns off the light by it self when it has been on for too long. But I can't the use-case when it would be a useful. There must be a timer somewhere that turns off the light. That timer should instead trigger an alarm.

Actually there is an noise-making-device in the fridge. But that only used when the temperature in the fridge is above a certain temperature. That noise-making-devices should be triggered by that timer. Why didn't it?

I don't know.

Recently I've noticed more and more weird design choices in everyday things. Like having the handle of a door be shaped as a pipe when you should push the door, and having the handle be shaped like a flat surface when you should pull the door (think about it, a surface is easy to push and a pipe is easy to pull).

Even worse, I've experienced that fire alarms sounds very very much like the break-in alarm.

Friday, September 3, 2010

Mim: the build system you always wanted

The word mim means any of the following:
  • an incorrect way of writing 1999 in roman numbers (as in "we gonna party like it's 1999"),
  • the Madame Mim from the Disney movie Sword in the Stone,
  • Swedish for "mime" meaning "imitating"
  • a figure in Norse mythology renowned for his wisdom
  • an acronym for "Mim Isn't Make"
it's also the name of the build system I've been thinking and working of for a while. This post is about what Mim is now and what it can become. Text in red describes stuff that isn't implemented yet, text in yellow is things that are kind-of implemented, and text in black describes implemented features. First, I'll describe how Mim is different from other build systems and why it's better.

Problem
The problems with traditional build system are:
  • make sure the dependencies are correct and built in the right order (think: make depend && make && make install),
  • be sure of that every built file you see is up to date,
  • hard understand how a software project is built (what are the artifacts? where are they stored? what are the dependencies?),
  • hard to understand which variables a build have (e.g., make DEBUG=1)
Solution
The solution to these problem is (of course) Mim. With Mim you clearly see all artifact the build system produces, how they are built, and when they need to be built. In fact, you can see the artifacts, e.g., by doing ls, before they are built. In other words, using your normal command line tools you can browse the project file tree to find the artifact you need (to run, to view, to copy, etc.) without building anything. Sounds like magic? Keep reading...

When when you found the program you need to get built you simple execute it. No need to issue any command yourself to build it. The program will be automatically built for you and then executed. Similarly, if you have a tool that generates a text file as part of the build, you can do emacs file.txt. The file will be automatically generated and then opened in emacs.

If you need to get more information about a file you can do mim ls filename. You'll get something like this written to the console:
-r-xr-xr-x you users filename (g++ filename.cc -o filename)
that is, you can easily discover to the access rights, the owner, the group, and how the file is built, of any artifact simply by locating it in the file system.

This is especially powerful when you work with project of which source code generation is part of the building process. Understanding what files are generated, how they are generated, can be very hard. With Mim, however, this is very easy, because every file (generated or not) looks the same to you when you browse the file tree. Additionally, you can ask Mim for additional information about a certain artifact file, e.g., get the command for building the file by doing mim cmd file.

Furthermore, getting the dependencies right when the build involves several steps (generating .d files, generating source code, generating .d files for the generated code, compiling the source code, linking the source code against libraries (which themselves has to be built, including its generated source code)) is also very very hard. And understanding it a few month later is even harder. Mim solves all these problems.


Let's take an example. Let's assume we have a directory called hello with a file hello.cc contaning an implementation of Hello, World. There is also a file called Mimfile that tells Mim how to build this program. Doing mim ls in hello gives you:
greeting.txt
hello.cc     codegen -lang en greeting.txt -o hello.cc
hello-en     g++ hello.cc -o hello-en
Mimfile
which tells you that there is only one physical file, greetings.txt, in this directory (except the obligatory Mimfile) and two artifact files, hello.cc and hello-en, of which the latter is an executable. You can also see that there is some code generation involved by looking at the command line for hello.cc. You can tell all this by issuing one simple command, mim ls, without building anything at all.

Let's say you wish to take a peek at the hello.cc file. All you need to do then is to do cat hello.cc. No need to worry that the code you see is out of date because Mim makes sure it's up to date before it's opened.

Being lazy
Being lazy is a good attribute; as long as you're not too lazy. Being lazy in the context of a build system means that you only rebuild a file if is inputs have changed. How do Mim achieve this?

Mim woks by intercepting all read and write accesses to the file system. This means that Mim has perfect knowledge of 1) which files are read when a given artifact if built, and 2) which files that have been updated since the artifact was last built. This means that artifacts are only rebuilt when needed.

This information makes it possible for Mim to easily construct a graph dependencies and what needs to be built. This implies that Mim start building the leaves of the graph as early as possible to reduce the build time.

Instant messaging
In large project there is usually variables that controls how the project is built. Mim supports such variables in a nice way. To list the variables simply do mim conf, you'll get something like this printed to the console:
lang      en   en|fr|swe  Controls in which language "Hello, World" is printed.
debug     off  on|off     Compile with or without debug information.
optimize  0    1|2|3      Optimization level as defined by gcc:s -O flag.
The first column is the the name of the variable, second column is its current value, third is its possible value, and last column is a textual description of it.

Changing the value of the variable lang to swe is done by doing mim conf lang=swe. Setting a variable have instant effect. For example, if a variable affects which files are built, you can see the updated list of files immediately by doing ls without building anything. Example:
$ ls hello*
hello.cc  hello-en
$ mim lang=swe
$ ls
hello.cc  hello-swe
That is, by changing the variable lang from en to swe we get an artifact file called hello-swe instead of hello-en like we got before.

To aid discoverability even more, you can do mim ls hello* lang==* to list all files matching hello* for all values of lang. For the hello example this outputs:
hello-en   lang=en   g++ hello.cc -o hello-en
hello-fr   lang=fr   g++ hello.cc -o hello-fr
hello-swe  lang=swe  g++ hello.cc -o hello-swe
All these features helps you in learning how an a project is built, how to use its build system, how to change it, how those changes affect the build system, and more.


The Tracker
Since Mim intercepts all accesses to the file system, it can help you when you do potentially bad things, for example, if you delete a file that is needed to build an artifact. Mim also updates your Mimfiles automatically when you do non-destructive changes to your source tree. For instance, when you rename an input file to gcc as the following example illustrates this:
$ mim ls hello*
hello.cc  codegen -lang en greeting.txt -o hello.cc
hello-en  g++ hello.cc -o hello-en
$ mv hello.cc hi.cc
$ mim ls hello*
hi.cc     codegen -lang en greeting.txt -o hi.cc
hello-en  g++ hi.cc -o hello-en

Mim can do this automatic update of Mimfiles correctly in many cases, but it's impossible to get it right every time since that would require deep knowledge of all possible build tools (e.g., gcc, ld, m4, etc). However, Mim makes certain (reasonable) assumption on how the build tools behave, and as long as the build tools' conform to these assumptions Mim gets it right most of the time. Anyway, to check that Mim got it right, you simple build of application so this isn't a big problem in practice.

Getting general
Mim is built around the concepts of events and actions triggered by event. So far we have implicitly discussed the event open artifact for reading and the action was update artifact. The possible events that can trigger actions are:
  • open physical of artifact file for reading (before-read),
  • closing a physical or artifact file after write (after-write),
  • deleting physical or artifact file or directory (delete-file),
  • creating a physical file or directory (create-file), and
  • moving a physical or artifact file (move-file).
Since arbitrary commands can be executed as the result of an event, you can do what ever you heart desire when a file is moved, delete, created, etc. However, actions that interact with Mim would be hard to implement ourself and are thus built into Mim:
  • update an artifact (without reading the artifact file)
  • log to the Mim log (e.g., an error message)
  • reject event (e.g., reject opening a file).
The reject event action is useful, e.g., if you need to verify the consistency of a save file: if the file is consistent it cannot be saved.

Playing nice
So far so good. But what about integrating with other non-miming tools like, say, make. Since all Mim does to build an artifact files is to execute a command line in the shell, Mim can easily be used as a front-end to make. Here is an example of doing mim ls in a directory of a project that does just that:
main.cc
hello     make hello
Makefile
Mimfile
Furthermore, any program can use Mim for building files, because all the program need to do to build a file is to (try to) open it for reading. Mim will then kick in and build the file, without the other program ever nothing anything special. This is very powerful, because every program you have on you computer, cat, emacs, Mozilla, etc, can build any file using Mim.

Saving the environment
Do you have ./configure && make && sudo make install aliased to nike on you system? If you do you're not using Mim.

One reason to use autoconf (i.e., the ./configure script above) is to find the paths to all the necessary tools, libraries, headers, etc, needed to build your project. With autoconf this involves generating Makefiles that are specifically targeted at your system. The problem with autoconf and the generated Makefiles is that you have to be a genius to understand what the heck is going on.

So what do Mim do instead? Easy, you have an artifact file for each tool, library, header, etc, that is needed by your build. Those artifacts are configured (e.g., using autoconf) to link to a specific file on your system. These tools (or, ) are normally placed in a directory called ext (for "external"); thus, to list the dependencies to external tools simply do find ext or ls -R ext. Every file listed is an external dependency. Example:

$ cd my-project/ext
$ ls -R *
ext/bin:
g++  ld  python

ext/include:
3rdparty.h

ext/lib:
3rdparty.so

In this example we have configured the ext directory to contain a compiler, linker, and a python interpreter. So, when we need to compile a .cc file we have the following command line in the Mimfile ext/bin/g++ file.cc. The first time this command line is executed, the artifacts (ext/bin/g++, etc) are not yet configured, so Mim launches the configuration tool (autoconf). After that, the artifacts in the ext directory point to the correct files on the local system and can be executed easily.

A way of thinking of this is that Mim creates a virtual environment (directory structure) that is ideal(ized) for you specific project. All the tools and libraries you need are right there in the ext directory.


Deep understanding
Mim really, really, understands mimfiles. If a mimfile is updated, artifacts potentially need to be rebuilt. However, Mim will only rebuild artifact which command lines has changed or if their dependencies' command lines are changed. Have you ever experienced that your entire project needs to be rebuilt just because you added an a comment to a makefile? That will never happen with Mim.

Mim has even support for refactoring mimfiles via your filesystem! In the hidden directory .mimfile the artifact files' mimfile-data is represented as files. So, if you want to change the name of an artifact file simply do mv .mimfile/old-name .mimfile/new-name; to create a new artifact file that is identical to an existing one do cp .mimfile/original .mimfile/copy; to delete an entry from the mimfile do rm .mimfile/remove-me. Finally, to edit the command line for an artifact you simply edit the corresponding file under .mimfile, e.g., echo 'gcc input-file.c -o %output' > .mimfile/file-to-be-edited.

As mentioned earlier, Mim also has variables that can be used to configure how your project is built. These variables are also represented as files under the directory .mimvars. Here you can easily browse the variables using ls, find, grep, etc, or even your graphical file browser. The get the current value of a variable, simply do cat .mimvars/the-variable and to set the value do something like echo 'new value' > .mimvars/the-variable.


When thing go wrong
Ok, so Mim is nice, I think I've communicated that by now. But sometimes the world is very very unnice. Sometimes the world don't want to compile your code. Then you need to either 1) fix the world, or 2) fix your code. To my experience the latter is slightly easer than the former, although your mileage may vary.

To fix your code you need to know what the compiler complains about, but how do you do this when the files is compiled behind the scenes? Simply do mim log errors to print the error messages to the console, fix the error, and run the program again.

Doing mim log error after every build error gets annoying very fast. Thus there is a variant, mim log errors -f, that is useful when you wish to get continuous feedback. Given the -f flag Mim will run in the background and print the error message to the console as soon as they appear. So, assuming hello.cc contains an error, trying to execute hello will print the flowing:
$ mim errors -f
$ ./hello
hello/hello.cc: In function ‘int main()’:
hello/hello.cc:3: error: ‘printf’ was not declared in this scope
which is is just what gcc outputs, thus, any tool that parses the output from gcc will understand the error output from Mim as well. This is of course also applicable for other tools like make.

Summing up
I've described the ideas behind Mim, its current state, and possible future features (that, by that way should be very reasonable to implement :)). You can find Mim here. You'll need Intel Threading Building Blocks and Lua 5.1. I've developed using Ubuntu and also compiled it on Suse 11.2. The instructions for how to build Mim itself is currently horribly missing, so please contact me if you like to try Mim out.