Wednesday, January 12, 2011

When all magic goes wrong: std::vector of incomplete type

I have recently been working on an API. I put great effort into separating the implementation from the interface, which in this case means that the header file of the API strictly contains declarations. No executable code at all. This makes it easier to hide implementation details, which is something we should always aim for, especially for APIs.

In C++ there are several ways to hide implementation. One way is to forward declare types and simply use pointers and references to those types in header files. However, when you need to use a type by-value it is not possible to use a forward declared. For example:
class CompleteType { };
class IncompleteType;
class HoldsTheAboveTypes {
  CompleteType value0;     // Ok.
  IncompleteType* pointer; // Ok.
  IncompleteType value1;   // Compilation error!
};
In my experience, there are usually ways to avoid having types by-value that are implementation details. Usually its a matter of thinking hard about the life-time or ownership of an object. However, when I implemented the API mentioned above I ran into a problem that seemed to be unsolvable.

I needed to have a class with a field of type std::vector of an incomplete type, that is:
class StdVectorOfIncompleteType {
  std::vector<IncompleteType> value;
};
This code fails to compile, though, giving some error message about "invalid use of incomplete type" (just as the code above). However, IncompleteType isn't used anywhere! So it should compile, shouldn't it?

(Well, I guess you could argue that it should compile if C++ would be designed properly, but it not so let's not go into that...)

The reason the above code doesn't compile is because the following special methods are automagically generated by the compiler:
  • zero-argument constructor
  • copy constructor
  • destructor
  • assignment operator
The default implementations of these special methods are nice to have in most cases. However, in the example with std::vector<IncompleteType> above these default implementation doesn't work at all. It is these default implementation that causes the compilation error, which is very much non-obvious. All (auto-)magic goes wrong.

So to fix the compilation error given above, we simply need to declare these special methods, and provide an implementation to them in a separate .cc-file, where the declaration of IncompleteType is available.

I've been fiddeling with programming for more 15 years (professionally much shorter, though) and I've run into this problem several times before but never tried to understand the cause for it. Today I did.

Sunday, January 9, 2011

Design patterns are to software development what McDonald's is to cooking

I remember reading the GoF design patterns book and thinking gosh, this is really good stuff. Now I can write program like a real master! I liked the whole idea so much that I went on reading the xUnit Patterns book, and a few more like Refactoring to Patterns.

Looking back on these books now and what I learned from them, I realize that it's not the patterns described in the books that I value the most. It's the reason for their existents; the motivation for using them. For example, the Factory pattern exists because it's often desirable to separate object construction from domain logic. Why? Because it reduces coupling, which means code are easier to enhance, reuse, extend, and test. So when you understand why a pattern exists, then you know when to use and when not to use it.

The problem is that you don't need to understand why a design pattern is needed in order to use a design pattern in you code. Code with a misused design pattern is worse than code without that pattern. As an example, here is some code taken basically directly from an application I worked with:

Thing t = new ThingFactory().create(arg);
with ThingFactory defined as

class ThingFactory {
  Thing create(int arg) { return new Thing(arg); }
}
This is a prime example of code that misuses a design pattern. Clearly, (s)he who wrote this code did not understanding why and when a Factory should be used, (s)he simply used used a Factory without thinking. Probably because (s)he just read some fancy-named design-pattern book.

This is one big problem I see with design patterns. It makes it easy to write code that looks good and professional, when in fact it's horribly bad and convoluted. The Design Patterns book is the software equivalent to MacDonald's Big Book of Burgers: you need to be a good cook/developer already in order to learn anything that will actually make you burgers/software skills better. A less-than-good cook/developer will only learn how to make burgers/software that look good on the surface.

I recently read Object-Oriented Design Heuristics by Arthur J. Riel, and I must say that this book is much better than the Design Patterns book. First of all, it more than just a dictionary of patterns, it's actually a proper book you can read (without being bored to death). Second, the design rules (what the author calls "heuristics") are much more deep and applicable than design patterns. These rules are like Maxwell's equations for good software design. Understand and apply them, and your software will be well designed.

Let me illustrate with an example how I think Riel is different from GoF. Where GoF say "this is a hammer, use it on nails", Riel says "to attach to wooden objects you can either use hammer+nails or screwdriver+screws, both has pros and cons." Sure GoF is easier to read and you'll learn some fancy word you can say if you running out of buzz-words, but Riel actually makes you understand. Understanding is underrated.


But let's give the GoF book et. al, some slack. To be honest I actually did learn something useful and important from those books, and I couldn't do my daily programming work properly without that knowledge:
  • favor composition over inheritance,
  • separating construction logic from domain logic, and
  • code towards an interface, not an implementation.
To me, these three ideas are the essence of most design pattern that are worth knowing and if you keep those three ideas in you head all the time you won't screw up to much.

Oh, there is actually one more idea you should keep in your head or you will definitely screw up big time (literately).
    But of course there is one more important thing about knowing design patterns. It helps you communicating with your colleagues. Design patterns defines a pattern language, so instead of saying "...you know that class that instantiates a bunch of classes an connects them..."you can say "...you know that builder class...". Honestly, for me this language is way more important than the patterns themselves.

    Actually, there is one thing that I learned from all those design-patterns books (not including Riel). They taught me something important that I could only have learned from few other places: I learned that if an author tries hard enough, (s)he can write a 500 pages book consisting of some common sense and two or three good ideas repeated over and over again. The xUnit Patterns book is the prime example of this. Don't read it. Read Riel.

    Thursday, January 6, 2011

    Global contracts: mutexes, new/delete, singeltons

    I had lunch with a former collegue some time ago. We discussed several things, I would like to write about one idea in particular: global contracts.

    A global contract is an agreement that every part of your code needs to adhere to. A good example of a global contract is locks. Locks are used in multi-threaded applications to synchronize threads. The problem is that there is no way to enforce that threads take the locks when they should, nor to release the lock when they should. Application code is free to access memory/object with or with out locks. Correct application code, though, have to take locks, access the memory/object, and the release the lock.

    A similar but more common global contract is Singletons. Every part of the application must know how to get the singleton instance, and if the singleton has mutable state every part of the application must also know how to mutate that state correctly.

    An even more common global contract is memory management in application without garbage collection. In such applications, the knowledge of how to allocate and deallocate instances of objects are distributed. There are ways to design your application such that the global contract is encapsulated, but in general the detailed knowledge of the contract is spread out all over the application.

    Ok, global contracts are bad for the application design, but how do we get rid of them? We can't simply ignore locks or memory management, so how do we handle them? I see the following possibilities:
    1. Encapsulate the details of the global contract, or
    2. Make the global contract into a local contract.

    The first alternative means that you accept that contract is global, but you make sure that it's simple to adhere to it. An example of this is to use smart pointers, or to hide that there is a singleton within a single class, which also holds state, e.g.,
    class SingletonHider {
      Singleton singleton = Singleton::instance();
      int intState;
      void changeState(int i) { intState = i; }
      void doit() { singleton.doit(i); }
    }
    And every time some part of the application needs to use the Singleton class it instantiates a SingletonHider, which holds all the state needed. The precise behavior (or any state that absolutely needs to be global) is handled by the Singleton class.

    The second alternative is to make the global contract into a local contract. How is this possible? Well, whether or not a contract is global or local is entirely a question of perspective. If you look a memory management from a operating system's perspective, the details of new and delete are definitely a local contract of that application; it's not a operating system-global contract. So the idea for making global contract local is to implement a class that acts like a overseer of the rest of the code. This class encapsulates all the details of the contract and provides a simple interface. This is exactly what a garbage collector is. Let's take Java's garbage collector as an example:
    • it oversees: inspects the running application and determines when a piece of memory is garbage
    • simple interface: every piece of memory is allocated by new. The details of where the memory is allocated (constant pool, generation 1, stack, etc) is encapsulated.
    I'm sure there are more examples and more ways how to make a global contract local, but the approaches outlined here should give a few ideas the next time you which to refactor away a singleton, or fix new-delete coupling.