Saturday, April 14, 2012

Java initialization chaos

I've recently started using Java again after going from C++ to Java to C++ and now Java again. What bums me out the most with Java is how it pretty surface hides a lot of chaos. Don't get me wrong, it's good that you don't have to see the chaos (too often), but it's terrible that the chaos is there at all. Java's static initialization is particular "interesting" topic.

What is a and b in the following code:
class A { public static int a = B.b; }
class B { public static int b = A.a; }
I know its a contrived example, but non-obvious variants of this code pops up every now and then and brings havok to the dependency graph.

So, what do happen to a and b in the code above? How does the JVM resolve the cyclic dependency? It does it by initializing a and b to zero. In fact, all (static) fields are first initialized to zero (or null if the fields holds a reference type) when the JVM first sees the field during class loading. Then when the entire class is loaded, it's class initializer is called where the fields are initialized to the values written in the source code, e.g., 10 in the code below:
class C { public static int c = 10; }
In other words, the code for C is equivalent to:
class C {
  public static int c = 0;
  static { c = 10; }
}
Which you can easily see by running javap on the C's .class-file.

Now, what I've been asking my self is why it's done like this? Why is the fields initialized to zero (or null) and later assigned to their proper value (10 in the case of C.c). It would be extremely helpful to have the fields being tagged as uninitialized before a  proper value is assigned to them. That would not require tonnes of extra logic in the JVM, and would capture circular dependencies upon startup of the application.

The JVM is nice and all, but it's not perfect. But then again, no one is arguing it is, right?

No comments: