Friday, October 17, 2008

Don't Repeat Yourself - what does 'repeat' really mean?

It is not uncommon to have some kind of script (e.g., a .sh-script) to start an application. For example, a start script for an Java application could check that the correct version of the JRE is installed, set up the classpath, and then start the application by executing java -cp [classpath] [mainclass].


In a case like this, the start script contains some information that is already embedded in the source code, e.g., the name of the class containing the static void main(String[]) method. Is this a violation of the DRY principle? I certainly think so.


However, you could argue that source code is filled with this kind of violation (refering to something by its textual name) since classes/types are referred to by name everywhere, for example when instantiating a new object in most OO languages. I don't consider this to be a violation of DRY, though.


Why? Because with modern IDEs classes can be renamed/moved and all references to the class will be updated. Thus, effectively, there is no repetition (since you don't manually handle it). So, no violation of the DRY principle.


However, if the application uses reflection, or something similar, then the IDE can't safely handle it. Consequently, you have to handle these repetions manually with IDE support. In other words, the DRY principle is violated.


The impact of these violations can be minimzed by having a good test-suit. This way, if you fail to update the code correctly the tests will tell you so. Reflection-heavy code is not different than any other code in this sense.


Ok, so let's get back to the original example: the start script and the reference to the main class of the application. This is a violation of the DRY principle since the IDE does not update the script's references to classes. But not only that, in most cases it does not have any test-cases. This is very bad, because you'll get no indication that something has gone wrong. (You could argue that you shouldn't rename the main class, but that's beside the point I'm making).


So, how to fix this? Simple. Either

  1. unit test the start-script (run it, or use some kind of pattern matching), or
  2. generate the test-script by a well-tested generator.

All executable parts of the application it should be possible to test; but how about the non-executable parts? How about documentation, e.g., user guides? I don't have a good answer to this besides generate what can be generated but this is hard in practice. If you have a good soluation, please let me know...

No comments: