Saturday, April 12, 2008

Making deactivated logging 100 times faster

I think the java.util.logging is a nice logging framework: it's easy to do simple things yet it is not limited. You can easily tweak it using custom filters, formatters, and handlers. One thing I do not like with it, however, is its performance.
 
The problem with logging
I have no problem with the performance of java.lang.logging when the logging is activated. It's the performance when logging is deactivated that is an issue for me. The problem, as I see it, is that when
  logger.info("someting: " + something.toString());
is executed, the argument to info is created (by concatinating two string, which is computationally heavy) despite logging being deactivated. This means that a string will be created and then directly thrown away without being used. To make things even worse, there is even a greater performance penalty if the toString method of something is computionally heavy.

This is not problem with java.util.logging per se, but rather a problem with the Java language. Don't get me wrong -- I like Java -- but in certain areas Java is simply too limited/limiting. I see at least three way of solving the problem described above:
  1. introducing some kind of macros to the language,
  2. using aspect-oriented programming, or
  3. performing string concatination lazily.
Personly, I think that the common variant of macros (the C/C++-kind) it a Bad Thing. On the other hand, the other variant of macros (the Lisp-kind) does not fit nicely in the Java languange because those kinds of macros operate on the AST (this is perfectly ok in Lisp because Lisp does not have any syntax -- you're actually creating the AST when you write the program).

The second solution to the problem is aspect-oriented programming. To be honest, I don't know enough about that to be able to discuss it here. With the limited knowledge I do have, however, I think that it should be possible to instrument the piece of code above such that you get the following sematic:
  if (logger.logsAtLevel(Level.INFO) {
    logger.info("someting: " + something.toString());
  }

The third solution -- performing string concatination lazily -- is the solution I will discuss for the rest of this post. I'm assuming that the methods used to create the log message, e.g., the toString method, are are pure functions, i.e., has no side-effect. This is a perfectly legitimate assumption because deactivated logging should have no side effect as it is.

Lazy string concatination
Ok, so how can we make string concatination in Java lazy? In C++ we could have overloaded the operator +, but this is not possible in Java. One hacker-ish solution would be to implement new String and StringBuilder classes (which the compiler uses to implement string concatination) which performs concatination lazily, but this not trivial... (I have actually tried... (and failed)). Instead, we can implement a thin wrapper around java.util.logging.Logger with the following methods:
  MyLogger log(Object msg);
  MyLogger log(Object msg1, Object msg2);
  MyLogger log(Object msg1, Object msg2, Object msg3);
  // ... and so on.
  void info(Object msg);
  // ... and all the the other levels.
which is used like this:
  myLogger.log("Received message: ", msg, " from ").info(msgProvider);
which is the equivalent of
  logger.info("Received message: " + msg + " from " + msgProvider);
when using a java.util.logging.Logger. The log methods is simply implemented by storing the references to the objects given as arguments. The info method is implemented by calling toString on its argument and the argument given to log if the logging is actived, otherwise it does nothing.

I (kind of) have implemented such class; the difference is that instead of wrapping a java.util.logging.Logger my class uses a java.util.logging.Handler directly. The interface of this class, which I named Ln4j (pun definitely intended), is the same as MyLogger above, however.

Performance measuments
So, what kinds of performance numbers can we expect? <disclamer>I'm definitely no expert in measuring performance, but I have tried my best to create fair benchmarks.</disclamer> These are the benchmarks:
  • logging single constant string,
  • concatinating two constant string and log the result,
  • concatinating a constant string and a variable string and log the result,
  • concatinating six short (4 characters) variable strings,
  • concatinating six long (40 characters) variable strings,
  • concatinating a constant string and an int and log the result,
  • concatinating a constant string and a List<double> (of length 8) and log the result.

I ran these benchmars with and without the -server switch to the JVM and with logging activated and with logging deactivated. This is the result.

In summary: with logging activated ln4j performs a bit faster than java.util.logging. However, since ln4j is quite simple (e.g., it has no log levels) this small performance advantage would probably disapper if ln4j implemented all functionality provided by java.util.logging.Logger.
When running the benchmarks with logging deactivated, there is usually considerable performance gains (no, the post title is no exaggeration). Of course, exact numbers depend on what is logged. When logging a single constant ln4j is actually somewhat slower. However, in the benchmark that logs a list, ln4j is 600-700 times faster than java.util.logging. That optimzation for ya!

I hope this post was informative and that you have learned something from reading it. I learned a lot when experimenting with lazy string concatination; let's hope it will native in Java 8. :)

Oh, I almost forgot, here and here are the source used in the benchmarks.

7 comments:

Janick said...

You could also just use the public void log(Level level,
String msg,
Object[] params) method on the Logger class.

Torgny said...

You are absolutly correct. That would speed up things alot. I actually never considered that, to be honest.

However, that would require you to allocate a new array every time you need to log. This adds to the noise already present in most code that logs. (By 'noise' I mean stuff that is not relevant for the core function of the code, e.g., logging).

Running my benchmarks using the method you proposed, ln4j is still 1 to 50 times faster. Still quite good, I think, but not close to the the previous results.

Thanks for point this out to me!

Árni Hermann Reynisson said...

Nice. But I think you need to add a little bit of locking or synchronization and rerun your benchmark for obvious thread safety reasons, although it might still be faster, just not -as fast- after you make it thread safe (my bet anyway).

Torgny said...

Árni:
Yes, it definitely needs synchonization to be a proper replacement for java.util.logging.Logger. I'll re-run the benchmarks sometime later when I've made a proper thread-safe implementation.

Of cource, the code is also available for you to do it. :)

(I'm not a big fan of the shared-memory threading model... That why I'm usually so ignorant to the whole threading thing.)

noe said...

Perhaps ropes (http://ahmadsoft.org/ropes/) could be another alternative?

Unknown said...

How about http://slf4j.org/ ? It does exactly this. And can then defer to your logging api of choice.

Torgny said...

noe: I've heard about Ropes before, but never actually looked at it. Seems interessting -- I have to check it out.

Archiveur: I've only locked at LSF4J very quickly, but it looks like it provides mothods for doing what I proposed in my post. Nice.