Thursday, October 20, 2011

std::fstream -- please leave!

During the last half year, there has been several bugs related to C++ file streams, e.g., std::ifstream. Most bugs have been resolved by adding calls to ios::clear to clear the error state of the stream, or similar fixes. Other bugs were fixed by using C FILE instead.

Why, why, why, is file I/O so hard to implement in a reliable way in C++? I haven't done much work with files in C, but the things I've done work well. Same thing with python. Java's file I/O is just a joke (why do I need three classes to read a file?), but at is more reliable than C++.

Think twice before I use std::fstream again. You code might be fine on implementation of the C++ standard library, but will fail on another. Sad.

3 comments:

Frank Jacobs said...

Thanks for the post.

Java's file I/O is just a joke (why do I need three classes to read a file?

I can see what you mean. In terms of lines of code, the extra levels result in extra lines of code when it seems like it should be a one-liner.

But, the Java I/O API does come in handy. Having File distinct from Reader/InputStream makes it easy to design code that isn't dependent on files. Say I write a class that does some I/O, I can have it's input provided via a Reader or InputStream (instead of a File).
This leads to a good unit test as the test can provide something in memory like ByteArrayInputStream for input. Sure, if the input was a File, the test could give it a File. But, then you end up with a unit test that isn't as fast to execute due to file I/O. And, the unit test starts to look more like an integration test.

So, I guess maybe it's the good with the bad here: flexibility traded for a few more lines of code.

Torgny said...

You are right, of course. InputStreams and Readers do make it possible to write unit-tests without having to resort to direct file access.

However, I fail to see the reason why it needs to more complicated than the python equivalent (open in production code and StringIO in tests).

Sure, sometimes you need very fine-grained control over how data is written (e.g, what kind of buffering). And in those cases Java's solution is perfect. But that's the exception -- not the common case (in my experience). In most cases you just want to get the data...

I guess what I'm saying is flexibility traded for a few more lines of code must be complemented with a simple way of doing the simple things.

Thank you for your comment!

Frank Jacobs said...

Good points.

...the python equivalent (open in production code and StringIO in tests)

Yeah, that seems like a nice, simple approach. I like that.

Thanks for the response!