Saturday, April 17, 2010

Rethinking the Evil of XML Configuration Files

Do you remember when your system's configuration grew from a simple file containing a few name/value pairs modified by a small set of people, into a large collection of files in multiple locations with their own business processes surrounding them?

When the config files were small and informal, you could easily get by with a YAML file or a Windows .INI file format. But now that there are hundreds or thousands of settings, maintaining them all has become challenging.

The main problems with YAML and .INI configuration files are:
  • Lack of Type Safety - If the configuration reading code expects 'true' or 'false' and the configuration file contains '1', how will the code handle that input? Will it log an error? Will it silently misinterpret the '1' as 'false'? Do you want to have to write type-checking code throughout your application? You could write a comment in the configuration file that specified the type of the setting but not everyone understands all the various types and formats (e.g. dates and times).
  • Lack of Range Checking - If the configuration code expects a value between 1 and 4 inclusive, and someone has configured the setting as 5, how will your system react? What if your configuration reading code expects 'high' or 'low' and someone enters 'medium'? That's another form of range violation. You could write comments that specify the range but the comments had better agree with the code that does the actual range checking.
  • Lack of Validation Support - If one of your configuration settings is mandatory for the system to operate correctly (e.g. a web service endpoint) and it's missing, you don't discover the error until run-time. You could add a comment to the configuration file that stated that the setting was mandatory, but will people read it?
  • Lack of Appropriate Defaults - If some of your configuration values have appropriate defaults that you want to communicate to the user, you are stuck writing comments in the YAML or .INI file that list the defaults. Unfortunately, you have now just introduced duplication between the code that must take the default when the configuration value is missing and the comment in the configuration file.
XML files and, more specifically, XSD files, provide for all of the above.
  • XSD allows you to specify the type of a given configuration value. If your configuration value has the wrong type, XSD validation of the configuration file will alert you to your mistake.
  • XSD has support for range checking of several types.
  • XSD by it's very nature handles validation. If your application does XSD validation upon startup, you can quickly catch configuration errors.
  • XSD allows you to specify default values for configuration settings.
  • With an XSLT transform you could generate an HTML document that would list all your settings along with their defaults.
In short XSD codifies and enforces all of the constraints that we would otherwise add to our configuration files as comments.

Maybe XML configuration files aren't completely evil. The main complaint I had about XML configuration files was that they were so hard to get right. Isn't that ironic! YAML and .INI files are hard to prove that they're right.

How many of my YAML and .INI files are wrong and I just don't know about it?

4 comments:

Matt Doar said...

"hundreds of thousands" of configuration settings? There's one problem - why so many?

Config files by themselves are easy enough, but I find it's managing changes as an app evolves that is hard. The app ends up having to know explicitly how to handle reading and transforming the values for every version. And then how to handle a changed default that now happens to match what the user had set as their custom value - ugg.

Anonymous said...

If you're looking at hundred and thousands of configuration options, then I can't help but wonder if maybe there's a bit too much complexity up front in whatever solution you've put together.

Something to consider anyway.

Maybe it's time to start looking at the ideas put forward by the convention-over-configuration crowd, in particular the idea of sensible defaults. Comments in your config file that relay the appropriate input types and ranges are probably more than sufficient in most cases, too.

But that's just me. Too much xml makes my eyes bleed.

Tim Stewart said...

Thanks for reading this post.

> There's one problem - why so many?

Different kinds of apps need different amounts of configuration. Your average desktop app or web site might only need dozens or hundreds.

In a system that requires a very high degree of flexibility and must provide numerous fine-grained customizations for a large number of different user groups thousands, if not tens of thousands of configuration settings would not be unheard of.

Anonymous said...

nice job! waiting for your new artical. .................................................................