Serializing Configuration Information in Java

Serialization of objects has many uses in Java, including persistence of a particular state; communication with other applications; and storing of an application's configuration when the program is not running. The latter use case, storage of configuration information, has its own set of constraints that require special attention when serializing. Simply handing off a configuration object to some serialization service is not enough. Diligent planning of serialization is required to produce configuration files that are accessible to editing and flexible enough not to break as your application evolves.

Java's Own XMLEncoder

A tempting serialization method uses Java's built-in XMLEncoder, which promises to provide a future-safe XML format for serializing object instances. XMLEncoder uses the getter/setter methods of a JavaBean to save the state of an object in a procedural fashion, effectively indicating what getter/setter methods should be called to save or retrieve an object's state. There's nothing wrong with this solution in approach, but in the context of configuration files XMLEncoder has several drawbacks:

XStream

A marvelous alternative to XMLEncoder is XStream, an open-source library available for free. XStream uses reflection to serialize individual fields, not the result of JavaBean getters and setters. For configuration files XStream offers several advantages:

As an example of how XStream uses reflection to its advantage, a standard list instance such as ArrayList will be serialized simply as <myField><list>…</list></myField> if it is obvious from the field the type of list involved. In fact, in most cases XStream will shorten even this representation to an implied form, <myField>…</myField>, in which the list elements are serialized as children of the XML element representing the property itself.

Taking Care of Serialization

Because XStream makes it so simple to serialize and deserialize object instances, it is tempting simply to let XStream do its job and ignore the actual output. If all that matters is that objects are saved and restored accurately, this is fine, as XStream is dependable. But for configuration files, information fidelity is not the end of the matter. The application will likely change, as will its configuration file format. Classes will be renamed; new fields will be added; and other fields will be removed. Creating configuration files that are not brittle requires that the configuration classes and resulting files each be examined and the serialization process customized.

In a recent application I worked on, for instance, I had implemented internal, anonymous classes, which XStream serialized and deserialized without complaint. In Java, internal, anonymous classes are given names such as com.example.Class$1. In a new version of the application I not only created additional new internal, anonymous classes, but I changed their order in the enclosing class—causing the incorrect classes to be deserialized because of the name change.

This internal, anonymous class naming issue would arise in whatever serialization library is being used. It also illustrates something that I've found to be generally true: the more human-readable the serialization, the more future-proof it is. This is because that the more the format can illustrate the meaning of the configuration information rather than exposing internal implementation details, the longer it will be valid and usable, as program semantics change less often than implementations.

Below are several serialization rules of thumb that I've found helpful in creating configuration files that are compact, understandable, and flexible with regard to future changes in the application and configuration file format: