Serialized Data Survivability
Developer's Cave, Potential RPG September 26th. 2008, 12:48pmSerialized object data is inherently fragile. A change in one class effectively renders any serialized data that includes the class (even deep within the object graph) unusable. This article describes how I have (potentially) addressed this problem in my MMORPG platform.
First, consider the cases in which data serialization (a.k.a. marshalling and unmarshalling bytes) are used on this project. I am serializing objects for client-server network transport, client data cache, and server persistent storage. Without any special care, any class changes could/would break compatibility with old versions of the data. This situation is evident to any Alpha Playtester, as I usually wipe out the database after major updates.
Notice that client-side cache data can be (and is) expunged if any data structure has become incompatible. Also, since I’m using Java Web Start (at least provisionally), I can assume all users are running the latest version of the client software. Therefore, transport compatibility is not a big concern. Nonetheless, other projects may support clients of varying versions. Although I’m not explicitly dealing with this scenario, the technique described below could be used to support forward compatibility.
For the server, serialized data must survive code changes.
The standard advice is to avoid the problem; either maintain yet another external data format, or never change your objects. Since the former is undesirable, and the latter impossible, I’ve attempted to come up with a more robust approach.
Although Potential RPG is (so far) 100% Pure Java, I am not using Java’s built-in Serialization mechanism. Java Serialization is designed (rather elegantly, actually) for convenience at the expense of robustness. Even for core Java classes, the documentation explicitly warns that serialized data is likely to break between Java versions. Using Java’s Externalizable interface is an option, but this still binds the developer to Java’s Serialization mechanics.
The approach I’ve developed is an object packing/unpacking system, in which each serializable class has an accompanying byte-handling Packer helper class. This is functionally equivalent to a class-based Externalizable system (but with a cleaner API, IMHO).
The problem of survivability remains. The one-byte solution is to insert a packing schema version byte before each object’s data. This version is given to the unpacking method when unmarshalling the bytes. The unpacking method can include logic to (a) skip defunct fields and/or (b) set default values for missing fields. This technique provides backward compatibility. (Adapting to future versions is more complex, so I’ll ignore that on this project…)
Under this design, maintaining backward data compatibility for a class is a matter of augmenting the unpacking routine. Notice that the packing routine always produces an updated serialized data structure. Therefore, it is possible to write a routine that upgrades the database in one pass; or, the server can be left to lazily update objects as it encounters them.
For Playtesters, the next Alpha release will impose a mandatory database reset. After that, if I am diligent in my code, the test world should survive future updates.
Further reading: Durable Java: Serialization













September 26th, 2008 at 2:37 pm
Spiffy. This is an obstacle on just about every project I have worked on. You’ll have to know how it works out for you.