Thursday, 2 August 2012

DataContractSerializer in WCF



What is Serialization?

Let’s start with the basics.  Serialization has been a key part of .Net since version 1.  It is basically the process of converting an object instance into a portable and transferable format.  The objects can be serialized into all sorts of formats.  Serializing to Xml is most often used for its interoperability.  Serializing to binary is useful when you want to send the object from one .Net application to another.  .Net even supports the interfaces and base classes to build your own serializes. There are libraries out there to serialize to comma delimited strings, JSON, etc.
Deserialization is basically the reverse of serialization.  Its the process of taking some data (Xml, binary, etc) and converting it back into an object.

What is the XmlSerialzer?

For those that may not be familiar with System.Xml.Serialization.XmlSerializer let’s go over it briefly.  This is the xml serializer that has been around since .Net version one.  To serialize or deserialize an object, you basically just need to create an instance of the XmlSerializer for the type you want to work with, then just call Serialize() or Deserialize().  It works with streams, so you could serialize to any stream such as an MemoryStream, FileStream, etc.
// Create serializer for the type
System.Xml.Serialization.XmlSerializer xmlSerializer =
    new System.Xml.Serialization.XmlSerializer(typeof(MyType)); 
 
// Serialize from an object to a stream
xmlSerializer.Serialize(stream, myInstanceOfMyType); 
 
// Deserialize from a stream to an object
myInstanceOfMyType = (MyType)xmlSerializer.Deserialize(stream);

What is the DataContractSerializer?

The System.Runtime.Serialization.DataContractSerializer is new in .Net 3.0 and was designed for contract-first development and speed.  Specifically it was brought in to be used by Wcf, but can be used for general serialization as well. Using the DataContractSerializer isn’t that much different than using the XmlSerializer.  There are a few more options, but the only real key difference is that you use a WriteObject() method to serialize instead of a Serialize() method and a ReadObject() method to deserialize instead of a Deserialize() method.  It works with the same types of streams, so you can write to memory, files, etc.
DataContractSerializer dataContractSerializer =
    new DataContractSerializer(typeof(MyType)); 
 
// Serialize from an object to a stream
dataContractSerializer.WriteObject(stream, myInstanceOfMyType); 
 
// Deserialize from a stream to an object
myInstanceOfMyType = (MyType)dataContractSerializer.ReadObject(stream);







A The XmlSerializer has been in .Net since version 1.0 and has served us well for everything from Remoting, Web Services, serializing to a file, etc. However in .Net 3.0 the DataContractSerializer came along.  And all of a sudden a lot of guidance suggests that we should use it over the old tried and true XmlSerializer. Wcf even uses this as the default mechanism for serialization.  The question is, “Is it really better?”.  The verdict is yes, and no.  Like most things it depends on your implementation and what you need.  For Wcf, you should prefer to use the DataContractSerializer.  If you need full control over how the xml looks though, you should go back to the XmlSerializer.
Lets look at the both of these in detail and leave it up to you to decide which is best for your implementation.  Here are a few of the advantages and disadvantages of each of them:


Advantages:
1. Opt-out rather than opt-in properties to serialize. This mean you don’t have to specify each and every property to serialize, only those you don’t wan to serialize2. Full control over how a property is serialized including it it should be a node or an attribute
3. Supports more of the XSD standard
Disadvantages:
1. Can only serialize properties
2. Properties must be public
3. Properties must have a get and a set which can result in some awkward design
4. Supports a narrower set of types
5. Cannot understand the DataContractAttribute and will not serialize it unless there is a SerializableAttribute too
Advantages:
1. Opt-in rather than opt-out properties to serialize. This mean you specify what you want serialize
2. Because it is opt in you can serialize not only properties, but also fields.  You can even serialize non-public members such as private or protected members. And you dont need a set on a property either (however without a setter you can serialize, but not deserialize)
3. Is about 10% faster than XmlSerializer to serialize the data because since you don’t have full control over how it is serialize, there is a lot that can be done to optimize the serialization/deserialization process.
4. Can understand the SerializableAttribute and know that it needs to be serialized
5. More options and control over KnownTypes
Disadvantages:
1. No control over how the object is serialized outside of setting the name and the order



Below example class setup to use the DatContractSerializer.  Notice that I am explicitly setting the DataMemberAttribute on the properties I want to serialize, but not on the others.
[DataContract]
public class Individual
{
    private string m_FirstName;
    private string m_LastName;
    private int m_SocialSecurityNumber; 
 
    [DataMember]
    public string FirstName
    {
        get { return m_FirstName; }
        set { m_FirstName = value; }
    } 
 
    [DataMember]
    public string LastName
    {
        get { return m_LastName; }
        set { m_LastName = value; }
    } 
 
    public int SocialSecurityNumber
    {
        get { return m_SocialSecurityNumber; }
        set { m_SocialSecurityNumber = value; }
    } 
 
    public Individual()
    {
    }
    public Individual(string firstName, string lastName)
    {
        m_FirstName = firstName;
        m_LastName = lastName;
    }
}

One other important thing to talk about with the DataContractSerializer are the ServiceKnownTypeAttribute and KnownTypeAttribute attributes.  These are similar to the XmlIncludeAttribute used by the XmlSerializer.  When used in Wcf, these identify what types should be represented in the WSDL that is generated.
The KnownTypeAttribute specifies types that should be recognized by the DataContractSerializer when serializing and deserializing a type.  It is applied to a class and basically specifies what other types are used in the class.  You don’t need to specify known .Net types, but any custom classes should be added here.  This attribute can be used multiple times to identify multiple types.
[DataContract]
[KnownType(typeof(MyOtherType))]
public class MyType
{
    [DataMember]
    public MyOtherType TheOtherType;
} 
 
[DataContract]
public class MyOtherType
{
    [DataMember]
    public string MyValue;
}
The ServiceKnownTypeAttribute specifies known types to be used by a service when serializing or deserializing.  It is applied to a ServiceContract or to an OperationContract and specifies what types are used in the methods.  Again (like the KnownTypeAttribute), you don’t need to specify known .Net types and this attribute can be used multiple times to identify multiple types.
[ServiceContract]
[ServiceKnownType(typeof(MyType))]
[ServiceKnownType(typeof(MyOtherType))]
public interface MyService
{
    [OperationContract]
    [ServiceKnownType(typeof(YetAnotherType))]
    void MyMethod();
}

When to use which serializer?

For Wcf, you should prefer to use the DataContractSerializer.  If you need full control over how the xml looks though, you should go back to the XmlSerializer.   If you are doing general serialization, it is up to you, but I would weigh out the advantages and disadvantages.  I would still prefer the DataContractSerializer for the same reasons I prefer it for Wcf.  I do not recommend using the NetDataContractSerializer unless you have to.  You can lose too much interoperability and its not as descriptive.
If you need some custom xml serializer, by all means go ahead and implement it.  Wcf supports any serializer you can throw at it.  Just be careful not to “reinvent the wheel”.  The XmlSerializer is very configurable and may suit your needs. If that doesn’t work, then ISerializable gives you full control.

1 comment: