Technical Overview
This article discusses the general topic of serialization in .NET and gives an overview of the implementation details in FsPickler.
Serialization in the .NET framework
Serialization in the .NET framework is something often considered as being problematic or even broken. This could be tracked to the following reasons:
-
There is no runtime or type system support for serialization. Rather, serializers are library implementations that mostly depend on the reflection system to extrapolate serialization rules for each type (resulting in runtime errors if this is not possible).
-
There is confusion as to which is the prefered methodology when defining serialization semantics for types. The BCL itself comes with a multitude of approaches most of which are dated, unsafe and inefficient by design.
BCL Serialization methods
The BCL comes with the following patterns for defining serializable types:
-
Field-based serialization: objects are pickled by serializing the contents of their fields. This behaviour is tuned by applying an assortment of attributes such as
NonSerialized
,OnSerializing
,OnDeserialized
and theIDeserializationCallback
interface. This is arguably the most problematic approach, since it is sensitive to internal implementation details. Incidentally, this is the pattern of choice for most F# types. -
Property-based or DataContract serialization: objects are pickled by serializing the contents of specially flagged properties. Properties should be settable and are usually required to be public. In most cases, a parameterless constructor is also necessary. Property-based serialization is probably the most popular approach.
-
ISerializable types: possibly intended as a type-safe replacement of field-based serialization. It fails to live up to this role, mostly due to the nature of interfaces; it still requires a special constructor for deserialization which, if missing, would result in a runtime error. It requires separate implementations for serialization and deserialization, often leading to bugs and introducing lots of boilerplate code. Its dynamic nature also makes it impossible to reason about the components it serializes a priori. It is however an indispensible pattern since it is used by many BCL types, most important being exceptions, whose metadata is impossible to serialize otherwise.
Serialization Libraries
The BCL ships with a couple of general-purpose serializer libraries but these suffer either from obsolescence (e.g. BinaryFormatter) or performance (NetDataContractSerializer). Many third-party libraries have emerged attempting to address these shortcomings, some with success. However, most of them seem to suffer when it comes to properly supporting serialization of .NET objects. In particular many of them:
offer sketchy or no support for reflection types, like
MemberInfo
.- require library-specific dependencies for serializable type definitions.
- do not support inaccessible types/constructors/properties.
- fail to address subtype polymorphism.
- do not support types like multi-dimensional arrays or delegates.
- fail to properly handle cyclic objects.
- have limited or nonexistent support for F# types.
FsPickler : Motivation
The need for a new serialization library was realized while developing for MBrace. MBrace is a framework for distributed computation and big data that is essentially built on the idea of a distributed continuation monad. To be able to distribute continuations, you need to be able to serialize closures. The inherently intrinsic and arbitrary nature of closures means that they are not supported by most serialization libraries. Initially we started using NetDataContractSerializer as our library of choice, but that eventually took its toll on overall performance, so we decided to build a new library from scratch.
FsPickler was created with two goals in mind: performance and completeness in supported objects. Importantly, it was conceived as a .NET serializer, rather than an F# serializer: if it is a .NET object designed to be serialized, it should be serializable in FsPickler. In that sense, it was designed with a goal to embrace the imperfect world of .NET serialization in its totality, albeit with an eye for correctness.
At this point, we should acknowledge Anton Tayanovskyy and his great F# snippet Union-Friendly Generic Binary Serializer which served as the initial inspiration for this library.
Pickler Combinators
FsPickler is founded on the functional programming notion of picklers and pickler combinators. The concept was originally described in Andrew Kennedy's Pickler Combinators. A pickler is essentially the type:
1: 2: 3: 4: 5: |
|
The Write
and Read
functions represent serialization and deserialization
implementations for values of type 'T
. When compared to other .NET serializers,
picklers prove exceptionally fast since they directly handle the serialized value
without any reliance on reflection-based resolution.
Pickler combinators are functions that operate on picklers, generating picklers of more complex types. For instance, the following could be a pickler combinator implementation for rank-1 arrays:
1: 2: 3: 4: 5: |
|
Given a sufficient range of primitive picklers and pickler combinators, it is possible to define pickling rules for most essentially serializable .NET types. However, these would still need to be declared manually by their implementor.
FsPickler attempts to solve this by providing an automated pickler generation framework: picklers are generated at runtime and on demand using a combination of reflection and dynamic IL generation. Picklers are cached for future use, hence the cost of generation has a constant price.
Moreover, it extends the concept of picklers so that features particular to the .NET framework are accommodated: object orientation and subtype polymorphism, null references and cyclic objects.
It should be mentioned for the sake of completeness that a similar effort has been undertaken by the Scala community. There are differences of approach however, since this mostly relies on the metaprogramming facility offered by scala.
Serializable Types
So what qualifies as a serializable type in FsPickler?
The short answer would be any type 'T
for which we present an instance of
Pickler<'T>
.
To make the question more meaningful: for what types does the library auto-generate picklers?
This is something that can only be answered at runtime, typically by calling
1:
|
|
Primitive types and string
are serializable but pointers, COM objects and MarshalByRef types are not.
A managed class or struct is serializable if and only if it carries the Serializable
flag or attribute
and moreover satisfies either of the following:
- is array/nullable/tuple/enum of serializable element types.
implements the
ISerializable
interface and has a matching constructor implementation.-
carries the
DataContract
attribute and allDataMember
members are of serializable type. Properties carrying theDataMember
attribute must be settable. Parameterless constructors or public properties are not required. - is an F# union or record type in which all contained fields are serializable.
carries the
CustomPickler
attribute and an accompanying pickler factory method.-
is neither of the above and all its contained fields are of serializable type. Parameterless constructors or public fields are not required.
Object Oriented Picklers
We now discuss how the functional programming concept of picklers can be reconciled with the object-oriented .NET framework. It is useful to note here that reference type serialization essentially means object graph serialization. Object graphs in .NET can often contain null references or cycles, which ought to be fully supported. FsPickler automatically augments reference types so that such occurences are handled effectively.
Object graphs are serialized by performing depth-first traversal;
this approach improves performance by eliminating unnecessary
boxings in object queues but is sensitive to stack overflow conditions.
FsPickler uses
ObjectIdGenerator
for keeping track of traversed references and detecting cycles.
Cyclic objects are re-instantiated lazily through shallow field copying.
Subtyping
FsPickler discriminates between reference types that are sealed and those that are not sealed. A value of non-sealed type can either be instance of this type or of a proper subtype. In the context of open hierarchies, it is clear that a type-fixed pickler implementation simply cannot cater to arbitrary subtype instances.
For this reason, picklers will only be used to serialize values whose reflected type is identical to the pickler's. When coming across an object of non-sealed type, FsPickler will examine its reflected type; if found to differ, a new pickler will be generated on-the-fly for that particular subtype.
The above implies that obj
is serializable in FsPickler, but Pickler<obj>
will only be used with obj()
instances. Similarly, picklers for interfaces
or abstract classes are possible, but on their own are incapable of serializing anything.
This is how serialization of F# closures is made possible:
for every internal closure type generated by the F# compiler,
a special pickler is generated for handling it.
Pickler Generation
Conceptually, pickler implementations found in the library can roughly be separated into the following categories:
-
Atomic picklers: self containing pickler implementations for types like primitives, strings,
System.DateTime
andSystem.Guid
. These do little other than delegating serialization to the underlying pickle format implementation. -
Pickler combinators: as described earlier, these are parametric functions generating picklers out of simpler components. Examples are option, list, array and tuple picklers.
-
Pickler generators: for nominal types that are not expressions on known types, a different generation strategy is required. The type structure is determined using reflection and a new pickler is constructed from that information. FsPickler can emit dynamic methods to ensure that no reflection or boxing penalty is incurred during actual serialization. Pickler generators can be further separated into the following kinds:
- Field-based pickler generation.
- DataContract pickler generation.
- ISerializable pickler generation.
- F# union/record/exception pickler generation.
Putting it all together
So how does FsPickler generate a pickler (or fail trying) for a given type parameter? I will attempt to explain by giving a simplified example:
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 26: 27: 28: 29: 30: |
|
The above declarations define a notion of "type shape", a partitioning of types according
to certain traits satisfied. This pattern gives naturally rise to a "type shape visitor",
which we will be making use of later. Given an arbitrary type 'T
, we can use reflection to extract
and construct the shape it fits in, albeit in a packaged manner:
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: |
|
We can now use the type shape visitor to plug in all required pickler combinators. This will result in a generator function that recursively calls appropriate pickler combinators in a strongly typed manner:
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: |
|
The generate
function can now be used for auto-generating derivative picklers:
1: 2: 3: |
|
module Pickler
from MBrace.FsPickler.Combinators
--------------------
type Pickler =
private new : Type -> Pickler
abstract member private Cast : unit -> Pickler<'S>
abstract member private Unpack : IPicklerUnpacker<'U> -> 'U
abstract member private UntypedAccept : state:VisitState -> value:obj -> unit
abstract member private UntypedClone : state:CloneState -> value:obj -> obj
abstract member private UntypedRead : state:ReadState -> tag:string -> obj
abstract member private UntypedWrite : state:WriteState -> tag:string -> value:obj -> unit
abstract member ImplementationType : Type
abstract member IsCacheByRef : bool
abstract member IsCloneableOnly : bool
...
--------------------
type Pickler<'T> =
{ Write: WriteState -> 'T -> unit
Read: ReadState -> 'T }
private new : formatter:IPickleFormatWriter * resolver:IPicklerResolver * reflectionCache:ReflectionCache * isHashComputation:bool * disableSubtypeResolution:bool * ?streamingContext:StreamingContext * ?sifter:IObjectSifter -> WriteState
member private GetObjectId : obj:obj * firstTime:byref<bool> -> int64
member private Reset : unit -> unit
member private CyclicObjectSet : HashSet<int64>
member DisableSubtypeResolution : bool
member private Formatter : IPickleFormatWriter
member IsHashComputation : bool
member private ObjectCount : int64
member private ObjectStack : Stack<int64>
member private PicklerResolver : IPicklerResolver
...
private new : formatter:IPickleFormatReader * resolver:IPicklerResolver * reflectionCache:ReflectionCache * disableSubtypeResolution:bool * disableAssemblyLoading:bool * ?streamingContext:StreamingContext * ?sifted:(int64 * obj) [] -> ReadState
member private EarlyRegisterArray : array:Array -> unit
member private NextObjectId : unit -> int64
member private Reset : unit -> unit
member DisableAssemblyLoading : bool
member DisableSubtypeResolution : bool
member private Formatter : IPickleFormatReader
member private IsUnSifting : bool
member private ObjectCache : Dictionary<int64,obj>
member private ObjectCount : int64
...
val int : value:'T -> int (requires member op_Explicit)
--------------------
type int = int32
--------------------
type int<'Measure> = int
member Clone : unit -> obj
member CopyTo : array:Array * index:int -> unit + 1 overload
member GetEnumerator : unit -> IEnumerator
member GetLength : dimension:int -> int
member GetLongLength : dimension:int -> int64
member GetLowerBound : dimension:int -> int
member GetUpperBound : dimension:int -> int
member GetValue : [<ParamArray>] indices:int[] -> obj + 7 overloads
member Initialize : unit -> unit
member IsFixedSize : bool
...
private new : unit -> FsPickler
static member Clone : value:'T * ?pickler:Pickler<'T> * ?streamingContext:StreamingContext -> 'T
static member ComputeHash : value:'T * ?hashFactory:IHashStreamFactory -> HashResult
static member ComputeSize : value:'T * ?pickler:Pickler<'T> -> int64
static member CreateBinarySerializer : ?forceLittleEndian:bool * ?typeConverter:ITypeNameConverter * ?picklerResolver:IPicklerResolver -> BinarySerializer
static member CreateObjectSizeCounter : ?encoding:Encoding * ?resetInterval:int64 -> ObjectSizeCounter
static member CreateXmlSerializer : ?typeConverter:ITypeNameConverter * ?indent:bool * ?picklerResolver:IPicklerResolver -> XmlSerializer
static member EnsureSerializable : graph:'T * ?failOnCloneableOnlyTypes:bool -> unit
static member GatherObjectsInGraph : graph:obj -> obj []
static member GatherTypesInObjectGraph : graph:obj -> Type []
...
static member FsPickler.IsSerializableType : t:Type -> bool
type AbstractClassAttribute =
inherit Attribute
new : unit -> AbstractClassAttribute
--------------------
new : unit -> AbstractClassAttribute
type TypeShape<'T> =
new : unit -> TypeShape<'T>
abstract member Accept : ITypeShapeVisitor<'R> -> 'R
--------------------
new : unit -> TypeShape<'T>
interface
abstract member VisitInt : unit -> 'R
abstract member VisitList : unit -> 'R
abstract member VisitSet : unit -> 'R
abstract member VisitString : unit -> 'R
abstract member VisitTuple : unit -> 'R
end
type ShapeInt =
inherit TypeShape<int>
new : unit -> ShapeInt
override Accept : v:ITypeShapeVisitor<'e> -> 'e
--------------------
new : unit -> ShapeInt
type ShapeString =
inherit TypeShape<string>
new : unit -> ShapeString
override Accept : v:ITypeShapeVisitor<'d> -> 'd
--------------------
new : unit -> ShapeString
val string : value:'T -> string
--------------------
type string = String
type ShapeTuple<'T,'S> =
inherit TypeShape<'T * 'S>
new : unit -> ShapeTuple<'T,'S>
override Accept : v:ITypeShapeVisitor<'c> -> 'c
--------------------
new : unit -> ShapeTuple<'T,'S>
type ShapeList<'T> =
inherit TypeShape<'T list>
new : unit -> ShapeList<'T>
override Accept : v:ITypeShapeVisitor<'b> -> 'b
--------------------
new : unit -> ShapeList<'T>
type ShapeSet<'T (requires comparison)> =
inherit TypeShape<Set<'T>>
new : unit -> ShapeSet<'T>
override Accept : v:ITypeShapeVisitor<'a> -> 'a
--------------------
new : unit -> ShapeSet<'T>
module Set
from Microsoft.FSharp.Collections
--------------------
type Set<'T (requires comparison)> =
interface IReadOnlyCollection<'T>
interface IComparable
interface IEnumerable
interface IEnumerable<'T>
interface ICollection<'T>
new : elements:seq<'T> -> Set<'T>
member Add : value:'T -> Set<'T>
member Contains : value:'T -> bool
override Equals : obj -> bool
member IsProperSubsetOf : otherSet:Set<'T> -> bool
...
--------------------
new : elements:seq<'T> -> Set<'T>
static member CreateInstance<'T> : unit -> 'T + 9 overloads
static member CreateInstanceFrom : assemblyFile:string * typeName:string -> ObjectHandle + 2 overloads
Activator.CreateInstance(type: Type) : obj
Activator.CreateInstance(type: Type, nonPublic: bool) : obj
Activator.CreateInstance(assemblyName: string, typeName: string) : Runtime.Remoting.ObjectHandle
Activator.CreateInstance(type: Type, [<ParamArray>] args: obj []) : obj
Activator.CreateInstance(assemblyName: string, typeName: string, activationAttributes: obj []) : Runtime.Remoting.ObjectHandle
Activator.CreateInstance(type: Type, args: obj [], activationAttributes: obj []) : obj
Activator.CreateInstance(type: Type, bindingAttr: Reflection.BindingFlags, binder: Reflection.Binder, args: obj [], culture: Globalization.CultureInfo) : obj
Activator.CreateInstance(type: Type, bindingAttr: Reflection.BindingFlags, binder: Reflection.Binder, args: obj [], culture: Globalization.CultureInfo, activationAttributes: obj []) : obj
Activator.CreateInstance(assemblyName: string, typeName: string, ignoreCase: bool, bindingAttr: Reflection.BindingFlags, binder: Reflection.Binder, args: obj [], culture: Globalization.CultureInfo, activationAttributes: obj []) : Runtime.Remoting.ObjectHandle
module Pickler
from MBrace.FsPickler.Combinators
--------------------
type Pickler =
private new : Type -> Pickler
abstract member private Cast : unit -> Pickler<'S>
abstract member private Unpack : IPicklerUnpacker<'U> -> 'U
abstract member private UntypedAccept : state:VisitState -> value:obj -> unit
abstract member private UntypedClone : state:CloneState -> value:obj -> obj
abstract member private UntypedRead : state:ReadState -> tag:string -> obj
abstract member private UntypedWrite : state:WriteState -> tag:string -> value:obj -> unit
abstract member ImplementationType : Type
abstract member IsCacheByRef : bool
abstract member IsCloneableOnly : bool
...
--------------------
type Pickler<'T> =
inherit Pickler
private new : unit -> Pickler<'T>
abstract member Accept : state:VisitState -> value:'T -> unit
abstract member Clone : state:CloneState -> value:'T -> 'T
abstract member Read : state:ReadState -> tag:string -> 'T
abstract member Write : state:WriteState -> tag:string -> value:'T -> unit
override private Unpack : IPicklerUnpacker<'R> -> 'R
override private UntypedAccept : state:VisitState -> value:obj -> unit
override private UntypedClone : state:CloneState -> obj -> obj
override private UntypedRead : state:ReadState -> tag:string -> obj
...