By Ugorji Nwoke
15 Dec 2014
/blog
technology go-codec
Re-Introducing Go Codec Library: msgpack, binc, cbor, json and more formats
The go-codec library is a High Performance, Feature-Rich and Idiomatic
Go encoding/decoding library for binc, msgpack, cbor, json,
with runtime reflection or compile-time code generation support.
View Source at http://github.com/ugorji/go .
Sometime in 2013, we announced go-codec
as a library for msgpack. The go-codec library has come a long way since then.
NOTE: *This is the first article of a series on go-codec, which includes*:
To whet your appetite, I will show you some benchmark numbers gathered comparing
encoding/json from the standard library to json support offered by go-codec.
See below for the raw data and further quick analysis of the results.
New Features provided by go-codec
Currently, go-codec provides best-of-breed support for the following formats:
messagepack: binary
cbor: binary, streaming, explicit map/array delimited NEW
json: text, streaming, explicit delimited NEW
binc: binary, symbols
All these formats inherit the following features of go-codec mentioned below.
This update provides the following new features:
Much Increased performance
Fast (no-reflection) encoding/decoding of common maps and slices.
The fast non-reflection support is enabled for all combinations
of builtin types of maps and slices e.g. map[string]uint32, []int16, etc.
Support for code generation
This gives up to 2-20X performance improvement over the already stellar performance.
Support for text-based formats i.e. json
Support for IETF proposed Internet-Of-Things format i.e. cbor
Support for indefinite-length formats to enable true streaming
Read only what is needed
This allows a stream to contain some encoded data, and other data e.g.
a stream contains some msgpack encoded data, then \r\n delimiter, then some json-encoded
data. go-codec supports that efficiently, as it never reads more from the stream
than it needs, and it doesn’t do buffering.
NEVER silently skip data when decoding
User decides whether to return an error or silently skip data when keys or indexes
in the data stream do not map to fields in the struct.
Drop-in replacement for encoding/json. json: key in struct tag supported.
This is in addition to the features already supported, but now made more robust
and fully supported via all encoding/decoding paths i.e. runtime reflection, code generation
and fast-path for common maps and slices:
Encode based on the destination data structure.
For example,
decode a uint64 from any kind of number in the data stream
(float, unsigned integer, signed integer, etc)
decode a string from a string or binary byte array in the data stream
Support NIL in the stream in multiple contexts,
decoding it as the zero-value of the data structure.
Full support for encoding.(Text|Binary)(M|Unm)arshaler interfaces.
Decoding without a schema (into a interface{}).
This means decoding into a nil interface{}. Users can specify the type of maps and slices
to use; these default to map[interface{}]interface{} and []interface{} respectively.
RPC Server and Client Codecs for integration with net/rpc.
This allows the seamless use of the go-codec library for rpc. You do not have to use gob.
Standard field renaming via tags
Support for omitting empty fields during an encoding.
If a field has the zero value, it can be skipped. This will reduce the encoded length
and reduce the decoding time. Make sure that the value being decoded into is a zero-value
or a struct which has all fields initialized to their zero-values.
Extensions to support efficient encoding/decoding of any named types.
For example, type XYZ [8]uint8, type 3DPoint struct { X, Y, Z uint8 }.
A user can encode and decode XYZ or 3DPoint above to/from a single unsigned integer.
Encode a struct as an array, and decode struct from an array in the data stream.
This is more compact and efficient, but requires that the exported fields of the
struct stay in the same order.
Comprehensive support for anonymous fields.
Whether the anonymous field is a pointer or a value, codec will handle it.
You can see that, even without code generation, the performance of the go-codec library is extremely
impressive. The standard library takes 20% more time and uses double the allocations during encode,
and almost double the time during decode.
Getting this level of performance was no easy feat. But it was possible because we built the
go-codec library to be high-performance, and support pluggable Handles. Each of these handles comes
to about 500 lines of code.