View articles in the go-codec series, source at http://github.com/ugorji/go
go-codec supports compile-time generation of encoders and decoders for named types,
which does not incur the overhead of reflection in the typical case,
giving 40% to 100% performance improvement over the idiomatic runtime introspection mode.
Idiomatic encoding and decoding types within go typically relies on the reflection capabilities of the go runtime. This affords flexible performance without the need for a pre-compilation step; the go types contain all the information needed and the runtime exposes the full types via reflection. However, introspecting the runtime to get this information has a noticeable overhead, which can be eliminated by a pre-compilation/code-generation step.
To eliminate that overhead, a pre-compilation step must be done to create the code
which would have been inferred at runtime.
This is why Protocol Buffers, Avro, etc have better performance than runtime-based systems.
go-codec now provides the same capabilities, with the accompanying 2X-20X performance improvement
depending on the size and structure of the named type.
Let us start with some benchmark numbers to whet your appetite.
Encoding - Runtime
Benchmark__Msgpack____Encode-8         	   14095	     84318 ns/op	    3192 B/op	      44 allocs/op
Benchmark__Binc_______Encode-8         	   14058	     85184 ns/op	    3192 B/op	      44 allocs/op
Benchmark__Simple_____Encode-8         	   13978	     85796 ns/op	    3192 B/op	      44 allocs/op
Benchmark__Cbor_______Encode-8         	   13983	     87215 ns/op	    3192 B/op	      44 allocs/op
Benchmark__Json_______Encode-8         	    6051	    188551 ns/op	    3256 B/op	      44 allocs/op
Benchmark__Std_Json___Encode-8         	    5514	    218973 ns/op	   74474 B/op	     444 allocs/op
Benchmark__Gob________Encode-8         	    6646	    177393 ns/op	  170414 B/op	     591 allocs/op
Benchmark__Bson_______Encode-8         	    4936	    239069 ns/op	  222828 B/op	     364 allocs/op
Encoding - CodeGen
Benchmark__Msgpack____Encode-8         	   28369	     41501 ns/op	     288 B/op	       2 allocs/op
Benchmark__Binc_______Encode-8         	   26284	     45098 ns/op	     288 B/op	       2 allocs/op
Benchmark__Simple_____Encode-8         	   26959	     44700 ns/op	     288 B/op	       2 allocs/op
Benchmark__Cbor_______Encode-8         	   26628	     44320 ns/op	     288 B/op	       2 allocs/op
Benchmark__Json_______Encode-8         	    8064	    141844 ns/op	     352 B/op	       2 allocs/op
Decoding - Runtime
Benchmark__Msgpack____Decode-8         	    5866	    203320 ns/op	   67387 B/op	     913 allocs/op
Benchmark__Binc_______Decode-8         	    5438	    223080 ns/op	   67390 B/op	     913 allocs/op
Benchmark__Simple_____Decode-8         	    5958	    203158 ns/op	   67360 B/op	     913 allocs/op
Benchmark__Cbor_______Decode-8         	    5793	    206755 ns/op	   67373 B/op	     913 allocs/op
Benchmark__Json_______Decode-8         	    3105	    390624 ns/op	   89300 B/op	    1041 allocs/op
Benchmark__Std_Json___Decode-8         	    1365	    855218 ns/op	  138558 B/op	    3032 allocs/op
Benchmark__Gob________Decode-8         	    4135	    296280 ns/op	  156140 B/op	    2242 allocs/op
Benchmark__Bson_______Decode-8         	    2582	    467415 ns/op	  183853 B/op	    4085 allocs/op
Decoding - CodeGen
Benchmark__Msgpack____Decode-8         	    9934	    121373 ns/op	   64070 B/op	     871 allocs/op
Benchmark__Binc_______Decode-8         	    9210	    131006 ns/op	   64072 B/op	     871 allocs/op
Benchmark__Simple_____Decode-8         	    9733	    122189 ns/op	   64068 B/op	     871 allocs/op
Benchmark__Cbor_______Decode-8         	    9968	    123628 ns/op	   64085 B/op	     871 allocs/op
Benchmark__Json_______Decode-8         	    4257	    283405 ns/op	   87471 B/op	    1002 allocs/op
We see that the encoding and decoding times for the binary formats supported by go-codec
are pretty similar, so we will just use cbor as representative of the binary formats,
and also compare json benchmark numbers.
The table below compares encode using runtime support only against a baseline of code generation.
| Time | Memory | Allocations | |
|---|---|---|---|
| Cbor | 2.0 X | 11 X | 22 X | 
| Json | 1.3 X | 9 X | 22 X | 
The table below compares decode using runtime support only against a baseline of code generation.
| Time | Memory | Allocations | |
|---|---|---|---|
| Cbor | 1.7 X | 1.05 X | 1.05 X | 
| Json | 1.4 X | 1.02 X | 1.04 X | 
There is very clear benefit to code generation. Code generation gives you better performance in clock time, cpu time and memory usage/allocations. The benefits, especially in memory use, are more pronounced during encoding than during decoding.
I call sheninegens! reflection in go is not slow. In fact, interfaces/type-switch/etc use the same runtime introspection mechanism under the hood that reflection does.
Let me explain. In go, reflection is a thin layer of runtime introspection support. There is a small computational cost to compute or expose requested information about the types already known to the runtime, or to create new values and return a wrapper (reflect.Value) around them.
However, that thin layer requires that most values be allocated on the heap, and the use of interfaces prevents benefits of escape analysis and inlining. We see a consistent overhead of about 35% added by the runtime.
Note that reflection is an intrinsic part of the go runtime,
and used in core foundational packages like fmt.
codecgen works off a single interface.
type Selfer interface {
	CodecEncodeSelf(*Encoder)
	CodecDecodeSelf(*Decoder)
}
When encoding or decoding a type, if it implements the codec.Selfer interface above,
then it will handle its own encoding and decoding. The Encoder/Decoder checks this before
extension support or if the type also implements encoding.(Text|Binary)(M|Unm)arshaler
interfaces.
NOTE: the Canonical option is ignored (not supported AT THIS TIME). If you need Canonical support (e.g. for cbor), then do not use codecgen.
codecgen uses this knowledge to generate type-safe code which does exactly what the regular runtime introspection code does at run-time. It is an amazing feat.
With codecgen, the full feature-set of codec is still supported, including:
codecgen builds fully atop the go-codec package. We needed it to work exactly as the runtime introspection works, so we can leverage all the IP built into the package already.
go-codec at runtime will parse each type needed and create an in-memory structure specifying all important information about the type. codecgen uses all that information and replicates the runtime logic exactly.
codecgen runs in multiple phases:
codec.Selfer implementationcodec.Gen(...) function, passing in all the types gatheredgo run -tags=XYZ transient-file.go
The transient file looks like this (*error handling removed for conciseness):
	fout, err := os.Create("values_codecgen_generated_test.go")
	var out bytes.Buffer
	var typs []reflect.Type 
	var t0 codec.AnonInTestStruc
	typs = append(typs, reflect.TypeOf(t0))
	var t1 codec.AnonInTestStrucIntf
	typs = append(typs, reflect.TypeOf(t1))
    // <snip>
	codec.Gen(&out, "codecgen", "codec", false, typs...)
	bout, err := format.Source(out.Bytes())
	fout.Write(bout)
The generated file looks like this (details elided):
func (x *MyType) CodecEncodeSelf(e *Encoder) {
}
func (x *MyType) CodecDecodeSelf(e *Decoder) {
}
Using codecgen is very straightforward.
Download and install the tool
go get -u github.com/ugorji/go/codec/codecgen
Run the tool on your files
The command line format is:
codecgen [options] (-o outfile) (infile ...)
% codecgen -?
Usage of codecgen:
  -c string
    	codec path (default "github.com/ugorji/go/codec")
  -d int
    	random identifier for use in generated code
  -nr string
    	regex for type name to exclude (default "^$")
  -nx
    	do not support extensions - support of extensions may cause extra allocation
  -o string
    	out file
  -r string
    	regex for type name to match (default ".*")
  -rt string
    	tags for go run
  -st string
    	struct tag keys to introspect (default "codec,json")
  -t string
    	build tag to put in file
  -x	keep temp file
% codecgen -o values_codecgen.go values.go values2.go moretypedefs.go
That is it
| Option | Description | 
|---|---|
| -o | codecgen will generate a single output file. | 
| -c | If you have used vendored the codec package into a different place, use this option to specify a different package path for the codec package. Most users do not need this. | 
| -t | Users may want to only use the code generated file when specific build tags are specified. You can pass some tags and the generated file will have them. | 
| -st | Users can customize the struct tags keys to introspect | 
| -rt | codecgen runs by creating a temporary file, and then using go runto execute it. If the file that you are generating values against needs a build tag, specify it to the codecgen tool. | 
| -x | This is a debugging switch to not delete the transient file which must be passed to go run. | 
| -d | Specify the random integer used during codecgen. This helps reduce churn in generated output, etc. | 
| -r | Specify regex for type name to match (default “.*“) | 
| -nr | Specify regex for type name to exclude (default “^$”) | 
| -nx | do not support extensions in generated files - this may help reduce some allocation if you know that you never use extensions | 
Yes.
codecgen can be used easily with go generate.
The easiest way is to create a file, add the generate tag to it, and call codecgen in it. A sample file looks like this:
//+build generate
package mypackage
//go:generate codecgen -o values.generated.go file1.go file2.go file3.go
Run go generate in the directory containing the file.
go-codec updates an internal version each time an incompatible change occurs to the library.
Within an init function, we check that the generated code matches the current supporting library.
If the check fails, we panic in the init so that the application never starts until
the user updates.
The error message looks like:
codecgen version mismatch: current: 1, need 2. Re-generate file: /home/ugorji/depot/repo/src/ugorji.net/codec/values_codecgen_generated_test.go
If you get a similar panic message, please use an old library or regenerate your file.
There are a few other code-generation libraries created for specific formats. They had issues which I will list below:
msgp https://github.com/philhofer/msgp/
The others were non-starters, as they failed to generate implementations for TestStruc.
megajson https://github.com/benbjohnson/megajson
Error: Field contains no name: &{<nil> [] AnonInTestStruc <nil> <nil>}:ffjson https://github.com/pquerna/ffjson
panic: runtime error: index out of rangebsongen http://godoc.org/github.com/youtube/vitess/go/cmd/bsongen
&{Struct:696 Fields:0xc208063380 Incomplete:false} is not a simple type