The Most Powerful Feature of Go Is The Least Sexy

You’re probably tired of hearing about Go by now: It is the node.js of 2013, a constant stream of first-impression posts making their way to the top of software development boards.

Sure there are a few people doing considerable work with it, but by and large the landscape is dominated by language tourists who spend a few hours with it and spit out a blog post, mission accomplished.

It deserves more. It deserves an extended visit, if not permanent occupancy. It has a level of maturity and utility far more refined than one might expect given its young age.

To recap the same advantages that you’ve probably read elsewhere, here are a few of the reasons you should take a deep dive with Go-

  • It is a hybrd of high-level and low-level language features, giving you niceties such as garbage-collection and first-class functions with closures, yet…
  • …the language is restricted enough, the features so well considered, that even the nascent generation compilers generate extremely efficient, competitive-with-C code — very small and very fast. Go avoids most of the pitfalls of garbage collection (an issue that can appear when a solution is at scale) through a very intelligent use of the stack.
  • The tools build stand-alone, self-contained executables very quickly.
  • The supporting library is surprisingly rich.
  • The language is concise and easy to master. It has few edge conditions, and fits within the standard context space of most developers.
  • The language has some simple niceties that solve decades of programming irritants — for instance multiple return values from functions, and accessibility defined by case.
  • It has excellent cross-platform support, with universally excellent performance on a wide variety of systems.
  • Goroutines are very lightweight and allow for efficient, easy-to-develop concurrency. The native message support is intuitive and makes the normally complex task of “thread” (or in this case goroutines) communications extremely simple.
  • The community and language is thus far unburdened with the crippling cargo cult pattern-worship of some other communities (namely the Java and .NET world), allowing you to create robust, maintainable solutions without defensively layering shrouds of DI and factory factories, building shielding surrogates instead of actual quality and maintainability. To be fair, with either Java or C# — both platforms I love and regularly use — you can build lean, mean, problem-solving machines, but the inertia of the community has it more likely that you’ll end up with an onion of redundant LINQ queries over your IEnumerable to abstract out a horrifically inefficient solution for what should be a single-line problem or a coherent data structure.

And the disadvantages of Go as it is stands today-

  • The language in some ways feels obsolete. The lack of generics or function overloads demands some code-ugliness that is a bit unfortunate, though this is a facet of the first advantage listed above.
  • There are no exceptions. Is this a disadvantage? Maybe, but the multi-return makes this less of a problem than it might be, and avoids the common (and expensive) use of exceptions for flow control.
  • The name “Go” has no search selectivity, making simple investigations an exercise in search pain. I’ve seen people argumentatively refute this on Hacker News, but those people are simply lying: One cannot have done considerable work with Go without coming to appreciate this stark reality. The search situation with Go is similar to searching C# resources back when Google ignored the #. Some advise to use golang, but that unfortunately primarily serves to filter by domain name and excludes a lot of pertinent information.
  • The supporting ecosystem is still somewhat limited. Tools like profiling are workable but not much more.

Recently I’ve been at work building a high performance financial computation engine as a component of a solution. The principal development goals being rapid development, with provable, tested code that supports easy scale-out deployment across cores and machines, doing the most with the least hardware.

Or rather, doing the most with the hardware, whatever its scale.

In this case the deployed hardware is a cluster of dual-CPU 12-core Ivy Bridge machines, each with 256GB of main memory. Regardless of the scale of the hardware, solving as many financial scenarios in a given timeframe is the goal, and doing more in less gives you leeway to do even more. Optimizing hard and early puts you in a place where you have an open road ahead of you, and incredible flexibility in your deployments.

Efficiency is priority #1 people, because waste is a thief.  I showed this to my man here. You liked it, didn’t you?

This isn’t the web where a 600ms turnaround time to generate a simple markup for the occasional visitor is acceptable. There are very large island sets of data that yield atomic results that can’t be easily scaled out in a classic map/reduce fashion.

So Go is an efficient player in this puzzle. Goroutines make for very easy scale-out.

But Go wasn’t enough, despite offering surprisingly high performance for the test financial implementations. For this case there is a need to fully leverage the underlying platform, including automatic vectorization, and AVX2 targeting optimizations (AVX for now, and AVX2 once deployed to Haswell). This is a situation where four million IRR calculations per core per second is of substantially greater value than 500,000 of the same.

Which brings me to what I think is the most powerful feature of Go — it plays very well with C. Now of course most languages play together with C fairly well, but never with the utter ease and efficiency that you can achieve with Go, where shared data structures can be passed, zero copy, with ease and confidence, and values don’t go through expensive boxing/unboxing, security gating procedures. IPC — even on the same machine — was a complete non-starter as the performance requirements wouldn’t be close to met.

In this particular code I leverage the Intel Compiler and its Math Kernel Library, generating libraries that with utter simplicity are used within the Go code via the supports provided by Cgo. The computations exist as a very high performance C library, while the orchestration, communications, and data management lives within the friendly world of Go.

The result is very easily maintained and expanded, and provides the best of the both worlds: I pass first class functions around, with intelligent closure states and easy/safe data management, exposing the subsystem via the generous, feature-rich communications library of Go (with zero-block goroutines), while getting pinnacle performance out of the processors. The result can be trivially deployed on machines big and small, on one or many.

Go has proven a very valuable technology partner for this subsystem, and I imagine I’ll be utilizing more of it in the days ahead.

* – Given that I always get asked about this, OpenCL / CUDA remains of limited value for general financial work. This may change with zero-copy shared-memory systems, but as of now the closest simile that shows value is the Xeon Phi.