The Dark Clouds of Choice

I enjoy Go (the programming language/tooling). It rapidly facilitates efficient, concise solutions. Whether it’s processing some data, building a simple connector web service, hosting a system monitoring tool (where I might layer it on some C libraries), it just makes it a joy to bang out a solution. The no-dependency binary outputs are a bonus.

It’s a great tool to have in pocket, and is fun to leverage.

I recently pondered why this was. Why I don’t have the same feeling of delight building solutions in any of the other platforms that I regularly use. C# or Java, for instance, are both absolutely spectacular languages with extraordinary ecosystems (I’d talk about C or C++ here, using both in the regular mix, but while there are usually very practical reasons to drop to them they don’t really fit in this conversation, despite the fact that from a build-a-fast-native-executable perspective they’re the closest).

Goroutines? Syntactic sugar over thread pools. Channels? That’s a concurrent queue. Syntactical sugar is nice and lubricates usage, but once you’ve done it enough times it just becomes irrelevant. A static native build? A bonus, but not a critical requirement.

There is nothing I build in Go that I can’t build at a similar pace in the other languages. And those languages have rich IDEs and toolsets, while Go remains remarkably spartan.

The reason I enjoy it, I think, is that Go is young enough that it isn’t overloaded with the paradox of choice: You throw together some Go using the basic idioms and widely acknowledged best practices, compile it, and there’s your solution. Move on. A lot of strongly held opinions are appearing (about dependencies and versioning, etc — things Go should have gotten right in v1), and an evolutionary battle is happening between a lot of supporting libraries, but ultimately you can mostly ignore it. Happily build and enjoy.

Doing the same project in Java or C#, on the other hand, is an endless series of diverging multi-path forks. Choices. For the most trivial of needs you have countless options of implementation approaches and patterns, and supporting libraries and dependencies. With each new language iteration the options multiply as more language elements from newer languages are grafted on.

Choices are empowering when you’re choosing between wrong and right options, where you can objectively evaluate and make informed, confident decisions. Unfortunately our choices are often more a matter of taste, with countless ways to achieve the same goal, with primarily subjective differences (I’ll anger 50% of C# programmers by stating that LINQ is one of the worst things to happen to the language, and is an inelegant hack that is overwhelming used to build terrible, inefficient, opaque code).

We’re perpetually dogged with the sense that you could have gone a different way. Done it a different way. I actually enjoy C++ (I admit it…Stockholm syndrome?), but with each new standard there are more bolted-on ways to achieve existing solutions in slightly different ways.

I of course still build many solutions on those other platforms, and am incredibly thankful they exist, but I never have the completely confident sense that it is optimal in all ways, or that someone couldn’t look at it and ask “Couldn’t you have…” and I could firmly retort. I continue an internal debate about forks not taken.

The Best of the Best of the Best…Maybe?

I’ve talked about the consulting thing on here a lot, and the trouble involved with the pursuit. While I’ve been working with a fantastic long-running client, and have primarily been focused on speculative technology builds, I considered keeping skills diverse and fresh by mixing things up and working occasionally through a freelancing group that purports to have the best of the best. Doing this would theoretically remove the bad parts of chasing engagements, pre-sales, badgering for payments, etcThe parts that are a giant pain when you hang your own shingle.

If it was just challenging engagements with vetted clients, cool. Just the fun and rewarding parts, please and thank-you.

Freelancing groups almost always end up being a race to the bottom, generally becoming dens of mediocrity, so the notion of a very selective group made it more interesting. I like a challenge, and if someone wants to build a collection of Boss Levels for me to battle, the barriers yielding a group that customers would pay a premium for, maybe it’d be interesting.

So I took a look to find that one of their requirements — not mandatory, but strong recommended — is that you post on your blog how much you want to work with them. This is before you know anything about the credibility of their process, rates, the quality of peers, etc. And this isn’t something you can easily find: they demand that you don’t talk about their process or virtually anything involved with working with them, so in the absence of any information about them (beyond some very negative threads I later found on HN, primarily posts by throwaway accounts), you need to tell the world how eager you are to join them.

This is a profound asymmetry of motives.

Who would do that? It is an adverse selection criteria, and it instantly deflated my impression of their selectivity and had me clicking the back button: The sort of desperation where someone would pander like that — while certainly possible among fantastic talents in bad situations — is not a good criteria. To try to harvest cheap link and namespace credibility like that itself makes the service look lame, like a cheap 1990s long distance carrier.

I still want those challenging smaller efforts, however — variety keeps me fresh and in love with this field, and some extra cash and contacts are always nice — so instead I’m going to start offering micro-consulting via yafla, pitching it on here as well: Pay some set amount (e.g. $500), describe your problem and supply supporting data, and I’ll spend from four to eight hours on your problem, offering up appropriate deliverables (utility, shell of a solution, decision input, directions, analysis, etc). I’ll get that online quickly.

Going for the smaller chunk efforts, of the sort that someone with a corporate credit card can easily pay for, should prove much more tenable than seeking significant engagements where the sales process just drags on forever.

It also is a cross-motivation desire to encourage me to spend more time on this blog, and if I’m pitching bite-sized consulting on posts and actually seeing uptake, it’ll keep me authoring content that people want to see.

While I have always enjoyed the cathartic aspect of blogging, it has never been profitable for me. Every engagement has always come via referral and word of mouth, and even when I’ve sought traditional employment I’ve always been amazed that no one ever even does a simple Google search, much less read anything on here. I never expected any return, and adore that anyone reads these things at all, but it does make it so at times it’s tough to justify time spent making content.

go, first-class functions, panics and closures

A simple hack-functional-programming challenge was posted to Hacker News a few nights ago, quickly rising on the front-page, just as quickly disappearing (penalized for getting too many comments too quickly without corresponding upvotes).

Given that I’d been working on a solution using Go at the time, and that language was tagged as “likely impossible”, it grabbed my focus for a bit.

So I whipped up a solution.

It is a bit iffy whether it fully conforms with the guidelines (e.g. the score function type has a method String that is recognized and used (Stringer interface) by methods such as fmt.Println for seemingly automatic conversion, however this is by convention and the r-value is a func, not a string: You can’t simply cast or assign the func to a string without calling the String method), but I thought it a fun diversion.

Of pertinence for those who are less accustomed to go-

  • g() creates a closure that is held and used by the returned function for its lifetime. This allows it to cleanly hold per-instance state without hacks like globals
  • g() returns a function of the type score. That function type returns itself
  • both f() and g() take …string arguments (variadic), meaning 0..n values. It adds o on 0 arguments, the string on 1, and panics on more than one, which is overloaded to extract the closure value
  • the type score has a method called String defined on it, which by convention will be used by various library functions when a non-string type is encountered, calling String() to get the string representation (similar to Object.ToString/toString in C#/Java)
  • as it’s a simple function type, without any further payload, and the closure was created in g(), getting at the value of the base variable in the closure from String was a fun challenge. In the end I abused panic/recover to extract the value, but still search for better methods. Do you know something better?
  • Anyways, it was a fun brief diversion. Go is an enjoyable language and the whole closure quandary amused me.

Database Performance, LVM Snapshots, golang and sqlite3

Snapshots for Beginners

Snapshots are a very useful mechanism found in database products, disk management layers, and virtualization tools. Instead of making a complete copy of the source, snapshot functionality allows you to create the copy virtually instantly.

Snapshots can be used to perform backups without locks or transaction contention, or to save volume contents at a particular point in time, for instance because you’re about to make some risky changes that you want the ability to quickly revert, or because you need to access the data as it was at a particular point in time.

The speed of snapshots is courtesy of trickery: Snapshot technology generally doesn’t copy anything at the outset, but instead imposes overhead atop all I/O operations going forward.

VirtualBox snapshot volumes, for instance, contain the changed disk contents from the point of the snapshot forward (contrary to what one would intuitively think), leaving the base image unchanged. When you delete a snapshot it merges all of those changes down, whereas if you “rollback” the snapshot it simply deletes the change data and reverts to the root. It adds a layer of indirection to disk access as all activities need to determine first whether they apply to the changeset or the root set.

LVM (Logical Volume Manager, a Linux volume layer that adds considerable flexibility to storage on the platform) snapshots — the topic of this entry — use a very different but more common approach. The snapshot is effectively a sparse copy of the original data, acting as simply an indirection to the root volume. As you change data in the root volume it performs a copy-on-write, moving those changed disk sectors to the snapshot volume.

If you had a 40GB root volume and then created a snapshot, at the outset your root volume will of course take 40GB of space, while the snapshot takes 0GB. If you then changed all 40GB of data, the root volume would still take 40GB, while the snapshot would also take 40GB, every block of the original data having been copied over before completing each write operation.

Snapshot Performance Concerns

There is an obvious performance concern when using snapshots. Aside from read activities needing the additional indirection, all write operations against the root now require an analysis of which snapshots it impacts, and then a copy of the original data to those affected snapshots.

The overhead can be onerous, and there are many dire warnings out there, driven from artificial benchmarks. While to-the-metal, reductionist benchmarks are important, it’s often useful to assess how this impacts the whole of operations at a higher level: The impact of seemingly enormous performance changes is often very different from expectations.

Benchmarking Snapshots

So with a couple of Amazon EC2 instance types (one, a c3.2xlarge running Ubuntu 14.04 using HVM, with dual local SSDs, and the other a m1.xlarge running Ubuntu 14.04 with PV, with local magnetic drives), and a simple benchmarking tool I built with go 1.3 and sqlite3 (using the excellent sqlite3 driver, itself a great demonstration of Go’s most powerful feature being it’s least sexy), I set about determining the impact of snapshots.

800 LVM Snapshots? Why Would Anyone Ever…

Why would this matter? As a bit of background, much of my professional life is spent in the financial industry, building very high performance, broad functionality solutions for leading financial firms.

One common need in the industry is as-it-was-at-a-particular-point-in-time data browsing: To be able to re-generate reports with varying parameters and investigate the source data that yielded those outputs. In effect, a version control system for data.

That data is constantly changing as new data comes in, and building an entire data history in the database can yield enormous volumes of data, and creates a database schema that is incredibly difficult to query or utilize with any level of performance.

When it’s at the system level, such as with Oracle’s Flashback Query, it can yield a much more tenable solution, but then much of the implementation is a black box, and you don’t have the ability to flag exactly when such an archive should be built — all data churn becomes a part of the history. As if you were committing your source after every keystroke.

I needed the ability to maintain historical versions in a manner that was immediately usable, in a high performance fashion, but didn’t redundantly copy 100% of the data for each archive, which would quickly become unmanageable.

To explain, in one solution I engineered, the generated reporting set was exported to sqlite3, for a wide variety of interesting and compelling reasons. That data store is periodically incrementally refreshed with new data as the various participants certify it and gold plate it for dissemination.

While I seriously considered LVM as one component of a solution (a part of a more complex historical platform), I ended up adding block level versioning directly into sqlite3 (it is a fantastic foundation for derived solutions), but nonetheless found LVM to be a fascinating solution, and wanted to dig deeper into the performance considerations of the technology.

To the natural question that one would ask about the utility of this benchmark — if you were really starting a small database from scratch, the easiest solution is to simply make a 100% copy of the database (presuming that you have the reasonable ability to transactionally pause and fsync), quickly and easily, for every historical copy. Imagine, however, that instead you’ve already amassed a very large volume of data, and are dealing with the daily change of that data: Where the daily delta is very small on a large volume of data.

The Benchmark

The benchmark creates a table with the following definition and indexes-

create table valuation (id integer not null primary key, 
    investment_id integer not null, 
    period_id integer not null, 
    value real not null);
create index valuation_investment_idx ON 
    valuation(investment_id, period_id);
create index valuation_period_idx ON 
    valuation(period_id, investment_id);

It then iterates 95 times, adding a new “investment”, and then individually adding 10,000 period records for each investment (26 years of daily valuations), the value varied randomly between records. On the second and subsequent iteration it then updates the entire period history of the prior investment, as if updated details had come in. All of these changes occur within a single transaction per investment iteration (meaning 95 transactions in all).

Running this benchmark without the indexes, for those curious, became impossibly slow almost immediately due to the update stage needing to endlessly table scan.

For the snapshot-enabled tests it creates a snapshot of the root LVM volume between investment iterations, meaning that at the end of the run there are 95 snapshots, each of which contains the data as it was at the moment of creation (such that there is a snapshot where only the first investment exists. Then a snapshot with the first and the second, the details of the first changed. Then a third, second and first, and so on), those databases immediately usable, providing a glimpse of historic data.

The first test was run on an Amazon EC2 c3.2xlarge, which is a 8vCPU/15GB RAM instance chosen as it features two local 80GB SSDs. One SSD was mounted as an LVM volume, hosting the database and snapshots, while the other hosted the sqlite3 journal (which required changes to the sqlite3 source code, but yielded much better performance, and dramatically reduced what would have been churn across snapshots. Churn is the enemy on all systems).


A few things are evident from this.

  • The more data you insert, the slower the process is. This isn’t terribly surprising, and is primarily a facet of index maintenance.
  • LVM itself imposes a small overhead. The gray represents straight to the SSD, with no snapshots, against a non LVM volume (LVM2 not even installed). Red is with LVM installed, still with no snapshots. Blue represents snapshots on every investment iteration.
  • Snapshots impose far less of an overhead than I expected, at least against SSDs. At iteration 95, each insert had the overhead of maintaining 94 snapshots as well, yet it only halved the transaction speed.

Flash memory really is the solution here, as the primary overhead of maintaining snapshot are distinct IOs, moving to the various locations on the disk. I captured the iostats at every iteration, and it really is fascinating seeing the IOPS ramp up.

As one aside, at first consideration of the methodology you might logically think that the high degree of performance was explained by the limited overlap of the snapshots — e.g. on iteration 50, only investment 50 and 49 are altered and thus shouldn’t impact prior snapshots that didn’t even include this accumulated data. Unfortunately this isn’t true: When new and novel data is added to the database, those earlier snapshots still see the copy of the “original” data, which happened to be empty space on the hard drive.

After 95 iterations, for instance, the first snapshot had 52GB of changes written to it, even though the original data captured on that snapshot was in the sub-MB range.

Before looking at how snapshots impact performance on magnetic drives, here’s a quick comparison of the benchmark run on the SSD-equipped machine, compared to running it on a magnetic-drive hosting m1.xlarge, again hosting the benchmark on one of the local drives, the journal on another.


On pure IOPS and reductionist benchmarks, the flash drives launch to a many magnitudes advantage, but in this practical implementation it managed around a 5x speed advantage in most cases.

So how do snapshots impact the magnetic drive machine?


The situation is dramatically worse. Instead of a 2x slowdown, there is more in the range of a 5x slowdown, yielding a whole that is some 8.5x slower than the SSD equipped machine in the snapshot scenario.

Snapshot Performance – Not Too Bad

Under a somewhat realistic test scenario, on a modern machine (which now simply must mean flash-equipped), the performance overhead of LVM snapshots was actually significantly lower than I expected. It would be much better still if it took into account file system usage, and didn’t replicate unused file blocks to snapshots that simply don’t care about them.

Further I’ve tested up to 850 LVM2 snapshots, hitting a configuration ring-buffer limit that I could circumvent if I wanted to really push limits.

Having said that, LVM snapshots have a deadly, critical fault — when you create a snapshot, you have to allocate non-shared space that will hold its changeset. As mentioned above one test saw 52GB of changes pushed to one snapshot. Not only is this fixed, if you under-estimate this capacity and it fills, LVM2 degrades in a very unpleasant manner.

This kills the machine.

Quite seriously, though, several such cases required hard resets and rebuilding the LVM2 volume set after such a condition.

In any case, under controlled conditions snapshots can be a tool in the belt. How much it impacts your specific workload depends greatly upon the sorts of IOPS you generate, and the workload overhead you incur. But it’s worth looking into.

Using sqlite3 for financial calculation purposes

SQLite Is Everywhere

Even if you know nothing about the SQLite project, you’re probably using it right now.

It’s used in Apple’s Safari browser, and their iOS and OSX operating systems. It’s used in Mozilla Firefox. It’s used in Google’s Chrome browser and OS, and the Android OS.

SQLite is everywhere.

The wide usage is a result of a very liberal license that allows carte blanche use of a thoroughly tested embedded database solution built on a highly-portable, very efficient codebase.

It offers a surprisingly robust set of functionality. It supports fully ACID operations. It supports massive databases. It is easily extensible with custom functionality, such as very high performance calculations.

Don’t Burn Your Oracle License Just Yet

It’s not going to replace your big enterprise database solution: It’s an embedded project with limited writer concurrency, more often used to manage browser bookmarks and cookies than to run large OLTP platforms.

But it can certainly help with some of the edges. The edges that in practice end up being one of the biggest headaches of most financial systems, but are often the primary value of the platform: Reporting and analysis.

Most financial systems are built around collecting and aggregating data from disparate sources, using both manual input and automated processes to collect the data in the OLTP information store.

Transaction details. Security market close numbers. Fund compositions and holdings. Etc.

A very high degree of write concurrency is a necessity, though individual transactions tend to impact or draw upon a relatively small set of data (assuming you don’t attempt to front-load derived value computations). This is a domain that is often backed by a clustered copy of one of the big database products (Oracle, DB2, SQL Server, etc), and where I/O is the primary potential limitation — seldom an issue anymore with the enormous IOPS potential of flash solutions — horizontal scaling not being an issue outside of ensuring redundancy needs are met via a mechanism such as simple mirroring.

On the output side of the equation — the purpose for doing all of that data work — is reporting and analysis, which is a process that generally has marginal write concurrency needs, but often draws upon enormous sets of data to build those results. It’s a process that is ill-suited for most relational databases, especially when attempts are made to shoehorn it onto the same platform hosting the OLTP, causing destructive contention for limited resources, and a lock system that is overloaded trying to rationalize the competing demands.

Which is, of course, why most database vendors offer data warehousing products. Products more suitable for read-heavy jobs, where data is organized optimally for reporting purposes (for instance column-oriented). The data warehousing model is generally that you periodically or incrementally publish data from your OLTP system to your data warehouse, and then via technologies like OLAP allow users and agents to construct and generate results and what-if scenarios.

Data warehousing solutions work brilliantly for many markets, for instance to populate the scorecards and back the drilldowns of a chain of retail stores. It is less suitable for financial systems for a wide variety of implementation reasons, where attempting to make the solution fit often ends up causing more grief than satisfaction. The cube model, for instance, is unsuitable for many complex hierarchical holding scenarios.

HP’s Vertica offers a compelling option in column-oriented databases, compelling for its R integration, however it can be overwrought for some purposes.

In between are industry-specific solutions such as Advent’s Geneva. Geneva offers novel technical solutions to service the financial space (with a very unique and compelling in-memory, knowledge-date capable database system), and is a market leader, but like any product in the real world it isn’t a fit for all needs.

So many financial organizations end up with something unique and custom-made. Often a solution is pursued by trying to bolt something onto the OLTP system, throwing all of the eggs into one basket, creating an inefficient, unreliable platform that no longer serves any need well.

On such a hydra there is seldom the ability to scale economically. Security is usually a fingers-crossed affair. Data integrity is often compromised. Licensing costs often end up being a significant hindrance to optimal deployments. Maintainability….forget about it. Performance is usually abysmal.

On the other end of the custom-built spectrum are non-database solutions hosting maps of maps of dictionaries of hashes, data pulled, on-demand, into in-memory constructs by various .NET / Java / Python instances, loaded into purpose built structures and solutions. While they allow for trivial, low-cost scale-out and can offer optimal performance (many orders of magnitude over the performance generally achieved bolting into the RDBMS), the problem with these solutions is that they tend towards being overspecialized, every alteration turning into a significant coding project. They’re very difficult to bring to a project success state given the number of assumptions and analysis steps that need to be absolutely correct for the end result to yield the desired effect.

But assume you haven’t chosen either of those paths. Instead you have a clean and well-designed OLTP platform that only stores canonical data (nothing derived, keeping your data tight and hyper-efficient), every transaction as short and efficient as possible. You fully understand your data, and the universe of beneficial uses and calculations you can do on that data. What then?

You don’t want to be tied into the RDBMS, with the costs and limitations/serious performance restrictions that entails. You don’t want to build something entirely from scratch, essentially trying to build a specialized database product in the process, reinventing a problem solved many times over.

SQLite Enters The Scene

This is where SQLite comes into play as an element of a bounding solution.

  • It has a robust, well-proven, high-performance, low overhead set of data storage and SQL-enabled data retrieval functionality that you can embed into your solutions, including in-memory databases. This gives you optimized, specialized computing, but with the versatility that you can flex the data processed with simple SQL alterations
  • The SQLite library is very easily extensible. The example I give here is a simple IRR (internal rate of return) aggregate function extension, but the opportunities are endless, with our own library hosting a number of financial computations, all extremely high performance. With the foundation of SQLite you can add artificial table sources, unique native date functions, even custom data storage, etc.
  • The file format is well known and very widely supported. Assuming you do a reporting/analysis dump into a SQLite database file (or many files — scale out is unbounded, so you can build data islands to whatever extent makes sense), you can use that data in your specialized, SQL-enabled analysis solution hosting your customized SQLite library, or in any other tool that supports such files.
  • With those islands of sqlite databases you now have the capacity to use very efficient page-level file versioning, giving you the capacity to do state-in-time analysis (in the variety of “give me the performance metrics for this hierarchy of holdings as it would have been reporting given what was known on January 5th”. This allows you to report numbers exactly as they ).

Adding a Custom Function to SQLite

As an example of the simple implementation of a custom, native aggregate function within the SQLite library, an IRR variant (note that in Excel parlance the following implementation is actually a simile of the more versatile XIRR function) might be added to SQLite via-

** The follow structure helps keep track of irr information for the
** irr aggregate function

typedef struct TransactionSet TransactionSet;
struct TransactionSet {
  TransactionSet *nextTransactionSet;
} __attribute__ ((aligned (16)));

typedef struct NpvCtx NpvCtx;
struct NpvCtx {
  i64 cnt;
  i64 minPeriod;
  i64 maxPeriod;
  TransactionSet *baseTransactions;
  TransactionSet *headTransactions;

static double calcNpv(NpvCtx *p, double rate, int derivative)
  rate += 1.0;
  double result = 0.0;
  int i;
  int c;
  c = p->cnt;
  TransactionSet *t = p->baseTransactions;
  if (derivative) {
    for (i = 0; i < c; i++) {
      int period = t->period[i] - p->minPeriod;
      result += -period * t->amount[i] / pow(rate, period+1);
  } else {
    for (i = 0; i < c; i++) {
      result += (t->amount[i] / pow(rate, (double)t->period[i] - p->minPeriod));
  return result;

** Routines to implement daily irr.
static void irrStep(sqlite3_context *context, int argc, sqlite3_value **argv){
  int atype, ptype;
  NpvCtx *p;
  p =  sqlite3_aggregate_context(context, sizeof(NpvCtx));

  atype = sqlite3_value_numeric_type(argv[0]);
  ptype = sqlite3_value_numeric_type(argv[1]);

  if( p && atype!=SQLITE_NULL && ptype!=SQLITE_NULL){
    if (atype==SQLITE_FLOAT && ptype==SQLITE_INTEGER) {
      int period = sqlite3_value_int(argv[1]);
      int offset = p->cnt % TRANSACTION_SET_GROUP_COUNT;

      if (offset == 0) {
        TransactionSet *newSet;
        newSet = malloc(sizeof(TransactionSet));
        newSet->nextTransactionSet = (void *)0;

        if (p->headTransactions) {
          p->headTransactions->nextTransactionSet = newSet;
          p->headTransactions = newSet;
        } else {
          p->baseTransactions = newSet;
          p->headTransactions = newSet;

      if (p->cnt == 0) {
        p->maxPeriod = period;
        p->minPeriod = period;
      } else {
        if (period > p->maxPeriod) {
          p->maxPeriod = period;
        } else if (period < p->minPeriod) {
          p->minPeriod = period;
      p->headTransactions->period[offset] = period;
      p->headTransactions->amount[offset] = sqlite3_value_double(argv[0]);

#define IRR_CLOSE_ENOUGH 0.000001

static void irrFinalize(sqlite3_context *context){
  NpvCtx *p;
  p =  sqlite3_aggregate_context(context, 0);
  if (p) {
    int iterations;
    double guess, lguess, result, differential, diff;
    guess = 0.1;
    lguess = 0.0;
    diff = 1.0;
    iterations = 0;

    do {
      result = calcNpv(p, guess, 0);
      differential = calcNpv(p, guess, 1);
      if (differential == 0) {
        iterations = IRR_MAX_ITERATIONS;
      lguess = guess;
      guess = guess - result / differential;

      diff = guess - lguess;
      if (diff < 0) diff *= -1;
      if (diff < IRR_CLOSE_ENOUGH) break;
      if (iterations++ > IRR_MAX_ITERATIONS) break;
    } while (1);

    if (iterations < IRR_MAX_ITERATIONS) {
      sqlite3_result_double(context, (double)guess);
    } else {
      // no solution found
    // manually free the temporary chain of transaction data.
    TransactionSet *current, *next;
    current = p->headTransactions;
    while (current) {
      next = current->nextTransactionSet;
      current = next;
  } else {
    sqlite3_result_double(context, (double)0);

    AGGREGATE(irr, 2, 0, 0, irrStep, irrFinalize),

The aggregate function takes two parameters per row: value and period. Period is an integer value of your choosing, whether year, week from some origin, days from some origin, etc. If it were called with the data (-300000.0, 0), (150000.0, 1), (150000.0, 2), (150000.0, 3), (10000.0, 4), say via a query such as SELECT irr(amount, period) from cashtransactions where investment_id = 768;, it would return 24.3118538739616% (which is the rate at which you would discount prior cash flows to yield a result approximately equal to 0).

I’ve removed some error-checking/memory-management and hand-coded vectorization for simplification, though note that this takes full advantage of SSE and even AVX through auto-vectorization of any decent compiler, and can compute enormous data counts extremely quickly. With this very simple addition to SQLite, you have IRR and NPV functionality natively in the library, and thus in any of your apps that use that library (through the normal SQLite API).

As some implementation comments, note that this does need to copy subject data to temporary in-memory storage, which can mean wastefully copying from, in some cases, an in-memory table into an in-memory array. It does this both to allow for vectorization through alignment and placement, but also the nature of IRR calculations requires iterating over the same data repeatedly performing refined approximations of the NPV rate.

You can also add functionality like this via loadable extensions, but for our case our changes are so substantial that we add our customizations right in the SQLite library.

There are other open-source database solutions, of course, and another approach would be building a custom, financial-industry focused version of PostgrSQL or some other data product — and such products exist. Needs vary, however in our case the benefit of using SQLite was specifically that it was a self-contained, very efficient and lightweight, pluggable shared library that allowed us to still very easily incorporate the non-relational parts of the application in the host application without trying to shoehorn it into the relational design. SQLite is used as a data manager for an analysis and reporting engine, with some rudiments of financial functionality right in that library, but it is just a part of the solution.

In the financial calculation solutions that my organization provides to clients, we leverage this heavily customized, extremely high-performance SQLite library via Go (golang) as the orchestrator, using mechanisms I’ve mentioned on here before.

The end result is a powerful, flexible, efficient, easily modified and expanded, very easily scaled and migrated solution that can be pushed on thin cloud instances or hyper-threaded on mega servers (or even smartphones or embedded devices). One that offers performance that is simply unmatched by any competitive product, and is many orders of magnitude faster than the in-house built solutions often created. One that allows you to deploy to armies of computers with ease and at minimal cost or legal complexity.

Like The Sound of This

These are the sorts of solutions we @ yafla build for our clients. They enable efficiency and scale and flexibility of operations that make small and medium organizations into computing Goliaths.

These are solutions that endure.

Contact me today to get us making the technology of your organization into a formidable foe by using and building upon the awesome, incredible technology around us.

Go (“golang”) and Android

EDIT: Own an Android device? Check out Gallus on the Play Store. It’s like Instagram brought Hyperlapse to Android, only a lot better.

Readers know that I am a fan of Go: The ease with which you can build highly-concurrent apps is an absolute delight.

Let me broaden that a bit: The ease with which you can build pretty much any kind of system or console style app, from a web service layer, a utility to connect A⇌B, or a financial calculation system, is a delight. That Go makes concurrency painless and incredibly robust is a nice bonus.

Not every project benefits from Go, but it’s worth learning about and considering in the stack.

One realm where Go has been considered the odd man out, however, is Android development. Which is a bit strange given their shared Google lineage. Indeed, in conversations about Go, it’s surprising how often people have the odd belief that Go is either some sort of lock-in to the Google App Engine (it is supported, but so are other languages. I use and abuse Go but have zero interest in the App Engine), or a tool for Android, where it actually has effectively no official support at all.

It is neither.

Go is a modern language with a surprisingly mature toolset. It builds native executables for x86, AMD64 (x86_64), and ARM, targeting the Linux, Windows, Darwin (OSX), and FreeBSD platforms. These executables have very few dependencies, meaning you can usually deploy a simple executable and…that’s it…the magic happens.

Looking at the above list you might notice that Go builds for Linux, which Android is of course based upon (Android 4.4 is based upon the Linux 3.8 kernel). And it can build for ARM, the architecture of most Android devices.

And you can build for all of these options on one single machine. If you’re running on Windows — it’s my base operating system — do yourself a favor and fire up a VM running Ubuntu or some other Linux distro for Go work. Once you have that you can easily build Go from the source, configuring it for full cross-platform builds (my only additions to that page is to ensure that you’ve installed mercurial and build-essential first — e.g. sudo apt-get install build-essential, and to use the release branch on the hg clone — hg clone -u release

From that single box with simple alterations of environment variables you can build to a huge variety of targets, including that ARM/Linux combo that most Android devices run.

export GOARCH=arm
export GOOS=linux
export GOARM=7
go build

… out pops an ARM executable that you can actually run on Android!

Just to take a moment to talk about the capabilities of smartphones, my Nexus 5 features 3x the memory of the virtual machine running this blog (a platform running MySQL, WordPress, Ubuntu 12.04.04 LTS Server, PHP5-fpm, among a wide variety of other system things, and doing so effortlessly). My smartphone is in practice running a faster processor.

But every now and then I run an app on it — a bonafide app that I built for a desktop or server — and I have that moment of disbelief and awe. “How is this possible?” I ask myself. It’s just a smartphone!

It still hasn’t entirely sunk in that it’s a powerful little computer of its own.

Anyways, back to the practical details of that executable, if you want to actually see that in practice you can copy the file to your Android device somewhere where you can set the executable bit (meaning not under sdcard). For instance /data/local/tmp. Open an ADB shell and chmod the file to 755 (setting the execute bit). Run it.

adb push myexecutable /data/local/tmp/myexecutable
adb shell
cd /data/local/tmp
chmod 755 myexecutable

Magic! A Go app running, in all probability perfectly, on your Android device.

Of course this isn’t terribly practical. You aren’t going to deploy executables and ask your clients to fire up ADB.

You can also package these stand-alone executables as a part of your encompassing APK. The Go app itself cannot (with current tooling) be the primary app, but it can be an assisting app if the situation called for it.

Scenarios and possibilities vary dramatically, however imagine that you’ve built a financial engine in Go. It takes a set of inputs and runs a set of complex rules and out pops, for instance, investment allocations. You run this on your server, but you’re putting together some mobile apps and it’d be optimal to have a local instance.

But you don’t want to port all of that code to Java. It’s complex and is fully tested, making heavy use of all of the concurrency primitives that would be difficult to port anyways, so such a task, and then added burden of trying to keep two or more versions in sync, is a high price to pay.

So don’t. Deploy your Go app.

Copy myexectuable under your Android project’s ./libs/armeabi/ folder. If you’re targeting one of the rare x86 Android devices, build to x86/Linux and copy the resulting binary to ./libs/x86 (and yes, you can include both in a single APK, and it will deploy whichever one is appropriate).

Here’s where things get a little hackish, but to good effect. Your executable needs to be prefaced with lib, and have the extension .so, to be bundled into the APK. For instance The mechanisms of the tooling are intended for shared-libraries generated by the NDK, but for our case that doesn’t matter. With the filename that meets the packaging regular expression’s needs your executable will be bundled and deployed when the app is installed.

You can avoid the hackishness using make files and custom specifying the APK construction, but ultimately it doesn’t matter: That the file isn’t a shared library is of no consequence, beyond looking weird. It is a bag of bits that will have the execute bit automatically set on deployment.

You can query where this binary baggage can be found at runtime via getApplicationInfo().nativeLibraryDir.

So now you have your main app running on the Android device, and you know the location of a Go executable that came along for the ride. Get that party started.

Process p = Runtime.getRuntime().exec(getApplicationInfo().nativeLibraryDir() + "/");

How things work from here on out is up to you. Whether you use stdin/stdout to talk to the process, RPC, HTTP web service calls, ran TCP…whatever you do, you’re talking to another process. That other process has the same restrictions that your main Android app has (if you don’t have internet access, neither does it. It has the same file system limits. And so on. Going native doesn’t get you outside of the sandbox).

And of course you need to manage the lifetime of that second process. If your app is killed, as will it. If you want to preserve resources, shut it down. And so on.

None of this is to say that such a deployment strategy is ideal for what you are doing, but it’s interesting that you can do it, and it is one of those things that opens up possibilities that you might not have realized were there.

Edit: It’s worth noting that deploying a Go app as a service obviously isn’t ideal in many cases — a better solution would be a Go-developed library that could be used through JNI. The Go team is hard at work on that, and hopefully in the not too distant future the solutions you built in Go can be used without the hackery. Additionally note that Android L, in some builds, makes use of the SELinux functionality that restricts executables to those built using position-independent code. Go does not currently build such code, though surely the chain will add it in the near future.