Keeping Tabs: SQL Server 2014

[EDIT 2014-04-01: The final release version of SQL Server 2014 is now available at on technet]

SQL Server 2014 is available in beta form via the Community Technology Preview (CTP) #2, the production release expected in a few short months. This release is mostly of interest to so-called Microsoft shops, especially those who already have an investment in prior versions of the platform, or who benefit from its integration with .NET and the Windows platform.

A Costly Option

Over the years, SQL Server’s market position has gone from “a cheap, maligned, not entirely secure alternative that you started using because it came in your MSDN subscription and was easy to install“, to “a well-regarding RDBMS that often costs as much as an Oracle deploy“. Add the licensingpdf discussions and corresponding liabilities for every deployment strategy.

It is a rock-solid, high-performance, feature-rich database system, but it comes with a large price tag, coupled with the requirement that you leverage Windows somewhere in your platform, which isn’t always desired.

While it is worth the price for many scenarios, it is quite a contrast from the “add a couple more rack servers running NDB Cluster” type flexibility some other options provide. Instead you engage purchasing and financing and the guy in charge of license compliance, try to dig through all of the caveats and disclaimers to determine what you really can do with a given license set — a seemingly simple task that will often yield many conflicting answers from authoritative sources — and then write a really big check.

You might try to find a way to live with the lower-priced options, however the non-Enterprise editions are limited given the evolution of the computing platform, where very big memory (128GB+) is cheap and cores are plentiful. A 64GB limit per instance, and a max usage of 12 cores, as found in SQL Server Standard Edition, looks pretty restrictive next to a $7,000 rack server that can be purchased with 16+ blazing cores and 192GB+ of memory. That’s ignoring that many of the compelling features, some of which might be a necessity for your needs, are only found in the Enterprise edition.

The per core pricing strongly favors prioritizing the fastest per-core performance option you can find that can can handle your workload, such as the 6-core Xeon E5-2643 v2 or even the 4-core Xeon E5-2637 v2. Obviously both of those E5-2600 v2 options will be supplanted when the v3 Haswell options arrive, so if you’re travelling from the near future consider those instead.

Note that the license demands that you have core licenses for all cores in the server, so you can’t buy less and engage a governor. Virtual machine licensing has its own complexities.

For a conceptual two server, dual-CPU Xeon E5-2643 v2 configuration running in an AlwaysOn Availability Group in an Active/Active configuration (the second server providing not only failover redundancy, but also putting the hardware into use by hosting read loads, performing backups, etc), you would be looking at ~$83,000 in SQL Server licensing costs for Enterprise edition, or $22,000 for Standard edition. Considering the same but with the 12-core E5-2697 v2 in a 2-way SMP arrangement with the two machines in the active/active group, and you’re looking at some $330,000 in SQL Server licensing fees, standard no longer an option due to the core count. Such server hardware could be had in the $10,000 range each (Dell R620 with dual E5-2697 v2s and 256GB of memory) — the hardware is cheap, though I’ve rather conveniently ignored storage. Stick a 1.4TB Micron P420m or two in each of those servers, yielding impossibly high performance, exceeding 700,000 IOPS.

Halve the SQL Server licensing costs if your second box serves as only a failover.

Database servers have historically been I/O bound, but with the explosion and rapid evolution of flash-based storage, the weak link in some scenarios has moved back to the processor, especially where business logic is put in the stored procedures, so it’s worth noting that you also have 4-way SMP options: those with very deep pockets can keep on scaling up to serve enormous loads, and can scale out in various ways, most easily if those servers service read loads.

Microsoft does provide reduced pricing for some AMD architecturespdf, while penalizing one and two core processors. Those core factors will likely change with SQL Server 2014.

In any case, this entry isn’t intended as an itemized list of everything new or changed in the 2014 outing, as Microsoft is a much better source of such a rundown, which you can find on TechNet. If you’ve been out of the loop on SQL Server for a while, you can also look at what was new in SQL Server 2012, before that 2008 R2, and then back to 2008.

Instead I wrote this summary from my experiences as someone who has deployed a number of OLTP and data warehousing systems on the SQL Server platform over the years, from small and simple to massive and enormously complex, noting those things that I think are the most interesting for those cases, . Given that SQL Server 2014 isn’t final yet, features and pricing may change before release. This is based upon a period of evaluating the product and investigating the new functionality.

The Most Interesting New Features of SQL Server 2014

In-Memory OLTP – Enterprise Edition Only

This entails several rather interesting technologies, developed under the umbrella Hekaton code name. The two primary elements of this initiate are native tables and stored procedures.

And they really are native. When you create a native table or stored procedure, the system actually generates supporting c artifacts and manifest details, using the included Microsoft C++ v 17 (Visual Studio 2012) compiler and v 11 linker to generate actual native extension dlls that are then hooked into the system. I imagine at some point this will be leveraged for more external integration.

You have the option to create memory-optimized tables either in-memory only, where only the schema is persisted to disk and a restart will see the table empty, or persisted tables where the structures remain in the memory-optimized state, but are saved to disk. In both cases you need to have enough memory available to host all memory-optimized tables.

The memory-optimized tables use hash strategies for storage and lookups, and is effectively like having a copy of Redis inside your database, usable at a significantly higher performance level from within the database given the lack of IPC: you can query from memory-optimized tables at a rate of many, many millions of discrete lookups per second per thread.

I previously wrote an entry about key/value lookups and whether to use an RDBMS or in-memory structure, yet this new option in SQL Server completely upsets the pattern — it offers equivalent performance to that .NET solution from within SQL Server. If you’re trying to use this same mechanism on a key by key basis from a separate application, performance will obviously drop significantly courtesy of the tabular data stream chatter, so this is an architectural choice based on the need.

Memory optimized structures impose a significant number of restrictions on what you can do, so this isn’t a panacea, but in even the trivial cases I’ve tested the performance improvement is very dramatic. With servers hosting many GBs of data, this is a fantastic new bit of functionality in SQL Server, allowing for some incredibly high performance implementation options.

Backup Encryption

Having streaming, high performance, built-in encryption from the source significantly improves its practical application as databases grow larger, saving you from backing up to huge files and then post-processing outside of SQL Server through yet another tool, which itself would necessitate unnecessarily revealing unencrypted data outside of the database. This is similar to the benefits when compression was added in the database server in SQL Server 2008, again saving a lot of post-backup work.

My tests have shown a minimal bump in CPU usage during large backups with AES encryption, though I can’t determine whether SQL Server uses AES-NI or not, I have to imagine that they do.

This simple change removes most scenarios where backups end up becoming the source of data compromise. Obviously you need to ensure your keys are archived in a secure, protected location.

Backup Encryption doesn’t make sense if you already use Transparent Data Encryption on your database, however many avoid TDE (going without attached storage encryption, or using storage-based encryption) because it imposes sometimes considerable performance hit to most database operations, and makes database backups uncompressible. TDE only makes sense if you have a high-security application, where your storage devices/controllers cannot perform hardware encryption transparently (and often with no performance loss) themselves.

Buffer Pool Extensions – Enterprise Edition Only

Buffer Pool Extensions allow you to use an SSD as an application layer hybrid disk cache system atop the underlying database storage, expanding the buffer pool (which are pages of the database cached in memory). If your database is on a speedy PCI-E flash device, this is of little value, however if you’re drawing from magnetic drives, or over the network to an iSCSI device, and the active set is larger than available memory, it can offer dramatic read speed-ups.

While it does potentially have ramifications for writes — when you write data to objects in SQL Server, they are modified in the buffer pool, that change persisted in the linear transaction log and the transaction is considered durably applied, so hypothetically it could write those dirty pages to the extension SSD while awaiting the buffer pool write to storage. Occasionally (by default once a minute, or after various other metrics are satisfied) the system raises a checkpoint on the database and those dirty pool pages are written to disk: If your server then crashed before those changed buffer pool pages were written to disk, on restart it rolls forward through the transaction log, applying changes, and nothing is lost, which is the common design of most high durability databases.

This allows it to apply changes to disk in batches, avoiding slower random I/O.

However I cannot find scenarios outside of extremely limited memory or enormous quantities of writes with an extended checkpoint time where Buffer Pool Extensions would assist write performance. Consider it read focused.

If you have a 10TB database on an external iSCSI enclosure hosting a RAID-6 set of disks, BPE is a simple way to drop an SSD into your database server and improve read performance, possibility substantially if a wide range of data is frequently accessed. If, on the other hand, you have a 10GB database that you’re hosting on your 192GB server, BPE will do nothing for you. Nor will it if you have your database sitting on a Micron p420m device.

I am a little surprised that BPE is only available in the pricey Enterprise edition, given that most scenarios where it would be applicable are on more economical build-outs: The organizations paying hundreds of thousands in SQL licensings fees are likely to already have massive quantities of server memory, and to be using flash storage or minimally hardware hybrid storage, including just flash caching on their controller cards. This seems like a feature that would benefit the more economical build-outs.

Delayed Durability

One of the tricks that some database products use to post impressive benchmark results is to loosen durability. MongoDB, as one famous example, will happily and speedily confirm writes at a seeming impossible pace, the actual persistence to disk lagging far behind, and given that it is the coalesced combination of many transaction, tends to be much more efficient than doing each individually. A software or hardware fault when such a queue has pending writes means inevitable data loss. On the other end of the spectrum, most RDBMS systems have strict durability defaults: Your transaction doesn’t return until the data is confirmed written to disk2, and the disk subsystem has responded back with a confirmation that the write to media was successful.

It doesn’t need to be all or nothing. While there are many scenarios where durability needs to be ensured, there may be cases where durability can be loosened, on a transaction by transaction basis, benefiting from less contention and delays during bursty activity.

If data is derived, for instance, and can be reconstructed from the source data in the rare case of a failure, its a good candidate for delayed durability. Alternately if all data CRUD activity is initiated from a durable external message queue, where in the case of a failure you can simply find the last durably applied message and process forward. Such a scenario is an ideal architecture of a platform such as Hacker News where every vote, moderation, addition and edit or delete can be synchronously processed through json messages, those messages and their archive serving as the durable mechanism by which the database needs no intrinsic durability.

With SQL Server 2014, durability can be force loosened at the database level, or can be added as an option and then selected on a transaction by transaction basis.

In my own testing when configuring the data for the entry Your Query Planner and You, loosening durability significantly improved insert performance, with a corresponding drop in IOPS. This impact would be much greater in the case of significant contention and bursty activity, where transactions traditionally line up in a queue to await the fsync.

Delayed durability is a fantastic new option that will yield great benefits for many workloads, especially those that are essentially hand-coded Data Warehousing solutions.

Conclusion

These new additions to SQL Server 2014 add tremendous flexibility to the platform, allowing it to bridge the gap of performance with various competitors without fundamentally compromising the data integrity or functionality of the platform…unless you specifically choose to do so. It is a very compelling upgrade, especially for those shops who already use SQL Server and who likely sit several versions behind.

1 – that is SQL Server 2012 pricing, which 2014 will likely mirror. Note that the Standard Edition per core pricing listed on that page is wrong, and should be $1,793 per Microsoft’s licensing datasheet

2 – Worth noting that some disk subsystems have write-back caches that report a successful write before actually performing the write to media. Presuming that the controller has an adequate backup battery, such can provide a very safe but high performance platform.