Android’s “Secure Enclave” / Private Content and Strong Encryption

Recent iterations of the Android OS have exposed more of the ARM Trusted Execution Environment or Secure Element, allowing you to use encryption that can be strongly tied to a piece of hardware. It’s a subsystem where you can create strongly protected keys (symmetric and asymmetric), protected against extraction and rate limited (via user authentication) against brute force attacks, using them against streams of data to encrypt or decrypt securely.

The private elements of these keys can’t be extracted from the device (theoretically, at least), regardless of whether the application or even operating system were compromised or subverted.

A foe can’t do an adb backup or flash copy and find a poorly hidden private key to extract confidential data.

In an idealized world this would defend against even nation state levels of resources, though that isn’t necessarily the case.  Implementations slowly move towards perfection.

Imagine that you’re making an app to capture strongly encrypted video+audio (a sponsored upgrade solicitation I was recently offered for a prior product I built). The reasons could be many and are outside of the technical discussion: field intelligence gathering or secret R&D, catching corruption as a whistle blower, romantic trysts where the parties don’t really want their captures to be accidentally viewed by someone looking at holiday pictures on their device or automatically uploaded to the cloud or shared, etc.

There are nefarious uses for encryption, as there are with all privacy efforts, but there are countless entirely innocent and lawful uses in the era of the “Fappening”. We have credible reasons for wanting to protect data when it’s perilously easy to share it accidentally and unintentionally.

Let’s start with a modern Android device with full disk encryption. As a start you’re in a better place than without, but this still leaves a number of gaps (FDE becomes irrelevant when you happily unlock and hand your device to a family member to play a game, or when the Android media scanner decides to enumerate your media and it wasn’t appropriately protected, or you pocket shared to Facebook, etc).

So you have some codec streams emitting HEVC and AAC stream blocks (or any other source of data, really), and you want to encrypt it in a strong, device coupled fashion, above and beyond FDE. You accept that if the device is lost or broken, that data is gone presuming you aren’t live uploading the original streams, presumably over an encrypted connection, to some persistence location, which obviously brings up a litany of new concerns and considerations and may undermine this whole exercise.

Easy peasie, at least on Android 6.0 or above (which currently entails about 27% of the active Android market, which sounds small until you consider that this would account for hundreds of millions of devices, which by normal measures is a massive market).

final String keyIdentifier = "codecEncrypt";
final KeyStore ks = KeyStore.getInstance("AndroidKeyStore");

SecretKey key = (SecretKey) ks.getKey(keyIdentifier, null);
if (key == null) {
   // create the key
   KeyGenerator keyGenerator = KeyGenerator.getInstance(
      KeyProperties.KEY_ALGORITHM_AES, "AndroidKeyStore");
      new KeyGenParameterSpec.Builder(keyIdentifier,
         KeyProperties.PURPOSE_ENCRYPT | KeyProperties.PURPOSE_DECRYPT)
   key = keyGenerator.generateKey();

   // verify that key is in secure hardware.
   SecretKeyFactory factory = SecretKeyFactory.getInstance(key.getAlgorithm(), "AndroidKeyStore");
   KeyInfo keyInfo = (KeyInfo) factory.getKeySpec(key, KeyInfo.class);
   if (!keyInfo.isInsideSecureHardware()) {
      // is this acceptable? Depends on the app

The above sample is greatly simplified, and there are a number of possible exceptions and error states that need to be accounted for, as does the decision of whether secure hardware is a necessity or a nicety (in the nicety case the OS still acts with best efforts to protect the key, but has less of a barrier to exploitation if someone compromised the OS itself).

In this case it’s an AES key that will allow for a number of block mode and padding uses. Notably the key demands user authentication 30 seconds before use: For a device with a finger print or passcode or pattern, the key won’t allow for initialization in a cipher unless that requirement has been met, demanding that your app imperatively demand a re-authentication on exceptions. Whether this is a requirement for a given use is up to the developer.

You can’t pull the key materials as the key is protected from extraction, both through software and hardware.

Using the key is largely normal cipher operations.

Cipher cipher = Cipher.getInstance("AES/CBC/PKCS7Padding");
cipher.init(Cipher.ENCRYPT_MODE, key);
// the iv is critically important. save it.
byte[] iv = cipher.getIV();
// encrypt the data.
byte [] encryptedBytes = cipher.update(dataToEncrypt);
... persist and repeat with subsequent blocks.
encryptedBytes = cipher.doFinal();

And to decrypt

Cipher decryptCipher = Cipher.getInstance("AES/CBC/PKCS7Padding");
AlgorithmParameterSpec IVspec = new IvParameterSpec(iv);
decryptCipher.init(Cipher.DECRYPT_MODE, key, IVspec);
byte [] decryptedBytes = decryptCipher.update(encryptedBlock);
byte [] decryptedBytes = decryptCipher.doFinal

Pretty straightforward, and the key is never revealed to the application, nor often even to the OS. If you had demanded that the user be recently authenticated and that requirement isn’t satisfied (e.g. the timeout had elapsed), the Cipher init call would yield a UserNotAuthenticatedException exception, which you could deal with by calling-

Intent targetIntent = keyguardManager.createConfirmDeviceCredentialIntent(null, null);
startActivityForResult(targetIntent, 0);

Try again on a successful authentication callback, the secure “enclave” doing the necessary rate limiting and lockouts as appropriate.

And of course you may have separate keys for different media files, though ultimately nothing would be gained by doing that.

Having the key in secure hardware is a huge benefit, and ensuring that an authentication happened recently is crucial, but if you take out your full disk encryption Android device, unlock it with your fingerprint, and then hand it to your grandmother to play Townships, she might accidentally hit recent apps and clicks to open a video where you’re doing parkour at the tops of city buildings (family controversy!). It would open because all of the requirements have been satisfied and the hardware key would be happily allowed to be used to decrypt the media stream.

It’d be nice to have an additional level of protection above and beyond simple imperative fences. One thing that is missing from the KeyStore implementation is the ability to add an imperative password to each key (which the traditional key vaults have).

Adding a second level password, including per media resource (without requiring additional keys) is trivial. Recall that each Cipher starts with a randomly generated IV (initialization vector which is, as the name states, is the initial state of the cipher) that ultimately is not privileged information, and generally is information that is stored in the clear (the point of an IV is that if you re-encrypt the same content again and again with the same key, an unwanted observer could discern that it’s repeating, so adding a random IV as the starting point makes each encrypted message entirely different).

Without the IV you can’t decrypt the stream. So let’s encrypt the IV with a pass phrase.

SecretKeyFactory factory = SecretKeyFactory.getInstance("PBEwithSHAandTWOFISH-CBC");
SecretKey ivKey = factory.generateSecret(new PBEKeySpec(password.toCharArray(), new byte[] {0}, 32, 256));

Cipher ivCipher = Cipher.getInstance("AES/GCM/NoPadding");
ivCipher.init(Cipher.ENCRYPT_MODE, ivKey);
byte [] ivIv = ivCipher.getIV();
byte [] encryptedIv = ivCipher.doFinal(iv);

In this case I used GCM, which cryptographically tags the encrypted data with an integrity digest and on decryption validates the payload against corruption or modification. A successful GCM decryption is a clear thumbs up that the password was correct. You could of course use GCM for the media streams as well, and if it’s individual frames of audio or video each in distinct sessions that would probably be ideal, but for large messages GCM has the downside that the entirety of output is buffered until completion to allow it to compute and validate the GCM.

Now we have an encrypted IV (the 16-byte IV becoming a 32-byte encrypted output given that it contains the 16-byte GCM tag, plus we need to save the additional 16-byte IV for this new session, so 48-bytes to store our protected IV). Note that I used a salt of a single 0 byte because it doesn’t add value in this case.

You can do time-intensive many-round KDFs, but in this case where it’s acting as a secure hardware augmentation over already strong, rate-limited encryption, it isn’t that critical.

To decrypt the IV-

ivCipher = Cipher.getInstance("AES/GCM/NoPadding");
AlgorithmParameterSpec IVspec = new IvParameterSpec(ivIv);
ivCipher.init(Cipher.DECRYPT_MODE, ivKey, IVspec);
byte [] decryptedIv = ivCipher.doFinal(encryptedIv);

We store the encrypted streams and the encrypted IV (including its IV), and now to access that media stream the user needs to authenticate with the OS rate-and-try limited authentication, the hardware needs the associated trusted environment key, and the user needs the correct passphrase to access the IV to successfully decrypt.

In the end it’s remarkably simple to add powerful, extremely effective encryption to your application. This can be useful simply to protect you from your own misclicks, or even to defend against formidable, well-resourced foes.

Micro-benchmarks as the Canary in the Coal Mine

I frequent a number of programming social news style sites as a morning ritual: You don’t have to chase every trend, but being aware of happenings in the industry, learning from other people’s discoveries and adventures, is a useful exercise.

A recurring source of content are micro-benchmarks of some easily understood sliver of our problem space, the canonical example being trivial web implementations in one’s platform of choice.

A Hello World for HTTP.

package main

import (

func handler(w http.ResponseWriter, r *http.Request) {
   fmt.Fprintf(w, "Hello world!")

func main() {
   http.HandleFunc("/", handler)
   http.ListenAndServe(":8080", nil)

Incontestable proof of the universal superiority of whatever language is being pushed. Massive numbers of meaningless requests served by a single virtual server.

As an aside that I should probably add as a footnote, I still strongly recommend that static and cached content be served from a dedicated platform like nginx (use lightweight unix sockets to the back end if on the same machine), itself very likely layered by a CDN. This sort of trivial type stuff should never be in your own code, nor should it be a primary focus of optimizations.

Occasionally the discussion will move to a slightly higher level and there’ll be impassioned debates about HTTP routers (differentiating URLs, pulling parameters, etc, then calling the relevant service logic), everyone optimizing the edges. There are thousands of HTTP routers on virtually every platform, most differentiated by tiny performance differences.

People once cut their teeth by making their own compiler or OS, but now everyone seems to start by making an HTTP router. Focus moves elsewhere.

In a recent discussion where a micro-benchmark was being discussed (used to promote a pre-alpha platform), a user said in regards to Go (one of the lesser alternatives compared against)-

it’s just that the std lib is coded with total disregard for performance concerns, the http server is slow, regex implementation is a joke”

total disregard. A jokeSlow.

On a decently capable server, that critiqued Go implementation, if you’re testing it in isolation and don’t care about doing anything actually useful, could serve more requests than seen by the vast majority of sites on these fair tubes of ours. With a magnitude or two to spare.

100s of thousands of requests per second is simply enormous. It wasn’t that long ago that we were amazed at 100 requests per second for completely static content cached in memory. Just a few short years ago most frameworks tapped out at barely double digit requests per second (twas the era of synchronous IO and blocking a threads for every request).

As a fun fact, a recent implementation I spearheaded attained four million fully robust web service financial transactions per second. This was on a seriously high-end server, and used a wide range of optimizations such as a zero-copy network interface and secure memory sharing between service layers, and ultimately was just grossly overbuilt unless conquering new worlds, but it helped a sales pitch.

Things improve. Standards and expectations improve. That really was a poor state of affairs, and not only were users given a slow, poor experience, it often required farms of servers for even modest traffic needs.

Choosing a high performance foundation is good. The common notion that you can just fix the poor performance parts after the fact seldom holds true.

Nonetheless, the whole venture made me curious what sort of correlation trivial micro-benchmarks hold to actual real-world needs. Clearly printing a string to a TCP connection is an absolutely minuscule part of any real-world solution, and once you’ve layered in authentication and authorization and models and abstractions and back-end microservices and ORMs and databases, it becomes a rounding error.

But does it indicate choices behind the scenes, or a fanatical pursuit of performance, that pays off elsewhere?

It’s tough to gauge because there is no universal web platform benchmark. There is no TPC for web applications.

The best we have, really, are the TechEmpower benchmarks. These are a set of relatively simple benchmarks that vary from absurdly trivial to mostly trivial-

  • Return a simple string (plaintext)
  • Serialize an object (containing a single string) into a JSON string and return it (json)
  • Query a value from a database, and serialize it (an id and a string) into a JSON string and return it (single query)
  • Query multiple values from a database and serialize them (multiple queries)
  • Query values from a database, add an additional value, and serialize them (fortunes)
  • Load rows into objects, update the objects, save the changes back to the database, serialize to json (data updates)

It is hardly a real world implementation of the stacks of dependencies and efficiency barriers in an application, but some of the tests are worlds better than the trivial micro-benchmarks that dot the land. It also gives developers a visible performance reward, just as Sunspider led to enormous Javascript performance improvements.

So here’s the performance profile of a variety of frameworks/platforms against the postgres db on their physical test platform, each clustered in a sequence of plaintext (blue), JSON (red), Fortune (yellow), Single Query (green), and Multiple Query (brown) results. The vertical axis has been capped at 1,000,000 requests per second to preserve detail, and only frameworks having results for all of the categories are included.

When I originally decided that I’d author this piece, my intention was to actually show that you shouldn’t trust micro-benchmarks because they seldom have a correlation with more significant tasks that you’ll face in real life. While I’ve long argued that such optimizations often indicate a team that cares about performance holistically, in the web world it has often been the case that products that shine at very specific things are often very weak in more realistic use.

But in this case my core assumption was only partly right. The correlation between the trivial micro-benchmark speed — simply returning a string — and the more significant tasks that I was sure would be drown out by underlying processing (when you’re doing queries at a rate of 1000 per second, an overhead of 0.000001s is hardly relevant), is much higher than I expected.

  • 0.75 – Correlation between JSON and plaintext performance
  • 0.58 – Correlation between Fortune and plaintext performance
  • 0.646 – Correlation between Single query and plaintext performance
  • 0.21371 – Correlation between Multiple query and plaintext performance

As more happens in the background, outside of the control of the framework, invariably the raw performance advantage is lost, but my core assumption was that there would be a much smaller correlation.

So in the end this is simply a “well, that’s interesting” post. It certainly isn’t a recommendation for any framework or the other — developer aptitude and suitability for task reign supreme — but I found it interesting.


Link Rot Pt 2

Over the years I’ve moved between a number of content management systems, URL schemes, and even whole domain name changes. So when I did a move a while back I put in a large number of URL redirects for all of those ancient URLs in use around the tubes. A year ago I announced that I was removing them, but happenstance had me installing an nginx variant that had perl shortly after, so they lived for a while longer.

I finally moved on, removing the surface area risk of that rewrite subsystem. Those URLs that I have been serving 301 permanent redirects for years are now dead. I have added to the global accumulation of link rot. I see dozens to hundreds of people coming from old HN or other links daily, get redirected to the front page, and click back (interesting to note that no one just searches for whatever it was that they were looking for once on the site. Attention spans have fallen to essentially zero).

I considered just making the 404 page a search of the wrong URL, Not worth it. Technically it’s an easy problem to solve, but that doesn’t mean it’s worth solving.

A pretty boring tale about link rot, but really it’s an observation about technology and simple solutions being ignored: Again, for years those old links were being responded to with a courtesy note that the URL has changed and a new URL should be used permanently. Link rot didn’t have to become rot if any of those systems ever did any verification at all on their links, remembering the new location for the future.

While that sort of link verification and upkeep is a slightly more involved task with something like social news or comment links (and has some considerations that would need to be considered like PageRank gaming, where a bunch of old links were all spaghetti fed to some spam site after gaining credibility, though that’s the case with a persistent redirect so simply fixing the new location is no regression), it’s so bad that even feed readers ignored 301 redirects for years. They followed them, but the next request they were back at the original URL, once again awaiting the onset of link rot.

The Dark Clouds of Choice

I enjoy Go (the programming language/tooling). It rapidly facilitates efficient, concise solutions. Whether it’s processing some data, building a simple connector web service, hosting a system monitoring tool (where I might layer it on some C libraries), it just makes it a joy to bang out a solution. The no-dependency binary outputs are a bonus.

It’s a great tool to have in pocket, and is fun to leverage.

I recently pondered why this was. Why I don’t have the same feeling of delight building solutions in any of the other platforms that I regularly use. C# or Java, for instance, are both absolutely spectacular languages with extraordinary ecosystems (I’d talk about C or C++ here, using both in the regular mix, but while there are usually very practical reasons to drop to them they don’t really fit in this conversation, despite the fact that from a build-a-fast-native-executable perspective they’re the closest).

Goroutines? Syntactic sugar over thread pools. Channels? That’s a concurrent queue. Syntactical sugar is nice and lubricates usage, but once you’ve done it enough times it just becomes irrelevant. A static native build? A bonus, but not a critical requirement.

There is nothing I build in Go that I can’t build at a similar pace in the other languages. And those languages have rich IDEs and toolsets, while Go remains remarkably spartan.

The reason I enjoy it, I think, is that Go is young enough that it isn’t overloaded with the paradox of choice: You throw together some Go using the basic idioms and widely acknowledged best practices, compile it, and there’s your solution. Move on. A lot of strongly held opinions are appearing (about dependencies and versioning, etc — things Go should have gotten right in v1), and an evolutionary battle is happening between a lot of supporting libraries, but ultimately you can mostly ignore it. Happily build and enjoy.

Doing the same project in Java or C#, on the other hand, is an endless series of diverging multi-path forks. Choices. For the most trivial of needs you have countless options of implementation approaches and patterns, and supporting libraries and dependencies. With each new language iteration the options multiply as more language elements from newer languages are grafted on.

Choices are empowering when you’re choosing between wrong and right options, where you can objectively evaluate and make informed, confident decisions. Unfortunately our choices are often more a matter of taste, with countless ways to achieve the same goal, with primarily subjective differences (I’ll anger 50% of C# programmers by stating that LINQ is one of the worst things to happen to the language, and is an inelegant hack that is overwhelming used to build terrible, inefficient, opaque code).

We’re perpetually dogged with the sense that you could have gone a different way. Done it a different way. I actually enjoy C++ (I admit it…Stockholm syndrome?), but with each new standard there are more bolted-on ways to achieve existing solutions in slightly different ways.

I of course still build many solutions on those other platforms, and am incredibly thankful they exist, but I never have the completely confident sense that it is optimal in all ways, or that someone couldn’t look at it and ask “Couldn’t you have…” and I could firmly retort. I continue an internal debate about forks not taken.

The Best of the Best of the Best…Maybe?

I’ve talked about the consulting thing on here a lot, and the trouble involved with the pursuit. While I’ve been working with a fantastic long-running client, and have primarily been focused on speculative technology builds, I considered keeping skills diverse and fresh by mixing things up and working occasionally through a freelancing group that purports to have the best of the best. Doing this would theoretically remove the bad parts of chasing engagements, pre-sales, badgering for payments, etcThe parts that are a giant pain when you hang your own shingle.

If it was just challenging engagements with vetted clients, cool. Just the fun and rewarding parts, please and thank-you.

Freelancing groups almost always end up being a race to the bottom, generally becoming dens of mediocrity, so the notion of a very selective group made it more interesting. I like a challenge, and if someone wants to build a collection of Boss Levels for me to battle, the barriers yielding a group that customers would pay a premium for, maybe it’d be interesting.

So I took a look to find that one of their requirements — not mandatory, but strong recommended — is that you post on your blog how much you want to work with them. This is before you know anything about the credibility of their process, rates, the quality of peers, etc. And this isn’t something you can easily find: they demand that you don’t talk about their process or virtually anything involved with working with them, so in the absence of any information about them (beyond some very negative threads I later found on HN, primarily posts by throwaway accounts), you need to tell the world how eager you are to join them.

This is a profound asymmetry of motives.

Who would do that? It is an adverse selection criteria, and it instantly deflated my impression of their selectivity and had me clicking the back button: The sort of desperation where someone would pander like that — while certainly possible among fantastic talents in bad situations — is not a good criteria. To try to harvest cheap link and namespace credibility like that itself makes the service look lame, like a cheap 1990s long distance carrier.

I still want those challenging smaller efforts, however — variety keeps me fresh and in love with this field, and some extra cash and contacts are always nice — so instead I’m going to start offering micro-consulting via yafla, pitching it on here as well: Pay some set amount (e.g. $500), describe your problem and supply supporting data, and I’ll spend from four to eight hours on your problem, offering up appropriate deliverables (utility, shell of a solution, decision input, directions, analysis, etc). I’ll get that online quickly.

Going for the smaller chunk efforts, of the sort that someone with a corporate credit card can easily pay for, should prove much more tenable than seeking significant engagements where the sales process just drags on forever.

It also is a cross-motivation desire to encourage me to spend more time on this blog, and if I’m pitching bite-sized consulting on posts and actually seeing uptake, it’ll keep me authoring content that people want to see.

While I have always enjoyed the cathartic aspect of blogging, it has never been profitable for me. Every engagement has always come via referral and word of mouth, and even when I’ve sought traditional employment I’ve always been amazed that no one ever even does a simple Google search, much less read anything on here. I never expected any return, and adore that anyone reads these things at all, but it does make it so at times it’s tough to justify time spent making content.

Paid For Solutions, Not The Pursuit

A fun read via HN this morn – You Are Not Paid to Write Code.

I’ve touched on this many times here, including an entry a decade ago where I called SLOC the “Most Destructive Metric in Software Development“. A decade of experience has only made me double (neigh, quadruple) my belief in that sentiment: outside of truly novel solutions, SLOC often has a negative correlation with productivity, and high SLOC shops almost universally slow to a crawl until they hit the rock bottom of no progress. Eventually a new guard is brought in, a giant volume of code is trashed, and the same futile pursuit started fresh again.

This time, you see, it’s a giant node.js codebase instead of that silly giant python codebase they pursued the last time, replacing the giant Delphi codebase that replaced the giant Visual Basic codebase that…

This time it’s different.

An entry on here that gets a number of daily visitors is one from 2005 – Internal Code Reuse Considered Dangerous (I’ve always wondered why, and my weak assumption is it’s coworkers trying to convince peers that they need to move on from the internal framework mentality). That piece came from firsthand observations of a number of shops that had enormous volumes of internal frameworks and libraries that were second-rate, half-complete duplications of proven industry options. But it was treated as an asset, as if just a bit more code would give them the swiss army knife that would allow them to annihilate competitors. Every one of those shops eventually failed outright or did a complete technology shift to leave the detritus in the past.

It isn’t hard to figure out how this happens. If someone asks you to process a file from A to B — the B is the part they care about, not particularly how you do it — and you present a solution including some free and appropriate ETL tools and options, there is no glory in that. If, on the other hand, you make a heavily abstracted, versatile, plug-in engine that can (hypothetically and in some future reality where it ever gets finished) process any form of file to any form of output with a pluggable reference engine and calculation derivative, you can pitch the notion of IP. That instead of just providing a solution, you’ve also built an asset.

This is a lie almost all of the time. There is incredibly little code theft in this industry. That giant internal framework, when uncoupled from the internal mythology of a shop, suddenly has negligible or even negative value to outsiders. A part of that is a not invented here syndrome endemic in this industry (I’ve been in the business of trying to sell completed, proven solutions, and even then it’s tough to find customers because everyone imagines that they can build it themselves better), but a larger part is simply that broadly usable, generalized solutions don’t happen by accident.

This is one of those sorts of entries where some might contrive exceptions, and of course there are exceptions. There are cases where the business says “turn A to B” and you discover that you really need to turn A to B-Z, and 0-9, and… There are many situations where novel solutions are necessary. But so many times they simply aren’t.

Some of the most fulfilling consulting engagements I’ve taken on have been fixed deliverable / fixed price contracts. These are a sort that everyone in this industry will tell you never to do, but really if you have a good grip on your abilities, the problem, and you can obtain a clearly understood agreement of scope and capabilities and deficiencies of the proposed build, it is incredibly liberating. Being literally paid for the solution leads to some of the most frictionless gigs.