Updates: Pay Apps / Date-Changing Posts / Random Projects

Recently a couple of readers noticed that some posts seemed to be reposted with contemporary dates. The explanation might be broadly interesting, so here goes.

I host this site on Amazon’s AWS, as that’s where I’ve done a lot of professional work, I trust the platform, etc. It’s just a personal blog so I actually host it on spot instances — instances that are bid upon and can be terminated at any moment — and there was a dramatic spike late in the week on the pricing of c3 instances, exceeding my bid maximum. My instance was terminated with extreme prejudice. I still had the EBS volume, and could easily have replicated the data on the new instance for zero data loss (just a small period of unavailability), however I was just heading out so I just ramped up an AMI image that I’d previously saved, posted a couple of the lost posts from Google cache text, and let it be. Apologies.

Revisiting Gallus

Readers know I worked for a while on a speculative app called Gallus — a gyroscope-stabilized video solution with a long litany of additional features. Gallus came close to being sold as a complete technology twice, and was the source of an unending amount of stress.

Anyways, recently wanted a challenge of frame-v-frame image stabilization and achieved some fantastic results, motivated by my Galaxy S8 that features OIS (which it provides no developer accessible metrics upon), but given the short range of in-camera OIS it can yield an imperfect result. The idea with be a combination of EIS and OIS, and the result of that development benefits everything. I rolled it into Gallus to augment the existing gyroscope feature, coupling both for fantastic results (it gets rid of the odd gyro mistiming issue, but still has the benefit that it fully stabilizes with highly dynamic and complex scenes). Previously I pursued purely a big pop outcome — I only wanted a tech purchase, coming perilously close — but this time it’s much more sedate in my life and my hope is relaxed. Nonetheless it will return as a pay app, with a dramatically simplified and optimized API. I am considering restricting it only to devices I directly test on first hand. If there are zero or a dozen installs that’s fine, as it’s a much different approach and expectation.

Project with my Son

Another project approaching release is novelty app with my son, primarily to acclimate him to “team” working with git. Again expectations are amazingly low and it’s just for fun, but might make for the source of some content.

Gallus / Lessons In Developing For Android

I recently unpublished Gallus — the first and only gyroscope stabilizing hyperlapse app on Android, capturing and stabilizing 4K/60FPS video in real time with overlaid instrumentation like route, speed, distance, etc — from the Google Play Store. It had somewhere in the range of 50,000 installs at the time of unpublishing, with an average rating just under 4.

It was an enjoyable project to work on primarily because it was extremely challenging (75% of the challenge being dealing with deficiencies of Android devices), and in the wake of the change a previous negotiation to sell the technology/source code and possible involvement was revived. That old discussion had been put on indefinite hold in an advanced stage when a buyer was acquired by a larger company, which quite reasonably led them to regroup to analyze how their technologies overlapped and what their ongoing strategy would be.

I put far too many eggs into that deal going smoothly, leading to significant personal costs. It taught me a lesson about contingencies. I’m pleased it is coming to fruition now.

I never had revenue plans with the free app by itself (no ads, no data collection, no coupling with anything else…just pure functionality), and ultimately the technology sell was my goal.

Nonetheless, the experience was illuminating, so I’m taking a moment to document my observations for future me to search up later-

  • Organic search is useless as a means of getting installs. Use a PR-style approach. 50,000 is two magnitudes below my expectation.
  • No matter how simple and intuitive your interface is, people will misunderstand it. When a product is widely used we naturally think “I must be doing something wrong” with the worst interfaces, putting in the time and effort to learn its idioms and standards. When a product isn’t widely used we instead think “This product is doing something wrong.” and blame the product. I abhor the blame the user type mentality, but at the same time it was eye opening how little time people would spend with the most rudimentary of investigations before giving up.
  • The higher expectations are about your product — the more a user wants to leverage it and sees it adding value to their life — the more vigorous and motivated their anger/disappointment is if for some reason they can’t. Over the duration on the market some of the feedback I got because an app wouldn’t work on someone’s strange brand device that I’d never heard of, with a chipset that I’d never seen before, was extraordinary. A sort of “you owe me this working perfectly on my device“.
  • The one-star-but-will-five-star-if-you… users are terrible people. There’s simply no other way to put it. Early on I played along, but quickly learned that they’ll be back with a one star in the future with a new demand.
  • I have spoken about the exaggeration of Android fragmentation on here many times before: For most apps it is a complete non-issue. Most apps can reasonably target 4.4 and above with little market impact. Cutting out obsolete devices will often just save you from negative feedback later — it is Pyrrhic to reach back and support people’s abandoned, obsolete devices — but if you do try to reach back for 100% coverage it’s made easier with the compatibility support library.
    At the same time, though, if you touch the camera pipeline it is nothing but a world of hurt. The number of defects, non-conformance, broken codecs, and fragile systems that I encountered on devices left me shell-shocked. Wrong pixel formats, codecs that die in certain combinations with other codecs, the complete and the utter mystery of what a given device can handle (resolution, bitrate, i-frame interval, frame rate, etc).For a free little app I certainly couldn’t have a suite of test devices (for a given Samsung Galaxy S device there are often dozens of variations, each with unique quirks), and could only base my work on the six or so devices I do have, so it ends up being a game of “if someone complains just block the device”, but then I’d notice months earlier that someone on a slight variation of the device gave a five star review and reported great results.

I enjoyed the project, and finally it looks like it will pay off financially, but the experience dramatically changed my future approach to Android development.

Android’s “Secure Enclave” / Private Content and Strong Encryption

Recent iterations of the Android OS have exposed more of the ARM Trusted Execution Environment or Secure Element, allowing you to use encryption that can be strongly tied to a piece of hardware. It’s a subsystem where you can create strongly protected keys (symmetric and asymmetric), protected against extraction and rate limited (via user authentication) against brute force attacks, using them against streams of data to encrypt or decrypt securely.

The private elements of these keys can’t be extracted from the device (theoretically, at least), regardless of whether the application or even operating system were compromised or subverted.

A foe can’t do an adb backup or flash copy and find a poorly hidden private key to extract confidential data.

In an idealized world this would defend against even nation state levels of resources, though that isn’t necessarily the case.  Implementations slowly move towards perfection.

Imagine that you’re making an app to capture strongly encrypted video+audio (a sponsored upgrade solicitation I was recently offered for a prior product I built). The reasons could be many and are outside of the technical discussion: field intelligence gathering or secret R&D, catching corruption as a whistle blower, romantic trysts where the parties don’t really want their captures to be accidentally viewed by someone looking at holiday pictures on their device or automatically uploaded to the cloud or shared, etc.

There are nefarious uses for encryption, as there are with all privacy efforts, but there are countless entirely innocent and lawful uses in the era of the “Fappening”. We have credible reasons for wanting to protect data when it’s perilously easy to share it accidentally and unintentionally.

Let’s start with a modern Android device with full disk encryption. As a start you’re in a better place than without, but this still leaves a number of gaps (FDE becomes irrelevant when you happily unlock and hand your device to a family member to play a game, or when the Android media scanner decides to enumerate your media and it wasn’t appropriately protected, or you pocket shared to Facebook, etc).

So you have some codec streams emitting HEVC and AAC stream blocks (or any other source of data, really), and you want to encrypt it in a strong, device coupled fashion, above and beyond FDE. You accept that if the device is lost or broken, that data is gone presuming you aren’t live uploading the original streams, presumably over an encrypted connection, to some persistence location, which obviously brings up a litany of new concerns and considerations and may undermine this whole exercise.

Easy peasie, at least on Android 6.0 or above (which currently entails about 27% of the active Android market, which sounds small until you consider that this would account for hundreds of millions of devices, which by normal measures is a massive market).

final String keyIdentifier = "codecEncrypt";
final KeyStore ks = KeyStore.getInstance("AndroidKeyStore");
ks.load(null);

SecretKey key = (SecretKey) ks.getKey(keyIdentifier, null);
if (key == null) {
   // create the key
   KeyGenerator keyGenerator = KeyGenerator.getInstance(
      KeyProperties.KEY_ALGORITHM_AES, "AndroidKeyStore");
   keyGenerator.init(
      new KeyGenParameterSpec.Builder(keyIdentifier,
         KeyProperties.PURPOSE_ENCRYPT | KeyProperties.PURPOSE_DECRYPT)
            .setBlockModes(
               KeyProperties.BLOCK_MODE_CBC,
               KeyProperties.BLOCK_MODE_CTR, 
               KeyProperties.BLOCK_MODE_GCM)
            .setEncryptionPaddings(
               KeyProperties.ENCRYPTION_PADDING_PKCS7, 
               KeyProperties.ENCRYPTION_PADDING_NONE)
            .setUserAuthenticationRequired(true)
            .setUserAuthenticationValidityDurationSeconds(30)
            .build());
   key = keyGenerator.generateKey();

   // verify that key is in secure hardware.
   SecretKeyFactory factory = SecretKeyFactory.getInstance(key.getAlgorithm(), "AndroidKeyStore");
   KeyInfo keyInfo = (KeyInfo) factory.getKeySpec(key, KeyInfo.class);
   if (!keyInfo.isInsideSecureHardware()) {
      // is this acceptable? Depends on the app
   }
}

The above sample is greatly simplified, and there are a number of possible exceptions and error states that need to be accounted for, as does the decision of whether secure hardware is a necessity or a nicety (in the nicety case the OS still acts with best efforts to protect the key, but has less of a barrier to exploitation if someone compromised the OS itself).

In this case it’s an AES key that will allow for a number of block mode and padding uses. Notably the key demands user authentication 30 seconds before use: For a device with a finger print or passcode or pattern, the key won’t allow for initialization in a cipher unless that requirement has been met, demanding that your app imperatively demand a re-authentication on exceptions. Whether this is a requirement for a given use is up to the developer.

You can’t pull the key materials as the key is protected from extraction, both through software and hardware.

Using the key is largely normal cipher operations.

Cipher cipher = Cipher.getInstance("AES/CBC/PKCS7Padding");
cipher.init(Cipher.ENCRYPT_MODE, key);
// the iv is critically important. save it.
byte[] iv = cipher.getIV();
// encrypt the data.
byte [] encryptedBytes = cipher.update(dataToEncrypt);
... persist and repeat with subsequent blocks.
encryptedBytes = cipher.doFinal();

And to decrypt

Cipher decryptCipher = Cipher.getInstance("AES/CBC/PKCS7Padding");
AlgorithmParameterSpec IVspec = new IvParameterSpec(iv);
decryptCipher.init(Cipher.DECRYPT_MODE, key, IVspec);
byte [] decryptedBytes = decryptCipher.update(encryptedBlock);
... 
byte [] decryptedBytes = decryptCipher.doFinal

Pretty straightforward, and the key is never revealed to the application, nor often even to the OS. If you had demanded that the user be recently authenticated and that requirement isn’t satisfied (e.g. the timeout had elapsed), the Cipher init call would yield a UserNotAuthenticatedException exception, which you could deal with by calling-

Intent targetIntent = keyguardManager.createConfirmDeviceCredentialIntent(null, null);
startActivityForResult(targetIntent, 0);

Try again on a successful authentication callback, the secure “enclave” doing the necessary rate limiting and lockouts as appropriate.

And of course you may have separate keys for different media files, though ultimately nothing would be gained by doing that.

Having the key in secure hardware is a huge benefit, and ensuring that an authentication happened recently is crucial, but if you take out your full disk encryption Android device, unlock it with your fingerprint, and then hand it to your grandmother to play Townships, she might accidentally hit recent apps and clicks to open a video where you’re doing parkour at the tops of city buildings (family controversy!). It would open because all of the requirements have been satisfied and the hardware key would be happily allowed to be used to decrypt the media stream.

It’d be nice to have an additional level of protection above and beyond simple imperative fences. One thing that is missing from the KeyStore implementation is the ability to add an imperative password to each key (which the traditional key vaults have).

Adding a second level password, including per media resource (without requiring additional keys) is trivial. Recall that each Cipher starts with a randomly generated IV (initialization vector which is, as the name states, is the initial state of the cipher) that ultimately is not privileged information, and generally is information that is stored in the clear (the point of an IV is that if you re-encrypt the same content again and again with the same key, an unwanted observer could discern that it’s repeating, so adding a random IV as the starting point makes each encrypted message entirely different).

Without the IV you can’t decrypt the stream. So let’s encrypt the IV with a pass phrase.

SecretKeyFactory factory = SecretKeyFactory.getInstance("PBEwithSHAandTWOFISH-CBC");
SecretKey ivKey = factory.generateSecret(new PBEKeySpec(password.toCharArray(), new byte[] {0}, 32, 256));

Cipher ivCipher = Cipher.getInstance("AES/GCM/NoPadding");
ivCipher.init(Cipher.ENCRYPT_MODE, ivKey);
byte [] ivIv = ivCipher.getIV();
byte [] encryptedIv = ivCipher.doFinal(iv);

In this case I used GCM, which cryptographically tags the encrypted data with an integrity digest and on decryption validates the payload against corruption or modification. A successful GCM decryption is a clear thumbs up that the password was correct. You could of course use GCM for the media streams as well, and if it’s individual frames of audio or video each in distinct sessions that would probably be ideal, but for large messages GCM has the downside that the entirety of output is buffered until completion to allow it to compute and validate the GCM.

Now we have an encrypted IV (the 16-byte IV becoming a 32-byte encrypted output given that it contains the 16-byte GCM tag, plus we need to save the additional 16-byte IV for this new session, so 48-bytes to store our protected IV). Note that I used a salt of a single 0 byte because it doesn’t add value in this case.

You can do time-intensive many-round KDFs, but in this case where it’s acting as a secure hardware augmentation over already strong, rate-limited encryption, it isn’t that critical.

To decrypt the IV-

ivCipher = Cipher.getInstance("AES/GCM/NoPadding");
AlgorithmParameterSpec IVspec = new IvParameterSpec(ivIv);
ivCipher.init(Cipher.DECRYPT_MODE, ivKey, IVspec);
byte [] decryptedIv = ivCipher.doFinal(encryptedIv);

We store the encrypted streams and the encrypted IV (including its IV), and now to access that media stream the user needs to authenticate with the OS rate-and-try limited authentication, the hardware needs the associated trusted environment key, and the user needs the correct passphrase to access the IV to successfully decrypt.

In the end it’s remarkably simple to add powerful, extremely effective encryption to your application. This can be useful simply to protect you from your own misclicks, or even to defend against formidable, well-resourced foes.

Optical vs Electronic Image Stabilization

The recently unveiled Google Pixel smartphones features electronic image stabilization in lieu of optical image stabilization, with Google reps offering up some justifications for their choice.

While there is some merit to their arguments, the contention that optical image stabilization is primarily for photos is inaccurate, and is at odds with the many excellent video solutions that feature optical image stabilization, including the competing iPhone 7.

Add that video is nothing more than a series of pictures, a seemingly trite observation that will have apparent relevance later in this piece.

This post is an attempt to explain stabilization techniques and their merits and detriments.

Why should you listen to me? I have some degree of expertise on this topic. Almost two years ago I created an app for Android1 that featured gyroscope-driven image stabilization, with advanced perspective correction and rolling shutter compensation. It offers sensor-driven Electronic Image Stabilization for any Android device (with Android 4.4+ and a gyroscope).

It was long the only app that did this for Android (and to my knowledge remains the only third-party app to do it). Subsequent releases of Android included extremely rudimentary EIS functionality in the system camera. Now with the Google Pixel, Google has purportedly upgraded the hardware, paid attention to the necessity of reliable timing, and offered a limited version for that device.

They explain why they could accomplish it with some hardware upgrades-

“We have a physical wire between the camera module and the hub for the accelerometer, the gyro, and the motion sensors,” Knight explains. “That makes it possible to have very accurate time-sensing and synchronization between the camera and the gyro.”

We’re talking about gyroscope data triggering 200 times per second, and frames of video in the 30 times per second range. The timing sensitivity is in the 5ms range (that the event sources are timestamped within that range of accuracy). This is a trivial timing need and should need no special hardware upgrades to be accomplished. The iPhone has had rock solid gyroscope timing information going back many generations, along with rock solid image frame timing. It simply wasn’t a need for Google, so the poor design of the timing insanity was the foundation of data on Android (and let me be clear that I’m very pro-Android. I’m pro all non-murdery technology, really. This isn’t advocacy or flag-waving for some alternative: it’s just an irritation that something so simple became so troublesome and wasted so much of my time).

Everyone is getting in on the EIS stabilization game now, including Sony, one of the pioneers of OIS, with several of their new smartphones, and even GoPro with their latest device (their demos again under the blazing midday sun, and still they’re unimpressive). EIS lets you use a cheaper, thinner, less complex imaging module, reducing the number of moving parts (so better yields and reliability over time. Speaking of which, I’ve had two SLR camera bodies go bad because the stabilized sensor system broke in some way).

A number of post-processing options have also appeared (e.g. using only frame v frame evaluations to determine movement and perspective), including Microsoft’s stabilization solution, and the optional solution built right into YouTube.

There are some great white papers covering the topic of stabilization .

Let’s get to stabilization techniques and how EIS compares with OIS.

With optical image stabilization, a gyro sensor package is coupled with the imaging sensor. Some solutions couple this with some electromagnets to move the lens, other solutions move the sensor array, while the best option (there are optical consequences of moving the lens or sensor individually, limiting the magnitude before there are negative optical effects) moves the entire lens+sensor assembly (frequently called “module tilt”), as if it were on a limited range gimbal. And there are actual gimbals2 that can hold your imaging device and stabilize it via gyroscope directed motors.

A 2-axis OIS solution corrects for minor movements of tilt or yaw — e.g. tilting the lens down or up, or tilting to the sides — the Nexus 5 came with 2-axis stabilization, although it was never well used by the system software, and later updates seem to have simply disabled it altogether.

More advanced solutions add rotation (roll), which is twisting the camera, upping it to a 3-axis solution. The pinnacle is 5-axis which also incorporate accelerometer readings and compensates for minor movements left or right, up and down.

EIS also comes in software 2-, 3- and 5-axis varieties: Correlate the necessary sensor readings with the captured frames and correct accordingly. My app is 3-axis (adding the lateral movements was too unreliable across devices, not to mention that while rotational movements could be very accurately corrected and perspective adjusted, the perspective change of lateral movements is a non-trivial consideration, and most implementations are naive).

With an OIS solution the module is trying to fix on a static orientation so long as it concludes that any movement is unintended and variations fall within its range of movement. As you’re walking and pointing down the street, the various movements are cancelled out as the lens or sensor or entire module does corrective counter-movements. Built-in modules have a limited degree of correction — one to two degrees in most cases, so you still have to be somewhat steady, but it can make captures look like you’re operating a steadicam.

An OIS solution does not need to crop to apply corrections, and doesn’t need to maintain any sort of boundary buffer area. The downside, however, is that the OIS system is trying to guess, in real time, the intentions of movements, and will often initially cancel out the onset of intentional movements: As you start to pan the OIS system will often counteract the motion, and then rapidly try to “catch up” and move back to the center sweet spot where it has the maximum range of motion for the new orientation.

The imaging sensor in OIS solutions is largely looking at a static scene, mechanically kept aligned.

With an EIS solution, in contrast, the sensor is fixed, and is moving as the user moves. As the user vibrates around and sways back and forth, and so on, that is what the camera is capturing. The stabilization is then applied either in real-time, or as a post-processing step.

A real-time EIS system often maintains a fixed cropping to maintain a buffer area (e.g. only a portion of the frame is recorded, allowing the active capture area to move around within the buffer area without changing digital zoom levels), and as with OIS solution it predictively tries to infer the intentions of movements. From the demo video Google gave, their system is real-time (or with a minimal number of buffer frames), yielding the displeasing shifts as it adjusts from being fixed on one orientation to transitioning to the next fixed orientation (presumably as range of movement started to push against the edge of the buffer area), rather than smoothly panning between.

A sensor-driven post-processing EIS system, which is what Gallus is, captures the original recording as is, correlating the necessary sensor data and using attributes of the device (focal length, sensor size, field of views, etc) in post processing can evaluate the motion with a knowledge of the entire sequence, using low-pass filters and other smoothing techniques to make a movement spline within the set variability allowance.

Let’s start with an illustrative sample. Moments before writing this entry, the sun beginning to set on what was already a dreary, dim day, I took a walk in the back of the yard with my app, shooting a 4K video on my Nexus 6p. Here it is (it was originally recorded in h265 and was transcoded to h264, and then YouTube did its own re-encoding, so some quality was lost) –

This is no “noon in San Francisco” or “ride on the ferry” sort of demo. It’s terrible light, subjects are close to the camera (and thus have a high rate of relative motion in frame during movements) and the motions are erratic and extreme, though I was actually trying to be stable.

Here’s what Gallus — an example of sensor-driven post-processing EIS — yielded when it stabilized the result.

I included some of the Gallus instrumentation for my own edification. Having that sort of informational overlay on a video is an interesting concern because it conflicts with systems that do frame-v-frame stabilization.

Next up is YouTube and their frame-v-frame stabilization.

YouTube did a pretty good job, outside of the random zooming and jello effect that appears in various segments.

But ultimately this is not intending to be a positive example of Gallus. Quite the opposite, I’m demonstrating exactly what is wrong with EIS: Where it fails, and why you should be very wary of demonstrations that always occur under perfect conditions. And this is a problem that is common across all EIS solutions.

A video is a series of photos. While some degree of motion blur in a video is desirable when presented as is, with all of the original movements, as humans we have become accustomed to blur correlating with motion — a subject is moving, or the frame is moving. We filter it out. You likely didn’t notice that a significant percentage of the frames were blurry messes (pause at random frames) in the original video, courtesy of the lower-light induced longer shutter times mixed with device movements.

Stabilize that video, however, and motion blur of a stabilized frame3 is very off-putting. Which is exactly what is happening above: Gallus is stabilizing the frame perfectly, but many of the frames it is stabilizing were captured during rapid motion, the entire frame marred with significant motion blur.

Frames blurred by imaging device movement are fine when presented in their original form, but are terrible when the motion is removed.

This is the significant downside of EIS relative to OIS. Where individual OIS frames are usually ideal under even challenging conditions, such as the fading sun of a dreary fall day, captured with the stability of individual photos, EIS is often working with seriously compromised source material.

Google added some image processing to make up for the lack of OIS for individual photos — taking a sequence of very short shutter time photos in low light, minimizing photographer motion, and then trying to tease out a usable image from the noise — but this isn’t possible when shooting video.

An EIS system could try to avoid this problem by using very short exposure times (which itself yields a displeasing strobe light effect) and wide apertures or higher, noisier ISOs, but ultimately it is simply a compromise. To yield a usable result other elements of image capture had to be sacrificed.

But the Pixel surely does it better than your little app!” you confidently announce (though they’re doing exactly the same process), sitting on your Pixel order and hoping that advocacy will change reality. As someone who has more skin in this game than anyone heralding whatever their favorite device happens to have, I will guarantee you that the EIS stabilization in the Pixel will be mediocre to unusable in challenging conditions (though the camera will naturally be better than the Nexus 6p, each iteration generally improving upon the last, and is most certainly spectacular in good conditions).

Here’s a review of the iPhone 7 (shot with an iPhone 7), and I’ll draw your attention to the ~1:18 mark — as they walk with the iPhone 7, the frame is largely static with little movement, and is clear and very usable courtesy of OIS (Apple combines minimal electronic stabilization with OIS, but ultimately the question is the probability that static elements of the scene are capturing the majority of a frame’s shutter time on a fixed set of image sensors, and OIS vastly improves the odds). As they pan left, pause and view those frames. Naturally, given the low light, with significant relative movement of the scene it’s a blurry mess. On the Pixel every frame will be like this under that sort of situation presuming the absence of inhuman stability or an external stabilizer.

I’m not trying to pick specifically on the Pixel, and it otherwise looks like a fantastic device (and would be my natural next device, having gone through most Nexus devices back to the Nexus One, which replaced my HTC Magic/HTC Dream duo), but in advocating their “an okay compromise in some situations” solution, they went a little too far with the bombast. Claiming that OIS is just for photos is absurd in the way they intended it, though perhaps it is true if you consider a video a series of photos, as I observed at the outset.

A good OIS solution is vastly superior to the best possible EIS solution. There is no debate about this. EIS is the cut-rate, discount, make-the-best-of-a-bad-situation compromise. That the Pixel lacks OIS might be fine on the streets of San Francisco at noon, but it’s going to be a serious impediment during that Halloween walk, in Times Square at night, or virtually anywhere else where the conditions aren’t ideal.

The bar for video capture has been raised. Even for single frame photography any test that uses static positioning is invalid at the outset: It doesn’t matter if the lens and sensor yield perfect contrast and color if it’s only in the artificial scenario where the camera and subject are both mounted and perfectly still, when in the real world the camera will always be swaying around and vibrating in someone’s hands.

Subjects moving in frame of course will yield similar motion blur on both solutions, but that tends to be a much smaller problem in real world video imaging, and tends to occur at much smaller magnitudes. When you’re swaying back and forth with a fixed, non-OIS sensor, the entire frame is moving across differing capture pixels at a high rate of speed, versus a small subject doing a usually small in frame motion. They are a vastly different scale of problem.

The days of shaky cam action are fast fading, and the blurry cam surrogates are the hanger-ons. The best option is a stabilized rig (but seriously). Next up is 5-axis optical image stabilization, and then its 3-axis cousin. Far behind is sensor-driven EIS. In last place are post-processing frame versus frame comparison options (they often falter in complex scenes, but will always be demoed with a far off horizon line in perfect conditions, with gentle, low frequency movements).

Often on-camera OIS will be augmented with minimal EIS — usually during transition periods when OIS predicted the future wrong (to attempt to resolve the rapid catch-up), and also to deal with rolling shutter distortion.

To explain rolling shutter distortion, each line of the CMOS sensor is captured and read individually and sequentially, so during heavy movement the frame can skew because the bottom of the scene was pulled from the sensor as much as 25ms after the beginning line of the scene (as you pan down things compress to be smaller, grow when panning up, and skew left and right during side movements). So during those rapid transition periods the camera may post process to do some gentle de-skewing, with a small amount of overflow capture resolution to provide the correction pixels. Rolling shutter distortion is another interesting effect because it’s a pretty significant problem with every CMOS device, but it didn’t become obvious until people started stabilizing frames.

And to digress for a moment, the hero of an enormous amount of technology progress in the past ten years are the simple, unheralded MEMS gyroscopes. These are incredible devices, driven by a fascinating principle (vibrating structures reacting to the Coriolis effect), and they’re the foundation of enormous technology shifts. They’re a very recent innovation as well. Years ago it was stabilized tank turrets that had this technology (okay, some murdery technology is pretty cool), courtesy of giant, expensive mechanical gyroscopes. Now we have cheap little bike mount gimbals doing the same thing.

For the curious, here’s the unzoomed original corrections as applied by Gallus. It was a fun, technically challenging project, despite the frustrations and sleepless nights. It began with some whiteboard considerations of optics and how they work, then field of view, sensor sizes, offset calculations, and so on.

1 – Developing Gallus presented unnecessary challenges. Android lacks reliable event timing, though that has improved somewhat in recent iterations. A lot of the necessary imaging metadata simply didn’t exist, because Google had no need for it (and this is a problem when sharecropping on a platform). As they got new interests, new functionality would appear that would expose a little more of the basic underlying hardware functionality (versus starting with a logical analysis of what such a system would consist of and designing a decent foundation). The whole camera subsystem is poorly designed and shockingly fragile. The number of Android handsets with terrible hardware is so high that building such an advanced, hardware-coupled application is an exercise in extreme frustration.

And given some cynical feedback, note that this post is not a plea for people to use that app (this is not a subtle ad), though that should be obvious given that I start by telling you that the very foundation of EIS is flawed. I go through occasional spurts of updating the app (occasionally breaking it in the process), and having users bitching because an update notice inconvenienced their day kind of turned me off of the whole “want lots of users” thing, at least for an app as so “edge of the possible” as Gallus.

2 – While this post was ostensibly about the Pixel EIS claims, I was motivated to actually write it after seeing many of the comments on this video. That bike run, shot with a “Z1-ZRider2” actively stabilized gimbal (not a pitch for it — there are many that are largely identical) is beautifully smoothed, so it’s interesting to see all of the speculation about it being smoothed in post (e.g. via some frame-v-frame solution). Not a chance. If it was shot unstabilized or with EIS (which is, for the purpose of discussion, unstabilized) it would have been a disastrous mess of blurred foliage and massive cropping, for the reasons discussed, even under the moderate sun. Already objects moving closer to the lens (and in a frame relative sense faster) are heavily blurred, but the entirety of the scene would have that blur or worse minus mechanical stabilization.

There is an exaggerated sense of faith in what is possible with post-process smoothing. Garbage in = garbage out.

3 – One of my next technical challenge projects relates to this. Have a lot of cash and want to fund some innovation? Contact me.

h.265 (HEVC) Encoding and Decoding on Android

I periodically give some attention Gallus1 (an awesome in every way stabilized hyperlapse app for Android, doing what many said was impossible), tonight enabling h.265 (HEVC) support for devices surfacing hardware encoding/decoding for that codec, still packaging it in a traditional MP4 container.

The Nexus 6p, for instance, has hardware HEVC encoding/decoding via the Snapdragon 810 (one of the benefits over the 808), however it was inaccessible for third-party developer use until the Android N (7.0) release. I had done trials of it through some of the 7.0 betas, however until recently it was seriously unstable. With the final release it seems pretty great.

h.265 is a pretty significant improvement over the h.264 (AVC) codec that we all know and love, promising about the same quality at half the bitrate. Alternately, much better quality at the same bitrate. It also features artifacts that are arguably less jarring when the compression is aggressive or breaks down. And on a small, mobile processor it’s encoded efficiently in real time at up to 4K resolutions.

One aspect of Android development that might surprise many is just how fragile some of the subsystems of the platform are. If you fail to utilize the camera API in specific patterns and sequence (I am not talking about documented behaviors, but rather rudely discovered quirks of operation, where some sequences of events require literal imperative pauses, discovered through trial and error, to avoid defects), the entire camera subsystem will crash and be unrecoverable by any means other than a full restart. The same is true of the h.265 encoder and decoder on the Nexus 6p, and in implementing the new functionality I had to restart the device a dozen+ times to recover from subsystem failures as I massaged the existing code to yield good behavior from the new codecs. Ultimately I find Android to be an incredible, amazing platform, but it remains surprising that so much of it is perilous native code houses of cards with a very thin API.

1 – I’ve never paid much attention to the Play Store listing, and it still features screenshots from a much the primitive (in UI, not in algorithmic awesomeness) initial version, and I’ve never made a tutorial or real demonstration (it is absolutely incredible on the Nexus 6p). It always seemed like there was one more thing I wanted to fix before I really made a deal out of it. But it is pretty awesome, and I finally started committing serious time to finishing what I think are the “worth it” features that put it over the top. By the end of October.

And to answer a question that no one has asked (okay, one or two have asked it, including recently a major manufacturer who I’m trying to entice into committing), yes, Gallus (the code, technology, git history, and optionally my participation in getting you to success with it) for a very reasonable price.