The Reports of HTML’s Death Have Been Greatly Exaggerated…?

Feedback

Yesterday’s post titled “Android Instant Apps / The Slow, Inexorable Death of HTML” has accumulated some 35,000 or so uniques thus far, which was a surprise (when log files start ballooning I notice). It has yielded feedback containing recurring sentiments that are worth addressing.

it is weird the article trying to sell the idea that apps are better posted XKCD images stating otherwise

While there are situations where a native app can certainly do things that a web app can’t, and there are some things it can simply do better, the prior entry wasn’t trying to “sell” the idea that apps are inherently better (and I have advocated the opposite on here and professionally for years where the situation merits). It was simply an observation of Google’s recent initiative, and what the likely outcome will be.

Which segues to another sentiment-

The reverse is happening. Hybrid apps are growing in number. CSS/JS is becoming slicker than ever.

The web is already a universal platform, so why the ████ would you code a little bit of Java for Android instead of writing it once for everything?

In the prior entry I mentioned that some mobile websites are growing worse. The cause of this decline isn’t that HTML5/JS/CSS or the related stack is somehow rusting. Instead it’s that many of these sites are so committed to getting you into their native app that they’ll sabotage their web property for the cause.

No, I don’t want to install your app. Seriously.

Add that the mobile web has seen a huge upsurge in advertising dark patterns. The sort of nonsense that has mostly disappeared from the desktop web, courtesy of the nuclear threat of ad blockers. Given that many on the mobile web don’t utilize these tools, the domain is rife with endless redirects, popovers, the intentionally delayed page re-flows to encourage errant clicks (a strategy that is purely self-destructive in the longer term, as every user will simply hit back, undermining the CPC), overriding swipe behaviors, making all background space an ad click, and so on.

The technology of the mobile web is top notch, but the implementation is an absolute garbage dump across many web properties.

So you have an endless list of web properties that desperately want you to install their app (which they already developed, often in duplicate, triplicate…this isn’t a new thing), and who are fully willing to make your web experience miserable. Now offer them the ability to essentially force parts of that app on the user.

The uptake rate is going to be incredibly high. It is going to become prevalent. And with it, the treatment of the remaining mobile webfugees is going to grow worse.

On Stickiness

I think it’s pretty cool to see a post get moderate success, and enjoy the exposure. One of the trends that has changed in the world of the web, though, is in the reduced stickiness of visitors.

A decade or so ago, getting a front page on Slashdot — I managed it a few times in its hey-day — would yield visitors who would browse around the site often for hours on end, subscribe to the RSS feed, etc. It was a very sticky success, and the benefits echoed long after the initial exposure died down. A part of the reason is that there simply wasn’t a lot of content, so you couldn’t just refresh Slashdot and browse to the next 10 stories while avoiding work.

Having a few HN and Reddit success stories over the past while I’ve noticed a very different pattern. People pop on and read a piece, their time on site equaling the time to read to the end, and then they leave. I would say less than 0.5% look at any other page.

There is no stickiness. When the exposure dies down, it’s as if it didn’t happen at all.

Observing my own uses, this is exactly how I use the web now: I jump to various programming forums, visiting the various papers and entries and posts, and then I click back. I never really notice the author, I don’t bookmark their site, and I don’t subscribe to their feed. The rationale is that when they have another interesting post, maybe it’ll appear on the sites I visit.


This is just the new norm. It’s not good or bad, but it’s the way we utilize a constant flow of information. The group will select and filter for us.

While that’s a not very interesting observation, I should justify those paragraphs: I believe this is the cause of both the growing utilization of dark patterns on the web (essentially you’re to be exploited as much as possible during the brief moment they have your attention, and the truth is you probably won’t even remember the site that tricked you into clicking six ads and sent you on a vicious loop of redirects), and the desperation to install their app where they think they’ll gain a more permanent space in your digital world.

Android Instant Apps / The Slow, Inexorable Death of HTML

Android Instant Apps were announced at the recent Google I/O. Based upon available information1, Instant Apps offer the ability for websites to instead transparently open as a specific activity/context in an Android App, the device downloading the relevant app modules (e.g. the specific fragments and activities necessary for a need) on demand, modularized for only what the context needs.

The querystring app execution functionality already exists in Android. If you have the IMDB app, for instance, and open an IMDB URL, you will find yourself in the native app, often without prompting: from the Google Search app it is automatic, although on third party sites it will prompt whether you want to use the app or not, offering to always use the association.

www.imdb.com/title/tt0472954/

Click on that link in newer versions of Android (in a rendering agent that leverages the standard launch intents), with IMDB installed, and you’ll be brought to the relevant page in that app.

Instant Apps presumably entail a couple of basic changes-

  • Instead of devices individually having a list of app links (e.g. “I have apps installed that registered for the IMDB, Food Network and Buzzfeed domains, so keep an eye out for ACTION_VIEW intents for any of the respective domains“), there will be a Google-managed master list that will be consulted and likely downloaded/cached regularly. These link matches may be refined to a URL subset (where the current functionality is for a full domain).
  • An update to Android Studio / the build platform will introduce more granular artifact analysis/dependency slicing. Already this exists in that an APK is a ZIP of the various binary dependencies (e.g. for each target processor if you’re using the NDK), resources, and so on, however presumably the activities, classes and compiled resources will be bifurcated, their dependencies documented.
  • When you open a link covered by the master list, the device will check for the relevant app installed. If it isn’t found, it will download the necessary dependencies, cache them in some space-capped instant app area, initialize a staged environment area, and then launch the app.

They promise support, via Google Play Services, all the ways back to Android 4.1 (Jellybean), which encompasses 95.7% of active users. Of course individual apps and their activities may use functionality leveraging newer SDKs, and may mandate it as a minimum, so this doesn’t mean that all instant apps will work on all 95.7% of devices.

 

 

The examples given include opening links from a messaging conversation, and from the Google Search app (which is a native implementation, having little to do with HTML).

The system will certainly provide a configuration point allowing a device to opt out of this behavior, but it clearly will become the norm. Google has detailed some enhanced  restrictions on the sandbox of such an instant app — no device identification or services, for instance — but otherwise it utilizes the on-demand permission model and all of the existing APIs like a normal app (detailed here. As is always the case, those who don’t understand this are fear mongering about it being a security nightmare, just as when auto app-updates were rolled out there were a number of can you say bricked? responses).

And to clear up a common misconception, these apps are not run “in the cloud”, with some articles implying that they’re VNC sessions or the like. Aside from some download reductions for the “instant” scenario (Instant Apps are apparently capped at 4MB for a given set of functionality, and it’s tough to understand how the rest of the B&H app fills it out to 37MB), the real change is that you’re no longer asked — the app is essentially forced on you by default — and it doesn’t occupy an icon on your home screen or app drawer. It also can’t launch background services, which is a bonus.

Unfortunately, the examples given demonstrate little benefit over the shared-platform HTML web — the BuzzFeed example is a vertical list of videos, while the B&H example’s single native benefit was Android Pay — though there are many scenarios where the native platform can admittedly provide an improved, more integrated and richer experience.

It further cements the HTML web as a second class citizen (these are all web service powered, so simply saying “the web” seems dubious). I would cynically suggest that the primary motivation for this move is the increased adoption of ad blockers on the mobile HTML web: It’s a much more difficult proposition to block ads within native apps, while adding uBlock to the Firefox mobile browser is trivial, and is increasingly becoming necessary due to the abusive, race-to-the-bottom behaviors becoming prevalent.

And it will be approximately one day before activities that recognize they’re running as instant apps start endlessly begging users to install the full app.

Ultimately I don’t think this is some big strategic shift, and such analyses are usually nonsensical. But it’s to be seen what the impact will be. Already many sites treat their mobile HTML visitors abusively: one of the advocacy articles heralding this move argued that it’s great because look at how terrible the Yelp website has become, which is a bit of a vicious cycle. If Yelp can soon lean on a situation where a significant percentage of users will automatically find themselves in the app, their motivations for presenting a decent web property decline even further.

1 – I have no inside knowledge of this release, and of course I might be wrong in some of the details. But I’m not wrong. Based upon how the platform is implemented, and the functionality demonstrated, I’m quite confident my guesses are correct.

Achieving a Perfect SSL Labs Score with C(++)

A good article making the rounds details how to achieve a perfect SSL Labs Score with Go. In the related discussion (also on reddit) many noted that such a pursuit was impractical: if you’re causing connectivity issues for some of your users, achieving minor improvements in theoretical security might be Pyrrhic.

A perfect score is not a productive pursuit for most public web properties, and an A+ with a couple of 90s is perfectly adequate and very robustly secure for most scenarios.

Striving for 100 across the board is nonetheless an interesting, educational exercise. The Qualys people have done a remarkable job educating and informing, increasing the prevalence of best practice configurations, improving the average across the industry. It’s worth understanding the nuances of such an exercise even if not practically applicable for all situations.

It’s also worth considering that not all web endpoints are publicly consumable, and there are scenarios where cutting off less secure clients is an entirely rational choice. If your industrial endpoint is called from your industrial management process, it really doesn’t matter whether Android 2.2 or IE 6 users are incompatible.

score

So here’s how to create a trivial implementation of a perfect score HTTPS endpoint in C(++). It’s more wordy than the Go variant, though it’s a trivial exercise to parameterize and componentize for easy reuse. And as anyone who visits here regularly knows, in no universe am I advocating creating HTTPS endpoints in C++: I’m a big fan and (ab)user of Go, C#, Java, and various other languages and platforms, but it’s nice to have the options available when appropriate.

This was all done on a Ubuntu 16.04 machine with the typical build tools installed (e.g. make, git, build-essential, autoconf), though of course you could do it on most Linux variants, OSX, Ubuntu on Windows, etc. This exercise presumes that you have certificates available at /etc/letsencrypt/live/example.com/

(where example.com is replaced with your domain. Replace in code as appropriate, or make arguments)

Note that if you use the default letsencrypt certificates, which are currently 2048 bits, the SSL Test will still yield an A+ from the below code however it will yield a slightly imperfect score, with only a score of 90 for the key exchange. In practice a 2048-bit cert is considered more than adequate, so whether you sweat this and update to a 4096-bit cert is up to you (as mentioned in the Go entry, you can obtain a 4096-bit cert via the lego Go app, using the

--key-type "rsa4096"

argument).

1 – Install openssl and the openssl development library.

sudo apt-get update && sudo apt-get install openssl libssl-dev

2 – Create a DH param file. This is used by the OpenSSL for the DH key exchange.

sudo openssl dhparam -out /etc/letsencrypt/live/example.com/dh_param_2048.pem 2048

3 – Download, make, install the libevent v2.1.5 “beta”. Install as root and refresh the library cache (e.g. sudo ldconfig).

https://github.com/libevent/libevent/releases/tag/release-2.1.5-beta

4 – Start a new C++ application linked to libcrypto, libevent, libevent_openssl, libevent_pthreads and libssl.

5 – Add the necessary includes-

#include <iostream>
#include <openssl/ssl.h>
#include <openssl/err.h>
#include <openssl/rand.h>
#include <openssl/stack.h>

#include <event.h>
#include <event2/listener.h>
#include <event2/bufferevent_ssl.h>
#include <evhttp.h>

6 – Initialize the SSL context-

SSL_CTX *
ssl_init(void) {
    SSL_CTX *server_ctx;

    SSL_load_error_strings();
    SSL_library_init();

    if (!RAND_poll())
        return nullptr;

    server_ctx = SSL_CTX_new(SSLv23_server_method());

    // Load our certificates
    if (!SSL_CTX_use_certificate_chain_file(server_ctx, "/etc/letsencrypt/live/example.com/fullchain.pem") ||
            !SSL_CTX_use_PrivateKey_file(server_ctx, "/etc/letsencrypt/live/example.com/privkey.pem", SSL_FILETYPE_PEM)) {
        std::cerr << "Couldn't read chain or private key" << std::endl;
        return nullptr;
    }

    // prepare the PFS context
    EC_KEY *ecdh = EC_KEY_new_by_curve_name(NID_secp384r1);
    if (!ecdh) return nullptr;
    
    if (SSL_CTX_set_tmp_ecdh(server_ctx, ecdh) != 1) {
        return nullptr;
    }

    bool pfsEnabled = false;
    FILE *paramFile = fopen("/etc/letsencrypt/live/example.com/dh_param_2048.pem", "r");
    if (paramFile) {
        DH *dh2048 = PEM_read_DHparams(paramFile, NULL, NULL, NULL);
        if (dh2048 != NULL) {
            if (SSL_CTX_set_tmp_dh(server_ctx, dh2048) == 1) {
                pfsEnabled = true;
            }
        }
        fclose(paramFile);
    }

    if (!pfsEnabled) {
        std::cerr << "Couldn't enable PFS. Validate DH Param file." << std::endl;
        return nullptr;
    }
    
    SSL_CTX_set_options(server_ctx,
            SSL_OP_SINGLE_DH_USE |
            SSL_OP_SINGLE_ECDH_USE |
            SSL_OP_NO_SSLv2 | SSL_OP_NO_SSLv3 | SSL_OP_NO_TLSv1 | SSL_OP_NO_TLSv1_1);

    if (SSL_CTX_set_cipher_list(server_ctx, "EECDH+ECDSA+AESGCM:EECDH+aRSA+AESGCM:EECDH+ECDSA+SHA384:EECDH+ECDSA+SHA256:AES256:!DHE:!RSA:!AES128:!RC4:!DES:!3DES:!DSS:!SRP:!PSK:!EXP:!MD5:!LOW:!aNULL:!eNULL") != 1) {
        std::cerr << "Cipher list could not be initialized." << std::endl;
        return nullptr;
    }

    return server_ctx;
}

The most notable aspects are the setup of PFS, including a strong, 384-bit elliptic curve. Additionally, deprecated transport options are disabled (in this case anything under TLSv1.2), as are weak ciphers.

ciphers

7 – Prepare a libevent callback that attaches a new SSL connection to each libevent connection-

struct bufferevent* initializeConnectionSSL(struct event_base *base, void *arg) {
    struct bufferevent* r;
    SSL_CTX *ctx = (SSL_CTX *) arg;
    r = bufferevent_openssl_socket_new(base,
            -1,
            SSL_new(ctx),
            BUFFEREVENT_SSL_ACCEPTING,
            BEV_OPT_CLOSE_ON_FREE);
    return r;
}

8 – Hook it all together-

int main(int argc, char** argv) {
    SSL_CTX *ctx;
    ctx = ssl_init();
    if (ctx == nullptr) {
        std::cerr << "Failed to initialize SSL. Check certificate files." << std::endl;
        return EXIT_FAILURE;
    }

    if (!event_init()) {
        std::cerr << "Failed to init libevent." << std::endl; 
        return EXIT_FAILURE; 
    } 
    auto base = event_base_new(); 
    auto https = evhttp_new(base); 

    void (*requestHandler)(evhttp_request *req, void *) = [] (evhttp_request *req, void *) { 
         auto *outBuf = evhttp_request_get_output_buffer(req); 
         if (!outBuf) return; 
         switch (req->type) {
            case EVHTTP_REQ_GET:
                {
                    auto headers = evhttp_request_get_output_headers(req);
                    evhttp_add_header(headers, "Strict-Transport-Security", "max-age=63072000; includeSubDomains");
                    evbuffer_add_printf(outBuf, "<html><body><center><h1>Request for - %s</h1></center></body></html>", req->uri);
                    evhttp_send_reply(req, HTTP_OK, "", outBuf);
                }
                break;
            default:
                evhttp_send_reply(req, HTTP_BADMETHOD, "", nullptr);
                break;
        }
    };

    // add the callbacks
    evhttp_set_bevcb(https, initializeConnectionSSL, ctx);
    evhttp_set_gencb(https, requestHandler, nullptr);
    auto https_handle = evhttp_bind_socket_with_handle(https, "0.0.0.0", 443);

    event_base_dispatch(base);

    if (event_dispatch() == -1) {
        std::cerr << "Failed to run message loop." << std::endl;
        return EXIT_FAILURE;
    }

    return 0;
}

Should you strive for 100? Maybe not. Should you even have SSL termination in your C(++) apps?  Maybe not (terminate with something like nginx and you can take advantage of all of the modules available, including compression, rate limiting, easy resource ACLs, etc). But it is a tool at your disposal if the situation is appropriate. And of course the above is quickly hacked together, non-production ready sample code (with some small changes it can be made more scalable, achieving enormous performance levels on commodity servers), so use at your own risk.

Just another fun exercise. The lightweight version of this page can be found at https://dennisforbes.ca/index.php/2016/05/23/achieving-a-perfect-ssl-labs-score-with-c/amp/, per “Hanging Chads / New Projects / AMPlified“.

Note that this is not the promised “Adding Secure, Authenticated HTTPS Interop to a C(++) Project” piece that is still in work.  That undertaking is more involved with secure authentication and authorization, custom certificate authorities, and client certificates.

Disappearing Posts / Financing / Rust

1984

While in negotiations I have removed a few older posts temporarily. The “Adding Secure, Authenticated HTTPS Interop to a C(++) Project” series, for instance.

I can’t make the time to focus on it at the moment and don’t want it to sit like a bad promise while the conclusion awaits (and for technical pieces I really try to ensure 100% accuracy which is time consuming), and will republish when I can finish it. I note this given a few comments where helpful readers thought some sort of data corruption or transactional rollback was afoot. All is good.

Rust

Occasionally I write things on here that lead some to inaccurately extrapolate more about my position. In a recent post, for instance, I noted that Rust (the system language) seems to be used more for advocacy — particularly of the “my big brother is tougher than your big brother” anti-Go sort — than in creating actual solutions.

chain-109302_640
This wasn’t a criticism of Rust. So I was a bit surprising when I was asked to write a “Why Go demolishes Rust” article (paraphrasing, but that was the intent) for a technical magazine.

I don’t think Go demolishes Rust. Rust is actually a very exciting, well considered, modern language. It’s a bit young at the moment, but has gotten over the rapid changes that occurred earlier in its lifecycle.

Language tourism is a great pursuit for all developers. Not only do we learn new tools that might be useful in our pursuits, at a minimum we’ll look at the languages we do use and leverage daily in a different way, often learning and understanding their design compromises and benefits through comparison.

I would absolutely recommend that everyone give Rust a spin. The tutorials are very simple, the feedback fast and rewarding.

Selling Abilities

When selling oneself, particularly in an entrepreneurial effort where you’re the foundation of the exercise and your abilities are key, you can’t leverage social lies like contrived self-deprecation or restraint. It’s pretty much a given that you have to be assertive and confident in your abilities, because that’s ultimately what you’re selling to people.

This doesn’t mean claims of infallibility, or some sad demonstration of the Dunning–Kruger effect. Instead that you have a good understanding of what you are capable of doing based upon empirical evidence, and are willing and hoping to be challenged on it.

A few days ago I had to literally search whether Java passes array members by reference or value (it was a long day of jumping between a half dozen languages and platforms). I’m certainly fallible. Yet I am fully confident that I can quickly architect and/or build an excellent implementation of a solution to almost any problem. Because that’s what my past has demonstrated.

Generally that goes well. Every now and then, however, I’ve encountered someone who is so offended by pitch confidence that, without bothering to know a thing about me, my accomplishments, or taking me up on my open offer to demonstrate it, they respond negatively or dismissively. This seems to be particularly true among Canadians (I am, of course, a Canadian. This country has a very widely subscribed to crab mentality, however, with a “who do you think you are?” sort of natural reaction among many). Not all Canadians by any measure, but enough that it becomes notable when you regularly deal with people from other countries and start to notice that stark difference.

 

Beautiful Code

All code is born ugly.

It starts disorganized and inconsistent, with overlaps and redundancies and gaps.

We begin working it into an imperfect solution for an often poorly defined problem.

potter-1139047_640

As we start building up like clay, a solution starts taking form. The feedback guides us in moving, removing and adding material. It allows us to add and remove details. We learn from our mistakes.

As we iterate, the problem itself becomes clearer. We focus on the problem from the optics of possible solutions.

Every project follows this path. All code is born ugly. This is the technical debt that every project incurs in those early days, and that is only paid off through iterations. For many projects the final form is an evasive goal that’s always just out of grasp.

Occasionally someone will believe that they have so much experience that they can circumvent the ugly code step. Extensive up front design, planning, standards and guidelines. Start as a swam.

This yields the ugliest code of all. It yields the poorly suited, overly abstracted solutions that solidify like concrete, forever ill-suited for the problem because the feedback loop was circumvented. Grotesquely overwrought solutions for the most trivial of problems, enormous line-counts of boilerplate, unoriginal code for the most banal of needs. These are projects that attempted to begin as swans, but instead became the ugliest ducklings of all.

Intel’s Decelerating Mobile Push -or- Maybe Bet Against Intel?

A year and a half ago I wrote an entry on here regarding Intel in the mobile space. The argument was basically that Intel was finally getting their stuff together, and the market had gotten ready for Intel and x861 (as well as x86_64) to be a fully supported platform.

From Unity to the NDK to AVDs, Intel is now a first-class platform on Android.

But the industry runs at a very different cost and profit model from what Intel was accustomed. The highest-end ARM SoCs run from $30 – $70 per unit. Intel has long lived in a world where their solutions net hundreds to thousands of dollars per unit. But the market changes, and the ARM world isn’t going away if Intel just looks the other way.

Yet Intel seems to have just killed off their aspirations for the market. Their intentionally sabotaged Atom solutions are being bested by small competitors, and they can’t make the finances work.

Bizarre. I find it hard to believe, especially given that Intel has made significant noise about targeting the IoT market. I think the conclusions that people are drawing about Intel killing off the mobile Atom devices and a noncompetitive radio chipset — concluding that Intel is crawling back into their desktop and server processor shell, ceding defeat — highly unlikely.

More likely, I would guess that Intel is going to follow Nvidia’s lead, as there’s no way they’re simply giving up on mobile devices. Nvidia once had separate mobile and desktop engineering, with the duplicated costs that entailed, but with their Maxwell chipset the same designs, architectures and processes are used on both sides of the fold.

I expect Intel to pursue the same approach, simply scaling up and down their common contemporary core to all needs. There are Skylake processors available right now with a TDP of 7.5W (which is the going range for tablet SoCs). Core M processors with a TDP below 4W. The Atom processors didn’t serve a particular need beyond being sabotaged just enough that they didn’t threaten the more expensive markets. That approach doesn’t work anymore.

1 – As an aside, it’s impossible to discuss x86(_64) without someone confidently announcing that it’s a derelict bad design that deserves to die, etc, carrying on an argument from literally the late 1980s to the early 1990s. This betrays a general ignorance about the state of x86_64 vs ARM64, or the enormous complexities of modern ARM chips (with absolutely staggering transistor counts). They’re both great solutions.

Email Addresses Need A Checksum

I get other people’s email.

I grabbed a Google account very early, in the invitation days, and got the first.last@gmail.com gold standard (which by Google rules means I also have firstlast@gmail.com, fir.st.l.a.s.t@gmail.com, etc. These derivatives can be a powerful, but often just confuse people).

Since then I’ve gotten thousands of emails intended for other people. From grocery stores. Art dealers. Hairdressers. Car rental agencies. Hoteliers. Flight itineraries. School newsletters and personal appeals. Square receipts. Alumni groups.

Where possible, when email is sent by a real human being and not a black-hole noreply source, I try to alert people to update their addresses, though it’s surprising how often the issue repeats anyways.

All of these were presumably intended for people sharing variations of my name (e.g. Denis), or with the same name but who had to resort to some sort of derivative such as firstMlast@gmail.com.

Many of the errant emails have privileged or time sensitive information, and a lot of them are actionable.

Square receipts allowing me to rate the retailer and leave feedback, alongside some CC details. Hotel reservations that allow me to cancel or change the reservation with absolutely no checks or controls beyond that the email is in hand. Rewards cards through which I can redeem or transfer points.

Some have highly personal, presumably confidential information.

emotions-371238_640

In many if not most of these cases the email address was likely transmitted verbally1. To the retailer, grocery store clerk, or over a reservation phone line to a travel agent or hotel representative. Alternately it might have been entered on some second screen device (my iCloud account receives the email for more than one stranger’s Facebook accounts).

For a vanity domain it usually means it goes to some ignored catch-all, but on a densely populated host like gmail it yields deliveries of possibly sensitive data to the wrong people, as almost every variation is occupied.

Email addresses should have a checksum. A simple mechanism through which human beings can confirm that information was conveyed properly. Even the most trivial of checksums would provide value, eliminating the vast majority of simple mistakes.

For instance to calculate a CRC32 of a variety of email address derivatives, displaying the base32 (32 digits, or 5 bits each digit, whereas the 32 in CRC32 refers to bits) of the bottom 5 bits of the most and then least significant bytes (totally arbitrary, but sound. This is extremely trivial in a world where launch vehicles are landing on floating barges) would yield-

first.last@gmail.com EW
first.lst@gmail.com 6U
firstMlast@gmail.com XM
frst.last@gmail.com ZS

“My email address is f  i  r  s  t   period   l a s t @ g m a i l . c o m”

“Okay, got it. 6U?”

“Nope, I must have misspoken. Let me restate that – … ”

“Okay, got it. EW?”

“Perfect!”

(and of course every user would quickly know and remember their checksum. This wouldn’t be something the user is calculating on demand)

When I’m forced to use my atrophied hand-writing to chicken scratch an email address on a form, a simple two digit checksum should yield a “go / no go” processing of the email address: If it isn’t a valid combination (whether because the email address or the checksum aren’t being interpreted correctly), contact me to verify, and certainly don’t start sending sensitive information.

Two digits of base32 yields 10-bits of entropy, or 1024 variations. Obviously this is useless against intentional collisions, but against accidental data “corruption” it would catch errors 99.9%+ of the time.

Technical Aside: Email addresses theoretically can contained mixed case, but in practice the vast majority of the email infrastructure is case-insensitive.

The Pragmatic Footer

Gmail and the other vendors aren’t going to start displaying email address checksums. Forms and retailers and Square aren’t going to start changing their apps and forms to capture or display email data entry checksums.

As with prior “improve the system” exercises, it’s more a theoretical while discussing the concepts that we regularly encounter. While it doesn’t work for telephone exchanges, more data transfer should be happening via NFC or temporary QR codes than being verbally relayed.

It’s a fun thought exercise to go back and think of how the system could have been improved from the outset, given the reality that information transfer is often human and thus imperfect. For instance all email address have a standardized checksum suffix – first.last+EW@gmail.com. Or whatever.

If you develop a system where humans verbally or imperfectly transmit information, and it’s important that it is stated and understood correctly, consider a checksum.

 

1 – I had a speech impediment as a young child, courtesy of a Jamie Oliver-esque mega tongue that was trying to escape the confines of my mouth. This made me more aware of the general sloppiness of verbal data transmissions as a problem, later noticing that it’s a fairly universal issue.

Code: It’s Trivial

Everyone is going crazy about a purported $1.4 million dollar random arrow app for the TSA. It didn’t take long before a developer “duplicated” it in 10 minutes.  With some practice they could easily get it down to twenty seconds.

$252 million dollars an hour!

Not that such a demonstration means much. Developers can make a veneer simile of almost anything not overly computationally complex in short shrift. I could spin out a superficial Twitter “clone” in a few hours. Where’s my billions in valuation?

As Atwood said a few years ago (as everyone declared how easily they could make Stack Overflow clones) – Code: It’s Trivial (his article making my choice of title trivial). The word trivial is used and abused in developer circles everywhere, used to easily deride almost every solution, each of us puffing up our chests and declaring that we could totally make Facebook in a weekend, Twitter in afternoon, and YouTube the next morning. We could make the next Angry Birds, and with Unity we could totally storm the market with a new 3D shooter if we wanted.

Because it’s all trivial. We could all do everything with ease.

It later turned out the app itself actually cost $47,000, which is still a pretty good chunk of change for such a seemingly simple app. Only $8,460,000 per hour.

But the amount of time spent in the IDE is close to irrelevant, as anyone who has ever worked in a large organization knows. These sorts of exercises are universally nonsensical. This method of evaluating the cost of a solution is pure nonsense.

I’m not defending the TSA, their security theater, the genesis or criteria for this app, or even saying that it isn’t trivial — by all appearances it seems to be. But knowing that the TSA decided that this is what they were going to do, $47,000 doesn’t sound particularly expensive at all.

Some senior security guy didn’t say “We need x. Do x.” and a day later they had an arrow app. As two large organizations they most certainly had planning meetings, accessibility meetings. They likely argued aesthetics of arrows. They put in checks and conditions to lock the user in the app. They likely allow for varying odds ratios (total conjecture on my part, but I doubt it was a fixed 50:50, and likely had situational service-based variations depending upon overrides for manpower restrictions), etc. Still not in any universe a significant application, but the number of things that people can talk about, question, probe, and consider grows exponentially. The number of possible discussions explodes.

Then documentation, training material (yes, line level workers really need to be trained in all software), auditing to ensure it actually did what it said it did (developers regularly mess up things as simple as “random number” usage), etc.

In the end, $47,000 for a piece of software deployed in an enormous organization, in a security capacity….I’m surprised that the floor for something like this isn’t a couple of magnitudes higher.

Nothing — nothing — in a large organization is trivial. Nothing is cheap. Ever.

 

 

The Full Stack Developer / Computer (Un)science

I have nothing technically interesting to discuss right now, but various half-finished, lazily conceived thought essays have sat around for a few weeks, so here they are in somewhat rough form. These are not viable for general consumption — being wordy and contentious and not terribly interesting — and are not intended for social news (if you found your way here from social news, click back and maintain your innocence), but are contemplation pieces for existing readers looking for a bit of time filler.

typewriter-1004433_640

As with most narrative-style content on here, this is very subjective. I expect that many disagree with some of the suppositions, and of course welcome disagreement. You can even send me an email if so inclined.

The Full Stack Developer

Small shops and rapid growth startups need full-stack developers.

They need people who can ply the craft, with motivation and self-direction, all the way from the physical hardware (including VM provisioning and automation as appropriate), to the networking, OS, application server, database, API and presentation level, selecting technologies and securing appropriately. They need practitioners who might jump between building some Go server-side business logic — clustering the backing databases and applying appropriate indexes and optimizations to maximize the TPS — to working on the Angular presentation JavaScript, to building some Puppet automated deployment scripts and researching some new AWS products to see how it impacts the plans, then rebuilding nginx from the source to include some custom updates to a module, install some SSL certs configured for an A score and PFS, add some SIMD to some calculation logic with dedicated source variations for NEON, AVX and SSE. Maybe they’ll jump into Xcode to spin some Swift for the new iOS app, and then develop some ETL tasks to pull in some import data. Then they relax and collaborate on some slides for an investor proposal until they’re pulled aside to integrated DTLS into an Android app that was using unsecured UDP.

Some definitions of “full stack” add or remove layers — at some shops, simply being able to make a stored procedure in addition to the web app qualifies — but the basic premise is that there isn’t the traditional separation of concerns where each individual manages one small part of their project. Instead everyone does essentially everything.

Full Stack Rock Pile

Having full stack developers is absolutely critical when the headcount is small (or if you’re doing your own venture) and you just need to Get Things Done. You need people who aren’t constantly spinning while waiting for various other people to do things that they depend upon but can’t do for themselves, throwing their arms up under the comforting excuse of Someone Else’s Problem.

But doing full stack development in practice sucks. It is sub-optimal on the longer timeline. It’s an emergency measure that should not continue for longer than absolutely necessary. It can be a crutch for staffing issues.

Each context switch imposes a massive productivity penalty. It’s bad enough when you’re working on some presentation code and you get distracted by a meeting, losing your train of concentration and mental context. Now consider the penalty jumping from presentation code to an entirely different language, platform, and paradigm to put out a small fire, then returning back to where you were.

When voluntary and controlled, these sorts of project gear shifts can be hugely rewarding and beneficial (many developers hate working constantly on the same thing, growing malaise and carelessness from boredom), but these redirections are productivity sapping and stressful when they regularly occur as crises or due to poor planning, or simply because the headcount is so small that you have no other choice. It may be critical to the firm, but it’s detrimental to focused productivity.

Such is the life of the full stack developer.

I’ve done the full stack thing for much of my career. Not always because the headcount was small (though I do prefer to work on small teams), but sometimes just due to knowledge gaps elsewhere in organizations: When software developers flame out they often get moved to network administration / engineering, or database administration. With the right motivation and aptitude it can be a great fit, but often it’s just avoiding some temporary unpleasantness by making everything more of a hassle for everyone else as someone acts as a gatekeeper for a role they’re ill-suited for and unmotivated to learn competency in.

When the “DBA” only knows how to backup and restore a database, but imposes themselves as overhead on every activity purely as a turf defensiveness, and the network admin has a seeming strategic incompetence forcing you to do everything yourself (being the lead architect of an organization, with the top salary in the firm….fixing Exchange issues), it’s just bad for everyone. Everything gets a little worse and a lot slower.

Being a full stack developer imposes an enormous amount of overhead trying to stay on top of everything covering the entire domain, and it’s simply impossible to know it all comprehensively. So every area, by the simple limits of time, will be compromised to some degree. Many “full stack” implementations have betrayed this reality, with security rife with amateur mistakes, or a gross misuse of the database (which is incredibly common. If your database is a recurring problem point and you’re searching for silver bullet fixes, it’s more likely the developers that are the weakness).

There is no way I can coerce and optimize AWS as much as someone focused on it. Or monitor and tune and finesse the database to the same completeness of someone dedicated to doing that in a project where it might comprise 5% of my time. Nor can I spend the time to analyze every facet of security that someone focused purely on intrusion evaluations can. There is only so much attention and mental focus to go around.

While a developer tasked with the API might build out a well planned, focused, coherent API, the full stack developer is busy adding things on a need basis, tightly coupled to their specific immediate needs.

And forget about documentation. Or comprehensive tests. Did you notice I just automated the cluster deployments and their monitoring, configured the IPSec intra-machine security, and built the shared-memory module for nginx? And then I dealt with the consequences of the messaging server failing, after migrating the presentation code from Angular to Angular 2. And you’re bugging me about documentation? Let me just finish up that disaster recovery solution first.

I do this because I have to, and because I get paid to do it (often on projects at a state where a very rapid but focused build out is necessary), and it’s a necessity of circumstances unique to the roles I fill. But on a long enough timeline the survival rate for everyone drops to zero you should have dedicated people focused on getting the most out of as granular of domains possible, the focus intensifying as the team grows. If you have two dozen generalists all doing everything, you chose the wrong path somewhere along the way.

Though let me add the caveat that as people specialize, they must become actual experts (in knowledge and practice and not just in theory) who provide service levels and responsiveness. Not just flamed out dregs filling a seat, acting as conceptual gatekeepers while imposing lengthy useless delays.

Oh how I dream of having DBAs would actually alert me to database hot points or suggested optimal indexes, partitioning schemes and usage patterns that would improve performance. Or to have some security experts actually kick the tires and look at the protocols in depth and give a better sense of comfort, instead of just that clichéd “some guy who earned the benefits of the Dilbert principle and now imposes multi-week delays on your project because they’re the security `gatekeeper’, but whose analysis will be so superficial that it adds no utility or value” (true story! That was at a large banking group, as an aside, and was the glorious cargo cult illusion of security. The more onerous and inconvenient, the more the illusion of security was realized).

Get actual experts specialized on working together for a common solutions, with common motivations. Not generalists focused on making their presence known in minor turf wars.

Having said all of that, some shops ask for full stack developers but they actually want you to specialize. Meaning that they want developers to understand the workings of modern hardware, how the operating system functions, how the network, database, proxy server, application server, and all of the parts of the platform works. And then they want you to focus on your specific domain and solution with that knowledge in the back of your mind, considering cache levels and their impact on performance, the overhead of I/O and network communications, and how UDP and TCP and sliding windows impacts your work. How to make vectorizable code, and how what you’re doing impacts other projects, etc. The basics of the major facets of encryption (symmetric, asymmetric, elliptic curve versus RSA, the modes of each, etc). Facebook is the poster child for demanding this, and I have absolutely no criticisms or complaints with that. Their full stack developers aren’t really full stack developers.

An expectation that developers understand the consequences of their design choices, based upon a good knowledge of the platforms and systems they’re developing on, should be universal.

Computer Unscience

The nutrition and software development fields have a lot in common.

In both, flawed/incomplete dogma make the rounds and headlines. “Studies”1 — often agenda driven — that show some correlated or Hawthorne effect are held as critical proofs that change everything.

croissant-648803_640

We want quick fixes, loosening our skepticism in their pursuit. Something that we can adopt and quickly become a competent manager, 10x programmer or team, eradicate all errors and security concerns, clear blemishes, lose weight, have more energy, and eliminate those persistent headaches.

The easiest way to find yourself in someone’s favor often is to parrot their current quick-fix beliefs. “Couldn’t have said it better myself!” they’ll exclaim, declaring you the smartest person they know — barely concealed self-congratulations — because you support their current notions about NoSQL, Rust2, gluten or fat. Whether it’s their fervent advocacy of TDD, or in the evils of carbohydrates, the same ego-driven, “the more I believe and the more I advocate, the more it’s true!” flawed motive comes into play.

People in the sales industry know how to exploit this mimicry effect well, and it’s one aspect of the consulting world (where sales are a fundamental element of the role) that I find most unpalatable: Many people seek outside assistance primarily to confirm their beliefs, often while empire building or as position allies in internal turf wars.

The hiring process in many firms has sadly been diminished to a group of people with their current set of pseudo-science beliefs and cargo cult behaviors searching for someone who aligns with their biases. Who hasn’t sat in an interview where a coworker repeatedly asks specific trivia about some technology, philosophy or dogma that they very recently adopted, looking for validation of some new Belief structure?

You can usually determine the current trends by following the tech social news sites for a week or so. Then wait for the “boy, that really wasn’t a silver bullet!” follow-up cycle of blog posts a year later as trends come in and out of favor, the early adopter’s euphoria turning into a “a period of wallowing in the depths of cynicism” (taken from James Bezdek’s glorious editorial in IEEE Transactions on Fuzzy Systems), just as superfoods and macronutrients and dietary evils fall in and out of favor in the world of nutrition, as waves of converts to various trends then seek to vilify it to explain their personal failure.

What ground up ancient berry should you be mainlining today? What methodology or language or tooling is going to turn your team into superstars?

A 20 Year Comparison

My 11 year old son — who has been gaining competence in C# and JavaScript via the Unity platform for a couple of years now, motivated by the urge to create fun things for and with his friends — recently asked me what has changed in software development over my career: What innovations and progress have shot the field forward, making plying the craft today different from then.

The silver bullets[PDF], so to speak.

So I sat in a darkened room, Beethoven’s Piano Concerto No. 5 quietly playing in the background, contemplating a 20 year contrast (which was pretty much when I entered the industry as a professional developer).

2016 versus 1996, from the perspective of widespread software development practices during those two periods (e.g. that something was used in a university somewhere, or was a nascent technology or methodology or approach, made it a non-factor in the 1996 consideration). I am not considering niche development fields, so how software is developed for NASA, the military, nuclear power plants or for unique snowflake projects. Nor am I considering “sweatshop” style development where some low complexity project is farmed out across hordes of low cost, often low skill factory-style development groups. These have unique needs and patterns, and are not the subject of this conversation.

I should also explain that none of this is motivated by resistance to change or “all one has is a hammer” motives: Over the years I’ve utilized many of the innovations hyped at the time, but at a later point realized (and continue to realize) that everything old is new again, and that this is an industry of perpetual hype cycles. Object-oriented, aspect-oriented, CASE, UML, every ERD variant, Slack, DI, TDD, IRC, XP, pair programming, standing desks, office work, remote work, open plan, private offices, functional programming, COM/DCOM/CORBA, SOAP, document oriented, almost every variation of RDBSM and NoSQL solution, and on and on and on. I’ve plied them all.

So what are the factors that, in my personal opinion, really changed the field? If I were to compare work practices in 1996 versus today, what would be the things that stand out the most? The things that I would miss the most if I forced myself into a re-live-1996 programming exercise?

I’m excluding platforms as that’s a wholly separate discussion, irrelevant in this context, so of course I would miss Android and iOS and Windows 10 and modern Linux and LVM and HyperV and all of the related technologies that greatly enhance our pursuit of excellence. Here I’m talking purely about the process of crafting software, and the tools and techniques that we use.

The Things I Would Miss Most

  • Source control
  • The internet
  • The scope and quality of libraries available
  • Free tooling
  • Concurrency and thread safety

Source control has existed in some form for many decades (the best known earlier iteration being SCCS), but didn’t become widespread until closing on the turn of the century. Prior to this, many teams and individuals used a shared folder of the current source (and sadly some still do!), occasionally creating a point-in-time archive.

Source control is the enabler of collaboration, and the liberator of change. Even when working as an individual developer, source control frees us from the paranoia that we’re always on the precipice of destroying our project, allowing us at any moment to investigate what happened when, understanding the creeping change in our creations. I check in frequently, and it is the foundation that enables very rapid progress, and the metadata to recall the motives and intentions of my prior activities.

The widespread adoption of source control hugely enabled enhanced productivity, accountability and quality across the industry. We’ve gone through a several dominant tools during that period (SCCS, RCS, CVS, SourceSafe, subversion, TFS, Hg, git), and while incremental improvements bring massive advances to certain types of work (e.g. Linux kernel scale of projects), the general value was there from early on.

The internet brought obvious benefits because it allowed for close to real-time collaboration with peers across the industry, whether via Usenet newsgroups or, more recently, on sites like StackOverflow. A world of documentation and libraries and code examples came available at our fingertips (which I could contrast with a giant stack of Visual Studio manuals I started with, memorization of every API a requirement to have any sort of velocity).

Of course the Internet existed in 1996, but the ability to find people who’ve faced the same unique problem set quickly, and to learn from and adopt their discoveries, is an enormous productivity boost. Projects could get hung up on minor issues for days to weeks — some unloved Usenet newsgroup post lying ignored for weeks — where now it’s often seconds away from a fix.

Libraries allow us to develop on the backs of giants. I specifically say libraries rather than frameworks because the former is pure gain, while the latter is often much more nuanced, and the gains are often hard to qualify. Many frameworks exist primarily as a structural approach rather than beneficial code (e.g. libraries are steroids. Frameworks are a training regime), and are often the manifestation of developer ego.

Libraries allow us to create a program in minutes, on virtually any platform in virtually any language, that can receive files via HTTP/2, decompress them, decompose it, analyze it (e.g. computer vision, OCR, etc), reprocess it, and push it via XML to a far off system. The scope, scale and quality of the library universe is so enormous that almost anything is made easy.

Free tooling raised the status quo across all developers. There are a number of fantastically good IDEs and compilers and libraries that in 1996 were a significant expense. Even if you worked in a money-rich corporate space, the process of procuring tools was often so laborious and ridiculous that many teams simply hung back with sub-par tooling and outdated IDEs/compilers. Now everyone is a download away from the best tools and platforms in the world.

Concurrency and thread safety Obviously it wasn’t a real problem in 1996, however modern languages and tooling offers an enormous number of solutions for concurrency and thread safety, including in modern variants of C++. It would be crippling to develop 1996 style without these benefits if targeting modern hardware.

But What About…

Early in my career, one of the hottest this-changes-everything developments were CASE (Computer-Aided Software Engineering) tools. These very high priced tools, advertisements for which dominated every developer magazine — promised to change the field, allowing the program manager to drag and drop some requirements and generate high quality, complete tools.

UML later came and promised the same. An architect would contrive some UML diagrams, and the rest would be easy.

Both are close to irrelevant now. Both brought very little benefit, but everyone was chasing the silver bullet.

And of course I said nothing about C# (Java of course existed in 1996, though with much more rudimentary tooling), garbage collection in general, Go, C++14 or any of the other iterations, Python, and countless other languages. There are a lot of things that I love and enjoy about modern languages, but the truth is that their benefit is significantly oversold. A huge bulk of solutions we enjoy today, and many of the critical libraries and technologies that we enjoy, continue to be developed in a 1990s, if not 1980s, variation of C. Of course some newer features are used, but if for some reason C/C++ compilers all mysteriously reverted to circa-1996 variations (from a language perspective. Obviously not having newer optimizations and target language support would be detrimental), it would be relatively simple to adapt.

None of that is to say “give me the old timey ways…this new fangled stuff stinks”, but rather is simply that when developing real solutions it’s surprising how little of a difference it really makes. Whether I’m using C# or 1996 level Object Pascal, Go or C++ circa 1996…it just isn’t that big of a difference. With each iteration of the C++ spec I initially have a “ooh wow that’s great” enthusiasm, but in retrospect a lot of it feels like moving deck chairs around.

It just doesn’t make a significant difference. But we hype each iteration up as a This Changes Everything…Again! revolution.


1 Virtually every study in the programming field is some variation of “get four university juniors together and have them create a trivial project using technique 1 and technique 2. Compare and contrast and then project these results across the industry.”

In the same way many tools and techniques are heralded for the cost savings in the first hours and days of a project (which is completely irrelevant over the lifespan of real-world projects) — e.g. “schemaless” database projects where you amortize that initial savings, with a very high rate of interest, over the entire project, or the development solution that was chosen not because it offers the best results or long term success, but because the developer had a good understanding of it within the first ten minutes. None hold much predictive power regarding real projects in real scenarios.

2 Checked HN while typing this and one of the top posts was advocating rewriting some standard library bit in Rust. Rust, like Haskell, is one of those solutions that is proposed as a sure-win easy solution to almost everything, resolving all impediments to development, curing all security ills, inflating productivity and awesomeness…followed by crickets. The number of actual solutions built with them puts them on endangered lists, but in inflated rhetoric they’re a cure all for everything. And once again, someone makes some cheap commentary about fixing everything, promises some future resolution, and if it follows the pattern of all that came before, positively nothing will come of it.

Rust is hardly alone in being a silver bullet solution. Go, which I enjoy and have posted about on here multiple times, was the topic of a key value implementation a few days ago. The performance was truly terrible, the implementation questionable, but because it was in Go it got that “ooooh” hype and push to the top.

And just to be clear, Rust is a very exciting, elegant language that seeks to blend the best aspects of C (performance, predictable memory allocation and de-allocation, with scope lifetimes and reference counting — I have always been a critic of garbage collection as it’s essentially, in my mind, a hack worst-case solution to the problem of memory management. Garbage collection should be something the exists in debug mode, with an orphaned object indicating some sort of failure of the platform) with the best aspects of higher level functional and object oriented languages. I have done the tutorials and having a middling understanding of the language, and it looks great, so please don’t take this as a criticism of it. However it is caught in that void where many of the people picking it up seem to primarily just want some platform to advocate. Many of them seem to simply advocate it as an alternative to Go, and then go back to their day job using Java or whatever. It is one of those exciting languages that needs to make it over the hump of practicality.

The iPhone SE

Apple released a number of iterative product updates yesterday, including the 9.7″ iPad Pro1, and an iteration of the iPhone 5S: the iPhone SE (Special Edition).

This new device features the same A9 processor, 2GB of RAM and the very well reviewed back optics and imaging sensor as the iPhone 6s.

For $399.

(though it’s a little bit scammy in that they’re still pushing a 16GB device — below what anyone should buy — but then skip the optimal 32GB version, forcing you $100 up to the 64GB version at $499)

That is an insane amount of device for the money. I normally don’t post about incremental Apple product updates, but this will have a much bigger splash than the media response seems to predict. It is an outrageous value. It is a top tier device for people who prefer a smaller body.

It is going to become the “smartphone for my kid(s)” of 2016. It’s going to hurt Apple’s ASP, but will put them more into the conversation for no-contract/no-plan devices.

The screen is too small for my tastes (my daily driver right now is a Nexus 6p. I thought it would be too large but now it just seems normal, making the Nexus 5 feel almost quaintly small in comparison. Best feature of the 6p, as an aside: front facing speakers), but on the flip side the GPU power to screen resolution is so overwhelming, the thing will be absolutely market leading. Aside from the controls issue that remains a problem with all smartphones, it is a gaming colossus.

And given that many comment boards are full of people who really bought into the “Apple doesn’t care about specs” nonsense, the A9 remains market leading. It destroys my Nexus 6p. It destroys the Galaxy S7. The GPU is absurd. The CPU is outrageous. It remains the best mobile processor available. It might be underclocked on the SE (though the early benchmarks show it matching the iPhone 6S in CPU tasks, and of course beating it in on-screen GPU tasks given that it has fewer pixels to sling), but even if it were pruned 30% it would remain a leading device.

And now it’s in a $399 device. What a world. All hail competition!


1 – I added a 3rd generation iPad to the household some four years ago. With four children, their friends and visitors, and occasional parental use, it has seen many thousands of hours of use, and a ridiculous number of recharges. It is still storming strong, still works and looks great, and the battery still lasts for hours on end. Ultimately I have to peg it as the best value purchase I’ve ever made. The 9.7″ “Pro” is a tempting update.