Email Addresses Need A Checksum

I get other people’s email.

I grabbed a Google account very early, in the invitation days, and got the first.last@gmail.com gold standard (which by Google rules means I also have firstlast@gmail.com, fir.st.l.a.s.t@gmail.com, etc. These derivatives can be a powerful, but often just confuse people).

Since then I’ve gotten thousands of emails intended for other people. From grocery stores. Art dealers. Hairdressers. Car rental agencies. Hoteliers. Flight itineraries. School newsletters and personal appeals. Square receipts. Alumni groups.

Where possible, when email is sent by a real human being and not a black-hole noreply source, I try to alert people to update their addresses, though it’s surprising how often the issue repeats anyways.

All of these were presumably intended for people sharing variations of my name (e.g. Denis), or with the same name but who had to resort to some sort of derivative such as firstMlast@gmail.com.

Many of the errant emails have privileged or time sensitive information, and a lot of them are actionable.

Square receipts allowing me to rate the retailer and leave feedback, alongside some CC details. Hotel reservations that allow me to cancel or change the reservation with absolutely no checks or controls beyond that the email is in hand. Rewards cards through which I can redeem or transfer points.

Some have highly personal, presumably confidential information.

emotions-371238_640

In many if not most of these cases the email address was likely transmitted verbally1. To the retailer, grocery store clerk, or over a reservation phone line to a travel agent or hotel representative. Alternately it might have been entered on some second screen device (my iCloud account receives the email for more than one stranger’s Facebook accounts).

For a vanity domain it usually means it goes to some ignored catch-all, but on a densely populated host like gmail it yields deliveries of possibly sensitive data to the wrong people, as almost every variation is occupied.

Email addresses should have a checksum. A simple mechanism through which human beings can confirm that information was conveyed properly. Even the most trivial of checksums would provide value, eliminating the vast majority of simple mistakes.

For instance to calculate a CRC32 of a variety of email address derivatives, displaying the base32 (32 digits, or 5 bits each digit, whereas the 32 in CRC32 refers to bits) of the bottom 5 bits of the most and then least significant bytes (totally arbitrary, but sound. This is extremely trivial in a world where launch vehicles are landing on floating barges) would yield-

first.last@gmail.com EW
first.lst@gmail.com 6U
firstMlast@gmail.com XM
frst.last@gmail.com ZS

“My email address is f  i  r  s  t   period   l a s t @ g m a i l . c o m”

“Okay, got it. 6U?”

“Nope, I must have misspoken. Let me restate that – … ”

“Okay, got it. EW?”

“Perfect!”

(and of course every user would quickly know and remember their checksum. This wouldn’t be something the user is calculating on demand)

When I’m forced to use my atrophied hand-writing to chicken scratch an email address on a form, a simple two digit checksum should yield a “go / no go” processing of the email address: If it isn’t a valid combination (whether because the email address or the checksum aren’t being interpreted correctly), contact me to verify, and certainly don’t start sending sensitive information.

Two digits of base32 yields 10-bits of entropy, or 1024 variations. Obviously this is useless against intentional collisions, but against accidental data “corruption” it would catch errors 99.9%+ of the time.

Technical Aside: Email addresses theoretically can contained mixed case, but in practice the vast majority of the email infrastructure is case-insensitive.

The Pragmatic Footer

Gmail and the other vendors aren’t going to start displaying email address checksums. Forms and retailers and Square aren’t going to start changing their apps and forms to capture or display email data entry checksums.

As with prior “improve the system” exercises, it’s more a theoretical while discussing the concepts that we regularly encounter. While it doesn’t work for telephone exchanges, more data transfer should be happening via NFC or temporary QR codes than being verbally relayed.

It’s a fun thought exercise to go back and think of how the system could have been improved from the outset, given the reality that information transfer is often human and thus imperfect. For instance all email address have a standardized checksum suffix – first.last+EW@gmail.com. Or whatever.

If you develop a system where humans verbally or imperfectly transmit information, and it’s important that it is stated and understood correctly, consider a checksum.

 

1 – I had a speech impediment as a young child, courtesy of a Jamie Oliver-esque mega tongue that was trying to escape the confines of my mouth. This made me more aware of the general sloppiness of verbal data transmissions as a problem, later noticing that it’s a fairly universal issue.

Code: It’s Trivial

Everyone is going crazy about a purported $1.4 million dollar random arrow app for the TSA. It didn’t take long before a developer “duplicated” it in 10 minutes.  With some practice they could easily get it down to twenty seconds.

$252 million dollars an hour!

Not that such a demonstration means much. Developers can make a veneer simile of almost anything not overly computationally complex in short shrift. I could spin out a superficial Twitter “clone” in a few hours. Where’s my billions in valuation?

As Atwood said a few years ago (as everyone declared how easily they could make Stack Overflow clones) – Code: It’s Trivial (his article making my choice of title trivial). The word trivial is used and abused in developer circles everywhere, used to easily deride almost every solution, each of us puffing up our chests and declaring that we could totally make Facebook in a weekend, Twitter in afternoon, and YouTube the next morning. We could make the next Angry Birds, and with Unity we could totally storm the market with a new 3D shooter if we wanted.

Because it’s all trivial. We could all do everything with ease.

It later turned out the app itself actually cost $47,000, which is still a pretty good chunk of change for such a seemingly simple app. Only $8,460,000 per hour.

But the amount of time spent in the IDE is close to irrelevant, as anyone who has ever worked in a large organization knows. These sorts of exercises are universally nonsensical. This method of evaluating the cost of a solution is pure nonsense.

I’m not defending the TSA, their security theater, the genesis or criteria for this app, or even saying that it isn’t trivial — by all appearances it seems to be. But knowing that the TSA decided that this is what they were going to do, $47,000 doesn’t sound particularly expensive at all.

Some senior security guy didn’t say “We need x. Do x.” and a day later they had an arrow app. As two large organizations they most certainly had planning meetings, accessibility meetings. They likely argued aesthetics of arrows. They put in checks and conditions to lock the user in the app. They likely allow for varying odds ratios (total conjecture on my part, but I doubt it was a fixed 50:50, and likely had situational service-based variations depending upon overrides for manpower restrictions), etc. Still not in any universe a significant application, but the number of things that people can talk about, question, probe, and consider grows exponentially. The number of possible discussions explodes.

Then documentation, training material (yes, line level workers really need to be trained in all software), auditing to ensure it actually did what it said it did (developers regularly mess up things as simple as “random number” usage), etc.

In the end, $47,000 for a piece of software deployed in an enormous organization, in a security capacity….I’m surprised that the floor for something like this isn’t a couple of magnitudes higher.

Nothing — nothing — in a large organization is trivial. Nothing is cheap. Ever.

 

 

The Full Stack Developer / Computer (Un)science

I have nothing technically interesting to discuss right now, but various half-finished, lazily conceived thought essays have sat around for a few weeks, so here they are in somewhat rough form. These are not viable for general consumption — being wordy and contentious and not terribly interesting — and are not intended for social news (if you found your way here from social news, click back and maintain your innocence), but are contemplation pieces for existing readers looking for a bit of time filler.

typewriter-1004433_640

As with most narrative-style content on here, this is very subjective. I expect that many disagree with some of the suppositions, and of course welcome disagreement. You can even send me an email if so inclined.

The Full Stack Developer

Small shops and rapid growth startups need full-stack developers.

They need people who can ply the craft, with motivation and self-direction, all the way from the physical hardware (including VM provisioning and automation as appropriate), to the networking, OS, application server, database, API and presentation level, selecting technologies and securing appropriately. They need practitioners who might jump between building some Go server-side business logic — clustering the backing databases and applying appropriate indexes and optimizations to maximize the TPS — to working on the Angular presentation JavaScript, to building some Puppet automated deployment scripts and researching some new AWS products to see how it impacts the plans, then rebuilding nginx from the source to include some custom updates to a module, install some SSL certs configured for an A score and PFS, add some SIMD to some calculation logic with dedicated source variations for NEON, AVX and SSE. Maybe they’ll jump into Xcode to spin some Swift for the new iOS app, and then develop some ETL tasks to pull in some import data. Then they relax and collaborate on some slides for an investor proposal until they’re pulled aside to integrated DTLS into an Android app that was using unsecured UDP.

Some definitions of “full stack” add or remove layers — at some shops, simply being able to make a stored procedure in addition to the web app qualifies — but the basic premise is that there isn’t the traditional separation of concerns where each individual manages one small part of their project. Instead everyone does essentially everything.

Full Stack Rock Pile

Having full stack developers is absolutely critical when the headcount is small (or if you’re doing your own venture) and you just need to Get Things Done. You need people who aren’t constantly spinning while waiting for various other people to do things that they depend upon but can’t do for themselves, throwing their arms up under the comforting excuse of Someone Else’s Problem.

But doing full stack development in practice sucks. It is sub-optimal on the longer timeline. It’s an emergency measure that should not continue for longer than absolutely necessary. It can be a crutch for staffing issues.

Each context switch imposing a massive productivity penalty. It’s bad enough when you’re working on some presentation code and you get distracted by a meeting, losing your train of concentration and mental context. Now consider the penalty jumping from presentation code to an entirely different language, platform, and paradigm to put out a small fire, then returning back to where you were.

When voluntary and controlled, these sorts of project gear shifts can be hugely rewarding and beneficial (many developers hate working constantly on the same thing, growing malaise from boredom), but these redirections are productivity sapping and stressful when they regularly occur as crises or due to poor planning, or simply because the headcount is so small that you have no other choice. It may be critical to the firm, but it’s detrimental to focused productivity.

Such is the life of the full stack developer.

I’ve done the full stack thing for much of my career. Not always because the headcount was small (though I do prefer to work on small teams), but sometimes just due to knowledge gaps elsewhere in organizations: When software developers flame out they often get moved to network administration / engineering, or database administration. With the right motivation and aptitude it can be a great fit, but often it’s just avoiding some temporary unpleasantness by making everything more of a hassle for everyone else as someone acts as a gatekeeper for a role they’re ill-suited for and unmotivated to learn competency in.

When the “DBA” only knows how to backup and restore a database, but imposes themselves as overhead on every activity purely as a turf defensiveness, and the network admin has a seeming strategic incompetence forcing you to do everything yourself (being the lead architect of an organization, with the top salary in the firm….fixing Exchange issues), it’s just bad for everyone. Everything gets a little worse and a lot slower.

Being a full stack developer imposes an enormous amount of overhead trying to stay on top of everything covering the entire domain, and it’s simply impossible to know it all comprehensively. So every area, by the simple limits of time, will be compromised to some degree. Many “full stack” implementations have betrayed this reality, with security rife with amateur mistakes, or a gross misuse of the database (which is incredibly common. If your database is a recurring problem point and you’re searching for silver bullet fixes, it’s more likely the developers that are the weakness).

There is no way I can coerce and optimize AWS as much as someone focused on it. Or monitor and tune and finesse the database to the same completeness of someone dedicated to doing that in a project where it might comprise 5% of my time. Nor can I spend the time to analyze every facet of security that someone focused purely on intrusion evaluations can. There is only so much attention and mental focus to go around.

While a developer tasked with the API might build out a well planned, focused, coherent API, the full stack developer is busy adding things on a need basis, tightly coupled to their specific immediate needs.

And forget about documentation. Or comprehensive tests. Did you notice I just automated the cluster deployments and their monitoring, configured the IPSec intra-machine security, and built the shared-memory module for nginx? And then I dealt with the consequences of the messaging server failing, after migrating the presentation code from Angular to Angular 2. And you’re bugging me about documentation? Let me just finish up that disaster recovery solution first.

I do this because I have to, and because I get paid to do it (often on projects at a state where a very rapid but focused build out is necessary), and it’s a necessity of circumstances unique to the roles I fill. But on a long enough timeline the survival rate for everyone drops to zero you should have dedicated people focused on getting the most out of as granular of domains possible, the focus intensifying as the team grows. If you have two dozen generalists all doing everything, you chose the wrong path somewhere along the way.

Though let me add the caveat that as people specialize, they must become actual experts (in knowledge and practice and not just in theory) who provide service levels and responsiveness. Not just flamed out dregs filling a seat, acting as conceptual gatekeepers while imposing lengthy useless delays.

Oh how I dream of having DBAs would actually alert me to database hot points or suggested optimal indexes, partitioning schemes and usage patterns that would improve performance. Or to have some security experts actually kick the tires and look at the protocols in depth and give a better sense of comfort, instead of just that clichéd “some guy who earned the benefits of the Dilbert principle and now imposes multi-week delays on your project because they’re the security `gatekeeper’, but whose analysis will be so superficial that it adds no utility or value” (true story! That was at a large banking group, as an aside, and was the glorious cargo cult illusion of security. The more onerous and inconvenient, the more the illusion of security was realized).

Get actual experts specialized on working together for a common solutions, with common motivations. Not generalists focused on making their presence known in minor turf wars.

Having said all of that, some shops ask for full stack developers but they actually want you to specialize. Meaning that they want developers to understand the workings of modern hardware, how the operating system functions, how the network, database, proxy server, application server, and all of the parts of the platform works. And then they want you to focus on your specific domain and solution with that knowledge in the back of your mind, considering cache levels and their impact on performance, the overhead of I/O and network communications, and how UDP and TCP and sliding windows impacts your work. How to make vectorizable code, and how what you’re doing impacts other projects, etc. The basics of the major facets of encryption (symmetric, asymmetric, elliptic curve versus RSA, the modes of each, etc). Facebook is the poster child for demanding this, and I have absolutely no criticisms or complaints with that. Their full stack developers aren’t really full stack developers.

An expectation that developers understand the consequences of their design choices, based upon a good knowledge of the platforms and systems they’re developing on, should be universal.

Computer Unscience

The nutrition and software development fields have a lot in common.

In both, flawed/incomplete dogma make the rounds and headlines. “Studies”1 — often agenda driven — that show some correlated or Hawthorne effect are held as critical proofs that change everything.

croissant-648803_640

We want quick fixes, loosening our skepticism in their pursuit. Something that we can adopt and quickly become a competent manager, 10x programmer or team, eradicate all errors and security concerns, clear blemishes, lose weight, have more energy, and eliminate those persistent headaches.

The easiest way to find yourself in someone’s favor often is to parrot their current quick-fix beliefs. “Couldn’t have said it better myself!” they’ll exclaim, declaring you the smartest person they know — barely concealed self-congratulations — because you support their current notions about NoSQL, Rust2, gluten or fat. Whether it’s their fervent advocacy of TDD, or in the evils of carbohydrates, the same ego-driven, “the more I believe and the more I advocate, the more it’s true!” flawed motive comes into play.

People in the sales industry know how to exploit this mimicry effect well, and it’s one aspect of the consulting world (where sales are a fundamental element of the role) that I find most unpalatable: Many people seek outside assistance primarily to confirm their beliefs, often while empire building or as position allies in internal turf wars.

The hiring process in many firms has sadly been diminished to a group of people with their current set of pseudo-science beliefs and cargo cult behaviors searching for someone who aligns with their biases. Who hasn’t sat in an interview where a coworker repeatedly asks specific trivia about some technology, philosophy or dogma that they very recently adopted, looking for validation of some new Belief structure?

You can usually determine the current trends by following the tech social news sites for a week or so. Then wait for the “boy, that really wasn’t a silver bullet!” follow-up cycle of blog posts a year later as trends come in and out of favor, the early adopter’s euphoria turning into a “a period of wallowing in the depths of cynicism” (taken from James Bezdek’s glorious editorial in IEEE Transactions on Fuzzy Systems), just as superfoods and macronutrients and dietary evils fall in and out of favor in the world of nutrition, as waves of converts to various trends then seek to vilify it to explain their personal failure.

What ground up ancient berry should you be mainlining today? What methodology or language or tooling is going to turn your team into superstars?

A 20 Year Comparison

My 11 year old son — who has been gaining competence in C# and JavaScript via the Unity platform for a couple of years now, motivated by the urge to create fun things for and with his friends — recently asked me what has changed in software development over my career: What innovations and progress have shot the field forward, making plying the craft today different from then.

The silver bullets[PDF], so to speak.

So I sat in a darkened room, Beethoven’s Piano Concerto No. 5 quietly playing in the background, contemplating a 20 year contrast (which was pretty much when I entered the industry as a professional developer).

2016 versus 1996, from the perspective of widespread software development practices during those two periods (e.g. that something was used in a university somewhere, or was a nascent technology or methodology or approach, made it a non-factor in the 1996 consideration). I am not considering niche development fields, so how software is developed for NASA, the military, nuclear power plants or for unique snowflake projects. Nor am I considering “sweatshop” style development where some low complexity project is farmed out across hordes of low cost, often low skill factory-style development groups. These have unique needs and patterns, and are not the subject of this conversation.

I should also explain that none of this is motivated by resistance to change or “all one has is a hammer” motives: Over the years I’ve utilized many of the innovations hyped at the time, but at a later point realized (and continue to realize) that everything old is new again, and that this is an industry of perpetual hype cycles. Object-oriented, aspect-oriented, CASE, UML, every ERD variant, Slack, DI, TDD, IRC, XP, pair programming, standing desks, office work, remote work, open plan, private offices, functional programming, COM/DCOM/CORBA, SOAP, document oriented, almost every variation of RDBSM and NoSQL solution, and on and on and on. I’ve plied them all.

So what are the factors that, in my personal opinion, really changed the field? If I were to compare work practices in 1996 versus today, what would be the things that stand out the most? The things that I would miss the most if I forced myself into a re-live-1996 programming exercise?

I’m excluding platforms as that’s a wholly separate discussion, irrelevant in this context, so of course I would miss Android and iOS and Windows 10 and modern Linux and LVM and HyperV and all of the related technologies that greatly enhance our pursuit of excellence. Here I’m talking purely about the process of crafting software, and the tools and techniques that we use.

The Things I Would Miss Most

  • Source control
  • The internet
  • The scope and quality of libraries available
  • Free tooling
  • Concurrency and thread safety

Source control has existed in some form for many decades (the best known earlier iteration being SCCS), but didn’t become widespread until closing on the turn of the century. Prior to this, many teams and individuals used a shared folder of the current source (and sadly some still do!), occasionally creating a point-in-time archive.

Source control is the enabler of collaboration, and the liberator of change. Even when working as an individual developer, source control frees us from the paranoia that we’re always on the precipice of destroying our project, allowing us at any moment to investigate what happened when, understanding the creeping change in our creations. I check in frequently, and it is the foundation that enables very rapid progress, and the metadata to recall the motives and intentions of my prior activities.

The widespread adoption of source control hugely enabled enhanced productivity, accountability and quality across the industry. We’ve gone through a several dominant tools during that period (SCCS, RCS, CVS, SourceSafe, subversion, TFS, Hg, git), and while incremental improvements bring massive advances to certain types of work (e.g. Linux kernel scale of projects), the general value was there from early on.

The internet brought obvious benefits because it allowed for close to real-time collaboration with peers across the industry, whether via Usenet newsgroups or, more recently, on sites like StackOverflow. A world of documentation and libraries and code examples came available at our fingertips.

Of course the Internet existed in 1996, but the ability to find people who’ve faced the same unique problem set quickly, and to learn from and adopt their discoveries, is an enormous productivity boost. Projects could get hung up on minor issues for days to weeks — some unloved Usenet newsgroup post lying ignored for weeks — where now it’s often seconds away from a fix.

Libraries allow us to develop on the backs of giants. I specifically say libraries rather than frameworks because the former is pure gain, while the latter is often much more nuanced, and the gains are often hard to qualify. Many frameworks exist primarily as a structural approach rather than beneficial code (e.g. libraries are steroids. Frameworks are a training regime), and are often the manifestation of developer ego.

Libraries allow us to create a program in minutes, on virtually any platform in virtually any language, that can receive files via HTTP/2, decompress them, decompose it, analyze it (e.g. computer vision, OCR, etc), reprocess it, and push it via XML to a far off system. The scope, scale and quality of the library universe is so enormous that almost anything is made easy.

Free tooling raised the status quo across all developers. There are a number of fantastically good IDEs and compilers and libraries that in 1996 were a significant expense. Even if you worked in a money-rich corporate space, the process of procuring tools was often so laborious and ridiculous that many teams simply hung back with sub-par tooling and outdated IDEs/compilers. Now everyone is a download away from the best tools and platforms in the world.

Concurrency and thread safety Obviously it wasn’t a real problem in 1996, however modern languages and tooling offers an enormous number of solutions for concurrency and thread safety, including in modern variants of C++. It would be crippling to develop 1996 style without these benefits if targeting modern hardware.

But What About…

Early in my career, one of the hottest this-changes-everything developments were CASE (Computer-Aided Software Engineering) tools. These very high priced tools, advertisements for which dominated every developer magazine — promised to change the field, allowing the program manager to drag and drop some requirements and generate high quality, complete tools.

UML later came and promised the same. An architect would contrive some UML diagrams, and the rest would be easy.

Both are close to irrelevant now. Both brought very little benefit, but everyone was chasing the silver bullet.

And of course I said nothing about C# (Java of course existed in 1996, though with much more rudimentary tooling), garbage collection in general, Go, C++14 or any of the other iterations, Python, and countless other languages. There are a lot of things that I love and enjoy about modern languages, but the truth is that their benefit is significantly oversold. A huge bulk of solutions we enjoy today, and many of the critical libraries and technologies that we enjoy, continue to be developed in a 1990s, if not 1980s, variation of C. Of course some newer features are used, but if for some reason C/C++ compilers all mysteriously reverted to circa-1996 variations (from a language perspective. Obviously not having newer optimizations and target language support would be detrimental), it would be relatively simple to adapt.

None of that is to say “give me the old timey ways…this new fangled stuff stinks”, but rather is simply that when developing real solutions it’s surprising how little of a difference it really makes. Whether I’m using C# or 1996 level Object Pascal, Go or C++ circa 1996…it just isn’t that big of a difference. With each iteration of the C++ spec I initially have a “ooh wow that’s great” enthusiasm, but in retrospect a lot of it feels like moving deck chairs around.

It just doesn’t make a significant difference. But we hype each iteration up as a This Changes Everything…Again! revolution.


1 Virtually every study in the programming field is some variation of “get four university juniors together and have them create a trivial project using technique 1 and technique 2. Compare and contrast and then project these results across the industry.”

In the same way many tools and techniques are heralded for the cost savings in the first hours and days of a project (which is completely irrelevant over the lifespan of real-world projects) — e.g. “schemaless” database projects where you amortize that initial savings, with a very high rate of interest, over the entire project, or the development solution that was chosen not because it offers the best results or long term success, but because the developer had a good understanding of it within the first ten minutes. None hold much predictive power regarding real projects in real scenarios.

2 Checked HN while typing this and one of the top posts was advocating rewriting some standard library bit in Rust. Rust, like Haskell, is one of those solutions that is proposed as a sure-win easy solution to almost everything, resolving all impediments to development, curing all security ills, inflating productivity and awesomeness…followed by crickets. The number of actual solutions built with them puts them on endangered lists, but in inflated rhetoric they’re a cure all for everything. And once again, someone makes some cheap commentary about fixing everything, promises some future resolution, and if it follows the pattern of all that came before, positively nothing will come of it.

Rust is hardly alone in being a silver bullet solution. Go, which I enjoy and have posted about on here multiple times, was the topic of a key value implementation a few days ago. The performance was truly terrible, the implementation questionable, but because it was in Go it got that “ooooh” hype and push to the top.

The iPhone SE

Apple released a number of iterative product updates yesterday, including the 9.7″ iPad Pro1, and an iteration of the iPhone 5S: the iPhone SE (Special Edition).

This new device features the same A9 processor, 2GB of RAM and the very well reviewed back optics and imaging sensor as the iPhone 6s.

For $399.

(though it’s a little bit scammy in that they’re still pushing a 16GB device — below what anyone should buy — but then skip the optimal 32GB version, forcing you $100 up to the 64GB version at $499)

That is an insane amount of device for the money. I normally don’t post about incremental Apple product updates, but this will have a much bigger splash than the media response seems to predict. It is an outrageous value. It is a top tier device for people who prefer a smaller body.

It is going to become the “smartphone for my kid(s)” of 2016. It’s going to hurt Apple’s ASP, but will put them more into the conversation for no-contract/no-plan devices.

The screen is too small for my tastes (my daily driver right now is a Nexus 6p. I thought it would be too large but now it just seems normal, making the Nexus 5 feel almost quaintly small in comparison. Best feature of the 6p, as an aside: front facing speakers), but on the flip side the GPU power to screen resolution is so overwhelming, the thing will be absolutely market leading. Aside from the controls issue that remains a problem with all smartphones, it is a gaming colossus.

And given that many comment boards are full of people who really bought into the “Apple doesn’t care about specs” nonsense, the A9 remains market leading. It destroys my Nexus 6p. It destroys the Galaxy S7. The GPU is absurd. The CPU is outrageous. It remains the best mobile processor available. It might be underclocked on the SE (though the early benchmarks show it matching the iPhone 6S in CPU tasks, and of course beating it in on-screen GPU tasks given that it has fewer pixels to sling), but even if it were pruned 30% it would remain a leading device.

And now it’s in a $399 device. What a world. All hail competition!


1 – I added a 3rd generation iPad to the household some four years ago. With four children, their friends and visitors, and occasional parental use, it has seen many thousands of hours of use, and a ridiculous number of recharges. It is still storming strong, still works and looks great, and the battery still lasts for hours on end. Ultimately I have to peg it as the best value purchase I’ve ever made. The 9.7″ “Pro” is a tempting update.

Understanding the Hiring Process / anti-.NET bigotry

This morning atop Hacker News1: We only hire the trendiest.

In one of the comments discussing the post-

There have been a number of posts about hiring practices lately. And a lot of them contradict each other.

Here’s the secret to understand hiring: It’s largely random2.

The people making the decisions — the gatekeepers — are an assorted collection of flawed, opinionated people, often trying to mold the world in their image. These are people with limited knowledge, biases, bigotry and stereotypes, subjective tastes and desires…the foibles of all of humanity.

Some of them are smart. Some of them are not. Some of them are open minded, and some are not. They come from or represent organizations that may be brilliant…or just lucky. Organizations that often stumbled on success by accident, then spending years trying to recreate that spark: One of the thousands of things they did hit pay dirt, but without knowing what was the key (if not just right time/right place luck) they stratify every identifiable behavior of the firm.

whatwouldyousay

When someone asks me anything about hiring — it has occasionally happened given that I’ve been in the position of influence or final say on a hundred or so hires over my career — I often have nothing to say because….who knows. No, seriously, your questions about how much detail, what to emphasize, what to downplay, how to approach it, who to contact and how aggressively, if you should include a cover letter and if so what should it say, how to approach the interview: It is utterly impossible to give a general answer that can be at all useful, beyond the most obvious of basic human behavior (e.g. don’t show up under the influence is probably a decent rule for most situations).

One person’s passion and drive is another person’s desperation. Pride of work…or arrogance/self-centeredness. Confidence or egotism. Independent drive or unmanageable lone wolf. Loyal worker or lazy careerist. Focused or “all you have is a hammer”. Diversely skilled or scattered and overly broad. Detailed or windy. Concise or terse.

One person’s variety of experience is another person’s trend chasing.

When people complain about some sort of hiring feedback, they often generalize, but more often than not they’re outraged that a single person — some random, flawed, myopic hiring agent at some random company — gave feedback that went against how they project what people should want.

And to be fair, the person complaining is also some random, flawed, myopic, defensive candidate trying to mold reality in their own image, projecting a world where any hiring process inevitably favors exactly them. We’re all endlessly lobbying.

Eh.76

(As an aside, and pertinent to the topic, Atwood recently posted a great article about hiring. The root cause of a lot of the hyper-selective hiring nonsense — which often becomes an egotistical “evaluate the world in my image” exercise — is this notion that once someone is in the door they’re there for life. We know that is seldom the case, and when situations don’t work out both parties usually want to diverge, but we hire as if you’re making a lifetime commitment)

The Purported “Anti-.NET” Sentiment

With the growing contingent of Microsoft-ecosystem developers populating HN, there has been this repeated slur about “trendy” technologies, and a defensive observation about an anti-.NET (or anti-Microsoft technologies) sentiment among many technology companies.

To go back in history a bit to set the context for what follows, for a decade+ I was primarily a Microsoft platform developer. I developed on, and for, the Windows platform and the Microsoft stack. I acquired a variety of Microsoft certificates along the way (including MCSD, MCSE, MCDBA, among others. They were a vehicle to pay attention to things I might not have otherwise, and it made me a much better developer knowing every aspect of the platforms I was developing against), and was published in Microsoft’s premiere MSDN magazine. I’m extremely adept at SQL Server, C# and many other Microsoft technologies. I spend probably 60% of my day on a Windows 10 desktop, often in Visual Studio (which, as an aside, has very decent support for the modern C++ standards now), the rest on a Ubuntu laptop.

Microsoft technologies are candidates of potential solution sets.

Yet if I am given a resume that is nothing but the pablum that Microsoft served up for the candidate, the odds that it goes to the rejection bin dramatically increase (I’m some random guy with my own set of biases and bigotry). Having C# or SQL Server on a resume is a positive, but they turn to a negative if it’s coupled with only the creations of Redmond. It makes it seem like the candidate has limited their technology horizon to whatever Microsoft gets around to adopting, often belatedly and only if it doesn’t threaten any other element of their empire. It has an extremely high correlation with professional ennui.

Many years ago I was working at a small engineering shop when the .NET platform was first introduced as a very restricted, low quality beta. The manager of a team (who later went to work in a sales capacity for Microsoft directly, which was a natural, more honest outcome) was advocating that we should embrace this beta wholeheartedly, giving a list of reasons why it would be a winner.

I remember reading the list and asking “If this is all so great….why isn’t and hasn’t Java been in the discussion?” Not that it should be the choice, but a query as to why it has never been within the realm of possible candidates.

Every benefit of this nascent product from Microsoft was already available in Java. And to move to the new Microsoft platform meant new languages and tools, new APIs, new libraries, and so on, so it wasn’t like there was some transitional advantage.

But this one was from Microsoft, so it legitimized those choices and turned those frowns upside down. We were a Microsoft “shop”, you see, and we were Microsoft developers. All of those prior criticisms, fears and concerns about Java simply didn’t apply anymore because Microsoft has now anointed those choices. They were now Microsoft Shop approved.

I grew weary of the Devout Microsoft Platform Developer (DMPD from here on in), and started to always evaluate my own perception of technology to ensure I didn’t grow that single-vendor malaise, lazily outsourcing technology selection and implementation to some self-serving entity whose best interests may not correspond with my own3.

The DMPD was sure that the web was just a short term fad, soon to be usurped by WinForms and later by WPF. That ActiveX and Internet Explorer would own the Internet future. That MVC became a favored acronym when ASP.NET MVC came out in betas, and that mixed rendering was the bees knees with Razor. That AJAX became viable when update panels appeared, and functional programming was no longer just a fad the moment it came out in F# form (just as every language element picked up by C#, turning it into the Frankenstein it has become, was unnecessary until its inclusion). That JavaScript became credible when TypeScript entered the scene, and in-memory databases became worthwhile with Hekaton.

The DMPD was sure that Windows Phone would be a success, and are probably still clutching their unloved Lumia, sure that some new strategy is going to be the thing that puts it over the top (we’ve been hearing that one for  years now, to the point that I literally get the tingles of deja vu when I see it repeated, opening the desktop calendar to confirm the date). That Windows 10 on the Xbox is going to really reshape the market and turn back the clocks, making us all embrace tiles and make the wholesale transition to Windows Apps and Surface devices.

Even when you’re working within the Microsoft stack itself, this DMPD syndrome leads developers to assume that random quality solutions from Microsoft are preferable to alternatives (whether other vendors or self-engineered). Microsoft’s AppFabric solution, for instance, was a catastrophic disaster of a project (so profoundly inefficient it was effectively an app decelerator), but you couldn’t talk about any caching strategy without the DMPD’s endlessly pushing AppFabric. Every half-considered initiative and solution of Microsoft’s, however half-baked and obviously doomed to fail, is embraced and evangelized.

Which makes the “trendy” slurs a bit farcical really — Microsoft has sent the faithful on a hundred futile ventures, then abandoning them in the scraggy rocks of doom4.

Do you see yourself in any of the things I just described? If you do, diversify. Take intellectual ownership over the decisions you make. They may still lead you to the Microsoft-fold because it really does offer the optimal set of solutions for your particular situation and requirements, but make it an open, honest consideration.

Others will certainly see peers they’ve worked with in those descriptions.

When people have a bigotry about the “.NET developer”, the DMPD is the root cause: No one thinks anyone is tainted by C# or defiled by .NET, but by experience they’ve come to experience enough DMPDs — effectively Microsoft salespeople doing that org’s bidding on someone else’s payroll — that they start to paint all MPDs with the same brush, which is unfair and unfortunate.


1 The server logs betray that this post received some moderate exposure from Hacker News (today – March 22nd). That was not my doing and it certainly wasn’t my intention: I don’t write “reaction pieces” hoping to grab the tail of someone else’s success, but instead because the referenced article brought some observations to mind.

There was a time when “making it” on Hacker News made me absolutely giddy, and I targeted pieces specifically to achieve that (mostly through pandering, which works wonders), to moderate success. Now, not so much. I remain convinced that HN is detrimental to the tech community, and the occasional visit only increases that belief: the small community circle the wagons and often engage in discussions that could best be described as inventing a reality.

As an aside, my $12/month AWS VM had absolutely zero problems or even marginal loading (though to be fair it was only a moderate load of readers). Every connection TLS encrypted, many with HTTP/2, some even using Brotli. Cool. What an awesome platform world we’re in.

2 The least random hiring processes are the ones that we in the tech industry generally revile: the HR-professional laundry list of requirements, predictable “What are your weaknesses” interviews, rote evaluations of cover letters, etc. In the HR world, where the gate keepers aren’t really tech adept at all and have all been trained by the same book of best practices, there are some expectations that you can pander to. But generally these are the sorts of orgs that no one wants to work at anyways, so if you fail the vanilla HR gauntlet it’s often a good thing.

3 Obviously this doesn’t just happen in the Microsoft fold. Already there are those people who seem to embrace whatever poorly-considered abomination Google pushes out in the same sort of vendor advocacy.

4 A lot of developers don’t realize how influential Microsoft once was to tech. I remember anxiously awaiting Technet subscription discs where I’d get pre-alphas of various considered SDKs and APIs and technologies. Most of them, of course died, but it was necessary to be on the leading edge of what Microsoft was doing, or be left behind. When some rumored Microsoft change was afoot (e.g. the “dumping Win32” ruse repeated for well over a decade now), it caused serious panic and concern.

76 Regular readers know that I’m fairly cynical about almost everything — 95%+ of everything is just self-serving bullshit.

Ageism and Enablers

Hollywood is brutal to older actresses.

Leading ladies are almost always in their twenties, even when the onscreen romantic interest of a sixty year old man (Maggie Gyllenhaal, as an example, was purportedly turned down as “too old” for a role that would have her play the love interest of a 55 year old. Maggie is 37). So as signs of aging appear, many actresses start getting bit parts, often as the mothers (or grandmothers) of the leads (in the {nsfw} famous Comedy Central bit, they point out how Sally Field was Tom Hanks love interest in Punchline, but became his mother by Forest Gump). Some go off the deep end trying to turn back the clock of time, becoming a cartoonish caricature of youth.

And occasionally these now ostracized women get together and point out this issue and complain about the lack of work.

But they seldom complain about it when they benefit. It isn’t the young actress — the next generation putting the last generation out of work — who’s railing about ageism in Hollywood. It isn’t the young actress saying “is it really believable that I’m 22 and somehow the head DA for a major city, even if it works for the inevitable workplace love interest?”

Of course they don’t. That’s human nature, and we all suffer the same selfish perspective lens: When the system benefits us, it gets a pass and we’ll find ways to justify it. When the same system is detrimental for us, we complain and rail our fists at the clouds.

First they came for the…

Which brings me to the software development industry and ageism. I’m in a position where I don’t really need to worry about this, but there was an earlier stage of my career where it was a considerable concern (though it took a back seat to the imminent peril of every development job moving overseas, which was the prevailing paranoia circa-2000): When would I make the transition to the fuzzy fields like middle management? Should I become a “project manager”? Should I get my MBA, or PhD, or become a teacher?

In many software development shops that Logan’s Run renewal chamber is running at capacity, and many pursue these transitions regardless of what they really want to do. Before their hand crystal starts blinking red.

I love software development and deep technical challenges too much. There was no way I was softening my approach to pander to other people’s issues or stereotypes. I reworked my path to evade other people’s nonsense to the greatest extent possible.

So I have the clarity of being on the outside looking in. And it’s fascinating because you have the 30- and 40-something developers complaining about rampant ageism. Complaining about being turned down for prospective jobs because they fall above the mean age (which, to be fair, probably isn’t really the issue much of the time and instead is an excuse. But it does happen), left out often through the ruse of workplace “culture”: Do they look nerf gun-ish enough? Or they complain about excessive hours that prevent having a life, or policies and expectations that are incompatible with having a family or outside interests. About the efforts to try to act young.

On the other side you have the 26-year old developers parroting the noise that older developers are somehow stuck in their ways (which is one of those things where people see what they expect or want to see, in the glorious tunnel vision of bigotry. I know older developers still stuck to Borland Delphi because they used it 20 years ago. I know young developers glued to Ruby, trying desperately to make it fit for problems where it’s grossly inappropriate. It just turns out that a lot of people don’t like change, and they’ll avoid it whenever they can, whether 25, 35, or 55. It takes root quickly). Those same young developers will tell you that families are a big distraction and that you need the dedication of 80 hour workweeks, accounting for the 60 hours spent browsing social media sites. That health issues and family coverage costs are just a big deadweight on the corporate health plan.

You can see this exact set of discussions happen regularly on sites like Hacker News. Where people see the world exactly how it benefits them right now. And if you denigrate older developers, or people in situations other than yours…well, less competition, right?

Which always leaves me marveling. Do these 25 year old developers realize that before they know it they’ll be 35 year old developers? And in an instant more they’ll be 45 year old developers? Or they’ll grow a family, or gain some hobbies or outside interests? That the same sharp-tongued patter that they serve right now will haunt them later in life? I mean, unless they have an exit strategy young, or live under the delusion that their side project has a high probability of hitting it mega big, they’re really sabotaging their own future self, or at a minimum allowing other people to sabotage it.

It’s kind of remarkable, really. Such thoroughly self-destructive noise. But if it’s somewhere off in the future…eh. Deal with it then.

Hanging Chads / New Projects / AMPlified

The Journey of a Thousand COMMITS

One of the most challenging parts of most projects, for me at least, is taking the first step, and if there’s a procrastination quagmire it’s always in those early days.

To explain, over my career I’ve seldom done projects that implement only, or even significantly, things I already know. Instead I’m always drawn to, or into, projects that involve learning and conquering new things, doing something not (widely) done before, pushing envelopes, achieving very difficult requirements that might be perceived as impossible (and sometimes turn out to actually be impossible), etc.

This keeps it fun and challenging and rewarding (and it’s one of the reasons I’ve never been a big disciple of Test Driven Development — if I were repeating patterns and techniques in project after project, it would be of obvious value, but when I’m endlessly jumping technologies, using new techniques and libraries and protocols, with projects that are speculative and often based upon research while under way, etc…it’s less of a win). But invariably there are those stages where I have to learn significant new things (not like “learn this one thing”, but rather “learn an entire field to winnow down to the specific choices for this project”). I often avoid it for a while, often whittling time away on the trivial stuff (which recently might be configuring containers and configuration scripts, etc).

Things learned incrementally, with continuous feedback and rewards, are never a problem — and a lot of tools and technologies are built around this model — but many others demand an enormous up-front investment of time and intellectual effort before it can begin to offer any value at all.

I’ve written about this before, and I’ll include the graphic that I made then-

project

And that’s where a fun project I’ve started with my ten year old son sits right now. Mobile, voice communications, with real-time, low latency, high quality, bit-efficient streams of hundreds of people. But progress continues and it should make for some fun posts on here given my inability to talk about other engagements.

The “Expert”

This post is mostly just a fun little exercise to freshen the cache while updating the site, so excuse me while I jump around a bit, as the prior section reminded me of a bit of a peeve regarding this industry, and it’s one that I constantly encounter.

I was once in a shop (a very large financial firm) where I was the “Borland Delphi” expert. I still recall a conversation where a project being conceived in C# was being discussed, and the manager on the project commented that I was probably not interested/that it would be outside of my expertise, because I’m a “Delphi guy”.

At the time this happened I had a fairly popular published open source project in C#/.NET. I had a published technical article on the .NET platform making the rounds, had been recognized nationally by Microsoft, had my MCSD and an assortment of other Microsoft certifications (hey…I never bragged about them, displayed them on my wall, etc — they were instead decent motivations for digging into the edge material that I might not have bothered with, and I’ve never regretted gaining them), etc. I had done more in .NET and C# and the Microsoft stack than the entirety of the room, and was, to my knowledge, might have been the most experienced on the platform in the entire organization.

But in this industry people want to pigeon-hole you. They want everyone to fit in a place in the technology tree. To have a camp, so to speak. Now often this is sadly true, where the developers get a hammer and then for the rest of their career try to make everything a nail. But many times it isn’t.

If you aren’t guarding against it, it’ll “typecast” one into a role that might not be interesting going forward, might not be lucrative or with long term viability, etc.

I’ve seen the same thing, to much amusement, in the database world: I’ve done an enormous amount of work and projects with relational databases, and with NoSQL databases, and with column stores, and with document and KV stores, and with graph databases, and with in-memory databases. I’ve built large projects on Microsoft SQL Server, pgsql, Vertica, Datomic, Couchbase, and on and on and on. I built a large scale solution using LVM snapshots and sqlite.

But if you do a project in any of them, you get typecast to those participants as only knowing that, and by association of being fervently against the alternatives. Do a project in SQL Server and you must be some anti-NoSQL old timer who should see what the kids are doing nowadays in their new fangled tools (I saw exactly this sort of feedback when I critiqued Digg’s hilariously poor RDBMS use years ago: People in the RDBMS camp declared me one of their champions, and the NoSQL camp declared me the enemy). Use a Couchbase Cluster and you surely have lots of angry, opinionated words ready to attack relational database.

The whole flag waving thing gets incredibly boring. I develop for Windows and Linux and Android and iOS, in C++ and Go and C and Java and C#, on the web and in services and in fat apps and in native apps.

As an aside, this same thing happens in the “blogging” and online world. After I wrote a couple of critical pieces about the rose-coloured glasses of some in the NoSQL camp, I was cheered on and referenced endlessly by the anti-NoSQL crowd (many of whom just fear change). If I criticize iOS or Apple I must be a member of the pro-Google crowd. And on and on. And then I’ll post something that goes exactly contrary to those assumptions, get some angry emails about being a turncoat, lose subscribers, etc. Eh. Whatever.

Use the best tools. Apply them well. Make great solutions. Seems pretty simple.

Accelerated Mobile Pages

I’d of course heard about AMP (most of it of the conspiratorial slant), a recent initiative of Google’s for creating a faster web, and from the context of the way people talked I presumed that it was some sort of new WAP-like protocol, completely decoupled and simplified from the web as we know it.

Instead it’s restricted HTML, very restricted JavaScript use, and restrictions on how things like media (e.g. IMGs) are used. For instance images require a fixed sizing, allowing for the page to immediately be laid out in its forever state regardless of slow loading media (seriously if browsers all refused to display pages with unsized media, forcing makers to stop being so profoundly lazy and abusive of technology, the web would be in much better shape). And if your page is validated as good AMP, it can then be cached by Google’s AMP cache (and, for what it’s worth, the moment Google noticed this site supported AMP they started indexing those resources like crazy).

A pretty good idea, really, at least for content sites that aren’t heavily leveraging JavaScript or effects (and it’s worth noting that Facebook Instant Articles are on the same theme). HTML’s flexibility is in many ways its undoing, and when people favor apps over the web, it’s usually because the web is being abused by many sites now (my favorite recent encounter being TripAdvisor — their site would be perfectly fine if it weren’t for the endless, repeating demands that you install their app).

Interesting enough. When I finally spent the time I was a bit surprised to see how trivial the whole AMP thing really was. And via a simple plug-in this site supports AMP — just append /amp on the end and voila, a valid, cacheable AMP page.

Android N

Google announced an iOS-like beta program of Android N yesterday, adopting the early public availability and the hopefully useful feedback that provides. Enrolling compatible devices (essentially the newer Nexus devices) is a simple webpage away. This is a dramatically better model than the forced-wipe, manual-install of prior iterations, where the images often didn’t even come out until they were almost fully baked. Android N is most certainly not fully baked at this point, but it’s fun to play with.

I dropped it on my Nexus 6p and from a normal usage angle it’s mostly a graphical refresh (in this iteration going from “simplified”, with some borders removed, layouts simplified, etc. In this industry we endlessly cycle from simplified to bedazzled and back again, each time perceiving it as “more modern”). On the programmer side it gives Java 8 support, adds multi-window support (which I suspect was prioritized and given a lot of executive pressure given the criticisms of the Pixel C), ala what Samsung has done for several years, and so on.

At this stage, I would not recommend installing Android N. Aside from a number of serious malfunctions in various existing APIs (causing a number of apps to malfunction or fail entirely using various codecs, camera APIs, and so on), it has a significant number of rendering issues (many clearly related to the multi-window support), and just generally extremely poor graphical performance. It’s a very early test version, so this should be expected, but it’s just not a good daily driver. The same easy mechanism used to enroll can also be used to unenroll, which is very elegantly implemented.

Eat Your Brotli / Revisiting Why You Should Use Nginx In Your Solutions

Google recently deployed brotli lossless transport compression in the Canary and Dev channels of Chrome. This is the compression algorithm that they introduced late last year, hyping up compared to competitors.

If your Chrome variant is equipped, you can enable it via (in the address bar)-

chrome://flags/#enable-brotli

It is limited to HTTPS-only currently, presumably to avoid causing issues with poorly built proxies.

Brotli is already included in the stable releases of Chrome and Firefox, albeit only to support the new, more compressible WOFF 2.0 web font standard. The dev channel updates just extend the use a bit, allowing the browser to declare a new Accepts-Encoding option, br (it was originally “bro”, but this was changed for obvious reasons), and has authored support for servers to serve up brotli compressed data in the form of an nginx module (itself a very lightweight wrapper around the brotli library. Nginx really is a study in elegant design).

One of the great things about these on-demand extensible web standards is that they enable incremental progress without disruption — you aren’t cutting anyone out by supporting them (browsers that don’t support this can remain oblivious, with no ill effect), but you can enhance the experience for users on capable devices. This is true for both HTTP/2 and brotli.

Overhyped Incremental Improvements

Most of the articles about the new compression option are over the top-

Google’s new algorithm will make Chrome run much faster” exclaims The Verge. “Google Chrome Is Getting a Big Speed Boost” declares Time.

Brotli will not reduce the size of the images. It will not reduce the size of the auto-play video. It can reduce the size of the various text-type resources (HTML, JavaScript, CSS), however the improvement over the already widely entrenched deflate/gzip is maybe 20-30%. Unless your connection is incredibly slow, in most cases the difference will likely be imperceptible. It will help with data caps, but once again it’s unlikely that the text-based content is really what’s ballooning usage, and instead it’s the meaty videos and images and animated GIFs that eat up the bulk of your transfer allocation.

Other articles have opined that it’ll save precious mobile device space, but again brotli is for transport compression. Every single browser that I’m aware of caches files locally in file-native form (e.g. a PNG at rest stays compressed with deflate because that’s a format-specific internal static compression, just as most PDFs are internally compressed, but that HTML page or JavaScript file transport compressed with brotli or gzip or deflate is decompressed on the client and cached decompressed).

In the real world, it’s unlikely to make much difference at all to most users on reasonably fast connections, beyond those edge type tests where you make an unrealistically tiny sample fit in a single packet. But it is a small incremental improvement, and why not.

One “why not” not might be if compression time is too onerous, and many results have found that the compression stage is much slower than existing options. I’ll touch on working around that later regarding nginx.

But still it’s kind of neat. New compression formats don’t come along that often, so brotli deserves a look.

Reptitions == Compressibility

Brotli starts with LZ77, which is a “find and reference repetitions” algorithm seen in every other mainstream compression algorithm.

LZ77 implementations work by looking some window (usually 32KB) back in the file to see if any bits of data have repeated, and if they have replacing repetitions with much smaller references to the earlier data. Brotli is a bit different in that every implementation lugs along a 119KB static dictionary of phrases that Google presumably found were most common across the world of text-based compressible documents. So when it scans a document for compression, it not only looks for duplicates in the past 32KB window, it also uses the static dictionary as a source of matches. They enhanced this a bit by adding 121 “transforms” on each of those dictionary entries (which in the code looks incredible hack-ish. Things like checking for matches with dictionary words and the suffix ” and”, for instance, or for capitalization variations of the dictionary words).

As a quick detour, Google has for several years heavily used another compression algorithm – Shared Dictionary Compression for HTTP. SDCH is actually very similar to Brotli, however instead of having a 119KB universal static dictionary, SDCH allows every site to define their own, domain-specific dictionary (or dictionaries), then using that as the reference dictionary. For instance a financial site might have a reference dictionary loading with financial terminology, disclaimers, clauses, etc.

However SDCH requires some engineering work and saw extremely little uptake outside of Google. The only other major user is LinkedIn.

So Brotli is like SDCH without the confusion (or flexibility) of server-side dictionary generation.

The Brotli dictionary makes for a fascinating read. Remember that this is a dictionary that is the basis for potentially trillions of data exchanges, and that sits at rest on billions of devices.

Here are a couple of examples of phrases that Brotli can handle exceptionally well-

the Netherlands
the most common
background:url(
argued that the
scrolling="no"
included in the
North American
the name of the
interpretations
the traditional
development of
frequently used
a collection of
Holy Roman Emperor
almost exclusively
" border="0" alt="
Secretary of State
culminating in the
CIA World Factbook
the most important
anniversary of the
style="background-
<li><em><a href="/
the Atlantic Ocean
strictly speaking,
shortly before the
different types of
the Ottoman Empire
under the influence
contribution to the
Official website of
headquarters of the
centered around the
implications of the
have been developed
Federal Republic of

Thousands of basic words across a variety of languages, and then collections of words and phrases such as the above example, comprise the Brotli standard dictionary. With the transforms previously mentioned, it supports any of these in variations such as pluralization, varied capitalization, suffixed with words like ” and” or ” for”, and a variety of punctuation variations.

So if you’re talking about the Federal Republic of the Holy Roman Emporor against the Ottoman Empire, Brotli has your back.

For really curious readers, I’ve made the dictionary available in 7z-compressed (fun fact – 7z uses LZMA) text file format if you don’t want to extract it from the source directly.

Should You Use It? And Why I Love Nginx

One of the most visited prior entries on here is Ten Reasons You Should Still Use Nginx from two+ years ago. In that I exclaim how I love having nginx sitting in front of solutions because it offers a tremendous amount of flexibility, at very little cost or risk: It is incredibly unlikely that the nginx layer, even if acting as a reverse proxy across a heterogeneous solution that might be built in a mix of technologies (old and new), is a speed, reliability or deployment weakness, and generally it will be the most robust, efficient part of your solution.

The nginx source code is a joy to work with as well, and the Google nginx module — a tiny wrapper around Mozilla’s brotli library, itself a wrapper around the Google brotli project, is a great example of the elegance of extending nginx.

In any case, another great benefit of nginx is that it often gains support for newer technologies very rapidly, in a manner that can be deployed on almost anything with ease (e.g. IIS from Microsoft is a superb web server, but if you aren’t ready to upgrade to Windows Server 2016 across your stack, you aren’t getting HTTP/2. The coupling of web servers with OS versions isn’t reasonable).

Right now this server that you’re hitting is running HTTP/2 for users who support it (which happens to be most), improving speeds while actually reducing server load. This server also supports brotli because…well it’s my play thing so why not. It supports a plethora of fun and occasionally experimental things.

Dynamic brotli compression probably isn’t a win, though. As Cloudflare found, the extra compression time required for brotli nullifies the reduced transfer times in many situations — if the server churns for 30ms that could have been transfer milliseconds, it’s a wash. Not to mention that under significant load it can seriously impair operations.

Where brotli makes a tonne of sense, however, and this holds for deflate/gzip as well, is when static resources are precompressed in advance on the server, often with the most aggressive compression possible. At rest the javascript file might sit in native, gzip, and brotli forms, the server streaming whichever one depending upon the client’s capabilities. Nginx of course supports this for gzip, and the Google brotli module fully supports this static-variation option as well. No additional computations on the server at all, the bytes start being delivered instantly, and if anything it reduces server IO. Just about every browser supports gzip at a minimum, so this static-at-rest-compressed strategy is a no-brainer, the limited downside being the redundant storage of a file in multiple forms, and the administration of ensuring that when you update these files you update the variations as well.

Win win win. Whether brotli or gzip, a better experience for everyone.

Link Rot and Permanent Redirects

I’ve changed “engines” several times over the lifespan of this blog, and have even changed TLDs a couple of times, moved from HTTP to HTTPS, along with several URL scheme changes. I was generating heaps of link rot: Links from wikis and Stack Overflow, from fantastic blogs like Atwood’s Coding Horror, and media and countless message boards going to 404s.

In my most recent move — from a very efficient blogging engine that I wrote, primarily to facilitate running on the tiniest server imaginable, moving to a WordPress blog for content management advantages — I leveraged a significant number of nginx rewrite rules to try to avoid this problem. Rules crossing domains, differing URL structures, redirecting RSS feed users, etc. Rules leveraging perl to decompose URLs and recast them to their more contemporary form.

For all of these deprecated URLs I served up a permanent redirect saying to every caller “The URL you actually want is over there. Use it from this point forward.”

But of course the static links on other sites (e.g. in comments on StackOverflow) never change, forever pointing to the original, time-limited link, awaiting the day they inevitably turn to link rot. In an ideal world these static content sites would have a perpetual bot validating links, updating them to contemporary forms where appropriate, or flagging them as unavailable when they turn to rot (or when they revert to placeholder pages as domains get scooped up after being abandoned).

The static content is a bit more of an issue, but what really surprises me, though, are the number of RSS readers and other automated consumers that just completely ignore permanent redirects, treating every one in an immediately forgotten fashion. Literally years after I switched to this platform, serving up 301 permanent redirects to millions of requests in the meantime, these readers still keep slamming on the no longer available URLs.

No longer available because I finally dropped the rewrite rules. They added complexity and risks to the ruleset, and required a module that I didn’t want to install now that I switched to the mainline version of nginx (primarily to have some fun with the HTTP/2 functionality).

If you’ve hit the root page of this blog because you were sent to one of these very dated links — I apologize. The web’s mechanism of dealing with link change is far from ideal.

Okay, got it

Earlier this year (2015) I put out an app for Android called Gallus. As regulars know, it was a stabilizing hyperlapse-capable video recorder for Android (unlike Instagram’s solution on iOS, it wasn’t just for stabilizing hyperlapses, and offered a variety of other features including interval frames, locked focus/exposure, advanced filters, etc). Everyone else will know nothing about it given that it was incredibly obscure, but regardless it captured the attention of a couple of industry heavyweights, making the whole effort worthwhile.

I’m very proud of what I built. The Nexus 6p — a recent addition to my device portfolio — is the first device to really showcase what the app is capable of, and I consider the project an enormous personal success.

Gallus was very technically challenging, and did something claimed impossible (still not achieved by anyone else). It doesn’t work great on every device, though, because not every device in Android land is great. And when someone has a device with a terrible gyroscope or a system image with invalid field of view settings, they’re going to blame Gallus and not their device. Such is the fun of the Android world if you’re doing anything more than a couple of basic forms1.

But it had a terrible interface as the technology was my focus, not the UI. The settings page was just awash in obscure settings to alter (owing to the extreme variance of Android devices, an issue that makes such a solution magnitudes harder than doing the same on iOS), and the main interface…well personally I thought it was perfectly fine, but a lot of users seemed to have incredible trouble figuring it out.

While I understood every complaint about the mess of a settings page, the difficulty with the main interface surprised me: The iconography wasn’t completely literal, but it was built with the intention that users would just try the buttons and quickly discover what each does. This was a wrong assumption, as a recurring feedback trend were users toggling basic settings off and then complaining about the result. Stabilization, for instance, is a toggleable button with a “shaking camera” on it, only available during playback/preview, yet a number of users would disable this, apparently inadvertently, and then complain about stabilization no longer working. Another button lets you disable/enable smartzoom (a toggleable button with “crop” iconography on it), yielding a result where the frame jumped around in the render (although the actual scene was perfectly stabilized, and personally I think was a fantastic solution and I prefer it over forced zoom), and again people would disable it and then complain about weird black bars appearing around their video.

Many people have more time and motivation to complain than they do spending seconds trying out buttons to figure out what they do. This isn’t a “resent the user” statement, but is a learned observation and I suppose could be considered beneficial: For every user that complained about a trivial interface, how many more just uninstalled and moved on?

I always intended to do some sort of inline help. The “wizard” sort of thing that does bouncing arrows and walkthroughs of the interface as a series of steps: This is how you enable/disable stabilization. This is how you render to a video file that you can share. Etc. It was never financially worth the time or effort (for a free app with no ads or monetization beyond being a vehicle to pitch some technology), but if I were going to do something that was my imagination of how I would. In a way sort of like the obnoxious “tutorial” stage of many games where you try clicking past each forced interaction as quickly as possible.

Which was all a big egotistical way of getting to the real subject of this post, which is that Google recently started filling all of their Android apps full of “Okay, Got It” staged tutorials. When you open that newly updated camera app it has a multi-step tutorial. The same for Gmail, Google Maps, and so on. Generally these appear on first run, though they seem to keep retriggering them on minor updates even though the information hasn’t changed, and force you to step through the various pages before you can use the app.

It’s amazing how quickly “Okay, got it” fatigue sets in — it’s a concept that works in isolation, but diminishes in value at scale. While I imagine that it benefits green users, to most established users it quickly becomes a nuisance. As I’ve talked about before, the worst time to pester users is when they start the app. When I’m sitting on the side of the road in Niagara Falls trying to find my way out of some suburban enclave, having my navigation request bring me to a Google Maps tutorials wizard is not wanted, needed, or beneficial. When I pull up the camera app to take a picture of a quickly passing moment, having the camera use that moment to teach me how to use it is…well it’s the worst possible moment for it.

There has to be a better solution. Some sort of master help interaction of the platform, triggered and observed by any app that decides to. The Okay, Got It approach does not scale, and already I’d say I’ve seen less than 5% of the content of these screens.

1 – Recently there was a bit of a hoopla around various camera apps not working correctly with the Nexus 5x. To explain this issue, the image sensor can be mounted in two variations relative to the natural portrait position of the device. It just turns out that for the rear camera every single device all did it a single way, to the point that it was assumed to be a given and many apps didn’t even have a code path for the alternative. The obvious conclusion is to standardize this (it should never have been a variation to begin with, but at this point should be entrenched), and if the sensor is inverted, flip it at the system level before presenting it to the application. To minimize the hassle every single app has to go through when this could be standardized at higher levels. I’ve complained about this before, where instead of doing the work at the system level, it becomes a problem for every app developer to deal with, often poorly.

Not for Google though, and you see this endlessly throughout Android: Simple things that should be standardized either as a demand of the hardware, or as a shim standardization offered by the system, are not, so every app has to have countless permutations. To make it worse, you then have to just hope that your permutation actually works, or obtain every possible variation of device. I have sensor rotation code in Gallus, but I have no idea if it actually works correctly on the Nexus 5x (I mean I know it should work, but many times there has been a schism between how I think things will work and how they actually do, so I say that with no confidence).