It’s a good smart watch

I’m a big fan of the Apple Watch and its health metrics1. It’s a great platform for health and activity tracking, and is just generally a great device. The SE devices are a wonderful value if you’re okay with the a mediocre, yet workable, battery life.

That is actually my single real complaint about the devices: competing devices are offering sometimes weeks of usage with a similar feature set, yet Apple keeps boasting about making it lighter or with a slightly bigger screen, barely improving on the 18 hour battery life. It would be awesome if one could go away for the weekend or even the week without worrying about the special needs of charging your watch. And of course, for children it’s one of those annoyance/benefit things where often they’ll have a dead device.

Yes, you can charge it while showering and during other activities and make its limitation work with minimal fuss. It still would be better if you didn’t have to.

And my suspicion is that Apple keeps such a marginal battery in the device to ensure that as the battery fades you upgrade purely to get back to a workable battery life. If the device started with a two week battery life, few would be motivated to upgrade when a couple of years in its down to a week of battery life. Seeing your 18 hours drop to 9 hours, on the other hand, is a very different story.

Regardless, one measure I’ve paid attention to is VO2Maxpdf, using it as a general measure of overall health. Using my Polar H10 and its VO2Max feature I get very different numbers (much better numbers, for what it’s worth), but I generally only use the H10 occasionally so it’d be nice if the Watch proved useful on this measure, at least in relation to seeing trends.

But don’t read too much into the VO2Max Measures

I’ve become suspicious of the value of the VO2Max measure, however, at least in regards to movements in one direction or the other.

I’ve had activities where my heart rate has stayed relatively low and consistent, and yet my VO2Max inexplicably drops. Others where I’m a bit under the weather and my heart rate is higher than normal, yet my VO2Max improves. I’ve had walks where I’m carrying 40lbs of groceries2, expecting a big VO2Max penalty — the watch has no idea of the burden I’m carrying, but suddenly my body is working harder for seemingly the same effort — yet it rises still again.

This matters to me because there is the implication that the calculations are seeing something deeper. So when my VO2Max drops, it makes me concerned. It makes me wonder if there is some underlying health condition that the Apple Watch picked up on early.

This led to me exporting the heart rate data from a set of activities to trying to find any correlation with the movements of the VO2Max measures. I found no correlation, and it seemed almost random. Like in broad strokes my overall fitness level justifies the broad band within which my VO2Max sits, but the number is rising and dropping by up to 20% seemingly based upon essentially nothing.

Over the spring and early summer I was doing a significant number of activities where the Watch was reporting it. Every day there would be at least two sessions.

My VO2Max kept improving, jumping almost 20% over a month or two. These were pretty low effort sessions so I was a bit surprised that they seemed to have such an outsized health benefit.

As the heat of the summer settled in — and Toronto summers are very, very hot and humid — I stopped doing those activities and moved to alternate cardio exercises that weren’t relevant for Apple’s VO2 estimates, yet were even more potent for actual cardio training. To be clear, I was absolutely doing activities that maintain if not improve VO2Max, but not under the watch of the Apple Watch.

So on the rare occasion where I did do an Apple Watch monitored VO2 effort, my score kept dropping and dropping. Every week it was a new low. Again I exported actual heart rate data for the relevant activities and it didn’t support this drop, and seemed comparable to heart rate response from the spring when my VO2Max was hitting personal highs.

What gives?

The Man Behind The Curtain

The Apple paper linked above heralds the tight correlation between the Apple Watch’s VO2Max measures and formal measurements using a lab test with a max effort, mask-monitored session. While individuals often differed between measurement devices significantly, it averaged out across the set.

I think that correlation might be overvalued a bit: It is one of those cases where just having rudimentary measures about a person — their age, sex, weight, height, etc — is enough to approximate a set of people to a close to 1.0 correlation: Exceptional people will fall above or below your guess, but it will average out. Getting a good correlation seems to be mostly a parlour trick, in the same way that you can guess someone’s age of death using mortality tables and you’ll also hit an ~1.0 correlation. Some will die much earlier, some much later, but that’s the whole thing about mortality tables is that it will mean out.

Apple documentation on the VO2Max measure makes a big deal about doing relevant exercises frequently, purportedly to give it more data to analyze. After months of monitoring this, I am confident that frequency of activity as observed by Apple Watch is the most important input into their equation, regardless of the heartbeat response to an activity.

It starts by setting you to an average VO2Max for your particulars, then maybe adjusting based upon significant heart rate variations from estimates. As you do Apple Watch activities frequently, it will improve your score regardless of heart rate response. If you do activities less frequently, it will drop it.

Conclusion

Clearly Apple was concerned with wide variations in VO2Max scaring users or making the measure look unreliable, so they put in a longer term averaging that makes general trends almost worthless. If I am feeling sick and am carrying 40lbs of groceries, I would fully expect my VO2Max to drop significantly, yet instead I see it actually improve because for the two weeks prior I took daily walks. And so on. None of this is seen in its measures as it massages it out and seems to use it mostly to gamify doing activities, acting as a bit of a fake feedback loop.

That is unfortunate. It makes it, in my opinion, a poor measure of health and at best is a “how many VO2 max qualified activities have you done in the preceding period?”.

Footnotes

  1. As detailed in the opinions page

  2. I love walking and getting groceries as an bonus of being in a suburb that is nearby many amenities. Occasionally I underestimate the weight of the groceries I’m carrying and it gets a bit ridiculous.