Friday, September 20, 2019

Why is the Garmin Firstbeat VO2 estimate inaccurate, can we do better?

As the final post in the "VO2 max estimation" series, I would like to take a stab as to why the Garmin-Firstbeat prediction is, at least in my case, inaccurate.  Since I don't have access to a large number of paired gas exchange versus Garmin watch VO2 metrics I can only speak to my experience.  
To start with, as discussed before, the Firstbeat algorithm takes advantage of the near linear relationship between VO2 and heart rate.  This was explored previously but at the time I did not have my personal VO2 data, so let's take a look now and see if that holds.  Here is the gas exchange VO2 vs heart rate (last 30 sec average) from my ramp study:

  • The fit line of regression is quite good, with only a +- 2 BPM deviation from the observed points.  I did not use the last point from the final stage since the data was suspect (see prior post).
  • The VO2 to heart rate relationship seems very clear from this ramp study.

Since we can't measure VO2 on the road, does the VO2 to power curve seem close enough to use Watts as a surrogate marker?

  • The correlation is very good with a solid relationship of power in watts to VO2.

Finally, does power still match up with heart rate?  It should if VO2 = Power.

  • There is an excellent fit of heart rate to power, except for the last point which was excluded.

As another example of the close relation of heart rate to power, here is a graph of an indoor ramp done with 5 minute stages to above the MLSS, but not to exhaustion.  

  • This plot is as close as the VO2 max ramp with the rate within 1-2 BPM of the fit line.

So if VO2 were substituted for watts, we would have a very nice, near linear association of heart rate and VO2.  

So whats the problem then?  Well at least 4 things come to mind.
  • The heart rate may be different if the stages were longer.
  • Ambient temperature/humidity will affect heart rate to variable degree.
  • Maximum heart rate needs to be reached for a true VO2 max.  Maximum rate prediction equations may or may not be valid for the given individual.
  • We are still stuck without an accurate VO2 to power equation.  In another discussion, the various formulas for watts/run speed to VO2 conversion were reviewed.  Perhaps the best is the Storer formula, but this was derived from doing 1 minute continuous cycling stages, 30w incremental ramps to failure, which Garmin does not utilize.  Even if the Garmin formula correcting for temp, humidity, exercise time was perfect, the critical missing link is what is the actual VO2 for a given max power.  

An example of field conditions.
I recently did a series of 5 minute intervals on the road (80F and humid) and wanted to see how they matched up with the 5 minute indoor ramp from above.  The power values were 180, 240, 300w in red below:

The three points in red are plotted along the indoor ramp.  Except for the 300 watt value, the other two are quite a bit off.  

This translated into a fair amount of watt differential:

The reason I show this is to put the "Performance condition" metric that Garmin displays in perspective.  My guess is that they are simply using a predefined heart rate to power formula, and then comparing your current heart rate to that.  As noted above, even a mild temperature, humidity change can have a major effect, especially at the lower power figures.  

Is there a better surrogate than heart rate for VO2?
Although the relationship of heart rate to VO2 is superb in a lab setting or even at home on a trainer, in the real world it is not.  Do we have another choice for a physiologic surrogate?
Historically, the plot of ventilation (Ve) to VO2 is also very solid.  However, instead of a nice linear relation, it's curvilinear.  

Gastinger et al looked at Ve and heart rate relationships to VO2 and felt that Ve provided a better correlation to VO2.


Notice that the Ve response does deviate off a simple linear regression (red line I drew), as opposed to the HR which is flat (and would have been even more pronounced as curvilinear if they were testing at higher work rates):

Let's examine the Ve to VO2 graph from my test:
This is a plot leaving out the last stage with it's questionable value.

This was based on the following third order equation:

Here is the Ve to power plot for the 5 minute stage ramp:
On this one I ran a third order fit as well as a linear regression at below LT1 and above it.  They are actually pretty close but the curvilinear trace seems better.

Now let's place some other cycling session data points on the graph.  The following are two different outdoor sessions with comparison to the indoor ramp study.  The first was done in warm humid weather and the second at about room temp but outside.

     Equal ambient temperature:

    • There is a mild difference in watt equivalence between indoor and warm outdoor efforts (as expected).
    • The comparison of indoor to outside at equal temperature is closer especially at the higher ranges.
    • Any formula needs to take conditions as well as hydration into account.

    • Although in a lab or controlled setting VO2, heart rate, Ve and power are all well correlated, in actuality, there can be major differences to the indoor benchmark values out on the road.  
    • Although Ve to HR may be a better choice for relative VO2 assessment, there is no accurate conversion formula to derive O2 usage from Ve. 
    • The Garmin Firstbeat methods do have some solid science behind them but the final product almost has to be flawed.  Not only does the heart rate relationship markedly change with conditions, it will also be affected by elapsed time and hydration.  These type of changes my be more or less per individual making blanket correction difficult.  
    • Caution should be used with the concept of "performance condition" as well as recommendations of your Garmin gear as to how to train.  In perfectly matched conditions, cycling at higher power with lower HR certainly implies an improvement in fitness, but conditions are seldom matched.
    • However, the biggest Achilles heel of the Firstbeat formula must be the conversion of power to VO2.  Somewhere in their calculations, above and beyond heart rate as a measure of the fraction of VO2 max reached, there must be some internal watts/run speed transformation to oxygen usage.  My personal guess is that this is the main reason the estimation fails.  Remember, I used an accurate HRM/power meter during my VO2 max testing and the Garmin estimate was off over 11%. 
    • Until an accurate formula that converts watts/run speed to VO2 is available (probably never) no prediction algorithm can be precise for VO2 max.  
    • As a better method of measuring your VO2 max "status", surrogate tests such as 4 or 5 minute steady state max power/speed or the Wingate 60 are reasonable alternatives for current VO2 max status.

    See also:


      1. Fascinating series! I googled whether Firstbeat's training program was accurate, and I ended up here. Interesting insights and clear conclusions that even a non-scientific person could understand. Thank you for the time you put in.

      2. I made some tests using the latest Garmin 6 + HRM belt, it seems an Anaerobic Training Effect is recorded only past the Anaerobic Threshold, top of Z3). However, as I understood Z3 is a mix of aerobic + anaerobic effort. So why no training effect happens in their model ?

      3. The Garmin models are not ideal in my opinion for many reasons (see my series of posts). Don't pay attention to them.