Wednesday, July 7, 2021

AI Endurance - DFA a1 accuracy initial review

Updated 7/13/21

It's with great pleasure that I'm updating this post after the AI Endurance team made a series of adjustments to their DFA a1 calculation methods (especially that troublemaker, detrending).  As the following data will show, their numbers are now spot on with Kubios.  The developers should be commended on this achievement and have my thanks.

A new implementation of DFA a1 has just become available through the web coaching site AI Endurance.  Since precision of DFA a1 is needed for accurate physiologic metrics (and therefore coaching suggestions) I wanted to take a quick preliminary look at their accuracy.  This was from a 2 hour session of mixed zone 1, zone 3, I did a few days ago.  For comparison sake, I included Fatmaxxer data, since from my previous testing experience, that app has the closest similarity to Kubios for DFA a1.  Artifacts were present, but below 1-2% at all times (mostly zero).  This was based on data given to me by the AI Endurance team.

Here is the DFA a1 over time:

AI Endurance:


  • The a1 values of AI Endurance are now essentially the same as Kubios, especially in that all important .75 area of the AeT (first blue circle).  They also reach appropriate nadir levels during HIT.
  • The post HIT fatigue induced suppression of a1 also has excellent agreement (purple circle).
  • Fatmaxxer data is very close to Kubios through the entire series as well.

A zoomed look at the all important a1 = .75 zone, during cycling near the aerobic threshold:

  • We see superb agreement between methods (Kubios premium vs AI Endurance).
  • On an interesting note, just a few watts of power is enough to knock the DFA a1 off the .75 threshold.  Without the trust in a1 calculation methodology/precision, erroneous conclusions could easily be made. 


Bland Altman Analysis of AI endurance

This plots the difference between each set of point pairs against the average value

Fatmaxxer for comparison
  • The correlation of AI endurance is excellent ( r of .96 and a mean difference of only about 5% at low DFA a1).
  • We can easily see that AI Endurance has very close agreement to Kubios, the 5% mean difference is trivial.
  • Job well done by the AI Endurance team!!

What about heart rate?  

Did I line the data up incorrectly?  If the data point pairs really were time shifted, we should see substantial HR discrepancy:

This looks very good (really can't do better) - r of 1.0 and about a .1 bpm mean difference!

Another days data - although I don't have the raw data, I wanted to share a look at a rough comparison of Kubios, Fatmaxxer, Runalyze and AI endurance during a session I did yesterday.  A 2 hour cycling session with 2 HIT intervals.  The important point is that the nadir values of the HIT and post HIT data are very close to Kubios in all methods.

The first is AI Endurance:


The others:

  • The red circles showing the DFA a1 nadirs are all very similar across methods.
  • Other portions of the data plot are also very similar.

 Update 10/23 - new findings of interest:


  • AI Endurance DFA a1 calculation appears to closely match that of Kubios premium.  This is remarkable achievement for the AI Endurance development team.  They now join the "elite club" of Runalyze and Fatmaxxer for Kubios like accuracy.
  • My compliments to the AI Endurance team for pursuing this difficult goal.
  • Since DFA a1 is a dimensionless index, trustworthy data values are essential to threshold calculation as well as monitoring health and fatigue.  Although we don't need to "calibrate" a1 to lactate or gas exchange (the blessing), we do need to have these numbers consistent with what Kubios displays (the curse) since that is what we used for threshold and other published studies.
  • See this for ramp protocol comments - Ramp slope and HRV a1 thresholds - does it matter?

Heart rate variability during dynamic exercise


  1. Hello Bruce,

    i did the Ramp-Test from AI Endurance with a Polar H10 on Zwift yesterday (starting at 105W and then +10W every minute) and imported my data from my Garmin Edge (including the HRV-Data). During the test i watched my DFA Alpha1 on the Garmin Widget "DFA Alpha 1" and paralell on my smartphone via Fatmaxxer.
    While the test i noticed that my DFA Alpha1 values on the Fatmaxxer-App are at the beginning of the Ramp lower - but more constant (not so jumping around) - then my values on the Garmin (DFA Alpha 1 Widget).
    So i get through my VT1 (0,75) on Fatmaxxer about one minute earlier then i did on my Garmin (185W against 194W, thats OK). But about 1,5 minutes later my Fatmaxxer values started to get higher than on my Garmin (but still more believable), and i get through my VT2 (0,5) around 4 minutes later than i did on the Garmin (272W againt 235W on my Garmin).

    So all in all i'm finding fatmaxxer-numbers very relieable and plausible.
    In later analysis and comparison between fatmaxxer and runalyze (Garmin HRV-data) i'm more with the data of fatmaxxer (for example my DFA Alpha 1 went down to 0,2 in runalyze, but only down to 0,3 on fatmaxxer which i trust more).

    BUT the analyse of AI Endurance (also Garmin HRV-data like in Runalyze) looks nearly the same as the numbers of fatmaxxer, so also very trustable.

    I'm relativly new in this DFA Alpha1 thing, so i don't really understand why there are so different numbers of the same data source. I would love to use the Garmin Widget DFA Alpha 1 because it's much easier than on the smartphone with fatmaxxer, especially when riding outside, but don't really trust it, for whatever reason.

    Would appreciate it very much if you can find a little time to get a little explenation of this :)
    PS: If you may like my data or diagrams i can make it available for you.

    THANKS for the great and interesting Blog!

    1. I'm glad you are getting decent results with Fatmaxxer and the 2 web apps. When a new app comes out aiming to measure a1, I always assume it is erroneous until proven otherwise. Although I could be wrong, I have difficulty imagining that a widget on a Garmin device can mimic the other methods. As to why they are subtlety different, the algorithms are close but not identical.