Tuesday, July 30, 2019

HRV artifact avoidance vs correction, getting it right the first time

HRV analysis is a heavily researched subject which has been touched on in some of the previous posts on this blog.  However, one aspect of the RR interval interpretation sometimes overlooked is the way artifacts are corrected.  In this post we will briefly examine artifact types, the impact on HRV interpretation but most importantly will come to realize that avoidance of artifacts is much more preferable than correcting them.  In the same way as a master chef attempts to compensate for inferior recipe ingredients, the HRV software will try to compensate for various RR artifacts to still produce a superior end product.  Unfortunately, as with the chef not being able to produce the high quality meal from inferior material, the HRV analysis may also be tainted by artifact, despite correction algorithms.

Types of artifacts:
Here is a graphical look at some types of artifact taken from a recent study comparing the Polar H10 vs a ECG based device

The top tracing shows a missed RR interval, middle tracing shows a misinterpreted RR peak and the bottom shows a section of noise masking any usable data.

Another study classified them in a more detailed fashion:

 With examples of the RR interval change:

  • The type 4 artifact is the missed RR peak so the RR length is effectively doubled.

  • The missed beat (type 4) is also the most common seen according to that study:

Does the missing of beats matter?
Before attempting to answer this question, lets first look at the "hierarchy" of RR recording device quality.  The most basic heart rate monitor attribute we generally think of is one of simple heart rate tracking.  Does the monitor actually measure the heart rate under challenging conditions?  As discussed in a previous post, optical, wrist worn HRM are not even able to track basic heart rate with precision during intense exercise.  Some forehead based optical devices track fairly well (Moov sweat), but some do not (Polar OH1).  Since we need absolute precision in where the RR peak is, the "Plethysmographic" sensor approach with optical methods is not going to work for us.  

That leaves us with chest belts like the Polar H10, ECG capable garments like the Hexoskin or using a dedicated ambulatory ECG monitor.  

One recent paper pitted the mobile Schiller AR12 ECG system against the Polar H10 during various activities.  

They found that the AR12 ECG system had many more artifacts than the Polar H10:
The total RR interval artifacts while jogging were a whopping 12% with the AR12 system and .72% with the Polar.  One take away lesson from this is that the AR12 system can't handle activity.  The major question one is left with is if the H10, with near 1% artifacts is ideal for HRV purposes.  There is no doubt that it is fine for rate tracking, but does even a low artifact rate effect HRV?

Avoidance vs Correction:
Probably the most definitive evidence that correction of artifacts can't fix some HRV parameters is the study by Giles and Draper.  I would strongly recommend reading their paper.  
The purpose of the study:
However, the methods that are used for
the correction of artefacts in recent literature vary considerably.
Although several studies have used various interpolation
techniques (8,12), others simply deleted offending
intervals (20) or, most frequently, do not mention correction
at all (13,14,31). Equally, research that has specifically examined
the validity of HRMs has used several methods (4,32).
As such, there is a need for standard practice in the collection
and processing of RR interval data recorded using
HRMs in research as, currently, there are no standard criteria.

Therefore, this study aimed to examine current methods
for error correction, along with the extent of alterations in
artefact occurrence during exercise, in comparison with
simultaneously recorded ECG
The subjects performed a VO2 max test on a treadmill wearing both a Polar H7 and a set of ECG leads.

Artifact correction techniques:
Various methods were used including the Kubios formulas -
The methods used for artefact correction were (a) Uncorrected,
no correction applied to any intervals; data were left as
recorded. (b) Deletion, erroneous RR interval(s) were simply
deleted from the time series. Deletion can have a significant
effect on HRV parameters because of changes in the length
of the signal, particularly in short-term recordings and
frequency-domain parameters (23,26). Deletion introduces
step-like shapes into the RR interval time series resulting
in changes in variability as well as decreasing the length of
the signal, producing false low frequency (LF) and high frequency
(HF) components (26). Interpolation: Interpolation
methods, in contrast to the deletion, replace erroneous nonnormal
RR intervals with interpolated intervals. Critically,
interpolation allows for the length of the recording to remain
the same, mitigating the issue of reduced signal length. (c)
Degree Zero (average), substitution of artefact(s) with a mean
value that is calculated from surrounding RR intervals. On
longer sections of artefacts, degree zero interpolation results
in the same averaged value over a whole segment, resulting
in a flat shape, introducing false trends, and increasing LF
and very low frequency (VLF) power (5,26). (d) Degree One
a straight line is fitted over the irregular intervals to
obtain new values. As with degree zero, on longer sections of
artefacts, slope-like shapes occur, introducing false trends
and potentially increasing LF and VLF components (5,26).
(e) Cubic, cubic interpolation uses 4 datum points to compute
the polynomial; there are no constraints on the derivatives.
Cubic interpolation does not result in flat sections of
data. However, as nonlinear analysis is concerned with the
complexity and irregularity of heartbeat series, the introduction
of a potentially falsely correlated signal is of concern,
particular if there is a significant number of erroneous
intervals (23,26). (f ) Spline, cubic spline interpolation.
Smooth values are estimated through a number of datum
points by fitting a third-degree polynomial. Cubic spline
interpolation computes a third-order polynomial from only
2 datum points with the additional constraint that the first
and second derivatives at the interpolation points are continuous.
As with cubic interpolation, spline interpolation
may also introduce false correlations into the signal. (g)
Kubios HRV software (Version 2.2), Kubios software provides
options for the detection and correction of ectopic
beats and artefacts
 As noted in the table above, type 4 errors were the most prevalent, almost 97% of the total.

How did the error correction methods effect the HRV parameters?
For the most part pretty well as long as the effort was below 50% VO2 max and the indexes were not the non linear variety (entropy, SD1):

However as noted below, artifact correction of any kind has major difficulty in accurately correcting non linear methods like SD1, sample entropy.  
Linear interpolation produced corrected
intervals with the lowest bias and ES. However, values
of RMSSD, LF:HF ratio, SD1, and SampEn at 60 +
VO2max exercise intensities showed large bias and ES, and
increased LoA and reduced ICC, regardless of correction
They did not look at DFA a1, but I will have more on that shortly.
 Some points from the discussion:
Briefly, artefacts increased relative to exercise intensity, to a peak of 4.46%
during recordings made at 80–100% VO2max. Artefact correction
was necessary, with large percentage bias and ES of
HRV parameters in all but supine and standing recordings;
correction resulted in reduced bias and ES for resting and 60% VO2max recordings with all methods.
Caution should be given to the interpretation of RMSSD, LF:HF ratio,
SD1, and SampEn at high (60% + VO2max) exercise intensities,
as, even when correction methods were applied, large
amounts of bias were still present.
As most recent HRM validation studies were
conducted either at rest or during an exercise that did not
involve upper body movement (e.g., cycle ergometry), the
observed occurrence of artefacts has previously been very
low, at less than 1% (12,20,31). The criterion for the identification
of artefacts also varies considerably among studies
and, as such, they may also be significantly underreported.
Inserting linearly interpolated
intervals can create flat sections with little or no variability
and it is advised that large sections of ectopic beats should
not be edited using these techniques
Kubios HRV software provides options for the automatic
detection and correction of ectopic beats and artefacts.
However, despite Kubios appearing to accurately detect
most artefacts, the results showed larger bias and ES
compared with both the ECG and the manually corrected
cubic spline interpolation
. Both correction methods used the
same interpolation calculation in Matlab. Figure 2 shows
a short section of data with a single type 4 artefact present;
it is apparent that although Kubios software does accurately
identify the artefact (Figure 2a), the erroneous beat is only
replaced with a single interval, rather than the required 2
(Figure 2b)
Personal note - Hopefully Kubios has corrected this.
As the results of this
study demonstrate, artefact correction is necessary for RR
intervals obtained from HRMs. Correction of artefacts
with a simple linear interpolation reduced bias and ES
and increased ICC, in most but not all cases: caution
should be given to RMSSD, LF:HF ratio, SD1, and
SampEn at high (60 + VO2max) exercise intensities
Where possible, select sections of RR intervals that are
artefact free.

My experience
Recently the Hexoskin team has released a new improved shirt design.  It is essentially a simplified version of the Astroskin garment but containing only a single lead ECG (not 3).  The elastic straps along the chest and abdominal areas are gone, replaced by an integrated adjustable elastic band:

In the past I have had good luck with the Hexoskin as far as heart rate tracking is concerned.  I never looked at the artifact percent of either the Hexoskin or Polar H10 but let's do so now.  
The conditions of use included pre wetting the Hexoskin/H10 and applying conductive cream to all sensors.

Power profile:
My goal here was to do about 3 minutes at my maximal aerobic power, then reduce to just under the MLSS (lactate steady state) for training both VO2 max as well as lactate disposal.  The heart rate was high, between 140 and 170 bpm.  According to some previous measurements, DFA a1 complexity should have stayed near the white noise value of .5 throughout.

Here is the Polar H10 tracing with the corresponding DFA a1 (rolling 60 second window tracking, update graph points every 10 seconds of the activity).  This is from Kubios V3.3, premium version with their automatic artifact correction.
The vertical lines in the top tracing are the artifact corrections
  • The loss of RR complexity  (DFA a1 drop below .5) occurs during the VO2 max/MAP interval of 360 watts (as it should).
  • There is regain of complexity, DFA a1 rise, during the subsequent moderate to high power portion (it should not have done this).
  • Artifacts (circled in black) were 1.81%.
Does this pattern hold up?
The Hexoskin shows a much different picture:
  • Although the initial DFA a1 drop is similar, the values never regain complexity throughout the complex interval.  I drew a line along the .5 limit (white noise), which is far different than the Polar tracing above.
  • Almost no vertical lines - perhaps 4 artifacts:
  • Despite a relatively low artifact rate on the H10, automatic correction by Kubios premium V3.3, there is a major discrepancy in the non linear DFA a1 parameter.

Updated data - 11/26/19
I decide to compare the Hexoskin to the Polar H10 worn at the same time over two 7 minute intervals.  The first was at 173 watts, immediately followed by 7 min at 195w.
Here is a table of both SampEn and DFA a1, along with the artifacts and correction methods:

  • The DFA a1 at 195 watts derived from a zero artifact tracing (Hexoskin) is below .5, yet the moderate artifact Polar file yields values .8 or higher.  The discrepancy is present at the low watt interval as well (.8 with Hexoskin vs 1.4 with Polar).  I tried both the standard medium correction method as well as the auto method that comes with the premium version.
  • The artifact levels encountered are within literature accepted methods.  I will continue to work on getting this issue defined and resolved.  The potential of failure to reproduce study results as well as having athletes train incorrectly is the obvious result.

Sample Entropy
Here the tracings are also very disparate (as pointed out by the Giles and Draper's work):

  •  Sample entropy is well below 1 throughout the complex interval.
The Polar H10:
  • Sample entropy is close to 2, well above and almost double what the Hexoskin showed.

HF peak frequency:
Some data indicates that HF peak can be helpful as a marker of exercise load and respiratory rate.  Does a small artifact rate effect this as well?

Hexoskin HF peak:

Polar H10 HF peak:
  • They are both about .8 to 1.0, with the Polar having some drop outs.
  • The values are much closer here indicating that artifacts for this metric are more correctable.
What makes the Hexoskin so good?
Theoretically, the H10 should produce comparable RR interval data but in real life, on the road the Hexoskin has been designed to be a stable 1 lead ECG monitoring device.  With proper sensor contact the newer Hexoskin garment produces a noise free tracing.
Here are 2 raw extracts from the session I did for the above Kubios figures:

This was at the end of the 3 minute, 360 watt beginning segment of the above figures

This was at the end of a Wingate 60 (maximal all out 60s) later in the morning.  This has always been the most challenging to get right.  There is tremendous back and forth body motion, arm and torso muscular activity with a rapidly changing ECG:

  • Both tracings look as if they were made from a patient at rest with a good quality ECG machine.
  • If there was an "artifact" we would be able to classify it.
I am not criticizing the Polar, it is a great unit.  The artifact rate was still below 2% which many would say is fine.  What I am pointing out is that one needs to be cautious using a "blind" system like the Polar.  There is no visual tracing to evaluate.  For instance, if the Hexoskin analysis reported high artifacts we can see what they actually were.  The presence of even relatively low artifact rates can be problematic with the non linear HRV metrics.  Since RR complexity may be useful as a training intensity demarcation, getting it done accurately is essential to a proper zone limit.

For the older athlete:
Perhaps the "artifacts" were not noise or missed RR intervals.  Masquerading in the mass of artifacts could be a potentially severe arrhythmia, not something we would want to miss especially in the realm of aging athletes.  Part of my post ride analysis is a beat by beat review of the intense intervals, looking for any sign of arrhythmia. In time perhaps, more leads will become available with the Hexoskin and ischemic change could be looked for as well.

  • Artifacts in the recording of RR intervals are common.
  • The most common type of artifact is that of a missing beat.
  • Many correction algorithms exist to "compensate" for these artifacts.
  • Most formulas will do a reasonable job at correction producing accurate HRV results with the exception of the non linear indexes.
  • According to the study by Giles and Draper, RMSSD, LF:HF ratio, SD1, and
    SampEn at high (60 + VO2max) exercise intensities will be problematic.
  • My data indicates that DFA a1 will also be misrepresented at an artifact rate below 2%.
  • The Polar H10 is an excellent device for heart rate tracking but in real world usage is prone to a low but significant artifact rate.  This should be examined carefully by the researcher or end user.
  • The new Hexoskin garment with integrated straps produced almost artifact free tracings even under the most challenging conditions.  This device should be considered for those interested in artifact free recordings.  
  • It is better to avoid the artifact than correct it!

Heart rate variability during dynamic exercise

Friday, July 19, 2019

First and second Lactate thresholds, why the confusion, how to measure?

There have been innumerable articles, posts and comments published on lactate thresholds.  Recently I was interested in figuring out what my "threshold" was but was astounded in seeing the multitude of opinions as to what this even means.  Even time honored concepts such as the MLSS have controversy in what they identify, correspond to and of course how best to measure.  The situation becomes even murkier lower down the lactate curve at the point where small amounts of the substance start to accumulate as work loads pass a certain point.  It seems that the aerobic to anaerobic transition is not an all or none phenomenon but a smooth integrated response involving components of each system.  This post will review some of the published work regarding the relationship of work load induced lactate rise, "threshold" definitions and most importantly how to measure the cycling power (or running speed) associated with the first initial rise in lactate.  

One of the best overall reviews was by Faude several years ago which I recommend reading.  They reviewed the literature extensively and found:
A total of 25 different LT concepts were located. All concepts were divided
into three categories. Several authors use fixed bLa during incremental
exercise to assess endurance performance (category 1). Other LT concepts
aim at detecting the first rise in bLa above baseline levels (category 2). The
third category consists of threshold concepts that aim at detecting either the
MLSS or a rapid/distinct change in the inclination of the blood lactate curve
(category 3).
In addition they recognized the distinction between gas exchange thresholds and lactate (as discussed in a previous post):
It has to be emphasized that this text focuses
on LTs only. Although a close link between lactate
and gas exchange markers has often been
proposed,[21,31,33-36] there is still controversial
debate with regard to the underlying physiological
Differences in test design:
Depending on the length of ramp, prior exercise and carbohydrate availability, lactate results can differ.
It is of note that the specific GXT protocol can
vary considerably with regard to starting and
subsequent work rates, work rate increments and
stage duration. A recent review focused on the
influence of varying test protocols on markers
usually used in the diagnosis of endurance performance.[
47] For instance, varying stage duration
or work rate increments may lead to relevant
differences in blood lactate curves and LTs. 
Test interpretation and curve fitting 
This is a huge issue which initially confused me.  For some parameters (like a fixed lactate of 4 as a cutoff) there is minimal interpretation, however several other techniques rely upon mathematical modeling.  Most importantly, the initial rise in lactate at lower work loads may not be obvious due to the inherent error in the assay (strip error):
In addition, there has been great debate on the
best fitting procedure for the obtained bLa data
set. For instance, a single-[51] or double-phase
model[52] using two or three linear regression
segments, a double-log model,[53] a third-order
polynomial[54] or an exponential function[55] have
been used in previous studies. Up to now, no
generally accepted fitting procedure has been
established.[47] Thus, it seems appropriate that
test design as well as data fitting procedures
should be chosen (and reported) as has been
originally described for a certain LT
In addition, the method of measurement may introduce variation of values:
From a methodological point of view, the site
(earlobe, fingertip) as well as the method (venous,
arterial, capillary) of blood sampling[56,57] and
the laboratory methods (lactate analyser, analysed
blood medium)[58-60] may also affect the test
result. Samples taken from the earlobe have uniformly
been shown to result in lower bLa than
samples taken from the fingertip
Definition of the first transition
At lower work loads there is a gradual rise in lactate that has been called the anaerobic threshold.  Some consider this a misnomer and refer to it as the aerobic threshold.  This intensity would be a valuable metric to know since many consider it to be the upper limit of zone 1 training, the zone that endurance athletes should be spending the bulk of their sessions in.
This model consists of two typical breakpoints
that are passed during incremental exercise. In
the low intensity range, there is an intensity at
which bLa begin to rise above baseline levels.
This intensity was originally determined using gas
exchange measurements,[21,22] and Wasserman
called it the ‘anaerobic threshold’. This term has
since been used for various LTs, particularly those
with a different physiological background,[33,75]
and, thus, has caused considerable confusion.
Kindermann et al.[30] and Skinner and McLellan[
34] suggested this intensity be called the
aerobic threshold’ (LTAer), because it marks the
upper limit of a nearly exclusive aerobic metabolism
and allows exercise lasting for hours. This
intensity might be suitable for enhancing cardiorespiratory
fitness in recreational sports, for
cardiac rehabilitation in patients or for lowintensity
and regenerative training sessions in
high level endurance athletes

Definition of the second transition:
Above the point of initial lactate rise, a situation occurs where higher work loads induce more lactate accumulation.  Up to a certain point, lactate disposal can keep pace with increased production, but past the "Maximal lactate steady state" power, levels can not maintain equilibrium:
Exercise intensities only slightly above the
LTAer result in elevated but constant bLa during
steady-state exercise and can be maintained for
prolonged periods of time (~4 hours at intensities
in the range of the first increase in bLa[82-84] and
45–60 minutes at an intensity corresponding to
the maximal lactate steady state [MLSS
Although anaerobic glycolysis is enhanced, it is
speculated that such intensities may induce a
considerable increase in the oxidative metabolism
of muscle cells.[30,87] Theoretically, a high stimulation
of oxidative metabolism for as long a period
of time as is possible in this intensity range
might be an appropriate load for endurance
training. The highest constant workload that still
leads to an equilibrium between lactate production
and lactate elimination represents the MLSS.
Some authors suggested that this intensity be
called the ‘anaerobic threshold’

In addition, since the MLSS can only be sustained for so long, indexes like the FTP or critical power have been used as approximate surrogates.  I will not get into the nit picking that exists in the literature on which is a "better" parameter.  

Clearly, some individuals can withstand higher lactate levels than others making a fixed cutoff problematic.  Some prior posts have discussed non invasive methods to approximate MLSS.  If a more exact determination is needed, more lengthy sessions are necessary but will introduce (perhaps undesired) training stress.
The gold standard for the determination of the
MLSS is performing several constant load trials
of at least 30 minutes’ duration on different days
at various exercise intensities (in the range of
50–90% VO2max, An increase
in bLa of not more than 1 mmol/L between
10 and 30 minutes during the constant load trials
appears to be the most reasonable procedure for
MLSS determination

The confusion around various lactate threshold definitions and concepts:
Obtaining the work load associated with the "aerobic" threshold has been discussed by numerous authors.  Here is an extract from Faude's review:
Table I shows an overview of LT concepts that
could be categorized as the first rise in bLa above
baseline levels (LTAer). Several researchers described
the procedure to determine this threshold
with terms like ‘‘the first significant/marked/
systematic/non-linear/sharp/abrupt sustained increase
in bLa above baseline’’.[30,110,126-133,138]
Although the visual determination of the first rise
of bLa above baseline levels seems obvious and
simple, in practice it is associated with considerable
problems because of the only slight changes
in bLa on the first steps during GXTs.
et al.[142] demonstrated that the visual detection
of the LTAer (in that study called ‘anaerobic
threshold’) led to relevant differences between
observers. Therefore, it does not seem appropriate
to determine this threshold by simple visual
  • They definitely emphasize that visual inspection of the lactate curve is not an optimal method to obtain threshold transitions

Here is a table listing the methods for the first threshold :

Again, the take home lesson here is the comment about not being able to use visual methods to get that first transition point.  Mathematical modeling seems the most objective approach.

The second transition:
Although a fixed lactate of 4 can be used a cutoff, it seems more complicated than this:
It was soon recognized that a fixed
bLa does not take into account considerable
interindividual differences and that LT4 may frequently
underestimate (particularly in anaerobically
trained subjects) or overestimate (in
aerobically trained athletes) real endurance capacity.[
88,96,97,146] Therefore, several so-called ‘individualized’
LT concepts were developed. For
instance, Keul et al.[96] and Simon et al.[97] determined
the individual anaerobic threshold (IAT)
at a certain inclination of the lactate curve (tangent
of 51 and 45 , respectively). However, it
seems questionable whether the use of a fixed inclination
may reflect individual lactate kinetics
better than a fixed bLa
Bunc et al.[143] determined the LTAn as the intersection
between the exponential regression of the lactate
curve and the bisector of the tangents on the upper
and lower parts of the regression. A comparable
model was established by Cheng et al.[54] and
called the Dmax method. Those authors determined
the maximal perpendicular distance of the
lactate curve from the line connecting the start
with the endpoint of the lactate curve. It is obvious
that these threshold models are dependent
on the start intensity as well as the maximal effort
spent by the subjects. To eliminate the influence
of the start point of the GXT, Bishop et al.[140]
connected the LTAer with the endpoint of the
lactate curve and observed that this modified
Dmax threshold (Dmod) was also highly correlated
with performance during a 1-hour time trial in 24
female cyclists.
In other words, multiple methods exist to calculate the second transition.  The analytic methods will be reviewed shortly.

Cautions on using LT4 (lactate of 4 cutoff) as MLSS 
Although some studies have shown reasonable correlation with a lactate cutoff of 4 with the MLSS, stage duration should be 5 minutes (or more) and even so the correspondence is not absolute:
Most researchers analyzed the relationship of
LT4 with MLSS.[49,72,90,92,112,117] For instance,
Heck and colleagues[49,50,72] found strong correlations
between LT4 and MLSS during running
as well as during cycling exercise. However, the
fitness level of their subjects was quite heterogeneous
and, therefore, the high correlations to
some extent might be spurious. Additionally,
they observed that the velocity at LT4 was higher
than MLSS velocity when stage duration during
the GXT was 3 minutes, whereas this was not
the case with 5-minute stages. Therefore, these
authors concluded that LT4 gives a valuable
estimate of the MLSS when stage duration is
at least 5 minutes
. Also, Jones and Doust[112]
found a high correlation between LT4 and the
MLSS in a homogenous group of trained runners
with LT4 being higher than MLSS (3-minute
stages). Lower correlations were found by
van Schuylenbergh et al.[92] in elite cyclists as
well as by Beneke[117] in a homogenous group
of rowers. Also, LT4 and MLSS did not differ
significantly with 6-minute stages,[92] whereas
LT4 was considerably higher than MLSS with
3-minute stages
.[117] Lajoie et al.[90] evaluated
whether the intensity corresponding to 4 mmol/L
lactate during a GXT with 8-minute stages and
30W increments is appropriate to estimate the
MLSS in nine cyclists. Average power output at
MLSS and LT4 was not significantly different.

Reproducibility and correlation with performance:
A recent study looked at the multiple methods of lactate value interpretation with a graded exercise test (25w every 5 min) compared to a climb up Mt Ventoux as well as a 2 simulated time trials. 

Here are their definitions of the methods that will be used to calculate the anaerobic  threshold:
LT1. Similar to what Tanaka described [13] we plotted bLa (mmol/L) versus power (W). Three authors (JH, WdMK and PG) were asked to independently select the first point in the BLC that marks a substantial increase above resting level. LT1 was defined as the power value corresponding to the point selected by at least two researchers, or in cases without consensus,
the three researchers discussed until consensus was reached.

LT2. Coyle et al. [14] determined LT as 1 mmol/L above a visually determined baseline in the BLC. We took the lactate measurement chosen as LT1 and calculated the mean of the measurements
preceding this point to create an average baseline value. The power value belonging to the first measured lactate value after baseline that supersedes the baseline value plus 1 mmol/L was considered LT2.

LT3. As Dickhuth et al., [15] we determined the minimum lactate equivalent (the lowest value when bLa is divided by work intensity) using third-order polynomial fitting and added 1.5 mmol/L to the corresponding bLa, termed individual anaerobic threshold in the paper, to find the power value on the fitted polynomial of the BLC and termed it LT3.

LT4. As described by Amann et al., [16] we calculated the first rise of 1 mmol/L or more between two bLa measurements where the next rise was similar or larger than 1 mmol/L. Themeasurement that preceded this first increase was considered LT4.
LT5. Based on the method described by Dickhuth et al., [17] we divided bLa (mmol/L) by the 30 second average VO2 (mL/min/kg) and plotted it against power. These values were interpolated with a third-order polynomial and the power value at the lowest point in this curve was considered LT5.
LT-4mmol. A widely used concept is the LT-4mmol method, as described for example by Sjodin et al. [18] The power in the interpolated third-order polynomial BLC that corresponds to a bLa of 4 mmol/L was considered LT-4mmol.

Dmax and Dmax modified. Similar to the method proposed by Cheng et al., [19] we plotted bLa versus power, interpolated with a third-order polynomial and plotted a line from the first measurement to the last measurement. The point in the interpolated BLC that has the maximum perpendicular distance with that line was considered Dmax. A modified version as described by Bishop et al., [20] uses the measurement that precedes an increase of at least 0.4mmol/L instead of the first bLa measurement to draw the line to the last measurement, which is termed Dmax modified (Dmax-mod).
The results showed wide differences in results:

Here are the performance comparisons to the times trials and race up Mt Ventoux:

They also reviewed literature comparing the various computation methods with performance:

 The conclusion:
In this study we compared eight different representative LT concepts on the same large cycling performance dataset to evaluate repeatability and predictive properties. All concepts showed high repeatability, and correlated with endurance performance. However, LT3, LT- 4mmol, Dmax and Dmax-mod showed the best repeatability, and had the highest correlation
with time trial performance.
  As correlation with performance was consistently high for Dmax and Dmax-mod, also with the uphill road race, the latter performing slightly better on each criterion, and because Dmax-mod was previously shown to be a valid estimate of MLSS, we
would recommend using Dmax-mod when analyzing the blood lactate curve.

Time for a recap:
  • The definition of the first and second lactate thresholds (aerobic and anaerobic) are somewhat controversial.  
  • A progressive ramp test is generally done to obtain lactate levels at each power range.  
  • Each stage of the test should be 5 minutes (or more).  
  • The first threshold is helpful in determining the upper limit of the easy training zone.
  • The second breakpoint is useful as an index of future performance, maximal pace as well as intense training zone commencement.
  • Both LT 4 mmol. Dmax and modified Dmax give reasonable time trial performance prediction as well as estimation of MLSS for a given individual.  No, they are not the same as a true MLSS.  However, since it takes repeated 30 minute sessions to obtain a more exact MLSS, the shorter, less intense methods are more practical.

How to generate the multiple lactate parameters from a table of power vs lactate?
About 10 years ago, Newell and colleages wrote a paper, not only reviewing the different methodologies in lactate concepts, but providing free public domain software allowing one to graph out each particular method.  By entering in the values of lactate at each level of the power (or speed) done during a ramp stage, mathematically derived threshold results are objectively calculated along with a curve plot.  Since criticism of a hand drawn threshold curve is voiced in the literature, I was particularly intrigued by this approach.  In addition, for the non math oriented folks doing these tests (me), the guess work for doing log-log or Dmax calculations is taken away.  I would like to review the paper and offer some tips in using the template created in "R" by Higgins and Newell.

Note - After writing up the tips for R below, Dr Newell informed me that there is web based app that will do this.  You do not need R or any software libraries.  Here is the screen you will be greeted with to enter your data:


The authors review and define the algorithms as follows:

Lactate threshold
Traditionally, the lactate threshold was determined
subjectively from plots of the lactate concentration
versus work rate by visually identifying the treadmill
velocity or work rate that best corresponds to a
departure from a linear baseline pattern. Lundberg,
Hughson, Weisiger, Jones, and Swanson (1986)
proposed fitting a linear spline where the lactate
threshold is the estimated work rate corresponding to
the location of the knot (i.e. the point of intersection
between the two linear splines). The location of the
knot and the parameters of the lines are estimated by
minimizing the sum of the squared differences
between the observed lactate values and the fitted
The value of the lactate threshold can be estimated
using simple linear regression by fitting model and identifying the work rate LT corresponding to the model with minimum mean squared error.
A log transformation of both the work rate and
blood lactate concentration has been suggested
(LTloglog) in an attempt to gain a better estimate of
the lactate threshold (Beaver et al., 1985).
  • What they are saying here is that the first threshold (LT1) can be estimated by either a log-log plot or a variation of linear regression and they will provide results for both.
  • Both parameters are potential markers for the first initial lactate threshold (aerobic), the work load at which lactate just starts to rise past baseline.

Fixed blood lactate concentration (FBLC) or the OBLA
This marker is the work rate corresponding to a fixed
blood lactate concentration, typically 4 mmol
(Heck et al., 1985; Kindermann et al., 1979). It is
calculated using inverse prediction by finding the
work rate w (in model 2.2) corresponding to a lactate
value equal to the FBLC
  • This is a conventional marker, well established in literature, estimated by the software.

Fixed rise post baseline (FRPB)
This marker (Thoden, 1991) corresponds to a
work rate preceding an increase in lactate concentration
of a fixed rise post baseline (e.g.
1 mmol from baseline). Let Lbaseline represent
the lactate reading at baseline. The FRPB marker
is calculated by finding the work rate w corresponding
to a selected rise from baseline (e.g.1 mmol )
  • Another commonly used marker, the work rate at which lactate rises 1 mmol above baseline.

This marker corresponds to the work rate corresponding
to the point that yields the maximum
perpendicular from a line L2, joining the first and last
lactate measurements to the estimated lactate curve
L3 (Cheng et al., 1992).
 At first glance this sounds confusing but a picture illustrates the concept:

  • I have drawn the perpendicular lines over to the lactate curve which yield both Dmax and Dmax mod.  Clearly, this is not something that can be done by easy hand calculation or visual inspection.

This marker (Newell et al., 2005, 2006) represents
the work rate corresponding to the point of
maximum acceleration of the estimated underlying
lactate curve (i.e. the maximum of the second
derivative of the lactate curve).
It should be noted that the D2LMaxDiscrete will
always correspond to a work rate where data were
  • Yet another proposed idea, one of maximum acceleration.   This is not Dmax mod from the above figure but somewhat similar in theory.

This is what the software will yield when power and the corresponding lactate levels are known (from the paper by Newell):

Tips for plotting your own results:
So how do we get our results to graph out like the above?  One should read the tutorial but here are a few tips.

Download and install R 2.3.1 - do not use the newer versions (they won't work, it seems the pspline library is not compatible).  Install the windows .exe file to install R.
Go to this page and download "lactatemarkers.zip" 
Unzip the file to the root C drive.
This is what it will look like:

Follow the directions in the tutorial:

Although at this point, you could run the default entered power/lactate values, let's put our own numbers in.

For my particular lactate study, I used 5 minute ramp intervals, temporarily getting off the bike and testing fingertip lactate about 90 sec after the stage.  After a 10 minute warmup, I started at about 135 watts and went to just over 260.  For a potentially more accurate look, higher power/lactate values can be added.

Now we are simply going to change the "samplerunner.txt" and put our values in
The first column should not be changed ("1" "2" and so on).  The second column is power or speed, the third column is lactate:

Save the file and go back to R.

Paste in "sample.runner<-read.table('C:\\Lactatemarkers\\samplerunner.txt', header=TRUE)"
and hit return

Paste in "sample.runner" and hit return.

Paste in "Lactate.markers(sample.runner)" and hit return:

You will get the following:

There are ways to get both multiple samples and historical results to be plotted as well (If the tutorial is studied further).
A unique feature though, is an objective look at the first LT plus the log-log relation, giving us an estimate if that first lactate threshold.  For me it is about 200 watts, which fits with where I fail the talk test.

I would like to personally thank the authors for this handy tool and hope it is used more often. 

Why the focus on knowing where the "easy zone" training limit is?

Although this has been discussed before, most if not all coaches and sports professionals emphasize the creation and maintenance of a huge "aerobic base".  This generally revolves around some form of polarized training with the vast bulk of the time spent in the "easy zone".  A paper looking at this topic just came out I wanted to briefly discuss the findings.
The study evaluated the real world running performance of a group of elite subjects.  Performance score was best correlated with the total volume of easy run training.  

The study group:
Eighty-five male elite- and international-standard long-distance
runners took part. The age range was between 18 and 43 years,
with a mean age of 28 years (65). All subjects were specialists in
the 5,000, 10,000 m, half-marathon (21.195 km), or marathon
(42.195 km) events
The relevant training activities included
were cross-training, flexibility training, weight training, work
with the coach, easy runs, tempo runs, long-interval training,
short-interval training, and competition and time trials (9,41).
For each of the latter 5 activities, subjects were further instructed
to account for total weekly distance (km). The latter 4 activities
(i.e., not including easy runs) were the activities that subjects
considered more important and, for this study, were considered
DP. This consideration was taken because the same subjects of
this study rated these activities with high values (mean superior to
7 in a 10-point Likert-type scale) and significantly higher than 5
on the scale for relevance, physical and mental effort, and enjoyment in 2 previous studies. Easy runs were considered mentally effortless because its
rating was not significantly higher than 5 on the Likert scale for
What was found:
The best predictor to finishing time was the total distance ran during training and in particular the volume of easy runs:

  • Short interval training did have a modest correlation, but long interval training had almost no correlation.  This seems like yet another piece of evidence supporting polarized training especially keeping the volume of zone 1 (easy runs) quite high.  
  • One could also speculate that the long interval training took place in zone 2, which is an area of questionable value (as far as stress to benefit ratio).

The study conclusions:
Practical Applications
The first important finding that coaches should note was that
the strongest relationships found for performance scores were
with total distance run after 3, 5, and 7 years of systematic
. There is thus a fundamental need for athletes to run
over considerable distances (.100 km per week) to compete
with world-class athletes and even with those who are below
this highest standard. It is not possible to always train at high
intensities, particularly over these long distances, so the large
associations found between easy runs and performance scores
are welcome in terms of managing training intensity in long distance
running regimens, notwithstanding their central role
in developing cardiovascular fitness

Final thoughts:
  • Lactate threshold concepts and nomenclature are variable, sometimes arbitrary and often confusing.  The "aerobic threshold" has been labeled as the "anaerobic threshold" and there are multiple definitions for both low and high breakpoints.
  • Doing your own lactate test is not difficult.  You will need accurate speed or cycling power (treadmill or indoor trainer), a lactate meter and the willingness to stick your finger for blood after each stage.
  • The software developed by Newell and Higgins is an ideal tool to objectively calculate the first and second breakpoint equivalents.  Agreed on methods such as log-log, Dmax and other curve fitting techniques are automatically done.  The web based app makes calculation even simpler.
  • When doing the test, use the same bottle of strips, same site of blood, same meter, same ramp intervals, same time from end of stage to blood sample.  Uniformity of each stage is key.  Therefore, don't do a 5 minute stage on one power interval but a 7 minute interval on another.  Don't wait 3 minutes after the stage to test, when you were previously waiting 1 minute etc.
  • Although the second breakpoint/threshold is most spoken of (MLSS, OBLA, 4 mmol etc), in my opinion the first threshold is perhaps as or more critical.  Since it is an objective marker of the beginning of a more intense work load, it gives one an upper limit for an "easy training zone".  As reviewed above, recent studies show that the quantity of easy training was more predictive of future endurance sport success than that of HIT duration.  In order to do that volume of easy training, one needs to know up to what power/heart rate limit that ceiling is.  I would propose that the upper limit of zone 1 would be just below that of the first lactate threshold.

See also:

Update 10/23