Sunday, April 14, 2019

Data Presentation and Analysis - Plotting

The next step once we have a solid spreadsheet of our data is to graph and analyze it.  If you have a background in stats or engineering, your knowledge base is vastly superior to mine so ignore this post.  This is intended as a simple, no cost way of data presentation for users with little or no data manipulation background.

A graphing web app by the name of Plot.ly is my personal favorite.  If you like to use a dedicated graphics or data package that may work even better.

You will need to sign up for a free account, then we are ready to begin.

Time, Power, Heart rate graph for the entire session:
The first step is to import the spreadsheet we just created.  Click the Import button to the right (in red):


You will get this dialog:

 Upload the spreadsheet (takes a few seconds).  You will then have this:

As you can see, all our data is there with the headers on top.  To plot a single column, click "+ Trace" (next to red arrow above).

The following will appear:
 We want to select the line graph (above red arrow).

The next step is to tell the app what X and Y values to graph.  The X has a pull down list, select "Time" and the Y should be the "power".  In the future, the X is usually Time (the same value on the list, time is the same for all sensors), and the Y will be each sensor data range we want to graph.
Notice we now have a graph in purple of watts.

Let's add heart rate to the curve.  Click the "+Trace" again to add another sensor series:
The X axis is again the time, the Y is heart rate.  However, we can have a separate scaling on the Y (a second Y axis).  If you click the "+" button next to the Y axis, a second range is created (far right side).

Here is the final result so far:
Not very pretty or readable but we can clean this up:
First, lets add text to the axis labels, title and trace labels (top R).  The trace labels list can also be moved around.  To change the labels just click on them and type some appropriate text.

To boost the font size, press the "Style" button on the left side of the screen:
Here I darkened the font color and changed the size so we can read it.

Although it is possible to zoom and analyze the plot, I prefer to copy out a section of the original master spreadsheet and load it independently.   The web app is just not robust enough to handle over 100,000 data points.

Go back to openoffice, open the original sheet and find the segment of interest.  For the above, I wanted the 5000 to 6000 second range:
I will then click to first box of interest on the far left, scroll down to the end of range I want and shift click on the far right (last data series) to select the whole list of values


Then control C to copy.


Go to menu, open a new sheet, then paste into the second row:


You can then go back to the original sheet to copy the first row title headers and paste into the new sheet:

Save it with an appropriate name, go back to the Plot.ly app and import it.

Now we will do a full graph of SmO2, power, heart rate and ventilation for this training interval.

What I have done here is to get back to where we were above with power/HR over time and added Ventilation (using the second Y axis for HR, far right side):



Let's now look at a couple of SmO2 fields (RF and Costal HHb):
The same Y axis was used for watts and ventilation and the right Y axis for HHb.

Here I dimmed out the Watts for a better look (using the Style tab on the far left toolbar.

Some more color changes to put Ventilation in the background (in yellow).
This is good for a final.  

To save the image as a jpeg, press the export button (red arrow below):


Here is the final:
  • At this point if you are satisfied, a screen shot copy, or export as jpeg can get a high res image.  One could also save this in the web app (the blue button on the far left).




Curve fitting:
The web app has some primitive curve fitting which is nice if you want a line drawn through your data objectively.
  • Why would you need to do so?  For example - looking for MLSS on a long interval.  Using either SmO2 (RF/costal) or Ventilation rate over a > 5 minute constant power interval can be used to determine this.  Having an accurate regression line with slope may help determine which power level causes a take off in slope.
Under Analysis (left menus), choose "Curve fitting" and select a Trace (I did Ventilation) on the top drop down.  You can only curve fit a data series if it is using the left Y axis.
Enable advanced to get the calculation range to show (X min and X max range of values).


We need to populate the X min max fields (both the calculation and display range).  If you hover over the line trace data you will see the X values on the curve as well:
Put in a 0 placeholder in the M and X fields.
I put in 5600 to 5800 seconds for the Ranges.  Make sure the "matching data" box is in blue.

Then press the Run button
You should have a curve fit line just through the range of data selected:

I took the liberty of running curve fits for both costal and RF HHb.

Averages:
Both the Hexoskin web app as well as Garmin do not allow for easy calculation of average values.  
  • For example, let's say you were looking at your long MLSS power interval again and wanted to know at what average SmO2 or Vent rate this was attained.  Even though you may know the power figure, there is no way of getting the corresponding Vent rate average from the Hexoskin site, or the SmO2 average from Garmin.
Although I'm not going to be getting into spreadsheet calculation detail, lets at least see how to get an average for a range.   We are going to get the average power and vent rate for the second interval on the sheet.
Here is a section of the sheet (the end of the range I want to average) we will work on.
To start, I created a blank column next to the "power", column C.  Next to the last value we want to average is the box we will use for the Function.  It really does not matter where you put it, but this looks convenient.


While in that black box, click the Function button again.
We have a new one to try now, "Average", double click it and the following will appear:


Press the circled red box to bring up another range entry input box.  

Although we can just put the range in here directly, it's easy to just shift-click the first and last values to get our range.




Here is what you will see after clicking back into the sheet on the last value (215 watts) in cell B647:




Then scroll up the sheet to find the first value (226watts-B287), then shift-click into that value (make sure it is in the B column):

The range is put in properly.  If this is correct, now press the enter key to put it away and you are back to this:

You could have typed it in manually as you now see.  Press the OK and you have your result:


I went back to cycling analytics web site and double checked the average:
Looks good!


I repeated the process with the Vent rate with the results.  We now have an average ventilation vs power for a given interval:


Summary:
There are many other interesting metrics to explore including, running averages, comparing line slopes and of course multiple sensor value combos in one graph.
For now I think this is enough to get started, especially if this is all new to you.  To the data scientists out there, my apologies for oversimplification, omissions but to the untrained folks out there (like me) this will get you up and running.  As old Dr McCoy said on Start Trek many times, "I'm a doctor not a ......"
Although sites like Garmin, cycling analytics, Hexoskin, Humon and Golden Cheetah will present data, they don't combine all sensors of interest, nor provide the flexibility that a plain old spreadsheet with appropriate graphing does.  However, for monitoring metrics over time, providing meaningful peak values, tracking training loads and such, these sites are essential.  I encourage athletes to use a data repository you are comfortable with, as well as learning to extract the individual sensor info for more in depth analysis.

No comments:

Post a Comment