Tuesday, 30th September, 2014

(Plotting GPS tracks and heart rate data)

Link to this post: http://kotakoski.fi/blog.php?blogpost=20140929-RRRunningWithR

I have recently been toying around with the idea to get myself a bit better
versed in modern tools used in data analysis and so-called big data. One of the
often mentioned tools on this arena is **R**, the free software environment for statistical
computing and graphics. Since I like to practice with new programming
languages, I have already for a couple of months been looking for a good excuse
to try R in a real application.

Until recently, I have been logging all my running activities on online
services, which have provided a nice overview on each of the runs individually
and also a convenient way for browsing through old entries. However, this has
always felt a bit silly—while having the data online is good for sharing
on social media, for anything that interests just myself it is crazy.
Nevertheless, unaware of any good offline tools for this purpose, I was stuck
with the online solutions. As you can guess, this is where R came to the
resque. I can't remember how it happened, but this Sunday I landed on
**Mollie Taylor**'s blog, where she discussed Mapping
GPS Tracks in R. Now I had the long-awaited excuse for getting my hands
dirty with R.

I already had the GPS and heart rate data from my *Garmin Forerunner
405* transferred to the harddrive with **Braiden Kindt**'s python-ant-downloader,
and Mollie pointed out in her blog that the tcx files I had could be converted into
csv using GPSBabel. What I still needed
to figure out myself was how to use OpenStreetMap instead of data from one
specific colossal corporation, how to include several plots within one figure
and how to populate those plots with the data I wanted.

The trickiest part turned out to be calculating accurate distances from the GPS
coordinates. For reasons I can't fully comprehend, it turns out that almost
everybody assumes that the Earth is a perfect ball when calculating the
distance. This is obviously not true, and since I often run on different
continents, I wanted to get this one nailed down. Of course, I wasn't the first
one looking for a solution to this problem. The most reasonable implementation
for R, by **Mario Pineda-Krch**, can be found on r-bloggers.com.
It is based on JavaScript code by **Chriss Veness** available at movable-type.co.uk
(attribution license). (The theory behind the
method is by **Thaddeus Vincenty**). Also this formula, however,
disregards altitude variations, which can be significant if you like to run on
hilly areas like I do. So, I needed to add that part myself.

Another point to consider was that the point-by-point pace information was much too noisy for being directly plotted from the data. I ended up solving this by calculating the average pace for each 100 m trek and plotting those instead of each individual point. The result seems to be reasonably good in terms of comparison with empirical analysis (pace variations while running).

The end result from my Sunday coding is now R code, which produces results such as this one:

The `runmap` code itself is available on github under the MIT
license. I hope it will be useful also for others.