I have recently been toying around with the idea to get myself a bit better versed in modern tools used in data analysis and so-called big data. One of the often mentioned tools on this arena is R, the free software environment for statistical computing and graphics. Since I like to practice with new programming languages, I have already for a couple of months been looking for a good excuse to try R in a real application.
Until recently, I have been logging all my running activities on online services, which have provided a nice overview on each of the runs individually and also a convenient way for browsing through old entries. However, this has always felt a bit silly—while having the data online is good for sharing on social media, for anything that interests just myself it is crazy. Nevertheless, unaware of any good offline tools for this purpose, I was stuck with the online solutions. As you can guess, this is where R came to the resque. I can't remember how it happened, but this Sunday I landed on Mollie Taylor's blog, where she discussed Mapping GPS Tracks in R. Now I had the long-awaited excuse for getting my hands dirty with R.
I already had the GPS and heart rate data from my Garmin Forerunner 405 transferred to the harddrive with Braiden Kindt's python-ant-downloader, and Mollie pointed out in her blog that the tcx files I had could be converted into csv using GPSBabel. What I still needed to figure out myself was how to use OpenStreetMap instead of data from one specific colossal corporation, how to include several plots within one figure and how to populate those plots with the data I wanted.
Another point to consider was that the point-by-point pace information was much too noisy for being directly plotted from the data. I ended up solving this by calculating the average pace for each 100 m trek and plotting those instead of each individual point. The result seems to be reasonably good in terms of comparison with empirical analysis (pace variations while running).
The end result from my Sunday coding is now R code, which produces results such as this one:
I have been horrible with updating the blog recently, and I'm offering no excuses for this. The reason for this update is the Open Notebook which I established on the Wiki of University of Vienna. I will keep the notebook in the spirit of Open Notebook Science for projects where this approach seems suitable. I wish I could do this with all of my projects, and I may indeed start to do this at some point, but for now I will just test the concept. The first project where I'll try this is exfoliation and characterization of 2D crystals, which we are carrying out with a couple of students here at the Uni. We'll start with mechanical exfoliation, and carry out optical microscopy, Raman spectroscopy, atomic force microscopy and transmission electron microscopy on the samples. Let's see how this goes.
The main reason for trying this out is the need of centralized notebook for the project and the appeal of open science. The final push for trying this out came from a presentation by Peter Murray-Rust given yesterday (3.6.2014) in Vienna (organized by the Austrian Science Fund).