Many GPS devices and apps have the capability to track your current position via GPS. If you go walking, running, cycling, flying or driving, you can take a look at your exact route and your average speed.

Some of these devices or apps also allow you to export your routes in various formats, e.g., the popular XML-based GPX format.

I want to show you my attempts to
- read in a GPX file using R and its XML package,
- calculate distances and speeds between points,
- plot elevation and speed,
- plot a track,
- plot a track on a map.

Here are the results. Everything below the plots shows you how to do it in R. Drop me a comment if you have any questions.


The altitude of the track in time, with smoother. Click to enlarge.

The speed during the run, with smoother. Click to enlarge.

The raw track without a map in the background. Click to enlarge.

The track with an OpenStreetMap of Stuttgart in the background.

The track with a Bing map of Stuttgart in the background. Click to enlarge.

The track with a Mapquest map of Stuttgart in the background. Click to enlarge.

Again the same track, now with Skobbler map of Stuttgart in the background. Click to enlarge.


If you want to replicate this script, you can use an example GPX file I uploaded. It's one of my runs through Stuttgart. The file name is "run.gpx".

We gonna need some packages.

library(XML)
library(OpenStreetMap)
library(lubridate)

And a function that shifts vectors conviniently (we gonna need this later):

shift.vec <- function (vec, shift) {
  if(length(vec) <= abs(shift)) {
    rep(NA ,length(vec))
  }else{
    if (shift >= 0) {
      c(rep(NA, shift), vec[1:(length(vec)-shift)]) }
    else {
      c(vec[(abs(shift)+1):length(vec)], rep(NA, abs(shift))) } } }

Now, we're reading in the GPX file. If you want help on parsing XML files, check out this (German) tutorial I made a while ago.

# Parse the GPX file
pfile <- htmlTreeParse("run.gpx",
                      error = function (...) {}, useInternalNodes = T)
# Get all elevations, times and coordinates via the respective xpath
elevations <- as.numeric(xpathSApply(pfile, path = "//trkpt/ele", xmlValue))
times <- xpathSApply(pfile, path = "//trkpt/time", xmlValue)
coords <- xpathSApply(pfile, path = "//trkpt", xmlAttrs)
# Extract latitude and longitude from the coordinates
lats <- as.numeric(coords["lat",])
lons <- as.numeric(coords["lon",])
# Put everything in a dataframe and get rid of old variables
geodf <- data.frame(lat = lats, lon = lons, ele = elevations, time = times)
rm(list=c("elevations", "lats", "lons", "pfile", "times", "coords"))
head(geodf)
       lat      lon ele                   time
1 48.78457 9.217300 312 2014-08-17T17:25:07.45
2 48.78457 9.217300 312 2014-08-17T17:25:07.52
3 48.78466 9.217295 311 2014-08-17T17:25:10.53
4 48.78475 9.217335 307 2014-08-17T17:25:13.50
5 48.78480 9.217410 310 2014-08-17T17:25:19.51
6 48.78486 9.217550 306 2014-08-17T17:25:22.49

We already have nice dataframe now with all the information available in the GPX file for each position. Each position is defined by the latitude and longitude and we also have the elevation (altitude) available. Note, that the altitude ist quite noisy with GPS, we will see this in a minute.

Now, let's calculate the distances between successive positions and the respective speed in this segment.

# Shift vectors for lat and lon so that each row also contains the next position.
geodf$lat.p1 <- shift.vec(geodf$lat, -1)
geodf$lon.p1 <- shift.vec(geodf$lon, -1)
# Calculate distances (in metres) using the function pointDistance from the 'raster' package.
# Parameter 'lonlat' has to be TRUE!
geodf$dist.to.prev <- apply(geodf, 1, FUN = function (row) {
  pointDistance(c(as.numeric(row["lat.p1"]),
  as.numeric(row["lon.p1"])),
                c(as.numeric(row["lat"]), as.numeric(row["lon"])),
                lonlat = T)
})
# Transform the column 'time' so that R knows how to interpret it.
geodf$time <- strptime(geodf$time, format = "%Y-%m-%dT%H:%M:%OS")
# Shift the time vector, too.
geodf$time.p1 <- shift.vec(geodf$time, -1)
# Calculate the number of seconds between two positions.
geodf$time.diff.to.prev <- as.numeric(difftime(geodf$time.p1, geodf$time))
# Calculate metres per seconds, kilometres per hour and two LOWESS smoothers to get rid of some noise.
geodf$speed.m.per.sec <- geodf$dist.to.prev / geodf$time.diff.to.prev
geodf$speed.km.per.h <- geodf$speed.m.per.sec * 3.6
geodf$speed.km.per.h <- ifelse(is.na(geodf$speed.km.per.h), 0, geodf$speed.km.per.h)
geodf$lowess.speed <- lowess(geodf$speed.km.per.h, f = 0.2)$y
geodf$lowess.ele <- lowess(geodf$ele, f = 0.2)$y

Now, let's plot all the stuff!

# Plot elevations and smoother
plot(geodf$ele, type = "l", bty = "n", xaxt = "n", ylab = "Elevation", xlab = "", col = "grey40")
lines(geodf$lowess.ele, col = "red", lwd = 3)
legend(x="bottomright", legend = c("GPS elevation", "LOWESS elevation"),
       col = c("grey40", "red"), lwd = c(1,3), bty = "n")

# Plot speeds and smoother
plot(geodf$speed.km.per.h, type = "l", bty = "n", xaxt = "n", ylab = "Speed (km/h)", xlab = "",
     col = "grey40")
lines(geodf$lowess.speed, col = "blue", lwd = 3)
legend(x="bottom", legend = c("GPS speed", "LOWESS speed"),
       col = c("grey40", "blue"), lwd = c(1,3), bty = "n")
abline(h = mean(geodf$speed.km.per.h), lty = 2, col = "blue")

# Plot the track without any map, the shape of the track is already visible.
plot(rev(geodf$lon), rev(geodf$lat), type = "l", col = "red", lwd = 3, bty = "n", ylab = "Latitude", xlab = "Longitude")

We will now use the OpenStreetMap package to get some maps from the internet and use them as a background for the track we just converted. There are several blocks for each map type. Check the comments in the first block what the different calls do.

First, we need to get the specific map. The 'type' argument controls which type of map we get.

map <- openmap(as.numeric(c(max(geodf$lat), min(geodf$lon))),
               as.numeric(c(min(geodf$lat), max(geodf$lon))), type = "osm")

This next step is important and it took me a while to figure this out. We need to convert the maps we got with the function openmap() to a projection that fits our longitude-latitude format of positions. This will distort the maps in the plots a little. But we need this step because the track has to fit the map!

transmap <- openproj(map, projection = "+proj=longlat")

# Now for plotting...
png("map1.png", width = 1000, height = 800, res = 100)
par(mar = rep(0,4))
plot(transmap, raster=T)
lines(geodf$lon, geodf$lat, type = "l", col = scales::alpha("red", .5), lwd = 4)
dev.off()

map <- openmap(as.numeric(c(max(geodf$lat), min(geodf$lon))),
               as.numeric(c(min(geodf$lat), max(geodf$lon))), type = "bing")

transmap <- openproj(map, projection = "+proj=longlat")
png("map2.png", width = 1000, height = 800, res = 100)
par(mar = rep(0,4))
plot(transmap, raster=T)
lines(geodf$lon, geodf$lat, type = "l", col = scales::alpha("yellow", .5), lwd = 4)
dev.off()

map <- openmap(as.numeric(c(max(geodf$lat), min(geodf$lon))),
               as.numeric(c(min(geodf$lat), max(geodf$lon))), type = "mapquest")

transmap <- openproj(map, projection = "+proj=longlat")
png("map3.png", width = 1000, height = 800, res = 100)
par(mar = rep(0,4))
plot(transmap, raster=T)
lines(geodf$lon, geodf$lat, type = "l", col = scales::alpha("yellow", .5), lwd = 4)
dev.off()

map <- openmap(as.numeric(c(max(geodf$lat), min(geodf$lon))),
               as.numeric(c(min(geodf$lat), max(geodf$lon))), type = "skobbler")

transmap <- openproj(map, projection = "+proj=longlat")
png("map4.png", width = 1000, height = 800, res = 100)
par(mar = rep(0,4))
plot(transmap, raster=T)
lines(geodf$lon, geodf$lat, type = "l", col = scales::alpha("blue", .5), lwd = 4)
dev.off()

map <- openmap(as.numeric(c(max(geodf$lat), min(geodf$lon))),
               as.numeric(c(min(geodf$lat), max(geodf$lon))), type = "esri-topo")

transmap <- openproj(map, projection = "+proj=longlat")
png("map5.png", width = 1000, height = 800, res = 100)
par(mar = rep(0,4))
plot(transmap, raster=T)
lines(geodf$lon, geodf$lat, type = "l", col = scales::alpha("blue", .5), lwd = 4)
dev.off()
5

View comments

  1. You can also use the simple "readGPX" function in the plotKML package to put your GPX file into a data frame.
    Thanks for the demo, this is great!
    Rock

    ReplyDelete
  2. Hey Rock, thanks for the tip! That would have saved me some typing, indeed...

    ReplyDelete
  3. This comment has been removed by a blog administrator.

    ReplyDelete
  4. A reader of my blog asked me about how to enhance the resolution of the maps. You can do this by setting the Parameter 'minNumTiles' in the function 'openmap' to a higher value (I tried 100 which took a while to load the map). Then you increase the width, height and resolution parameters of your png call and the map should have a better resolution.

    ReplyDelete
  5. Hey Sascha, I am wondering if there is a typo/error in pointDistance() usage?

    It seems to me it should be:

    pointDistance(c(lon1, lat1), c(lon2, lat2), lonlat = TRUE)

    whereas the code under "Now, let's calculate the distances between successive positions and the respective speed in this segment" uses:

    pointDistance(c(lat1, lon1), c(lat2, lon2), lonlat = TRUE)

    ReplyDelete

Hi all, this is just an announcement.

I am moving Rcrastinate to a blogdown-based solution and am therefore leaving blogger.com. If you're interested in the new setup and how you could do the same yourself, please check out the all shiny and new Rcrastinate over at

http://rcrastinate.rbind.io/

In my first post over there, I am giving a short summary on how I started the whole thing. I hope that the new Rcrastinate is also integrated into R-bloggers soon.

Thanks for being here, see you over there.

Alright, seems like this is developing into a blog where I am increasingly investigating my own music listening habits.

Recently, I've come across the analyzelastfm package by Sebastian Wolf. I used it to download my complete listening history from Last.FM for the last ten years. That's a complete dataset from 2009 to 2018 with exactly 65,356 "scrobbles" (which is the word Last.FM uses to describe one instance of a playback of a song).
3

Giddy up, giddy it up

Wanna move into a fool's gold room

With my pulse on the animal jewels

Of the rules that you choose to use to get loose

With the luminous moves

Bored of these limits, let me get, let me get it like

Wow!

When it comes to surreal lyrics and videos, I'm always thinking of Beck. Above, I cited the beginning of the song "Wow" from his latest album "Colors" which has received rather mixed reviews. In this post, I want to show you what I have done with Spotify's API.

Click here for the interactive visualization

If you're interested in the visualisation of networks or graphs, you might've heard of the great package "visNetwork". I think it's a really great package and I love playing around with it. The scenarios of graph-based analyses are many and diverse: whenever you can describe your data in terms of "outgoing" and "receiving" entities, a graph-based analysis and/or visualisation is possible.
12

Here is some updated R code from my previous post. It doesn't throw any warnings when importing tracks with and without heart rate information. Also, it is easier to distinguish types of tracks now (e.g., when you want to plot runs and rides separately). Another thing I changed: You get very basic information on the track when you click on it (currently the name of the track and the total length).

Have fun and leave a comment if you have any questions.
3

So, Strava's heatmap made quite a stir the last few weeks. I decided to give it a try myself. I wanted to create some kind of "personal heatmap" of my runs, using Strava's API. Also, combining the data with Leaflet maps allows us to make use of the beautiful map tiles supported by Leaflet and to zoom and move the maps around - with the runs on it, of course.

So, let's get started. First, you will need an access token for Strava's API.

I've been using the ggplot2 package a lot recently. When creating a legend or tick marks on the axes, ggplot2 uses the levels of a character or factor vector. Most of the time, I am working with coded variables that use some abbreviation of the "true" meaning (e.g. "f" for female and "m" for male or single characters for some single character for a location: "S" for Stuttgart and "M" for Mannheim).

In my plots, I don't want these codes but the full name of the level.

It's been a while since I had the opportunity to post something on music. Let's get back to that.

I got my hands on some song lyrics by a range of artists. (I have an R script to download all lyrics for a given artist from a lyrics website.
4

Lately, I got the chance to play around with Shiny and Leaflet a lot - and it is really fun! So I decided to catch up on an old post of mine and build a Shiny application where you can upload your own GPX files and plot them directly in the browser.

Of course, you will need some GPX file to try it out. You can get an example file here (you gonna need to save it in a .gpx file with a text editor, though). Also, the Shiny application will always plot the first track saved in a GPX file.
9
Blog Archive
BlogRoll
BlogRoll
Loading
Dynamic Views theme. Powered by Blogger. Report Abuse.