I recently encountered some functionality in R which most of you might already know. Nevertheless, I want to share it here, because it might come in handy for those of you who do not know this yet.
Suppose you want to read in a large number of very large text tables in R. There is the great function fread() in the data.table package, which is really fast in reading in those large tables. However, it is still under development and sometimes it fails (e.g., if there are unbalanced quotes for an entry).
I guess, this will be fixed in the future. In the meantime, I wrote a little function which catches an error and tries something else.
The following function reads in a file (I stored it in one some private webspace for you if you want to try this out) with fread(). It will fail for fread(), but it tries good old read.table() with the appropriate parameter set next. read.table() is much slower but it also works for unbalanced quotes.
The function try() does the trick...
read.file <- function (file.name) {
file <- try(fread(file.name))
if (class(file) == "try-error") {
cat("Caught an error during fread, trying read.table.\n")
file <- as.data.table(read.table(file.name, sep = " ", quote = ""))
# Let's try this (excuse the German output)
> read.file("http://www.wolferonline.de/test/test.txt")
versuche URL 'http://www.wolferonline.de/test/test.txt'
Content type 'text/plain' length 72 bytes
URL geƶffnet
downloaded 72 bytes
Error in fread(file.name) :
Unbalanced " observed on this line: "Unbalanced.quotes some.entry some.other.entry
Caught an error during fread, trying read.table.
V1 V2 V3
1 No.quotes entry1 entry2
2 "Unbalanced.quotes some.entry some.other.entry
The cool thing about this: Wether you read in a file or you do something else which has a fast and a slow way to do it, you can first try the fast way. If this fails, you can still try the other, more stable (but slower) way to do it. Also, you can use try() as often as you like. So if the slower way also fails, you can return something which your script can use further on.
Good luck!
EDIT [17/07/2015]: Please note that if your data structure in the successful case has a class with a length longer than 1, you get a warning. This is the case for data.tables. They have class(data.table()): data.table, data.frame. If you don't want the warning from the if (class(file) == "try-error") then you can simply write if (class(file)[1] == "try-error").

Rcrastinate is moving.
Hi all, this is just an announcement.
I am moving Rcrastinate to a blogdown-based solution and am therefore leaving blogger.com. If you're interested in the new setup and how you could do the same yourself, please check out the all shiny and new Rcrastinate over at
In my first post over there, I am giving a short summary on how I started the whole thing. I hope that the new Rcrastinate is also integrated into R-bloggers soon.
Thanks for being here, see you over there.
I am moving Rcrastinate to a blogdown-based solution and am therefore leaving blogger.com. If you're interested in the new setup and how you could do the same yourself, please check out the all shiny and new Rcrastinate over at
In my first post over there, I am giving a short summary on how I started the whole thing. I hope that the new Rcrastinate is also integrated into R-bloggers soon.
Thanks for being here, see you over there.
10 years of playback history on Last.FM: "Just sit back and listen"
Alright, seems like this is developing into a blog where I am increasingly investigating my own music listening habits.
Recently, I've come across the analyzelastfm package by Sebastian Wolf. I used it to download my complete listening history from Last.FM for the last ten years. That's a complete dataset from 2009 to 2018 with exactly 65,356 "scrobbles" (which is the word Last.FM uses to describe one instance of a playback of a song).
Recently, I've come across the analyzelastfm package by Sebastian Wolf. I used it to download my complete listening history from Last.FM for the last ten years. That's a complete dataset from 2009 to 2018 with exactly 65,356 "scrobbles" (which is the word Last.FM uses to describe one instance of a playback of a song).
This dance, it's like a weapon: Radiohead's and Beck's danceability, valence, popularity, and more from the LastFM and Spotify APIs
Giddy up, giddy it up
Wanna move into a fool's gold room
With my pulse on the animal jewels
Of the rules that you choose to use to get loose
With the luminous moves
Bored of these limits, let me get, let me get it like
When it comes to surreal lyrics and videos, I'm always thinking of Beck. Above, I cited the beginning of the song "Wow" from his latest album "Colors" which has received rather mixed reviews. In this post, I want to show you what I have done with Spotify's API.
Wanna move into a fool's gold room
With my pulse on the animal jewels
Of the rules that you choose to use to get loose
With the luminous moves
Bored of these limits, let me get, let me get it like
When it comes to surreal lyrics and videos, I'm always thinking of Beck. Above, I cited the beginning of the song "Wow" from his latest album "Colors" which has received rather mixed reviews. In this post, I want to show you what I have done with Spotify's API.
Network visualization of football transfers using the 'visNetwork' package
Click here for the interactive visualization
If you're interested in the visualisation of networks or graphs, you might've heard of the great package "visNetwork". I think it's a really great package and I love playing around with it. The scenarios of graph-based analyses are many and diverse: whenever you can describe your data in terms of "outgoing" and "receiving" entities, a graph-based analysis and/or visualisation is possible.
If you're interested in the visualisation of networks or graphs, you might've heard of the great package "visNetwork". I think it's a really great package and I love playing around with it. The scenarios of graph-based analyses are many and diverse: whenever you can describe your data in terms of "outgoing" and "receiving" entities, a graph-based analysis and/or visualisation is possible.
Send tweets from R: A very short walkthrough
There are a few reasons why you might want to send tweets from R. You might want to write a Twitter bot or - as in my case - you want to send yourself a tweet when a very long computation finishes.
So, here I will run you through all the steps you have to take using
- Twitter's API and
- the twitteR package written by Jeff Gentry
The setup to send myself tweets is the following: I have my main twitter account and an additional account I am only using to tweet from R.
So, here I will run you through all the steps you have to take using
- Twitter's API and
- the twitteR package written by Jeff Gentry
The setup to send myself tweets is the following: I have my main twitter account and an additional account I am only using to tweet from R.
Get your tracks from the Strava API and plot them on Leaflet maps
Here is some updated R code from my previous post. It doesn't throw any warnings when importing tracks with and without heart rate information. Also, it is easier to distinguish types of tracks now (e.g., when you want to plot runs and rides separately). Another thing I changed: You get very basic information on the track when you click on it (currently the name of the track and the total length).
Have fun and leave a comment if you have any questions.
Have fun and leave a comment if you have any questions.
Where do you run to? Map your Strava activities on static and Leaflet maps.
So, Strava's heatmap made quite a stir the last few weeks. I decided to give it a try myself. I wanted to create some kind of "personal heatmap" of my runs, using Strava's API. Also, combining the data with Leaflet maps allows us to make use of the beautiful map tiles supported by Leaflet and to zoom and move the maps around - with the runs on it, of course.
So, let's get started. First, you will need an access token for Strava's API.
So, let's get started. First, you will need an access token for Strava's API.
Substitute levels in a factor or character vector
I've been using the ggplot2 package a lot recently. When creating a legend or tick marks on the axes, ggplot2 uses the levels of a character or factor vector. Most of the time, I am working with coded variables that use some abbreviation of the "true" meaning (e.g. "f" for female and "m" for male or single characters for some single character for a location: "S" for Stuttgart and "M" for Mannheim).
In my plots, I don't want these codes but the full name of the level.
In my plots, I don't want these codes but the full name of the level.
What's in the words? Comparing artists and lyrics with R.
It's been a while since I had the opportunity to post something on music. Let's get back to that.
I got my hands on some song lyrics by a range of artists. (I have an R script to download all lyrics for a given artist from a lyrics website.
I got my hands on some song lyrics by a range of artists. (I have an R script to download all lyrics for a given artist from a lyrics website.
Plotting GPX tracks with Shiny and Leaflet
Lately, I got the chance to play around with Shiny and Leaflet a lot - and it is really fun! So I decided to catch up on an old post of mine and build a Shiny application where you can upload your own GPX files and plot them directly in the browser.
Of course, you will need some GPX file to try it out. You can get an example file here (you gonna need to save it in a .gpx file with a text editor, though). Also, the Shiny application will always plot the first track saved in a GPX file.
Of course, you will need some GPX file to try it out. You can get an example file here (you gonna need to save it in a .gpx file with a text editor, though). Also, the Shiny application will always plot the first track saved in a GPX file.
Add a comment