09 January 2018

Another R Book

It should come as no surprise that I, being a codger, prefer my books with pages sewn into a spine and hard covered. For the last few decades those have been mostly about relational databases and things quant. R being the latest obsession. The pile never seems to diminish.

One kind of interesting aspect of the R world is that there is a bomb thrower amongst them, not wholly unlike what's going on in the political world. That would be Hadley Wickham. Some years ago there were reports of naughty words towards him. You can let your fingers do the walking through the Inntertubes (you'll even find a bit of prose from Your Humble Servant in one such thread).

Which brings me to his latest (with co-author), "R For Data Science". I don't think it's best as a learning text for R per se; Crawley is still (but, yes, a little long in the tooth) my preferred intro with ggplot2 to get up to date graphics. But as a reference on Wickham's tidyverse, it's canonical.

Here's a recent view:
The tidyverse is an 'opinionated' collection of R packages that duplicate and seek to improve upon numerous base R functions for data manipulation (e.g. dplyr) and graphing (e.g. ggplot2). As the tidyverse has grown increasing more comprehensive, it has been suggested that it be taught first to new R users. The debate between which R dialect is better has generated a lot of heat, but much light.

Here's one point from the book that, at one time, likely would have gotten major flames:
R is an old language, and some things that were useful 10 or 20 years ago now get in your way. It's difficult to change base R without breaking existing code, so most innovation occurs in packages. Here we will describe the tibble package, which provides opinionated data frames that make working in the tidyverse a little easier.

Anyway, recommended. O'Reilly spent some extra moolah on color graphics and text to make a pretty book. On the whole, I'd rather they'd spent the money to put it in a Rep-Kover binding.

No comments: