Following our initial and very gratifying Shiny Developer Conference this past January, which sold out in a few days, RStudio is very excited to announce a new and bigger conference today!

rstudio::conf, the conference about all things R and RStudio, will take place January 13 and 14, 2017 in Orlando, Florida. The conference will feature talks and tutorials from popular RStudio data scientists and developers like Hadley Wickham, Yihui Xie, Joe Cheng, Winston Chang, Garrett Grolemund, and J.J. Allaire, along with lightning talks from RStudio partners and customers.

Preceding the conference, on January 11 and 12, RStudio will offer two days of optional training. Training attendees can choose from Hadley Wickham’s Master R training, a new Intermediate Shiny workshop from Shiny creator Joe Cheng or a new workshop from Garrett Grolemund that is based on his soon-to-be-published book with Hadley: Introduction to Data Science with R.

rstudio::conf is for R and RStudio users who want to learn how to write better shiny applications in a better way, explore all the new capabilities of the R Markdown authoring framework, apply R to big data and work effectively with Spark, understand the RStudio toolchain for data science with R, discover best practices and tips for coding with RStudio, and investigate enterprise scale development and deployment practices and tools, including the new RStudio Connect.

Not to be missed, RStudio has also reserved Universal Studio’s The Wizarding World of Harry Potter on Friday night, January 13, for the exclusive use of conference attendees!

Conference attendance is limited to 400. Training is limited to 70 students for each of the three 2-day workshops. All seats are are available on a first-come, first-serve basis.

Please go to http://www.rstudio.com/conference to purchase.

We hope to see you in Florida at rstudio::conf 2017!

For questions or issues registering, please email conf@rstudio.com. To ask about sponsorship opportunities contact anne@rstudio.com.

UseR! 2016 has arrived and the RStudio team is at Stanford to share our newest products and latest enhancements to Shiny, R Markdown, dplyr, and more. Here’s a quick snapshot of RStudio related sessions. We hope to see you in as many of them as you can attend!

Monday June 27

Morning Tutorials

Afternoon Tutorials

Afternoon short talks moderated by Hadley Wickham

Tuesday June 28

Wednesday June 29

Thursday June 30

Stop by the booth!
Don’t miss our table in the exhibition area during the conference. Come talk to us about your plans for R and learn how RStudio Server Pro and Shiny Server Pro can provide enterprise-ready support and scalability for your RStudio IDE and Shiny deployments.

Note: Although UseR! is sold out, arrangements have been made to stream the keynote talks from https://aka.ms/user2016conference. Video recordings of the other sessions (where permitted by speakers) will be made available by UseR! organizers after the conference.

 

I’m very pleased to announce that dplyr 0.5.0 is now available from CRAN. Get the latest version with:

install.packages("dplyr")

dplyr 0.5.0 is a big release with a heap of new features, a whole bunch of minor improvements, and many bug fixes, both from me and from the broader dplyr community. In this blog post, I’ll highlight the most important changes:

  • Some breaking changes to single table verbs.
  • New tibble and dtplyr packages.
  • New vector functions.
  • Replacements for summarise_each() and mutate_each().
  • Improvements to SQL translation.

To see the complete list, please read the release notes.

Breaking changes

arrange() once again ignores grouping, reverting back to the behaviour of dplyr 0.3 and earlier. This makes arrange() inconsistent with other dplyr verbs, but I think this behaviour is generally more useful. Regardless, it’s not going to change again, as more changes will just cause more confusion.

mtcars %>% 
  group_by(cyl) %>% 
  arrange(desc(mpg))
#> Source: local data frame [32 x 11]
#> Groups: cyl [3]
#> 
#> # A tibble: 32 x 11
#>     mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
#>   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1  33.9     4  71.1    65  4.22 1.835 19.90     1     1     4     1
#> 2  32.4     4  78.7    66  4.08 2.200 19.47     1     1     4     1
#> 3  30.4     4  75.7    52  4.93 1.615 18.52     1     1     4     2
#> 4  30.4     4  95.1   113  3.77 1.513 16.90     1     1     5     2
#> 5  27.3     4  79.0    66  4.08 1.935 18.90     1     1     4     1
#> ... with 27 more rows

If you give distinct() a list of variables, it now only keeps those variables (instead of, as previously, keeping the first value from the other variables). To preserve the previous behaviour, use .keep_all = TRUE:

df <- data_frame(x = c(1, 1, 1, 2, 2), y = 1:5)

# Now only keeps x variable
df %>% distinct(x)
#> # A tibble: 2 x 1
#>       x
#>   <dbl>
#> 1     1
#> 2     2

# Previous behaviour preserved all variables
df %>% distinct(x, .keep_all = TRUE)
#> # A tibble: 2 x 2
#>       x     y
#>   <dbl> <int>
#> 1     1     1
#> 2     2     4

The select() helper functions starts_with(), ends_with(), etc are now real exported functions. This means that they have better documentation, and there’s an extension mechnaism if you want to write your own helpers.

Tibble and dtplyr packages

Functions related to the creation and coercion of tbl_dfs (“tibble”s for short), now live in their own package: tibble. See vignette("tibble") for more details.

Similarly, all code related to the data table dplyr backend code has been separated out in to a new dtplyr package. This decouples the development of the data.table interface from the development of the dplyr package, and I hope will spur improvements to the backend. If both data.table and dplyr are loaded, you’ll get a message reminding you to load dtplyr.

Vector functions

This version of dplyr gains a number of vector functions inspired by SQL. Two functions make it a little easier to eliminate or generate missing values:

  • Given a set of vectors, coalesce() finds the first non-missing value in each position:
    x <- c(1,  2, NA, 4, NA, 6)
    y <- c(NA, 2,  3, 4,  5, NA)
    
    # Use this to piece together a complete vector:
    coalesce(x, y)
    #> [1] 1 2 3 4 5 6
    
    # Or just replace missing value with a constant:
    coalesce(x, 0)
    #> [1] 1 2 0 4 0 6
  • The complement of coalesce() is na_if(): it replaces a specified value with an NA.
    x <- c(1, 5, 2, -99, -99, 10)
    na_if(x, -99)
    #> [1]  1  5  2 NA NA 10

Three functions provide convenient ways of replacing values. In order from simplest to most complicated, they are:

  • if_else(), a vectorised if statement, takes a logical vector (usually created with a comparison operator like ==, <, or %in%) and replaces TRUEs with one vector and FALSEs with another.
    x1 <- sample(5)
    if_else(x1 < 5, "small", "big")
    #> [1] "small" "small" "big"   "small" "small"

    if_else() is similar to base::ifelse(), but has two useful improvements.
    First, it has a fourth argument that will replace missing values:

    x2 <- c(NA, x1)
    if_else(x2 < 5, "small", "big", "unknown")
    #> [1] "unknown" "small"   "small"   "big"     "small"   "small"

    Secondly, it also have stricter semantics that ifelse(): the true and false arguments must be the same type. This gives a less surprising return type, and preserves S3 vectors like dates and factors:

    x <- factor(sample(letters[1:5], 10, replace = TRUE))
    ifelse(x %in% c("a", "b", "c"), x, factor(NA))
    #>  [1] NA NA  1 NA  3  2  3 NA  3  2
    if_else(x %in% c("a", "b", "c"), x, factor(NA))
    #>  [1] <NA> <NA> a    <NA> c    b    c    <NA> c    b   
    #> Levels: a b c d e

    Currently, if_else() is very strict, so you’ll need to careful match the types of true and false. This is most likely to bite you when you’re using missing values, and you’ll need to use a specific NA: NA_integer_, NA_real_, or NA_character_:

    if_else(TRUE, 1, NA)
    #> Error: `false` has type 'logical' not 'double'
    if_else(TRUE, 1, NA_real_)
    #> [1] 1
  • recode(), a vectorised switch(), takes a numeric vector, character vector, or factor, and replaces elements based on their values.
    x <- sample(c("a", "b", "c", NA), 10, replace = TRUE)
    
    # The default is to leave non-replaced values as is
    recode(x, a = "Apple")
    #>  [1] "c"     "Apple" NA      NA      "c"     NA      "b"     NA     
    #>  [9] "c"     "Apple"
    # But you can choose to override the default:
    recode(x, a = "Apple", .default = NA_character_)
    #>  [1] NA      "Apple" NA      NA      NA      NA      NA      NA     
    #>  [9] NA      "Apple"
    # You can also choose what value is used for missing values
    recode(x, a = "Apple", .default = NA_character_, .missing = "Unknown")
    #>  [1] NA        "Apple"   "Unknown" "Unknown" NA        "Unknown" NA       
    #>  [8] "Unknown" NA        "Apple"
  • case_when(), is a vectorised set of if and else ifs. You provide it a set of test-result pairs as formulas: The left side of the formula should return a logical vector, and the right hand side should return either a single value, or a vector the same length as the left hand side. All results must be the same type of vector.
    x <- 1:40
    case_when(
      x %% 35 == 0 ~ "fizz buzz",
      x %% 5 == 0 ~ "fizz",
      x %% 7 == 0 ~ "buzz",
      TRUE ~ as.character(x)
    )
    #>  [1] "1"         "2"         "3"         "4"         "fizz"     
    #>  [6] "6"         "buzz"      "8"         "9"         "fizz"     
    #> [11] "11"        "12"        "13"        "buzz"      "fizz"     
    #> [16] "16"        "17"        "18"        "19"        "fizz"     
    #> [21] "buzz"      "22"        "23"        "24"        "fizz"     
    #> [26] "26"        "27"        "buzz"      "29"        "fizz"     
    #> [31] "31"        "32"        "33"        "34"        "fizz buzz"
    #> [36] "36"        "37"        "38"        "39"        "fizz"

    case_when() is still somewhat experiment and does not currently work inside mutate(). That will be fixed in a future version.

I also added one small helper for dealing with floating point comparisons: near() tests for equality with numeric tolerance (abs(x - y) < tolerance).

x <- sqrt(2) ^ 2

x == 2
#> [1] FALSE
near(x, 2)
#> [1] TRUE

Predicate functions

Thanks to ideas and code from Lionel Henry, a new family of functions improve upon summarise_each() and mutate_each():

  • summarise_all() and mutate_all() apply a function to all (non-grouped) columns:
    mtcars %>% group_by(cyl) %>% summarise_all(mean)    
    #> # A tibble: 3 x 11
    #>     cyl      mpg     disp        hp     drat       wt     qsec        vs
    #>   <dbl>    <dbl>    <dbl>     <dbl>    <dbl>    <dbl>    <dbl>     <dbl>
    #> 1     4 26.66364 105.1364  82.63636 4.070909 2.285727 19.13727 0.9090909
    #> 2     6 19.74286 183.3143 122.28571 3.585714 3.117143 17.97714 0.5714286
    #> 3     8 15.10000 353.1000 209.21429 3.229286 3.999214 16.77214 0.0000000
    #> ... with 3 more variables: am <dbl>, gear <dbl>, carb <dbl>
  • summarise_at() and mutate_at() operate on a subset of columns. You can select columns with:
    • a character vector of column names,
    • a numeric vector of column positions, or
    • a column specification with select() semantics generated with the new vars() helper.
    mtcars %>% group_by(cyl) %>% summarise_at(c("mpg", "wt"), mean)
    #> # A tibble: 3 x 3
    #>     cyl      mpg       wt
    #>   <dbl>    <dbl>    <dbl>
    #> 1     4 26.66364 2.285727
    #> 2     6 19.74286 3.117143
    #> 3     8 15.10000 3.999214
    mtcars %>% group_by(cyl) %>% summarise_at(vars(mpg, wt), mean)
    #> # A tibble: 3 x 3
    #>     cyl      mpg       wt
    #>   <dbl>    <dbl>    <dbl>
    #> 1     4 26.66364 2.285727
    #> 2     6 19.74286 3.117143
    #> 3     8 15.10000 3.999214
  • summarise_if() and mutate_if() take a predicate function (a function that returns TRUE or FALSE when given a column). This makes it easy to apply a function only to numeric columns:
    iris %>% summarise_if(is.numeric, mean)
    #>   Sepal.Length Sepal.Width Petal.Length Petal.Width
    #> 1     5.843333    3.057333        3.758    1.199333

All of these functions pass ... on to the individual funs:

iris %>% summarise_if(is.numeric, mean, trim = 0.25)
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width
#> 1     5.802632    3.032895     3.934211    1.230263

A new select_if() allows you to pick columns with a predicate function:

df <- data_frame(x = 1:3, y = c("a", "b", "c"))
df %>% select_if(is.numeric)
#> # A tibble: 3 x 1
#>       x
#>   <int>
#> 1     1
#> 2     2
#> 3     3
df %>% select_if(is.character)
#> # A tibble: 3 x 1
#>       y
#>   <chr>
#> 1     a
#> 2     b
#> 3     c

summarise_each() and mutate_each() will be deprecated in a future release.

SQL translation

I have completely overhauled the translation of dplyr verbs into SQL statements. Previously, dplyr used a rather ad-hoc approach which tried to guess when a new subquery was needed. Unfortunately this approach was fraught with bugs, so I have now implemented a richer internal data model. In the short-term, this is likely to lead to some minor performance decreases (as the generated SQL is more complex), but the dplyr is much more likely to generate correct SQL. In the long-term, these abstractions will make it possible to write a query optimiser/compiler in dplyr, which would make it possible to generate much more succinct queries. If you know anything about writing query optimisers or compilers and are interested in working on this problem, please let me know!

I’m pleased to announce tidyr 0.5.0. tidyr makes it easy to “tidy” your data, storing it in a consistent form so that it’s easy to manipulate, visualise and model. Tidy data has a simple convention: put variables in the columns and observations in the rows. You can learn more about it in the tidy data vignette. Install it with:

install.packages("tidyr")

This release has three useful new features:

  1. separate_rows() separates values that contain multiple values separated by a delimited into multiple rows. Thanks to Aaron Wolen for the contribution!
    df <- data_frame(x = 1:2, y = c("a,b", "d,e,f"))
    df %>% 
      separate_rows(y, sep = ",")
    #> Source: local data frame [5 x 2]
    #> 
    #>       x     y
    #>   <int> <chr>
    #> 1     1     a
    #> 2     1     b
    #> 3     2     d
    #> 4     2     e
    #> 5     2     f

    Compare with separate() which separates into (named) columns:

    df %>% 
      separate(y, c("y1", "y2", "y3"), sep = ",", fill = "right")
    #> Source: local data frame [2 x 4]
    #> 
    #>       x    y1    y2    y3
    #> * <int> <chr> <chr> <chr>
    #> 1     1     a     b  <NA>
    #> 2     2     d     e     f
  2. spread() gains a sep argument. Setting this will name columns as “key|sep|value”. This is useful when you’re spreading based on a numeric column:
    df <- data_frame(
      x = c(1, 2, 1), 
      key = c(1, 1, 2), 
      val = c("a", "b", "c")
    )
    df %>% spread(key, val)
    #> Source: local data frame [2 x 3]
    #> 
    #>       x     1     2
    #> * <dbl> <chr> <chr>
    #> 1     1     a     c
    #> 2     2     b  <NA>
    df %>% spread(key, val, sep = "_")
    #> Source: local data frame [2 x 3]
    #> 
    #>       x key_1 key_2
    #> * <dbl> <chr> <chr>
    #> 1     1     a     c
    #> 2     2     b  <NA>
  3. unnest() gains a .sep argument. This is useful if you have multiple columns of data frames that have the same variable names:
    df <- data_frame(
      x = 1:2,
      y1 = list(
        data_frame(y = 1),
        data_frame(y = 2)
      ),
      y2 = list(
        data_frame(y = "a"),
        data_frame(y = "b")
      )
    )
    df %>% unnest()
    #> Source: local data frame [2 x 3]
    #> 
    #>       x     y     y
    #>   <int> <dbl> <chr>
    #> 1     1     1     a
    #> 2     2     2     b
    df %>% unnest(.sep = "_")
    #> Source: local data frame [2 x 3]
    #> 
    #>       x  y1_y  y2_y
    #>   <int> <dbl> <chr>
    #> 1     1     1     a
    #> 2     2     2     b

    It also gains a .id column that makes the names of the list explicit:

    df <- data_frame(
      x = 1:2,
      y = list(
        a = 1:3,
        b = 3:1
      )
    )
    df %>% unnest()
    #> Source: local data frame [6 x 2]
    #> 
    #>       x     y
    #>   <int> <int>
    #> 1     1     1
    #> 2     1     2
    #> 3     1     3
    #> 4     2     3
    #> 5     2     2
    #> 6     2     1
    df %>% unnest(.id = "id")
    #> Source: local data frame [6 x 3]
    #> 
    #>       x     y    id
    #>   <int> <int> <chr>
    #> 1     1     1     a
    #> 2     1     2     a
    #> 3     1     3     a
    #> 4     2     3     b
    #> 5     2     2     b
    #> 6     2     1     b

tidyr 0.5.0 also includes a bumper crop of bug fixes, including fixes for spread() and gather() in the presence of list-columns. Please see the release notes for a complete list of changes.

“How can I make my code faster?” If you write R code, then you’ve probably asked yourself this question. A profiler is an important tool for doing this: it records how the computer spends its time, and once you know that, you can focus on the slow parts to make them faster.

The preview releases of RStudio now have integrated support for profiling R code and for visualizing profiling data. R itself has long had a built-in profiler, and now it’s easier than ever to use the profiler and interpret the results.

To profile code with RStudio, select it in the editor, and then click on Profile -> Profile Selected Line(s). R will run that code with the profiler turned on, and then open up an interactive visualization.

In the visualization, there are two main parts: on top, there is the code with information about the amount of time spent executing each line, and on the bottom there is a flame graph, which shows R was doing over time. In the flame graph, the horizontal direction represents time, moving from left to right, and the vertical direction represents the call stack, which are the functions that are currently being called. (Each time a function calls another function, it goes on top of the stack, and when a function exits, it is removed from the stack.)

profile.png

The Data tab contains a call tree, showing which function calls are most expensive:

Profiling data pane

Armed with this information, you’ll know what parts of your code to focus on to speed things up!

The interactive profile visualizations are created with the profvis package, which can be used separately from the RStudio IDE. If you use profvis outside of RStudio, the visualizations will open in a web browser.

To learn more about interpreting profiling data, check out the profvis website, which has interactive demos. You can also find out more about profiling with RStudio there.

Today we’re excited to announce flexdashboard, a new package that enables you to easily create flexible, attractive, interactive dashboards with R. Authoring and customization of dashboards is done using R Markdown and you can optionally include Shiny components for additional interactivity.

neighborhood-diversity-flexdashboard

Highlights of the flexdashboard package include:

  • Support for a wide variety of components including interactive htmlwidgets; base, lattice, and grid graphics; tabular data; gauges; and value boxes.
  • Flexible and easy to specify row and column-based layouts. Components are intelligently re-sized to fill the browser and adapted for display on mobile devices.
  • Extensive support for text annotations to include assumptions, contextual narrative, and analysis within dashboards.
  • Storyboard layouts for presenting sequences of visualizations and related commentary.
  • By default dashboards are standard HTML documents that can be deployed on any web server or even attached to an email message. You can optionally add Shiny components for additional interactivity and then deploy on Shiny Server or shinyapps.io.

Getting Started

The flexdashboard package is available on CRAN; you can install it as follows:

install.packages("flexdashboard", type = "source")

To author a flexdashboard you create an R Markdown document with the flexdashboard::flex_dashboard output format. You can do this from within RStudio using the New R Markdown dialog:

Dashboards are simple R Markdown documents where each level 3 header (###) defines a section of the dashboard. For example, here’s a simple dashboard layout with 3 charts arranged top to bottom:

---
title: "My Dashboard"
output: flexdashboard::flex_dashboard
---

### Chart 1
 
```{r}

```
 
### Chart 2

```{r}

```

### Chart 3

```{r}

```

You can use level 2 headers (-----------) to introduce rows and columns into your dashboard and section attributes to control their relative size:

---
title: "My Dashboard"
output: flexdashboard::flex_dashboard
---

Column {data-width=600}
-------------------------------------
 
### Chart 1
 
```{r}

```
 
Column {data-width=400}
-------------------------------------
 
### Chart 2

```{r}

``` 
 
### Chart 3
 
```{r}

```

Learning More

The flexdashboard website includes extensive documentation on building your own dashboards, including:

  • A user guide for all of the features and options of flexdashboard, including layout orientations (row vs. column based), chart sizing, the various supported components, theming, and creating dashboards with multiple pages.
  • Details on using Shiny to create dashboards that enable viewers to change underlying parameters and see the results immediately, or that update themselves incrementally as their underlying data changes.
  • A variety of sample layouts which you can use as a starting point for your own dashboards.
  • Many examples of flexdashboard in action (including links to source code if you want to dig into how each example was created).

The examples below illustrate the use of flexdashboard with various packages and layouts (click the thumbnail to view a running version of each dashboard):

htmlwidgets-d3heatmap

d3heatmap: NBA scoring

ggplotly: ggplot2 geoms

ggplotly: ggplot2 geoms

Shiny: biclust example

Shiny: biclust example

dygraphs: Linked time series

dygraphs: linked time series

highcharter: sales report

highcharter: sales report

Storyboard: htmlwidgets showcase

Storyboard: htmlwidgets showcase

rbokeh: iris dataset

rbokeh: iris dataset

Shiny: diamonds explorer

Shiny: diamonds explorer

 

Try It Out

The flexdashboard package provides a simple yet powerful framework for creating dashboards from R. If you know R Markdown you already know enough to begin creating dashboards right now! We hope you’ll try it out and let us know how it’s working and what else we can do to make it better.

 

We are happy to announce a new series of tutorials that will take your Shiny apps to the next level. In the tutorials, Herman Sontrop and Erwin Schuijtvlot of FRISS will teach you how to create custom JavaScript widgets and embed them into your Shiny apps.

The JavaScript language is a powerful tool when combined with Shiny.  You can use JavaScript code to create highly sophisticated actions, and the code can be run by your user’s web browser. Best of all, JavaScript comes with a host of amazing visualization libraries that are ready to use out of the box, like c3.js, d3.js, intro.js and more.

The first tutorial is ready now, and we will publish each new lesson at the Shiny Development Center as it becomes available.

About FRISS

FRISS

FRISS (friss.eu) is software company with a 100% focus on fraud, risk & compliance for insurers worldwide. Shiny is an important component of the analytics framework employed by FRISS for its clients. In these tutorials, FRISS shares its expertise in developing Shiny apps with JavaScript.

This past January, we held the first ever Shiny Developer Conference. It was a chance to gather together a group of intermediate to advanced Shiny users, and take their skills to the next level.

It was an exciting event for me in particular, as I’ve been dying to share some of these intermediate and advanced Shiny concepts for years now. There are many concepts that aren’t strictly required to be productive with Shiny, but make a huge difference in helping you write efficient, robust, maintainable apps—and also make Shiny app authoring a lot more satisfying.

The feedback we received from conference attendees was overwhelmingly positive: everyone from relative novices to the most advanced users told us they gained new insights into how to improve their Shiny apps, or had a perspective shift on concepts they thought they already understood. The user-contributed lightning talks were also a big hit, helping people see what’s possible using Shiny and inspiring them to push their own apps further.

If you weren’t able to attend but are still interested in building your Shiny skills, we’re happy to announce the availability of videos of the tutorials and talks:

Shiny Developer Conference 2016 Videos

At the moment, these videos are our best sources of info on the topics of reactive programming, Shiny gadgets, Shiny modules, debugging Shiny apps, and performance. If you’re at all serious about writing Shiny apps, we highly recommend you take the time to watch!

If you’re interested in attending next year’s conference, you can sign up for our email list using the subscription form at the top of the Shiny Developer Conference 2016 Videos page, and we’ll let you know when more details are available.

On May 19 and 20, 2016, Hadley Wickham will teach his two day Master R Developer Workshop in the centrally located European city of Amsterdam.

This is the first time we’ve offered Hadley’s workshop in Europe. It’s a rare chance to learn from Hadley in person. Only 3 public Master R Developer Workshop classes are offered per year and no future classes in Europe are planned at this time for 2016 or 2017.

If you don’t want to miss this opportunity, register now to secure your seat!

For the convenience of those who may travel to the workshop, it will be held at the Hotel NH Amsterdam Schiphol Airport.

We look forward to seeing you soon!

testthat 1.0.0 is now available on CRAN. Testthat makes it easy to turn your existing informal tests into formal automated tests that you can rerun quickly and easily. Learn more at http://r-pkgs.had.co.nz/tests.html. Install the latest version with:

install.packages("testthat")

This version of testthat saw a major behind the scenes overhaul. This is the reason for the 1.0.0 release, and it will make it easier to add new expectations and reporters in the future. As well as the internal changes, there are improvements in four main areas:

  • New expectations.
  • Support for the pipe.
  • More consistent tests for side-effects.
  • Support for testing C++ code.

These are described in detail below. For a complete set of changes, please see the release notes.

Improved expectations

There are five new expectations:

  • expect_type() checks the base type of an object (with typeof()), expect_s3_class() tests that an object is S3 with given class, and expect_s4_class() tests that an object is S4 with given class. I recommend using these more specific expectations instead of the generic expect_is(), because they more clearly convey intent.
  • expect_length() checks that an object has expected length.
  • expect_output_file() compares output of a function with a text file, optionally update the file. This is useful for regression tests for print() methods.

A number of older expectations have been deprecated:

  • expect_more_than() and expect_less_than() have been deprecated. Please use expect_gt() and expect_lt() instead.
  • takes_less_than() has been deprecated.
  • not() has been deprecated. Please use the explicit individual forms expect_error(..., NA) , expect_warning(.., NA), etc.

We also did a thorough review of the documentation, ensuring that related expectations are documented together.

Piping

Most expectations now invisibly return the input object. This makes it possible to chain together expectations with magrittr:

factor("a") %>% 
  expect_type("integer") %>% 
  expect_s3_class("factor") %>% 
  expect_length(1)

To make this style even easier, testthat now imports and re-exports the pipe so you don’t need to explicitly attach magrittr.

Side-effects

Expectations that test for side-effects (i.e. expect_message(), expect_warning(), expect_error(), and expect_output()) are now more consistent:

  • expect_message(f(), NA) will fail if a message is produced (i.e. it’s not missing), and similarly for expect_output(), expect_warning(), and expect_error().
    quiet <- function() {}
    noisy <- function() message("Hi!")
    
    expect_message(quiet(), NA)
    expect_message(noisy(), NA)
    #> Error: noisy() showed 1 message. 
    #> * Hi!
  • expect_message(f(), NULL) will fail if a message isn’t produced, and similarly for expect_output(), expect_warning(), and expect_error().
    expect_message(quiet(), NULL)
    #> Error: quiet() showed 0 messages
    expect_message(noisy(), NULL)

There were three other changes made in the interest of consistency:

  • Previously testing for one side-effect (e.g. messages) tended to muffle other side effects (e.g. warnings). This is no longer the case.
  • Warnings that are not captured explicitly by expect_warning() are tracked and reported. These do not currently cause a test suite to fail, but may do in the future.
  • If you want to test a print method, expect_output() now requires you to explicitly print the object: expect_output("a", "a") will fail, expect_output(print("a"), "a") will succeed. This makes it more consistent with the other side-effect functions.

C++

Thanks to the work of Kevin Ushey, testthat now includes a simple interface to unit test C++ code using the Catch library. Using Catch in your packages is easy – just call testthat::use_catch() and the necessary infrastructure, alongside a few sample test files, will be generated for your package. By convention, you can place your unit tests in src/test-<name>.cpp. Here’s a simple example of a test file you might write when using testthat + Catch:

#include <testthat.h>
context("Addition") {
  test_that("two plus two equals four") {
    int result = 2 + 2;
    expect_true(result == 4);
  }
}

These unit tests will be compiled and run during calls to devtools::test(), as well as R CMD check. See ?use_catch for a full list of functions supported by testthat, and for more details.

For now, Catch unit tests will only be compiled when using the gcc and clang compilers – this implies that the unit tests you write will not be compiled + run on Solaris, which should make it easier to submit packages that use testthat for C++ unit tests to CRAN.

Follow

Get every new post delivered to your Inbox.

Join 19,827 other followers