We have released the R package bookdown (v0.3) to CRAN. It may be old news to some users, but we are happy to make an official announcement today. To install the package from CRAN, you can

install.packages("bookdown")

The bookdown package provides an easier way to write books and technical publications than traditional tools such as LaTeX and Word. It inherits the simplicity of syntax and flexibility for data analysis from R Markdown, and extends R Markdown for technical writing, so that you can make better use of document elements such as figures, tables, equations, theorems, citations, and references, etc. Similar to LaTeX, you can number and cross-reference these elements with bookdown. Read the rest of this entry »

Want to Master R? There’s no better time or place than Hadley Wickham’s workshop on December 12th and 13th at the Cliftons in Melbourne, VIC, Australia.

Register here: https://www.eventbrite.com/e/master-r-developer-workshop-melbourne-tickets-22546200292   (Note: Prices are in $US and VAT is not collected)

Discounts are still available for academics (students or faculty) and for 5 or more attendees from any organization. Email training@rstudio.com if you have any questions about the workshop that you don’t find answered on the registration page.

Hadley has no Master R Workshops planned in the region for 2017 and his next one with availability won’t be until September in San Francisco. If you’ve always wanted to take Master R but haven’t found the time, Melbourne, the second most fun city in the world, is the place to go!

P.S. We’ve arranged a “happy hour” reception after class on Monday the 12th. Be sure to set aside an hour or so after the first day to talk to your classmates and Hadley about what’s happening in R.

I’m very pleased to announce ggplot2 2.2.0. It includes four major new features:

  • Subtitles and captions.
  • A large rewrite of the facetting system.
  • Improved theme options.
  • Better stacking.

It also includes as numerous bug fixes and minor improvements, as described in the release notes.

The majority of this work was carried out by Thomas Pederson, who I was lucky to have as my “ggplot2 intern” this summer. Make sure to check out his other visualisation packages: ggraphggforce, and tweenr.

Install ggplot2 with:

install.packages("ggplot2")

Subtitles and captions

Thanks to Bob Rudis, you can now add subtitles and captions to your plots:

ggplot(mpg, aes(displ, hwy)) +
  geom_point(aes(color = class)) +
  geom_smooth(se = FALSE, method = "loess") +
  labs(
    title = "Fuel efficiency generally decreases with engine size",
    subtitle = "Two seaters (sports cars) are an exception because of their light weight",
    caption = "Data from fueleconomy.gov"
  )

 

subtitle-1

These are controlled by the theme settings plot.subtitle and plot.caption.

The plot title is now aligned to the left by default. To return to the previous centered alignment, use theme(plot.title = element_text(hjust = 0.5)).

Facets

The facet and layout implementation has been moved to ggproto and received a large rewrite and refactoring. This will allow others to create their own facetting systems, as descrbied in the vignette("extending-ggplot2"). Along with the rewrite a number of features and improvements has been added, most notably:

  • ou can now use functions in facetting formulas, thanks to Dan Ruderman.
    ggplot(diamonds, aes(carat, price)) + 
      geom_hex(bins = 20) + 
      facet_wrap(~cut_number(depth, 6))

    facet-1-1

  • Axes are now drawn under the panels in facet_wrap() when the rentangle is not completely filled.
    ggplot(mpg, aes(displ, hwy)) + 
      geom_point() + 
      facet_wrap(~class)

    facet-2-1

  • You can set the position of the axes with the position argument.
    ggplot(mpg, aes(displ, hwy)) + 
      geom_point() + 
      scale_x_continuous(position = "top") + 
      scale_y_continuous(position = "right")

    facet-3-1

  • You can display a secondary axis that is a one-to-one transformation of the primary axis with sec.axis.
    ggplot(mpg, aes(displ, hwy)) + 
      geom_point() + 
      scale_y_continuous(
        "mpg (US)", 
        sec.axis = sec_axis(~ . * 1.20, name = "mpg (UK)")
      )

     

  • Strips can be placed on any side, and the placement with respect to axes can be controlled with the strip.placement theme option.
    ggplot(mpg, aes(displ, hwy)) + 
      geom_point() + 
      facet_wrap(~ drv, strip.position = "bottom") + 
      theme(
        strip.placement = "outside",
        strip.background = element_blank(),
        strip.text = element_text(face = "bold")
      ) +
      xlab(NULL)

    facet-5-1

Theming

  • The theme() function now has named arguments so autocomplete and documentation suggestions are vastly improved.
  • Blank elements can now be overridden again so you get the expected behavior when setting e.g. axis.line.x.
  • element_line() gets an arrow argument that lets you put arrows on axes.
    arrow <- arrow(length = unit(0.4, "cm"), type = "closed")
    
    ggplot(mpg, aes(displ, hwy)) + 
      geom_point() + 
      theme_minimal() + 
      theme(
        axis.line = element_line(arrow = arrow)
      )

    theme-1-1

  • Control of legend styling has been improved. The whole legend area can be aligned with the plot area and a box can be drawn around all legends:
    ggplot(mpg, aes(displ, hwy, shape = drv, colour = fl)) + 
      geom_point() + 
      theme(
        legend.justification = "top", 
        legend.box = "horizontal",
        legend.box.margin = margin(3, 3, 3, 3, "mm"), 
        legend.margin = margin(),
        legend.box.background = element_rect(colour = "grey50")
      )

    theme-2-1

  • panel.margin and legend.margin have been renamed to panel.spacing and legend.spacing respectively, as this better indicates their roles. A new legend.margin actually controls the margin around each legend.
  • When computing the height of titles, ggplot2 now inclues the height of the descenders (i.e. the bits g and y that hang underneath). This improves the margins around titles, particularly the y axis label. I have also very slightly increased the inner margins of axis titles, and removed the outer margins.
  • The default themes has been tweaked by Jean-Olivier Irisson making them better match theme_grey().

Stacking bars

position_stack() and position_fill() now stack values in the reverse order of the grouping, which makes the default stack order match the legend.

avg_price <- diamonds %>% 
  group_by(cut, color) %>% 
  summarise(price = mean(price)) %>% 
  ungroup() %>% 
  mutate(price_rel = price - mean(price))

ggplot(avg_price) + 
  geom_col(aes(x = cut, y = price, fill = color))

stack-1-1

(Note also the new geom_col() which is short-hand for geom_bar(stat = "identity"), contributed by Bob Rudis.)

If you want to stack in the opposite order, try forcats::fct_rev():

ggplot(avg_price) + 
  geom_col(aes(x = cut, y = price, fill = fct_rev(color)))

stack-2-1

Additionally, you can now stack negative values:

ggplot(avg_price) + 
  geom_col(aes(x = cut, y = price_rel, fill = color))

stack-3-1

The overall ordering cannot necessarily be matched in the presence of negative values, but the ordering on either side of the x-axis will match.

Labels can also be stacked, but the default position is suboptimal:

series <- data.frame(
  time = c(rep(1, 4),rep(2, 4), rep(3, 4), rep(4, 4)),
  type = rep(c('a', 'b', 'c', 'd'), 4),
  value = rpois(16, 10)
)

ggplot(series, aes(time, value, group = type)) +
  geom_area(aes(fill = type)) +
  geom_text(aes(label = type), position = "stack")

stack-4-1

You can improve the position with the vjust parameter. A vjust of 0.5 will center the labels inside the corresponding area:

ggplot(series, aes(time, value, group = type)) +
  geom_area(aes(fill = type)) +
  geom_text(aes(label = type), position = position_stack(vjust = 0.5))

stack-5-1

Today we are pleased to release a new version of svglite. This release fixes many bugs, includes new documentation vignettes, and improves fonts support.

You can install svglite with:

install.packages("svglite")

Font handling

Fonts are tricky with SVG because they are needed at two stages:

  • When creating the SVG file, the fonts are needed in order to correctly measure the amount space each character occupies. This is particularly important for plot that use plotmath.
  • When drawing the SVG file on screen, the fonts are needed to draw each character correctly.

For the best display, that means you need to have the same fonts installed on both the computer that generates the SVG file and the computer that draws it. By default, svglite uses fonts that are installed on pretty much every computer. svglite’s font support is now much more flexible thanks to two new arguments: system_fonts and user_fonts.

  1. system_fonts allows you to specify the name of a font installed on your computer. This is useful, for example, if you’d like to use a font with better CJK support:
    svglite("Rplots.svg", system_fonts = list(sans = "Arial Unicode MS"))
    plot.new()
    text(0.5, 0.5, "正規分布")
    dev.off()
  2. user_fonts allows you to specify a font installed in a R package (like fontquiver). This is needed if you want to generate identical plot across different operating systems, and are using in the upcoming vdiffr package which provides graphical unit tests.

For more details, see vignette("fonts").

Text scaling

This update also fixes many bugs. The most important is that text is now properly scaled within the plot, and we provide a vignette that describes the details: vignette("scaling"). It documents, for instance, how to include a svglite graphic in a web page with the figure text consistently scaled with the surrounding text.

Find a full list of changes in the release notes.

On October 12, RStudio launched R Views with great enthusiasm. R Views is a new blog for R users about the R Community and the R Language. Under the care of editor-in-chief and new RStudio ambassador-at-large, Joseph Rickert, R Views provides a new perspective on R and RStudio that we like to think will become essential reading for you.

You may have read an R Views post already. In the first, widely syndicated, post, Joseph interviewed J.J. Allaire, RStudio’s founder, CEO and most prolific software developer. Later posts by Mine Cetinkaya-Rundel on Highcharts and thoughtful book reviews, new R package picks, and a primer on Naive Bayes from Joseph rounded out the first month. Each post was entirely different from anything you could have read here, on what we now call our Developer Blog at rstudio.org.

Fortunately, you don’t have to choose. Each has its purpose. Our Developer Blog is the place to go for RStudio news. You’ll find product announcements, events, and company happenings – like the announcement of a new blog – right here. R Views is about R in action. You’ll find stories and solutions and opinions that we hope will educate and challenge you.

Subscribe to each and stay up to date on all things R and RStudio!

Thanks for making R and RStudio part of your data science experience and for supporting our work.

Shiny Server 1.5.1.834 and Shiny Server Pro 1.5.1.760 are now available.

The Shiny Server 1.5.x release family upgrades our underlying Node.js engine from 0.10.47 to 6.9.1. The impetus for this change was not stability or performance, but because the 0.10.x release family has reached the end of its life.

We highly recommend that you test on a staging server before upgrading production Shiny Server 1.4.x machines to 1.5. You should always do this for any production-critical software, but it’s particularly important for this release, due to the magnitude of changes to Node.js that we’ve absorbed in one big gulp. (We’ve done thorough end-to-end testing of this release, but there’s no substitute for testing with your own apps, on your own servers.)

Some small bug fixes are also included in this release. See the release notes for more details.

The beginning of the end for Ubuntu 12.04 and Red Hat 5

While we still support Ubuntu 12.04 and Red Hat 5 today, we’ll be moving on from these very old releases in a few months. Both of these distributions will end-of-life in April 2017, and will stop receiving bug fixes and security fixes from their vendors at that time. If you’re using Shiny Server with one of these platforms, we recommend that you start planning your upgrade.

Today we’re very pleased to announce the availability of RStudio Version 1.0! Version 1.0 is our 10th major release since the initial launch in February 2011 (see the full release history below), and our biggest ever! Highlights include:

  • Authoring tools for R Notebooks.
  • Integrated support for the sparklyr package (R interface to Spark).
  • Performance profiling via integration with the profvis package.
  • Enhanced data import tools based on the readr, readxl and haven packages.
  • Authoring tools for R Markdown websites and the bookdown package.
  • Many other miscellaneous enhancements and bug fixes.

We hope you download version 1.0 now and as always let us know what you think.

R Notebooks

R Notebooks add a powerful notebook authoring engine to R Markdown. Notebook interfaces for data analysis have compelling advantages including the close association of code and output and the ability to intersperse narrative with computation. Notebooks are also an excellent tool for teaching and a convenient way to share analyses.

Interactive R Markdown

As an authoring format, R Markdown bears many similarities to traditional notebooks like Jupyter and Beaker. However, code in notebooks is typically executed interactively, one cell at a time, whereas code in R Markdown documents is typically executed in batch.

R Notebooks bring the interactive model of execution to your R Markdown documents, giving you the capability to work quickly and iteratively in a notebook interface without leaving behind the plain-text tools, compatibility with version control, and production-quality output you’ve come to rely on from R Markdown.

Iterate Quickly

In a typical R Markdown document, you must re-knit the document to see your changes, which can take some time if it contains non-trivial computations. R Notebooks, however, let you run code and see the results in the document immediately. They can include just about any kind of content R produces, including console output, plots, data frames, and interactive HTML widgets.

screen-shot-2016-09-20-at-4-16-47-pm

You can see the progress of the code as it runs:

screen-shot-2016-09-21-at-10-52-02-am

You can preview the results of individual inline expressions, too:

notebook-inline-output

Even your LaTeX equations render in real-time as you type:

notebook-mathjax

This focused mode of interaction doesn’t require you to keep the console, viewer, or output panes open. Everything you need is at your fingertips in the editor, reducing distractions and helping you concentrate on your analysis. When you’re done, you’ll have a formatted, reproducible record of what you’ve accomplished, with plenty of context, perfect for your own records or sharing with others.

Spark with sparklyr

The sparklyr package is a new R interface for Apache Spark. RStudio now includes integrated support for Spark and the sparklyr package, including tools for:

  • Creating and managing Spark connections
  • Browsing the tables and columns of Spark DataFrames
  • Previewing the first 1,000 rows of Spark DataFrames

Once you’ve installed the sparklyr package, you should find a new Spark pane within the IDE. This pane includes a New Connection dialog which can be used to make connections to local or remote Spark instances:

Once you’ve connected to Spark you’ll be able to browse the tables contained within the Spark cluster:

The Spark DataFrame preview uses the standard RStudio data viewer:

Profiling with profvis

“How can I make my code faster?”

If you write R code, then you’ve probably asked yourself this question. A profiler is an important tool for doing this: it records how the computer spends its time, and once you know that, you can focus on the slow parts to make them faster.

RStudio now includes integrated support for profiling R code and for visualizing profiling data. R itself has long had a built-in profiler, and now it’s easier than ever to use the profiler and interpret the results.

To profile code with RStudio, select it in the editor, and then click on Profile -> Profile Selected Line(s). R will run that code with the profiler turned on, and then open up an interactive visualization.

In the visualization, there are two main parts: on top, there is the code with information about the amount of time spent executing each line, and on the bottom there is a flame graph, which shows what R was doing over time. In the flame graph, the horizontal direction represents time, moving from left to right, and the vertical direction represents the call stack, which are the functions that are currently being called. (Each time a function calls another function, it goes on top of the stack, and when a function exits, it is removed from the stack.)

profile.png

The Data tab contains a call tree, showing which function calls are most expensive:

Profiling data pane

Armed with this information, you’ll know what parts of your code to focus on to speed things up!

Data Import

RStudio now integrates with the readr, readxl, and haven packages to provide comprehensive tools for importing data from many text file formats, Excel worksheets, as well as SAS, Stata, and SPSS data files. The tools are focused on interactively refining an import then providing the code required to reproduce the import on new datasets.

For example, here’s the workflow we would use to import the Excel worksheet at http://www.fns.usda.gov/sites/default/files/pd/slsummar.xls.

First provide the dataset URL and review the import in preview mode (notice that this file contains two tables and as a result requires the first few rows to be removed):

We can clean this up by skipping 6 rows from this file and unchecking the “First Row as Names” checkbox:

The file is looking better but some columns are being displayed as strings when they are clearly numerical data. We can fix this by selecting “numeric” from the column drop-down:

The final step is to click “Import” to run the code displayed under “Code Preview” and import the data into R. The code is executed within the console and imported dataset is displayed automatically:

Note that rather than executing the import we could have just copied and pasted the import code and included it within any R script.

RStudio Release History

We started working on RStudio in November of 2008 (8 years ago!) and had our first public release in February of 2011. Here are highlights of the various releases through the years:

Version Date Highlights
0.92 Feb 2011
  • Initial public release
0.93 Apr 2011
  • Interactive plotting with manipulate
  • Source editor themes
  • Configurable workspace layout
0.94 Jun 2011
  • Enhanced plot export
  • Enhanced package installation and management
  • Enhanced history management
0.95 Jan 2012
  • RStudio project system
  • Code navigation (typeahead search, go to definition)
  • Version control integration (Git and Subversion)
0.96 May 2012
  • Enhanced authoring for Sweave
  • Web publishing with R Markdown
  • Code folding and many other editing enhancements
0.97 Oct 2012
  • Package development tools
  • Vim editing mode
  • More intelligent R auto-indentation
0.98 Dec 2013
  • Interactive debugging tools
  • Enhanced environment pane
  • Viewer pane for web content / htmlwidgets
0.98b Jun 2014
  • R Markdown v2 (publish to PDF, Word, and more)
  • Integrated tools for Shiny application development
  • Editor support for XML, SQL, Python, and Bash
0.99 May 2015
  • Data viewer with support for large datasets, filtering, searching, and sorting
  • Major enhancements to R and C/C++ code completion and inline code diagnostics
  • Multiple cursors, tab re-ordering, enhanced Vim mode
0.99b Feb 2016
  • Emacs editing mode
  • Multi-window source editing
  • Customizable keyboard shortcuts
  • RStudio Addins
1.0 Nov 2016
  • Authoring tools for R Notebooks
  • Integrated support for sparklyr (R interface to Spark)
  • Enhanced data import tools
  • Performance profiling via integration with profvis

The RStudio Release History page on our support website provides a complete history of all major and minor point releases.

 

It’s nearly summeRtime in Australia! Join RStudio Chief Data Scientist Hadley Wickham for his popular Master R workshop in Melbourne.

Register here:  https://www.eventbrite.com/e/master-r-developer-workshop-melbourne-tickets-22546200292

Melbourne will be Hadley’s first and only scheduled Master R workshop in Australia. Whether you live or work nearby or you just need one more good reason to visit Melbourne in the Southern Hemisphere spring, consider joining him at the Cliftons Melbourne on December 12th and 13th. It’s a rare opportunity to learn from one of the R community’s most popular and innovative authors and package developers.

Hadley’s workshops usually sell out. This is his final Master R in 2016 and he has no plans to offer another in the area in 2017. If you’re an active R user and have been meaning to take this class, now is the perfect time to do it!

We look forward to seeing you in Melbourne!

We are excited to announce that submissions for lightning talks at rstudio::conf are now open! Lightning talks are short (5 minute) high energy presentations that give you the chance to talk about an interesting project that you’ve tackled with R. Short talks, or demos of your R code, R packages, and shiny apps are great options. See some of the great lightning talks from the Shiny Developer Conference (scroll down to user talks).

Submit your lightning talk proposal here!

Submissions are due December 1, 2016. We’ll announce the accepted talks on December 15. (You must be a registered attendee of rstudio::conf to present a lightning talk.)

Shiny Server 1.4.7.815 and Shiny Server Pro 1.4.7.736 are now available! This release includes new features to support Shiny 0.14. It also updates our Node.js to 0.10.47, which includes important security fixes for SSL/TLS.

Connection robustness (a.k.a. grey-outs)

Shiny’s architecture is built on top of websockets, which are long-lived network connections between the browser and an R session on the server. If this connection is broken for any reason, the browser is no longer able to communicate with its R session on the server. Shiny indicates this to the user by turning the page background grey and fading out the page contents.

In Shiny 0.14 and Shiny Server 1.4.7, we’ve done work at both the server and package levels to minimize the amount of greyouts users will see. Simply by upgrading Shiny Server, transient (<15sec) network interruptions should no longer disrupt Shiny apps. And for many Shiny apps, a secondary, opt-in reconnection mechanism should all but eliminate grey-outs. This article on shiny.rstudio.com has all the details.

Bookmarkable state

Shiny 0.14 introduced a “bookmarkable state” feature that made it possible to snapshot the state of a running Shiny app, and send it to someone as a URL to try in their own browser. At the app author’s option, the app state could either be fully encoded in the URL, or written to disk and referred to by a short ID. This latter approach requires support from the server, and that support is now officially provided by Shiny Server and Shiny Server 1.4.7. (This functionality is not yet available for ShinyApps.io, however.)

Coming soon: Shiny Server 1.5.0

Just a heads up: Shiny Server (Pro) 1.5.0 is coming in a few weeks. Shiny Server was originally written using Node.js 0.10, which is nearing the end of its lifespan. This release will move to Node.js 6.x.

Due to the complexity of this upgrade, Shiny Server 1.5.0 will not add any new features, except for supporting perfect forward secrecy for SSL/TLS connections. The focus will be entirely on ensuring a smooth and stable release.