We’re excited to announce a new release of Packrat, a tool for making R projects more isolated and reproducible by managing their package dependencies.

This release brings a number of exciting features to Packrat that significantly improve the user experience:

  • Automatic snapshots ensure that new packages installed in your project library are automatically tracked by Packrat.
  • Bundle and share your projects with packrat::bundle() and packrat::unbundle() — whether you want to freeze an analysis, or exchange it for collaboration with colleagues.
  • Packrat mode can now be turned on and off at will, allowing you to navigate between different Packrat projects in a single R session. Use packrat::on() to activate Packrat in the current directory, and packrat::off() to turn it off.
  • Local repositories (ie, directories containing R package sources) can now be specified for projects, allowing local source packages to be used in a Packrat project alongside CRAN, BioConductor and GitHub packages (see this and more with ?"packrat-options").

In addition, Packrat is now tightly integrated with the RStudio IDE, making it easier to manage project dependencies than ever. Download today’s RStudio IDE 0.98.978 release and try it out!

Packrat RStudio package pane integration

You can install the latest version of Packrat from GitHub with:

    devtools::install_github("rstudio/packrat")

Packrat will be coming to CRAN soon as well.

If you try it, we’d love to get your feedback. Leave a comment here or post in the packrat-discuss Google group.

 

tidyr is new package that makes it easy to “tidy” your data. Tidy data is data that’s easy to work with: it’s easy to munge (with dplyr), visualise (with ggplot2 or ggvis) and model (with R’s hundreds of modelling packages). The two most important properties of tidy data are:

  • Each column is a variable.
  • Each row is an observation.

Arranging your data in this way makes it easier to work with because you have a consistent way of referring to variables (as column names) and observations (as row indices). When use tidy data and tidy tools, you spend less time worrying about how to feed the output from one function into the input of another, and more time answering your questions about the data.

To tidy messy data, you first identify the variables in your dataset, then use the tools provided by tidyr to move them into columns. tidyr provides three main functions for tidying your messy data: gather(), separate() and spread().

gather() takes multiple columns, and gathers them into key-value pairs: it makes “wide” data longer. Other names for gather include melt (reshape2), pivot (spreadsheets) and fold (databases). Here’s an example how you might use gather() on a made-up dataset. In this experiment we’ve given three people two different drugs and recorded their heart rate:

library(tidyr)
library(dplyr)

messy <- data.frame(
  name = c("Wilbur", "Petunia", "Gregory"),
  a = c(67, 80, 64),
  b = c(56, 90, 50)
)
messy
#>      name  a  b
#> 1  Wilbur 67 56
#> 2 Petunia 80 90
#> 3 Gregory 64 50

We have three variables (name, drug and heartrate), but only name is currently in a column. We use gather() to gather the a and b columns into key-value pairs of drug and heartrate:

messy %>%
  gather(drug, heartrate, a:b)
#>      name drug heartrate
#> 1  Wilbur    a        67
#> 2 Petunia    a        80
#> 3 Gregory    a        64
#> 4  Wilbur    b        56
#> 5 Petunia    b        90
#> 6 Gregory    b        50

Sometimes two variables are clumped together in one column. separate() allows you to tease them apart (extract() works similarly but uses regexp groups instead of a splitting pattern or position). Take this example from stackoverflow (modified slightly for brevity). We have some measurements of how much time people spend on their phones, measured at two locations (work and home), at two times. Each person has been randomly assigned to either treatment or control.

set.seed(10)
messy <- data.frame(
  id = 1:4,
  trt = sample(rep(c('control', 'treatment'), each = 2)),
  work.T1 = runif(4),
  home.T1 = runif(4),
  work.T2 = runif(4),
  home.T2 = runif(4)
)

To tidy this data, we first use gather() to turn columns work.T1, home.T1, work.T2 and home.T2 into a key-value pair of key and time. (Only the first eight rows are shown to save space.)

tidier <- messy %>%
  gather(key, time, -id, -trt)
tidier %>% head(8)
#>   id       trt     key    time
#> 1  1 treatment work.T1 0.08514
#> 2  2   control work.T1 0.22544
#> 3  3 treatment work.T1 0.27453
#> 4  4   control work.T1 0.27231
#> 5  1 treatment home.T1 0.61583
#> 6  2   control home.T1 0.42967
#> 7  3 treatment home.T1 0.65166
#> 8  4   control home.T1 0.56774

Next we use separate() to split the key into location and time, using a regular expression to describe the character that separates them.

tidy <- tidier %>%
  separate(key, into = c("location", "time"), sep = "\\.") 
tidy %>% head(8)
#>   id       trt location time    time
#> 1  1 treatment     work   T1 0.08514
#> 2  2   control     work   T1 0.22544
#> 3  3 treatment     work   T1 0.27453
#> 4  4   control     work   T1 0.27231
#> 5  1 treatment     home   T1 0.61583
#> 6  2   control     home   T1 0.42967
#> 7  3 treatment     home   T1 0.65166
#> 8  4   control     home   T1 0.56774

The last tool, spread(), takes two columns (a key-value pair) and spreads them in to multiple columns, making “long” data wider. Spread is known by other names in other places: it’s cast in reshape2, unpivot in spreadsheets and unfold in databases. spread() is used when you have variables that form rows instead of columns. You need spread() less frequently than gather() or separate() so to learn more, check out the documentation and the demos.

Just as reshape2 did less than reshape, tidyr does less than reshape2. It’s designed specifically for tidying data, not general reshaping. In particular, existing methods only work for data frames, and tidyr never aggregates. This makes each function in tidyr simpler: each function does one thing well. For more complicated operations you can string together multiple simple tidyr and dplyr functions with %>%.

You can learn more about the underlying principles in my tidy data paper. To see more examples of data tidying, read the vignette, vignette("tidy-data"), or check out the demos, demo(package = "tidyr"). Alternatively, check out some of the great stackoverflow answers that use tidyr. Keep up-to-date with development at http://github.com/hadley/tidyr, report bugs at http://github.com/hadley/tidyr/issues and get help with data manipulation challenges at https://groups.google.com/group/manipulatr. If you ask a question specifically about tidyr on stackoverflow, please tag it with tidyr and I’ll make sure to read it.

We’ve added a new section of articles to the Shiny Development Center. These articles explain how to create interactive documents with Shiny and R Markdown.

You’ll learn how to

  • Use R Markdown to create reproducible, dynamic reports. R Markdown offers one of the most efficient workflows for writing up your R results.

  • Create interactive documents and slideshows by embedding Shiny elements into an R Markdown report. The Shiny + R Markdown combo does more than just enhance your reports; R Markdown provides one of the quickest ways to make light weight Shiny apps.

  • Take advantage of RStudio’s built in features that support R Markdown

interactive-articles.001

Learn more at shiny.rstudio.com/articles

The RStudio team recently rolled out new capabilities in RStudio, shiny, ggvis, dplyr, knitr, R Markdown, and packrat. The “Essential Tools for Data Science with R” free webinar series is the perfect place to learn more about the power of these R packages from the authors themselves.

Click to learn more and register for one or more webinar sessions. You must register for each separately. If you miss a live webinar or want to review them, recorded versions will be available to registrants within 30 days.

The Grammar and Graphics of Data Science
Live! Wednesday, July 30 at 11am Eastern Time US  Click to register

  • dplyr: a grammar of data manipulation – Hadley Wickham
  • ggvis: Interactive graphics in R – Winston Chang

Reproducible Reporting 
Live! Wednesday, August 13 at 11am Eastern Time US  Click to register

  • The Next Generation of R Markdown – Jeff Allen
  • Knitr Ninja – Yihui Xie
  • Packrat – A Dependency Management System for R – J.J. Allaire & Kevin Ushey

Interactive Reporting
Live! Wednesday, September 3 at 11am Eastern Time US  Click to register

  • Embedding Shiny Apps in R Markdown documents – Garrett Grolemund
  • Shiny: R made interactive – Joe Cheng

 

RStudio will teach the new essentials for doing data science in R at this year’s Strata NYC conference, Oct 15 2014.

R Day at Strata is a full day of tutorials that will cover some of the most useful topics in R. You’ll learn how to manipulate and visualize data with R, as well as how to write reproducible, interactive reports that foster collaboration. Topics include:

9:00am – 10:30am
A Grammar of Data Manipulation with dplyr
Speaker: Hadley Wickham

11:00am – 12:30pm
A Reactive Grammar of Graphics with ggvis
Speaker: Winston Chang

1:30pm – 3:00pm
Analytic Web Applications with Shiny
Speaker: Garrett Grolemund

3:30pm – 5:00pm
Reproducible R Reports with Packrat and Rmarkdown
Speaker: JJ Allaire & Yihui Xie

The tutorials are integrated into a cohesive day of instruction. Many of the tools that we’ll cover did not exist six months ago, so you are almost certain to learn something new. You will get the most out of the day if you already know how to load data into R and have some basic experience visualizing and manipulating data.

Visit strataconf.com/stratany2014 to learn more and register! Early bird pricing ends July 31.

Not available on October 15? Check out Hadley’s Advanced R Workshop in New York City on September 8 and 9, 2014.

 

Shiny v0.10 comes with a quick, handy guide. Use the Shiny cheat sheet as a quick reference for building Shiny apps. The cheat sheet will guide you from structuring your app, to writing a reactive foundation with server.R, to laying out and deploying your app.

 

cheatsheet

 

You can find the Shiny cheat sheet along with many more resources for using Shiny at the Shiny Dev Center, shiny.rstudio.com.

(p.s. Visit the RStudio booth at useR! today for a free hard copy of the cheat sheet.)

The R User Conference 2014 is coming up fast in Los Angeles!

RStudio will be there in force to share the latest enhancements to shiny, ggvis, knitr, dplyr. R markdown, packrat and more.  Here’s a quick snapshot of our scheduled sessions. We hope to see you in as many of them as you can attend!

Monday, June 30

Morning Tutorials

  • Interactive graphics with ggvis - Winston Chang
  • Dynamic Documents with R and knitr – Yihui Xie

Afternoon Tutorials

  • Data manipulation with dplyr – Hadley Wickham
  • Interactive data display with Shiny and R – Garrett Grolemund

Tuesday, July 1

Session 1 10:30 Room – Palisades
ggvis: Interactive graphics in R - Winston Chang

Session 2 13:00 Room - Palisades
Shiny: R made interactive – Joe Cheng

Session 3 16:00 Room - Palisades
dplyr: a grammar of data manipulation – Hadley Wickham

Wednesday, July 2

Session 5 16.00 Room - Palisades
Packrat – A Dependency Management System for R – J.J. Allaire

Thursday, July 3

Session 6 10:00 Room - Palisades
The Next Generation of R Markdown – J.J. Allaire
Knitr Ninja – Yihui Xie
Embedding Shiny Apps in R Markdown documents – Garrett Grolemund

Every Day

Don’t miss our table in the exhibition area during the conference. Come talk to us about your plans for R and learn how RStudio Server Pro and Shiny Server Pro can provide enterprise-ready support and scalability for your RStudio IDE and Shiny deployments.

Our first public release of ggvis, version 0.3, is now available on CRAN. What is ggvis? It’s a new package for data visualization. Like ggplot2, it is built on concepts from the grammar of graphics, but it also adds interactivity, a new data pipeline, and it renders in a web browser. Our goal is to make an interface that’s flexible, so that you can compose new kinds of visualizations, yet simple, so that it’s accessible to all R users.

Update: there was an issue affecting interactive plots in version 0.3. Version 0.3.0.1 fixes the issue. The updated source package is now on CRAN, and Windows and Mac binary packages will be available shortly.

ggvis_movies

ggvis integrates with Shiny, so you can use dynamic, interactive ggvis graphics in Shiny applications. We hope that the combination of ggvis and Shiny will make it easy for you to create applications for interactive data exploration and presentation. ggvis plots are inherently reactive and they render in the browser, so they can take advantage of the capabilities provided by modern web browsers. You can use Shiny’s interactive components for interactivity as well as more direct forms of interaction with the plot, such as hovering, clicking, and brushing.

ggvis works seamlessly with R Markdown v2 and interactive documents, so you can easily add interactive graphics to your R Markdown documents:

shiny-doc-ggvis  ggvis_density

And don’t worry — ggvis isn’t only meant to be used with Shiny and interactive documents. Because the RStudio IDE is also a web browser, ggvis plots can display in the IDE, like any other R graphics:

ggvis in RStudio IDE

There’s much more to come with ggvis. To learn more, visit the ggvis website.

Please note that ggvis is still young, and lacks a number of important features from ggplot2. But we’re working hard on ggvis and expect many improvements in the months to come.

Shiny 0.10 is now available on CRAN.

Interactive documents

In this release, the biggest changes were under the hood to support the creation of interactive documents. If you haven’t had a chance to check out interactive documents, we really encourage you to do so—it may be the easiest way to learn Shiny.

New layout functions

Three new functions—flowLayout(), splitLayout(), and inputPanel()—were added for putting UI elements side by side.

  • flowPanel() lays out its children in a left-to-right, top-to-bottom arrangement.
  • splitLayout() evenly divides its horizontal space among its children (or unevenly divides if cellWidths argument is provided).
  • inputPanel() is like flowPanel(), but with a light grey background, and is intended for encapsulating small input controls wherever vertical space is at a premium.

A new logical argument inline was also added to checkboxGroupInput() and radioButtons() to arrange check boxes and radio buttons horizontally.

Custom validation error messages

Sometimes you don’t want your reactive expressions or output renderers in server.R to proceed unless certain input conditions are satisfied, e.g. a select input value has been chosen, or a sensible combination of inputs has been provided. In these cases, you might want to stop the render function quietly, or you might want to give the user a custom message. In shiny 0.10.0, we introduced the functions validate() and need() which you can use to enforce validation conditions. This won’t be the last word on input validation in Shiny, but it should be a lot safer and more convenient than how most of us have been doing it.

See the article Write error messages for your UI with validate for details and examples.

Sever-side processing for Selectize input

In the previous release of Shiny, we added support for Selectize, a powerful select box widget. At that time, our implementation passed all of the data to the web page and used JavaScript to do any paging, filtering, and sorting. It worked great for small numbers of items but didn’t scale well beyond a few thousand items.

For Shiny 0.10, we greatly improved the performance of our existing client-side Selectize binding, but also added a new mode that allows the paging, filtering, and sorting to all happen on the server. Only the results that are actually displayed are downloaded to the client. This approach works well for hundreds of thousands or millions of rows.

For more details and examples, see the article Using selectize input on shiny.rstudio.com.

htmltools

We also split off Shiny’s HTML generating library (tags and friends) into a separate htmltools package. If you’re writing a package that needs to generate HTML programmatically, it’s far easier and safer to use htmltools than to paste HTML strings together yourself. We’ll have more to share about htmltools in the months to come.

Other changes

  • New actionLink() input control: behaves like actionButton() but looks like a link
  • renderPlot() now calls print() on its result if it’s visible–no more explicit print() required for ggplot2
  • Sliders and select boxes now use a fixed horizontal size instead of filling up all available horizontal space; pass width="100%" if you need the old behavior
  • The session object that can be passed into a server function is now documented: see ?session
  • New reactive domains feature makes it easy to get callbacks when the current session ends, without having to pass session everywhere
  • Thanks to reactive domains, by default, observers now automatically stop executing when the Shiny session that created them ends
  • shinyUI and shinyServer

For the full list, you can take a look at the NEWS file. Please let us know if you have any comments or questions.

R Markdown’s new interactive documents provide a quick, light-weight way to use Shiny. An interactive document embeds Shiny elements in an R Markdown report. The report becomes “live”, a choose your own adventure that readers can control and explore. Interactive documents are easy to create and easy to share.

Create an interactive document

To create an interactive document use RStudio to create a new R Markdown file, choose the Shiny document template, then click “Run Document” to show a preview:


storms.002

Embed R code chunks in your report where you like. Interactive documents use the same syntax as R Markdown and knitr. Set echo = FALSE. Your reader won’t see the code, just its results.

 

  storms2.001

Include Shiny widgets and outputs in your code chunks. R Markdown will insert the widgets directly into your final document. When a reader toggles a widget, the parts of the document that depend on it will update instantly.

 storms.003

That’s it! No extra files are needed.

Note that in order to use interactive documents you should be running the latest version of RStudio (v0.98.932 or higher). Alternatively if you are not using RStudio be sure to follow the directions here to install all of the required components.

Share your document

Interactive documents can be run locally on the desktop or be deployed Shiny Server v1.2 or ShinyApps just like any other Shiny application. See the RMarkdown v2 website for more details on deploying interactive documents.

Use pre-packaged tools

Interactive documents make it easy to insert powerful tools into a report. For example, you can insert a kmeans clustering tool into your document with one line of code, as below. kmeans_cluster is a widget built from a Shiny app and intended for use in interactive documents.

storms.004

You can build your own widgets with shinyApp, a new function that repackages Shiny apps as functions. shinyApp is easy to use. Its first argument takes the code that appears in an app’s ui.R file. The second argument takes the code that appears in the app’s server.R file. The source of kmeans_cluster reveals how simple this is.

Be a hero

Ready to be a hero? You can use the `shinyApp` function to make out of the box widgets that students, teachers, and data scientists will use everyday. Widgets can

  • fit models
  • compare distributions
  • visualize data
  • demonstrate teaching examples
  • act as quizzes or multiple choice questions
  • and more

These widgets are not made yet, they are low hanging fruit for any Shiny developer. If you know how to program with Shiny (or want to learn), and would like to make your mark on R, consider authoring a package that makes widgets available for interactive documents.

Get started!

To learn more about interactive documents visit http://rmarkdown.rstudio.com/authoring_shiny.html.

 

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Join 598 other followers

RStudio is an affiliated project of the Foundation for Open Access Statistics

Follow

Get every new post delivered to your Inbox.

Join 598 other followers