You are currently browsing the category archive for the ‘Featured’ category.

We’re excited to announce the release of RStudio Connect: version 1.4.6. This is an incremental release which features significantly improved startup time and support for server-side Shiny bookmarks.

Creating a server-side Shiny bookmark in RStudio Connect

Improved Startup & Job Listing Time

We now track R process jobs in the database which allows us to list and query jobs much more quickly. This decreases the startup time of the RStudio Connect service — allowing even the busiest of servers to spin up in a matter of seconds. Additionally, operations that involve listing jobs such as viewing process logs for a particular application should be noticeably faster.

Server-Side Shiny Bookmarks

Shiny v0.14 introduced a feature by which users could bookmark the current state of the application by either encoding the state in the URL or saving the state to the server. As of this release, RStudio Connect now supports server-side bookmarking of Shiny applications.

Other notable changes this release:

  • BREAKING: Changed the default for Authorization.DefaultUserRole from publisher to viewer. New users will now be created with a viewer account until promoted. The user roles documentation explains the differences. To restore the previous behavior, set DefaultUserRole = publisher. Because viewer users cannot be added as collaborators on content, this means that in order to add a remote user as a collaborator on content you must first create their account, then promote them to a publisher account.
  • Fixed a bug in the previous release that had broken Applications.ViewerOnDemandReports and Applications.ViewerCustomizedReports. These settings are again functional and allow you to manage the capabilities of a viewer of a parameterized report on the server.
  • Tune the number of concurrent processes to use when building R packages. This is controlled with the Server.CompilationConcurrency setting and passed as the value to the make flag -jNUM. The default is to permit four concurrent processes. Decrease this setting in low memory environments.
  • The /etc/rstudio-connect/rstudio-connect.gcfg file is installed with more restrictive permissions.
  • Log file downloads include a more descriptive file name by default. Previously, we used the naming convention <jobId>.log, which resulted in file names like GBFCaiPE6tegbrEM.log. Now, we use the naming convention rstudio-connect.<appId>.<reportId>.<bundleId>.<jobType>.<jobId>.log, which results in file names like rstudio-connect.34.259.15.packrat_restore.GBFCaiPE6tegbrEM.log.
  • Bundle the admin guide and user guide in the product. You can access both from the Documentation tab.
  • Implemented improved, pop-out filtering panel when filtering content, which offers a better experience on small/mobile screens.
  • Improvements to the parameterized report pane when the viewer does not have the authority to render custom versions of the document.
  • Database performance improvements which should improve performance in high-traffic environments.

Upgrade Planning: The migration of jobs from disk to the database may take a few minutes. The server will be unavailable during this migration which will be performed the first time RStudio Connect v1.4.6 starts. Even on the busiest of servers we would expect this migration to complete in under 5 minutes.

If you haven’t yet had a chance to download and try RStudio Connect we encourage you to do so. RStudio Connect is the best way to share all the work that you do in R (Shiny apps, R Markdown documents, plots, dashboards, etc.) with collaborators, colleagues, or customers.

You can find more details or download a 45 day evaluation of the product at https://www.rstudio.com/products/connect/. Additional resources can be found below.

We’re excited to announce the release of RStudio Connect: version 1.4.4.1. This release includes the ability to manage different versions of your work on RStudio Connect.

Managing old versions of deployed content in RStudio Connect

Rollback / Roll Forward
The most notable feature of this release is the ability to “rollback” to a previously deployed version of your work or “roll forward” to a more recent version of your work.

You can also download a particular version, perhaps as a starting place for a new report or application, and delete old versions that you want to remove from the server.

Other important features allow you to:

  • Specify the number of versions to retain. You can alter the setting Applications.BundleRetentionLimit to specify how many versions of your applications you want to keep on disk. By default, we retain all bundles eternally.
  • Limit the number of scheduled reports that will be run concurrently using the Applications.ScheduleConcurrency setting. This setting will help ensure that your server isn’t overwhelmed by too many reports all scheduled to run at the same time of day. The default is set to 2.
  • Create a printable view of your content with a new “Print” menu option.
  • Notify users of unsaved changes before they take an action in parameterized reports.

The release also includes numerous security and stability improvements.

If you haven’t yet had a chance to download and try RStudio Connect we encourage you to do so. RStudio Connect is the best way to share all the work that you do in R (Shiny apps, R Markdown documents, plots, dashboards, etc.) with collaborators, colleagues, or customers.

You can find more details or download a 45 day evaluation of the product at https://www.rstudio.com/products/connect/. Additional resources can be found below.

We’re excited to announce the latest release of RStudio Connect: version 1.4.2. This release includes a number of notable features including an overhauled interface for parameterized R Markdown reports.

Enhanced Parameterized R Markdown Reports

Enhanced Parameterized R Markdown Reports

The most notable feature in this release is the ability to publish parameterized R Markdown reports that are easier for anyone to customize. If you’re unfamiliar, parameterized R Markdown reports allow you to inject input parameters into your R Markdown document to alter what analysis the report performs. The parameters of your R Markdown report are now visible on the left-hand sidebar, allowing users to easily tweak the inputs to the document and quickly view the output in the browser.

Users even have the opportunity to create private versions of the report which they can schedule to run again, email, or save and revisit in the browser. Of course, you can continue to use the wide variety of output formats (notebooks, dashboards, books, and others) while using parameterized R Markdown.

In addition to the parameterized report overhaul, there are some other notable features included in this release.

  • Content private by default – Content is set to private (“Just Me”) by default. Users can still change the visibility of their content before publishing, as before.
  • Execute R as the authenticated viewer – You can now choose to have some applications execute their underlying R process as the authenticated viewer currently looking at the app. This allows applications to access any data or resource that the associated user has access to on the server. Requires PAM authentication. More details here.

Other important features include:

  • Show progress indicator when updating a report.
  • Users can now filter content to include only items that they can edit or view.
  • Users now only count against the named user license limit after they log in for the first time.
  • Added support for global “System Messages” that can display an HTML message to your users on the landing pages. Details here.
  • Updated packrat to gain more transparency on package build errors.
  • Updated the list of SSL ciphers to correspond with modern best-practices.

If you haven’t yet had a chance to download and try RStudio Connect we encourage you to do so. RStudio Connect is the best way to share all the work that you do in R (Shiny apps, R Markdown documents, plots, dashboards, etc.) with collaborators, colleagues, or customers.

You can find more details or download a 45 day evaluation of the product at https://www.rstudio.com/products/connect/. Additional resources can be found below.

We’re thrilled to officially introduce the newest product in RStudio’s product lineup: RStudio Connect.

You can download a free 45-day trial of it here.

RStudio Connect is a new publishing platform for all the work your teams do in R. It provides a single destination for your Shiny applications, R Markdown documents, interactive HTML widgets, static plots, and more.

RStudio Connect Settings

RStudio Connect isn’t just for R users. Now anyone can interact with custom built analytical data products developed by R users without having to program in R themselves. Team members can receive updated reports built on the same models/forecasts which can be configured to be rebuilt and distributed on a scheduled basis. RStudio Connect is designed to bring the power of data science to your entire enterprise.

RStudio Connect empowers analysts to share and manage the content they’ve created in R. Users of the RStudio IDE can publish content to RStudio Connect with the click of a button and immediately be able to manage that content from a user-friendly web application: setting access controls and performance settings and viewing the logs of the associated R processes on the server.

Deploying content from the RStudio IDE into RStudio Connect

RStudio Connect is on-premises software that you can install on a server behind your firewall ensuring that your data and R applications never have to leave your organization’s control. We integrate with many enterprise authentication platform including LDAP/Active Directory, Google OAuth, PAM, and proxied authentication. We also provide an option to use an internal username/password system complete with user self-sign-up.

RStudio Connect Admin Metrics

RStudio Connect has been in Beta for almost a year. We’ve had hundreds of customers validate and help us improve the software in that time. In November, we made RStudio Connect generally available without significant fanfare and began to work with Beta participants and existing RStudio customers eager to move it into their production environments. We are pleased that innovative early customers, like AdRoll, have already successfully introduced RStudio Connect into their data science process.

“At AdRoll, we have used the open source version of Shiny Server for years to great success but deploying apps always served as a barrier for new users. With RStudio Connect’s push button deployment from the RStudio IDE, the number of shiny devs has grown tremendously both in engineering and across teams completely new to shiny like finance and marketing. It’s been really powerful for those just getting started to be able to go from developing locally to sharing apps with others in just seconds.”

– Bryan Galvin, Senior Data Scientist, AdRoll

We invite you to take a look at RStudio Connect today, too!

You can find more details or download a 45 day evaluation of the product at https://www.rstudio.com/products/connect/. Additional resources can be found below.

On October 12, RStudio launched R Views with great enthusiasm. R Views is a new blog for R users about the R Community and the R Language. Under the care of editor-in-chief and new RStudio ambassador-at-large, Joseph Rickert, R Views provides a new perspective on R and RStudio that we like to think will become essential reading for you.

You may have read an R Views post already. In the first, widely syndicated, post, Joseph interviewed J.J. Allaire, RStudio’s founder, CEO and most prolific software developer. Later posts by Mine Cetinkaya-Rundel on Highcharts and thoughtful book reviews, new R package picks, and a primer on Naive Bayes from Joseph rounded out the first month. Each post was entirely different from anything you could have read here, on what we now call our Developer Blog at rstudio.org.

Fortunately, you don’t have to choose. Each has its purpose. Our Developer Blog is the place to go for RStudio news. You’ll find product announcements, events, and company happenings – like the announcement of a new blog – right here. R Views is about R in action. You’ll find stories and solutions and opinions that we hope will educate and challenge you.

Subscribe to each and stay up to date on all things R and RStudio!

Thanks for making R and RStudio part of your data science experience and for supporting our work.

rstudio::conf 2017, the conference on all things R and RStudio, is only 90 days away. Now is the time to claim your spot or grab one of the few remaining seats at Training Days – including the new Tidyverse workshop.

REGISTER NOW

Whether you’re already registered or still working on it, we’re delighted today to announce the full conference schedule, so that you can plan your days in Florida.

rstudio::conf 2017 takes place January 12-14 at the Gaylord Resorts in Kissimmee, Florida. There are over 30 talks and tutorials to choose from that are sure to accelerate your productivity in R and RStudio. In addition to the highlights below, topics include the latest news on R notebooks, sparklyr, profiling, the tidyverse, shiny, r markdown, html widgets, data access and the new enterprise-scale publishing capabilities of RStudio Connect.

Schedule Highlights

Keynotes
– Hadley Wickham, Chief Scientist, RStudio: Data Science in the Tidyverse
– Andrew Flowers, Economics Writer, FiveThirtyEight: Finding and Telling Stories with R
– J.J. Allaire, Software Engineer, CEO & Founder: RStudio Past, Present and Future

Tutorials
– Winston Chang, Software Engineer, RStudio: Building Dashboards with Shiny
– Charlotte Wickham, Oregon State University: Happy R Users Purrr
– Yihui Xie, Software Engineer, RStudio: Advanced R Markdown
– Jenny Bryan, University of British Columbia: Happy Git and GitHub for the UseR

Featured Speakers
– Max Kuhn, Senior Director Non-Clinical Statistics, Pfizer
– Dirk Eddelbuettel, Ketchum Trading: Extending R with C++: A Brief Introduction to Rcpp
– Hilary Parker, Stitch Fix: Opinionated Analysis Development“
Bryan Lewis, Paradigm4: “Fun with htmlwidgets”
Ryan Hafen, Hafen Consulting: “Interactive plotting with rbokeh and crosstalk”
Julia Silge, Datassist: “Text mining, the tidy way”
Bob Rudis, Rapid7: “Writing readable code with pipes”

Featured Talk
– Joseph Rickert, R Ambassador, RStudio: R’s Role in Data Science

Be sure to visit https://www.rstudio.com/conference/ for the full schedule and latest updates and don’t forget to download the RStudio conference app to help you plan your days in detail.

Special Reminder: When you register, make sure you purchase your ticket for Friday evening at Universal’s Wizarding World of Harry Potter. The park is reserved exclusively for rstudio::conf attendees. It’s an extraordinary experience we’re sure you’ll enjoy!

We appreciate our sponsors and exhibitors!

Today we’re excited to announce R Notebooks, which add a powerful notebook authoring engine to R Markdown. Notebook interfaces for data analysis have compelling advantages including the close association of code and output and the ability to intersperse narrative with computation. Notebooks are also an excellent tool for teaching and a convenient way to share analyses.

screen-shot-2016-09-21-at-3-42-44-pm

You can try out R Notebooks today in the RStudio Preview Release.

Interactive R Markdown

As an authoring format, R Markdown bears many similarities to traditional notebooks like Jupyter and Beaker. However, code in notebooks is typically executed interactively, one cell at a time, whereas code in R Markdown documents is typically executed in batch.

R Notebooks bring the interactive model of execution to your R Markdown documents, giving you the capability to work quickly and iteratively in a notebook interface without leaving behind the plain-text tools and production-quality output you’ve come to rely on from R Markdown.

R Markdown Notebooks Traditional Notebooks
Plain text representation
Same editor/tools used for R scripts
Works well with version control
Focus on production output
Output inline with code
Output cached across sessions
Share code and output in a single file
Emphasized execution model Interactive & Batch Interactive

This video provides a bit more background and a demonstration of notebooks in action:

Iterate Quickly

In a typical R Markdown document, you must re-knit the document to see your changes, which can take some time if it contains non-trivial computations. R Notebooks, however, let you run code and see the results in the document immediately. They can include just about any kind of content R produces, including console output, plots, data frames, and interactive HTML widgets.

screen-shot-2016-09-20-at-4-16-47-pm

You can see the progress of the code as it runs:

screen-shot-2016-09-21-at-10-52-02-am

You can preview the results of individual inline expressions, too:

notebook-inline-output

Even your LaTeX equations render in real-time as you type:

notebook-mathjax

This focused mode of interaction doesn’t require you to keep the console, viewer, or output panes open. Everything you need is at your fingertips in the editor, reducing distractions and helping you concentrate on your analysis. When you’re done, you’ll have a formatted, reproducible record of what you’ve accomplished, with plenty of context, perfect for your own records or sharing with others.

Batteries Included

R Notebooks can run more than just R code. You can run chunks written in other languages, like Python, Bash, or C++ (Rcpp).

screen-shot-2016-09-20-at-4-25-48-pm

It’s even possible to run SQL directly:

notebook-sql

This makes an R Notebook an excellent tool for orchestrating a reproducible, end-to-end data analysis workflow; you can easily ingest data using your tool of choice, and share data among languages by using packages like feather, or ordinary CSV files.

Reproducible Notebooks

While you can run chunks (and even individual lines of R code!) in any order you like, a fully reproducible document must be able to be re-executed start-to-finish in a clean environment. There’s a built-in command to do this, too, so it’s easy to test your notebooks for reproducibility.

screen-shot-2016-09-21-at-3-52-34-pm

Rich Output Formats

Since they’re built on R Markdown, R Notebooks work seamlessly with other R Markdown output types. You can use any existing R Markdown document as a notebook, or render (knit) a notebook to any R Markdown output type.

notebook-yaml

The same document can be used as a notebook when you’re quickly iterating on ideas and later rendered to a wholly different format for publication – no duplication of code, data, or output required.

Share and Publish

R Notebooks are easy to share with collaborators. Because they’re plain-text files, they work well with version control systems like Git. Your collaborators don’t even need RStudio to edit them, since notebooks can be rendered in the R console using the open source rmarkdown package.

Rendered notebooks can be previewed right inside RStudio:

notebook-preview

While the notebook preview looks similar to a rendered R Markdown document, the notebook preview does not execute any of your R code chunks; it simply shows you a rendered copy of the markdown in your document along with the most recent chunk output. Because it’s very fast to generate this preview (again, no R code is executed), it’s generated every time you save the R Markdown document.

The generated HTML file has the special extension .nb.html. It is self-contained, free of dependencies, and can be viewed locally or published to any static web hosting service.

screen-shot-2016-09-14-at-12-12-35-pm

It also includes a bundled copy of the R Markdown source file, so it can be seamlessly opened in RStudio to resume work on the notebook with all output intact.

Try It Out

To try out R Notebooks, you’ll need to download the latest RStudio Preview Release.

You can find documentation on notebook features on the R Notebooks page on the R Markdown website, and we’ve also published a video tutorial in our R Notebooks Webinar.

We believe the R Notebook will become a powerful new addition to your toolkit. Give it a spin and let us know what you think!

We’re excited today to announce sparklyr, a new package that provides an interface between R and Apache Spark.

Over the past couple of years we’ve heard time and time again that people want a native dplyr interface to Spark, so we built one! sparklyr also provides interfaces to Spark’s distributed machine learning algorithms and much more. Highlights include:

  • Interactively manipulate Spark data using both dplyr and SQL (via DBI).
  • Filter and aggregate Spark datasets then bring them into R for analysis and visualization.
  • Orchestrate distributed machine learning from R using either Spark MLlib or H2O SparkingWater.
  • Create extensions that call the full Spark API and provide interfaces to Spark packages.
  • Integrated support for establishing Spark connections and browsing Spark data frames within the RStudio IDE.

We’re also excited to be working with several industry partners. IBM is incorporating sparklyr into their Data Science Experience, Cloudera is working with us to ensure that sparklyr meets the requirements of their enterprise customers, and H2O has provided an integration between sparklyr and H2O Sparkling Water.

Getting Started

You can install sparklyr from CRAN as follows:

install.packages("sparklyr")

You should also install a local version of Spark for development purposes:

library(sparklyr)
spark_install(version = "1.6.2")

If you use the RStudio IDE, you should also download the latest preview release of the IDE which includes several enhancements for interacting with Spark.

Extensive documentation and examples are available at http://spark.rstudio.com.

Connecting to Spark

You can connect to both local instances of Spark as well as remote Spark clusters. Here we’ll connect to a local instance of Spark:

library(sparklyr)
sc <- spark_connect(master = "local")

The returned Spark connection (sc) provides a remote dplyr data source to the Spark cluster.

Reading Data

You can copy R data frames into Spark using the dplyr copy_to function (more typically though you’ll read data within the Spark cluster using the spark_read family of functions). For the examples below we’ll copy some datasets from R into Spark (note that you may need to install the nycflights13 and Lahman packages in order to execute this code):

library(dplyr)
iris_tbl <- copy_to(sc, iris)
flights_tbl <- copy_to(sc, nycflights13::flights, "flights")
batting_tbl <- copy_to(sc, Lahman::Batting, "batting")

Using dplyr

We can now use all of the available dplyr verbs against the tables within the cluster. Here’s a simple filtering example:

# filter by departure delay
flights_tbl %>% filter(dep_delay == 2)

Introduction to dplyr provides additional dplyr examples you can try. For example, consider the last example from the tutorial which plots data on flight delays:

delay <- flights_tbl %>% 
  group_by(tailnum) %>%
  summarise(count = n(), dist = mean(distance), delay = mean(arr_delay)) %>%
  filter(count > 20, dist < 2000, !is.na(delay)) %>%
  collect()

# plot delays
library(ggplot2)
ggplot(delay, aes(dist, delay)) +
  geom_point(aes(size = count), alpha = 1/2) +
  geom_smooth() +
  scale_size_area(max_size = 2)
Note that while the dplyr functions shown above look identical to the ones you use with R data frames, with sparklyr they use Spark as their back end and execute remotely in the cluster.

Window Functions

dplyr window functions are also supported, for example:

batting_tbl %>%
  select(playerID, yearID, teamID, G, AB:H) %>%
  arrange(playerID, yearID, teamID) %>%
  group_by(playerID) %>%
  filter(min_rank(desc(H)) <= 2 & H > 0)

For additional documentation on using dplyr with Spark see the dplyr section of the sparklyr website.

Using SQL

It’s also possible to execute SQL queries directly against tables within a Spark cluster. The spark_connection object implements a DBI interface for Spark, so you can use dbGetQuery to execute SQL and return the result as an R data frame:

library(DBI)
iris_preview <- dbGetQuery(sc, "SELECT * FROM iris LIMIT 10")

Machine Learning

You can orchestrate machine learning algorithms in a Spark cluster via either Spark MLlib or via the H2O Sparkling Water extension package. Both provide a set of high-level APIs built on top of DataFrames that help you create and tune machine learning workflows.

Spark MLlib

In this example we’ll use ml_linear_regression to fit a linear regression model. We’ll use the built-in mtcars dataset, and see if we can predict a car’s fuel consumption (mpg) based on its weight (wt) and the number of cylinders the engine contains (cyl). We’ll assume in each case that the relationship between mpg and each of our features is linear.

# copy mtcars into spark
mtcars_tbl <- copy_to(sc, mtcars)

# transform our data set, and then partition into 'training', 'test'
partitions <- mtcars_tbl %>%
  filter(hp >= 100) %>%
  mutate(cyl8 = cyl == 8) %>%
  sdf_partition(training = 0.5, test = 0.5, seed = 1099)

# fit a linear model to the training dataset
fit <- partitions$training %>%
  ml_linear_regression(response = "mpg", features = c("wt", "cyl"))

For linear regression models produced by Spark, we can use summary() to learn a bit more about the quality of our fit, and the statistical significance of each of our predictors.

summary(fit)

Spark machine learning supports a wide array of algorithms and feature transformations, and as illustrated above it’s easy to chain these functions together with dplyr pipelines. To learn more see the Spark MLlib section of the sparklyr website.

H2O Sparkling Water

Let’s walk the same mtcars example, but in this case use H2O’s machine learning algorithms via the H2O Sparkling Water extension. The dplyr code used to prepare the data is the same, but after partitioning into test and training data we call h2o.glm rather than ml_linear_regression:

# convert to h20_frame (uses the same underlying rdd)
training <- as_h2o_frame(partitions$training)
test <- as_h2o_frame(partitions$test)

# fit a linear model to the training dataset
fit <- h2o.glm(x = c("wt", "cyl"),
               y = "mpg",
               training_frame = training,
               lamda_search = TRUE)

# inspect the model
print(fit)

For linear regression models produced by H2O, we can use either print() or summary() to learn a bit more about the quality of our fit. The summary() method returns some extra information about scoring history and variable importance.

To learn more see the H2O Sparkling Water section of the sparklyr website.

Extensions

The facilities used internally by sparklyr for its dplyr and machine learning interfaces are available to extension packages. Since Spark is a general purpose cluster computing system there are many potential applications for extensions (e.g. interfaces to custom machine learning pipelines, interfaces to 3rd party Spark packages, etc.).

The sas7bdat extension enables parallel reading of SAS datasets in the sas7bdat format into Spark data frames. The rsparkling extension provides a bridge between sparklyr and H2O’s Sparkling Water.

We’re excited to see what other sparklyr extensions the R community creates. To learn more see the Extensions section of the sparklyr website.

RStudio IDE

The latest RStudio Preview Release of the RStudio IDE includes integrated support for Spark and the sparklyr package, including tools for:

  • Creating and managing Spark connections
  • Browsing the tables and columns of Spark DataFrames
  • Previewing the first 1,000 rows of Spark DataFrames

Once you’ve installed the sparklyr package, you should find a new Spark pane within the IDE. This pane includes a New Connection dialog which can be used to make connections to local or remote Spark instances:

Once you’ve connected to Spark you’ll be able to browse the tables contained within the Spark cluster:

The Spark DataFrame preview uses the standard RStudio data viewer:

The RStudio IDE features for sparklyr are available now as part of the RStudio Preview Release. The final version of RStudio IDE that includes integrated support for sparklyr will ship within the next few weeks.

Partners

We’re very pleased to be joined in this announcement by IBM, Cloudera, and H2O, who are working with us to ensure that sparklyr meets the requirements of enterprise customers and is easy to integrate with current and future deployments of Spark.

IBM

“With our latest contributions to Apache Spark and the release of sparklyr, we continue to emphasize R as a primary data science language within the Spark community. Additionally, we are making plans to include sparklyr in Data Science Experience to provide the tools data scientists are comfortable with to help them bring business-changing insights to their companies faster,” said Ritika Gunnar, vice president of Offering Management, IBM Analytics.

Cloudera

“At Cloudera, data science is one of the most popular use cases we see for Apache Spark as a core part of the Apache Hadoop ecosystem, yet the lack of a compelling R experience has limited data scientists’ access to available data and compute,” said Charles Zedlewski, vice president, Products at Cloudera. “We are excited to partner with RStudio to help bring sparklyr to the enterprise, so that data scientists and IT teams alike can get more value from their existing skills and infrastructure, all with the security, governance, and management our customers expect.”

H2O

“At H2O.ai, we’ve been focused on bringing the best of breed open source machine learning to data scientists working in R & Python. However, the lack of robust tooling in the R ecosystem for interfacing with Apache Spark has made it difficult for the R community to take advantage of the distributed data processing capabilities of Apache Spark.

We’re excited to work with RStudio to bring the ease of use of dplyr and the distributed machine learning algorithms from H2O’s Sparkling Water to the R community via the sparklyr & rsparkling packages”

The JSM conference in Chicago, July 31 thru August 4, 2016, is one of the largest to be found on statistics, with many terrific talks for R users. We’ve listed some of the sessions that we’re particularly excited about below. These include talks from RStudio employees, like Hadley Wickham, Yihui Xie, Mine Cetinkaya-Rundel, Garrett Grolemund, and Joe Cheng, but also include a bunch of other talks about R that we think look interesting.

When you’re not in one of the sessions below, please visit us in the exhibition area, booth #126-128. We’ll have copies of all our cheat sheets and stickers, and it’s a great place to learn about the other stuff we’ve been working on lately:  from Sparklyr and R Markdown Notebooks to the latest in RStudio Server Pro, Shiny Server Pro, shinyapps.io, RStudio Connect (beta) and more!

Another great place to chat with people interested in R is the Statistical Computing and Graphics Mixer at 6pm on Monday in the Hilton Stevens Salon A4. It’s advertised as a business meeting in the program, but don’t let that put you off – it’s open to all.

SUNDAY

Session 21: Statistical Computing and Graphics Student Awards
Sunday, July 31, 2016 : 2:00 PM to 3:50 PM, CC-W175b

Session 47 Making the Most of R Tools
Hadley Wickham, RStudio (Discussant)
Sunday, July 31, 2016: 4:00 PM to 4:50 PM, CC-W183b

Thinking with Data Using R and RStudio: Powerful Idioms for Analysts
Nicholas Jon Horton, Amherst College; Randall Pruim, Calvin College ; Daniel Kaplan, Macalester College
Transform Your Workflow and Deliverables with Shiny and R Markdown
Garrett Grolemund, RStudio

Session 54 Recent Advances in Information Visualization
Yihui Xie, RStudio (organizer)
Sunday, July 31, 2016: 4:00 PM to 4:50 PM, CC-W183c

Session 85 Reproducibility Promotes Transparency, Efficiency, and Aesthetics
Richard Schwinn
Sunday, July 31, 2016 : 5:35 PM to 5:50 PM, CC-W176a

Session 88 Communicate Better with R, R Markdown, and Shiny
Garrett Grolemund, RStudio (Poster Session)
Sunday, July 31, 2016: 6:00 PM to 8:00 PM, CC-Hall F1 West

MONDAY

Session 106  Linked Brushing in R
Hadley Wickham, RStudio
Monday, August 1, 2016 : 8:35 AM to 8:55 AM, CC-W196b

Session 127 R Tools for Statistical Computing
Monday, August 1, 2016 : 8:30 AM to 10:20 AM, CC-W196c

8:35 AM The Biglasso Package: Extending Lasso Model Fitting to Big Data in R — Yaohui Zeng, University of Iowa ; Patrick Breheny, University of Iowa
8:50 AM Independent Sampling for a Spatial Model with Incomplete Data — Harsimran Somal, University of Iowa ; Mary Kathryn Cowles, University of Iowa
9:05 AM Introduction to the TextmineR Package for R — Thomas Jones, Impact Research
9:20 AM Vector-Generalized Time Series Models — Victor Miranda Soberanis, University of Auckland ; Thomas Yee, University of Auckland
9:35 AM New Computational Approaches to Large/Complex Mixed Effects Models — Norman Matloff, University of California at Davis
9:50 AM Broom: An R Package for Converting Statistical Modeling Objects Into Tidy Data Frames — David G. Robinson, Stack Overflow
10:05 AM Exact Parametric and Nonparametric Likelihood-Ratio Tests for Two-Sample Comparisons — Yang Zhao, SUNY Buffalo ; Albert Vexler, SUNY Buffalo ; Alan Hutson, SUNY Buffalo ; Xiwei Chen, SUNY Buffalo

Session 270 Automated Analytics and Data Dashboards for Evaluating the Impacts of Educational Technologies
Daniel Stanhope and Joyce Yu and Karly Rectanus
Monday, August 1, 2016 : 3:05 PM to 3:50 PM, CC-Hall F1 West

TUESDAY

Session 276 Statistical Tools for Clinical Neuroimaging
Ciprian Crainiceanu
Tuesday, August 2, 2016 : 7:00 AM to 8:15 AM, CC-W375a

Session 332 Doing More with Data in and Outside the Undergraduate Classroom
Mine Cetinkaya-Rundel, Duke University (organizer)
Tuesday, August 2, 2016 : 10:30 AM to 12:20 PM, CC-W184bc

Session 407 Interactive Visualizations and Web Applications for Analytics
Tuesday, August 2, 2016 : 2:00 PM to 3:50 PM, CC-W179a

2:05 PM Radiant: A Platform-Independent Browser-Based Interface for Business Analytics in R — Vincent Nijs, Rady School of Management
2:20 PM Rbokeh: An R Interface to the Bokeh Plotting Library — Ryan Hafen, Hafen Consulting
2:35 PM Composable Linked Interactive Visualizations in R with Htmlwidgets and Shiny — Joseph Cheng, RStudio
2:50 PM Papayar: A Better Interactive Neuroimage Plotter in R — John Muschelli, The Johns Hopkins University
3:05 PM Interactive and Dynamic Web-Based Graphics for Data Analysis — Carson Sievert, Iowa State University
3:20 PM HTML Widgets: Interactive Visualizations from R Made Easy! — Yihui Xie, RStudio ; Ramnath Vaidyanathan, Alteryx

WEDNESDAY

Session 475  Steps Toward Reproducible Research
Yihui Xie, RStudio  (Discussant)
Wednesday, August 3, 2016 : 8:30 AM to 10:20 AM, CC-W196c

8:35 AM Reproducibility for All and Our Love/Hate Relationship with Spreadsheets — Jennifer Bryan, University of British Columbia
8:55 AM Steps Toward Reproducible Research — Karl W. Broman, University of Wisconsin – Madison
9:15 AM Enough with Trickle-Down Reproducibility: Scientists, Open This Gate! Scientists, Tear Down This Wall! — Karthik Ram, University of California at Berkeley
9:35 AM Integrating Reproducibility into the Undergraduate Statistics Curriculum — Mine Cetinkaya-Rundel, Duke University

Session 581 Mining Text in R
David Marchette, Naval Surface Warfare Center
Wednesday, August 3, 2016 : 2:05 PM to 2:40 PM, CC-W180

THURSDAY

Session 696 Statistics for Social Good
Hadley Wickham, RStudio (Chair)
Thursday, August 4, 2016 : 10:30 AM to 12:20 PM, CC-W179a

Session 694 Web Application Teaching Tools for Statistics Using R and Shiny
Jimmy Doi and Gail Potter and Jimmy Wong and Irvin Alcaraz and Peter Chi
Thursday, August 4, 2016 : 11:05 AM to 11:20 AM, CC-W192a

Following our initial and very gratifying Shiny Developer Conference this past January, which sold out in a few days, RStudio is very excited to announce a new and bigger conference today!

rstudio::conf, the conference about all things R and RStudio, will take place January 13 and 14, 2017 in Orlando, Florida. The conference will feature talks and tutorials from popular RStudio data scientists and developers like Hadley Wickham, Yihui Xie, Joe Cheng, Winston Chang, Garrett Grolemund, and J.J. Allaire, along with lightning talks from RStudio partners and customers.

Preceding the conference, on January 11 and 12, RStudio will offer two days of optional training. Training attendees can choose from Hadley Wickham’s Master R training, a new Intermediate Shiny workshop from Shiny creator Joe Cheng or a new workshop from Garrett Grolemund that is based on his soon-to-be-published book with Hadley: Introduction to Data Science with R.

rstudio::conf is for R and RStudio users who want to learn how to write better shiny applications in a better way, explore all the new capabilities of the R Markdown authoring framework, apply R to big data and work effectively with Spark, understand the RStudio toolchain for data science with R, discover best practices and tips for coding with RStudio, and investigate enterprise scale development and deployment practices and tools, including the new RStudio Connect.

Not to be missed, RStudio has also reserved Universal Studio’s The Wizarding World of Harry Potter on Friday night, January 13, for the exclusive use of conference attendees!

Conference attendance is limited to 400. Training is limited to 70 students for each of the three 2-day workshops. All seats are are available on a first-come, first-serve basis.

Please go to http://www.rstudio.com/conference to purchase.

We hope to see you in Florida at rstudio::conf 2017!

For questions or issues registering, please email conf@rstudio.com. To ask about sponsorship opportunities contact anne@rstudio.com.