You are currently browsing the tag archive for the ‘rstudio’ tag.
If big data is your thing, you use R, and you’re headed to Strata + Hadoop World in San Jose March 13 & 14th, you can experience in person how easy and practical it is to analyze big data with R and Spark.
In a beginner level talk by RStudio’s Edgar Ruiz and an intermediate level workshop by Win-Vector’s John Mount, we cover the spectrum: What R is, what Spark is, how Sparklyr works, and what is required to set up and tune a Spark cluster. You’ll also learn practical applications including: how to quickly set up a local Spark instance, store big data in Spark and then connect to the data with R, use R to apply machine-learning algorithms to big data stored in Spark, and filter and aggregate big data stored in Spark and then import the results into R for analysis and visualization.
2:40pm–3:20pm Wednesday, March 15, 2017
Sparklyr: An R interface for Apache Spark
Edgar Ruiz (RStudio)
Primary topic: Spark & beyond
Location: LL21 C/D
Secondary topics: R
1:30pm–5:00pm Tuesday, March 14, 2017
Modeling big data with R, sparklyr, and Apache Spark
John Mount (Win-Vector LLC)
Primary topic: Data science & advanced analytics
Location: LL21 C/D
Secondary topics: R
While you’re at the conference be sure to look us up in the Innovator’s Pavilion – booth number P8 during the Expo Hall hours. We’ll have the latest books from RStudio authors, t-shirts to win, demonstrations of RStudio Connect and RStudio Server Pro and, of course, stickers and cheatsheets. Share with us what you’re doing with RStudio and get your product and company questions answered by RStudio employees.
See you in San Jose! (https://conferences.oreilly.com/strata/strata-ca)
RStudio’s data viewer provides a quick way to look at the contents of data frames and other column-based data in your R environment. You invoke it by clicking on the grid icon in the Environment pane, or at the console by typing
No Row Limit
While the data viewer in 0.98 was limited to the first 1,000 rows, you can now view all the rows of your data set. RStudio loads just the portion of the data you’re looking at into the user interface, so things won’t get sluggish even when you’re working with large data sets.
We’ve also added fixed column headers, and support for column labels imported from SPSS and other systems.
Sorting and Filtering
RStudio isn’t designed to act like a spreadsheet, but sometimes it’s helpful to do a quick sort or filter to get some idea of the data’s characteristics before moving into reproducible data analysis. Towards that end, we’ve built some basic sorting and filtering into the new data viewer.
Click a column once to sort data in ascending order, and again to sort in descending order. For instance, how big is the biggest diamond?
To clear all sorts and filters on the data, click the upper-left column header.
Click the new Filter button to enter Filter mode, then click the white filter value box to filter a column. You might, for instance, want to look at only at smaller diamonds:
Not all data types can be filtered; at the moment, you can filter only numeric types, characters, and factors.
You can also stack filters; for instance, let’s further restrict this view to small diamonds with a Very Good cut:
You can search the full text of your data frame using the new Search box in the upper right. This is useful for finding specific records; for instance, how many people named John were born in 2013?
If you invoke the data viewer on a variable as in
View(mydata), the data viewer will (in most cases) automatically refresh whenever data in the variable changes.
You can use this feature to watch data change as you manipulate it. It continues to work even when the data viewer is popped out, a configuration that combines well with multi-monitor setups.
We hope these improvements help make you understand your data more quickly and easily. Try out the RStudio Preview Release and let us know what you think!
RStudio’s code editor includes a set of lightweight Vim key bindings. You can turn these on in Tools | Global Options | Code | Editing:
For those not familiar, Vim is a popular text editor built to enable efficient text editing. It can take some practice and dedication to master Vim style editing but those who have done so typically swear by it. RStudio’s “vim mode” enables the use of many of the most common keyboard operations from Vim right inside RStudio.
As part of the 0.99 preview release, we’ve included an upgraded version of the ACE editor, which has a completely revamped Vim mode. This mode extends the range of Vim key bindings that are supported, and implements a number of Vim “power features” that go beyond basic text motions and editing. These include:
- Vertical block selection via
Ctrl + V. This integrates with the new multiple cursor support in ACE and allows you to type in multiple lines at once.
- Macro playback and recording, using
- Marks, which allow you drop markers in your source and jump back to them quickly later.
- A selection of Ex commands, such as
:%sthat allow you to perform editor operations as you would in native Vim.
- Fast in-file search with e.g.
We’ve also added a Vim quick reference card to the IDE that you can bring up at any time to show the supported key bindings. To see it, switch your editor to Vim mode (as described above) and type
:help in Command mode.
Whether you’re a Vim novice or power user, we hope these improvements make the RStudio IDE’s editor a more productive and enjoyable environment for you. You can try the new Vim features out now by downloading the RStudio Preview Release.
Sometimes the universe surprises us. In this case, it was in a good way and we genuinely appreciated it.
Earlier this week we learned that the Infoworld Testing Center staff selected RStudio as one of 32 recipients of the 2015 Technology of the Year Award.
We thought it was cool because it was completely unsolicited, we’re in very good company (some of our favorite technologies like Docker, Github, node.js…even my Dell XPS 15 Touch!…were also award winners) and the description of our products was surprisingly elegant – simple and accurate.
We know Infoworld wouldn’t have known about us if our customers hadn’t brought us to their attention.
Great news for Shiny and R Markdown enthusiasts!
An Interactive Reporting Workshop with Shiny and R Markdown is coming to a city near you. Act fast as only 20 seats are available for each workshop.
You can find out more / register by clicking on the link for your city!
|East Coast||West Coast|
|March 2 – Washington, DC||April 15 – Los Angeles, CA|
|March 4 – New York, NY||April 17 – San Francisco, CA|
|March 6 – Boston, MA||April 20 – Seattle, WA|
You’ll want to take this workshop if…
You have some experience working with R already. You should have written a number of functions, and be comfortable with R’s basic data structures (vectors, matrices, arrays, lists, and data frames).
You will learn from…
The workshop is taught by Garrett Grolemund. Garrett is the Editor-in-Chief of shiny.rstudio.com, the development center for the Shiny R package. He is also the author of Hands-On Programming with R as well as Data Science with R, a forthcoming book by O’Reilly Media. Garrett works as a Data Scientist and Chief Instructor for RStudio, Inc. GitHub
As R users know, we’re continuously improving the RStudio IDE. This includes RStudio Server Pro, where organizations who want to deploy the IDE at scale will find a growing set of features recently enhanced for them.
If you’re not already familiar with RStudio Server Pro here’s an updated summary page and a comparison to RStudio Server worth checking out. Or you can skip all of that and download a free 45 day evaluation right now!
WHAT’S NEW IN RSTUDIO SERVER PRO (v0.98.1091)
Naturally, the latest RStudio Server Pro has all of the new features found in the open source server version of the RStudio IDE. They include improvements to R Markdown document and Shiny app creation, making R package development easier, better debugging and source editing, and support for Internet Explorer 10 and 11 and RHEL 7.
Recently, we added even more powerful features exclusively for RStudio Server Pro:
- Load balancing based on factors you control. Load balancing ensures R users are automatically assigned to the best available server in a cluster.
- Flexible resource allocation by user or group. Now you can allocate cores, set scheduler priority, control the version(s) of R and enforce memory and CPU limits.
- New security enhancements. Leverage PAM to issue Kerberos tickets, move Google Accounts support to OAuth 2.0, and allow administrators to disable access to various features.
For a full list of what’s changed in more depth, make sure to read the RStudio Server Pro admin guide.
THE RSTUDIO SERVER PRO BASICS
In addition to the newest features above there are many more that make RStudio Server Pro an upgrade to the open source IDE. Here’s a quick list:
- An administrative dashboard that provides insight into active sessions, server health, and monitoring of system-wide and per-user performance and resources
- Authentication using system accounts, ActiveDirectory, LDAP, or Google Accounts
- Full support for the Pluggable Authentication Module (PAM)
- HTTP enhancements add support for SSL and keep-alive for improved performance
- Ability to restrict access to the server by IP
- Customizable server health checks
- Suspend, terminate, or assume control of user sessions for assistance and troubleshooting
That’s a lot to discover! Please download the newest version of RStudio Server Pro and as always let us know how it’s working and what else you’d like to see.
Are you headed to Strata? It’s just around the corner!
We particularly hope to see you at R Day on October 15, where we will cover a raft of current topics that analysts and R users need to pay attention to. The R Day tutorials come from Hadley Wickham, Winston Chang, Garrett Grolemund, J.J. Allaire, and Yihui Xie who are all working on fascinating new ways to keep the R ecosystem apace of the challenges facing those who work with data.
If you plan to stay for the full Strata Conference+Hadoop World be sure to look us up in the Innovator Pavilion booth P14 during the Expo Hall hours. We’ll have the latest books from RStudio authors and “shiny” t-shirts to win. Share with us what you’re doing with RStudio and get your product and company questions answered by RStudio employees.
See you in New York City!
Please join us for our popular Introduction to R course for data scientists and data analysts in San Francisco on April 28 and 29. This is a two-day workshop, designed to provide a comprehensive introduction to R that will have you analyzing and modeling data with R in no time. We will cover practical skills for visualizing, transforming, and modeling data in R. You will learn how to explore and understand data as well as how to build linear and non-linear models in R.
The course will be led by RStudio Master Instructor and author Dr. Garrett Grolemund.
We offer introductory R training only a few times a year. The Boston course in January sold out quickly. Space is limited. We encourage you to register (rstudio-sfbay.eventbrite.com) as soon as you can.
“The instructor was amazing. He knew so much and could answer any questions. His expertise was obvious and he was also very clear about how to explain it to a varied audience.” – Workshop Student, January 2014
“Very well organized and at a good pace. The example datasets were very helpful. Excellent teachers!” – Workshop Student, January 2014