Spring 2015

Introducing R

  • R is a language and environment for statistical computing and graphics
  • Freely available and maintained by volunteers
  • R is extensible; can be expanded by installing "packages" or by creating your own "functions"
  • Frequently used with RStudio, a free IDE (integrated development environment) for R

How to get R and RStudio

R: http://cran.r-project.org/ (or better yet, Google “Download R”)

RStudio: http://www.rstudio.com/

Both are available for Windows, Mac, and Linux and free to install (no catches).

If you install R and RStudio, then you only need to run RStudio.

Using R

  • command-line driven (very little point-and-click)
  • use "functions" to work with data
  • can work interactively (one function at a time) or write a script to submit multiple functions at once
  • requires time and effort to learn, but RStudio makes the process easier

R basics - functions

R uses functions to perform operations.

Functions take arguments to specify how operations are performed.

Example: read.csv(file="scores.csv")

read.csv is a function to import a CSV file; file is an argument.

R has many functions. Fortunately about 20% of them do about 80% of the work.

R basics - assignment

We often need to save a function's result or output. For this we use the assignment operator: <-

For example, when you import a CSV file you need to give it a name:

scoreData <- read.csv(file="scores.csv")

Now we can use scoreData as an argument to other functions. For example: summary(scoreData) which computes summaries for each column in the data.

Note: Use Alt + - in RStudio to quickly insert <-. You can also use = for assignment.

R basics - objects

R creates and manipulates objects. Examples of objects are vectors, data frames, matrices and lists.

You can create as many objects as memory allows.

create vector of character strings:
gender <- c("Male","Female","Male")

create vector of numbers:
x <- seq(0, 10, by=2)

create data frame:
df <- read.csv("scores.csv")

R basics - packages

All functions belong to packages. R comes with about 25 packages, but as of January 2015, there are over 6200 user-contributed packages!

To use functions in a package, the package must be installed and loaded.

You only install a package once. You load a package whenever you want to use its functions.

Examples of packages that come with R:

  • The stats package contains statistical functions
  • The foreign package contains functions for reading data from other statistical programs
  • The graphics package contains functions for creating graphics

R in a Nutshell

Use functions in various packages to create and manipulate objects.

Running R

When you have R and RStudio installed, you only need to run RStudio.

RStudio displays four panels:

  1. Script
  2. Console
  3. Environment (what's in memory)
  4. Help/Plots

These can be rearranged. (Tools…Global Options…Pane Layout)

Creating an R Script

You can use R interactively, entering one line at a time in the console. But you'll eventually want to save that code and be able to run it again, or at least review it.

An R script is just a text file with a .R extension.

Ctrl + Shift + N will start a new script on a PC. Command + Shift + N will start a new script on a Mac.

Be sure you know how to start and save an R script. I will ask you to turn in R scripts for assignments.

Submitting code from a script

To submit code from a script to the console, place your cursor in the line of code you want to submit and hit Ctrl + Enter, or Command + Enter on a Mac.

You can also highlight multiple lines of code and use the same keyboard shortcut to submit the code. Tip: Shift + down arrow will highlight lines

To switch from script to console, type Ctrl + 2

To switch from console to script, type Ctrl + 1

Comment your code

To add a comment to R code, enter a # followed by your comment. For example:

# create vector of proportions
pr <- c(0.2,0.3,0.4,0.1)

RStudio displays comments in green font.

To quickly comment/uncomment a block of text or code, highlight it and hit Ctrl+Shift+C, or Command+Shift+C on a Mac.

To re-flow comments (ie, automatically add line breaks), hit Ctrl+Shift+/, or Command+Shift+/ on a Mac

Your working directory

It's imperative you understand the concept of a working directory.

Your working directory is where R looks for input files, and where it sends output files (unless you specify a path in your code). For example, the code:

dat <- read.csv("file.csv")

looks in your working directory for file.csv. For the code to work, you need to set your working directory to be the directory where that file is located.

Setting your working directory - R code

The function to see your working directory: getwd()

The function to set your working directory is setwd. For example:

setwd("~/_clients/Allen/")

sets my working directory to the Allen folder, in the _clients folder, in my computer's home directory.

The ~ means my computer's home directory. On Windows, this is usually My Documents.

Setting your working directory - Completion

RStudio allows you to use Tab completion when working with directories and typing R code. Take advantage of this!

When setting your working directory:

  • Type setwd() (notice RStudio automatically adds the closing parens)
  • Type "" in the parens (notice RStudio automatically adds the closing double-quote)
  • Type ~/ in the double quotes and hit Tab; you should see a selection of available directories in your computer's home directory.
  • arrow up and down to find the desired directory and hit Tab.

Setting your working directory - RStudio

You can set your working directory via point-and-click in RStudio by hitting Ctrl + Shift + H.

This is fine for an exploratory session or following along in class. But when you write a script, you'll want to be able to have a line of code that sets your working directory for you.

In this class I will provide you data sets for lectures and assignments.

To follow along with lectures, you must know how to download the files and set your working directory to where you downloaded the files.

My suggestion is to create a folder for this course and then create a subfolder called data.

RStudio function completion

As I mentioned earlier, RStudio can auto-complete your typing.

Start typing a command and hit Tab. All functions in currently loaded packages that begin with the letters you typed will appear in a pick-list.

After you typed your command and opened a parens, hit Tab again. In most cases (not all) you will get a pick-list of arguments for that function.

Using Help

All R users occasionally need help. Every last one. Know how to use R Help.

help(function)
?function

A help page for a function has the name of the function at the top left next to the package it's in. For example:

abline {graphics}

The function abline is part of the graphics package.

Use the example() function to run the examples in a help page: example(abline)

Vignettes

Some packages come with vignettes, which are basically tutorials to help you get up and running with a package.

These will be available on the package's index (or home) page through the link User guides, package vignettes and other documentation.

If no vignette is available, check the package DESCRIPTION file. It will sometimes have a URL that serves as the home page for the package.

Other keyboard shortcuts

Insert section header: Ctrl + Shift + R

Run the current section: Ctrl + Alt + T

Clear the console: Ctrl + L

Quickly create an R Notebook of your code: Ctrl + Shift + K

More RStudio Keyboard Shortcuts:
Alt + Shift + K

Books of Note

  • R in Action, Robert I. Kabacoff

  • Data Manipulation using R, Phil Spector

  • R for Everyone, Jared Lander

  • R Cookbook, Paul Teetor

Web Sites to explore