R introduction

From BITS wiki
Jump to: navigation, search

Slides R part I training

Slides R part II training

Slides

Demo scripts

  • R script for chapter 4: Graphs
  • R script for chapter 5: Assumptions
  • R script for chapter 6: Correlations
  • R script for chapter 7: Regression
  • R script for chapter 8: Regression extended
  • R script for chapter 9: Comparing two means
  • R script for chapter 10: Generalised linear models
  • R script for chapter 12: Factorial selection
  • R script for chapter 13: Randomized complete block designs
  • R script for chapter 15: Nonparametric tests
  • R script for chapter 18: Categorical variables

Exercise scripts

  • R script for chapter 4: Graphs
  • R script for chapter 5: Assumptions
  • R script for chapter 6: Correlations
  • R script for chapter 7: Regression
  • R script for chapter 9: Comparing two means
  • R script for chapter 10: Generalised linear models
  • R script for chapter 11
  • R script for chapter 12: Factorial selection
  • R script for chapter 13: Randomized complete block designs
  • R script for chapter 15: Nonparametric tests
  • R script for chapter 18: Categorical variables

Solutions to exercises

Files R part I training

Data sets

R scripts

Files Experiment Design training

Files R part II training

On this page you can find an introduction to R, the statistical programming language.

We will assume that you're working in RStudio, although most of the things we show will also work in the R editor.

RStudio

Although you can work directly in the R editor, most people find it easier to use RStudio on top of R. RStudio is free separate software to make R more user-friendly. It's essentially a graphical user interface for R.

Check out this video tutorial on the RStudio user interface.

Extra info and tips:

  • Features of the console that make life easier e.g. retrieving previous commands…
  • How to search the History
  • Find and replace can be opened using Ctrl+F.
  • RStudio supports the automatic completion of words using the Tab key.
    For example, if you have an object named relfreq (relative frequencies) in your workspace, you can type r and then Tab and RStudio will automatically show a list of possibilities to complete the full name of the object.


Installation

Install R (and RStudio)

R is available at the CRAN website. Upon choosing a CRAN mirror, you can download R.
R is available for Linux, Mac and Windows.

You can download RStudio from the RStudio website.

Install R packages

Check out this video tutorial on installing packages in R.

Part of the reason that R is so popular is the enormous diversity of packages that are available for R. Packages are collections of R programs that are able to perform a certain analysis, e.g. Matrix is a package that contains all the R-code you need for creating and working with matrices. R packages are available at the CRAN and Bioconductor websites.


Issues with installation of R/R packages

Mac: error message: Setting LC_CTYPE failed, using “C” during package installation

  • Close RStudio
  • Open Terminal
  • Type:
    defaults write org.R-project.R force.LANG en_US.UTF-8
  • Close Terminal
  • Start RStudio and retry installation


Creating a project in RStudio

An R project is a folder in RStudio where all your work on one project (e.g. a chapter in your PhD dissertation) is gathered. Projects help you to stay organized. When you create a project R will load data and scripts from and save results to this folder.


Creating and running scripts

A script is a text file that contains all the commands you will use. You cannot only write and run scripts but you can also save them so next time you need to do a similar analysis you can change and re-run the script with minimal effort. An R project can contain multiple scripts.

Commands

Scripts consist of a list of commands. Commands in R have a certain format:

output <- method(list of arguments)

Alternatively, you may use the following format:

output = method(list of arguments)

For example:

p = ggplot(mtcars,(aes(wt,mpg))

In this example ggplot() is the method. It generates a plot so the plot p is the output of the method. Before a function can start the actions and calculations that it encodes, it needs prior information: input data and parameter settings. These are called the arguments of the function. In this example the arguments are:
mtcars: a data frame consisting of 9 columns containing the input data
aes(wt,mpg): defines the columns you want to plot (wt and mpg) and how you want to plot them (wt on the X- axis and mpg on the Y-axis)

Creating a new script


Loading packages in R

You only need to install a package once. But each time you want to use a package you have to load it (activate its functions).


Getting help / documentation in R

You can find a lot of documentation online:

  • The documentation section of the R website
    Unfortunately this section is nor easily accessible nor well-structured and it can be quite a challenge to consult the help files of different R packages and functions online. By far the most user-friendly interface for searching the R documentation is the Rdocumentation website.
  • Documentation of RStudio
  • Quick R: for those who would like to make the transition to R (from SAS, SPSS, Stata)
  • R-bloggers: R-news and tutorials contributed by bloggers
  • Inside-R: a website created by people who are using R containing examples, how-to’s, packages…

Some R commands allow you to consult the documentation:

You can also ask for information on specific topics, e.g. find out what arguments a method needs...


Data types in R

A data frame contains several elements. It essentially has a matrix structure but some of the columns of the data frame might be matrices themselves. The different elements are allowed to contain different data types.

All columns in a matrix must have the same data type and length.