R introduction
Contents
R part I training
Slides R part I training
- Slides for the course
- Tutorial with exercises
- Demo exercises of day1
- Demo exercises of day2
- Demo exercises of day3: plots
- Demo exercises of day3: statistics
- Solutions of the exercises of the first day of the training
- Solutions of the exercises of the second day of the training
- Solutions of the exercises of the third day of the training
- Script to calculate average of two vectors
- Script to calculate average of any number of vectors
- Script to calculate standard error of the mean
- Script to calculate confidence intervals
- R script to start the second day of the training
Files R part I training
Cheat sheets
Data sets
- Babies data set for graphing
- Heatmap data set
- Hormone data set
- GH levels in teens data set
- GH levels in adults data set
- qPCR data set
- DE genes from RNA-Seq experiment
- data on ALL patients
- Proteins data set
- data set with 4 variables
- Fly genetics data set
- T cell data set
- weight and height data set for the first group exercise
- Hormone concentrations for 2 groups of 9 patients for the second group exercise
- Hormone before and after data set for the second group exercise
- Hormone durig activity data set for the third group exercise
- log10 transformed CNRQs of control samples for qPCR analysis and visualization in R
- log10 transformed CNRQs of treated samples for qPCR analysis and visualization in R
- Microarray data set
- RNA-Seq data set
- metadata for the RNA-Seq data set
R scripts
Exercises during the class
- R script to calculate the average of two vectors to complete
- R script to calculate the average of any number of vectors to complete
Extra exercises
- R script to complete for the first exercise: statistical analyses
- R script to complete for the second exercise: recap on graphs + a few new tricks
- R script to complete for the third exercise: advanced graphs
- R script with solutions for the first exercise: statistical analyses
- R script with solutions for the second exercise: recap on graphs + a few new tricks
- R script with solutions for the third exercise: advanced graphs
Specific analyses
- R script to complete for qPCR analysis
- R script to complete for microarray analysis, check our microarray wiki page for a detailed description of the workflow.
- R script to complete for bulk RNA-Seq analysis with DESeq2, check our RNA-Seq wiki page for more info
- R script with solutions for qPCR analysis
- R script with solutions for microarray analysis
- R script with solutions for bulk RNA-Seq analysis with DESeq2
- R script for bulk RNA-Seq analysis with edgeR
- R script for single cell RNA-Seq analysis with Seurat
- R script with extra functions for single cell RNA-Seq analysis with Seurat
- exercise on metagenomics analysis with vegan
Statistical thinking training
Files Statistical thinking training
Slides as ppt
Cases:
Experiment Design training
Files Experiment Design training
- slides as ppt
- data set for demo RSM
- data set for demo variance components
- Rmarkdown html with explanations of the exercises
- small R package in zip format
- small R package in tar.gz format
Software Experiment Design training
You need the latest version of R and RStudio and the following packages:
- reshape2
- tidyverse
- lme4
- AlgDesign
- agricolae
FlowSOM
R and RStudio
On this page you can find an introduction to R, the statistical programming language. We will assume that you're working in RStudio, although most of the things we show will also work in the R editor.
Although you can work directly in the R editor, most people find it easier to use RStudio on top of R. RStudio is free separate software to make R more user-friendly. It's essentially a graphical user interface for R.
Check out this video tutorial on the RStudio user interface.
Extra info and tips:
- Features of the console that make life easier e.g. retrieving previous commands…
- How to search the History
- Find and replace can be opened using Ctrl+F.
- RStudio supports the automatic completion of words using the Tab key.
For example, if you have an object named relfreq (relative frequencies) in your workspace, you can type r and then Tab and RStudio will automatically show a list of possibilities to complete the full name of the object.
Installation
Install R (and RStudio)
R is available at the CRAN website. Upon choosing a CRAN mirror, you can download R.
R is available for Linux, Mac and Windows.
You can download RStudio from the RStudio website.
Install R packages
Check out this video tutorial on installing packages in R.
Part of the reason that R is so popular is the enormous diversity of packages that are available for R. Packages are collections of R programs that are able to perform a certain analysis, e.g. Matrix is a package that contains all the R-code you need for creating and working with matrices. R packages are available at the CRAN and Bioconductor websites.
How to install packages in R ? |
---|
Open R or RStudio as administrator and install packages as follows:
Once you start typing the name of a package, RStudio tries to autocomplete it:
Select the package you want to install – in this case ggplot2 - and click the Install button.
|
Issues with installation of R/R packages
Mac: error message: Setting LC_CTYPE failed, using “C” during package installation
- Close RStudio
- Open Terminal
- Type:
defaults write org.R-project.R force.LANG en_US.UTF-8
- Close Terminal
- Start RStudio and retry installation
Creating a project in RStudio
An R project is a folder in RStudio where all your work on one project (e.g. a chapter in your PhD dissertation) is gathered. Projects help you to stay organized. When you create a project R will load data and scripts from and save results to this folder.
How to create a project ? |
---|
Select New Project from the Project dropdown menu in the top right corner.
Next you have to specify if the project should reside in a new or in an existing directory on your computer.
|
Creating and running scripts
A script is a text file that contains all the commands you will use. You cannot only write and run scripts but you can also save them so next time you need to do a similar analysis you can change and re-run the script with minimal effort. An R project can contain multiple scripts.
Commands
Scripts consist of a list of commands. Commands in R have a certain format:
output <- method(list of arguments)
Alternatively, you may use the following format:
output = method(list of arguments)
For example:
p = ggplot(mtcars,(aes(wt,mpg))
In this example ggplot() is the method. It generates a plot so the plot p is the output of the method. Before a function can start the actions and calculations that it encodes, it needs prior information: input data and parameter settings. These are called the arguments of the function. In this example the arguments are:
mtcars: a data frame consisting of 9 columns containing the input data
aes(wt,mpg): defines the columns you want to plot (wt and mpg) and how you want to plot them (wt on the X- axis and mpg on the Y-axis)
Creating a new script
How to create a new script in RStudio ? |
---|
Click File in the top menu: New File > R Script
|
Loading packages in R
You only need to install a package once. But each time you want to use a package you have to load it (activate its functions).
How to load packages in R ? |
---|
Loading a package is done by typing the following command directly in the console or as part of a script:
# load packages library("packagename") |
Getting help / documentation in R
You can find a lot of documentation online:
- The documentation section of the R website
Unfortunately this section is nor easily accessible nor well-structured and it can be quite a challenge to consult the help files of different R packages and functions online. By far the most user-friendly interface for searching the R documentation is the Rdocumentation website. - Documentation of RStudio
- Quick R: for those who would like to make the transition to R (from SAS, SPSS, Stata)
- R-bloggers: R-news and tutorials contributed by bloggers
- Inside-R: a website created by people who are using R containing examples, how-to’s, packages…
Some R commands allow you to consult the documentation:
How to show the documentation of a package ? |
---|
You can ask R to open a browser with a documentation file for a complete package:
help(package=”packagename”) |
You can also ask for information on specific topics, e.g. find out what arguments a method needs...
How to show the documentation of specific topics ? |
---|
You can ask for documentation of individual classes:
help(topicname) Try some examples to see what help() does: help(help) help(matrix) help(print) Alternatively you can use: ? topicname |
Data types in R
How to check which data type a variable belongs to ? |
---|
To know which class an object belongs to:
class(objectname) |
A data frame contains several elements. It essentially has a matrix structure but some of the columns of the data frame might be matrices themselves. The different elements are allowed to contain different data types.
How to list the elements of a data frame ? |
---|
To find out which elements are available for a data frame, use the following command:
names(dataframename) |
All columns in a matrix must have the same data type and length.
How to list the names of the rows or columns of a matrix ? |
---|
To find out which elements are available for a data frame, use the following command:
rownames(matrixname) columnnames(matrixname) |