Exercises: Handling R
From BITS wiki
Go to parent Introduction to R/Bioconductor for analysis of microarray data#Training Units
Contents
The first session: basic handling
- Start
R
- Load the pre-installed dataset
rivers
by typingdata(rivers)
at the command line. Use the function callls()
to verify that the dataset has been loaded. - Type
?rivers
to get some information on the data. - Look at the values in the dataset by typing
rivers
at the command line. Is the dataset sorted? - Calculate the mean, median, quartiles and largest smallest values through the command
summary(rivers)
. - Draw a histogram of the data via
hist(rivers)
. - Save the plot as a file right-clicking and selecting XXXX from the pop-up menu. Make sure to remember/write down name and directory of the plot file.
- Save the dataset as a binary data file by typing
save(rivers, file="myRivers.RData")
. - Quit
R
without saving the workspace image.
Congratulations! You have finished your first R session!
Re-loading and modifying data
- Start
R
again. - Get the name of the current working directory by typing
getwd()
at the command line. - Get the content of the current working directory through typing
dir()
at the command line. You should see the plot file and the filemyRivers.RData
that you have generated previously. - Load the previously stored data file through the command
load("myRivers.RData")
. - Use the function
ls()
to check that the data was loaded, and use the functionsummary
to re-calculate mean, median etc. - To compute the length of the rivers in km and assign the converted lengths as object
riversKm
, use the commandriversKm = rivers/1.609344
. Display the new object and calculate a numerical summary of the converted lengths as before. - Draw a boxplot of the converted data using the command
boxplot(riversKm)
. How many rivers are longer than 1000 km? - Save the plot as before.
- From the boxplot and the histogram before, the river lengths appear to be heavily skewed. In this situation, a logarithmic transformation (log transform) is often useful in making the data more symmetrical. Apply the function
log10
to calculate the logarithm for base 10 of the river lengths, and save the transformed lengths as objectlogRivers
. - Draw a histogram of the logarithmized river lengths. Has the skewness been reduced? Save the histogram to a plot file as before.
- Use
ls()
to verify that there are currently three object in your working space. - Use the command
save
to save all three objects to the same RData file as before (which gets thereby overwritten).
Getting more help
- Start the HTML help through typing
help.start()
at the command line. - Find out which vignettes are available through typing
vignette()
are available. Open one that you think sounds interesting, usingvignette("name of vignette")
. Were you right?
Load and install packages in R
- Install the package
DAAGbio
through the menu system. - Load the package, either through the menu system or through
library(DAAGbio)
at the command line. - From the help system, find out what you can about the data frame
coralTargets
- Load the data frame through
data(coralTargets)
and verify that it agrees with the help information.
- Rivers01.png
Histogram of
rivers
- Rivers02.png
Boxplot of
riversKm
- Rivers03.png
Histogram of
logRivers