How to create chip pseudo-images
Go to parent Analyze your own microarray data in R/Bioconductor
Contents
Chip pseudo images in affy
Chip pseudo-images are very useful for detecting spatial differences (artifacts) on the invidual arrays (so not for comparing between arrays).
Pseudo-images are generated by fitting a probe-level model (PLM) to the data that assumes that all probes of a probe set behave the same in the different samples: probes that bind well to their target should do so on all arrays, probes that bind with low affinity should do so on all arrays...
You can create pseudo-images based on the residuals or the weights that result from a comparison of the model (the ideal data, without any noise) to the actual data. These weights or residuals may be graphically displayed using the image() function in Bioconductor.
The model consists of a probe level (assuming that each probe should behave the same on all arrays) and an array level (taking into account that a gene can have different expression levels in different samples) parameter.
How to fit a probe-level model to the data ? |
---|
The Bioconductor method for fitting probe-level models is fitPLM(). As input it takes an AffyBatch object and outputs a PLMset object. It fits a probe level model to the probe intensities. The weights and residuals obtained by comparing the model with the observed data are stored in a PLMset object.
Pset = fitPLM(data) The model that is used by default is the one that was described in the slides. You can use other models if you want. More info in the documentation: ? fitPLM |
The method to compute the model is quite intensive, requiring a lot of resources. If you have a large data set, you might receive an out-of-memory error when fitting the model.
How to fit a probe-level model when you have a large number of arrays ? |
---|
If you have a large data set, you can ask the fitPLM() to skip some calculations that you're not going to use in your plots anyway by setting the varcov parameter to none. This will speed up the calculations and avoid out-of-memory errors.
Pset = fitPLM(data,output.param=list(varcov="none")) |
Based on weights
Weights represent how much the original data contribute to the model: outliers are strongly downweighted because they are so different from the ideal data. Weights have values between 0 and 1. So the smaller the weight of a probe
-> the more the probe is not showing the typical behavior that it shows on the other arrays
-> the more its intensity can be considered an outlier
How to create a pseudo-image based on weights ? |
---|
For each of the 6 arrays we generate a figure containing the pseudoimage. The image() method generates by default a pseudo-image based upon the weights. The which argument specifies which array is drawn. If which is not defined, then pseudo-images of all the arrays are drawn, one by one in succession.
for (i in 1:6) { name = paste("pseudoimage",i,".jpg",sep="") jpeg(name) image(Pset,which=i,main=ph@data$sample[i]) dev.off() } Again, this code was used to analyze an experiment containing 6 arrays. If you have a different number of samples you have to adjust the numbers in the code, e.g. if you have 8 arrays: for (i in 1:8) { name = paste("pseudoimage",i,".jpg",sep="") jpeg(name) image(Pset,which=i,main=ph@data$sample[i]) dev.off() } |
In the figure below, all pseudoimages were plotted on a single figure. Again, this only works for small arrays like the ATH arrays.
Small weights (outliers) are indicated in green on the figure.
If you have small arrays and you want to plot all pseudoimages on a single figure, use the following code:
op = par(mfrow = c(2,3)) for (i in 1:6){image(Pset,which=i,main=ph@data$sample[i])}
Based on residuals
Residuals are the second quantity used for chip pseudo-images. They represent the difference between the original data and the ideal data according to the model. So the more a residual deviates from 0, the larger the difference between the data and the model. Residuals can be
- positive: the intensity of the probe is larger than the ideal value according to the model
- negative: the intensity of the probe is smaller than the ideal value according to the model
How to create a pseudo-image based on residuals ? |
---|
Since residuals can be positive or negative, you can create four different images of residuals: based on residuals, positive residuals, negative residuals or sign of residuals.
for (i in 1:6) { name = paste("pseudoimage",i,".jpg",sep="") jpeg(name) image(Pset,which=i,type="resids",main=ph@data$sample[i]) dev.off() } The type argument is used to control which of these images is drawn:
This code was used to analyze an experiment containing 6 arrays. If you have a different number of samples you have to adjust the numbers in the code, e.g. if you have 8 arrays for (i in 1:8) { name = paste("pseudoimage",i,".jpg",sep="") jpeg(name) image(Pset,which=i,type="resids",main=ph@data$sample[i]) dev.off() } |
If you have small arrays, you might fit all six pseudoimages on one plot as shown below:
op = par(mfrow = c(2,3)) for (i in 1:6){image(Pset,which=i,type='resids',main=ph@data$sample[i])}
Positive residuals are plotted in red, negative residuals in blue.
Chip pseudo images in oligo
The fitPLM() method is a medthod from the affyPLM package and does not work on FeatureSets so in oligo we need to use the alternative fitProbeLevelModel() method. This method fits robust probe level linear models to all the probe sets in an FeatureSet.
Pset = fitProbeLevelModel(data)
Creating pseudoimages based on weights is done exactly the same as in affy.
Creating pseudoimages based on residuals is slightly different from affy:
for (i in 1:6) { name = paste("pseudoimage",i,".jpg",sep="") jpeg(name) image(Pset,which=i,type="residuals",main=ph@data$sample[i]) dev.off() }