NGS data analysis

From BITS wiki
Jump to: navigation, search


NGS.png

This wiki page is dedicated to the series of trainings that will lead you through the various workflows for the analysis of next generation sequencing data.
Have fun solving the exercises!


[ Main_Page ]

Technical.png During these training sessions, you will be invited to make exercises using free software running locally on your PC. Since many of the tools for analysis of NGS data run on Linux, for most of the exercises we will use a Linux installation (Linux Mint 17). Because most of you have used or will use the Illumina platform to generate their data, we will use Illumina data sets in all exercises

 

Training 1: Introduction to the analysis of NGS data

Periodically repeated Sessions (Janick Mathys)

Slides

Exercises

This training gives you the background knowledge you need to follow the more advanced trainings on variant analysis, RNA-Seq and ChIP-Seq.

Download the data sets for this training:

Now you can try the exercises.


FAQ

Q&A added during the intro to NGS data analysis

File formats


Training 2: NGS variant analysis

Session of October 2017 (Stéphane Plaisance)

Session of 2017 using GenePattern

Training archive

Q&A pages

HowTo Pages related to this training


 

Training 3: RNA-Seq analysis for differential expression

Training material



Summer school 2018

Prep Course

Linux

R

bulk RNA-Seq: from fastq to count files

bulk RNA-Seq: finding differentially expressed genes

List of R packages used in the training:

  • ggplot2
  • gplots
  • RColorBrewer
  • reshape2
  • pheatmap
  • Bioconductor
  • Bioconductor: DESeq2

bulk RNA-Seq: variant analysis

scRNA-Seq: introduction 10x Genomcis

scRNA-Seq: introduction to cell ranger

Analysis and exploration of single cell RNA-seq data

List of R packages used in the Seurat training:

  • Seurat
  • dplyr
  • Matrix
  • gridExtra
  • limSolve
  • mvoutlier
  • Bioconductor: scater
  • Bioconductor: scran

Scenic

Experimental design

Integration of omics data

What after the summer school ?

Bulk RNA-Seq - from raw reads to counts:

  • We have two GenePattern servers running that contain all the tools discussed in the training. Send an email to bits@vib.be to get an account
  • We can provide a snapshot of the server you worked on during the training. You can then make your own server on Google cloud (it's easy starting from a snapshot). You will have to pay for that.

Bulk RNA-Seq - finding DE genes:

  • You can do the R analysis on your own computer: see this section for the list of packages you need to install.
  • We can provide a snapshot of the server you worked on during the training. You can then make your own server on Google cloud (it's easy starting from a snapshot). You will have to pay for that.

Single cell RNA-Seq:

  • You can do the Seurat analysis on your own computer: see .this section for the list of packages you need to install.
  • We can provide a snapshot of the server you worked on during the training. You can then make your own server on Google cloud (it's easy starting from a snapshot). You will have to pay for that.
  • In the future you can get support from Niels and Liesbet. Contact scRNAseq@irc.vib-ugent.be for more information.
  • We will check if cell ranger is installed on KULeuven vsc (accessible by people from KULeuven and UHasselt).


Handicon.png A GIT page has been started to post your issues and share with us, you can reach it at https://github.com/BITS-VIB/Summer_school_2018

  • NGS_data_analysis_tools A page listing tools found during the day and that you may want to install on your computer

Archive

Session of March 20th and 23rd, 2015 (Stéphane Plaisance)

repeated September 25, 2015

Hands-on_introduction_to_NGS_RNASeq_DE_analysis - the pages of the actual training
containing a hands-on workflow of RNA-Seq analysis for differential expression using command line tools.


creating ENV variables for the training

Create a new file with "sudo /etc/profile.d/bits.sh" and paste the following content

# system wide ENV variables to ease path in training exercises
export SUMMER=/usr/summer
export SOFT=$SUMMER/software
export REFS=$SUMMER/refs
export DATA=/mnt/userdata/$(whoami)

source (=execute) the file by typing ". /etc/profile.d/bits.sh"

You now have shortcuts (env variables) that can be typed to reach the very long exercise locations as fololws:

  • $SUMMER leads to /usr/summer
  • $SOFT leads to $SUMMER/software
  • $REFS leads to $SUMMER/refs
  • $DATA leads to /home/<yourhome>/data

edgeR / DESeq2

Exercises
Slides

Archive

Session of January 20th and 27th, 2014 using Galaxy (Joachim Jacob)

Training 4: ChIP-Seq analysis

http://morgane.bardiaux.fr/chip-seq-training/ session 2015-06-01
link to the gene annotations (download this bed file and use it in IGV)

Session of February 24th, 2014 (Morgane Thomas-Chollier)

Introduction

The aim of this session is to :

  • Have an understanding of the nature of ChIP-Seq data
  • Perform a complete analysis workflow including QC, read mapping, visualization in a genome browser and peak-calling.
  • Use command line and open source software for each each step of the workflow workflow and feel the complexity of the task
  • Have an overview of possible downstream analyses
  • Perform a motif analysis with online web programs

This training gives an introduction to ChIP-seq data analysis, covering the processing steps starting from the reads to the peaks. Among all possible downstream analyses, the practical aspect will focus on motif analyses. A particular emphasis will be put on deciding which downstream analyses to perform depending on the biological question. This training does not cover all methods available today. It does not aim at bringing users to a professional NGS analyst level but provides enough information to allow biologists understand what DNA sequencing practically is and to communicate with NGS experts for more in-depth needs.

For this training, we will use a dataset produced by Myers et al [1] involved in the regulation of gene expression under anaerobic conditions in bacteria. We will focus on one factor: FNR.

Suggested Reading :

  • Bailey et al. Practical Guidelines for the Comprehensive Analysis of ChIP-seq Data. PLoS Comput Biol 9, e1003326 (2013) [2].PDF
  • Thomas-Chollier et al. A complete workflow for the analysis of full-size ChIP-seq (and similar) data sets using peak-motifs. Nature Protocols 7, 1551–1568 (2012)[3]. PDF

raw Data :


Exercises


Links


HowTo Pages related to this training

 

Training 5: metagenomics

Data files

Tools