Motif analysis
For the motif analysis, you first need to extract the sequences corresponding to the peaks. There are several ways to do this (as usual...). If you work on a UCSC-supported organism, the easiest is to use RSAT fetch-sequences. Here, we will use Bedtools, as we have the genome of interest at our disposal (Escherichia_coli_K12.fasta). However, we have to index the fasta file first to make it easy to access.
Which tool can be used to index the fasta file ? |
---|
When you search for modules containing the word fasta you find a tool called SAMtools.FastaIndex that can index a reference sequence in fasta format and this is exactly what we need. |
Use this tool to index the E. coli genome and copy the resulting .fai file to the Files tab (in the same folder as the fasta file).
How to extract sequences corresponding to the peaks ? |
---|
Use the BEDTools.fastaFromBed module for this.
|
Save the resulting .fa file to your computer.
To detect transcription factor motifs, you will use the Regulatory Sequence Analysis Tools. It has a specific teaching server recommended for trainings: http://pedagogix-tagc.univ-mrs.fr/rsat/
You will use the program peak-motifs.
How to find the peak-motifs program |
---|
In the left menu, click on NGS ChIP-seq and then click on peak-motifs. A new page opens, with a form |
The default peak-motifs web form only displays the essential options. There are only two mandatory parameters.
Fill the mandatory options |
---|
|
We will now modify some of the advanced options in order to fine-tune the analysis according to your data set.
Fill the advanced options |
---|
|
Launch the analysis |
---|
|
The Web page also displays a link, You can already click on this link. The report will be progressively updated during the processing of the workflow.