Homer

From BITS wiki
Jump to: navigation, search


pic2.gif

Homer [1] (Hypergeometric Optimization of Motif EnRichment) is a suite of tools for Motif Discovery and next-gen sequencing analysis. It is a collection of command line programs for unix-style operating systems written in Perl and C++. HOMER was primarily written as a de novo motif discovery algorithm and is well suited for finding 8-20 bp motifs in large scale genomics data. HOMER contains many useful tools for analyzing ChIP-Seq, GRO-Seq, RNA-Seq, DNase-Seq, Hi-C and numerous other types of functional genomics sequencing data sets.

HOMER Program Index
Below is a quick introduction to the different programs included in HOMER.  Running each program without any arguments will provide basic instructions and a list of command line options.
FASTA file Motif Discovery
 
findMotifs.pl - performs motif analysis with lists of Gene Identifiers or FASTA files (See FASTA file analysis)
 
homer2 - core component of motif finding (Called by everything else , See FASTA file analysis)
Gene/Promoter-based Analysis
findMotifs.pl - performs motif and gene ontology analysis with lists of Gene Identifiers, both promoter and mRNA motifs (See Gene ID Analysis Tutorial)
 
findGO.pl - performs only gene ontology analysis with lists of Gene Identifiers (Called by findMotifs.pl, See Gene Ontology Analysis)
loadPromoters.pl - setup custom promoter sets for specialized analysis (See Customization)
Next-Gen Sequencing/Genomic Position Analysis
findMotifsGenome.pl - performs motif analysis from genomic positions (See Finding Motifs from Peaks)
 
makeTagDirectory - creates a "tag directory" from high-throughput sequencing alignment files, performs quality control (See Creating a Tag Directory)
makeUCSCfile & makeBigWig.pl - create bedGraph file for visualization with the UCSC Genome Browser (See Creating UCSC file)
findPeaks - find peaks in ChIP-Seq data, regions in histone data, de novo transcripts from GRO-Seq (See Finding ChIP-Seq Peaks)
analyzeChIP-Seq.pl - automation of programs found above (See Automation of ChIP-Seq analysis) 
 
annotatePeaks.pl - annotation of genomic positions, organization of motif and sequencing data, histograms, heatmaps, and more... (See Annotating Peaks, Quantification)
analyzeRNA.pl - quantification of RNA levels across transcripts (See RNA quantification)
analyzeRepeats.pl - quantification of RNA levels across repeats (documentation coming soon...)
 
mergePeaks - find overlapping peak positions (See Comparing ChIP-Seq Peaks)
 
homerTools - basic sequence manipulation (See Sequence Manipulation)
tagDir2bed.pl - output tag directory as an alignment BED file (See Miscellaneous)
bed2pos.pl, pos2bed.pl - convert between HOMER peak file format and BED file format (See Miscellaneous)
 
checkPeakFile.pl - use this to see if your peak file is in the correct format
removeOutOfBoundsReads.pl - remove reads found outside acceptable chromosome limits
 
annotateTranscripts.pl - annotation of de novo identified transcripts
 
Motif Manipulation
compareMotifs.pl - checks a library of motifs for known motifs, creates in HTML output summarizing motif results (described here).
motif2Logo.pl - creates a PNG or PDF logo from any motif file.
revoppMotif.pl - creates a new motif file reflecting the nucleotide preferences of the opposite strand. 
seq2profile.pl - creates a new motif file from a consensus sequence
 
Hi-C Analysis Programs
 
analyzeHiC - primary analysis program - generates interaction matrices, normalization, identification of significant interactions, clustering of domains, generates Circos plots (most of the following programs use this one internally, See Hi-C analysis)
 
runHiCpca.pl - automated PCA analysis on Hi-C data to identify "compartments" (see Hi-C PCA analysis)
getHiCcorrDiff.pl - calculates the difference in correlation profiles between two Hi-C experiments (see Hi-C PCA analysis)
findHiCCompartments.pl - find continuous or differential regions from PCA/corrDiff results that describe what compartment regions of DNA belong to (see Hi-C PCA analysis)
 
findHiCInteractionsByChr.pl - helps automate the finding of high-resolution intra-chromosomal interactions (see Finding Hi-C Interactions)
annotateInteractions.pl - program for re-analysis of significant interactions, such as relating them to ChIP-Seq peaks (see Annotating Interactions)
 
SIMA.pl - Novel tool to boost sensitivity by pooling features together when performing interaction calculations. (see SIMA analysis)
 
Additional Utilities that may be useful (and sub-programs used by those above)
 
addData.pl, addDataHeader.pl, mergeData.pl - tools for joining/merging tab separated flat files
 
homerTools extract - extract genomic sequence for peaks from a peak file.
homerTools freq - finds nucleotide/dinucleotide frequencies of a collection of sequence and GC/CpG content.
getPeakTags - finds sequencing tags associated with genomic positions.
scanMotifGenomeWide.pl - look for all instances of a motif in the genome.
 
 
tagDir2bed.pl - convert *.tags.tsv file directory into a BED file for use with other programs
 
homer2 - new motif finding program
homer - original motif finding program (not used anymore)
 
getTopPeaks.pl - return peaks with the best peak scores.
getFocalPeaks.pl - return peaks with the highest focus ratios.
assignGenomeAnnotation - assign peaks to specific annotations in the genome
 
fasta2tab.pl, tab2fasta.pl - convert between HOMER-style sequence file and a FASTA file.
 
changeNewLine.pl - converts mac and dos formated text files (new lines of "\r" and "\r\n") to UNIX style ("\n").



References:
  1. Sven Heinz, Christopher Benner, Nathanael Spann, Eric Bertolino, Yin C Lin, Peter Laslo, Jason X Cheng, Cornelis Murre, Harinder Singh, Christopher K Glass
    Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities.
    Mol Cell: 2010, 38(4);576-89
    [PubMed:20513432] ##WORLDCAT## [DOI] (I p)



[ BioWare | Main_Page ]