Homer
Homer [1] (Hypergeometric Optimization of Motif EnRichment) is a suite of tools for Motif Discovery and next-gen sequencing analysis. It is a collection of command line programs for unix-style operating systems written in Perl and C++. HOMER was primarily written as a de novo motif discovery algorithm and is well suited for finding 8-20 bp motifs in large scale genomics data. HOMER contains many useful tools for analyzing ChIP-Seq, GRO-Seq, RNA-Seq, DNase-Seq, Hi-C and numerous other types of functional genomics sequencing data sets.
HOMER Program Index Below is a quick introduction to the different programs included in HOMER. Running each program without any arguments will provide basic instructions and a list of command line options. FASTA file Motif Discovery findMotifs.pl - performs motif analysis with lists of Gene Identifiers or FASTA files (See FASTA file analysis) homer2 - core component of motif finding (Called by everything else , See FASTA file analysis) Gene/Promoter-based Analysis findMotifs.pl - performs motif and gene ontology analysis with lists of Gene Identifiers, both promoter and mRNA motifs (See Gene ID Analysis Tutorial) findGO.pl - performs only gene ontology analysis with lists of Gene Identifiers (Called by findMotifs.pl, See Gene Ontology Analysis) loadPromoters.pl - setup custom promoter sets for specialized analysis (See Customization) Next-Gen Sequencing/Genomic Position Analysis findMotifsGenome.pl - performs motif analysis from genomic positions (See Finding Motifs from Peaks) makeTagDirectory - creates a "tag directory" from high-throughput sequencing alignment files, performs quality control (See Creating a Tag Directory) makeUCSCfile & makeBigWig.pl - create bedGraph file for visualization with the UCSC Genome Browser (See Creating UCSC file) findPeaks - find peaks in ChIP-Seq data, regions in histone data, de novo transcripts from GRO-Seq (See Finding ChIP-Seq Peaks) analyzeChIP-Seq.pl - automation of programs found above (See Automation of ChIP-Seq analysis) annotatePeaks.pl - annotation of genomic positions, organization of motif and sequencing data, histograms, heatmaps, and more... (See Annotating Peaks, Quantification) analyzeRNA.pl - quantification of RNA levels across transcripts (See RNA quantification) analyzeRepeats.pl - quantification of RNA levels across repeats (documentation coming soon...) mergePeaks - find overlapping peak positions (See Comparing ChIP-Seq Peaks) homerTools - basic sequence manipulation (See Sequence Manipulation) tagDir2bed.pl - output tag directory as an alignment BED file (See Miscellaneous) bed2pos.pl, pos2bed.pl - convert between HOMER peak file format and BED file format (See Miscellaneous) checkPeakFile.pl - use this to see if your peak file is in the correct format removeOutOfBoundsReads.pl - remove reads found outside acceptable chromosome limits annotateTranscripts.pl - annotation of de novo identified transcripts Motif Manipulation compareMotifs.pl - checks a library of motifs for known motifs, creates in HTML output summarizing motif results (described here). motif2Logo.pl - creates a PNG or PDF logo from any motif file. revoppMotif.pl - creates a new motif file reflecting the nucleotide preferences of the opposite strand. seq2profile.pl - creates a new motif file from a consensus sequence Hi-C Analysis Programs analyzeHiC - primary analysis program - generates interaction matrices, normalization, identification of significant interactions, clustering of domains, generates Circos plots (most of the following programs use this one internally, See Hi-C analysis) runHiCpca.pl - automated PCA analysis on Hi-C data to identify "compartments" (see Hi-C PCA analysis) getHiCcorrDiff.pl - calculates the difference in correlation profiles between two Hi-C experiments (see Hi-C PCA analysis) findHiCCompartments.pl - find continuous or differential regions from PCA/corrDiff results that describe what compartment regions of DNA belong to (see Hi-C PCA analysis) findHiCInteractionsByChr.pl - helps automate the finding of high-resolution intra-chromosomal interactions (see Finding Hi-C Interactions) annotateInteractions.pl - program for re-analysis of significant interactions, such as relating them to ChIP-Seq peaks (see Annotating Interactions) SIMA.pl - Novel tool to boost sensitivity by pooling features together when performing interaction calculations. (see SIMA analysis) Additional Utilities that may be useful (and sub-programs used by those above) addData.pl, addDataHeader.pl, mergeData.pl - tools for joining/merging tab separated flat files homerTools extract - extract genomic sequence for peaks from a peak file. homerTools freq - finds nucleotide/dinucleotide frequencies of a collection of sequence and GC/CpG content. getPeakTags - finds sequencing tags associated with genomic positions. scanMotifGenomeWide.pl - look for all instances of a motif in the genome. tagDir2bed.pl - convert *.tags.tsv file directory into a BED file for use with other programs homer2 - new motif finding program homer - original motif finding program (not used anymore) getTopPeaks.pl - return peaks with the best peak scores. getFocalPeaks.pl - return peaks with the highest focus ratios. assignGenomeAnnotation - assign peaks to specific annotations in the genome fasta2tab.pl, tab2fasta.pl - convert between HOMER-style sequence file and a FASTA file. changeNewLine.pl - converts mac and dos formated text files (new lines of "\r" and "\r\n") to UNIX style ("\n").
References:
- ↑
Sven Heinz, Christopher Benner, Nathanael Spann, Eric Bertolino, Yin C Lin, Peter Laslo, Jason X Cheng, Cornelis Murre, Harinder Singh, Christopher K Glass
Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities.
Mol Cell: 2010, 38(4);576-89
[PubMed:20513432] ##WORLDCAT## [DOI] (I p)