Bioperl introductionary training

Bioperl

Bioperl is a collection of perl modules that facilitate the development of perl scripts for bioinformatics applications. As such, it does not include ready to use programs in the sense that many commercial packages and free web-based interfaces (eg Entrez, SRS) do. On the other hand, bioperl does provide reusable perl modules that facilitate writing perl scripts for sequence manipulation, accessing of databases using a range of data formats and execution and parsing of the results of various molecular biology programs including Blast, clustalw, TCoffee, genscan, ESTscan and HMMER. Consequently, bioperl enables developing scripts that can analyze large quantities of sequence data in ways that are typically difficult or impossible with web based systems.

Training outline

See [1]

Exercises

Day 1

Bioperl Training Exercise 1: perldoc

Bioperl Training Exercise 2: thou shalt not forget

Bioperl Training Exercise 3: arrays

Bioperl Training Exercise 4: hashes

Bioperl Training Exercise 5: packages and modules 1

Bioperl Training Exercise 6: packages and modules 2

Bioperl Training Exercise 7: complex data structures

Bioperl Training Exercise 8: OOP

Bioperl Training Exercise 9: inheritance, polymorphism

Bioperl Training Exercise 10: aggregation, delegation

Scratch pad

references: create a module containing a subroutine to sort arrays on length. Show Schwartzian Tranform solution as well.
A pragma is a module which influences some aspect of the compile time or run time behaviour of Perl. e.g. use strict

Day 2

bioperl ships with a whole bunch of ready to use scripts. They all start with bp_. On the command line type bp_ <tab> to list them all.
use regular perldoc for information, e.g. perldoc bp_split_seq

scripts that might be useful:
- bp_sreformat:sequence and alignment format conversion. bp_seqconvert is similar but only allows sequence conversions.
- bp_search2table: generate table output from e.g. blast report
- bp_taxid4species: return a list of taxa ids for requested organisms
- bp_mutate: randomly mutagenize a single protein or DNA sequence
- etc ...

Bioperl Training Exercise 11: IO, add annotation, run EMBOSS application

Bioperl Training Exercise 12: Create a fuzzpro processor module

Bioperl Training Exercise 13: Create a taxon processor module

Bioperl Training Exercise 14: Create a reference processor module

Bioperl Training Exercise 15: Magical processor module

References, Sources, ...

Dependencies

Class::Inspector
Getopt::long
pod::usage
perltidy
Geany
EMBOSS

Author

Marc Logghe

Inspired by

Probably the best Perl book in the world: Object Oriented Perl by Damian Conway.
Randal Schwartz's Perls of Wisdom
Perl Training Australia