In this exercise we will improve Bioperl Training Exercise 11. The code in fuzzpro.pl should be put in a module, because it is perfectly reusable.
Steps to take:
- Create the module (class) BITS::Training::SeqProcessor::Fuzzpro
- In the constructor add a fuzzpro Bio::Tools::EMBOSSApplication object to your object (aggregation, remember ?).
- Create the process_seq() method that basically contains the code that was in the while loop of the original fuzzpro.pl script. That method should simply run fuzzpro (delegation !), add the features as before and finally return the sequence object. No output, simply return the sequence.
- Refactor the original fuzzpro.pl (Bioperl Training Exercise 11) or create a new script. This script should use the BITS::Training::SeqProcessor::Fuzzpro module instead
fuzzpro.pl
|
#!/usr/bin/perl use strict; use Bio::SeqIO; use BITS::Training::SeqProcessor::Fuzzpro; use Data::Dumper; # io object to read in the fasta from 'proteins.fa' my $in = Bio::SeqIO->new(-format => 'fasta', -file => '< proteins.fa'); my $fuzzpro = BITS::Training::SeqProcessor::Fuzzpro->new; # io object to write genbank to STDOUT my $out = Bio::SeqIO->new(-format => 'genbank', -fh => \*STDOUT); #print Data::Dumper->Dump([$fuzzpro->acd],['fuzzpro']);exit; # for every sequence while (my $seq = $in->next_seq) { $fuzzpro->process_seq($seq); $out->write_seq($seq); }
|
BITS::Training::SeqProcessor::Fuzzpro
|
package BITS::Training::SeqProcessor::Fuzzpro; use strict; use Bio::Factory::EMBOSS; use Bio::Tools::GFF; # N-glycosylation pattern argument for fuzzpro my $PATTERN = 'N-{P}-[ST]'; sub new { my ($class, @args) = @_; # just to prove that you can use a scalar as object as well my $fuzzpro = Bio::Factory::EMBOSS->new->program('fuzzpro'); my $self = \$fuzzpro; bless $self; return $self; } sub process_seq { my ($self, $seq) = @_; # create temporary file for fuzzpro output my ($fh, $gffile) = $$self->io->tempfile(UNLINK=>0); # run fuzzpro $$self->run({ -sequence => $seq, -pattern => $PATTERN, -rformat => 'GFF', -outfile => $gffile, }); # create reader/parser GFF object my $gffio = Bio::Tools::GFF->new(-fh => $fh); # loop over the feature stream while(my $feature = $gffio->next_feature) { # attach feature to sequence $seq->add_SeqFeature($feature); # change feature type $feature->primary_tag('protein_match'); # add notes $feature->add_tag_value(note => 'algorithm:fuzzpro'); $feature->add_tag_value(note => "pattern:$PATTERN"); # add label $feature->add_tag_value(label => 'N-glycosylation site'); } # close stream $gffio->close; # return sequence object return $seq; } 1;
|