Bioperl Training Exercise 7

From BITS wiki
Jump to: navigation, search
  • Create an array to store primer information. Per primer you should store:
    • sequence
    • length
    • degenerate or not
    • composition: number of A's, C's, G's and T's.
  • Loop over the structure and print the primer info in a table like manner.
  • Some hints:
    • for the composition you need to count the nucleotides. Use the 'transliterator' operator. Look into perldoc perlop.
    • degenerate or not: store 1 or 0 for that attribute. You can use the regular expression combined with the ternary operator:
      /[^ACGT]/i ? 1 : 0
    • use slicing when printing the rows


  • primer.pl template
#!/usr/bin/perl
use strict;
 
# container for primers
my @primers;
 
# fill container
while (<DATA>)
{
  # remove newline
 
  # skip empty lines
 
  # create primer data structure and add to @primers array
 
}
 
# print table output
# header
print join("\t", qw/sequence length degenerate? A C G T/),"\n";
# rows
# loop over the @primers array
 
 
__DATA__
CGCTGCGTTCTTCATCG
TCGATGAAGAACGCAGCG
ACCCGCTGAACTTAAGC
GGTTGGTTTCTTTTCCT
TTTTCAAAGTTCTTTTC
AAGAACTTTGAAAAGAG
CCGTGTTTCAAGACGGG
GTCTTGAAACACGGACC
ACCAGAGTTTCCTCTGG
TCCTGAGGGAAACTTCG
CGCCAGTTCTGCTTACC
TACTACCACCAAGATCT
GCAGATCTTGGTGGTAG
CACCTTGGAGACCTGCT
AGCAGGTCTCCAAGGTG
AGAGCACTGGGCAGAAA
AGTCAAGCTCAACAGGG
GACCCTGTTGAGCTTGA
GCCAGTTATCCCTGTGGTAA
GACTTAGAGGCGTTCAG
CTGAACGCCTCTAAGTCAGAA
CGTAACAACAAGGCTACT
AGCCAAACTCCCCACCTG
TAAATTACAACTCGGAC
TTCCACCCAAACACTCG
TAACCTATTCTCAAACTT
GTGAGACAGGTTAGTTTTACCCT
ACTTCAAGCGTTTCCCTTT
CCTCACGGTACTTGTTCGCT
GRATTACCGCGGCWGCTG
CCGTCAATTCVTTTPAGTTT
ACGGGCGGTGTGTPC
CTTAAAGGAATTGACGGAA
GTACACACCGCCCGTCG
TACCTGGTTGATQCTGCCAGT
ATTACCGCGGCTGCT
CGGCCATGCACCACC
GAAAGTTGATAGGGCT
AAACCAACAAAATAGAA
GTGCCCTTCCGTCAATT
TGTTACGACTTTTACTT
AAGWAAAAGTCGTAACAAGG
GTTCAACTACGAGCTTTTTAA
AGTTAAAAAGCTCGTAGTTG
GAACCAGGACTTTTACCTT
TTTGACTCAACACGGG
GTAGTCATATGCTTGTCTC
GGCTGCTGGCACCAGACTTGC
GCAAGTCTGGTGCCAGCAGCC
CTTCCGTCAATTCCTTTAAG
AACTTAAAGGAATTGACGGAAG
GCATCACAGACCTGTTATTGCCTC
GAGGCAATAACAGGTCTGTGATGC
TCCGCAGGTTCACCTACGGA