Exercises for the Basics tutorial
Go to parent CLC Bio Main Workbench
Contents
Genbank searches
Search for the longest human BRCA1 mRNA sequence. |
---|
Perform a Search on NCBI:
Error creating thumbnail: Unable to save thumbnail to destination Twice click the header of the Length column in the results table to order the sequences according to length in descending order. Error creating thumbnail: Unable to save thumbnail to destination Download and Save the sequence. |
How many times was the record updated ? |
---|
Go to the Genbank record view and look at VERSION section. Each time a record is updated, it keeps the same accession number but the version number is increased by one. In the "VERSION" section of the record you see that the version number is NM_000059.3 The ".3" means that this is the third version of this record so it has been twice updated. Error creating thumbnail: Unable to save thumbnail to destination |
Select and save the CDS of BRCA1 in a separate file. |
---|
Go back to the linear view. In the Side Panel, go to the "Annotation type" settings and remove all types except "CDS". Error creating thumbnail: Unable to save thumbnail to destination Save the sequence. |
What is the UniSTS ID of the longest STS in BRCA1 CDS ? |
---|
STS stands for Sequence Tagged Site, these are short (200-500 nt) sequences that are known to occur only once in the genome and for which the genomic location is known. The UniSTS database contains the sequences of the primers that can be used to selectively amplify them. You can use them to screen a library of uncharacterized clones for clones that contain a specific region of the genome. If you find pairs of overlapping clones you can use them to construct contigs (STS are unique so two clones that contain the same STS must overlap). They are also used to map mutations (deletions or insertions , not point mutations) in unique regions of the genome. PCR will generate different product sizes for mutations and as such they are used to trace mutations through families. |
In the Side Panel, go to the "Annotation type" settings and select "STS". Choose Fit width to see an overview of all STS. Error creating thumbnail: Unable to save thumbnail to destination |
How many introns does the BRCA1 gene have ? |
---|
For this we need the genomic sequence instead of the mRNA sequence. So repeat the Genbank search but search for a genomic sequence that contains the complete CDS. Error creating thumbnail: Unable to save thumbnail to destination Annotate mRNA and zoom in to a level that allows you to count the introns. There are 22 introns in this gene. |
Translate the BRCA1 CDS to obtain the protein sequence ? |
---|
Right click the CDS and select "Translate CDS/ORF" Error creating thumbnail: Unable to save thumbnail to destination Annotate mRNA and zoom in to a level that allows you to count the introns. There are 22 introns in this gene. |
UniProt searches
Download the protein sequence of BRCA1 from UniProt ? |
---|
Perform a UniProt search Error creating thumbnail: Unable to save thumbnail to destination Open the BRCA1 protein and save. |
Pairwise alignment
Align the two protein sequences to see if they correspond. |
---|
Go to Toolbox - Alignments and Trees - Create Alignment Select the two proteins: BRCA1 Protein and translation of BRCA1 CDS Error creating thumbnail: Unable to save thumbnail to destination Deselect all Alignment Info in the Side Panel and zoom out to see the alignment on a single page. Error creating thumbnail: Unable to save thumbnail to destination You can see that the UniProt protein is a part of the translation of the Genbank CDS. |
BLAST searches
Using the UniProt sequence, perform a stringent BLAST search on SwissProt to find homologs of BRCA1. |
---|
Go to Toolbox - BLAST - BLAST at NCBI Select the BRCA1 Protein sequence as an input Error creating thumbnail: Unable to save thumbnail to destination Many potential homologs are found: Error creating thumbnail: Unable to save thumbnail to destination |
Download the sequences that were found by the BLAST search. |
---|
Go to Toolbox - General Sequence Analysis - Extract Sequences Select the BLAST results file as an input file Error creating thumbnail: Unable to save thumbnail to destination Note that only the parts that are similar to human BRCA1 are saved not the full sequences !. |
What is the function of the best non-human hit ? |
---|
Switch to the BLAST table view of the BLAST results. Error creating thumbnail: Unable to save thumbnail to destination Double click the second hit (in dark grey on the figure) to open the sequence. Error creating thumbnail: Unable to save thumbnail to destination |
Pairwise alignment: dot plot
Create a dot plot of the BRCA1 gene and the BRCA1 mRNA. |
---|
Dot plots provide a graphical way of detecting similarity. Go to Toolbox - General Sequence analysis - Create Dot Plot Error creating thumbnail: Unable to save thumbnail to destination Search for blue diagonal lines: they correspond to regions that are the same in the mRNA and the gene sequence, i.e. the exons.
In the longest exon you see a number of internal repeats (visible as lines parallel to the diagonal of the exon Error creating thumbnail: Unable to save thumbnail to destination |
Searching for annotations based on their sequence
We will use an annotated vector sequence to annotate unannotated vector sequences. The Workbench comes with a list of vector sequences, some of them annotated some of them not. In the "Example Data" folder, expand the "Cloning vector library". As you can see it contains many vector sequences but when you open and cicularize the first vector: "M13mp8_pUC8" or vector "pAT153" you can see that they do not contain any annotations.
When you open and circularize "PCS19" on the other hand, you see that it is well annotated.
To get an idea about the number of unannotated vectors, search for all vectors that contain an ampicillin resistance gene. |
---|
Error creating thumbnail: Unable to save thumbnail to destination
You can see that he can only find 3 vectors that contain an ampicillin resistance gene, which is far below the number that you'd expect. |
Extract the annotations from PCS19. |
---|
For this you first need to install the "Extract annotations" plugin if you haven't done so already:
|
After installing the plugin you can use it:
Error creating thumbnail: Unable to save thumbnail to destination
This generates a sequence list containing all extracted annotations from pCS19. Save the list and export as a fasta file. |
Tranform the list of extracted annotations into a motif list. |
---|
Error creating thumbnail: Unable to save thumbnail to destination
|
Please note that the Workbench comes with a list of motifs for annotations. You can find them in the "Example Motifs" file in the "Enzyme lists" folder in "Cloning" section of the "Example Data". When you open this file you see that it contains among others motifs for the start codon, the hexaHis tag, primers for various promoters...
Search the extracted annotations in the following unannotated vector sequences:
|
---|
Error creating thumbnail: Unable to save thumbnail to destination
If annotations are found in the unannotated vector sequences they will be added. For instance, circularize pAT153 and you will see that both Tet and Amp have been annotated on the sequence. |
The Workbench contains by default a list of vector sequences but this list is not complete and some vectors are not annotated. Since adding and annotating vector sequences to the Workbench is a cumbersome task, you can download an extensive list of annotated vector sequences, compiled by BITS. You can simply import this list into a folder in your Workbench.