.embl

From BITS wiki
Jump to: navigation, search

source

The EMBL entries(as below) in the database are structured so as to be usable by human readers as well as by computer programs. Each entry in the database is composed of lines. Different types of lines, each with its own format, which are used to record the various types of data which make up the entry. Some entries will not contain all of the line types, and some line types occur many times in a single entry. As noted, each entry begins with an identification line (ID) and ends with a terminator line (//).

example

ID   L36435; SV 1; linear; mRNA; STD; MUS; 1458 BP.
XX
AC   L36435;
XX
DT   30-JAN-1995 (Rel. 42, Created)
DT   17-APR-2005 (Rel. 83, Last updated, Version 6)
XX
DE   Mus Musculus basic domain/leucine zipper transcription factor mRNA,
DE   complete cds.
XX
KW   basic domain/leucine zipper transcription factor.
XX
OS   Mus musculus (house mouse)
OC   Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;
OC   Eutheria; Euarchontoglires; Glires; Rodentia; Sciurognathi; Muroidea;
OC   Muridae; Murinae; Mus; Mus.
XX
RN   [1]
RP   1-1458
RX   DOI; 10.1016/0092-8674(94)90033-7.
RX   PUBMED; 8001130.
RA   Cordes S.P., Barsh G.S.;
RT   "The mouse segmentation gene kr encodes a novel basic domain-leucine zipper
RT   transcription factor";
RL   Cell 79(6):1025-1034(1994).
XX
FH   Key             Location/Qualifiers
FH
FT   source          1..1458
FT                   /organism="Mus musculus"
FT                   /mol_type="mRNA"
FT                   /db_xref="taxon:10090"
FT   5'UTR           1..169
FT   mRNA            1..1458
FT   CDS             170..1141
FT                   /codon_start=1
FT                   /product="basic domain/leucine zipper transcription factor"
FT                   /db_xref="GOA:P54841"
FT                   /db_xref="InterPro:IPR004826"
FT                   /db_xref="InterPro:IPR004827"
FT                   /db_xref="InterPro:IPR008917"
FT                   /db_xref="InterPro:IPR013592"
FT                   /db_xref="MGI:104555"
FT                   /db_xref="UniProtKB/Swiss-Prot:P54841"
FT                   /protein_id="AAA65689.1"
FT                   /translation="MAAELSMGQELPTSPLAMEYVNDFDLLKFDVKKEPLGRAERPGRP
FT                   CTRLQPAGSVSSTPLSTPCSSVPSSPSFSPTEPKTHLEDLYWMASNYQQMNPEALNLTP
FT                   EDAVEALIGSHPVPQPLQSFDGFRSAHHHHHHHHPHPHHGYPGAGVTHDDLGQHAHPHH
FT                   HHHHQASPPPSSAASPAQQLPTSHPGPGPHATAAATAAGGNGSVEDRFSDDQLVSMSVR
FT                   ELNRHLRGFTKDEVIRLKQKRRTLKNRGYAQSCRYKRVQQKHHLENEKTQLIQQVEQLK
FT                   QEVSRLARERDAYKVKCEKLANSGFREAGSTSDSPSSPEFFL"
FT   3'UTR           1142..1458
XX
SQ   Sequence 1458 BP; 263 A; 518 C; 452 G; 225 T; 0 other;
     gcgccgccgc gtccccagac aaaggcttgg ccggcggccc cggcccgctg cgccctcggc        60
     tccccgcctc cccggcttgc cgctcttcgc ccccgcgttt ggctcggcgc gtcccggccg       120
     gccgcaaagt tttccccgcg gcagcggcgg ctgagcctcg cttttagcga tggccgcgga       180
     gctgagcatg gggcaagagc tgcccaccag cccgctggcc atggagtacg tcaacgactt       240
     cgaccttctc aagttcgacg tgaagaagga gcccctgggg cgcgcggagc gtccgggccg       300
     gccatgcaca cgcctgcagc ctgctggctc ggtgtcgtcc accccgctca gcactccgtg       360
     cagctccgtg ccttcttctc ccagcttcag tccgactgaa ccgaagaccc atctcgagga       420
     cctgtactgg atggcgagca actaccagca gatgaacccc gaggcactca acctgacgcc       480
     cgaggacgcg gtggaggcgc tcatcggttc gcacccagtg ccccagccgc tgcagagctt       540
     cgacggcttc cgtagtgcgc accaccatca ccaccaccac caccctcatc cgcaccacgg       600
     gtacccagga gcaggtgtga ctcacgatga cctgggccag cacgctcacc cgcaccatca       660
     ccatcatcac caagcgtcgc ccccgccgtc cagcgctgcc agtcccgcgc aacagctacc       720
     cactagccac ccggggccgg gaccgcacgc aacagccgcg gcgacggctg cgggcggcaa       780
     cggtagtgtg gaggaccgct tctctgatga ccagctggtg tccatgtcgg tgcgtgagct       840
     gaaccgccac ctgcggggct tcaccaagga cgaggtgatc cgcctgaagc agaagcggcg       900
     gaccctgaag aaccggggct acgcccagtc gtgcaggtat aaacgcgtcc agcagaaaca       960
     tcacctggag aacgagaaga cgcagctcat tcagcaggtg gagcagctta agcaggaggt      1020
     gtcccggctg gcccgcgaga gagacgccta caaggtcaag tgcgagaaac tcgccaactc      1080
     cggcttcagg gaggcgggct ccaccagcga cagcccctcc tctcctgagt tctttctgtg      1140
     agtcctggcg ggtccggccc ccgcccttgc ccttgccctg gcccagactc cctattctgc      1200
     gcccctagcc ctggactccc tgtccctgcc atggccccgg ccttgacctg tttgacttga      1260
     gctagaggga ggaaggacgc gcgggtcgcg ggagtcaggc gggagcacgg gcgggcagag      1320
     aaccttggct aagaagaggg cagctcaggg cggcgcagcc tcttagactt gggcagagtt      1380
     agagaaaccc gggcgggtgc gaggtccggg agtaactttt ctccaagctg gaaggccgcg      1440
     aggcttattc caaggagt                                                    1458
//