Kinetoplast New Year
alicat at sanger.ac.uk
Tue Jan 23 17:41:42 BRST 2001
T. brucei Genome Survey Sequence Data
Just before Christmas 2000, the Sanger Centre Pathogen Sequencing Unit
submitted > 43,000 GSS sequences from the 2-kb sheared genomic DNA
clones from the TIGR library.
As an aid to the community, all GSS sequences were subjected to a BLASTX
analysis of Swissprot/TrEMBL databases. This has now been completed. The
summary data are shown below:
Applying a probability cut-off of 1e-10 to the BLAST output:
o 8196 had a hit (~21 percent):
of which, according to their description lines:
o 1095 were probably INGI-related (ORF 1, 2):
o 441 were adenylate cyclases:
o 77 were described as "ESAG":
o 632 were VSGs:
o 112 were ribosomal proteins:
o 66 were helicases:
o 1454 showed similarity to hypothetical proteins:
o 4170 did not fall into the above "classes":
o 2025 had no hits at all.
o species-by-species tally of top BLASTX hits (Note: T. brucei brucei
and T. brucei are treated as separate items)
Additional output data analyses are available on request.
Each of these datasets are available, either by clicking on the above
links, or from the GSS ftp site. The entire set of Sanger GSS are also
available as a fasta database. Additional output data analyses are
available on request.
The ftp site:
This email is available as a web page:
For more information please contact: Bart Barrell
(barrell at sanger.ac.uk), Neil Hall (nh1 at sanger.ac.uk) or Matt Berriman
(mb4 at sanger.ac.uk).
Wellcome Trust Genome Campus
Cambs. CB10 1SA
Tel -44-1223- 49 48 51
Fax -44-1223- 49 49 19
Email: alicat at sanger.ac.uk
More information about the Leish-l