Kinetoplast New Year

Al Ivens alicat at
Tue Jan 23 17:41:42 BRST 2001

Dear Colleagues:

T. brucei Genome Survey Sequence Data

Just before Christmas 2000, the Sanger Centre Pathogen Sequencing Unit
submitted > 43,000 GSS sequences from the 2-kb sheared genomic DNA
clones from the TIGR library.

As an aid to the community, all GSS sequences were subjected to a BLASTX
analysis of Swissprot/TrEMBL databases. This has now been completed. The
summary data are shown below:

Applying a probability cut-off of 1e-10 to the BLAST output: 

o 8196 had a hit (~21 percent):

of which, according to their description lines: 

o 1095 were probably INGI-related (ORF 1, 2):

o 441 were adenylate cyclases:

o 77 were described as "ESAG":

o 632 were VSGs:

o 112 were ribosomal proteins:

o 66 were helicases:

o 1454 showed similarity to hypothetical proteins:

o 4170 did not fall into the above "classes":

o 2025 had no hits at all. 

o species-by-species tally of top BLASTX hits (Note: T. brucei brucei
and T. brucei are treated as separate items)

Additional output data analyses are available on request.

Each of these datasets are available, either by clicking on the above
links, or from the GSS ftp site. The entire set of Sanger GSS are also
available as a fasta database. Additional output data analyses are
available on request.

The ftp site:

This email is available as a web page:

For more information please contact: Bart Barrell
(barrell at, Neil Hall (nh1 at or Matt Berriman
(mb4 at 

Al Ivens
Sanger Centre
Wellcome Trust Genome Campus
Cambs. CB10 1SA

Tel -44-1223- 49 48 51
Fax -44-1223- 49 49 19
Email: alicat at

Research pages:

More information about the Leish-l mailing list