The generation and comparative analisys of genomic sequences of the Trypanosoma rangeli
The hemoflagellate protozoan parasite Trypanosoma (Herpetosoma) rangeli Tejera, 1920 (Kinetoplastida: Tryponosomatidae) share several species of invertebrate and vertebrate hosts with T. cruzi, etiological agent of Chagas? disease. Recently, the genome of 3 trypanosomatidspecies of major importance on human health (Tri-Tryps) were described but non-pathogenic species has not been well studied, among which we include T. rangeli. Two distinct approaches have been used on genomics of several species, the GSS (Genome Sequence Survey) which aims the generation of sequences from randomly generated genomic DNA clones and EST (Expressed Sequence Tags), directed to the generation of sequences from cDNA libraries. In the presentstudy 1,720 genomic sequences from T. rangeli SC58 were generated by GSS. Furthermore, an integrated system for sequence analysis and annotation named GARSA (Genomic Analysis Resources for Sequence Annotation) was also developed. Through this system it is possible to run21 bioinformatics softwares from simple sequence analysis and trimming to phylogenetic and protein domain analyses in a user-friendly and intuitive manner. After analysis of the 1,720 sequences, a total of 915 were grouped in 375 non-redundant sequences (GSS-nr). The G+C content of the coding regions was of 55%. Similarity searches based on BLAST and Interpro revealed positive for 68% of the sequences, being 53% hypothetical proteins of organisms belonging to the same family, especially T. cruzi. Also, sequences related to the mRNA editing process (DEAD box helicase), as well as from the parasite coat as trans-sialidase, metaloproteases and mucinas were found. Functional annotation based on the Gene Ontology consortia vocabulary were carried out, mostly related to molecular function and related to RNAhelicase, serino-peptidases and ligands. For 31% of the generated sequences was not possible to infer functions based on similarity searches. Thus, these sequences may represent unknown sequences, T. rangeli specific sequences or even intergenic regions. Up to now there are noreports concerning the T. rangeli genome, indicating that the present work is the first one addressing a large scale exploration of the parasite genome.
Advisor:Alberto Martin Rivera Dávila; Edmundo Carlos Grisard
School:Faculdades Oswaldo Cruz
Source Type:Master's Thesis
Keywords:Trypanosoma rangeli genome annotation bioinformatic.
Date of Publication:05/05/2006