Computational Approaches to the Identification and Characterization of Non-Coding RNA Genes
Non-coding RNAs (ncRNAs) have emerged as highly diverse and powerful key players in the cell, the range of capabilities spanning from catalyzing essential processes in all living organisms, e.g. protein synthesis, to being highly specific regulators of gene expression. To fully understand the functional significance of ncRNAs, it is of critical importance to identify and characterize the repertoire of ncRNAs in the cell. Practically every genome-wide screen to identify ncRNAs has revealed large numbers of expressed ncRNAs and often identified species-specific ncRNA families of unknown function. Recent years' advancement in high-throughput sequencing techniques necessitates efficient and reliable methods for computational identification and annotation of genes. A major aim in the work underlying this thesis has been to develop and use computational tools for the identification and characterization of ncRNA genes.We used computational approaches in combination with experimental methods to study the ncRNA repertoire of the model organism Dictyostelium discoideum. We report ncRNA genes belonging to well-characterized gene families as well as previously unknown and potentially species-specific ncRNA families. The complicated task of de novo ncRNA gene prediction was successfully addressed by developing a method for nucleotide composition-based gene prediction using maximal-scoring partial sums and considering overlapping dinucleotides.We also report a substantial heterogeneity among human spliceosomal snRNAs. Northern blot analysis and cDNA cloning, as well as bioinformatical analysis of publicly available microarray data, revealed a large number of expressed snRNAs. In particular, U1 snRNA variants with several nucleotide substitutions that could potentially have dramatic effects on splice site recognition were identified.In conclusion, we have by using computational approaches combined with experimental analysis identified a rich and diverse ncRNA repertoire in the eukaryotes D. discoideum and Homo sapiens. The surprising diversity among the snRNAs in H. sapiens suggests a functional involvement in recognition of non-canonical introns and regulation of messenger RNA splicing.
Source Type:Doctoral Dissertation
Keywords:NATURAL SCIENCES; Chemistry; Theoretical chemistry; Bioinformatics; ncRNA; snRNA; U1; splice site; alternative splicing; Dictyostelium; nucleotide composition; partial sums
Date of Publication:01/01/2009