Structure-based multiple RNA sequence alignment and finding RNA motifs
Abstract (Summary)
ii
Craig Zirbel, Advisor
With the advent of faster computers and the availability of RNA crystal structures we
can now use more information to align homologous RNA sequences. We can take a crystal
structure and construct a probabilistic model, based on a SCFG, of an RNA molecule. We
construct objects called nodes that modularize the model into small pieces that are more
manageable. Using this model we can take sequences that are similar to the sequence in the
3D crystal structure and look for the most probable way that the model could have generated
the sequence. Then we can get a detailed description of how each node of the model could
have generated the sequence. Using this information we can align sequences. Given a seed
alignment we give a procedure to construct a 3D structural alignment quickly. In addition
we show how the parameters from the model can be estimated. We also have the ability to
do motif swaps using objects called alternative nodes.
We have developed an algorithm to quickly search through RNA 3D structures to find
motifs. This is accomplished by taking a query motif with m bases and finding the center of
the heavy atoms for each base and then rotating it onto candidate motifs that have the same
number of bases. Then we measure how good a fit the candidate is to the query by using
a discrepancy that we define which involves the distance between bases and their relative
orientations. A simple inequality allows us to quickly identify candidates whose discrepancy
with the query motif will exceed a cutoff discrepancy. We use this to screen out the vast
majority quickly.
To my grandmother and grandfather.
iii
Bibliographical Information:
Advisor:
School:Bowling Green State University
School Location:USA - Ohio
Source Type:Master's Thesis
Keywords:nucleotide sequence
ISBN:
Date of Publication: