A coarse-grain parallel implementation of the block tridiagonal divide and conquer algorithm for symmetric eigenproblems
Abstract (Summary)
Cuppen’s divide and conquer technique for symmetric tridiagonal eigenproblems, along
with Gu and Eisenstat’s modification for improvement of the eigenvector computation,
has yielded a stable, efficient, and widely-used algorithm. This algorithm has now been
extended to a larger class of matrices, namely symmetric block tridiagonal eigenproblems.
The Block Tridiagonal Divide and Conquer algorithm has shown several characteristics that
make it suitable for a number of applications, such as the Self-Consistent-Field procedure
in quantum chemistry.
This thesis discusses the steps taken to implement a coarse-grain parallel version of
the Block Tridiagonal Divide and Conquer algorithm, suitable for a parallel supercomputer
or a cluster of machines. The parallel version relies on components of the ScaLAPACK
parallel linear algebra library and follows the same model as the serial code, including the
implementation of full deflation.
A modest speedup (2 to 3) was achieved using a few processors (4 and 16). Increasing
the number of processors from 4 to 16 produced only slightly better speedup. This
implementation was not competitive with the standard ScaLAPACK symmetric eigenvalue
routine. Analysis shows that the distribution scheme chosen for the eigenvector storage requires
n O p2 function calls to the ScaLAPACK matrix multiplication routine, where n
is the matrix size and p is the number of blocks. The matrix multiplications are responsible
for the majority of the computational cost; therefore, the associated overhead needs to be
reduced in order to make this implementation more competitive.
iv
Bibliographical Information:
Advisor:
School:The University of Tennessee at Chattanooga
School Location:USA - Tennessee
Source Type:Master's Thesis
Keywords:
ISBN:
Date of Publication: