CRPC-TR97767 January 1997 Title: The Design and Implementation of the Parallel Out-of-core ScaLAPACK LU, QR and Cholesky Factorization Routines Author: E.F. D'Azevedo and J.J. Dongarra Submitted August 1998; Available as LAPACK Working Note 118 and UTK Technical Report CS-97-347 Abstract: This paper describes the design and implementation of three core factorization routines - LU, QR and Cholesky - included in the out-of-core extension of ScaLAPACK. These routines allow the factorization and solution of a dense system that is too large to fit entirely in physical memory. An image of the full matrix is maintained on disk and the factorization routines transfer sub-matrices intoemory. The 'left-looking' column-oriented variant of the factorization algorithm is implemented to reduce the disk I/O traffic. The routines are implemented using a portable I/O interface and utilize high performance ScaLAPACK factorization routines as in-core computational kernels. We present the details of the implementation for the out-of-core ScaLAPACK factorization routines, as well as performance and scalability results on the Intel Paragon. ------------------------------------------------------------------------------ E.F. D'Azevedo J.J. Dongarra dazevedoef@ornl.gov dongarra@cs.utk.edu Mathematical Sciences Section Department of Computer Science Oak Ridge National Laboratory University of Tennessee at Knoxville