Variable-Size Batched Gauss-Huard for Block-Jacobi Preconditioning

TitleVariable-Size Batched Gauss-Huard for Block-Jacobi Preconditioning
Publication TypeConference Proceedings
Year of Publication2017
AuthorsAnzt, H., J. Dongarra, G. Flegar, E. S. Quintana-Orti, and A. E. Thomas
Conference NameInternational Conference on Computational Science (ICCS 2017)
Volume108
Pagination1783-1792
Date Published2017-06
PublisherProcedia Computer Science
Conference LocationZurich, Switzerland
AbstractIn this work we present new kernels for the generation and application of block-Jacobi precon-ditioners that accelerate the iterative solution of sparse linear systems on graphics processing units (GPUs). Our approach departs from the conventional LU factorization and decomposes the diagonal blocks of the matrix using the Gauss-Huard method. When enhanced with column pivoting, this method is as stable as LU with partial/row pivoting. Due to extensive use of GPU registers and integration of implicit pivoting, our variable size batched Gauss-Huard implementation outperforms the batched version of LU factorization. In addition, the application kernel combines the conventional two-stage triangular solve procedure, consisting of a backward solve followed by a forward solve, into a single stage that performs both operations simultaneously.
DOI10.1016/j.procs.2017.05.186
External Publication Flag: