کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
524389 868643 2010 11 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
On iterative QR pre-processing in the parallel block-Jacobi SVD algorithm
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر
پیش نمایش صفحه اول مقاله
On iterative QR pre-processing in the parallel block-Jacobi SVD algorithm
چکیده انگلیسی

An efficient version of the parallel two-sided block-Jacobi algorithm for the singular value decomposition of an m×nm×n matrix A includes the pre-processing step, which consists of the QR factorization of A with column pivoting followed by the optional LQ factorization of the R-factor. Then the iterative two-sided block-Jacobi algorithm is applied in parallel to the R-factor (or L  -factor). For the efficient computation of the parallel QR (or LQ) factorization with (or without) column pivoting implemented in the ScaLAPACK, some matrix block cyclic distribution on a process grid r×cr×c with p=r×c,r,c⩾1, and block size nb×nbnb×nb is required so that all processors remain busy during the whole parallel QR (or LQ) factorization. Optimal values for parameters r, c   and nbnb are estimated experimentally using matrices of order n=4000n=4000 and 8000, and the number of processors p=8p=8 and 16, respectively. It turns out that the optimal values are about nb=100nb=100 and r⩽cr⩽c with both r, c   near to p. These parameters are then used in numerical experiments for six various distributions of singular values combined with well- (κ=101)(κ=101) and ill-conditioned matrices (κ=108)(κ=108). It is shown that using optimal parameters in the pre-processing step, the parallel two-sided block-Jacobi SVD algorithm performs better (or equally well) than the ScaLAPACK routine PDGESVD for matrices with a multiple minimal/maximal singular value regardless to the condition number. For other distributions of singular values, our algorithm is slower than the ScaLAPACK. The un-pivoted QRLQ pre-processing step is then re-formulated and extended to the QR iteration, and its connection to the QR algorithm applied to specific symmetric, positive definite matrices is shown. This connection helps to explain observations in another set of experiments with a variable number of QR iteration steps. In general, the best results for all six distributions of singular values are achieved by using about six QR iteration steps in pre-processing.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Parallel Computing - Volume 36, Issues 5–6, June 2010, Pages 297–307
نویسندگان
, , , ,