کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
497814 862945 2015 17 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Computational cost of isogeometric multi-frontal solvers on parallel distributed memory machines
ترجمه فارسی عنوان
هزینه های محاسباتی حل کننده های چند منظوره ایزوگ سنجی در ماشین های حافظه موازی توزیع شده
کلمات کلیدی
حل کننده چند طرفه، آنالیز ایزوگومتریک، حافظه موازی توزیع شده، هزینه محاسباتی، هزینه ارتباط
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر
چکیده انگلیسی


• We estimate computational cost of isogeometric solver on distributed memory parallel machines.
• We show p2p2 scalability as we increase the global continuity, in 2D and 3D.
• We show O(N) and O(N4/3N4/3) costs for 2D and 3D parallel direct solvers.
• We verify the costs experimentally on STAMPEDE Linux cluster from TACC.
• We test MUMPS, PaSTiX, and SuperLU, through PETIGA toolkit built on top of PETSc.

This paper derives theoretical estimates of the computational cost for isogeometric multi-frontal direct solver executed on parallel distributed memory machines. We show theoretically that for the Cp−1Cp−1 global continuity of the isogeometric solution, both the computational cost and the communication cost of a direct solver are of order O(log(N)p2)O(log(N)p2) for the one dimensional (1D) case, O(Np2)O(Np2) for the two dimensional (2D) case, and O(N4/3p2)O(N4/3p2) for the three dimensional (3D) case, where NN is the number of degrees of freedom and pp is the polynomial order of the B-spline basis functions. The theoretical estimates are verified by numerical experiments performed with three parallel multi-frontal direct solvers: MUMPS, PaStiX and SuperLU, available through PETIGA toolkit built on top of PETSc. Numerical results confirm these theoretical estimates both in terms of pp and NN. For a given problem size, the strong efficiency rapidly decreases as the number of processors increases, becoming about 20% for 256 processors for a 3D example with 12831283 unknowns and linear B-splines with C0C0 global continuity, and 15% for a 3D example with 643643 unknowns and quartic B-splines with C3C3 global continuity. At the same time, one cannot arbitrarily increase the problem size, since the memory required by higher order continuity spaces is large, quickly consuming all the available memory resources even in the parallel distributed memory version. Numerical results also suggest that the use of distributed parallel machines is highly beneficial when solving higher order continuity spaces, although the number of processors that one can efficiently employ is somehow limited.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computer Methods in Applied Mechanics and Engineering - Volume 284, 1 February 2015, Pages 971–987
نویسندگان
, , , , ,