Computational Technology Resources - CCP

Keywords: asynchronous iterations, convergence detection, global residual, parallel computing.

Summary

One of the major questions which arise when implementing asynchronous iterations consists of finding a mechanism to detect when convergence is reached. On efficiency aspects, centralized detection protocols suffer from scaling limits, and more elaborated mechanisms may imply termination delays. On the other hand, effective convergence is hardly guaranteed when resorting to assumptions-based protocols. One thus has to figure out what is the most appropriate choice according to his parallel configuration.

To be more precise, let a sequence of vectors be generated by asynchronous iterations to find the solution of a fixed-point problem. In such a context, this sequence of vectors is actually implicit, and one only explicitly handles parallel sequences of local subvectors. The asynchronous convergence detection problem therefore consists of determining, in a non-blocking way, and as quickly as possible, the moment when a residual error evaluation function would nearly vanish if applied to a gathered potential solution. The main distributed approaches consist of: modifying the iterative procedure to ensure finite-time termination, explicitly evaluating residual errors from global state snapshots, approximating the number of iterations required to reach convergence, monitoring both the consistency and the persistence of local convergence, evaluating the diameter of solutions nested sets by means of “macro-iterations”.

Modifying the iterative procedure is intrusive and even requires additional assumptions over the asynchronous iterative model. Making use of nested sets was investigated only on mathematical aspects, and suggests the need of intrusive piggybacking techniques. The monitoringbased and the prediction-based approaches can lead to untimely termination, which requires a post-detection final check. The snapshot method introduces computation data into snapshot messages, which leads to an O(n) communication overhead. In our earlier work an O(1) snapshot message size is achieved, but at the cost of assuming a bound on communication delays. The analysis therein shows the evaluation of an approximated residual error, while this approximation is explicitly bounded. Roughly, it allows for a non consistent snapshot. We therefore investigate, in this paper, to which extent such a snapshot could be non consistent, which even allows us to consider no control at all, meaning not performing any prior snapshot protocol. We performed several experiments on a supercomputer, with up to 504 processor cores, for solving a convection-diffusion equation in a regular 3D grid geometry, by means of an asynchronous iterative method based on a mixed Jacobi and Gauss-Seidel relaxation scheme.

purchase the full-text of this paper (price £22)

go to the previous paper
go to the next paper
return to the table of contents
return to the book description
purchase this book (price £25 +P&P)

	Computational & Technology Resources an online resource for computational, engineering & technology publications
	not logged in - login
Front Page Browse CCP CSETS CTR IJRT Other Authors Search Purchase Guide FAQ Contact us	Civil-Comp Proceedings ISSN 1759-3433 CCP: 112 PROCEEDINGS OF THE SIXTH INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED, GPU AND CLOUD COMPUTING FOR ENGINEERING Edited by: P. Iványi and B.H.V. Topping Paper 17 Distributed asynchronous convergence detection without detection protocol G. Gbikpi-Benissan¹ and F. Magoules² ¹RUDN University, Russia ²Centrale Supelec, Universite Paris-Saclay, France doi:10.4203/ccp.112.17 purchase the full-text of this paper Full Bibliographic Reference for this paper G. Gbikpi-Benissan, F. Magoules, "Distributed asynchronous convergence detection without detection protocol", in P. Iványi, B.H.V. Topping, (Editors), "Proceedings of the Sixth International Conference on Parallel, Distributed, GPU and Cloud Computing for Engineering", Civil-Comp Press, Stirlingshire, UK, Paper 17, 2019. doi:10.4203/ccp.112.17 Keywords: asynchronous iterations, convergence detection, global residual, parallel computing. Summary One of the major questions which arise when implementing asynchronous iterations consists of finding a mechanism to detect when convergence is reached. On efficiency aspects, centralized detection protocols suffer from scaling limits, and more elaborated mechanisms may imply termination delays. On the other hand, effective convergence is hardly guaranteed when resorting to assumptions-based protocols. One thus has to figure out what is the most appropriate choice according to his parallel configuration. To be more precise, let a sequence of vectors be generated by asynchronous iterations to find the solution of a fixed-point problem. In such a context, this sequence of vectors is actually implicit, and one only explicitly handles parallel sequences of local subvectors. The asynchronous convergence detection problem therefore consists of determining, in a non-blocking way, and as quickly as possible, the moment when a residual error evaluation function would nearly vanish if applied to a gathered potential solution. The main distributed approaches consist of: modifying the iterative procedure to ensure finite-time termination, explicitly evaluating residual errors from global state snapshots, approximating the number of iterations required to reach convergence, monitoring both the consistency and the persistence of local convergence, evaluating the diameter of solutions nested sets by means of “macro-iterations”. Modifying the iterative procedure is intrusive and even requires additional assumptions over the asynchronous iterative model. Making use of nested sets was investigated only on mathematical aspects, and suggests the need of intrusive piggybacking techniques. The monitoringbased and the prediction-based approaches can lead to untimely termination, which requires a post-detection final check. The snapshot method introduces computation data into snapshot messages, which leads to an O(n) communication overhead. In our earlier work an O(1) snapshot message size is achieved, but at the cost of assuming a bound on communication delays. The analysis therein shows the evaluation of an approximated residual error, while this approximation is explicitly bounded. Roughly, it allows for a non consistent snapshot. We therefore investigate, in this paper, to which extent such a snapshot could be non consistent, which even allows us to consider no control at all, meaning not performing any prior snapshot protocol. We performed several experiments on a supercomputer, with up to 504 processor cores, for solving a convection-diffusion equation in a regular 3D grid geometry, by means of an asynchronous iterative method based on a mixed Jacobi and Gauss-Seidel relaxation scheme. purchase the full-text of this paper (price £22) go to the previous paper go to the next paper return to the table of contents return to the book description purchase this book (price £25 +P&P)
Back to top	©Civil-Comp Limited 2023 - terms & conditions