Computational & Technology Resources
an online resource for computational,
engineering & technology publications 

Computational Science, Engineering & Technology Series
ISSN 17593158 CSETS: 27
TRENDS IN PARALLEL, DISTRIBUTED, GRID AND CLOUD COMPUTING FOR ENGINEERING Edited by: P. Iványi, B.H.V. Topping
Chapter 11
High Performance Strategies for Large Scale Complex Structural Analysis J.Y. Cognard^{1} and P. Verpeaux^{2}
^{1}Laboratoire Brestois de Mécanique et des Systèmes, ENSIETA, Brest, France J.Y. Cognard, P. Verpeaux, "High Performance Strategies for Large Scale Complex Structural Analysis", in P. Iványi, B.H.V. Topping, (Editors), "Trends in Parallel, Distributed, Grid and Cloud Computing for Engineering", SaxeCoburg Publications, Stirlingshire, UK, Chapter 11, pp 243268, 2011. doi:10.4203/csets.27.11
Keywords: nonlinear computations, parallel strategies, algorithms, large scale problems, load balancing, industrial environment.
Summary
Reducing the time and cost of mechanical design requires the nonlinearities to be taken into account
when simulating the behaviour of the structures. Unfortunately, these simulations often lead to
numerical costs too high for their use to be widespread in the industry. The joint use of powerful
algorithms and parallel computers is necessary to significantly reduce the cost of these complex
simulations. In order to obtain accurate numerical predictions, especially in respect of the safety
constraints which are more and more required in hightech industries, realistic models
have to be used. For such analysis, often the effects of aging, sometimes in severe environments,
can have a great influence on the mechanical behaviour of the materials which can lead the solution of
coupled problems. Therefore, such studies must take into account more and more accurate
numerical mechanical properties which leads to the model use of the various parts of the structures studied.
Unfortunately, the numerical simulation of these problems can be difficult, as they generally lead to the
solution of large scale complex timedependent nonlinear problems.
Nonlinear problems are usually solved using the socalled incremental methods, which split the studied time interval into a series of time increments. Using an estimate for the displacement leads to a timeindependent nonlinear problem, which is solved by means of Newton type iterative method. This algorithm mainly leads to solving two types of subproblems, which can be time consuming for a large number of degrees of freedom and for strongly nonlinear constitutive laws, i.e. for industrial type problems. On the one hand, linear global problems defined over the whole structure have to be solved, and on the other hand, the integration of the constitutive relationships leads to the solution of local in space nonlinear equations (at each integration point). Therefore two main difficulties exist with different mechanical properties. Moreover the iterative resolution process generates a coupling of these two difficulties which has to be taken into account in the numerical implementation. A precise modelling of some structures (composites, assemblies, etc.) requires highly mechanical properties contrasts to be taken into account, and for some industrial complex assemblies the mesh can contain some flattened elements. These two properties can lead to very bad conditioning of the stiffness matrix with an under elastic assumption of the different constituents. The resolution of such complex linear problems, taking into account the various boundary conditions can be accurately done using direct solvers or direct parallel solvers. For tridimensional applications, even using adequate ordering approaches which limit the fillin effect in the factorization of the matrix in order to reduce the matrix storage, the computational wallclock time increases very quickly with the number of degree of freedom. Another important limitation is the storage of the stiffness matrix which also increases very quickly with the number of degrees of freedom (d.o.f); the use of out of core storage strategies is thus necessary to solve large scale linear problems. But, it is important to understand that the use of intensive disk storage drastically increases the computational wallclock time. For instance, for a powerful PC a practical limit is around 10,000,000 d.o.f. with around 100 Gb matrix storage. Various powerful iterative solvers exist, for instance parallel preconditioned conjugate gradient techniques, but the convergence of such approaches is not often assured in the case of very bad stiffness matrix conditioning. Such problems can be encountered for industrial applications, for research analysis and for inverse identification procedures of material models parameters which can be time consuming simulations for complex threedimensional models. Thus, robust solvers have to be developed, in order to take into account the quality criteria of industrial type software, i.e. the determination of the correct solution of the problem (if the data is correct) with a predictable wallclock time and without the need for tuning. Therefore, the definition of high performance strategies for large scale complex timedependent nonlinear structural analysis requires the joint use of powerful algorithms and efficient parallel strategies. The aim of this research project was to extend the possibilities of the finite element code CAST3M (developed at CEA, France), where the purpose is to facilitate the development of new algorithms. Moreover, it is important to take into account the experiments over several decades of developments of powerful numerical strategies for nonlinear simulations in industrial environments. The challenge is to merge the possibilities of different parallel computers (in particular efficient and economic configurations of multicore 64 bits computers) with the traditional requirements of an industrial code: robustness and flexibility, ease of use, and predictability of computational resource employment. The proposed parallelisation strategy uses the mechanical properties of these two types of subproblems to be solved. On the one hand, domain decomposition techniques can be used to solve the linear global problems. On the other hand, the CPU time spent to integrate the constitutive laws depends on several parameters: the material behaviour, the position of the integration point in the structure, the history of the loading path, etc. Therefore, for complex simulations, it is nearly impossible to predict the space evolution of the numerical cost of this part with respect to the increments. In order to have a wellbalanced load during the integration of the constitutive laws, without communication, we propose the use of a type of second domain decomposition. An optimisation of the communications between the two domain decompositions is necessary to obtain good performance for the simulation of a wide class of nonlinear problems for quasistatic response. The starting point is to make use of the mechanical properties of the different types of equations to be solved in order to distribute computations over the different processors of a parallel computer. The approach is based on the use of two domain decompositions where the goal is to balance the computation load by limiting the redistribution of the tasks. A good load balancing of the tasks as well as keeping the communications as low as possible are necessary to obtain an effective parallel algorithm. The implementation of this algorithm is carried out starting from an extension of the possibilities of GIBIANE: the user language of the code CAST3M. We have created a parallel environment language that eases the development of parallel algorithms either at the programming level or at the user level. It is based on the development environment of the finite element code CAST3M. The parallel language developed, which is based on an objectbased virtual shared memory system, offers the user the vision of a unique and global address space over the individual memories. It ensures the data coherence and hides data exchanges between processors and much of the sequential code can be directly reused. The system proposed can be implemented on most parallel computers as it is developed with machineindependent programming techniques and it is important to notice that the different concepts can be used in other objectbased parallel languages. The purpose of this chapter is to present strategies developed efficiently to solve nonlinear mechanical problems with highly contrasting mechanical properties which can lead to very bad stiffness matrix conditioning. Numerical examples, in the case of large scale industrial type problems are presented to validate the proposed parallel approach. purchase the fulltext of this chapter (price £20)
go to the previous chapter 
