Computational Technology Resources - CCP

Keywords: reconfigurable computing, hardware acceleration, field programmable gate array, finite element method, preconditioned conjugate gradient.

Summary

Three dimensional finite element analyses require a large amount of time and memory in order to solve a large but sparse matrix-vector system of linear equations. One approach to the parallel computation of solutions is to use reconfigurable hardware, such as field programmable gate arrays (FPGAs), to create custom co-processors to accelerate the algorithm. However, in order to realize the full potential of such approaches, the underlying algorithms must be inherently parallelizable.

Recently, FPGAs have reached the speed and logic density required to implement highly complex systems. The latest Virtex 5 series produced by Xilinx Corporation has up to 330,000 logic cells (equivalent to 55 million basic 2-input logic gates) capable of operation at speeds exceeding 550 MHz [1]. FPGA co-processors have much lower cost and greater flexibility than ASIC hardware. For the right type of application, a reconfigurable hardware-software co-processor can rival expensive parallel computers in accelerating computationally expensive algorithms.

This paper presents the results for the implementation of an element-by-element preconditioned conjugate gradient FEM solver using single precision floating-point arithmetic on a reconfigurable computing platform. The platform consists of two Celoxica RC2000 PCI bus plug-in cards equipped with one single Xlinx Virtex 2V 6000 FPGA and one single Xilinx Virtex 4VLX160 FPGA respectively. The 4VLX160 FPGA contains 152,064 logic cells, which consists of a look-up table and a flip-flop, plus 288 blocks of 18 Kb RAM, and 96 Xtreme DSP slices. A software host is used to implement an element-by-element scheme in which the whole problem is scattered into a series of sub-domains that can be downloaded into the reconfigurable computing boards. In this case, the large and sparse global stiffness matrix is no longer required to be assembled in the FPGA. The element matrix-vector calculations are completely independent, thus, the computational resources of the reconfigurable hardware can be efficiently used in a scalable manner, even when the matrix size increases.

The 32-bit floating-point hardware-software co-processor for the three-dimensional tetrahedral finite element using the preconditioned conjugate gradient method can achieve a speed-up of 40 for a single FPGA board (based on a 4VLX160 FPGA) compared to a software solution implemented using the same algorithm on a fast PC.

References

1: www.xilinx.com/products/silicon_solutions/fpgas/virtex, Accessed 27 Mar 2008.

purchase the full-text of this paper (price £20)

go to the previous paper
go to the next paper
return to the table of contents
return to the book description
purchase this book (price £95 +P&P)

	Computational & Technology Resources an online resource for computational, engineering & technology publications
	not logged in - login
Front Page Browse CCP CSETS CTR IJRT Other Authors Search Purchase Guide FAQ Contact us	Civil-Comp Proceedings ISSN 1759-3433 CCP: 89 PROCEEDINGS OF THE SIXTH INTERNATIONAL CONFERENCE ON ENGINEERING COMPUTATIONAL TECHNOLOGY Edited by: M. Papadrakakis and B.H.V. Topping Paper 2 Acceleration of an Element-by-Element Preconditioned Conjugate Gradient Solver for Three-Dimensional Tetrahedral Finite Elements using Field Programmable Gate Arrays J. Hu¹, S.F. Quigley¹ and A.H.C. Chan² ¹Department of Electronic, Electrical and Computer Engineering, ²Department of Civil Engineering, University of Birmingham, United Kingdom doi:10.4203/ccp.89.2 purchase the full-text of this paper Full Bibliographic Reference for this paper J. Hu, S.F. Quigley, A.H.C. Chan, "Acceleration of an Element-by-Element Preconditioned Conjugate Gradient Solver for Three-Dimensional Tetrahedral Finite Elements using Field Programmable Gate Arrays", in M. Papadrakakis, B.H.V. Topping, (Editors), "Proceedings of the Sixth International Conference on Engineering Computational Technology", Civil-Comp Press, Stirlingshire, UK, Paper 2, 2008. doi:10.4203/ccp.89.2 Keywords: reconfigurable computing, hardware acceleration, field programmable gate array, finite element method, preconditioned conjugate gradient. Summary Three dimensional finite element analyses require a large amount of time and memory in order to solve a large but sparse matrix-vector system of linear equations. One approach to the parallel computation of solutions is to use reconfigurable hardware, such as field programmable gate arrays (FPGAs), to create custom co-processors to accelerate the algorithm. However, in order to realize the full potential of such approaches, the underlying algorithms must be inherently parallelizable. Recently, FPGAs have reached the speed and logic density required to implement highly complex systems. The latest Virtex 5 series produced by Xilinx Corporation has up to 330,000 logic cells (equivalent to 55 million basic 2-input logic gates) capable of operation at speeds exceeding 550 MHz [1]. FPGA co-processors have much lower cost and greater flexibility than ASIC hardware. For the right type of application, a reconfigurable hardware-software co-processor can rival expensive parallel computers in accelerating computationally expensive algorithms. This paper presents the results for the implementation of an element-by-element preconditioned conjugate gradient FEM solver using single precision floating-point arithmetic on a reconfigurable computing platform. The platform consists of two Celoxica RC2000 PCI bus plug-in cards equipped with one single Xlinx Virtex 2V 6000 FPGA and one single Xilinx Virtex 4VLX160 FPGA respectively. The 4VLX160 FPGA contains 152,064 logic cells, which consists of a look-up table and a flip-flop, plus 288 blocks of 18 Kb RAM, and 96 Xtreme DSP slices. A software host is used to implement an element-by-element scheme in which the whole problem is scattered into a series of sub-domains that can be downloaded into the reconfigurable computing boards. In this case, the large and sparse global stiffness matrix is no longer required to be assembled in the FPGA. The element matrix-vector calculations are completely independent, thus, the computational resources of the reconfigurable hardware can be efficiently used in a scalable manner, even when the matrix size increases. The 32-bit floating-point hardware-software co-processor for the three-dimensional tetrahedral finite element using the preconditioned conjugate gradient method can achieve a speed-up of 40 for a single FPGA board (based on a 4VLX160 FPGA) compared to a software solution implemented using the same algorithm on a fast PC. References 1 www.xilinx.com/products/silicon_solutions/fpgas/virtex, Accessed 27 Mar 2008. purchase the full-text of this paper (price £20) go to the previous paper go to the next paper return to the table of contents return to the book description purchase this book (price £95 +P&P)
Back to top	©Civil-Comp Limited 2023 - terms & conditions