Computational & Technology Resources
an online resource for computational,
engineering & technology publications
Civil-Comp Proceedings
ISSN 1759-3433
CCP: 90
Edited by:
Paper 24

Grid Workflow Acceleration using a Dataflow Paradigm

M. Rashid, F. Wang and S.N. Wu

Centre for Grid Computing, Cambridge-Cranfield HPCF, AMAC/SOE, Cranfield University, United Kingdom

Full Bibliographic Reference for this paper
M. Rashid, F. Wang, S.N. Wu, "Grid Workflow Acceleration using a Dataflow Paradigm", in , (Editors), "Proceedings of the First International Conference on Parallel, Distributed and Grid Computing for Engineering", Civil-Comp Press, Stirlingshire, UK, Paper 24, 2009. doi:10.4203/ccp.90.24
Keywords: workflow, dataflow, parallel computing, scheduling, grid computing.

Since the invention of the grid in early 1990s there has always been a requirement for greater computational resources. However, a direct implementation of computer based dataflow system is in fact a monumental challenge. The dataflow replica of computation offers many striking properties for parallel dispensation. The dataflow implementation is asynchronous and the execution of an instruction is based on the accessibility of its operands. This paper describes the design and implementation of a real life dataflow paradigm. Operation can be either naive or complex but it is always simplified at the computational end. Therefore, the synchronization of the parallel performance is implicit in the dataflow representation. Secondly, instructions in the dataflow representation do not force any constraints on the sequencing except for the data dependencies. The dataflow diagram is an illustration of a program that exposes all forms of parallelism eliminating the need to clearly supervise parallel implementation of a program [1]. In this paper MATLAB parallel computing tools are introduced for the simulation purpose and the performance is convincing. The random jobs are divided into simple tasks. The task size is assigned based on the resource availability. For high speed computations, the benefit of the dataflow approach over the control-flow method stems from the inherent parallelism embedded at the instruction level. This allows efficient exploitation of fine-grain parallelism in application programs [2]. The experiment was conducted assuming that the many variables are constant in fact they can vary depending on the design environment and physical constraints, which is still a big challenge for the future work of this project.

The simulation platform used for the experiments was chosen to be MATLAB R2008a which is a four cluster based (Lab) simulator [3]. The result of the experiment is persuasive. The difference between the sequential model and parallel model using data flow in performance is 38.46%, comparing the computational time for the same job load.

A.R. Hurson, B. Lee, "Issues in Dataflow Computing", The Pennsylvania State University, Department of Electrical and Computer Engineering, University Park, PA.
"Parallel Computing Toolbox 4.0- MATLAB", URL, 10th December 2008.
"MATLAB Distributed Computing Server 4.0", URL, 7th November 2008.

purchase the full-text of this paper (price £20)

go to the previous paper
go to the next paper
return to the table of contents
return to the book description
purchase this book (price £72 +P&P)