Computational Technology Resources - CCP

Keywords: data grids, replication strategies, decision tree, predictive model.

Summary

In grid environments, file replication strategies are critical to the overall performance of large-scale data intensive applications. However, due to the dynamism of the Grids, file replication decisions are always made by monitoring the change of the popularity of a file. Although prompt replication can avoid the increase in access latency in future, the burden of the replications and the current accesses to the relative files may conflict each other and hence increase the access latency. Ideally, advanced file replications can smooth the access latency if the changes of the file popularity can be predicted. In this paper, we propose a predictive file replication strategy based on forecasting the future popularity of files to address the problem.

The real-system-trace-based simulations were conducted under the European Data Grid simulation environment OptorSim. Having simulated the three replication strategies, it is clear that the proposed predictive replication strategy outperforms the LRU and Economic model under sequential and Zipf access patterns. In addition, the Queue Length scheduling algorithm brings the balance between mean job time and resource usage. It is also noticed that no strategy delivers the best performance results in every circumstance. In order to choose a good replication strategy, trace skewness, storage capacity and the maximal computing power have to be considered.

As a policy generator, the decision tree based predictive model should link the file characteristics and the future popularity of the file and pass the rules to areplication manager to decide when and where to create replicas. With advanced evaluation of files, access latency caused by geographically distributed resources can be smoothed.

purchase the full-text of this paper (price £20)

go to the previous paper
go to the next paper
return to the table of contents
return to the book description
purchase this book (price £72 +P&P)

	Computational & Technology Resources an online resource for computational, engineering & technology publications
	not logged in - login
Front Page Browse CCP CSETS CTR IJRT Other Authors Search Purchase Guide FAQ Contact us	Civil-Comp Proceedings ISSN 1759-3433 CCP: 90 PROCEEDINGS OF THE FIRST INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED AND GRID COMPUTING FOR ENGINEERING Edited by: B.H.V. Topping and P. Iványi Paper 21 A Predictive File Replication Strategy for Grid Computing C.H. Liao¹, F.Z. Wang¹, S.N. Wu¹, M.M. Rashid¹ and N. Helian² ¹Department of Applied Mathematics and Computing, School of Engineering, Cranfield University, United Kingdom ²Department of Computer Science, University of Hertfordshire, United Kingdom doi:10.4203/ccp.90.21 purchase the full-text of this paper Full Bibliographic Reference for this paper C.H. Liao, F.Z. Wang, S.N. Wu, M.M. Rashid, N. Helian, "A Predictive File Replication Strategy for Grid Computing", in B.H.V. Topping, P. Iványi, (Editors), "Proceedings of the First International Conference on Parallel, Distributed and Grid Computing for Engineering", Civil-Comp Press, Stirlingshire, UK, Paper 21, 2009. doi:10.4203/ccp.90.21 Keywords: data grids, replication strategies, decision tree, predictive model. Summary In grid environments, file replication strategies are critical to the overall performance of large-scale data intensive applications. However, due to the dynamism of the Grids, file replication decisions are always made by monitoring the change of the popularity of a file. Although prompt replication can avoid the increase in access latency in future, the burden of the replications and the current accesses to the relative files may conflict each other and hence increase the access latency. Ideally, advanced file replications can smooth the access latency if the changes of the file popularity can be predicted. In this paper, we propose a predictive file replication strategy based on forecasting the future popularity of files to address the problem. The real-system-trace-based simulations were conducted under the European Data Grid simulation environment OptorSim. Having simulated the three replication strategies, it is clear that the proposed predictive replication strategy outperforms the LRU and Economic model under sequential and Zipf access patterns. In addition, the Queue Length scheduling algorithm brings the balance between mean job time and resource usage. It is also noticed that no strategy delivers the best performance results in every circumstance. In order to choose a good replication strategy, trace skewness, storage capacity and the maximal computing power have to be considered. As a policy generator, the decision tree based predictive model should link the file characteristics and the future popularity of the file and pass the rules to areplication manager to decide when and where to create replicas. With advanced evaluation of files, access latency caused by geographically distributed resources can be smoothed. purchase the full-text of this paper (price £20) go to the previous paper go to the next paper return to the table of contents return to the book description purchase this book (price £72 +P&P)
Back to top	©Civil-Comp Limited 2023 - terms & conditions