Paper 206

A Service-Oriented Approach to Facilitate Big Data Analytics on the Web

A. Cheptsov and B. Koller

High Performance Computing Center, Stuttgart, Germany

Full Bibliographic Reference for this paper
A. Cheptsov, B. Koller, "A Service-Oriented Approach to Facilitate Big Data Analytics on the Web", in , (Editors), "Proceedings of the Fourteenth International Conference on Civil, Structural and Environmental Engineering Computing", Civil-Comp Press, Stirlingshire, UK, Paper 206, 2013. doi:10.4203/ccp.102.206
Keywords: semantic web, Java, parallelization, LarKC, JUNIPER. .

The volume of data exposed on the Web is increasing at a robust pace. Reasoning is a wide-spread knowledge discovery and information retrieval technique, in particular extensively used for developing Web applications. However, most of the reasoning algorithms are dealing with significant challenges when being scaled up to the problem sizes addressed by the modern Semantic Web, breaking the barrier of billions of RDF statements (triples). Unfortunately, reasoning applications are not optimized to be applied to emerging Internet-scale data sets, known as a 'big data' problem. In this paper, we introduce a service-oriented approach to facilitate the development of reasoning applications that can scale to big data demands. The approach is based on an incomplete reasoning engine LarKC (the Large Knowledge Collider) as well as parallelization techniques elaborated for big data applications in the frame of the JUNIPER EU project. We discuss the use of the service-oriented approach to develop two exemplarily resource discovery applications - query expansion and subsetting, based on the random indexing technique.

