NoSQL: Moving from MapReduce Batch Jobs to Event-Driven Data Collection

Collecting and analysing data of analytical value is important for many service providers today. Many make use of NoSQL databases for their larger software systems, what is less known is how to effectively analyse and gather business intelligence from the data in these systems. This paper suggests a...

Full description

Bibliographic Details
Main Author: Klingsbo, Lukas
Format: Others
Language:English
Published: Uppsala universitet, Institutionen för informationsteknologi 2015
Online Access:http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-260394
Description
Summary:Collecting and analysing data of analytical value is important for many service providers today. Many make use of NoSQL databases for their larger software systems, what is less known is how to effectively analyse and gather business intelligence from the data in these systems. This paper suggests a method of separating the most valuable analytical data from the rest in real time and at the same time providing an effective traditional database for the analyser. In this paper we analyse our given data sets to decide whether big data tools are required and then traditional databases are compared to see how well they fit the context. A technique that makes use of an asynchronous log- ging system is used to insert the data from the main system to the dedicated analytical database. The tests show that our technique can efficiently be used with a tra- ditional database even on large data sets (>1000000 insertions/hour per database node) and still provide both historical data and aggregate func- tions for the analyser.