Jump to content
  • Big Data

       (0 reviews)

    Tim Kannegieter

    Introduction

    One definition of big data is that the amount of data collected is sufficiently large to allow the development of insights that would be impossible with smaller data collections. Another definition is that big data cannot be dealt with by traditional data analytics techologies. If the questions being asked of the volume of data cannot be easily answered by traditional technologies, then it is big data.

    The primary purpose of big data is to create data based products, whereas traditional analytics' primary purpose is for internal decision support. One way of looking at difference between big data and traditional analytics is shown in the table below. In summary, big data is very large, unstructured and fast moving compared to traditional analytics, which calls for a different approach.

    DM6.thumb.png.d700e9a7097c27be5fd253b1ea4f66c4.png

    In order to be able to analyse information, present it in a meaningful way and visualise it, an enterprise needs to collect and store all data in their legacy systems, CRM systems or ERP systems and data from third party solutions and applications in a data warehouse. A simplified diagram of a typical data warehouse is shown below (excluding ETL software, business intelligence, dashboards and advanced analytic tools).

    DM7.thumb.png.02af572eb01c97cb70b0d39abcbe853e.png

    Diagram courtesy of Arthur Baoustanos, aib Consulting Services

    Big data requires a different storage and aggregation approach. Information from emails, documents, weblogs, social media sources, images and videos is collected in one storage system, or platform. One commonly used open source platform is Hadoop, which stores data in a Hadoop distributor file system (HDFS). Big data in a Hadoop environment is extremely useful for storing and retrieving very large amounts of data. If it is necessary to join databases or different datasets, other tools, such as in-memory computing tools, will be needed to provide the necessary computing power.

    Other technologies used for storing and processing big data are shown in the diagram below.

    DM8.png.616ace6ddacefe3c6f74793e633ee944.png

    Diagram courtesy of Arthur Baoustanos, aib Consulting Services

    These technologies are used to create a big data environment as shown in the following diagram.
    DM9.thumb.png.6ba3f4c160b7a1e32fb05e6fc6431a35.png

    Diagram courtesy of Arthur Baoustanos, aib Consulting Services

    Big data is stored by combining the traditional data warehouse with the big data environment as shown below.

    DM11.png.2d02913e9d8b261a1f88ef67ac9af890.png

    Diagram courtesy of Arthur Baoustanos, aib Consulting Services

    Aggregation vs correlation

    Much of the focus when analysing Big Data is aggregation of data, which is how to reduce the size of data.

    Data correlation, or relating seemingly unrelated data through other data, is challenging with Big Data in an unaggregated form, as in a multi-dimensional data space with a lot of attributes, and a lot of data, the wrong hypothesis will result in the wrong conclusion.

    User interaction with Big Data is through summaries or aggregations. For example, the data from a group of sensors, can be characterised in terms of one-minute, daily, weekly or yearly summaries. In many IoT applications, users do not need to see the source data.

    There are other ways of aggregating Big Data. For example, anomaly detection is an aggregation approach because it takes a lot of data to produce very few results. Some machine learning algorithms can also be thought of as aggregations, as they follow a similar approach.

    Solutions have been developed that analyse source data on insertion and instantaneously stream aggregations of Big Data for users as micro- and macro-summaries which are useful for real-time monitoring and decision support systems.

    Sources: The information on this page has been sourced primarily from the following:



    User Feedback

    Create an account or sign in to leave a review

    You need to be a member in order to leave a review

    Create an account

    Sign up for a new account in our community. It's easy!

    Register a new account

    Sign in

    Already have an account? Sign in here.

    Sign In Now

    There are no reviews to display.


×