Multi-Objective Big Data View Materialization Using MOGA

Multi-Objective Big Data View Materialization Using MOGA

Akshay Kumar, T. V. Vijay Kumar
Copyright: © 2022 |Pages: 28
DOI: 10.4018/IJAMC.292499
Article PDF Download
Open access articles are freely available for download

Abstract

The COVID 19 Pandemic, has resulted in large scale of generation of Big data. This Big data is heterogeneous and includes the data of people infected with corona virus, the people who were in contact of infected person, demographics of infected person, data on corona testing, huge amount of GPS data of people location, and large number of unstructured data about prevention and treatment of COVID 19. Thus, the pandemic has resulted in producing several Zeta bytes of structured, semi-structured and unstructured data. The challenge is to process this Big data, which has the characteristics of very large volume, brisk rate of generation and modification and large data redundancy, in a time bound manner to take timely predictions and decisions. Materialization of views for Big data is one of the ways to enhance the efficiency of processing of the data. In this paper, Big data view selection problem is addressed, as a bi-objective optimization problem, using Multi-objective genetic algorithm.
Article Preview
Top

1. Introduction

Information is one of the important criterion for the survival of Businesses in the present world. Big data applications are required to process large amounts of data, which is cleaned, integrated, and presented in different forms, for making optimal business decisions. Big data has four basic characteristics defined as the 4 V's of Big data - volume i.e. a large size, velocity, i.e. a high rate of data generation, variety, i.e. heterogeneity in data, and veracity, i.e., the trustworthiness of data (Jacobs, 2009; Zikopoulos et al., 2011; Gupta et al., 2012, Kumar et al., 2015). Big data is generated from a variety of data sources, which generally produces inconsistent data at different rates, leading to complex and challenging data cleaning and integration processes. In addition, Big data in its raw form is not suitable for business decisions, rather it is processed to create useful information for the benefit of an organization. This is also referred to as the value of Big data. Big data visualization, validity, vulnerability and volatility are other important considerations of Big data processing (Khan et al., 2014; Gandomi et al., 2015; Firican, 2017).

Big data applications process data in real time to enable timely decisions for helping the organization or society, for whom the application has been designed. (Luo et al., 2016) suggested the development of Big data applications for health care systems, where large amounts of clinical and hospital data could be used to forecast the spread of infectious diseases. One such recent Big data application is concerned with the spread of the COVID 19 pandemic (Chenghu et al., 2020). This application models the spread and future healthcare infrastructure requirements for COVID 19 patients. The application uses the Big data related to corona virus positive cases that includes the data of people who got infected with corona virus, people who came in contact with these infected people, the final outcome of the infected cases including recoveries, the number of tests conducted, demography of corona virus infected people, isolation data of corona virus infected people etc. This heterogeneous data is increasing at a massive rate with the global spread of the pandemic. Thus, tracking the spread of COVID 19 disease, and assessing future healthcare infrastructure, are required to process Zettabyte of geographical, semi-structured and unstructured data. (Chenghu et al., 2020) identified the role of Geographical Information Systems (GIS) and Big data for mapping, tracking, and modeling the spread of corona infections using data visualization. Big data with its predictive power can play a major role for various support processes that will help in the efforts to control of the pandemic. (Chenghu et al., 2020) identified the major challenges in implementation of a GIS system for the development of an application for monitoring COVID 19 spread. These challenges were concerned with integration of the redundant data generated from different data sources, the dynamic mapping of the sources of the epidemic and their contacts, analyzing the transmission of the disease to newer geographical areas, and assessing the risk of non-availability of resources in various geographical areas to deal with the pandemic. All these challenges required appropriate and efficient query processing on the Big data related to the pandemic. (Shneiderman, 2020) lists the data visualization applications, which model the spread of the pandemic and support policy decision making by the Government. One such effort listed in (Shneiderman, 2020) is a COVID19 dashboard by Lauren Gardner with her team at John Hopkins University. It highlights the importance of visualization for such worldwide disasters. Thus, Big data applications, which impact social life, have to deal with very high volumes of data, high speed of data generation, heterogeneity of data generating sources, and large data redundancy. Big data application for COVID 19 requires extensive data cleaning, integration and processing to generate accurate information, in different visual forms, having high information value. It should create reliable, dynamic and actionable knowledge in a timely manner, which can save many precious lives. View materialization is one of the techniques used for enhancing the speed of processing of Big data, which can result in faster decision making. This paper addresses the bi-objective Big data view materialization problem using the Multi-Objective Genetic Algorithm (MOGA).

Complete Article List

Search this Journal:
Reset
Volume 15: 1 Issue (2024)
Volume 14: 1 Issue (2023)
Volume 13: 4 Issues (2022): 2 Released, 2 Forthcoming
Volume 12: 4 Issues (2021)
Volume 11: 4 Issues (2020)
Volume 10: 4 Issues (2019)
Volume 9: 4 Issues (2018)
Volume 8: 4 Issues (2017)
Volume 7: 4 Issues (2016)
Volume 6: 4 Issues (2015)
Volume 5: 4 Issues (2014)
Volume 4: 4 Issues (2013)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing