Data Validation for Big Live Data

Malcolm Crowe, Carolyn Begg, Fritz Laux, Martti Laiho

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Data Integration of heterogeneous data sources relies either on periodically transferring large amounts of data to a physical Data Warehouse or retrieving data from the sources on request only. The latter results in the creation of what is referred to as a virtual Data Warehouse, which is preferable when the use of the latest data is paramount. However, the downside is that it adds network traffic and suffers from performance degradation when the amount of data is high. In this paper, we propose the use of a readCheck validator to ensure the timeliness of the queried data and reduced data traffic. It is further shown that the readCheck allows transactions to update data in the data sources obeying full Atomicity, Consistency, Isolation, and Durability (ACID) properties
Original languageEnglish
Title of host publicationDBKDA 2017
Subtitle of host publicationThe Ninth International Conference on Advances in Databases, Knowledge, and Data Applications
EditorsAndreas Schmidt, Fritz Laux, Dimitar Hristovski, Shin-ivhi Ohnishi
PublisherInternational Academy, Research, and Industry Association
Pages30-36
Number of pages7
ISBN (Electronic)978-1-61208-558-6
Publication statusPublished - 21 May 2017
EventThe Ninth International Conference on Advances in Databases, Knowledge and Data Applications - Barcelona, Spain
Duration: 21 May 201726 May 2017
Conference number: 2017
http://iaria.org/conferences2017/DBKDA17.html

Publication series

NameDBKDA, International Conference on Advances in Databases, Knowledge, and Data Applications
PublisherInternational Academy, Research, and Industry Association
Volume7
ISSN (Print)2308-4332

Conference

ConferenceThe Ninth International Conference on Advances in Databases, Knowledge and Data Applications
Abbreviated titleDBKDA
CountrySpain
CityBarcelona
Period21/05/1726/05/17
Internet address

Fingerprint

Data warehouses
Data integration
Durability
Degradation
Big data

Keywords

  • data validation
  • virtual data integration
  • ETags
  • rowversion validation

Cite this

Crowe, M., Begg, C., Laux, F., & Laiho, M. (2017). Data Validation for Big Live Data. In A. Schmidt, F. Laux, D. Hristovski, & S. Ohnishi (Eds.), DBKDA 2017: The Ninth International Conference on Advances in Databases, Knowledge, and Data Applications (pp. 30-36). (DBKDA, International Conference on Advances in Databases, Knowledge, and Data Applications; Vol. 7). International Academy, Research, and Industry Association.
Crowe, Malcolm ; Begg, Carolyn ; Laux, Fritz ; Laiho, Martti. / Data Validation for Big Live Data. DBKDA 2017: The Ninth International Conference on Advances in Databases, Knowledge, and Data Applications. editor / Andreas Schmidt ; Fritz Laux ; Dimitar Hristovski ; Shin-ivhi Ohnishi. International Academy, Research, and Industry Association, 2017. pp. 30-36 (DBKDA, International Conference on Advances in Databases, Knowledge, and Data Applications).
@inproceedings{bb8012aec0434fe0bafdc2492058aeae,
title = "Data Validation for Big Live Data",
abstract = "Data Integration of heterogeneous data sources relies either on periodically transferring large amounts of data to a physical Data Warehouse or retrieving data from the sources on request only. The latter results in the creation of what is referred to as a virtual Data Warehouse, which is preferable when the use of the latest data is paramount. However, the downside is that it adds network traffic and suffers from performance degradation when the amount of data is high. In this paper, we propose the use of a readCheck validator to ensure the timeliness of the queried data and reduced data traffic. It is further shown that the readCheck allows transactions to update data in the data sources obeying full Atomicity, Consistency, Isolation, and Durability (ACID) properties",
keywords = "data validation, virtual data integration, ETags, rowversion validation",
author = "Malcolm Crowe and Carolyn Begg and Fritz Laux and Martti Laiho",
year = "2017",
month = "5",
day = "21",
language = "English",
series = "DBKDA, International Conference on Advances in Databases, Knowledge, and Data Applications",
publisher = "International Academy, Research, and Industry Association",
pages = "30--36",
editor = "Andreas Schmidt and Fritz Laux and Dimitar Hristovski and Shin-ivhi Ohnishi",
booktitle = "DBKDA 2017",
address = "United States",

}

Crowe, M, Begg, C, Laux, F & Laiho, M 2017, Data Validation for Big Live Data. in A Schmidt, F Laux, D Hristovski & S Ohnishi (eds), DBKDA 2017: The Ninth International Conference on Advances in Databases, Knowledge, and Data Applications. DBKDA, International Conference on Advances in Databases, Knowledge, and Data Applications, vol. 7, International Academy, Research, and Industry Association, pp. 30-36, The Ninth International Conference on Advances in Databases, Knowledge and Data Applications, Barcelona, Spain, 21/05/17.

Data Validation for Big Live Data. / Crowe, Malcolm; Begg, Carolyn; Laux, Fritz; Laiho, Martti.

DBKDA 2017: The Ninth International Conference on Advances in Databases, Knowledge, and Data Applications. ed. / Andreas Schmidt; Fritz Laux; Dimitar Hristovski; Shin-ivhi Ohnishi. International Academy, Research, and Industry Association, 2017. p. 30-36 (DBKDA, International Conference on Advances in Databases, Knowledge, and Data Applications; Vol. 7).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Data Validation for Big Live Data

AU - Crowe, Malcolm

AU - Begg, Carolyn

AU - Laux, Fritz

AU - Laiho, Martti

PY - 2017/5/21

Y1 - 2017/5/21

N2 - Data Integration of heterogeneous data sources relies either on periodically transferring large amounts of data to a physical Data Warehouse or retrieving data from the sources on request only. The latter results in the creation of what is referred to as a virtual Data Warehouse, which is preferable when the use of the latest data is paramount. However, the downside is that it adds network traffic and suffers from performance degradation when the amount of data is high. In this paper, we propose the use of a readCheck validator to ensure the timeliness of the queried data and reduced data traffic. It is further shown that the readCheck allows transactions to update data in the data sources obeying full Atomicity, Consistency, Isolation, and Durability (ACID) properties

AB - Data Integration of heterogeneous data sources relies either on periodically transferring large amounts of data to a physical Data Warehouse or retrieving data from the sources on request only. The latter results in the creation of what is referred to as a virtual Data Warehouse, which is preferable when the use of the latest data is paramount. However, the downside is that it adds network traffic and suffers from performance degradation when the amount of data is high. In this paper, we propose the use of a readCheck validator to ensure the timeliness of the queried data and reduced data traffic. It is further shown that the readCheck allows transactions to update data in the data sources obeying full Atomicity, Consistency, Isolation, and Durability (ACID) properties

KW - data validation

KW - virtual data integration

KW - ETags

KW - rowversion validation

M3 - Conference contribution

T3 - DBKDA, International Conference on Advances in Databases, Knowledge, and Data Applications

SP - 30

EP - 36

BT - DBKDA 2017

A2 - Schmidt, Andreas

A2 - Laux, Fritz

A2 - Hristovski, Dimitar

A2 - Ohnishi, Shin-ivhi

PB - International Academy, Research, and Industry Association

ER -

Crowe M, Begg C, Laux F, Laiho M. Data Validation for Big Live Data. In Schmidt A, Laux F, Hristovski D, Ohnishi S, editors, DBKDA 2017: The Ninth International Conference on Advances in Databases, Knowledge, and Data Applications. International Academy, Research, and Industry Association. 2017. p. 30-36. (DBKDA, International Conference on Advances in Databases, Knowledge, and Data Applications).