K-VARP: K-anonymity for varied data streams via partitioning

Ankhbayar Otgonbayar, Zeeshan Pervez, Keshav Dahal, Steve Eager

Research output: Contribution to journalArticle

5 Citations (Scopus)
136 Downloads (Pure)

Abstract

Internet-of-Things produce and transmit enormous amount of data. Extracting valuable information from this enormous volume of data has become an important part of businesses and research. However, extracting information from this data without providing privacy protection puts individuals on risk. Data has to be sanitized before use, and anonymization provides solution to this problem. Since, IoT is a collection of numerous dierent devices, data streams of these devices tend to vary over time thus creating varied data stream. However, traditional data stream anonymization approaches only provide privacy protection for data streams having predened and xed attributes. Therefore, conventional methods cannot directly work on varied data stream. In this work, we propose K-VARP (K-anonymity for VARied data stream via Partitioning) to publish varied data stream. K-VARP reads tuple and assigns them to partitions using their description, and all tuples must be anonymized before expiring. It tries to anonymize expiring tuple within partition if its partition is eligible to produce k-anonymous cluster. Otherwise, partition -merging is applied. In K-VARP we proposed a new merging criterion called R-likeness to measure similarity distance between tuple and partitions. Moreover, exible re-using and imputation free-publication is implied in K-VARP to achieve better anonymization quality and performance. Our experiment on real dataset shows that K-VARP is ecient and eective compared to existing algorithms. K-VARP showed approximately twenty percent less information loss, while forming similar number of clusters within comparable computation time.
Original languageEnglish
Pages (from-to)238-255
Number of pages18
JournalInformation Sciences
Volume467
Early online date3 Aug 2018
DOIs
Publication statusPublished - 31 Oct 2018

    Fingerprint

Keywords

  • Internet of Things (IoT)
  • Data privacy
  • Data streams
  • Anonymization
  • Missing values

Cite this