The network traffic dataset is a crucial part of anomaly based intrusion detection systems (IDSs). These IDSs train themselves to learn normal and anomalous activities. Properly labeled dataset is used for the training purpose. For the activities based IDSs, proper network traffic activity labeled dataset is the first requirement, however non-availability of such datasets is bottlenecked in the field of IDS research. In this experiment, a synthetic dataset "Panjab University - Intrusion Dataset (PU-IDS)" is created. The purpose of this study is to provide the researchers a reference dataset for the performance evaluation of network traffic activity based IDSs. University of New Brunswick Network Security Laboratory - Knowledge Disscovery in Databases (NSL-KDD) is a benchmark dataset for anomaly detection but it does not contain activity based labeling. So basic characteristics of this dataset are taken for the generation of the new synthetic dataset with various activities based labels. The dataset is first categorized as per protocol and service. Thereafter, as per minimum & maximum values of attributes, activity profiles are synthetically generated. This paper also discusses various statistical characteristics of PU-IDS. The total number of 198533 instances along with 273 of activity profiles are created. This dataset also contain different 98 protocol_service profiles.
|Journal||International Journal of Computers, Communications and Control|
|Publication status||Published - 28 Apr 2015|