Enhanced Classification Accuracy on Naive Bayes Data Mining Models

Md. Faisal Kabir, Chowdhury Mofizur Rahman, Alamgir Hossain, Keshav Dahal

Research output: Contribution to journalArticle

Abstract

A classification paradigm is a data mining framework containing all the concepts extracted from the training dataset to differentiate one class from other classes existed in data. The primary goal of the classification frameworks is to provide a better result in terms of accuracy. However, in most of the cases we can not get better accuracy particularly for huge dataset and dataset with several groups of data . When a classification framework considers whole dataset for training then the algorithm may become unusuable because dataset consisits of several group of data. The alternative way of making classification useable is to identify a similar group of data from the whole training data set and then training each group of similar data. In our paper, we first split the training data using k-means clustering and then train each group with Naive Bayes Classification algorithm. In addition, we saved each model to classify sample or unknown or test data. For unknown data, we classify with the best match group/model and attain higher accuracy rate than the conventional Naive Bayes classifier.
Original languageEnglish
Pages (from-to)9-16
JournalInternational Journal of Computer Applications
Volume28
Issue number3
DOIs
Publication statusPublished - 2011
Externally publishedYes

Fingerprint

Data mining
Classifiers

Cite this

Kabir, Md. Faisal ; Rahman, Chowdhury Mofizur ; Hossain, Alamgir ; Dahal, Keshav. / Enhanced Classification Accuracy on Naive Bayes Data Mining Models. In: International Journal of Computer Applications. 2011 ; Vol. 28, No. 3. pp. 9-16.
@article{c2fb98f16b444ee19b032e7e792801e0,
title = "Enhanced Classification Accuracy on Naive Bayes Data Mining Models",
abstract = "A classification paradigm is a data mining framework containing all the concepts extracted from the training dataset to differentiate one class from other classes existed in data. The primary goal of the classification frameworks is to provide a better result in terms of accuracy. However, in most of the cases we can not get better accuracy particularly for huge dataset and dataset with several groups of data . When a classification framework considers whole dataset for training then the algorithm may become unusuable because dataset consisits of several group of data. The alternative way of making classification useable is to identify a similar group of data from the whole training data set and then training each group of similar data. In our paper, we first split the training data using k-means clustering and then train each group with Naive Bayes Classification algorithm. In addition, we saved each model to classify sample or unknown or test data. For unknown data, we classify with the best match group/model and attain higher accuracy rate than the conventional Naive Bayes classifier.",
author = "Kabir, {Md. Faisal} and Rahman, {Chowdhury Mofizur} and Alamgir Hossain and Keshav Dahal",
year = "2011",
doi = "10.5120/3371-4657",
language = "English",
volume = "28",
pages = "9--16",
journal = "International Journal of Computer Applications",
issn = "0975-8887",
publisher = "Foundation of Computer Science",
number = "3",

}

Enhanced Classification Accuracy on Naive Bayes Data Mining Models. / Kabir, Md. Faisal; Rahman, Chowdhury Mofizur; Hossain, Alamgir; Dahal, Keshav.

In: International Journal of Computer Applications, Vol. 28, No. 3, 2011, p. 9-16.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Enhanced Classification Accuracy on Naive Bayes Data Mining Models

AU - Kabir, Md. Faisal

AU - Rahman, Chowdhury Mofizur

AU - Hossain, Alamgir

AU - Dahal, Keshav

PY - 2011

Y1 - 2011

N2 - A classification paradigm is a data mining framework containing all the concepts extracted from the training dataset to differentiate one class from other classes existed in data. The primary goal of the classification frameworks is to provide a better result in terms of accuracy. However, in most of the cases we can not get better accuracy particularly for huge dataset and dataset with several groups of data . When a classification framework considers whole dataset for training then the algorithm may become unusuable because dataset consisits of several group of data. The alternative way of making classification useable is to identify a similar group of data from the whole training data set and then training each group of similar data. In our paper, we first split the training data using k-means clustering and then train each group with Naive Bayes Classification algorithm. In addition, we saved each model to classify sample or unknown or test data. For unknown data, we classify with the best match group/model and attain higher accuracy rate than the conventional Naive Bayes classifier.

AB - A classification paradigm is a data mining framework containing all the concepts extracted from the training dataset to differentiate one class from other classes existed in data. The primary goal of the classification frameworks is to provide a better result in terms of accuracy. However, in most of the cases we can not get better accuracy particularly for huge dataset and dataset with several groups of data . When a classification framework considers whole dataset for training then the algorithm may become unusuable because dataset consisits of several group of data. The alternative way of making classification useable is to identify a similar group of data from the whole training data set and then training each group of similar data. In our paper, we first split the training data using k-means clustering and then train each group with Naive Bayes Classification algorithm. In addition, we saved each model to classify sample or unknown or test data. For unknown data, we classify with the best match group/model and attain higher accuracy rate than the conventional Naive Bayes classifier.

U2 - 10.5120/3371-4657

DO - 10.5120/3371-4657

M3 - Article

VL - 28

SP - 9

EP - 16

JO - International Journal of Computer Applications

JF - International Journal of Computer Applications

SN - 0975-8887

IS - 3

ER -