AMF-IDBSCAN: Incremental Density Based Clustering Algorithm using Adaptive Median Filtering Technique
Date
2019
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Slovene Society Informatika
Abstract
Density-based spatial clustering of applications with noise (DBSCAN) is a fundamental algorithm for density-based clustering. It can discover clusters of arbitrary shapes and sizes from a large amount of data, which contains noise and outliers. However, it fails to treat large datasets, outperform when new objects are inserted into the existing database, remove noise points or outliers totally and handle the local density variation that exists within the cluster. So, a good clustering method should allow a significant density modification within the cluster and should learn dynamics and large databases. In this paper, an enhancement of the DBSCAN algorithm is proposed based on incremental clustering called AMF-IDBSCAN which builds incrementally the clusters of different shapes and sizes in large datasets and eliminates the presence of noise and outliers. The proposed AMF-IDBSCAN algorithm uses a canopy clustering algorithm for pre-clustering the data sets to decrease the volume of data, applies an incremental DBSCAN for clustering the data points and Adaptive Median Filtering (AMF) technique for post-clustering to reduce the number of outliers by replacing noises by chosen medians. Experiments with AMF-IDBSCAN are performed on the University of California Irvine (UCI) repository UCI data sets. The results show that our algorithm performs better than DBSCAN, IDBSCAN, and DMDBSCAN.