Monday, February 18, 2013

Density Based Clustering Algorithm: DBSCAN with Implementation in MATALB

Density Based Clustering Algorithm locates regions of high density that are separated from one another by regions of low density. DBSCAN is a center based approach to clustering in which density is estimated for a particular point in the data set by counting the number of points within the specified radius, ɛ, of that point.
The center based approach to density allows us to classify a point as one of the three:

     Core points: These points are in the interior of the dense region
   Border points:These points are not the core points, but fall within the neighborhood of the core points
   Noise points: A noise point is a point that is neither a core point nor a border point.
        The formal definition of DBSCAN algorithm is illustrated below:

  1.        Eliminate noise points
  2.        Perform clustering on remaining points
  3.        current_cluster_label := 0
·         for all core points do
·         If the core point has no cluster_label then
current_cluster_label := current_cluster_label +1
Assign the current core point the current_cluster_label
·         end if
·         For all points within the radius do
·         If the point does not have a cluster_label then
Label the point with the current_cluster_label
·         end if
·         end for