Home > hit > clustering > hit_kmeans.m

hit_kmeans

PURPOSE ^

HIT_KMEANS Weighted KMEANS algorithm.

SYNOPSIS ^

function [centers,cost,inl,class,etime,spec] = hit_kmeans(v,opt)

DESCRIPTION ^

HIT_KMEANS Weighted KMEANS algorithm.

 -------------------------------------------------------------------------
 DESCRIPTION
 -------------------------------------------------------------------------
 [centers,cost,inl,class,etime,spec] = hit_kmeans(v,opt)

 -------------------------------------------------------------------------
 INPUT
 -------------------------------------------------------------------------
 v(i,:): i-th point to be clustered. Points are ROW vectors
 opt: structure of specific parameters
   MANDATORY FIELDS
   opt.init_centers: 'COVARIANCES'; 'SCALARS'. Specify how to weight
   points in initializing the centers. 'SCALARS' is fast but 'COVARIANCES'
   is much more precise.
   opt.centers: 'COVARIANCES'; 'SCALARS'. Specify how to weight
   points in updating the centers. 'SCALARS' is fast but 'COVARIANCES'
   is much more precise.
   opt.repetitions: n. of times Kmeans must be run. Only the best
   results will be given as the outputs.
   opt.s: number of clusters. 
   DEPENDENT FIELDS
   opt.w(i): weight for the i-th point.
   opt.IR{i}: inverse of the covariance associated to the i-th point.
   OPTIONAL FIELDS
   opt.options: vector of options for k-means
   options(1) is set to 1 to display error values; If options(1) is set to
   0, then only warning messages and cost at each run are displayed.
   options(1) is -1, then nothing is displayed. Default=-1

   options(2) is a measure of the absolute precision required for the
   value of centres at the solution.  If the absolute difference between
   the values of centres between two successive steps is less than
   options(2), then this condition is satisfied. Default= 1e-4

   options(3) is a measure of the precision required of the error
   function at the solution.  If the absolute difference between the
   error functions between two successive steps is less than options(3),
   then this condition is satisfied. Both this and the previous
   condition must be satisfied for termination. Default= 1e-4

   options(4) is the maximum number of iterations in a single run of
   Kmeans; Default= 100.

 -------------------------------------------------------------------------
 OUTPUT
 -------------------------------------------------------------------------
 centers(i,:): center of the i-th cluster (is a ROW vector) found in
 the best run.

 cost: value of the clustering cost functional in the best run.

 inl: indexes of datapoints that are inliers after clustering (i.e.
 not discarded by the clustering algorithm). Since Kmeans oes not perform
 outlier detection, all input points will be inliers.

 class(i): classification of the i-th inlier. class(i)=j means
 that the i-th inlier belongs to the j-th cluster.

 etime: elapsed time for the execution of ALL the Kmeans runs.

 spec: structure with special outputs
   spec.costs is the vector of length opt.repetitions storing the
   costs at each iteration of Kmeans in the best run of the algorithm.
   spec.cost_each_clust: vector of costs for each cluster found (in the
   best run).
 -------------------------------------------------------------------------
 ACKNOWLEDGMENTS
 -------------------------------------------------------------------------
 This unction is based on the kmeans.m routines in the NetLab toolbox
 developed at the Aston University (Birmingham).

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:
Generated on Thu 01-Dec-2005 10:54:38 by m2html © 2003