Class KMeansPlusPlusClusterer<T extends Clusterable<T>>

  • Type Parameters:
    T - type of the points to cluster

    public class KMeansPlusPlusClusterer<T extends Clusterable<T>>
    extends java.lang.Object
    Clustering algorithm based on David Arthur and Sergei Vassilvitski k-means++ algorithm.
    Since:
    2.0
    See Also:
    K-means++ (wikipedia)
    • Constructor Detail

      • KMeansPlusPlusClusterer

        public KMeansPlusPlusClusterer​(java.util.Random random)
        Build a clusterer.

        The default strategy for handling empty clusters that may appear during algorithm iterations is to split the cluster with largest distance variance.

        Parameters:
        random - random generator to use for choosing initial centers
      • KMeansPlusPlusClusterer

        public KMeansPlusPlusClusterer​(java.util.Random random,
                                       KMeansPlusPlusClusterer.EmptyClusterStrategy emptyStrategy)
        Build a clusterer.
        Parameters:
        random - random generator to use for choosing initial centers
        emptyStrategy - strategy to use for handling empty clusters that may appear during algorithm iterations
        Since:
        2.2
    • Method Detail

      • cluster

        public java.util.List<Cluster<T>> cluster​(java.util.Collection<T> points,
                                                  int k,
                                                  int maxIterations)
        Runs the K-means++ clustering algorithm.
        Parameters:
        points - the points to cluster
        k - the number of clusters to split the data into
        maxIterations - the maximum number of iterations to run the algorithm for. If negative, no maximum will be used
        Returns:
        a list of clusters containing the points