The clustering ensemble is formed by combining clustering analysis and ensemble learning. However, most clustering ensemble methods treat all samples equally, which negatively affects the final clustering result. To this end, we propose sample’s representation to evaluate the sample’s importance in clustering comprehensively. The sample’s representation measures the clustering importance of samples from two perspectives: the stability of the relationship between the sample and its neighbor samples and the closeness of the relationship between the sample and its neighbor samples. According to the representation of each sample, we divide a dataset into cluster core and cluster halo. Then we obtain the credible underlying structure through the cluster core samples. Finally, the cluster halo samples are gradually allocated to the above structure to get the final clustering result. The working steps of the algorithm are shown on two synthetic datasets, and experiments on nine real datasets fully demonstrate that the algorithm outperforms 11 other state-of-the-art clustering ensemble methods.
|