Scalable Multiple Kernel Clustering: Learning Clustering Structure from Expectation
ICML 2024(2024)
摘要
In this paper, we derive an upper bound of the difference between a kernel matrix and its expectation under a mild assumption. Specifically, we assume that the true distribution of the training data is an unknown isotropic Gaussian distribution. When the kernel function is a Gaussian kernel, and the mean of each cluster is sufficiently separated, we find that the expectation of a kernel matrix can be close to a rank-$k$ matrix, where $k$ is the cluster number. Moreover, we prove that the normalized kernel matrix of the training set deviates (w.r.t. Frobenius norm) from its expectation in the order of $\widetilde{\mathcal{O}}(1/\sqrt{d})$, where $d$ is the dimension of samples. Based on the above theoretical results, we propose a novel multiple kernel clustering framework which attempts to learn the information of the expectation kernel matrices. First, we aim to minimize the distance between each base kernel and a rank-$k$ matrix, which is a proxy of the expectation kernel. Then, we fuse these rank-$k$ matrices into a consensus rank-$k$ matrix to find the clustering structure. Using an anchor-based method, the proposed framework is flexible with the sizes of input kernel matrices and able to handle large-scale datasets. We also provide the approximation guarantee by deriving two non-asymptotic bounds for the consensus kernel and clustering indicator matrices. Finally, we conduct extensive experiments to verify the clustering performance of the proposed method and the correctness of the proposed theoretical results.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn