In this study, we propose an iterative clustering algorithm that automatically clusters a set of voice signal data without a label into an optimal number of clusters and generates hmm model for each cluster. In the clustering process, the likelihood calculations of the clusters are performed using iterative hmm learning and testing while varying the number of clusters for given data, and the maximum likelihood estimation method is used to determine the optimal number of clusters.
We tested the effectiveness of this clustering algorithm on a small-vocabulary digit clustering task by mapping the unsupervised decoded output of the optimal cluster to the ground-truth transcription, we found out that they were highly correlated.
반복적 HMM 분류방법을 이용한 신호 클러스터의 최적수 결정방법