from a subset of the Filipino Speech Corpus developed by the DSP Laboratory of the University of the Philippines-Diliman, while the n-gram language model was based on a 22,000 article text-corpus extracted from the newspaper ‘Balita’. The speech corpus was used both in training and testing the system by estimating the parameters for phonetic hmm-based (Hidden-Markov Model) acoustic models. Experiments on different mixture-weights were incorporated in the study.
The phoneme-level isolated word recognition of a 5-state HMM resulted an average accuracy rate of 80.13 for a single-Gaussian mixture model, 81.13 after implementing a phoneme-alignment, and 87.19 for the increased Gaussian-mixture weight model. The highest accuracy rate of 88.70% was obtained from a 5-state model with 6 Gaussian mixtures.
A sentence level recognition rate of 63.44% is the highest recognition rate based on a 5-state HMM model with a word accuracy rate of 73.66% in a single Gaussian mixture model is achieved. The recognition rate increased by a small margin of around 1-2% by increasing mixture size into 6 Gaussian mixtures with the highest sentence level recognition rate of 64.75% and a word accuracy rate of 74.99%.