INFORMATION
탐색 건너뛰기 링크입니다.
RESEARCH
Y.J.KIM,  Dept. of  Computer Engineering, Hanbat National University
   International Journal
 Attention-LSTM-Attention Model for Speech Emotion Recognition and Analysis of IEMOCAP Database
   by Yeonguk Yu and Yoon-Joong Kim *
 

We propose a speech-emotion recognition (SER) model with an “attention-long Long Short-Term Memory (LSTM)-attention” component to combine IS09, a commonly used feature for SER, and mel spectrogram, and we analyze the reliability problem of the interactive emotional dyadic motion capture (IEMOCAP) database. The attention mechanism of the model focuses on emotion-related elements of the IS09 and mel spectrogram feature and the emotion-related duration from the time of the feature. Thus, the model extracts emotion information from a given speech signal. The proposed model for the baseline study achieved a weighted accuracy (WA) of 68% for the improvised dataset of IEMOCAP. However, the WA of the proposed model of the main study and modified models could not achieve more than 68% in the improvised dataset. This is because of the reliability limit of the IEMOCAP dataset. A more reliable dataset is required for a more accurate evaluation of the model’s performance. Therefore, in this study, we reconstructed a more reliable dataset based on the labeling results provided by IEMOCAP. The experimental results of the model for the more reliable dataset confirmed a WA of 73%.

 
Electronics(SCI) Vol. 9, No.5, https://doi.org/10.3390/electronics9050713 (registering DOI), (Section : Artificial Intelligence, Special Issue : Deep Neural Netoworks and Their Applications)  
  2020-04-26/2021-10-06/김윤중