www.wins.or.kr(IISPL)

INFORMATION

	Professor


	Members


		Grad. Students


			Phd. List


			Master Deg.


			Bachelor Deg.


		Student List


			Phd. List


			Master Deg.


			Bachelor Deg.


	Research


		Thesis


			Phd. Deg.


			Master Deg.


		Published


			Journal


			Conference


		Projects


			projects


			etc


		Patent


	Project Board


	SERVICES


	IRP

RESEARCH

Y.J.KIM, Dept. of Computer Engineering, Hanbat National University

	Domestic Journal
	음성감정데이터베이스의 분석과 임 단 특징과 발음단 특징을 통합하는 Attention Mechanism을 이용한 음성 감정 인식 시스템의 개발(Analysis of Speech Emotion Database and Development of Speech Emotion Recognition System using Attention Mechanism Integrating Frame- and Utterance-level Features)

	본 연구에서 음성신호로부터 임 단의 특징과 발음 단의 특징을 통합하고 감정의 정보 를 분석하는 BLSTM(Bidirectional Long-Sort Term Memory) 이어, Attention mechanism 이어 심층 신경회로망으로 구성되는 모델을 제안하고, 음성 감정 데이터베이스 IEMOCAP(Interactive Emotional Dyadic Motion Capture) 이블의 신뢰성 분석에 기하여 모델의 성능을 분석하다. IEMOCAP 데이터베이스에서 제공되는 이블의 평가 자료에 기하여 기본 데이터 셋, 감정 클래스의 분포를 균형화 시킨 데이터 셋, 3명 이상의 정에 기하여 신뢰성이 개선된 데이터 셋을 구성하고, 각각 의 데이터 셋에 하여 화자독립 교차검증실험을 수행하다. 개선되고 균형화된 데이터 셋에 한 실험에 서 최 67.23% (WA, Weighted Accuracy), 56.70% (UA, Unweighted Accuracy)의 스코어를 성취하 고 기본 데이터 셋의 실험에 비하여 6.47% (WA, 4.41% (UA) 개선됨을 확인하다. In this study, we propose a model consist of BLSTM (Bidirectional Long-Sort Term Memory) layer, Attention mechanism layer, and Deep neural network to integrate frame- and utterance-level features from speech signals model reliability analysis the labels in the speech emotional database IEMOCAP (Interactive Emotional Dyadic Motion Capture). Based on the evaluation script of the labels provided in the IEMOCAP database, a default data set, a data set with a balanced distribution of emotion classes, and a data set with improved reliability based on three or more judgments were constructed and used for performance of the proposed model using speaker independent cross validation approach. Experiment on the improved and balanced dataset achieve a maximum score of 67.23% (WA, Weighted Accuracy) and 56.70% (UA, Unweighted Accuracy) that represents an improvement of 6.47% (WA), 4.41% (UA) over the baseline dataset. 음성감정데이터베이스의 분석과 임 단 특징과 발음단 특징을 통합하는 Attention Mechanism을 이용한 음성 감정 인식 시스템의 개발(Analysis of Speech Emotion Database and Development of Speech Emotion Recognition System using Attention Mechanism Integrating Frame- and Utterance-level Features) 김도경(한밭대학교), 김윤중(한밭대학교) 한국정보과학회 정보과학회논문지 정보과학회논문지 제47권 제5호 2020.05 479 - 487(9 pages) DOI : 10.5626/JOK.2020.47.5.479 학회 논문지 한국정보과학회 정보과학회논문지 논문 https://www.dbpia.co.kr/Journal/articleDetail?nodeId=NODE09338276 articlePreView.pdf
	정보과학회논문지 제47권 제5호 pp.479-487 (2020.5)
	2020-05-01/2021-10-06/김윤중