In this study, we developed a speech data transcription tool that integrates speech segmentation, speaker classification, speech transcription, and editing processes for the purpose of shortening transcription time of audio data. The system converts the speech data into standardized transcription data that is used as an input to a spoken corpus construction system. The speech segmentation and speaker classification process was developed using deep learning technologies and the transcription process uses the Google API. It was confirmed that the experiment performed to compare with the existing ELAN and notepad tool saves half of the processing time
https://ieeexplore.ieee.org/document/8539450