DLD: An Optimized Chinese Speech Recognition Model Based on Deep Learning

Complexity 2022:1-8 (2022)
  Copy   BIBTEX

Abstract

Speech recognition technology has played an indispensable role in realizing human-computer intelligent interaction. However, most of the current Chinese speech recognition systems are provided online or offline models with low accuracy and poor performance. To improve the performance of offline Chinese speech recognition, we propose a hybrid acoustic model of deep convolutional neural network, long short-term memory, and deep neural network. This model utilizes DCNN to reduce frequency variation and adds a batch normalization layer after its convolutional layer to ensure the stability of data distribution, and then use LSTM to effectively solve the gradient vanishing problem. Finally, the fully connected structure of DNN is utilized to efficiently map the input features into a separable space, which is helpful for data classification. Therefore, leveraging the strengths of DCNN, LSTM, and DNN by combining them into a unified architecture can effectively improve speech recognition performance. Our model was tested on the open Chinese speech database THCHS-30 released by the Center for Speech and Language Technology of Tsinghua University, and it was concluded that the DLD model with 3 layers of LSTM and 3 layers of DNN had the best performance, reaching 13.49% of words error rate.

Links

PhilArchive



    Upload a copy of this work     Papers currently archived: 91,853

External links

Setup an account with your affiliations in order to access resources via your University's proxy server

Through your library

Similar books and articles

Diagnosis of Pneumonia Using Deep Learning.Alaa M. A. Barhoom & Samy S. Abu-Naser - 2022 - International Journal of Academic Engineering Research (IJAER) 6 (2):48-68.
Detection of Brain Tumor Using Deep Learning.Hamza Rafiq Almadhoun & Samy S. Abu-Naser - 2022 - International Journal of Academic Engineering Research (IJAER) 6 (3):29-47.
Classification of Real and Fake Human Faces Using Deep Learning.Fatima Maher Salman & Samy S. Abu-Naser - 2022 - International Journal of Academic Engineering Research (IJAER) 6 (3):1-14.
Merging information versus speech recognition.Irene Appelbaum - 2000 - Behavioral and Brain Sciences 23 (3):325-326.

Analytics

Added to PP
2022-05-04

Downloads
37 (#431,116)

6 months
34 (#102,124)

Historical graph of downloads
How can I increase my downloads?

Author's Profile

Citations of this work

No citations found.

Add more citations

References found in this work

No references found.

Add more references