DLD: An Optimized Chinese Speech Recognition Model Based on Deep Learning

Hong Lei; Yue Xiao; Yanchun Liang; Dalin Li; Heow Pueh Lee

Download from

dx.doi.org

More download options

DLD: An Optimized Chinese Speech Recognition Model Based on Deep Learning

Hong Lei, Yue Xiao, Yanchun Liang, Dalin Li & Heow Pueh Lee

Complexity 2022:1-8 (2022) Copy BIBT_EX

Abstract

Speech recognition technology has played an indispensable role in realizing human-computer intelligent interaction. However, most of the current Chinese speech recognition systems are provided online or offline models with low accuracy and poor performance. To improve the performance of offline Chinese speech recognition, we propose a hybrid acoustic model of deep convolutional neural network, long short-term memory, and deep neural network. This model utilizes DCNN to reduce frequency variation and adds a batch normalization layer after its convolutional layer to ensure the stability of data distribution, and then use LSTM to effectively solve the gradient vanishing problem. Finally, the fully connected structure of DNN is utilized to efficiently map the input features into a separable space, which is helpful for data classification. Therefore, leveraging the strengths of DCNN, LSTM, and DNN by combining them into a unified architecture can effectively improve speech recognition performance. Our model was tested on the open Chinese speech database THCHS-30 released by the Center for Speech and Language Technology of Tsinghua University, and it was concluded that the DLD model with 3 layers of LSTM and 3 layers of DNN had the best performance, reaching 13.49% of words error rate.

Cite

Plain text

BibTeX

Formatted text

Zotero

EndNote

Reference Manager

RefWorks

Options

Mark as duplicate

Find it on Scholar

Request removal from index

Revision history

Edit

Author's Profile

D. Li

Keywords

Add keywords

Reprint years

DOI

10.1155/2022/6927400

My notes

Similar books and articles

A Novel PSO-Based Optimized Lightweight Convolution Neural Network for Movements Recognizing from Multichannel Surface Electromyogram.Xiu Kan, Dan Yang, Huisheng le CaoShu, Yuanyuan Li, Wei Yao & Xiafeng Zhang - 2020 - Complexity 2020:1-15.

Articulatory-to-Acoustic Conversion of Mandarin Emotional Speech Based on PSO-LSSVM.Guofeng Ren, Jianmei Fu, Guicheng Shao & Yanqin Xun - 2021 - Complexity 2021:1-10.

Diagnosis of Pneumonia Using Deep Learning.Alaa M. A. Barhoom & Samy S. Abu-Naser - 2022 - International Journal of Academic Engineering Research (IJAER) 6 (2):48-68.

Detecting Pronunciation Errors in Spoken English Tests Based on Multifeature Fusion Algorithm.Yinping Wang - 2021 - Complexity 2021:1-11.

Multitask Learning with Local Attention for Tibetan Speech Recognition.Hui Wang, Fei Gao, Yue Zhao, Li Yang, Jianjian Yue & Huilin Ma - 2020 - Complexity 2020:1-10.

Detection of Brain Tumor Using Deep Learning.Hamza Rafiq Almadhoun & Samy S. Abu-Naser - 2022 - International Journal of Academic Engineering Research (IJAER) 6 (3):29-47.

A Novel User Emotional Interaction Design Model Using Long and Short-Term Memory Networks and Deep Learning.Xiang Chen, Rubing Huang, Xin Li, Lei Xiao, Ming Zhou & Linghao Zhang - 2021 - Frontiers in Psychology 12.

Deep Learning Based Emotion Recognition and Visualization of Figural Representation.Xiaofeng Lu - 2022 - Frontiers in Psychology 12.

Deep Learning-Based Artistic Inheritance and Cultural Emotion Color Dissemination of Qin Opera.Han Yu - 2022 - Frontiers in Psychology 13.

Forecasting Volatility of Stock Index: Deep Learning Model with Likelihood-Based Loss Function.Fang Jia & Boli Yang - 2021 - Complexity 2021:1-13.

Classification of Real and Fake Human Faces Using Deep Learning.Fatima Maher Salman & Samy S. Abu-Naser - 2022 - International Journal of Academic Engineering Research (IJAER) 6 (3):1-14.

Benchmark Pashto Handwritten Character Dataset and Pashto Object Character Recognition (OCR) Using Deep Neural Network with Rule Activation Function.Imran Uddin, Dzati A. Ramli, Abdullah Khan, Javed Iqbal Bangash, Nosheen Fayyaz, Asfandyar Khan & Mahwish Kundi - 2021 - Complexity 2021:1-16.

Radiography image analysis using cat swarm optimized deep belief networks.Sura Khalil Abd, Mustafa Musa Jaber & Amer S. Elameer - 2021 - Journal of Intelligent Systems 31 (1):40-54.

The Deep Versus the Shallow: Effects of Co‐Speech Gestures in Learning From Discourse.Ilaria Cutica & Monica Bucciarelli - 2008 - Cognitive Science 32 (5):921-935.

Merging information versus speech recognition.Irene Appelbaum - 2000 - Behavioral and Brain Sciences 23 (3):325-326.

Analytics

Added to PP
2022-05-04

Downloads
37 (#431,116)

6 months
34 (#102,124)

Historical graph of downloads

How can I increase my downloads?

Author's Profile

D. Li

Citations of this work

No citations found.

Add more citations

References found in this work

No references found.

Add more references

Applied ethics	Epistemology	History of Western Philosophy	Meta-ethics	Metaphysics	Normative ethics
Philosophy of biology	Philosophy of language	Philosophy of mind	Philosophy of religion	Science Logic and Mathematics	More ...

DLD: An Optimized Chinese Speech Recognition Model Based on Deep Learning

Abstract

Author's Profile

Categories

Keywords

Reprint years

DOI

Links

PhilArchive

External links

Through your library

My notes

Similar books and articles

Analytics

Author's Profile

Citations of this work

References found in this work