Feature Extraction and Analysis of Natural Language Processing for Deep Learning English Language

NLP (Natural Language Processing) is a technology that enables computers to understand human languages. Deep-level grammatical and semantic analysis usually uses words as the basic unit, and word segmentation is usually the primary task of NLP. In order to solve the practical problem of huge structu...

Full description

Bibliographic Details
Main Authors: Dongyang Wang, Junli Su, Hongbin Yu
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8999624/
id doaj-7987d48be3e245bc879a37f2f3233a17
record_format Article
spelling doaj-7987d48be3e245bc879a37f2f3233a172021-03-30T02:51:10ZengIEEEIEEE Access2169-35362020-01-018463354634510.1109/ACCESS.2020.29741018999624Feature Extraction and Analysis of Natural Language Processing for Deep Learning English LanguageDongyang Wang0https://orcid.org/0000-0001-9468-173XJunli Su1https://orcid.org/0000-0003-3266-4920Hongbin Yu2https://orcid.org/0000-0003-2679-3654College of Education, Arts and Science, Lyceum of the Philippines University, Batangas City, PhilippinesDepartment of Elementary Education, Jiaozuo Teachers College, Jiaozuo, ChinaSchool of Digital Media, Jiangnan University, Wuxi, ChinaNLP (Natural Language Processing) is a technology that enables computers to understand human languages. Deep-level grammatical and semantic analysis usually uses words as the basic unit, and word segmentation is usually the primary task of NLP. In order to solve the practical problem of huge structural differences between different data modalities in a multi-modal environment and traditional machine learning methods cannot be directly applied, this paper introduces the feature extraction method of deep learning and applies the ideas of deep learning to multi-modal feature extraction. This paper proposes a multi-modal neural network. For each mode, there is a multilayer sub-neural network with an independent structure corresponding to it. It is used to convert the features in different modes to the same-modal features. In terms of word segmentation processing, in view of the problems that existing word segmentation methods can hardly guarantee long-term dependency of text semantics and long training prediction time, a hybrid network English word segmentation processing method is proposed. This method applies BI-GRU (Bidirectional Gated Recurrent Unit) to English word segmentation, and uses the CRF (Conditional Random Field) model to annotate sentences in sequence, effectively solving the long-distance dependency of text semantics, shortening network training and predicted time. Experiments show that the processing effect of this method on word segmentation is similar to that of BI-LSTM-CRF (Bidirectional- Long Short Term Memory-Conditional Random Field) model, but the average predicted processing speed is 1.94 times that of BI-LSTM-CRF, effectively improving the efficiency of word segmentation processing.https://ieeexplore.ieee.org/document/8999624/Feature extractionEnglish word segmentation processinglong short term memorygated recurrent unit
collection DOAJ
language English
format Article
sources DOAJ
author Dongyang Wang
Junli Su
Hongbin Yu
spellingShingle Dongyang Wang
Junli Su
Hongbin Yu
Feature Extraction and Analysis of Natural Language Processing for Deep Learning English Language
IEEE Access
Feature extraction
English word segmentation processing
long short term memory
gated recurrent unit
author_facet Dongyang Wang
Junli Su
Hongbin Yu
author_sort Dongyang Wang
title Feature Extraction and Analysis of Natural Language Processing for Deep Learning English Language
title_short Feature Extraction and Analysis of Natural Language Processing for Deep Learning English Language
title_full Feature Extraction and Analysis of Natural Language Processing for Deep Learning English Language
title_fullStr Feature Extraction and Analysis of Natural Language Processing for Deep Learning English Language
title_full_unstemmed Feature Extraction and Analysis of Natural Language Processing for Deep Learning English Language
title_sort feature extraction and analysis of natural language processing for deep learning english language
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2020-01-01
description NLP (Natural Language Processing) is a technology that enables computers to understand human languages. Deep-level grammatical and semantic analysis usually uses words as the basic unit, and word segmentation is usually the primary task of NLP. In order to solve the practical problem of huge structural differences between different data modalities in a multi-modal environment and traditional machine learning methods cannot be directly applied, this paper introduces the feature extraction method of deep learning and applies the ideas of deep learning to multi-modal feature extraction. This paper proposes a multi-modal neural network. For each mode, there is a multilayer sub-neural network with an independent structure corresponding to it. It is used to convert the features in different modes to the same-modal features. In terms of word segmentation processing, in view of the problems that existing word segmentation methods can hardly guarantee long-term dependency of text semantics and long training prediction time, a hybrid network English word segmentation processing method is proposed. This method applies BI-GRU (Bidirectional Gated Recurrent Unit) to English word segmentation, and uses the CRF (Conditional Random Field) model to annotate sentences in sequence, effectively solving the long-distance dependency of text semantics, shortening network training and predicted time. Experiments show that the processing effect of this method on word segmentation is similar to that of BI-LSTM-CRF (Bidirectional- Long Short Term Memory-Conditional Random Field) model, but the average predicted processing speed is 1.94 times that of BI-LSTM-CRF, effectively improving the efficiency of word segmentation processing.
topic Feature extraction
English word segmentation processing
long short term memory
gated recurrent unit
url https://ieeexplore.ieee.org/document/8999624/
work_keys_str_mv AT dongyangwang featureextractionandanalysisofnaturallanguageprocessingfordeeplearningenglishlanguage
AT junlisu featureextractionandanalysisofnaturallanguageprocessingfordeeplearningenglishlanguage
AT hongbinyu featureextractionandanalysisofnaturallanguageprocessingfordeeplearningenglishlanguage
_version_ 1724184422583894016