Extracting Knowledge Entities from Sci-Tech Intelligence Resources Based on BiLSTM and Conditional Random Field

Weizhi LIAO; Mingtong HUANG; Pan MA; Yu WANG

doi:10.1587/transinf.2020BDP0007

IEICE TRANSACTIONS on Information

Extracting Knowledge Entities from Sci-Tech Intelligence Resources Based on BiLSTM and Conditional Random Field

Weizhi LIAO, Mingtong HUANG, Pan MA, Yu WANG

Full Text Views

0

Cite this

Summary :

There are many knowledge entities in sci-tech intelligence resources. Extracting these knowledge entities is of great importance for building knowledge networks, exploring the relationship between knowledge, and optimizing search engines. Many existing methods, which are mainly based on rules and traditional machine learning, require significant human involvement, but still suffer from unsatisfactory extraction accuracy. This paper proposes a novel approach for knowledge entity extraction based on BiLSTM and conditional random field (CRF).A BiLSTM neural network to obtain the context information of sentences, and CRF is then employed to integrate global label information to achieve optimal labels. This approach does not require the manual construction of features, and outperforms conventional methods. In the experiments presented in this paper, the titles and abstracts of 20,000 items in the existing sci-tech literature are processed, of which 50,243 items are used to build benchmark datasets. Based on these datasets, comparative experiments are conducted to evaluate the effectiveness of the proposed approach. Knowledge entities are extracted and corresponding knowledge networks are established with a further elaboration on the correlation of two different types of knowledge entities. The proposed research has the potential to improve the quality of sci-tech information services.

Publication: IEICE TRANSACTIONS on Information Vol.E104-D No.8 pp.1214-1221

Publication Date: 2021/08/01

Publicized: 2021/04/22

Online ISSN: 1745-1361

DOI: 10.1587/transinf.2020BDP0007

Type of Manuscript: Special Section PAPER (Special Section on Computational Intelligence and Big Data for Scientific and Technological Resources and Services)

Category

Authors

Weizhi LIAO
  University of Electronic Science and Technology of China
Mingtong HUANG
  University of Electronic Science and Technology of China
Pan MA
  University of Electronic Science and Technology of China
Yu WANG
  University of Electronic Science and Technology of China

Keyword

sci-tech intelligence resources, knowledge entity, sequence labeling, BiLSTM-CRF

Cite this

Copy

Weizhi LIAO, Mingtong HUANG, Pan MA, Yu WANG, "Extracting Knowledge Entities from Sci-Tech Intelligence Resources Based on BiLSTM and Conditional Random Field" in IEICE TRANSACTIONS on Information, vol. E104-D, no. 8, pp. 1214-1221, August 2021, doi: 10.1587/transinf.2020BDP0007.
Abstract: There are many knowledge entities in sci-tech intelligence resources. Extracting these knowledge entities is of great importance for building knowledge networks, exploring the relationship between knowledge, and optimizing search engines. Many existing methods, which are mainly based on rules and traditional machine learning, require significant human involvement, but still suffer from unsatisfactory extraction accuracy. This paper proposes a novel approach for knowledge entity extraction based on BiLSTM and conditional random field (CRF).A BiLSTM neural network to obtain the context information of sentences, and CRF is then employed to integrate global label information to achieve optimal labels. This approach does not require the manual construction of features, and outperforms conventional methods. In the experiments presented in this paper, the titles and abstracts of 20,000 items in the existing sci-tech literature are processed, of which 50,243 items are used to build benchmark datasets. Based on these datasets, comparative experiments are conducted to evaluate the effectiveness of the proposed approach. Knowledge entities are extracted and corresponding knowledge networks are established with a further elaboration on the correlation of two different types of knowledge entities. The proposed research has the potential to improve the quality of sci-tech information services.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2020BDP0007/_p

Copy

@ARTICLE{e104-d_8_1214,
author={Weizhi LIAO, Mingtong HUANG, Pan MA, Yu WANG, },
journal={IEICE TRANSACTIONS on Information},
title={Extracting Knowledge Entities from Sci-Tech Intelligence Resources Based on BiLSTM and Conditional Random Field},
year={2021},
volume={E104-D},
number={8},
pages={1214-1221},
abstract={There are many knowledge entities in sci-tech intelligence resources. Extracting these knowledge entities is of great importance for building knowledge networks, exploring the relationship between knowledge, and optimizing search engines. Many existing methods, which are mainly based on rules and traditional machine learning, require significant human involvement, but still suffer from unsatisfactory extraction accuracy. This paper proposes a novel approach for knowledge entity extraction based on BiLSTM and conditional random field (CRF).A BiLSTM neural network to obtain the context information of sentences, and CRF is then employed to integrate global label information to achieve optimal labels. This approach does not require the manual construction of features, and outperforms conventional methods. In the experiments presented in this paper, the titles and abstracts of 20,000 items in the existing sci-tech literature are processed, of which 50,243 items are used to build benchmark datasets. Based on these datasets, comparative experiments are conducted to evaluate the effectiveness of the proposed approach. Knowledge entities are extracted and corresponding knowledge networks are established with a further elaboration on the correlation of two different types of knowledge entities. The proposed research has the potential to improve the quality of sci-tech information services.},
keywords={},
doi={10.1587/transinf.2020BDP0007},
ISSN={1745-1361},
month={August},}

Copy

TY - JOUR
TI - Extracting Knowledge Entities from Sci-Tech Intelligence Resources Based on BiLSTM and Conditional Random Field
T2 - IEICE TRANSACTIONS on Information
SP - 1214
EP - 1221
AU - Weizhi LIAO
AU - Mingtong HUANG
AU - Pan MA
AU - Yu WANG
PY - 2021
DO - 10.1587/transinf.2020BDP0007
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E104-D
IS - 8
JA - IEICE TRANSACTIONS on Information
Y1 - August 2021
AB - There are many knowledge entities in sci-tech intelligence resources. Extracting these knowledge entities is of great importance for building knowledge networks, exploring the relationship between knowledge, and optimizing search engines. Many existing methods, which are mainly based on rules and traditional machine learning, require significant human involvement, but still suffer from unsatisfactory extraction accuracy. This paper proposes a novel approach for knowledge entity extraction based on BiLSTM and conditional random field (CRF).A BiLSTM neural network to obtain the context information of sentences, and CRF is then employed to integrate global label information to achieve optimal labels. This approach does not require the manual construction of features, and outperforms conventional methods. In the experiments presented in this paper, the titles and abstracts of 20,000 items in the existing sci-tech literature are processed, of which 50,243 items are used to build benchmark datasets. Based on these datasets, comparative experiments are conducted to evaluate the effectiveness of the proposed approach. Knowledge entities are extracted and corresponding knowledge networks are established with a further elaboration on the correlation of two different types of knowledge entities. The proposed research has the potential to improve the quality of sci-tech information services.
ER -

IEICE TRANSACTIONS on Information