当前位置>主页 > 期刊在线 > 信息技术 >

信息技术21年6期

融合多种使用词信息方法的命名实体识别研究
郭鹏,刘俊南
(因诺微科技(天津)有限公司,天津 300384)

摘  要:文章对融合词信息增强中文命名实体识别问题进行了研究,提出一种用于中文命名实体识别的融合词信息神经网络模型系统。首先使用预训练语言模型 Bert 对字进行编码得到字标识,然后使用 SoftLexicon 基于统计的方法将词统计语义信息融合进入字表示中,之后使用设计的 GraphLexicon 根据文本内字、词之间的交互关系图结构,将字词信息表示相互融合,达到较高的命名实体识别准确率。


关键词:中文命名实体识别;图神经网络;融合;词信息;字词交互;图结构



DOI:10.19850/j.cnki.2096-4706.2021.06.007


中图分类号:TP183                                      文献标识码:A                                   文章编号:2096-4706(2021)06-0025-04


Research on Named Entity Recognition Based on Multiple Words Used Information Methods

GUO Peng,LIU Junnan (Innovem Technology

(Tianjin)Co.,Ltd.,Tianjin 300384,China)

Abstract:In this paper,the problem of enhancing Chinese named entity recognition by fusing word information is studied,and a neural network model system based on fusing word information for Chinese named entity recognition is proposed. First,the pre training language model Bert is used to encode the character to get the character identification,and then the statistic based approach SoftLexicon is used to fuse the word statistical semantic information into the character representation. Then,according to the structure of the interaction graph between characters and words in the text,the character and word information representation are fused to achieve a high accuracy of named entity recognition

Keywords:Chinese named entity recognition;graph neural network;fuse;word information;character and word interaction; graph structure


参考文献:

[1] DAVID N,SATOSHI S. A survey of named entity recognition and classification [J].Lingvistic Investigationes.International Journal of Linguistics and Language Resources,2007,30(1):3-26.

[2] DEVLIN J,CHANG M,KENTON L,et al. Bert:Pretraining of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.

[3] MA R,PENG M,ZHANG Q,WEI Z,et al. Simplify the Usage of Lexicon in Chinese NER [C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.2019:5951- 5960.

[4] SUI D B,CHEN Y B,LIU K,et al. Leverage Lexical Knowledge for Chinese Named Entity Recognition via Collaborative Graph Network [C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing(EMNLP-IJCNLP).Hong Kong:Association for Computational Linguistics,2019:3830-3840.

[5] VELIČKOVIĆ P,CUCURULL G,CASANOVA A,et al. Graph Attention Networks [J/OL].arXiv:1710.10903v1 [stat.ML]. (2018-02-04).https://arxiv.org/abs/1710.10903v1.


作者简介:郭鹏(1988—),男,汉族,河南信阳人,总工程师, 硕士研究生,研究方向:无线通信,人工智能;刘俊南(1990—), 男,汉族,天津人,中级软件工程师,本科,研究方向:语音识别, 自然语言处理。