摘 要:随着大数据在各行业应用的广泛深入,取得良好的成果,许多档案行业学者对档案信息在大数据应用方面进行了研究和实践,通过采用人工智能技术对档案信息进行预处理,如利用 OpenCV 算法对文本档案进行 OCR 识别,采用 ASR 技术对音视频档案进行语音识别,采用人工智能技术进行人脸识别等。对获得的数字化档案信息采用隐马尔科夫模型进行结构化,最后形成“一人一档,一事一档”等大数据应用实践。
关键词:OCR;语音识别;人脸识别;数据结构化;一人一档;一事一档
DOI:10.19850/j.cnki.2096-4706.2021.23.036
中图分类号:TP39 文献标识码:A 文章编号:2096-4706(2021)23-0142-03
Preliminary Practice of Application of Big Data in Archival Information
ZHU Mengling
(Guangdong Yunxun Information Technology Co., Ltd., Huizhou 516000, China)
Abstract: With the extensive and in-depth application of big data in various industries, good results have been achieved, many scholars in the archives industry have studied and practiced the application of big data in archives information. They preprocess archives information by using artificial intelligence technology, such as OCR recognition of text archives by using OpenCV algorithm, ASR (automatic speech recognition) technology is used for speech recognition of audio and video archives, and artificial intelligence technology is used for face recognition. The obtained digital archives information is structured by hidden Markov model (HMM), and finally forms big data application practices such as “one file for one person, one file for one thing”.
Keywords: OCR; speech recognition; face recognition; data structure; one file for one person; one file for one thing
参考文献:
[1] 赵甲信 . 关于加快推进县域档案信息化建设工作步伐的几点体会 [J]. 陕西档案,2008(6):30.
[2] 赵鹏,李光 . 档案工作落实科学发展观的关键——实现档案实物化管理向信息化管理的转变 [J]. 山东档案,2005(5):7-9.
[3] 陶水龙.大数据特征的分析研究 [J].中国档案,2017(12): 58-59.
[4] 陈菲 . 大数据视角下的档案利用问题研究——由提高数据加工能力谈起 [J]. 机电兵船档案,2017(3):74-76.
[5] 王玲,张妍妍 . 大数据时代档案工作面临的大机遇与大挑战 [J]. 兰台世界,2014(17):15-16.
作者简介:朱梦玲(1997—),女,汉族,湖北黄冈人,工科学士学位,本科,研究方向:档案大数据。