摘 要:对手写数字进行准确识别是目前实现诸多人工智能任务的关键技术。提出了一种利用 KNN(K-Nearest-Neighbors)实现手写数字识别的方法,并在数据集 MNIST 上进行验证。首先对 MNIST 相关文件进行预处理,随后利用Python 语言并基于 sklearn 库实现 KNN 算法,最后在训练集上训练得到 KNN 模型。该模型在测试集上的识别准确率为 0.969 1,表明利用 KNN 实现手写数字识别是行之有效的方法。
关键词:手写数字识别;KNN;MNIST;Python
DOI:10.19850/j.cnki.2096-4706.2021.04.024
基金项目:岭南师范学院自然科学人才专项 (ZL2021015);广东省特殊儿童发展与教育重点 实验室一般项目(TJ202011)
中图分类号:TP391.4 文献标识码:A 文章编号:2096-4706(2021)04-0097-03
A Method for Realizing Handwritten Numeral Recognition Using KNN
LU Liqiong1,2,WU Dong2
(1.School of Information Engineering,Lingnan Normal University,Zhanjiang 524048,China;2.Guangdong Provincial Key Laboratory of Development and Education for Special Needs Children,Lingnan Normal University,Zhanjiang 524048,China)
Abstract:Accurate recognition of handwritten numeral is currently the key technology for many artificial intelligence tasks. A method for handwritten numeral recognition using KNN(K-Nearest-Neighbors)is proposed and verified on MNIST data set. Firstly,the MNIST related files are preprocessed,and then the KNN algorithm is implemented by using Python language and base on sklearn library. Finally,the KNN model is trained on the training set. The recognition accuracy of the model on the test set is 0.969 1,which shows that KNN is an effective method for handwritten numeral recognition.
Keywords:handwritten numeral recognition;KNN;MNIST;Python
参考文献:
[1] 李志伟 . 模式识别系统及统计模式识别研究概述 [J]. 信息 技术教学与研究,2016,88:10.
[2] 陈鸿宇 . 基于 KNN 算法手写数字识别技术的研究与实现 [J]. 信息通讯,2020,12(216):28-32.
[3] 赵卫东.机器学习 [M]. 北京:人民邮电出版社,2018
[4] YANN L,CORINNA C,CHRISTOPHER B. THE MNIST DATABASE of Handwritten Digits [EB/OL].(2010-10-01).http:// yann.lecun.com/exdb/mnist/.
[5] COVER T,HART P. Nearest neighbor pattern classification [J]. IEEE Transactions on Information Theory,1967,13(1):21-27.
[6] ABEYWICKRAMA T,CHEEMA M A,TANIAR D. k-nearest neighbors on road networks:a journey in experimentation and inmemory implementation [J].Proceedings of the VLDB Endowment,9 (6):492-503.
作者简介:卢利琼(1980—),女,汉族,湖北崇阳人,讲师, 博士,主要研究方向:文本识别;吴东(1981—),男,汉族,广 东合浦人,副教授,硕士,主要研究方向:模式识别。