当前位置>主页 > 期刊在线 > 计算机技术 >

计算机技术2020年23期

基于体检数据的糖尿病风险预测模型对比研究
马文彬¹,王克²,于滨³,冯超南³,纪俊¹’³
(1. 青岛大学,山东 青岛 266071;2. 青岛市市立医院 东院,山东 青岛 266071;3. 北京万灵盘古科技有限公司,北京 100089)

摘  要:随着中国糖尿病患者人数及病死率不断上升,对空腹血糖的有效检测及合理预测是目前的研究重点。采用数据挖掘的方法,根据体检数据建立空腹血糖变化预测模型。基于前三年的医学检查数据预测第四年空腹血糖的变化,从医学检查数据库中收集实验数据。在特征选择阶段,使用主成分分析选择最佳特征子集,结合5 种机器学习算法建立模型并预测患病风险。实验结果表明随机森林算法模型对糖尿病风险预测效果最佳。


关键词:空腹血糖;机器学习;PCA;体检数据;糖尿病预测



中图分类号:TP311.13;R587.1         文献标识码:A         文章编号:2096-4706(2020)23-0072-04


Comparative Study of Diabetes Risk Prediction Models Based on Physical

Examination Data

MA Wenbin1,WANG Ke2,YU Bin3,FENG Chaonan3,JI Jun1,3

(1.Qingdao University,Qingdao 266071,China;2.East Hospital of Qingdao Municipal Hospital,Qingdao 266071,China; 3.Beijing Wanlingpangu Technology Co.,Ltd.,Beijing 100089,China)

Abstract:With the increasing number of diabetes patients and mortality in China,the effective detection and reasonable prediction of fasting blood glucose is the focus of current research. Using the method of data mining,the prediction model of fasting blood glucose change was established according to the physical examination data. Based on the medical examination data of the previous three years to predict the change of fasting blood glucose in the fourth year,the experimental data is collected from the medical examination database.In the feature selection stage,principal component analysis is used to select the best feature subset,combined with five machine learningalgorithms to build a model and predict the risk of disease. The experimental results show that the random forest algorithm model is the bestfor diabetes risk prediction.

Keywords:fasting blood glucose;machine learning;PCA;physical examination data;diabetes prediction


基金项目:国家自然科学基金(61503208);山东省自然科学基金- 培养基金项目(ZR2015PF002)


参考文献:

[1] 刘子琪,刘爱萍,王培玉. 中国糖尿病患病率的流行病学调查研究状况 [J]. 中华老年多器官疾病杂志,2015,14(7):547-550.

[2] 张占林,孙勇,妥小青,等. 随机森林算法对体检人群糖尿病患病风险的预测价值研究 [J]. 中国全科医学,2019,22(9):1021-1026.

[3] KAVAKIOTIS I,TSAVE O,SALIFOGLOU A,et al.Machine Learning and Data Mining Methods in Diabetes Research [J].Computational and Structural Biotechnology Journal,2017,15:104-116.

[4] POLAT K,GÜNEŞ S. An expert system approach based onprincipal component analysis and adaptive neuro-fuzzy inference systemto diagnosis of diabetes disease [J]. Digital Signal Processing,2006,17(4):702-710.

[5] HAN L F,LUO S L,YU J M,et al. Rule Extraction FromSupport Vector Machines Using Ensemble Learning Approach:AnApplication for Diagnosis of Diabetes [J]. IEEE Journal of Biomedicaland Health Informatics,2015,19(2):728-734.

[6] TRESP V,BRIEGEL T,MOODY J. Neural-network modelsfor the blood glucose metabolism of a diabetic [J]. IEEE Transactions onNeural Networks,1999,10(5):1204-1213.

[7] GEORGA E I,PROTOPAPPAS V C,ARDIGO D,et al.Multivariate Prediction of Subcutaneous Glucose Concentration inType 1 Diabetes Patients Based on Support Vector Regression [J]. IEEEJournal of Biomedical and Health Informatics,2013,17(1):71-81.

[8] 余丽玲,陈婷,金浩宇,等. 基于支持向量机和自回归积分滑动平均模型组合的血糖值预测 [J]. 中国医学物理学杂志,2016,33(4):381-384.

[9] GANI A,GRIBOK A V,RAJARAMAN S,et al. PredictingSubcutaneous Glucose Concentration in Humans:Data-Driven GlucoseModeling [J]. IEEE Transactions on Biomedical Engineering,2009,56(2):246-254.

[10] PRADHAN M,BAMNOTE G R. Design of classifierfor detection of diabetes mellitus using genetic programming [C]//Proceedings of the 3rd International Conference on Frontiers of IntelligentComputing:Theory and Applications (FICTA),2014:763-770.

[11] 魏芬芬. 灰色预测模型在血糖预测中的研究 [D]. 郑州:郑州大学,2016.

[12] 丰罗菊,王亚龙,张建陶,等. 糖尿病肾病空腹血糖预测值筛选 [J]. 中国公共卫生,2008,24(6):727-729.

[13] 林震,王威. 基于决策树的数据挖掘算法优化研究 [J].现代计算机(专业版),2012(28):11-14.

[14] 侯玉梅,朱亚楠,朱立春,等. 决策树模型在2 型糖尿病患病风险预测中的应用 [J]. 中国卫生统计,2016,33(6):976-978+982.

[15] 曹文哲,应俊,陈广飞,等. 基于Logistic 回归和随机森林算法的2 型糖尿病并发视网膜病变风险预测及对比研究 [J]. 中国医疗设备,2016,31(3):33-38+69.

[16] 肖文翔. 基于电子病历分析的糖尿病患病风险数据挖掘方法研究 [D]. 青岛:青岛大学,2016.

[17] 付阳,李昆仑. 支持向量机模型参数选择方法综述 [J].电脑知识与技术,2010,6(28):8081-8082+8085.

[18] 黄衍,查伟雄. 随机森林与支持向量机分类性能比较 [J].软件,2012,33(6):107-110.

[19] 龚谊承,都承华,张艳娜,等. 基于主成分和GBDT 对血糖值的预测 [J]. 数学的实践与认识,2019,49(14):116-122.

[20] 中华医学会糖尿病学分会. 中国2 型糖尿病防治指南(2017 年版) [J]. 中国实用内科杂志,2018,38(4):292-344.


作者简介:马文彬(1994—),男,汉族,山东菏泽人,硕士在读,研究方向:医疗大数据。