摘 要:汽车零部件的齿轮装配过程中往往伴随着多种类型的故障,快速且精准地判断故障类型,对保证齿轮装配工位稳定运行具有重要意义。因此,提出一种基于 SMOTE 采样方法和随机森林(RF)分类方法的故障诊断模型——SMOTE-RF。首先,在实际齿轮装配过程中,故障数据是不平衡的,可以使用 SMOTE 算法生成平衡的故障数据;其次,将平衡后的数据作为随机森林算法的输入实现故障分类;最后,对模型进行性能评估。实验结果表明,SMOTE-RF 模型的分类效果优于 SVM 和 XGBoost。
关键词:故障诊断;不平衡数据;SMOTE 算法;随机森林
DOI:10.19850/j.cnki.2096-4706.2023.06.035
基金项目:湖南省教委科研基金(19K026);湖南省重点实验室建设项目(2020KF02)
中图分类号:TP391.4;TP181 文献标识码:A 文章编号:2096-4706(2023)06-0139-05
Fault Diagnosis Method of Gear Assembly under Imbalanced Data Set
WANG Zhe 1,2, XU Xi 1,2, ZHANG Bisheng3, HUANG Xiaowei 3, HU Wanli 4
(1.School of Computer Science, Hunan University of Technology, Zhuzhou 412007, China; 2.Key Laboratory of Intelligent Information Perception and Processing Technology of Hunan Province, Hunan University of Technology, Zhuzhou 412007, China; 3.Bosch Automotive Products (Changsha) Co., Ltd, Changsha 410100, China; 4.Changsha Robot Technology Co., Ltd., Changsha 410100, China)
Abstract: The gear assembly process of automobile parts is often accompanied by various types of faults. It is of great significance to quickly and accurately determine the fault type to ensure the stable operation of the gear assembly station. Therefore, a fault diagnosis model based on SMOTE sampling method and Random Forest (RF) classification method, SMOTE-RF, is proposed. Firstly, in the actual gear assembly process, the fault data is unbalanced, and the SMOTE algorithm can be used to generate balanced fault data. Secondly, the balanced data is used as the input of Random Forest algorithm to realize fault classification. Finally, the performance of the model is evaluated. The experimental results show that the classification effect of SMOTE-RF model is better than that of SVM and XGBoost.
Keywords: fault diagnosis; imbalanced data; SMOTE algorithm; Random Forest
参考文献:
[1] 吴清伟,葛茂根,王强 . 面向机械产品装配过程的在线故障诊断策略研究 [J]. 机械工程师,2014(7):60-62.
[2] 谢宇婵 . 基于物联网的汽车装配线智慧电动工具控制研究[D]. 长春:长春工业大学,2022.
[3] LEI Y G,YANG B,JIANG X W,et al. Applications of machine learning to machine fault diagnosis:A review and roadmap [J/OL].Mechanical Systems and Signal Processing,2020,138:106587[2022-09-20].https://doi.org/10.1016/j.ymssp.2019.106587.
[4] KANG Q,SHI L,ZHOU M C,et al. A distance-based weighted undersampling scheme for support vector machines and its application to imbalanced classification [J].IEEE transactions on neural networks and learning systems,2017,29(9):4152-4165.
[5] LIU W,CHAWLA S,CIESLAK D A,et al. A Robust Decision Tree Algorithm for Imbalanced Data Sets [C]//Proceedings of the 2010 SIAM International Conference on Data Mining. Society for Industrial and Applied MathematicsA Robust Decision Tree Algorithm for Imbalanced Data Sets,2010:766-777.
[6] JIANG G Q,HE H B,YAN J,et al. Multiscale convolutional neural networks for fault diagnosis of wind turbine gearbox [J].IEEE Transactions on Industrial Electronics,2018,66(4):3196-3207.
[7] BREIMAN L. Random Forests [J].Machine learning,2001,45(1):5-32.
[8] 徐佳庆,胡小月,唐付桥,等 . 基于随机森林的高性能互连网络阻塞故障检测 [J]. 计算机科学,2021,48(6):246-252.
[9] 翟嘉琪,杨希祥,程玉强,等 . 机器学习在故障检测与诊断领域应用综述 [J]. 计算机测量与控制,2021,29(3):1-9.
[10] CHAWLA N V,BOWYER K W,HALL L O,et al.SMOTE:synthetic minority over-sampling technique [J].Journal of artificial intelligence research,2002,16(1):321-357.
[11] GUO G D,WANG H,BELL D,et al. KNN ModelBased Approach in Classification [C]//OTM 2003:On The Move to Meaningful Internet Systems 2003:CoopIS,DOA,and ODBASE. Catania:Springer,2003:986-996.
[12] 王诚,赵晓培 . 基于混合采样的改进随机森林算法研究[J]. 计算机技术与发展,2021,31(12):50-54+91.
[13] 汪力纯,刘水生 . 基于混合采样和特征选择的改进随机森林算法研究 [J]. 南京邮电大学学报:自然科学版,2022,42(1):81-89.
作者简介:王喆(1997—),男,汉族,湖南长沙人,硕士在读,研究方向:工业物联网。