当前位置>主页 > 期刊在线 > 计算机技术 >

计算机技术22年8期

基于聚类分析的横向联邦学习方案
赵俊杰,张国兴,杨杰
(中南民族大学 计算机科学学院,湖北 武汉 430074)

摘  要:联邦学习是一种分布式的学习方法,参与者协同训练模型,参与者将数据保留在本地,只是把模型参数发送到服务器,从而保证了数据的安全性。研究发现,在模型训练的过程中,存在遭受数据投毒的数据或恶意窜改的数据,使训练的模型难以取得较好的预测效果。因此,文章提出一个基于聚类分析的参与者评价算法,通过对数据集进行联合分析并采取相应的措施来防御投毒攻击。实验结果证明了方案的合理性和实效性,有效防止了横向联邦学习中的投毒攻击。


关键词:聚类分析;联邦学习;正态分布



DOI:10.19850/j.cnki.2096-4706.2022.08.023


中图分类号:TP18                                         文献标识码:A                                       文章编号:2096-4706(2022)08-0082-04


A Horizontal Federated Learning Scheme Based on Cluster Analysis

ZHAO Junjie, ZHANG Guoxing, YANG Jie

(School of Computer Science, South-Central Minzu University, Wuhan 430074, China)

Abstract: Federated learning is a distributed learning method. Participants cooperate to train the models, keep the data locally, and only send the model parameters to the service to ensure the security of the data. It is found that in the process of model training, there are data poisoned by data or maliciously tampered data, which makes it difficult for the trained model to achieve better prediction results. Therefore, this paper proposes a participant evaluation algorithm based on cluster analysis, through the joint analysis of data sets and taking corresponding measures to prevent poisoning attacks. The experimental results show that the scheme is reasonable and effective, and effectively prevent poisoning attacks in horizontal federated learning.

Keywords: cluster analysis; federated learning; normal distribution


参考文献:

[1] 苗芳 . 人工智能在计算机信息技术中的标准化应用 [J]. 品牌与标准化,2022(S1):154-156.

[2] ZHUO R,HUFFAKER B,GREENSTEIN S. The impact of the General Data Protection Regulation on internet interconnection [J/OL].Telecommunications Policy,2021,45(2):[2022-02-26].

https://www.sciencedirect.com/sdfe/arp/cite?pii=S0308596120301737&f ormat=application%2Fx-research-info-systems&withabstract=true. 

[3] MCMAHAN H B,MOORE E,RAMAGE D,et al.Communication-efficient learning of deep networks from decentralized data [C]//20th International Conference on Artificial Intelligence and Statistics. Fort Lauderdale:[s.n.]. 2017:1273-1282.

[4] ZHU L G,LIU Z J,HAN S. Deep leakage from gradients. In: Proceedings of the 33rd Annual Conference on Neural Information Processing Systems [J].2019,14747–14756.

[5] JAGIELSKI M,OPREA A,BIGGIO B,et al. Ma⁃nipulating machine learning: Poisoning attacks andcountermeasures for regression learning [C]//roceed⁃ings of 2018 IEEE Symposium on Security and Priva⁃cy(SP). San Francisco:IEEE,2018:19-35.

[6] BAGDASARYAN E,VEIT A,HUA Y,et al. How To Backdoor Federated Learning [C]//International Conference on Artificial Intelligence and Statistics. Palermo:PMLR,2020:2938-2947.

[7] 杨明,胡学先,张启慧,等 . 基于信誉评估机制和区块链的移动网络联邦学习方案 [J]. 网络与信息安全学报,2021,7(6):99-112.

[8] CHEN Y,SU L,XU J. Distributed Statistical Machine Learning in Adversarial Settings: Byzantine Gradient Descent [J]. Proceedings of the ACM on Measurement and Analysis of Computing Systems,2017,1(2):1-25.

[9] DONG Y,CHEN Y,RAMCHANDRAN K,et al. Byzantine-Robust Distributed Learning: Towards Optimal Statistical Rates [C]// International Conference on Machine Learning.Palermo: PMLR,2018:5650-5659.

[10] ATENIESE G,FELICI G,MANCINI L V,et al. Hacking smart machines with smarter ones: How to extract meaningful data from machine learning classifiers [EB/OL].[2022-03-30].https://www. researchgate.net/profile/Giovanni-Felici/publication/239731767_ Hacking_Smart_Machines_with_Smarter_Ones_How_to_ Extract_Meaningful_Data_from_Machine_Learning_Classifiers/ links/53e413d60cf25d674e94b474/Hacking-Smart-Machines-withSmarter-Ones-How-to-Extract-Meaningful-Data-from-MachineLearning-Classifiers.pdf.


作者简介:赵俊杰(1996—),男,汉族,广西桂林人,硕士研究生在读,研究方向:信息安全。