摘 要:近年来,机器学习发展迅速,在广泛应用于各个领域的同时取得了许多理论突破。它为系统提供访问数据的能力,通过从过去的经验中学习和改进来解决复杂问题,使机器能够执行认知功能。文中将基于改进的 Stabilized Barzilai-Borwein(SBB)方法自动计算步长,与 SVRG 结合形成新的算法 SVRG-SBB,并从理论上证明新算法收敛,能够有效地解决机器学习中的常见问题。
关键词:随机优化;SBB 步长;机器学习
DOI:10.19850/j.cnki.2096-4706.2021.15.028
课题项目:2019 年湖南省教育厅科学研究 项目(19C0976)
中图分类号:TP181 文献标识码:A 文章编号:2096-4706(2021)15-0109-04
Research on First-order Stochastic Optimization Algorithm for Embedded with Stabilized Barzilai-Borwein Step Size
SHI Weijuan1,2 , Adibah Shuib2 , Zuraida Alwadood2
(1.School of Mathematics and Finance, Hunan University of Humanities, Science and Technology, Loudi 417000, China; 2.Faculty of Computer Science and Mathematical, MARA University of Technology, Shah Alam 40450, Malaysia)
Abstract: In recent years, machine learning is developing rapidly and has made many theoretical breakthroughs while being widely applied in various fields. It provides the system with the ability to access data, solve complex problems by learning from past experience and improving, and enable the machine to perform cognitive functions. In this paper, the step size is automatically calculated based on the improved Stabilized Barzilai-Borwein(SBB) method, which is combined with SVRG to form a new algorithm SVRG-SBB. It is proved theoretically that the new algorithm converges and can effectively solve the common problems in machine learning.
Keywords: stochastic optimization; SBB step size; machine learning
参考文献:
[1] ROBBINS H,MONRO S. A Stochastic Approximation Method [J].The Annals of Mathematical Statistics,2021,22(3), 400-407.
[2] JOHNSON R,ZHANG T. Accelerating stochastic gradient descent using predictive variance reduction [J].News in physiological sciences,2013,1(3):315-323.
[3] KONEČNÝ J,LIU J,RICHTÁRIK P,et al. mS2GD: Mini-Batch Semi-Stochastic Gradient Descent in the Proximal Setting [J/OL].arXiv:1410.4744 [cs.LG].[2021-05-10].https://arxiv.org/ abs/1410.4744v1.
[4] GHADIMI,S,GUANGHUI L. Optimal Stochastic Approximation Algorithms for Strongly Convex Stochastic Composite Optimization I:A Generic Algorithmic Framework [J/OL].SIAM Journal on Optimization,2012,22(4):1469-1492.
[5] NITANDA A. Stochastic proximal gradient descent with acceleration techniques [C]//Advances in Neural Information Processing Systems,Montreal:MIT Press,2014:1574-1582.
[6] NITANDA A. Accelerated Stochastic Gradient Descent for Minimizing Finite Sums [J/OL]. arXiv:1506.03016 [stat.ML].[2021- 05-10].https://arxiv.org/abs/1506.03016.
[7] YANG Z,WANG C,ZANG Y,et al. Mini-batch algorithms with Barzilai-Borwein update step [J].Neurocomputing,2018(314): 177-185.
[8] YANG Z,WANG C,ZHANG Z M,et al. Random BarzilaiBorwein step size for mini-batch algorithms [J].Engineering Applications of Artificial Intelligence,2018(C):124-135.
[9] YANG Z,WANG C,ZHANG Z,et al. Accelerated stochastic gradient descent with step size selection rules [J].Signal Processing,2019(159):171-186.
[10] DE S,YADAV A,JACOBS D, et al. Automated Inference with Adaptive Batches [C]//Proceedings of the 20th International Conference on Artificial Intelligence and Statistics.Fort Lauderdale: JMLR,2017(54):1504-1513.
[11] MA K,ZENG J,XIONG J,et al. Stochastic Non-convex Ordinal Embedding with Stabilized Barzilai-Borwein Step Size [J/ OL].arXiv:1711.06446 [stat.ML].[2021-05-10].https://arxiv.org/ abs/1711.06446.
[12] TAN C,MA S,DAI Y H,et al. Barzilai-Borwein step size for stochastic gradient descent [C]//the 30th International Conference on Neural Information Processing Systems.Barcelona:Curran Associates Inc.,2016:685–693.
[13] BARZILAI J,BORWEIN J. Two-Point Step Size Gradient Methods [J].IMA Journal of Numerical Analysis,1988,8(1): 141–148.
[14] BURDAKOV O,DAI Y H,HUANG N. Stabilized BarzilaiBorwein Method [J].Journal of Computational Mathematics.2019,37 (6),916-936.
作者简介:史卫娟(1988—),女,汉族,湖南娄底人,讲师,硕士研究生,研究方向:最优化算法及应用。