BFGS O-BFGS Is Not Essentially Convergent

提供:鈴木広大
2025年11月14日 (金) 04:15時点における184.178.172.14 (トーク)による版 (ページの作成:「<br>Limited-memory BFGS (L-BFGS or LM-BFGS) is an optimization algorithm in the collection of quasi-Newton strategies that approximates the Broyden-Fletcher-Goldfarb-Shanno algorithm (BFGS) utilizing a restricted amount of computer memory. It is a popular algorithm for parameter estimation in machine studying. Hessian (n being the variety of variables in the problem), L-BFGS stores just a few vectors that signify the approximation implicitly. On account of its ensui…」)
(差分) ← 古い版 | 最新版 (差分) | 新しい版 → (差分)
ナビゲーションに移動 検索に移動


Limited-memory BFGS (L-BFGS or LM-BFGS) is an optimization algorithm in the collection of quasi-Newton strategies that approximates the Broyden-Fletcher-Goldfarb-Shanno algorithm (BFGS) utilizing a restricted amount of computer memory. It is a popular algorithm for parameter estimation in machine studying. Hessian (n being the variety of variables in the problem), L-BFGS stores just a few vectors that signify the approximation implicitly. On account of its ensuing linear memory requirement, the L-BFGS methodology is especially well fitted to optimization issues with many variables. The 2-loop recursion formulation is widely utilized by unconstrained optimizers on account of its effectivity in multiplying by the inverse Hessian. However, it does not enable for the explicit formation of both the direct or inverse Hessian and is incompatible with non-box constraints. An alternate method is the compact representation, which entails a low-rank illustration for the direct and/or inverse Hessian. This represents the Hessian as a sum of a diagonal matrix and a low-rank replace. Such a representation allows using L-BFGS in constrained settings, for instance, as a part of the SQP methodology.



Since BFGS (and therefore L-BFGS) is designed to reduce clean functions without constraints, the L-BFGS algorithm should be modified to handle functions that include non-differentiable elements or constraints. A popular class of modifications are referred to as energetic-set strategies, based on the concept of the active set. The idea is that when restricted to a small neighborhood of the current iterate, the function and constraints may be simplified. The L-BFGS-B algorithm extends L-BFGS to handle simple field constraints (aka sure constraints) on variables; that is, constraints of the form li ≤ xi ≤ ui where li and ui are per-variable constant decrease and upper bounds, respectively (for each xi, both or each bounds could also be omitted). The strategy works by figuring out fastened and free variables at every step (utilizing a easy gradient methodology), and then using the L-BFGS technique on the free variables only to get larger accuracy, after which repeating the method. The tactic is an active-set type methodology: at each iterate, it estimates the sign of each element of the variable, and Memory Wave restricts the next step to have the same sign.



L-BFGS. After an L-BFGS step, the strategy permits some variables to vary sign, and repeats the process. Schraudolph et al. current an online approximation to both BFGS and L-BFGS. Similar to stochastic gradient descent, this can be utilized to cut back the computational complexity by evaluating the error perform and gradient on a randomly drawn subset of the overall dataset in each iteration. BFGS (O-BFGS) shouldn't be necessarily convergent. R's optim normal-function optimizer routine uses the L-BFGS-B method. SciPy's optimization module's decrease method also contains an choice to use L-BFGS-B. A reference implementation in Fortran 77 (and with a Fortran ninety interface). This model, in addition to older versions, has been transformed to many different languages. Liu, D. C.; Nocedal, J. (1989). "On the Limited Memory Method for giant Scale Optimization". Malouf, Robert (2002). "A comparison of algorithms for maximum entropy parameter estimation". Proceedings of the Sixth Conference on Pure Language Studying (CoNLL-2002).



Andrew, Galen; Gao, Jianfeng (2007). "Scalable training of L₁-regularized log-linear models". Proceedings of the 24th Worldwide Conference on Machine Learning. Matthies, H.; Strang, G. (1979). "The solution of non linear finite factor equations". Worldwide Journal for Numerical Methods in Engineering. 14 (11): 1613-1626. Bibcode:1979IJNME..14.1613M. Nocedal, J. (1980). "Updating Quasi-Newton Matrices with Limited Storage". Byrd, R. H.; Nocedal, J.; Schnabel, R. B. (1994). "Representations of Quasi-Newton Matrices and their use in Limited Memory Wave Audio Methods". Mathematical Programming. Sixty three (4): 129-156. doi:10.1007/BF01582063. Byrd, R. H.; Lu, P.; Nocedal, J.; Zhu, C. (1995). "A Limited Memory Algorithm for Bound Constrained Optimization". SIAM J. Sci. Comput. Zhu, C.; Byrd, Richard H.; Lu, Peihuang; Nocedal, Jorge (1997). "L-BFGS-B: Algorithm 778: L-BFGS-B, FORTRAN routines for giant scale bound constrained optimization". ACM Transactions on Mathematical Software program. Schraudolph, N.; Yu, J.; Günter, S. (2007). A stochastic quasi-Newton method for online convex optimization. Mokhtari, A.; Ribeiro, A. (2015). "Global convergence of online restricted memory BFGS" (PDF). Journal of Machine Studying Research. Mokhtari, A.; Ribeiro, A. (2014). "RES: Regularized Stochastic BFGS Algorithm". IEEE Transactions on Sign Processing. 62 (23): 6089-6104. arXiv:1401.7625. Morales, J. L.; Nocedal, J. (2011). "Remark on "algorithm 778: L-BFGS-B: Fortran subroutines for giant-scale sure constrained optimization"". ACM Transactions on Mathematical Software program. Liu, D. C.; Nocedal, J. (1989). "On the Limited Memory Methodology for big Scale Optimization". Haghighi, Aria (2 Dec 2014). "Numerical Optimization: Understanding L-BFGS". Pytlak, Radoslaw (2009). "Limited Memory Quasi-Newton Algorithms". Conjugate Gradient Algorithms in Nonconvex Optimization.