Optimization

On the adaptivity of stochastic gradient-based optimization. L. Lei and M. I. Jordan. arxiv.org/abs/1904.04480, 2019.

Is there an analog of Nesterov acceleration for MCMC?. Y.-A. Ma, X. Cheng, N. Flammarion, P. Bartlett, and M. I. Jordan. arxiv.org/abs/1902.00996, 2019.

Acceleration via symplectic discretization of high-resolution differential equations. B. Shi, S. Du, W. Su, and M. I. Jordan. arxiv.org/abs/1902.03694, 2019.

On efficient optimal transport: An analysis of greedy and accelerated mirror descent algorithms. T. Lin, N. Ho, and M. I. Jordan. arxiv.org/abs/1901.06482, 2019.

First-order methods almost always avoid strict saddle-points. J. Lee, I. Panageas, G. Piliouras, M. Simchowitz, M. I. Jordan, and B. Recht. Mathematical Programming, Series B, to appear.

Dynamical, symplectic and stochastic perspectives on gradient-based optimization. M. I. Jordan. Proceedings of the International Congress of Mathematicians, 1, 523-550, 2018.

Understanding the acceleration phenomenon via high-resolution differential equations. B. Shi, S. Du, M. I. Jordan, and W. Su. arxiv.org/abs/1810.08907, 2018.

Rao-Blackwellized stochastic gradients for discrete distributions. R. Liu, J. Regier, N. Tripuraneni, M. I. Jordan, and J. McAuliffe. arxiv.org/abs/1810.04777, 2018.

On symplectic optimization. M. Betancourt, M. I. Jordan, and A. Wilson. arxiv.org/abs/1802.03653, 2018.

CoCoA: A general framework for communication-efficient distributed optimization. V. Smith, S. Forte, C. Ma, M. Takac, M. I. Jordan, and M. Jaggi. Journal of Machine Learning Research, 18, 1-49, 2018.

Minimizing nonconvex population risk from rough empirical risk. C. Jin, L. Liu, Ge, R., and M. I. Jordan. arxiv.org/abs/1803.09357, 2018.

Underdamped Langevin MCMC: A non-asymptotic analysis. X. Cheng, N. Chatterji, P. Bartlett, and M. I. Jordan. Proceedings of the Conference on Computational Learning Theory (COLT), Stockholm, Sweden, 2018.

Averaging stochastic gradient descent on Riemannian manifolds. N. Tripuraneni, N. Flammarion, F. Bach, and M. I. Jordan. Proceedings of the Conference on Computational Learning Theory (COLT), Stockholm, Sweden, 2018.

Accelerated gradient descent escapes saddle points faster than gradient descent< /A>. C. Jin, P. Netrapalli, and M. I. Jordan. Proceedings of the Conference on Computational Learning Theory (COLT), Stockholm, Sweden, 2018.

First-order methods almost always avoid saddle points. J. Lee, I. Panageas, G. Piliouras, M. Simchowitz, M. I. Jordan, and B. Recht. arxiv.org/abs/1710.07406, 2017.

Stochastic cubic regularization for fast nonconvex optimization. N. Tripuraneni, M. Stern, C. Jin, J. Regier, and M. I. Jordan. arxiv.org/abs/1711.02838, 2017.

Perturbed iterate analysis for asynchronous stochastic optimization. H. Mania, X. Pan, D. Papailiopoulos, B. Recht, K. Ramchandran, and M. I. Jordan. SIAM Journal on Optimization, 27, 2202-2229, 2017.

Saturating splines and feature selection. Boyd, N., Hastie, T., Boyd, S., Recht, B., and M. I. Jordan. Journal of Machine Learning Research, to appear.

Gradient descent can take exponential time to escape saddle points. S. Du, C. Jin, J. Lee, M. I. Jordan, B. Poczos, and A. Singh. In S. Bengio, R. Fergus, S. Vishwanathan H. Wallach (Eds), Advances in Neural Information Processing Systems (NIPS) 31, 2018.

Nonconvex finite-sum optimization via SCSG methods. L. Lei, C. Ju, J. Chen, and M. I. Jordan. In S. Bengio, R. Fergus, S. Vishwanathan H. Wallach (Eds), Advances in Neural Information Processing Systems (NIPS) 31, 2018.

Fast black-box variational inference through stochastic trust-region optimization. J. Regier, M. I. Jordan, and J. McAuliffe. In S. Bengio, R. Fergus, S. Vishwanathan H. Wallach (Eds), Advances in Neural Information Processing Systems (NIPS) 31, 2018.

How to escape saddle points efficiently. C. Jin, R. Ge, P. Netrapalli, S. Kakade, and M. I. Jordan. In D. Precup and Y. W. Teh (Eds), Proceedings of the 34th International Conference on Machine Learning (ICML), Sydney, Australia, 2017.

Less than a single pass: Stochastically controlled stochastic gradient. Lei, L., and M. I. Jordan. In A. Singh and J. Zhu (Eds.), Proceedings of the Nineteenth Conference on Artificial Intelligence and Statistics (AISTATS), 2017. [Supplementary info]

Distributed optimization with arbitrary local solvers. C. Ma, J. Konecny, M. Jaggi, V. Smith, M. I. Jordan, P Richtarik, and M. Takac. Optimization Methods and Software, 4, 813-848, 2017.

A Lyapunov analysis of momentum methods in optimization. A. Wilson, B. Recht and M. I. Jordan. arXiv:1611.02635, 2016.

A variational perspective on accelerated methods in optimization. A. Wibisono, A. Wilson and M. I. Jordan. Proceedings of the National Academy of Sciences, 133, E7351-E7358. 2016. [ArXiv version]

Less than a single pass: Stochastically controlled stochastic gradient method. Lei, L., and M. I. Jordan. arXiv:1609.03261, 2016.

Gradient descent converges to minimizers. J. Lee, M. Simchowitz, M. I. Jordan, and B. Recht. Proceedings of the Conference on Computational Learning Theory (COLT), New York, NY, 2016.

A linearly-convergent stochastic L-BFGS algorithm. P. Moritz, R. Nishihara, and M. I. Jordan. Proceedings of the Eighteenth Conference on Artificial Intelligence and Statistics (AISTATS), Cadiz, Spain, 2016.

Splash: User-friendly programming interface for parallelizing stochastic algorithms. Y. Zhang and M. I. Jordan. arXiv:1506.07552, 2015.

A general analysis of the convergence of ADMM. R. Nishihara, L. Lessard, B. Recht, A. Packard, and M. I. Jordan. arXiv:1502.02009, 2015.

Adding vs. averaging in distributed primal-dual optimization. C. Ma, V. Smith, M. Jaggi, M. I. Jordan, P. Richtarik, and M. Takac. arXiv:1502.03508, 2015.

On the convergence rate of decomposable submodular function minimization. R. Nishihara, S. Jegelka, and M. I. Jordan. In Z. Ghahramani, M. Welling, C. Cortes and N. Lawrence (Eds.), Advances in Neural Information Processing Systems (NIPS) 28, 2015.

Communication-efficient distributed dual coordinate ascent. M. Jaggi, V. Smith, M. Takac, J. Terhorst, T. Hofmann, and M. I. Jordan. In Z. Ghahramani, M. Welling, C. Cortes and N. Lawrence (Eds.), Advances in Neural Information Processing Systems (NIPS) 28, 2015.

Parallel double greedy submodular maximization. X. Pan, S. Jegelka, J. Gonzalez, J. Bradley, and M. I. Jordan. In Z. Ghahramani, M. Welling, C. Cortes and N. Lawrence (Eds.), Advances in Neural Information Processing Systems (NIPS) 28, 2015.

Optimal rates for zero-order optimization: the power of two function evaluations. J. Duchi, M. I. Jordan, M. Wainwright, and A. Wibisono. IEEE Transactions on Information Theory, 61, 2788-2806, 2015.

Computational and statistical tradeoffs via convex relaxation. V. Chandrasekaran and M. I. Jordan. Proceedings of the National Academy of Sciences, 110, E1181-E1190, 2013.

The asymptotics of ranking algorithms. J. Duchi, L. Mackey, and M. I. Jordan. Annals of Statistics, 4, 2292-2323, 2013.

MAD-Bayes: MAP-based asymptotic derivations from Bayes. T. Broderick, B. Kulis, and M. I. Jordan. In S. Dasgupta and D. McAllester (Eds.), Proceedings of the 30th International Conference on Machine Learning (ICML), Atlanta, GA, 2013. [Supplementary information].

Finite sample convergence rates of zero-order stochastic optimization methods. J. Duchi, M. I. Jordan, M. Wainwright, and A. Wibisono. In P. Bartlett, F. Pereira, L. Bottou and C. Burges (Eds.), Advances in Neural Information Processing Systems (NIPS) 26, 2013.

Small-variance asymptotics for exponential family Dirichlet process mixture models. K. Jiang, B. Kulis, and M. I. Jordan. In P. Bartlett, F. Pereira, L. Bottou and C. Burges (Eds.), Advances in Neural Information Processing Systems (NIPS) 26, 2013.

Ergodic mirror descent. J. C. Duchi, A. Agarwal, M. Johansson, and M. I. Jordan. SIAM Journal of Optimization, 22, 1549-1578, 2012.

Variational inference over combinatorial spaces. A. Bouchard-Côté and M. I. Jordan. In J. Shawe-Taylor, R. Zemel, J. Lafferty, and C. Williams (Eds.) Advances in Neural Information Processing Systems (NIPS) 24, 2011. [Supplementary information].

Random conic pursuit for semidefinite programming. A. Kleiner, A. Rahimi, and M. I. Jordan. In J. Shawe-Taylor, R. Zemel, J. Lafferty, and C. Williams (Eds.) Advances in Neural Information Processing Systems (NIPS) 24, 2011.

Estimating divergence functionals and the likelihood ratio by convex risk minimization. X. Nguyen, M. J. Wainwright and M. I. Jordan. IEEE Transactions on Information Theory, 56, 5847-5861, 2010.

Feature selection methods for improving protein structure prediction with Rosetta. B. Blum, M. I. Jordan, D. Kim, R. Das, P. Bradley, and D. Baker. In J. Platt, D. Koller, Y. Singer and A. McCallum (Eds.), Advances in Neural Information Processing Systems (NIPS) 21, 2008.

A direct formulation for sparse PCA using semidefinite programming. A. d'Aspremont, L. El Ghaoui, M. I. Jordan, and G. R. G. Lanckriet. SIAM Review, 49, 434-448, 2007. [Winner of the 2008 SIAM Activity Group on Optimization Prize]. [Software].

Log-determinant relaxation for approximate inference in discrete Markov random fields. M. J. Wainwright and M. I. Jordan. IEEE Transactions on Signal Processing, 54, 2099-2109, 2006.

Structured prediction, dual extragradient and Bregman projections. B. Taskar, S. Lacoste-Julien and M. I. Jordan. Journal of Machine Learning Research, 7, 1627-1653, 2006.

Convexity, classification, and risk bounds. P. L. Bartlett, M. I. Jordan, and J. D. McAuliffe. Journal of the American Statistical Association, 101, 138-156, 2006.

Structured prediction via the extragradient method. B. Taskar, S. Lacoste-Julien and M. I. Jordan. Advances in Neural Information Processing Systems (NIPS) 19, 2006.

Treewidth-based conditions for exactness of the Sherali-Adams and Lasserre relaxations. M. J. Wainwright and M. I. Jordan. Technical Report 671, Department of Statistics, University of California, Berkeley, 2004.

Multiple kernel learning, conic duality, and the SMO algorithm. F. R. Bach, G. R. G. Lanckriet, and M. I. Jordan. Proceedings of the 21st International Conference on Machine Learning (ICML), 2004. [Long version]. [Software].

Semidefinite relaxations for approximate inference on graphs with cycles. M. J. Wainwright and M. I. Jordan. In S. Thrun, L. Saul, and B. Schoelkopf (Eds.), Advances in Neural Information Processing Systems (NIPS) 17, (long version), 2004.

Learning the kernel matrix with semidefinite programming. G. R. G. Lanckriet, N. Cristianini, L. El Ghaoui, P. L. Bartlett, and M. I. Jordan. Journal of Machine Learning Research, 5, 27-72, 2004.

Robust sparse hyperplane classifiers: application to uncertain molecular profiling data. C. Bhattacharyya, L. R. Grate, M. I. Jordan, L. El Ghaoui, and Mian, I. S. In press: Journal of Computational Biology, 2004.

Graphical models, exponential families, and variational inference. M. J. Wainwright and M. I. Jordan. Technical Report 649, Department of Statistics, University of California, Berkeley, 2003.

Variational inference in graphical models: The view from the marginal polytope. M. J. Wainwright and M. I. Jordan. Forty-first Annual Allerton Conference on Communication, Control, and Computing, Urbana-Champaign, IL, 2003.

Distance metric learning, with application to clustering with side-information. E. P. Xing, A. Y. Ng, M. I. Jordan and S. Russell. In S. Becker, S. Thrun, and K. Obermayer (Eds.), Advances in Neural Information Processing Systems (NIPS) 16, 2003.

A robust minimax approach to classification. G. R. G. Lanckriet, L. El Ghaoui, C. Bhattacharyya, and M. I. Jordan. Journal of Machine Learning Research, 3, 552-582, 2002. [Matlab code]

Learning the kernel matrix with semidefinite programming. G. R. G. Lanckriet, P. L. Bartlett, N. Cristianini, L. El Ghaoui, and M. I. Jordan. Machine Learning: Proceedings of the Nineteenth International Conference (ICML), San Mateo, CA: Morgan Kaufmann, 2002.

Minimax probability machine. G. R. G. Lanckriet, L. El Ghaoui, C. Bhattacharyya, and M. I. Jordan. In T. Dietterich, S. Becker and Z. Ghahramani (Eds.), Advances in Neural Information Processing Systems (NIPS) 15, 2002.

PEGASUS: A policy search method for large MDPs and POMDPs. A. Y. Ng and M. I. Jordan. Uncertainty in Artificial Intelligence (UAI), Proceedings of the Sixteenth Conference, 2000.

A variational principle for model-based interpolation. L. K. Saul and M. I. Jordan. In M. C. Mozer, M. I. Jordan, and T. Petsche (Eds.), Advances in Neural Information Processing Systems (NIPS) 10, Cambridge MA: MIT Press, 1997.