Scholarship list
Journal article
Simplifying Optimal Transport through Schatten-p Regularization
Published 02/24/2026
Transactions on machine learning research
We propose a new general framework for recovering low-rank structure in optimal transport using Schatten-p norm regularization. Our approach extends existing methods that promote sparse and interpretable transport maps or plans, while providing a unified and principled family of convex programs that encourage low-dimensional structure. The convexity of our formulation enables direct theoretical analysis: we derive optimality conditions and prove recovery guarantees for low-rank couplings, barycentric displacements, and cross-covariances in simplified settings. To efficiently solve the proposed program, we develop a mirror descent algorithm with convergence guarantees in the convex setting. Experiments on synthetic and real data demonstrate the method's efficiency, scalability, and ability to recover low-rank transport structures. In particular, we demonstrate its utility on a machine-learning task in learning transport between high-dimensional cell perturbations for biological applications. All code is publicly available at https://github.com/twmaunu/schatten_ot.
Preprint
A Smoothing Newton Method for Rank-one Matrix Recovery
Published 07/30/2025
arXiv (Cornell University)
We consider the phase retrieval problem, which involves recovering a rank-one positive semidefinite matrix from rank-one measurements. A recently proposed algorithm based on Bures-Wasserstein gradient descent (BWGD) exhibits superlinear convergence, but it is unstable, and existing theory can only prove local linear convergence for higher rank matrix recovery. We resolve this gap by revealing that BWGD implements Newton's method with a nonsmooth and nonconvex objective. We develop a smoothing framework that regularizes the objective, enabling a stable method with rigorous superlinear convergence guarantees. Experiments on synthetic data demonstrate this superior stability while maintaining fast convergence.
Preprint
Global Convergence of Iteratively Reweighted Least Squares for Robust Subspace Recovery
Published 06/29/2025
arXiv (Cornell University)
Robust subspace estimation is fundamental to many machine learning and data analysis tasks. Iteratively Reweighted Least Squares (IRLS) is an elegant and empirically effective approach to this problem, yet its theoretical properties remain poorly understood. This paper establishes that, under deterministic conditions, a variant of IRLS with dynamic smoothing regularization converges linearly to the underlying subspace from any initialization. We extend these guarantees to affine subspace estimation, a setting that lacks prior recovery theory. Additionally, we illustrate the practical benefits of IRLS through an application to low-dimensional neural network training. Our results provide the first global convergence guarantees for IRLS in robust subspace recovery and, more broadly, for nonconvex IRLS on a Riemannian manifold.
Preprint
Preconditioned Subspace Langevin Monte Carlo
Published 12/18/2024
We develop a new efficient method for high-dimensional sampling called Subspace Langevin Monte Carlo. The primary application of these methods is to efficiently implement Preconditioned Langevin Monte Carlo. To demonstrate the usefulness of this new method, we extend ideas from subspace descent methods in Euclidean space to solving a specific optimization problem over Wasserstein space. Our theoretical analysis demonstrates the advantageous convergence regimes of the proposed method, which depend on relative conditioning assumptions common to mirror descent methods. We back up our theory with experimental evidence on sampling from an ill-conditioned Gaussian distribution.
Preprint
Acceleration and Implicit Regularization in Gaussian Phase Retrieval
Published 11/20/2023
We study accelerated optimization methods in the Gaussian phase retrieval problem. In this setting, we prove that gradient methods with Polyak or Nesterov momentum have similar implicit regularization to gradient descent. This implicit regularization ensures that the algorithms remain in a nice region, where the cost function is strongly convex and smooth despite being nonconvex in general. This ensures that these accelerated methods achieve faster rates of convergence than gradient descent. Experimental evidence demonstrates that the accelerated methods converge faster than gradient descent in practice.
Conference proceeding
First-Order Algorithms for Optimization over Graph Laplacians
Published 07/10/2023
2023 International Conference on Sampling Theory and Applications (SampTA), 1 - 11
When solving an optimization problem over the set of graph Laplacian matrices, one must deal with a large number of constraints as well as the large objective variable size. In this paper we explore first-order methods for optimization over graph Laplacian matrices. These methods include two popular methods for constrained optimization: the mirror descent algorithm and the Frank-Wolfe (conditional gradient) algorithm. We derive efficiently implementable formulations of these algorithms over graph Laplacians, and use existing theory to show their iteration complexity in various regimes. Experiments demonstrate the efficiency of these methods over alternatives like interior point methods.
Conference proceeding
Bures-Wasserstein Barycenters and Low-Rank Matrix Recovery
Date presented 04/25/2023
Proceedings of Machine Learning Research, 206
International Conference on Artificial Intelligence and Statistics (AISTATS), 04/25/2023–04/27/2023, Valencia, Spain
We revisit the problem of recovering a low-rank positive semidefinite matrix from rank-one projections using tools from optimal transport. More specifically, we show that a variational formulation of this problem is equivalent to computing a Wasserstein barycenter. In turn, this new perspective enables the development of new geometric first-order methods with strong convergence guarantees in Bures-Wasserstein distance. Experiments on simulated data demonstrate the advantages of our new methodology over existing methods.
Conference presentation
Bures-Wasserstein Barycenters and Low-Rank Matrix Recovery
Date presented 01/04/2023
Joint Mathematics Meeting, 01/04/2023–01/07/2023, Boston, MA
We revisit the problem of recovering a low-rank positive semidefinite matrix from rank-one projections using tools from optimal transport. More specifically, we show that a variational formulation of this problem is equivalent to computing a Wasserstein barycenter. In turn, this new perspective enables the development of new geometric first-order methods with strong convergence guarantees in Bures-Wasserstein distance. Experiments on simulated data demonstrate the advantages of our new methodology over existing methods.
Journal article
Depth Descent Synchronization in SO(D)
Published 01/03/2023
International journal of computer vision, 131, 968 - 986
Preprint
Bures-Wasserstein Barycenters and Low-Rank Matrix Recovery
Published 10/26/2022
We revisit the problem of recovering a low-rank positive semidefinite matrix
from rank-one projections using tools from optimal transport. More
specifically, we show that a variational formulation of this problem is
equivalent to computing a Wasserstein barycenter. In turn, this new perspective
enables the development of new geometric first-order methods with strong
convergence guarantees in Bures-Wasserstein distance. Experiments on simulated
data demonstrate the advantages of our new methodology over existing methods.