This talk considers the recently introduced ordered weighted L1 (OWL) regularizer for sparse estimation problems with correlated variables. We begin by reviewing several convex analysis results concerning the OWL regularizer, namely: that it is indeed a norm, its dual norm, efficient methods to compute the corresponding proximity operator and the Euclidean projection on an OWL ball. We will also show how the OWL norm can be explicitly written as an atomic norm, opening the door to the use of the conditional gradient (Frank-Wolfe) algorithm. In the analysis front, we show that OWL regularization automatically clusters strongly correlated variables, in the sense that the coefficients associated with such variables have equal estimated values. Furthermore, we characterize the statistical performance of OWL regularization for generative models in which certain clusters of regression variables are strongly (even perfectly) correlated, but variables in different clusters are uncorrelated. We show that if the true p-dimensional signal generating the data involves only s of the clusters, then O(s log p) samples suffice to accurately estimate the signal, regardless of the number of coefficients within the clusters. The estimation of s-sparse signals with completely independent variables requires just as many measurements. In other words, using the OWL we pay no price (in terms of the number of measurements) for the presence of strongly correlated variables.
This work was done in collaboration with Robert Nowak (University of Wisonsin-Madison, USA) and Xiangrong Zeng (IT and IST).