Princeton University

School of Engineering & Applied Science

Feature Screening for the Lasso

Yun Wang
Engineering Quadrangle F218
Tuesday, September 8, 2015 - 11:00am to 12:30pm

Recently, the sparse representation of data with respect to a dictionary of features has contributed to successful new methods in machine learning, pattern analysis, and signal/image processing. At the heart of many sparse representation methods is the least squares problem with l1 regularization, often called the lasso problem. Despite being studied extensively in the signal processing, computer vision, machine learning, and statistics literature, the applicability of lasso to large-scale problems has been hindered by the expensive computational cost.
This dissertation investigates feature screening for the lasso problem, targeted at the aforementioned computational aspect. For a given target vector, screening quickly identifies a subset of features that will receive zero weight in a solution of the lasso problem. These features can be removed from the dictionary, prior to solving the lasso problem, without impacting the optimality of the solution obtained. This has two potential advantages: it reduces the size of the dictionary, allowing the lasso problem to be solved with less resource, and it speeds up obtaining a solution.