Princeton University

School of Engineering & Applied Science

Machine Learning on Graphs

David Eis
Engineering Quadrangle J401
Tuesday, May 20, 2014 - 2:00pm to 4:30pm

This dissertation covers two topics. The first topic is a probabilistic modeling problem, the Sticky Hierarchical Dirichlet Process Hidden Markov Model proposed by Fox et al. It is used to analyze and uncover structure in time series data, especially when there is an unknown number of hidden states. We propose a deterministic variational inference algorithm as an alternative to the randomized Markov Chain Monte Carlo (MCMC) algorithm proposed by Fox. This algorithm is an improvement over the MCMC algorithm as it allows for direct assessment of convergence and can run faster.
The second topic is the generalized lasso, a regularized regression problem which encourages sparsity in a transformation of the regression coefficients. This thesis completes the picture of equivalences between the generalized lasso, its dual problem, and the subspace constrained lasso (SCL) and its dual problem. The sparsity is directly expressed in the SCL. The structure of this problem allows for codeword screening, a method to reduce the size of the problem. Screening methods for the SCL are derived in this thesis. Screening in this case is not as effective as it is for the lasso, where it can reduce very large dictionaries to a fraction of the size. However, it is still an important tool which allows for increased speed of solving the SCL and hence the generalized lasso.