Princeton University

School of Engineering & Applied Science

Kernel Regression and Estimation: Learning Theory/Application and Errors-in-Variables Model Analysis

Pei-yuan Wu
Engineering Quadrangle B327
Wednesday, February 25, 2015 - 1:00pm to 2:30pm

Kernel method plays an important role in both supervised and unsupervised machine learning applications such as clustering, classification, regression, feature selection, etc. Instead of explicitly representing data samples in the form of feature vectors, kernel method only requires a user-defined kernel function which describes the similarity over pairs of data samples.
 There are some challenges applying kernel methods to big data analysis, which is usually characterized by 5Vs: Volume, Velocity, Variety, Veracity, and Value. Regarding the volume and velocity issues, the Gaussian radial basis function (RBF) kernel, being one of the most popular and effective kernels adopted, suffers from the so-called the curse of dimensionality problem that its learning and classification complexities grow drastically with the size of training data set. The first part of the talk will be dedicated to the cost-effectiveness issue in kernel-based learning algorithms.
 The second part of the talk is dedicated to the veracity issue where some features collected may be erroneous. This motivates the errors-in-variables analysis for kernel-based estimation theory. We will discuss the impact of input noise on nonlinear regression functions by a spectral decomposition analysis. This enables the decomposition of a nonlinear function into various spectral components, each having independent and heterogeneous “robustness” towards the presence of input noise.