CSE 446: Machine Learning Assignment 1 2020

Machine learning (ML) is an umbrella term for solving problems for which development of algorithms by human programmers would be cost-prohibitive, and instead the problems are solved by helping machines ‘discover’ their ‘own’ algorithms,[1] without needing to be explicitly told what to do by any human-developed algorithms.[2] When there was a vast amount of potential answers, the correct ones needed to be labeled as valid by human labelers initially and human supervision was need.

机器学习|Machine Learning作业代写
问题 1.

2 Linear Regression and kNN
Suppose we have a sample of $n$ pairs $\left(x_i, y_i\right)$ drawn i.i.d. from the following distribution:
$x_i \in X$, the set of instances
$y_i=f\left(x_i\right)+\epsilon_i$, where $f()$ is the regression function
$\epsilon_i \sim G\left(0, \sigma^2\right)$, a Gaussian with mean 0 and variance $\sigma^2$
We can construct an estimator for $f()$ that is linear in the $y_i$,
f\left(x_0\right)=\sum_{i=1}^n l_i\left(x_0 ; X\right) y_i,
where the weights $l_i\left(x_0 ; X\right)$ do not depend on the $y_i$, but do depend on the entire training set $X$. Show that both linear regression and $k$-nearest neighbor regression are members of this class of estimators. Explicitly describe the weights $l_i\left(x_0 ; X\right)$ for each of these algorithms.


CSE 446: Machine Learning COURSE NOTES

Dimensionality reduction
why and when it’s important

  • Simple feature selection
  • Principal component analysis
    minimizing reconstruction error
    relationship to covariance matrix and eigenvectors
    using SVD