Nadaraya Watson Estimator
- Estimates a regression function by averaging target values of nearby training points with weights from a kernel.
- Weights depend on distance from the evaluation point; a bandwidth parameter h controls the smoothing.
- Useful when the regression form is unknown or the data are noisy.
Definition
Section titled “Definition”The Nadaraya–Watson estimator is a nonparametric method for estimating the regression function in a supervised learning problem. It computes the estimate at a point by taking a weighted average of the target variables from training examples, where weights are given by a kernel function that assigns larger weights to examples closer to the point of evaluation.
Explanation
Section titled “Explanation”- The estimator assigns each training example a weight based on a kernel function. Examples closer to the evaluation point receive higher weights; examples further away receive lower weights.
- A common kernel choice is the Gaussian kernel, defined as:
where h is a bandwidth parameter controlling kernel width: larger h gives a wider kernel (more smoothing); smaller h gives a narrower kernel (less smoothing).
- The estimated regression value at a point is the kernel-weighted average of target values for the training examples, normalized by the sum of the weights.
Examples
Section titled “Examples”Simple numerical example
Section titled “Simple numerical example”Training set:
| x | y |
|---|---|
| 1 | 2 |
| 2 | 3 |
| 3 | 5 |
| 4 | 7 |
Estimate the regression function at x = 2.5 using the Gaussian kernel with h = 1.
Weights computed with the kernel:
- k(x1 - 2.5) = exp(-1.25 / (2 * 1^2)) = 0.7788
- k(x2 - 2.5) = exp(-0.25 / (2 * 1^2)) = 0.6065
- k(x3 - 2.5) = exp(0.75 / (2 * 1^2)) = 0.6065
- k(x4 - 2.5) = exp(1.75 / (2 * 1^2)) = 0.7788
Weighted average estimate:
Use cases
Section titled “Use cases”- Particularly useful when the data are noisy or when the functional form of the regression function is unknown.
Related terms
Section titled “Related terms”- Kernel function
- Gaussian kernel
- Bandwidth (h)
- Regression function
- Supervised learning