Asymptotic Normality of Nadaraya-Watson Estimator in Univariate Models

Answered using GPT-4o Mini by GPT4oMini.app

Question:

Assume the univariate random design model and some regularity conditions, discussing the asymptotic normality of the Nadaraya-Watson estimator ˆmh(x)

Answer:

The Nadaraya-Watson estimator is a popular nonparametric method for estimating the conditional expectation of a random variable. In the context of a univariate random design model, we can denote the estimator as (\hat{m}_h(x)), which estimates the conditional mean (m(x) = E[Y | X = x]) based on a sample of observations ((X_i, Y_i)) for (i = 1, \ldots, n).

Nadaraya-Watson Estimator

The Nadaraya-Watson estimator is defined as:

[ \hat{m}h(x) = \frac{\sum{i=1}^n K_h(X_i - x) Y_i}{\sum_{i=1}^n K_h(X_i - x)} ]

where:

  • (K_h(u) = \frac{1}{h} K\left(\frac{u}{h}\right)) is a kernel function scaled by a bandwidth (h),
  • (K(u)) is a kernel function (e.g., Gaussian, Epanechnikov) that integrates to 1,
  • (h) is the bandwidth parameter that controls the smoothness of the estimator.

Asymptotic Normality

To discuss the asymptotic normality of the Nadaraya-Watson estimator, we need to consider the following regularity conditions:

  1. Kernel Conditions: The kernel function (K(u)) is symmetric, integrates to 1, and has a bounded support. Additionally, (K(u)) should have a finite second moment.

  2. Bandwidth Condition: The bandwidth (h) should satisfy (h \to 0) as (n \to \infty) and (nh \to \infty). This ensures that the estimator becomes more localized around the point (x) as the sample size increases.

  3. Smoothness of the True Function: The true conditional mean function (m(x)) should be sufficiently smooth (e.g., continuous and differentiable) in the neighborhood of (x).

  4. Independence and Identically Distributed (i.i.d.) Samples: The observations ((X_i, Y_i)) should be drawn from a joint distribution that is i.i.d.

Asymptotic Distribution

Under these conditions, the asymptotic distribution of the Nadaraya-Watson estimator can be derived. Specifically, as (n \to \infty), the estimator (\hat{m}_h(x)) converges in distribution to a normal distribution:

[ \sqrt{nh} \left( \hat{m}_h(x) - m(x) \right) \xrightarrow{d} N(0, \sigma^2(x)) ]

where:

  • (\sigma^2(x)) is the asymptotic variance of the estimator, which can be expressed as:

[ \sigma^2(x) = \frac{1}{\int K^2(u) du} \cdot \text{Var}(Y | X = x) \cdot m'(x)^2 ]

This variance captures the variability of the estimator around the true conditional mean (m(x)) and depends on the choice of the kernel and the bandwidth.

Conclusion

In summary, the Nadaraya-Watson estimator (\hat{m}_h(x)) is asymptotically normal under certain regularity conditions. The convergence to a normal distribution allows for the construction of confidence intervals and hypothesis tests regarding the conditional mean function (m(x)). The choice of bandwidth (h) and kernel (K) plays a crucial role in the performance of the estimator, influencing both its bias and