Assume the univariate random design model and some regularity conditions, discussing the asymptotic normality of the Nadaraya-Watson estimator ˆmh(x)
The Nadaraya-Watson estimator is a popular nonparametric method for estimating the conditional expectation of a random variable. In the context of a univariate random design model, we can denote the estimator as (\hat{m}_h(x)), which estimates the conditional mean (m(x) = E[Y | X = x]) based on a sample of observations ((X_i, Y_i)) for (i = 1, \ldots, n).
The Nadaraya-Watson estimator is defined as:
[ \hat{m}h(x) = \frac{\sum{i=1}^n K_h(X_i - x) Y_i}{\sum_{i=1}^n K_h(X_i - x)} ]
where:
To discuss the asymptotic normality of the Nadaraya-Watson estimator, we need to consider the following regularity conditions:
Kernel Conditions: The kernel function (K(u)) is symmetric, integrates to 1, and has a bounded support. Additionally, (K(u)) should have a finite second moment.
Bandwidth Condition: The bandwidth (h) should satisfy (h \to 0) as (n \to \infty) and (nh \to \infty). This ensures that the estimator becomes more localized around the point (x) as the sample size increases.
Smoothness of the True Function: The true conditional mean function (m(x)) should be sufficiently smooth (e.g., continuous and differentiable) in the neighborhood of (x).
Independence and Identically Distributed (i.i.d.) Samples: The observations ((X_i, Y_i)) should be drawn from a joint distribution that is i.i.d.
Under these conditions, the asymptotic distribution of the Nadaraya-Watson estimator can be derived. Specifically, as (n \to \infty), the estimator (\hat{m}_h(x)) converges in distribution to a normal distribution:
[ \sqrt{nh} \left( \hat{m}_h(x) - m(x) \right) \xrightarrow{d} N(0, \sigma^2(x)) ]
where:
[ \sigma^2(x) = \frac{1}{\int K^2(u) du} \cdot \text{Var}(Y | X = x) \cdot m'(x)^2 ]
This variance captures the variability of the estimator around the true conditional mean (m(x)) and depends on the choice of the kernel and the bandwidth.
In summary, the Nadaraya-Watson estimator (\hat{m}_h(x)) is asymptotically normal under certain regularity conditions. The convergence to a normal distribution allows for the construction of confidence intervals and hypothesis tests regarding the conditional mean function (m(x)). The choice of bandwidth (h) and kernel (K) plays a crucial role in the performance of the estimator, influencing both its bias and