Wilks’ phenomenon, or lack thereof
I commonly find myself in a situation where I want to decide whether or not a certain mechanism is present in an applied problem. The “nicest” and best understood example of this would be if the mechanism can be modeled with a parameter $\theta \in (-\infty, \infty)$ and the mechanism is considered to be “not present” if $\theta = 0$. In this setting we can use a hypothesis testing framework to make an assessment about whether we can reject the null hypothesis that the mechanism does not exist. Here, the most powerful test (in some appropriate sense) is the likelihood ratio test (LRT), and the asymptotic distribution of the LRT statistic, will be chi-squared with some appropriate number of degrees of freedom. This is widely referred to as the Wilks phenomenon.
The key to the above is that the null hypothesis value $\theta_0$ lies in the interior of the alternative. There are many ways this can be violated, but the simplest is when the alternative range is $\theta > 0$. This happens, for example, when a mechanism is a chemical reaction (or change of state) that may or may not be present in a system. My understanding is that the LRT can still be the most powerful but the LRT statistic might no longer be chi-squared. I want to understand all parts of that last statement in more detail. And then, most relevant for my work is when the two models are cannot be expressed as “nested” in the since that you can express a simpler model by setting a parameter in the more complex model to zero.
“Frequentist consistency of Bayesian procedures” a Bactra (Cosma Shalizi) notebook.
“On the problem of the most efficient tests of statistical hypotheses” Neyman and Pearson. Royal Society A Vol 231 pg 289-337 (1933).
“The large-sample distribution of the likelihood ratio for texting composite hypotheses.” SS Wilks. Ann Math Stat. Vol 9 No 1 pg 60-62 (1938).
Problems to work through
I have this in a notebook somewhere and I remember it not working out, but … Suppose $X \sim N(\mu, 1)$ where $\mu \sim N(0, \sigma^2)$. The null hypothesis is that $\sigma = 0$ and the alternative is that it is larger. No matter what the data is, the inference will give a credible region for $\sigma$ that does not include 0. As the number of samples goes to infinity, what is the asymptotic distribution of $\sigma$? In particular, is it asymptotically normal? Moreover, let’s say that I take hypothesis testing perspective and use a likelihood ratio test. Is the LRT statistic asymptotically chi-squared? If I use Bayesian model selection methods (Bayes Factors, Posterior Bayes) or information criteria (AIC), how do they perform?
For a Markov-chain version of the same question, let $N(t)$ be the number of particles in a system at time $t$ and suppose there are two types of exit (maybe one is degredation and one is use in some reaction). If the rate of production $\lambda$ and one of the exit rates are known, call it $\kappa$, what is the estimator for the unknown rate $\mu$? Same questions go as above: if the true value of $\mu$ is zero, what is the asymptotic posterior distribution? What does the LRT statistic look like and how does it compare to the Bayesian and AIC assessments?