Properties of the maximum likelihood of gaussian models

\[\sqrt{n}\left( \hat\theta_{mle}-\theta_0\right)\rightarrow N\left(0, I^{-1}\right)\]

where

\[I\left(\theta\right)=-E\left[\frac{\partial ^2}{\partial \theta^2}\log {l(\theta\mid y)} \mid \theta\right]\]

In the practice, we will use the observed information

\[J\left(\hat \theta\right)=-\frac{\partial ^2}{\partial \theta^2}\log {l(\hat\theta\mid y)}\]

The score vector is defined as

\[\frac{\partial \log{l(\theta\mid y)} }{\partial \theta}\]

We should have that

\[\frac{\partial \log{l(\theta\mid y)} }{\partial \theta}\biggr\rvert_{\hat\theta}=0\]