Properties of the maximum likelihood of gaussian models
\[\sqrt{n}\left( \hat\theta_{mle}-\theta_0\right)\rightarrow N\left(0, I^{-1}\right)\]where
\[I\left(\theta\right)=-E\left[\frac{\partial ^2}{\partial \theta^2}\log {l(\theta\mid y)} \mid \theta\right]\]In the practice, we will use the observed information
\[J\left(\hat \theta\right)=-\frac{\partial ^2}{\partial \theta^2}\log {l(\hat\theta\mid y)}\]The score vector is defined as
\[\frac{\partial \log{l(\theta\mid y)} }{\partial \theta}\]We should have that
\[\frac{\partial \log{l(\theta\mid y)} }{\partial \theta}\biggr\rvert_{\hat\theta}=0\]