On Cross Validation
Setting
- We have models $\mathcal{M}_1, …, \mathcal{M}_k$.
- There are $2n$ data points
- Split the data randomly into two: $D = (Y_1, …, Y_n)$ and $T = (Y^_1,…,Y^_n)$.
Definition 1
A random variable $X$ with mean $\mu = E[X]$ is sub-Gaussian if there is a positive number $\sigma$ such that
$$E[e^{\lambda (X - \mu)}] \le e^{\frac{\sigma^2 \lambda^2}{2}}$$Step 1:
- Find MLE $\hat{\theta}_j$ using $D$.
(WIP)
Reference
AIC,BIC,Cross-Validation:
Concentration inequalities: