Maximum Likelihood Estimation


Reminder

One has access to nn samples x1,,xnx_1, \dots, x_n drawn from an unknown data distribution pdatap_{\mathrm{data}}:

𝑥1,,𝑥𝑛𝑝data.

The goal is to generate new samples drawn from pdatap_{\mathrm{data}}.

Maximum Log-Likelihood Estimation (MLE)

The main idea of maximum likelihood is the following:

  • one restricts the search for a model to a family of distributions p(θ)p(\cdot | \theta) parameterized by θ\theta.
  • for each parameter θ\theta, one evaluates the density p(θ)p( \cdot | \theta) on the data points x1,,xnx_1, \dots, x_n: p(x1,,xnθ)p(x_1, \dots, x_n | \theta)
  • then one chose the parameter θ\theta that maximizes the likelihood of the observed data p(x1,,xnθ)p(x_1, \dots, x_n | \theta).
ML(𝜃)log𝑝(𝑥1,,𝑥𝑛|𝜃)=log𝑝(𝑥1|𝜃)××𝑝(𝑥𝑛|𝜃)𝑛𝑖=1log𝑝(𝑥𝑖|𝜃)

Examples