Likelihood is a term to replace the following:
The probability of the current observation(s) if the population constant parameter is Theta.
Likelihood=
where
F is the pdf for x given theta
Maximum likelihood estimate for the unknown theta:
The constant theta of the population for which the current observation(s) would be most probable to happen.
we find the best model, among all models with different constant thetas, for which the probability of observations is the most.
We define the maximum, we find for which theta the derivative of Likelihood f(x|theta) is zero.
\frac{\partial f(x|\theta)) }{\partial \theta}=0
Which will give the same result as when we solve
for theta.
========================================
Example:
For the normal distribution which has probability density function
the corresponding probability density function for a sample of n independent identically distributed normal random variables (the likelihood) is
or more conveniently:
-
which will be zero when mean of the population is mean of the sample. Therefore the probability of the observed xs is most when mu is xbar.
fortunately Its expectation value is equal to the parameter μ of the given distribution,
also
This is zero when
Which means the estimator for variance of population sigma considering our observations is variance of the sample.
However,
Which means that variance of the sample is biased a little bit.
http://en.wikipedia.org/wiki/Maximum_likelihood