# Dictionary Definition

likelihood n : the probability of a specified
outcome [syn: likeliness, odds] [ant: unlikelihood, unlikelihood]

# User Contributed Dictionary

## English

### Noun

- The probability
of a specified outcome; the chance of something happening;
probability; the
state of being probable.
- In all likelihood the meeting will be cancelled.
- The likelihood is that the inflation rate will continue to rise.

- Shorthand for likelihood function; the probability that a real world experiment would generate a specific datum, as a function of the parameters of a mathematical model.
- Likeness, resemblance.
- "There is no likelihood between pure light and black darkness, or between righteousness and reprobation." (Sir W. Raleigh)

- Appearance,
show, sign, expression.
- "What of his heart perceive you in his face by any likelihood he showed to-day ?" (Shak)

#### Synonyms

#### Antonyms

#### Translations

probability

- Catalan: versemblança
- Dutch: waarschijnlijkheid
- Finnish: todennäköisyys
- Franch vraisemblance
- German: Wahrscheinlichkeit
- Spanish: verosimilitud
- Swedish: sannolikhet

mathematical likelihood

- Catalan: versemblança
- Finnish: todennäköisyys
- Spanish: verosimilitud

resemblance

- Finnish: samankaltaisuus, yhdennäköisyys

appearance

# Extensive Definition

In statistics, the likelihood
function (often simply the likelihood) is a function of the
parameters of a
statistical
model that plays a key role in statistical
inference. In non-technical usage, "likelihood" is a synonym
for "probability",
but throughout this article only the technical definition is used.
Informally, if "probability" allows us to predict unknown outcomes
based on known parameters, then "likelihood" allows us to estimate
unknown parameters based on known outcomes.

In a sense, likelihood works backwards from
probability: given B, we use the conditional probability P(A|B) to
reason about A, and given A, we use the likelihood function L(B|A)
to reason about B. This mode of reasoning is formalized in Bayes'
theorem:

- P(B \mid A) = \frac.\!

In statistics, a likelihood
function is a conditional
probability function
considered as a function of its second argument with its first
argument held fixed, thus:

- b\mapsto P(A \mid B=b), \!

and also any other function proportional to such
a function. That is, the likelihood function for B is the equivalence
class of functions

- L(b \mid A) = \alpha \; P(A \mid B=b) \!

for any constant of proportionality \alpha >
0. The numerical value L(b | A) alone is immaterial; all that
matters are likelihood ratios of the form

- \frac, \!

which are invariant with respect to the
constant of proportionality.

For more about making inferences via likelihood
functions, see also the method of maximum
likelihood, and likelihood-ratio
testing.

## Likelihood function of a parameterized model

Among many applications, we consider here one of broad theoretical and practical importance. Given a parameterized family of probability density functions (or probability mass functions in the case of discrete distributions)- x\mapsto f(x\mid\theta), \!

where θ is the parameter, the
likelihood function is

- \theta\mapsto f(x\mid\theta), \!

- L(\theta \mid x)=f(x\mid\theta), \!

where x is the observed outcome of an experiment.
In other words, when f(x | θ) is viewed as a function of
x with θ fixed, it is a probability density function, and
when viewed as a function of θ with x fixed, it is a
likelihood function.

Note: This is not the same as the probability
that those parameters are the right ones, given the observed
sample. Attempting to interpret the likelihood of a hypothesis
given observed evidence as the probability of the hypothesis is a
common error, with potentially disastrous real-world consequences
in medicine, engineering or jurisprudence. See prosecutor's
fallacy for an example of this.

### Likelihoods for continuous distributions

The use of the probability
density instead of a probability in specifying the likelihood
function above may be justified in a simple way. Suppose that,
instead of an exact observation, x, the observation is instead that
the value was in a short interval (xj-1,xj), with length Δj, where
the subscripts refer to a predefined set of intervals. Then the
probability of getting this observation (of being in interval j) is
approximately

- L_(\theta \mid x \text j)=\Delta_j f(x_\mid\theta), \!

where x* can be any point in interval j. Then,
recalling that the likelihood function is defined up to a
multiplicative constant, it is just as valid to say that the
likelihood function is approximately

- L_(\theta \mid x \text j)= f(x_\mid\theta), \!

and then, on considering the lengths of the
intervals to decrease to zero,

- L(\theta \mid x )= f(x\mid\theta). \!

### Likelihoods for mixed continuous - discrete distributions

The above can be extended in a simple way to
allow consideration of distitbutions which contain both discrete
and continuous components. Suppose that the distribution consists
of a number of discrete probability masses pk(θ) and a density
f(x|θ), where the sum of all the ps added to the integral of f is
always one. Assuming that it is possible to distinguish an
observation corresponding to one of the discrete probability masses
from one which corresponds to the density component, the likelihood
function for an observation from the continuous component can be
dealt with as above by setting the interval length short enough to
exclude any of the discrete masses. For an observation from the
discrete component, the probability can either be written down
directly or treated within the above context by saying that the
probability of getting an observation in an interval that does
contain a discrete component (of being in interval j which contains
discrete component k) is approximately

- L_(\theta \mid x \text j \text k)=p_k(\theta)+\Delta_j f(x_\mid\theta), \!

where x* can be any point in interval j. Then, on
considering the lengths of the intervals to decrease to zero, the
likelihood function for a observation from the discrete component
is

- L(\theta \mid x )= p_k(\theta), \!

where k is the index of the discrete probability
mass corresponding to observation x.

The fact that the likelihood function can be
defined in a way that includes contributions that are not
commensurate (the density and the probability mass) arises from the
way in which the likelihood function is defined up to a constant of
proportionality, where this "constant" can change with the
observation x, but not with the parameter θ.

## Example

For example, a coin is tossed with a probability pH of landing heads up ('H'), the probability of getting two heads in two trials ('HH') is pH2. If pH = 0.5, then the probability of seeing two heads is 0.25.In symbols, we can say the above as

- P(\mbox \mid p_H = 0.5) = 0.25

Another way of saying this is to reverse it and
say that "the likelihood of pH = 0.5, given the observation 'HH',
is 0.25", i.e.,

- L(p_H=0.5 \mid \mbox) = P(\mbox\mid p_H=0.5) =0.25.

But this is not the same as saying that the
probability of pH = 0.5, given the observation, is 0.25.

To take an extreme case, on this basis we can say
"the likelihood of pH = 1 given the observation 'HH' is 1". But it
is clearly not the case that the probability of pH = 1 given the
observation is 1: the event 'HH' can occur for any pH > 0 (and
often does, in reality, for pH roughly 0.5). If the probability of
pH = 1 given the observation is 1, it means that pH must and can
only be equal 1 for event 'HH' to occur which is obviously not
true.

The likelihood function is not a
probability density function – for example, the
integral of a likelihood function is not in general 1. In this
example, the integral of the likelihood density over the interval
[0, 1] in pH is 1/3, demonstrating again that the likelihood
density function cannot be interpreted as a probability density
function for pH. On the other hand, given any particular value of
pH, e.g. pH = 0.5, the integral of the probability density function
over the domain of the random
variables is 1.

## Likelihoods that eliminate nuisance parameters

In many cases, the likelihood is a function of more than one parameter but interest focusses on the estimation of only one or at most a few of them, with the others being considered as nuisance parameters. Several alternative ways have been developed to eliminate such nuisance parameters so that a likelihood can be written as a function of the parameter (or parameters) of interest only, the main ones being marginal, conditional and profile likelihoods.These are useful because standard likelihood
methods can become unreliable or fail entirely when there are many
nuisance parameters (or the nuisance parameter is
high-dimensional), particularly when the number of nuisance
parameters is a substantial fraction of the number of observations
and this fraction does not decrease when the sample size increases.
They can also be used to derive closed-form formulae for
statistical tests when direct use of maximum likelihood requires
iterative numerical methods, and find application in some
specialized topics such as sequential
analysis.

### Conditional likelihood

Sometimes it is possible to find a sufficient statistic for the nuisance parameters, and conditioning on this statistic results in a likelihood which does not depend on the nuisance parameters.One example occurs in 2×2 tables, where
conditioning on all four marginal totals leads to a conditional
likelihood based on the non-central hypergeometric
distribution. (This form of conditioning is also the basis for
Fisher's
exact test.)

### Marginal likelihood

Sometimes we can remove the nuisance parameters
by considering a likelihood based on only part of the information
in the data, for example by using the set of ranks rather than the
numerical values. Another example occurs in linear mixed models,
where considering a likelihood for the residuals only after fitting
the fixed effects leads to
residual maximum likelihood estimation of the variance
components. (Note that there is a different meaning of marginal
likelihood in Bayesian inference).

### Profile likelihood

It is often possible to write some parameters as
functions of other parameters, thereby reducing the number of
independent parameters. (The function is the parameter value which
maximises the likelihood given the value of the other parameters.)
This procedure is called concentration of the parameters and
results in the concentrated likelihood function, also occasionally
known as the maximized likelihood function, but most often called
the profile likelihood function.

For example, consider a regression
analysis model with normally
distributed
errors. The most likely value of the error variance is the variance of the
residuals. The residuals depend on all other parameters. Hence
the variance parameter can be written as a function of the other
parameters.

Unlike conditional and marginal likelihoods,
profile likelihood methods can always be used (even when the
profile likelihood cannot be written down explicitly). However, the
profile likelihood is not a true likelihood as it is not based
directly on a probability distribution and this leads to some less
satisfactory properties. (Attempts have been made to improve this,
resulting in modified profile likelihood.)

The idea of profile likelihood can also be used
to compute confidence
intervals that often have better small-sample properties than
those based on asymptotic standard
errors calculated from the full likelihood.

## Historical remarks

Some early thoughts on likelihood were made in a book by Thorvald N. Thiele published in 1889. The first paper where the full idea of the "likelihood" appears was written by R.A. Fisher in 1922: "On the mathematical foundations of theoretical statistics". In that paper, Fisher also uses the term "method of maximum likelihood". Fisher argues against inverse probability as a basis for statistical inferences, and instead proposes inferences based on likelihood functions.## See also

## Notes

## References

- A. W. F. Edwards (1972). Likelihood: An account of the statistical concept of likelihood and its application to scientific inference, Cambridge University Press. Reprinted in 1992, expanded edition, Johns Hopkins University Press.

likelihood in Arabic: دالةالإمكان

likelihood in French: Fonction de
vraisemblance

likelihood in Italian: Funzione di
verosimiglianza

likelihood in Russian: Функция
правдоподобия

likelihood in Japanese: 尤度関数

# Synonyms, Antonyms and Related Words

aptitude, aptness, bare possibility, best
bet, chance, conceivability, conceivableness,
contingency, even
chance, eventuality,
expectation,
expectations, fair
expectation, favorable prospect, good chance, good opportunity,
good possibility, hope,
hopes, liability, liableness, likeliness, main chance,
obligation, odds, odds-on, odds-on chance, off
chance, outlook, outside
chance, outside hope, possibility, possibleness, potential, potentiality, presumption, presumptive
evidence, probabilism, probability, proneness, prospect, prospects, reasonable ground,
reasonable hope, remote possibility, small hope, sporting chance,
sure bet, sure thing, tendency, the attainable, the
feasible, the possible, thinkability, thinkableness, verisimilitude, virtuality, weakness, well-grounded hope,
what is possible, what may be, what might be