In statistics, propagation of uncertainty (or propagation of error) is the effect of variables' uncertainties (or errors, more specifically random errors) on the uncertainty of a function based on them. When the variables are the values of experimental measurements they have uncertainties due to measurement limitations (e.g., instrument precision) which propagate due to the combination of variables in the function.
The uncertainty u can be expressed in a number of ways.It may be defined by the absolute error . Uncertainties can also be defined by the relative error, which is usually written as a percentage.Most commonly, the uncertainty on a quantity is quantified in terms of the standard deviation,, which is the positive square root of the variance. The value of a quantity and its error are then expressed as an interval . However, the most general way of characterizing uncertainty is by specifying its probability distribution.If the probability distribution of the variable is known or can be assumed, in theory it is possible to get any of its statistics. In particular, it is possible to derive confidence limits to describe the region within which the true value of the variable may be found. For example, the 68% confidence limits for a one-dimensional variable belonging to a normal distribution are approximately ± one standard deviation from the central value, which means that the region will cover the true value in roughly 68% of cases.
If the uncertainties are correlated then covariance must be taken into account. Correlation can arise from two different sources. First, the measurement errors may be correlated. Second, when the underlying values are correlated across a population, the uncertainties in the group averages will be correlated.[1]
In a general context where a nonlinear function modifies the uncertain parameters (correlated or not), the standard tools to propagate uncertainty, and infer resulting quantity probability distribution/statistics, are sampling techniques from the Monte Carlo method family.[2] For very large datasets or complex functions, the calculation of the error propagation may be very expensive so that a surrogate model[3] or a parallel computing strategy[4] [5] [6] may be necessary.
In some particular cases, the uncertainty propagation calculation can be done through simplistic algebraic procedures. Some of these scenarios are described below.
Let
\{fk(x1,x2,...,xn)\}
n
x1,x2,...,xn
Ak1,Ak2,...,Akn,(k=1,...,m)
Also let the variance–covariance matrix of be denoted by
\boldsymbol\Sigmax
\boldsymbol{\mu}
⊗
Then, the variance–covariance matrix
\boldsymbol\Sigmaf
In component notation, the equationreads
This is the most general expression for the propagation of error from one set of variables onto another. When the errors on x are uncorrelated, the general expression simplifies towhere
x | |
\Sigma | |
k |
=
2 | |
\sigma | |
xk |
\boldsymbol\Sigmax
\boldsymbol\Sigmaf
The general expressions for a scalar-valued function f are a little simpler (here a is a row vector):
Each covariance term
\sigmaij
\rhoij
\sigmaij=\rhoij\sigmai\sigmaj
In the case that the variables in x are uncorrelated, this simplifies further to
In the simple case of identical coefficients and variances, we find
For the arithmetic mean,
a=1/n
See also: Taylor expansions for the moments of functions of random variables. When f is a set of non-linear combination of the variables x, an interval propagation could be performed in order to compute intervals which contain all consistent values for the variables. In a probabilistic approach, the function f must usually be linearised by approximation to a first-order Taylor series expansion, though in some cases, exact formulae can be derived that do not depend on the expansion as is the case for the exact variance of products.[7] The Taylor expansion would be:where
\partialfk/\partialxi
\partialfk | |
\partialxi |
\partialfk | |
\partialxj |
That is, the Jacobian of the function is used to transform the rows and columns of the variance-covariance matrix of the argument.Note this is equivalent to the matrix expression for the linear case with
J=A
Neglecting correlations or assuming independent variables yields a common formula among engineers and experimental scientists to calculate error propagation, the variance formula:[9] where
sf
f
sx
x
sy
y
This formula is based on the linear characteristics of the gradient of
f
f
sx,sy,sz,\ldots
f
f
sx,sy,sz,\ldots
Any non-linear differentiable function,
f(a,b)
a
b
\sigmaf
f
\sigmaa
a
\sigmab
b
\sigmaab=\sigmaa\sigmab\rhoab
a
b
In the particular case that Thenorwhere
\rhoab
a
b
When the variables
a
b
\rhoab=0
Error estimates for non-linear functions are biased on account of using a truncated series expansion. The extent of this bias depends on the nature of the function. For example, the bias on the error calculated for log(1+x) increases as x increases, since the expansion to x is a good approximation only when x is near zero.
For highly non-linear functions, there exist five categories of probabilistic approaches for uncertainty propagation;[12] see Uncertainty quantification for details.
In the special case of the inverse or reciprocal
1/B
B=N(0,1)
However, in the slightly more general case of a shifted reciprocal function
1/(p-B)
B=N(\mu,\sigma)
p
\mu
Ratios are also problematic; normal approximations exist under certain conditions.
This table shows the variances and standard deviations of simple functions of the real variables
A,B
\sigmaA,\sigmaB,
\sigmaAB=\rhoAB\sigmaA\sigmaB,
\rhoAB.
a
b
\sigmaa=\sigmab=0.
In the right-hand columns of the table,
A
B
f
Function | Variance | Standard deviation | |||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
f=aA |
=
| \sigmaf= | a | \sigma_A | |||||||||||||||||||||||||||||||||||||||||||||||
f=A+B |
=
+
+2\sigmaAB | \sigmaf=
+
+2\sigmaAB | |||||||||||||||||||||||||||||||||||||||||||||||||
f=A-B |
=
+
-2\sigmaAB | \sigmaf=
+
-2\sigmaAB | |||||||||||||||||||||||||||||||||||||||||||||||||
f=aA+bB |
=
+
+2ab\sigmaAB | \sigmaf=
+
+2ab\sigmaAB | |||||||||||||||||||||||||||||||||||||||||||||||||
f=aA-bB |
=
+
-2ab\sigmaAB | \sigmaf=
+
-2ab\sigmaAB | |||||||||||||||||||||||||||||||||||||||||||||||||
f=AB |
≈ f2\left[\left(
\right)2+\left(
\right)2+2
\right] | \sigmaf ≈ \left | f \right | \sqrt | |||||||||||||||||||||||||||||||||||||||||||||||
f=
|
≈ f2\left[\left(
\right)2+\left(
\right)2-2
\right] | \sigmaf ≈ \left | f \right | \sqrt | |||||||||||||||||||||||||||||||||||||||||||||||
f=
|
≈
-2
\sigmaAB\right) | \sigmaf ≈ \left | \frac\right | \sqrt | |||||||||||||||||||||||||||||||||||||||||||||||
f=aAb |
≈ \left({a}{b}{A}b-1{\sigmaA}\right)2=\left(
{\sigmaA}}{A}\right)2 | \sigmaf ≈ \left | ^ \right | = \left | \frac \right | ||||||||||||||||||||||||||||||||||||||||||||||
f=aln(bA) |
≈ \left(a
\right)2 | \sigmaf ≈ \left | a \frac\right | ||||||||||||||||||||||||||||||||||||||||||||||||
f=alog10(bA) |
≈ \left(a
\right)2 | \sigmaf ≈ \left | a \frac \right | ||||||||||||||||||||||||||||||||||||||||||||||||
f=aebA |
≈ f2\left(b\sigmaA\right)2 | \sigmaf ≈ \left | f \right | \left | \left(b\sigma_A \right) \right | ||||||||||||||||||||||||||||||||||||||||||||||
f=abA |
≈ f2
| \sigmaf ≈ \left | f \right | \left | b \ln(a) \sigma_A \right | ||||||||||||||||||||||||||||||||||||||||||||||
f=a\sin(bA) |
≈ \left[ab\cos(bA)\sigmaA\right]2 | \sigmaf ≈ \left | a b \cos(b A) \sigma_A \right | ||||||||||||||||||||||||||||||||||||||||||||||||
f=a\cos\left(bA\right) |
≈ \left[ab\sin(bA)\sigmaA\right]2 | \sigmaf ≈ \left | a b \sin(b A) \sigma_A \right | ||||||||||||||||||||||||||||||||||||||||||||||||
f=a\tan\left(bA\right) |
≈ \left[ab\sec2(bA)\sigmaA\right]2 | \sigmaf ≈ \left | a b \sec^2(b A) \sigma_A \right | ||||||||||||||||||||||||||||||||||||||||||||||||
f=AB |
≈ f2\left[\left(
\sigmaA\right)2+\left(ln(A)\sigmaB\right)2+2
\sigmaAB\right] | \sigmaf ≈ \left | f \right | \sqrt | |||||||||||||||||||||||||||||||||||||||||||||||
f=\sqrt{aA2\pmbB2} |
≈ \left(
\right)2
+\left(
\right)2
\pm2ab
\sigmaAB | \sigmaf ≈ \sqrt{\left(
\right)2
+\left(
\right)2
\pm2ab
\sigmaAB |
For uncorrelated variables (
\rhoAB=0
\sigmaAB=0
For the case
f=AB
If A and B are uncorrelated, their difference A − B will have more variance than either of them. An increasing positive correlation (
\rhoAB\to1
\rhoAB\to-1
For example, the self-subtraction f = A − A has zero variance
2 | |
\sigma | |
f |
=0
\rhoA=1
\rhoA=0,
2 | |
\sigma | |
f |
=
2 | |
2\sigma | |
A. |
\rhoA=-1,
2 | |
\sigma | |
f |
=4
2 | |
\sigma | |
A |
1-\rhoA=2
We can calculate the uncertainty propagation for the inverse tangent function as an example of using partial derivatives to propagate error.
Definewhere
\Deltax
Therefore, our propagated uncertainty iswhere
\Deltaf
A practical application is an experiment in which one measures current,, and voltage,, on a resistor in order to determine the resistance,, using Ohm's law, .
Given the measured variables with uncertainties, and, and neglecting their possible correlation, the uncertainty in the computed quantity,, is: