In computer vision and image processing, Otsu's method, named after, is used to perform automatic image thresholding.[1] In the simplest form, the algorithm returns a single intensity threshold that separate pixels into two classes, foreground and background. This threshold is determined by minimizing intra-class intensity variance, or equivalently, by maximizing inter-class variance.[2] Otsu's method is a one-dimensional discrete analogue of Fisher's discriminant analysis, is related to Jenks optimization method, and is equivalent to a globally optimal k-means[3] performed on the intensity histogram. The extension to multi-level thresholding was described in the original paper, and computationally efficient implementations have since been proposed.[4] [5]
The algorithm exhaustively searches for the threshold that minimizes the intra-class variance, defined as a weighted sum of variances of the two classes:
2 | |
\sigma | |
w(t)=\omega |
2 | |
0(t)+\omega |
2 | |
1(t) |
\omega0
\omega1
t
2 | |
\sigma | |
0 |
2 | |
\sigma | |
1 |
The class probability
\omega\{0,1\
L
\begin{align} \omega0(t)&
t-1 | |
=\sum | |
i=0 |
p(i)\\[4pt] \omega1(t)&
L-1 | |
=\sum | |
i=t |
p(i) \end{align}
For 2 classes, minimizing the intra-class variance is equivalent to maximizing inter-class variance:[2]
2 | |
\begin{align} \sigma | |
b(t) |
&=\sigma2-\sigma
2 | |
w(t)=\omega |
0(t)(\mu0-\mu
2+\omega | |
1(t)(\mu |
1-\mu
2 | |
T) |
\\ &=\omega0(t)\omega1(t)\left[\mu0(t)-\mu
2 \end{align} | |
1(t)\right] |
which is expressed in terms of class probabilities
\omega
\mu
\mu0(t)
\mu1(t)
\muT
\begin{align} \mu0(t)&=
| ||||||||||
\omega0(t) |
\\[4pt] \mu1(t)&=
| ||||||||||
\omega1(t) |
\\ \muT&=
L-1 | |
\sum | |
i=0 |
ip(i) \end{align}
The following relations can be easily verified:
\begin{align} \omega0\mu0+\omega1\mu1&=\muT\\ \omega0+\omega1&=1 \end{align}
The class probabilities and class means can be computed iteratively. This ideayields an effective algorithm.
\omegai(0)
\mui(0)
t=1,\ldots
\omegai
\mui
2 | |
\sigma | |
b(t) |
2 | |
\sigma | |
b(t) |
histogramCounts
is a 256-element histogram of a grayscale image different gray-levels (typical for 8-bit images). level
is the threshold for the image (double).
Matlab has built-in functions graythresh
and multithresh
in the Image Processing Toolbox which are implemented with Otsu's method and Multi Otsu's method, respectively.
This implementation requires the NumPy library.
def otsu_intraclass_variance(image, threshold): """ Otsu's intra-class variance. If all pixels are above or below the threshold, this will throw a warning that can safely be ignored. """ return np.nansum([np.mean(cls) * np.var(image, where=cls) # weight · intra-class variance for cls in [image >= threshold, image < threshold] ]) # NaNs only arise if the class is empty, in which case the contribution should be zero, which `nansum` accomplishes.
image = np.random.randint(2, 253, size=(50, 50))
otsu_threshold = min(range(np.min(image) + 1, np.max(image)), key=lambda th: otsu_intraclass_variance(image, th),)Python libraries dedicated to image processing such as OpenCV and Scikit-image propose built-in implementations of the algorithm.
Otsu's method performs well when the histogram has a bimodal distribution with a deep and sharp valley between the two peaks.[6]
Like all other global thresholding methods, Otsu's method performs badly in case of heavy noise, small objects size, inhomogeneous lighting and larger intra-class than inter-class variance.[7] In those cases, local adaptations of the Otsu method have been developed.
Moreover, the mathematical grounding of Otsu's method models the histogram of the image as a mixture of two Normal distributions with equal variance and equal size.[8] Otsu's thresholding may however yield satisfying results even when these assumptions are not met, in the same way statistical tests (to which Otsu's method is heavily connected[9]) can perform correctly even when the working assumptions are not fully satisfied.
Several variations of Otsu's methods have been proposed to account for more severe deviations from these assumptions, such as the Kittler-Illingworth method.[10]
A popular local adaptation is the two-dimensional Otsu's method, which performs better for the object segmentation task in noisy images. Here, the intensity value of a given pixel is compared with the average intensity of its immediate neighborhood to improve segmentation results.[11]
At each pixel, the average gray-level value of the neighborhood is calculated. Let the gray level of the given pixel be divided into
L
L
(i,j)
L x L
fij
(i,j)
N
Pij=
fij | |
N, |
L-1 | |
\sum | |
i=0 |
L-1 | |
\sum | |
j=0 |
Pij=1
And the 2-dimensional Otsu's method is developed based on the 2-dimensional histogram as follows.
The probabilities of two classes can be denoted as:
\begin{align} \omega0&=
s-1 | |
\sum | |
i=0 |
t-1 | |
\sum | |
j=0 |
Pij\\ \omega1&=
L-1 | |
\sum | |
i=s |
L-1 | |
\sum | |
j=t |
Pij\end{align}
The intensity mean value vectors of two classes and total mean vector can be expressed as follows:
\begin{align} \mu0&=[\mu0i,\mu0j]T=
s-1 | |
\left[\sum | |
i=0 |
t-1 | |
\sum | |
j=0 |
i
Pij | |
\omega0 |
,
s-1 | |
\sum | |
i=0 |
t-1 | |
\sum | |
j=0 |
j
Pij | |
\omega0 |
\right]T\\ \mu1&=[\mu1i,\mu1j]T=
L-1 | |
\left[\sum | |
i=s |
L-1 | |
\sum | |
j=t |
i
Pij | |
\omega1 |
,
L-1 | |
\sum | |
i=s |
L-1 | |
\sum | |
j=t |
j
Pij | |
\omega1 |
\right]T\\ \muT&=[\muTi,\muTj]T=
L-1 | |
\left[\sum | |
i=0 |
L-1 | |
\sum | |
j=0 |
iPij,
L-1 | |
\sum | |
i=0 |
L-1 | |
\sum | |
j=0 |
jPij\right]T \end{align}
In most cases the probability off-diagonal will be negligible, so it is easy to verify:
\omega0+\omega1\cong1
\omega0\mu0+\omega1\mu1\cong\muT
The inter-class discrete matrix is defined as
Sb=
1\omega | |
\sum | |
k[(\mu |
k-\muT)(\muk-\mu
T] | |
T) |
The trace of the discrete matrix can be expressed as
\begin{align} &\operatorname{tr}(Sb)\\[4pt] ={}&\omega0[(\mu0i-\muTi)2+(\mu0j-\muTj)2]+\omega1[(\mu1i-\muTi)2+(\mu1j-\muTj)2]\\[4pt] ={}&
| |||||||||||||||||||
\omega0(1-\omega0) |
\end{align}
where
\mui=
s-1 | |
\sum | |
i=0 |
t-1 | |
\sum | |
j=0 |
iPij
\muj=
s-1 | |
\sum | |
i=0 |
t-1 | |
\sum | |
j=0 |
jPij
Similar to one-dimensional Otsu's method, the optimal threshold
(s,t)
\operatorname{tr}(Sb)
The
s
t
s
t
\operatorname{tr}(Sb)
for ss: 0 to L-1 do for tt: 0 to L-1 do evaluate tr(S_b); if tr(S_b) > max max = tr(S,b); s = ss; t = tt; end if end forend for
return s,t;
Notice that for evaluating
\operatorname{tr}(Sb)
If summed area tables are used to build the 3 tables, sum over
Pij
i*Pij
j*Pij
See also: Summed-area table.
function inputs and output:
hists
is a
256 x 256
total
is the number of pairs in the given image.it is determined by the number of the bins of 2D-histogram at each direction.
threshold
is the threshold obtained.
1 p_0(1,1) = hists(1,1); else p_0(ii,1) = p_0(ii-1,1) + hists(ii,1); mu_i(ii,1) = mu_i(ii-1,1)+(ii-1)*hists(ii,1); mu_j(ii,1) = mu_j(ii-1,1); end else p_0(ii,jj) = p_0(ii,jj-1)+p_0(ii-1,jj)-p_0(ii-1,jj-1)+hists(ii,jj); % THERE IS A BUG HERE. INDICES IN MATLAB MUST BE HIGHER THAN 0. ii-1 is not valid mu_i(ii,jj) = mu_i(ii,jj-1)+mu_i(ii-1,jj)-mu_i(ii-1,jj-1)+(ii-1)*hists(ii,jj); mu_j(ii,jj) = mu_j(ii,jj-1)+mu_j(ii-1,jj)-mu_j(ii-1,jj-1)+(jj-1)*hists(ii,jj); end
if (p_0(ii,jj)
total) break; end tr = ((mu_i(ii,jj)-p_0(ii,jj)*mu_t0)^2 + (mu_j(ii,jj)-p_0(ii,jj)*mu_t1)^2)/(p_0(ii,jj)*(1-p_0(ii,jj)));
if (tr >= maximum) threshold = ii; maximum = tr; end endendend
When the levels of gray of the classes of the image can be considered as Normal distributions but with unequal size and/or unequal variances, assumptions for the Otsu algorithm are not met. The Kittler-Illingworth algorithm (also known as Minimum Error thresholding) is a variation of Otsu's method to handle such cases. There are several ways to mathematically describe this algorithm. One of them is to consider that for each threshold being tested, the parameters of the Normal distributions in the resulting binary image are estimated by Maximum likelihood estimation given the data.
While this algorithm could seem superior to Otsu's method, it introduces new parameters to be estimated, and this can result in the algorithm being over-parametrized and thus unstable. In many cases where the assumptions from Otsu's method seem at least partially valid, it may be preferable to favor Otsu's method over the Kittler-Illingworth algorithm, following Occam's razor.
One limitation of the Otsu’s method is that it cannot segment weak objects as the method searches for a single threshold to separate an image into two classes, namely, foreground and background, in one shot. Because the Otsu’s method looks to segment an image with one threshold, it tends to bias toward the class with the large variance.[14] Iterative triclass thresholding algorithm is a variation of the Otsu’s method to circumvent this limitation.[15] Given an image, at the first iteration, the triclass thresholding algorithm calculates a threshold
η1
η1
[1] | |
\mu | |
upper |
η1
[1] | |
\mu | |
lower |
η1
[1] | |
\mu | |
upper |
F
[1] | |
\mu | |
lower |
B
[1] | |
[\mu | |
lower |
,
[1] | |
\mu | |
upper |
]
η2
[2] | |
\mu | |
upper |
η2
[2] | |
\mu | |
lower |
η2
[2] | |
\mu | |
upper |
F
[2] | |
\mu | |
lower |
B
[2] | |
[\mu | |
lower |
,
[2] | |
\mu | |
upper |
]
ηn