Interaction information explained

In probability theory and information theory, the interaction information is a generalization of the mutual information for more than two variables.

There are many names for interaction information, including amount of information,[1] information correlation,[2] co-information,[3] and simply mutual information. Interaction information expresses the amount of information (redundancy or synergy) bound up in a set of variables, beyond that which is present in any subset of those variables. Unlike the mutual information, the interaction information can be either positive or negative. These functions, their negativity and minima have a direct interpretation in algebraic topology.[4]

Definition

The conditional mutual information can be used to inductively define the interaction information for any finite number of variables as follows:

I(X1;\ldots;Xn+1)=I(X1;\ldots;Xn)-I(X1;\ldots;Xn\midXn+1),

where

I(X1;\ldots;Xn\midXn+1)=

E
Xn+1

(I(X1;\ldots;Xn)\midXn+1).

Some authors[5] define the interaction information differently, by swapping the two terms being subtracted in the preceding equation. This has the effect of reversing the sign for an odd number of variables.

For three variables

\{X,Y,Z\}

, the interaction information

I(X;Y;Z)

is given by

I(X;Y;Z)=I(X;Y)-I(X;Y\midZ)

where

I(X;Y)

is the mutual information between variables

X

and

Y

, and

I(X;Y\midZ)

is the conditional mutual information between variables

X

and

Y

given

Z

. The interaction information is symmetric, so it does not matter which variable is conditioned on. This is easy to see when the interaction information is written in terms of entropy and joint entropy, as follows:

\begin{alignat}{3} I(X;Y;Z)&=&&l(H(X)+H(Y)+H(Z)r)\\ &&&-l(H(X,Y)+H(X,Z)+H(Y,Z)r)\\ &&&+H(X,Y,Z) \end{alignat}

In general, for the set of variables

l{V}=\{X1,X2,\ldots,Xn\}

, the interaction information can be written in the following form (compare with Kirkwood approximation):

I(l{V})=\suml{T\subseteql{V}}(-1)\left\vertl{T\right\vert-1}H(l{T})

For three variables, the interaction information measures the influence of a variable

Z

on the amount of information shared between

X

and

Y

. Because the term

I(X;Y\midZ)

can be larger than

I(X;Y)

, the interaction information can be negative as well as positive. This will happen, for example, when

X

and

Y

are independent but not conditionally independent given

Z

. Positive interaction information indicates that variable

Z

inhibits (i.e., accounts for or explains some of) the correlation between

X

and

Y

, whereas negative interaction information indicates that variable

Z

facilitates or enhances the correlation.

Properties

Interaction information is bounded. In the three variable case, it is bounded by[6]

-min\{I(X;Y\midZ),I(Y;Z\midX),I(X;Z\midY)\}\leqI(X;Y;Z)\leqmin\{I(X;Y),I(Y;Z),I(X;Z)\}

If three variables form a Markov chain

X\toY\toZ

, then

I(X;Z\midY)=0

, but

I(X;Z)\ge0

. Therefore

I(X;Y;Z)=I(X;Z)-I(X;Z\midY)=I(X;Z)\ge0.

Examples

Positive interaction information

Positive interaction information seems much more natural than negative interaction information in the sense that such explanatory effects are typical of common-cause structures. For example, clouds cause rain and also block the sun; therefore, the correlation between rain and darkness is partly accounted for by the presence of clouds,

I(rain;dark\midcloud)<I(rain;dark)

. The result is positive interaction information

I(rain;dark;cloud)

.

Negative interaction information

A car's engine can fail to start due to either a dead battery or a blocked fuel pump. Ordinarily, we assume that battery death and fuel pump blockage are independent events,

I(blockedfuel;deadbattery)=0

. But knowing that the car fails to start, if an inspection shows the battery to be in good health, we can conclude that the fuel pump must be blocked. Therefore

I(blockedfuel;deadbattery\midenginefails)>0

, and the result is negative interaction information.

Difficulty of interpretation

The possible negativity of interaction information can be the source of some confusion. Many authors have taken zero interaction information as a sign that three or more random variables do not interact, but this interpretation is wrong.[7]

\{X1,X2,X3,X4,X5,X6,X7,X8\}

. Agglomerate these variables as follows:

\begin{align} Y1&=\{X1,X2,X3,X4,X5,X6,X7\}\\ Y2&=\{X4,X5,X6,X7\}\\ Y3&=\{X5,X6,X7,X8\}\end{align}

Because the

Yi

's overlap each other (are redundant) on the three binary variables

\{X5,X6,X7\}

, we would expect the interaction information

I(Y1;Y2;Y3)

to equal

3

bits, which it does. However, consider now the agglomerated variables

\begin{align} Y1&=\{X1,X2,X3,X4,X5,X6,X7\}\\ Y2&=\{X4,X5,X6,X7\}\\ Y3&=\{X5,X6,X7,X8\}\\ Y4&=\{X7,X8\} \end{align}

These are the same variables as before with the addition of

Y4=\{X7,X8\}

. However,

I(Y1;Y2;Y3;Y4)

in this case is actually equal to

+1

bit, indicating less redundancy. This is correct in the sense that

\begin{align} I(Y1;Y2;Y3;Y4)&=I(Y1;Y2;Y3)-I(Y1;Y2;Y3|Y4)\\ &=3-2\\ &=1 \end{align}

but it remains difficult to interpret.

Uses

See also

References

Notes and References

  1. Ting. Hu Kuo. January 1962. On the Amount of Information. Theory of Probability & Its Applications. 7. 4. 439–447. 10.1137/1107041. 0040-585X.
  2. David. Wolf. The Generalization of Mutual Information as the Information between a Set of Variables: The Information Correlation Function Hierarchy and the Information Structure of Multi-Agent Systems. NASA Ames Research Center. May 1, 1996.
  3. Bell. Anthony. 2003. The co-information lattice. 4th Int. Symp. Independent Component Analysis and Blind Source Separation.
  4. Baudot. Pierre. Bennequin. Daniel. 2015-05-13. The Homological Nature of Entropy. Entropy. 17. 5. 3253–3318. 10.3390/e17053253. 2015Entrp..17.3253B . 1099-4300. free.
  5. McGill. William J.. June 1954. Multivariate information transmission. Psychometrika. 19. 2. 97–116. 10.1007/bf02289159. 126431489 . 0033-3123.
  6. Yeung. R.W.. May 1991. A new outlook on Shannon's information measures. IEEE Transactions on Information Theory. 37. 3. 466–474. 10.1109/18.79902. 0018-9448.
  7. Krippendorff. Klaus. August 2009. Information of interactions in complex systems. International Journal of General Systems. en. 38. 6. 669–680. 10.1080/03081070902993160. 13923485 . 0308-1079.
  8. Killian . Benjamin J. . Yundenfreund Kravitz . Joslyn . Gilson . Michael K. . 2007-07-14 . Extraction of configurational entropy from molecular simulations via an expansion approximation . The Journal of Chemical Physics . en . 127 . 2 . 024107 . 10.1063/1.2746329 . 0021-9606 . 2707031 . 17640119. 2007JChPh.127b4107K .
  9. LeVine . Michael V. . Perez-Aguilar . Jose Manuel . Weinstein . Harel . 2014-06-18 . N-body Information Theory (NbIT) Analysis of Rigid-Body Dynamics in Intracellular Loop 2 of the 5-HT2A Receptor . q-bio.BM . 1406.4730 .
  10. Web site: InfoTopo: Topological Information Data Analysis. Deep statistical unsupervised and supervised learning - File Exchange - Github. github.com/pierrebaudot/infotopopy/. 26 September 2020.