Neural differential equation explained

In machine learning, a neural differential equation is a differential equation whose right-hand side is parametrized by the weights θ of a neural network.[1] In particular, a neural ordinary differential equation (neural ODE) is an ordinary differential equation of the form

\frac=f_\theta(\mathbf(t), t).In classical neural networks, layers are arranged in a sequence indexed by natural numbers. In neural ODEs, however, layers form a continuous family indexed by positive real numbers. Specifically, the function

h:R\ge\toR

maps each positive index t to a real value, representing the state of the neural network at that layer.

Neural ODEs can be understood as continuous-time control systems, where their ability to interpolate data can be interpreted in terms of controllability.[2]

Connection with residual neural networks

Neural ODEs can be interpreted as a residual neural network with a continuum of layers rather than a discrete number of layers. Applying the Euler method with a unit time step to a neural ODE yields the forward propagation equation of a residual neural network:

\mathbf_ = f_(\mathbf_, \ell) + \mathbf_,

with ℓ being the ℓ-th layer of this residual neural network. While the forward propagation of a residual neural network is done by applying a sequence of transformations starting at the input layer, the forward propagation computation of a neural ODE is done by solving a differential equation. More precisely, the output

hout

associated to the input

hin

of the neural ODE is obtained by solving the initial value problem

\frac=f_\theta(\mathbf(t), t), \quad \mathbf(0)=\mathbf_,

and assigning the value

h(T)

to

hout

.

Universal differential equations

In physics-informed contexts where additional information is known, neural ODEs can be combined with an existing first-principles model to build a physics-informed neural network model called universal differential equations (UDE).[3] [4] [5] For instance, an UDE version of the Lotka-Volterra model can be written as[6]

\begin \frac &= \alpha x - \beta x y + f_(x(t),y(t)), \\ \frac &= - \gamma y + \delta x y + g_(x(t),y(t)),\end

where the terms

f\theta

and

g\theta

are correction terms parametrized by neural networks.

See also

External links

Notes and References

  1. Chen . Ricky T. Q. . Rubanova . Yulia . Bettencourt . Jesse . Duvenaud . David K. . 2018 . Bengio . S. . Wallach . H. . Larochelle . H. . Grauman . K. . Cesa-Bianchi . N. . Garnett . R. . Neural Ordinary Differential Equations . Curran Associates, Inc. . 31 . 1806.07366 . Advances in Neural Information Processing Systems.
  2. Ruiz-Balet . Domènec . Zuazua . Enrique . 2023 . Neural ODE Control for Classification, Approximation, and Transport . SIAM Review . en . 65 . 3 . 735–773 . 2104.05278 . 10.1137/21M1411433 . 0036-1445.
  3. 2001.04385 . Christopher Rackauckas . Yingbo Ma . Universal Differential Equations for Scientific Machine Learning . 2024 . Julius Martensen . Collin Warner . Kirill Zubov . Rohit Supekar . Dominic Skinner . Ali Ramadhan . Alan Edelman.
  4. Xiao . Tianbai . Frank . Martin . 2023 . RelaxNet: A structure-preserving neural network to approximate the Boltzmann collision operator . Journal of Computational Physics . en . 490 . 112317 . 10.1016/j.jcp.2023.112317. 2211.08149 .
  5. 2408.07143 . Christoph Plate . Carl Julius Martensen . Sebastian Sager . Optimal Experimental Design for Universal Differential Equations . 2024.
  6. PhD . On Neural Differential Equations . Patrick Kidger . University of Oxford, Mathematical Institute . 2021 . Doctor of Philosophy . Oxford, United Kingdom .