Sparse identification of non-linear dynamics

From testwiki
Jump to navigation Jump to search

Template:Short description Template:Orphan

Sparse identification of nonlinear dynamics (SINDy) is a data-driven algorithm for obtaining dynamical systems from data.[1] Given a series of snapshots of a dynamical system and its corresponding time derivatives, SINDy performs a sparsity-promoting regression (such as LASSO and spare Bayesian inference[2]) on a library of nonlinear candidate functions of the snapshots against the derivatives to find the governing equations. This procedure relies on the assumption that most physical systems only have a few dominant terms which dictate the dynamics, given an appropriately selected coordinate system and quality training data.[3] It has been applied to identify the dynamics of fluids, based on proper orthogonal decomposition, as well as other complex dynamical systems, such as biological networks.[4]

Mathematical Overview

First, consider a dynamical system of the form

๐ฑห™=ddt๐ฑ(t)=๐Ÿ(๐ฑ(t)),

where ๐ฑ(t)โˆˆโ„n is a state vector (snapshot) of the system at time t and the function ๐Ÿ(๐ฑ(t)) defines the equations of motion and constraints of the system. The time derivative may be either prescribed or numerically approximated from the snapshots.

With ๐ฑ and ๐ฑห™ sampled at m equidistant points in time (t1,t2,โ‹ฏ,tm), these can be arranged into matrices of the form

๐—=[๐ฑ๐“(๐ญ๐Ÿ)๐ฑ๐“(๐ญ๐Ÿ)โ‹ฎ๐ฑ๐“(๐ญ๐ฆ)]=[๐ฑ๐Ÿ(๐ญ๐Ÿ)๐ฑ๐Ÿ(๐ญ๐Ÿ)โ‹ฏ๐ฑ๐ง(๐ญ๐Ÿ)๐ฑ๐Ÿ(๐ญ๐Ÿ)๐ฑ๐Ÿ(๐ญ๐Ÿ)โ‹ฏ๐ฑ๐ง(๐ญ๐Ÿ)โ‹ฎโ‹ฎโ‹ฑโ‹ฎ๐ฑ๐Ÿ(๐ญ๐ฆ)๐ฑ๐Ÿ(๐ญ๐ฆ)โ‹ฏ๐ฑ๐ง(๐ญ๐ฆ)],

and similarly for ๐—ห™.

Next, a library ๐œฃ(๐—) of nonlinear candidate functions of the columns of ๐— is constructed, which may be constant, polynomial, or more exotic functions (like trigonometric and rational terms, and so on):

   ๐œฃ(๐—)=[||||||๐Ÿ๐—๐—๐Ÿ๐—๐Ÿ‘โ‹ฏsin(๐—)cos(๐—)โ‹ฏ||||||]

The number of possible model structures from this library is combinatorically high. ๐Ÿ(๐ฑ(t)) is then substituted by ๐œฃ(๐—) and a vector of coefficients ๐œฉ=[๐ƒ๐Ÿ๐ƒ๐Ÿโ‹ฏ๐ƒ๐ง] determining the active terms in ๐Ÿ(๐ฑ(t)):

๐—ห™=๐œฃ(๐—)๐œฉ

Because only a few terms are expected to be active at each point in time, an assumption is made that ๐Ÿ(๐ฑ(t)) admits a sparse representation in ๐œฃ(๐—). This then becomes an optimization problem in finding a sparse ๐œฉ which optimally embeds ๐—ห™. In other words, a parsimonious model is obtained by performing least squares regression on the system Template:EquationRef with sparsity-promoting (L1) regularization

๐ƒ๐ค=argmin๐ƒ'๐ค||๐—ห™kโˆ’๐œฃ(๐—)๐ƒ'๐ค||๐Ÿ+๐€||๐ƒ'๐ค||๐Ÿ,

where ฮป is a regularization parameter. Finally, the sparse set of ๐ƒ๐ค can be used to reconstruct the dynamical system:

xห™k=๐œฃ(๐ฑ)๐ƒ๐ค

References