Draft:Dual numbers for first order sensitivity analysis

From testwiki
Jump to navigation Jump to search

Template:Draft article Template:Unreferenced Dual numbers, like complex, are a subset of hypercomplex numbers. Dual numbers consist of a real and imaginary part (aka a “dual” part), like complex, but follow different algebraic rules. In summary, whereas for complex numbers the imaginary unit i satisfies i2=1, for dual numbers the imaginary unit ϵ satisfies ϵ2=0, with ϵ0. However, for sensitivity analysis, the same numerical procedures are applied using dual numbers as performed when using CTSE. Identical numerical results will be obtained for both methods assuming that a sufficiently small step size is used for CTSE; whereas any step size can be used with dual numbers. There is an advantage and a disadvantage to using dual numbers in place of complex numbers. As an advantage, the method using dual numbers to compute sensitivities is step size independent; therefore, a typical step size is h=1. This is in contrast to CTSE where a step size of h<1010 is required. A second advantage is that dual numbers can also be used to symbolically obtain derivatives for languages that support symbolic computations, whereas CTSE can only compute numerical results unless a “Limit” operator is available and employed. A disadvantage of dual numbers is that they are not intrinsic to today’s programming languages and a support library must be provided. As a result, using dual numbers may be significantly slower than using complex numbers. A second disadvantage for dual numbers is that most engineers and scientists are not familiar with dual numbers. Therefore, a short introduction is provided.

Dual numbers are a subset of hypercomplex numbers, as shown in Figure x. They consist of a real part and an imaginary part of the form a+bϵ where a and b are real numbers, and ϵ denotes the imaginary number. Here ϵ is analogous to i for complex numbers but it is more traditional to use ϵ. However, contrary to complex numbers, ϵ2=0, with ϵ0. The real and imaginary parts of a dual number can be extracted as (a+bϵ)=a, and (a+bϵ)=b. Consider the Taylor series expansion of a dual number a+bϵ expanded about a, f(a+bϵ)=f(a)+f(a)bϵ+f(a)2(bϵ)2+f(a)6(bϵ)3+

where f denotes the first derivative of f with respect to x, f the second derivative, and f(n) the nth derivative. Utilizing the fact that ϵn=0, for n2, the dual Taylor series is truncated as

f(a+bϵ)=f(a)+f(a)bϵ

As we will see, we can use dual numbers to compute first order derivatives analogous to CTSE if we consider a perturbation by step size h along the imaginary axis. In this case, the Taylor series becomes

f(a+ϵh)=f(a)+f(a)hϵ

 ComplexDualFormata+bia+bϵReal partaaImaginary partbbImaginary unitiϵImaginary unit squaredi2=1ϵ2=0f(x)(f(x+ih)=h(f(x+hϵ)=hStep size h108<h<10308Arbitrary, typicallyh=1CauchyRiemann matrixgeneral(abba)(a0ba)CauchyRiemann matrixdifferentiation(ahha)(a0ha)TypeNumericalSymbolic or Numerical

Table 1:Comparison of complex and dual numbers for differentiation

Writing this result in terms of the generic evaluation point x, the first order derivative is obtained as

f(x)=[f(x+ϵh)]h

If we use a step size of h=1, the derivative is obtained as

f(x)=[f(x+ϵ)]

Notice here that the derivative is an equal sign, not approximately equal, as is true for CTSE. As a result, the dual-step method is independent of step size; therefore, a step size of h=1 is often used.

As shown in Section 4, Dual numbers can also be defined in terms of a Cauchy-Riemann matrix of all real numbers. Hence, operations with dual numbers can be accomplished using matrices of all real numbers.

A comparison of using complex and dual numbers for differentiation is shown in Table 1. Note that both methods use the identical formula for differentiation with the sole difference being that the step size h must be small for complex numbers but it can be arbitrary for dual numbers.

1 Overview of dual numbers

The definitions of addition, subtraction, multiplication, and division of dual numbers are straightforward. Functions of dual numbers are also straightforward

to compute using the Taylor series definition, see equation 1. Many of these properties are self-evident. Consider the following cases with the definitions D1=a+bϵ and D2=c+dϵ, where a, b, c, d, r, andn are real numbers.

1.1 Notation

The standard notation for dual numbers is (a+bϵ); however, it is convenient to also represent dual number as dual(a;b). This longer form is especially useful in computer programs. Both formats will be used here.

1.2 Addition and subtraction

D1+D2=(a+bϵ)+(c+dϵ)=(a+c)+(b+d)ϵ. Similarly, D1D2=(ac)+(bd)ϵ. In addition, the addition or subtraction of a real number with a dual number only affects the real portion of the dual number, e.g., r+d1=(a+r)+bϵ.

1.3 Multiplication

D1D2=(a+bϵ)(c+dϵ)=ac+adϵ+bcϵ+bdϵ2. However, since ϵ2=0, D1D2=ac+(ad+bc)ϵ. In addition, the multiplication of a real number with a dual number affects both the real and imaginary portions of the dual number, e.g., rD1=(r+0ϵ)(a+bϵ)=ra+rbϵ. That is, one can always consider a real number as a special case of a dual number with a zero imaginary part.

1.4 Negation of a dual number

Negation of a dual number is a special case of a real multiplication, D=1(a+bϵ)=(abϵ).

1.5 Conjugate of a dual number

The conjugate of a dual number is analogous to that of a complex number, in particular, D¯1=abϵ.

1.1.6 Division of a dual number by a dual number

The division of two dual numbers is facilitated through the use of the dual conjugate, D¯ . In this case D1D¯1=(a+bϵ)(abϵ)=a2+(abab)ϵ=a2.

Division is defined as follows: D1/D2=a+bϵc+dϵ=a+bϵc+dϵ*adϵcdϵ=ac+(bcad)ϵc2=(ac)+(bcadc2)ϵ. Note, division by a dual number of the form (0+ϵ) is not defined.

1.1.7 Division of a dual number by a real number

The division of a dual number by a real number can be defined as a subset of a dual number divided by a dual number with d=0. Therefore, a+bϵc=(ac)+(bc)ϵ.

OperationDual ResultAdditionD1±D2(a±c)+(b±d)ϵMultiplicationD1D2ac+(ad+bc)ϵDivisionD1/D2(ac)+(bcadc2)Reciprocal1/D1(1a)+(ba2)ϵDual to a dual powerD1D2ac+ac1(adln(a)+cb)ϵDual to a real powerD1nan+nban1ϵReal to a dual poweraD2ac+acdln(a)ϵ

Table 2:Summary of dual operations with D1=a+b, D2=c+d, and a, b, c, d, and n as real numbers

2 Functions of dual numbers

Functions of dual numbers can be determined in a straightforward manner using the Taylor series definition of a dual number. This definition is, f(a+bϵ)=f(a)+bf(a)ϵ . Notice the the function f and its derivative are only evaluated at the real number a. Several examples are shown in Table 3. Construction of functions of dual numbers is straightforward given the function and its derivative, both evaluated at the real variable. For example, to develop a dual log function, one needs the log and its derivative evaluated at the parameter a. This procedure gives the function ln(a+bϵ)=ln(a)+baϵ .

Note, it may be convenient to provide a numerical derivative instead of an analytical function. For example, one could use CTSE to compute the derivative of the gamma function, that is, Γ(a+bϵ)=Γ(a)+bIm(Γ(a+ih))/hϵ . In this way, a formal symbolic derivative is not required.

FunctionMathematicalexpressionAbs(a+bϵ)Abs(a)+bsign(a)ϵsin(a+bϵ)sin(a)+bcos(a)ϵcos(a+bϵ)cos(a)bsin(a)ϵsinh(a+bϵsinh(a)+bcosh(a)ϵcos(a+bϵ)cosh(a)+bsinh(a)ϵa+bϵa+b2aϵln(a+bϵ)ln(a)+baϵea+bϵea+beaϵsin1(a+bϵ)sin1(a)+b1a2ϵ

Table 3:Examples of dual functions evaluated at the dual number a+bϵ

2.1 Dual raised to a dual power

A dual number raised to a dual power can be determined using the exponential and logarithm functions of dual numbers. Consider D1D2=(a+bϵ)(c+dϵ). This situation can be addressed using the formula analogous to real numbers, xy=eyln(x). The end result is D1D2=eD2ln(D1). Substituting for ln(D1)=ln(a)+b/aϵ , we obtain D1D2=ac+ac(dln(a)+cb/a)ϵ=ac+ac1(adln(a)+cb)ϵ.

2.2 Dual raised to a real power

A dual number raised to a real power can be determined as a special case of a dual number raised to a dual power. If we use the notation (a+bϵ)n then c=n and d=0. The result is (a+bϵ)n=an+nban1ϵ . For example (a+bϵ)2=a2+2abϵ .

2.3 Real raised to a dual number

This case can be considers as a subset of a dual raised to a dual power with b=0. The result is ac+dϵ=ac+acdln(a)ϵ .

3 Symbolic derivative examples using elementary functions

An attractive feature of using dual numbers is that they can be used to compute symbolic and numerical derivatives. While their primary application is within numerical algorithms and programs, symbolic derivatives are often useful for learning and exploratory purposes. This is especially true when using dual numbers with computer algebra systems. In addition, as shown in Section XXX ( what section did you want to reference here Reference a section), symbolic derivatives allow one to compute mixed and higher order derivates using dual numbers. However, of course, there are already sophisticated computer algebra systems for computing arbitrary symbolic derivatives of arbitrary order.

The use of dual numbers to compute first order derivatives can be easily demonstrated using simple functions. For computing derivatives, the imaginary component of the dual function is the step size h. In this case, unlike CTSE, we do not need to have the step size approach zero, that is, we do not need h0. In fact, we will see that it is convenient to use h=1. In the examples below, we use a=x and b=h=1. That is, in order to compute the derivative of the function f(x), we use replace x with x+ϵ  and f(x)=Im(f(x+ϵ).

Example: f(x)=x2

Consider the function f(x)=x2. Then, f(x+ϵ)=(x+ϵ)2=x2+2xϵ , and f(x)=Im(f(x+ϵ))=2x.

Example: f(x)=x^{3}

Consider the function f(x)=x^{3}. Then f(x+\epsilon)=(x+\epsilon)^{3}=(x^{2}+2x\epsilon)(x+\epsilon)=x^{3}+(2x^{2}+x)\epsilon=x^{3}+3x^{2}\epsilon. Then f'(x)=Im(f(x+\epsilon))=3x^{2}.

Example: f(x)=e^{x}

Consider the function f(x)=e^{x}. This case is a subset of a dual raised to a dual power. In particular, we have (a+b\epsilon)^{c+d\epsilon}=a^{c}+a^{c-1}(ad\ln(a)+cb)\epsilon  with a=e, b=0,c=x, and d=1. The result is

f(x+\epsilon)=e^{x+\epsilon}=e^{x}+e^{x}(1)ln(e)\epsilon=e^{x}+e^{x}\epsilon  Then f'(x)=Im(f(x+\epsilon))=e^{x}.

Example: f(x)=\sin(x)

Consider the function f(x)=\sin(x). Using the definition as shown in Table [tab:dual functions] with a=x, and h=1, it is clear f'(x)=\cos(x).

Example: f(x)=\cos(x)

Consider the function f(x)=\cos(x). Using the definition as shown in Table [tab:dual functions] with a=x, and h=1, it is clear f'(x)=-\sin(x).

Example: f(x)=\tan(x)

Consider the function f(x)=\tan(x).

f(x+\epsilon) =tan(x+\epsilon)

= \frac{\sin(x+\epsilon)}{\cos(x+\epsilon)}=\frac{\sin(x)+\cos(x)\epsilon}{\cos(x)-\sin(x)\epsilon}=\frac{\sin(x)+\cos(x)\epsilon}{\cos(x)-\sin(x)\epsilon}\cdot\frac{\cos(x)+\sin(x)\epsilon}{\cos(x)+\sin(x)\epsilon}

= \frac{\sin(x)\cos(x)+\sin^{2}(x)\epsilon+\cos^{2}(x)\epsilon+\cos(x)\epsilon^{2}}{\cos^{2}(x)}

= \frac{\sin(x)\cos(x)+(\sin^{2}(x)+\cos^{2}(x))\epsilon}{\cos^{2}(x)}=\frac{\sin(x)}{\cos(x)}+\frac{1}{\cos^{2}(x)}\epsilon

Hence, f'(x)=\frac{1}{\cos^{2}(x)}=\csc^{2}(x).

Example: f(x)=\sinh(x)

Consider the function f(x)=\sinh(x)=\frac{1}{2}(e^{x}-e^{-x}).

f(x+\epsilon) =\sinh(x+\epsilon)=\frac{1}{2}(e^{x+\epsilon}-e^{-(x+\epsilon)})=\frac{1}{2}((e^{x}+e^{x}\epsilon)-(e^{-x}-e^{-x}\epsilon))

= \frac{1}{2}(e^{x}-e^{-x})+\frac{1}{2}(e^{x}+e^{-x})\epsilon

= \sinh(x)+\cosh(x)\epsilon

Hence, f'(x)=\cosh(x).

Example: f(x)=\cosh(x)

Consider the function f(x)=\cosh(x)=\frac{1}{2}(e^{x}+e^{-x}).

f(x+\epsilon) =\cosh(x+\epsilon)=\frac{1}{2}(e^{x+\epsilon}+e^{-(x+\epsilon)})=\frac{1}{2}((e^{x}+e^{x}\epsilon)+(e^{-x}-e^{-x}\epsilon))

= \frac{1}{2}(e^{x}+e^{-x})+\frac{1}{2}(e^{x}-e^{-x})\epsilon

= \cosh(x)+\sinh(x)\epsilon

Hence, f'(x)=\sinh(x).

Example: f(x)=xe^{x}

Consider the function f(x)=xe^{x}.

f(x+\epsilon) =(x+\epsilon)e^{x+\epsilon}=(x+\epsilon)(e^{x}+e^{x}\epsilon)

= xe^{x}+(xe^{x}+e^{x})\epsilon+e^{x}\epsilon^{2}

= xe^{x}+(xe^{x}+e^{x})\epsilon

Hence, f'(x)=(x+1)e^{x}.

Example: f(x)=\frac{1}{x}

Consider the function f(x)=\frac{1}{x}.

f(x+\epsilon) =\frac{1}{x+\epsilon}=\frac{1}{x+\epsilon}\cdot\frac{x-\epsilon}{x-\epsilon}

= \frac{x-\epsilon}{x^{2}}=\frac{1}{x}-\frac{1}{x^{2}}\epsilon

Hence, f'(x)=-\frac{1}{x^{2}}.

Example: f(x)=\sin(2x^{2})

Consider the function f(x)=\sin(2x^{2}).

f(x+\epsilon) =\sin(2(x+\epsilon)^{2})=\sin(2(x^{2}+2x\epsilon+\epsilon^{2}))=\sin(2x^{2}+4x\epsilon)

Using the fact that \sin(a+b\epsilon)=\sin(a)+b\cos(a), with a=2x^{2}, and b=4x, this yields f(x+\epsilon)=\sin(2x^{2})+4x\cos(2x^{2})\epsilon . Hence, f'(x)=4x\cos(2x^{2}).

Example: f(x)=\sin(x)/x

Consider the function f(x)=\sin(x)/x.

f(x+\epsilon) =\frac{\sin(x+\epsilon)}{x+\epsilon}=\frac{\sin(x+\epsilon)}{x+\epsilon}\cdot\frac{x-\epsilon}{x-\epsilon}=

= \frac{(\sin(x)+\cos(x)\epsilon)(x-\epsilon)}{x^{2}}=\frac{1}{x^{2}}(x\sin(x)+(-\sin(x)+xcos(x))\epsilon-\cos(x)\epsilon^{2})

= \frac{1}{x^{2}}(x\sin(x)+(-\sin(x)+xcos(x))\epsilon)

= \frac{\sin(x)}{x}+\frac{-\sin(x)+x\cos(x)}{x^{2}}\epsilon=\frac{\sin(x)}{x}+\left(-\frac{\sin(x)}{x^{2}}+\frac{\cos(x)}{x}\right)\epsilon

Hence, f'(x)=-\frac{\sin(x)}{x^{2}}+\frac{\cos(x)}{x}.

Example: f(x)=\sqrt{x}

Consider the function f(x)=\sqrt{x}. This function is a subset of a dual number raised to a real power. We can use the relationship (a+b\epsilon)^{c}=a^{c}+cba^{c-1}\epsilon , with a=x,b=1, and c=1/2 to yield

f(x+\epsilon) =(x+\epsilon)^{1/2}=x^{1/2}+1/2x^{-1/2}\epsilon=\sqrt{x}+\frac{1}{2\sqrt{x}}\epsilon

Hence, f'(x)=\frac{1}{2\sqrt{x}}. Template:Drafts moved from mainspace