Polynomial evaluation

From testwiki
Revision as of 21:49, 27 September 2024 by imported>XDanielx (Matrix polynomials)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Template:Short description

In mathematics and computer science, polynomial evaluation refers to computation of the value of a polynomial when its indeterminates are substituted for some values. In other words, evaluating the polynomial P(x1,x2)=2x1x2+x13+4 at x1=2,x2=3 consists of computing P(2,3)=223+23+4=24. See also Template:Slink

For evaluating the univariate polynomial anxn+an1xn1++a0, the most naive method would use n multiplications to compute anxn, use n1 multiplications to compute an1xn1 and so on for a total of n(n+1)2 multiplications and n additions. Using better methods, such as Horner's rule, this can be reduced to n multiplications and n additions. If some preprocessing is allowed, even more savings are possible.

Background

This problem arises frequently in practice. In computational geometry, polynomials are used to compute function approximations using Taylor polynomials. In cryptography and hash tables, polynomials are used to compute k-independent hashing.

In the former case, polynomials are evaluated using floating-point arithmetic, which is not exact. Thus different schemes for the evaluation will, in general, give slightly different answers. In the latter case, the polynomials are usually evaluated in a finite field, in which case the answers are always exact.

General methods

Horner's rule

Template:See also

Horner's method evaluates a polynomial using repeated bracketing: a0+a1x+a2x2+a3x3++anxn=a0+x(a1+x(a2+x(a3++x(an1+xan)))). This method reduces the number of multiplications and additions to just n

Horner's method is so common that a computer instruction "multiply–accumulate operation" has been added to many computer processors, which allow doing the addition and multiplication operations in one combined step.

Multivariate

If the polynomial is multivariate, Horner's rule can be applied recursively over some ordering of the variables. E.g.

P(x,y)=4+x+2xy+2x2y+x2y2

can be written as

P(x,y)=4+x(1+y(2)+x(y(2+y)))orP(x,y)=4+x+y(x(2+x(2))+y(x2)).

An efficient version of this approach was described by Carnicer and Gasca.[1]

Estrin's scheme

Template:See also

While it's not possible to do less computation than Horner's rule (without preprocessing), on modern computers the order of evaluation can matter a lot for the computational efficiency. A method known as Estrin's scheme computes a (single variate) polynomial in a tree like pattern:

P(x)=(a0+a1x)+(a2+a3x)x2+((a4+a5x)+(a6+a7x)x2)x4.

Combined by Exponentiation by squaring, this allows parallelizing the computation.

Evaluation with preprocessing

Arbitrary polynomials can be evaluated with fewer operations than Horner's rule requires if we first "preprocess" the coefficients an,,a0.

An example was first given by Motzkin[2] who noted that

P(x)=x4+a3x3+a2x2+a1x+a0

can be written as

y=(x+β0)x+β1,P(x)=(y+x+β2)y+β3,

where the values β0,,β3 are computed in advanced, based on a0,,a3. Motzkin's method uses just 3 multiplications compared to Horner's 4.

The values for each βi can be easily computed by expanding P(x) and equating the coefficients:

β0=12(a31),z=a2β0(β0+1),β1=a1β0z,β2=z2β1,β3=a0β1(β1+β2).

Example

To compute the Taylor expansion exp(x)1+x+x2/2+x3/6+x4/24, we can upscale by a factor 24, apply the above steps, and scale back down. That gives us the three multiplication computation

y=(x+1.5)x+11.625,P(x)=(y+x15)y/24+2.63477.

Improving over the equivalent Horner form (that is P(x)=1+x(1+x(1/2+x(1/6+x/24)))) by 1 multiplication.

Some general methods include the Knuth–Eve algorithm and the Rabin–Winograd algorithm. [3]

Multipoint evaluation

Evaluation of a degree-n polynomial P(x) at multiple points x1,,xm can be done with mn multiplications by using Horner's method m times. Using the above preprocessing approach, this can be reduced by a factor of two; that is, to mn/2 multiplications.

However, it is possible to do better and reduce the time requirement to just O((n+m)log2(n+m)).[4] The idea is to define two polynomials that are zero in respectively the first and second half of the points: m0(x)=(xx1)(xxn/2) and m1(x)=(xxn/2+1)(xxn). We then compute R0=Pmodm0 and R1=Pmodm1 using the Polynomial remainder theorem, which can be done in O(nlogn) time using a fast Fourier transform. This means P(x)=Q(x)m0(x)+R0(x) and P(x)=Q(x)m1(x)+R1(x) by construction, where R0 and R1 are polynomials of degree at most n/2. Because of how m0 and m1 were defined, we have

R0(xi)=P(xi)for in/2andR1(xi)=P(xi)for i>n/2.

Thus to compute P on all n of the xi, it suffices to compute the smaller polynomials R0 and R1 on each half of the points. This gives us a divide-and-conquer algorithm with T(n)=2T(n/2)+nlogn, which implies T(n)=O(n(logn)2) by the master theorem.


In the case where the points in which we wish to evaluate the polynomials have some structure, simpler methods exist. For example, Knuth[5] section 4.6.4 gives a method for tabulating polynomial values of the type

P(x0+h),P(x0+2h),.

Dynamic evaluation

In the case where x1,,xm are not known in advance, Kedlaya and Umans[6] gave a data structure for evaluating polynomials over a finite field of size Fq in time (logn)O(1)(log2q)1+o(1) per evaluation after some initial preprocessing. This was shown by Larsen[7] to be essentially optimal.

The idea is to transform P(x) of degree n into a multivariate polynomial f(x1,x2,,xm), such that P(x)=f(x,xd,xd2,,xdm) and the individual degrees of f is at most d. Since this is over modq, the largest value f can take (over ) is M=dm(q1)dm. Using the Chinese remainder theorem, it suffices to evaluate f modulo different primes p1,,p with a product at least M. Each prime can be taken to be roughly logM=O(dmlogq), and the number of primes needed, , is roughly the same. Doing this process recursively, we can get the primes as small as loglogq. That means we can compute and store f on all the possible values in T=(loglogq)m time and space. If we take d=logq, we get m=lognloglogq, so the time/space requirement is just nloglogqlogloglogq.

Kedlaya and Umans further show how to combine this preprocessing with fast (FFT) multipoint evaluation. This allows optimal algorithms for many important algebraic problems, such as polynomial modular composition.

Specific polynomials

While general polynomials require Ω(n) operations to evaluate, some polynomials can be computed much faster. For example, the polynomial P(x)=x2+2x+1 can be computed using just one multiplication and one addition since P(x)=(x+1)2

Evaluation of powers

Template:Main

A particularly interesting type of polynomial is powers like xn. Such polynomials can always be computed in O(logn) operations. Suppose, for example, that we need to compute x16; we could simply start with x and multiply by x to get x2. We can then multiply that by itself to get x4 and so on to get x8 and x16 in just four multiplications. Other powers like x5 can similarly be computed efficiently by first computing x4 by 2 multiplications and then multiplying by x.

The most efficient way to compute a given power xn is provided by addition-chain exponentiation. However, this requires designing a specific algorithm for each exponent, and the computation needed for designing these algorithms are difficult (NP-complete[8]), so exponentiation by squaring is generally preferred for effective computations.

Polynomial families

Often polynomials show up in a different form than the well known anxn++a1x+a0. For polynomials in Chebyshev form we can use Clenshaw algorithm. For polynomials in Bézier form we can use De Casteljau's algorithm, and for B-splines there is De Boor's algorithm.

Hard polynomials

The fact that some polynomials can be computed significantly faster than "general polynomials" suggests the question: Can we give an example of a simple polynomial that cannot be computed in time much smaller than its degree? Volker Strassen has shown[9] that the polynomial

P(x)=k=0n22kn3xk

cannot be evaluated with less than 12n2 multiplications and n4 additions. At least this bound holds if only operations of those types are allowed, giving rise to a so-called "polynomial chain of length <n2/logn".

The polynomial given by Strassen has very large coefficients, but by probabilistic methods, one can show there must exist even polynomials with coefficients just 0's and 1's such that the evaluation requires at least Ω(n/logn) multiplications.[10]

For other simple polynomials, the complexity is unknown. The polynomial (x+1)(x+2)(x+n) is conjectured to not be computable in time (logn)c for any c. This is supported by the fact that, if it can be computed fast, then integer factorization can be computed in polynomial time, breaking the RSA cryptosystem.[11]

Matrix polynomials

Sometimes the computational cost of scalar multiplications (like ax) is less than the computational cost of "non scalar" multiplications (like x2). The typical example of this is matrices. If M is an m×m matrix, a scalar multiplication aM takes about m2 arithmetic operations, while computing M2 takes about m3 (or m2.3 using fast matrix multiplication).

Matrix polynomials are important for example for computing the Matrix Exponential.

Paterson and Stockmeyer[12] showed how to compute a degree n polynomial using only O(n) non scalar multiplications and O(n) scalar multiplications. Thus a matrix polynomial of degree Template:Mvar can be evaluated in O(m2.3n+m2n) time. If m=n this is O(m3), as fast as one matrix multiplication with the standard algorithm.

This method works as follows: For a polynomial

P(M)=an1Mn1++a1M+a0I,

let Template:Mvar be the least integer not smaller than n. The powers M,M2,,Mk are computed with k matrix multiplications, and M2k,M3k,,Mk2k are then computed by repeated multiplication by Mk. Now,

P(M)=(a0I+a1M++ak1Mk1)+(akI+ak+1M++a2k1Mk1)Mk++(ankI+ank+1M++an1Mk1)Mk2k,,

where ai=0 for Template:Math. This requires just k more non-scalar multiplications.

We can write this succinctly using the Kronecker product:

P(M)=[IMMk1]T([a0a1a2akak+1a2k]I)[IMkM2k].

The direct application of this method uses 2n non-scalar multiplications, but combining it with Evaluation with preprocessing, Paterson and Stockmeyer show you can reduce this to 2n.

Methods based on matrix polynomial multiplications and additions have been proposed allowing to save nonscalar matrix multiplications with respect to the Paterson-Stockmeyer method.[13]

See also

References

Template:Reflist