A fast alternative to the standard C/C++ pow() function. With adjustable accuracy-space tradeoff. (1400 words)

- subject
- programming, optimization, C, C++, pow, floating-point, IEEE-754
- uri
- http://www.hxa.name/articles/content/fast-pow-adjustable_hxa7241_2007.html
- license
- Creative Commons BY-SA 3.0 License.
- download
- Source code in C and C++: http://www.hxa.name/articles/content/fast-pow-adjustable_hxa7241_2007.zip

This technique is formed of two ideas: manipulation of IEEE-754 floating-point bits, and precalculation of a lookup table.

First, a fast `pow`

:

‘A Fast, Compact Approximation of the Exponential Function’

Technical Report IDSIA-07-98;

Nicol N. Schraudolph;

IDSIA,

1998-06-24.

http://users.rsise.anu.edu.au/~nici/bib2html/b2hd-Schraudolph99.html

http://www.idsia.ch/idsiareport/IDSIA-07-98.ps.gz

— was re-written for `float`

implementation instead of `double`

(avoiding endian platform dependence of original). It works by inserting the submitted number into the exponent bits of the answer. The fractional-part makes a linear interpolation in the mantissa (simple and fast, but inaccurate). Or it can be used as follows.

Second, an adjustable fast `log`

:

‘Revisiting a basic function on current CPUs: A fast logarithm implementation

with adjustable accuracy’

Technical Report ICSI TR-07-002;

Oriol Vinyals, Gerald Friedland, Nikki Mirghafori;

ICSI,

2007-06-21.

http://www.icsi.berkeley.edu/cgi-bin/pubs/publication.pl?ID=002209

http://www.icsi.berkeley.edu/pubs/techreports/TR-07-002.pdf

— was adapted to fit the fast `pow`

core (and modified to be twice as accurate). This uses the mantissa bits to index a lookup table. The table value then replaces the mantissa.

Precision is adustable from 0 to 18. This is how many bits to use as index, and hence how large the table is: 4B to 1MB.

For precision 11 (8KB table), mean error is < 0.01%, and max error is < 0.02% (proportional, ie: *abs(true - approx) / true*). And, on one machine, it was over 9 times faster than standard `pow`

.

In C (89) (and assuming 32 bit integers), the core is:

```
const float _2p23 = 8388608.0f;
/**
* Initialize powFast lookup table.
*
* @pTable length must be 2 ^ precision
* @precision number of mantissa bits used, >= 0 and <= 18
*/
void powFastSetTable
(
unsigned int* const pTable,
const unsigned int precision
)
{
/* step along table elements and x-axis positions */
float zeroToOne = 1.0f / ((float)(1 << precision) * 2.0f); /* A */
int i; /* B */
for( i = 0; i < (1 << precision); ++i ) /* C */
{
/* make y-axis value for table element */
const float f = ((float)pow( 2.0f, zeroToOne ) - 1.0f) * _2p23;
pTable[i] = (unsigned int)( f < _2p23 ? f : (_2p23 - 1.0f) );
zeroToOne += 1.0f / (float)(1 << precision);
} /* D */
}
/**
* Get pow (fast!).
*
* @val power to raise radix to
* @ilog2 one over log, to required radix, of two
* @pTable length must be 2 ^ precision
* @precision number of mantissa bits used, >= 0 and <= 18
*/
float powFastLookup
(
const float val,
const float ilog2,
unsigned int* const pTable,
const unsigned int precision
)
{
/* build float bits */
const int i = (int)( (val * (_2p23 * ilog2)) + (127.0f * _2p23) );
/* replace mantissa with lookup */
const int it = (i & 0xFF800000) | pTable[(i & 0x7FFFFF) >> /* E */
(23 - precision)]; /* F */
/* convert bits to float */
return *(const float*)( &it );
}
```

Setting the `ilog2`

parameter:

- for
`pow( 2, val)`

,`ilog2 = 1 / log2(2) = 1`

- for
`pow(`

,*e*, val)`ilog2 = 1 / log(2) = 1.44269504088896`

- for
`pow(10, val)`

,`ilog2 = 1 / log10(2) = 3.32192809488736`

- for
`pow( r, val)`

,`ilog2 = log(r) / log(2) = log(r) * 1.44269504088896`

These two functions can easily be put in a small class/module that manages the storage and wraps calls for different radixes.

An IEEE-754 single-precision floating point number comprises a sign, exponent, and mantissa, with value: *(sign)mantissa * 2 ^{exponent}*. The exponent is -125 to +128, the mantissa is 1 to 2 - 2

sign | exponent | mantissa |
---|---|---|

S | EEEEEEEE | MMMMMMMMMMMMMMMMMMMMMMM |

31 | 23 | 0 |

Considering the parts as plain integer numbers, the value is: *(+/-)(1 + mantissa/2 ^{23}) * 2^{(exponent - 127)}*. So, for example, to make 2.5, set the exponent to

`128`

and the mantissa to `0x200000`

, shift them into position, and reinterpret as a float.If you set the exponent to an integer number, and the mantissa to zero, you effectively get `pow(2,number)`

. That leaves fractional numbers. An Engineering Mathematics book will show that *a ^{(b + c)} = a^{b} * a^{c}*, which in this case can be

The range of *2 ^{frac}* is 1 to <2 — the same as the usual mantissa. So if the fractional part of the power is simply placed in the mantissa bits, it gives an approximation by linear interpolation (

There are 23 bits of mantissa, but the maximum precision is 18. This is because 5 bits are used by the integer part of the power.

(For other radixes, multiply the power by *ln(radix) / ln(2)*.)

The essential approximation is a ‘staircase’ function across the fraction range between successive integer powers. It has full float precision *y* values, but at limited regular *x* intervals. Accuracy is proportional to number of table elements.

Those limited intervals imply rounding considerations for setting and looking-up the table values. The mantissa could be rounded before truncation for table look-up. This costs a few instructions in speed. But instead this can be treated when setting the table. Set the table values for the middles of the *x* intervals — so line A (in the code) starts the iteration half an increment across. This (practically) minimises error. (Better would be the mean of the *y* values across each interval. But the improvement would be small for these circumstances.)

This solution has a small weakness: for radix two it produces inexact results for integer powers when it need not. This may be undesirable, since they are ‘special’. One treatment is to round the mantissa, as mentioned before:

in `powFastSetTable`

, change lines A-C to:

```
float zeroToOne = 0.0f;
int i;
for( i = 0; i < ((1 << precision) + 1); ++i )
```

— in `powFastLookup`

, change lines E-F to:

```
/* (rounding mantissa-index before use) */
const int it = (i & 0xFF800000) | pTable[((i & 0x7FFFFF) + (0x400000 >>
precision)) >> (23 - precision)];
```

— and the table is one element longer for all precisions (so change the `@pTable`

comments too).

Instead, add a special case to set the first table element (the integer value) to zero. It avoids the rounding cost, but almost doubles max error in the first interval.

in `powFastSetTable`

, add after line D:

```
/* make integer power exact */
pTable[0] = 0;
```

It is possible to have maximum precision with medium precision table size (max error < 0.002%, 4KB instead of 1MB). The cost in speed is quite small — a few extra instructions.

The power equation above suggests the further reformulation. Just as the fractional calculation is separable from the integer calculation, the fractional one can itself be separated into parts: eg. *2 ^{(frachigh + fraclow)} = 2^{frachigh} * 2^{fraclow}*. Breaking it into two is probably the most useful.

Here is an excerpt from a C++ class:

```
PowFast2::PowFast2()
{
setTable( tableH_m, 9, 9, false );
setTable( tableL_m, 18, 9, true );
}
void PowFast2::setTable
(
float* const pTable,
const udword precision,
const udword extent,
const bool isRound
)
{
// step along table elements and x-axis positions
float zeroToOne = !isRound ?
0.0f : (1.0f / (static_cast<float>(1 << precision) * 2.0f));
for( int i = 0; i < (1 << extent); ++i )
{
// make y-axis value for table element
pTable[i] = ::powf( 2.0f, zeroToOne );
zeroToOne += 1.0f / static_cast<float>(1 << precision);
}
}
inline
float PowFast2::lookup
(
const float val,
const float ilog2
) const
{
const float _2p23 = 8388608.0f;
// build float bits
const int i = static_cast<int>( (val * (_2p23 * ilog2)) + (127.0f * _2p23) );
// replace mantissa with combined lookups
const float t = tableH_m[(i >> 14) & 0x1FF] * tableL_m[(i >> 5) & 0x1FF];
const int it = (i & 0xFF800000) |
(*reinterpret_cast<const int*>( &t ) & 0x7FFFFF);
// convert bits to float
return *reinterpret_cast<const float*>( &it );
}
```

The limit is separating each bit. This yields 18 two-element tables, combinable with 17 multiplies. Since the first elements are all 1, only 8.5 multiplies are needed on average, if branching is used.

Here are two source-code packages, including tests. In C++ as a simple constant class, and in C with an object interface:

http://www.hxa.name/articles/content/fast-pow-adjustable_hxa7241_2007.zip

The source code is available according to the (new) BSD license:

———

Copyright (c) 2007, Harrison Ainsworth / HXA7241.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

- Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
- Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
- The name of the author may not be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE AUTHOR "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.