
1: The following functions for the `long double' versions of the libm 2: function have to be written: 3: 4: e_acosl.c 5: e_asinl.c 6: e_atan2l.c 7: e_expl.c 8: e_fmodl.c 9: e_hypotl.c 10: e_j0l.c 11: e_j1l.c 12: e_jnl.c 13: e_lgammal_r.c 14: e_logl.c 15: e_log10l.c 16: e_powl.c 17: e_rem_pio2l.c 18: e_sinhl.c 19: e_sqrtl.c 20: 21: k_cosl.c 22: k_rem_pio2l.c 23: k_sinl.c 24: k_tanl.c 25: 26: s_atanl.c 27: s_erfl.c 28: s_expm1l.c 29: s_log1pl.c 30: 31: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 32: Methods 33: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 34: arcsin 35: ~~~~~~ 36: * Since asin(x) = x + x^3/6 + x^5*3/40 + x^7*15/336 + ... 37: * we approximate asin(x) on [0,0.5] by 38: * asin(x) = x + x*x^2*R(x^2) 39: * where 40: * R(x^2) is a rational approximation of (asin(x)-x)/x^3 41: * and its remez error is bounded by 42: * |(asin(x)-x)/x^3 - R(x^2)| < 2^(-58.75) 43: * 44: * For x in [0.5,1] 45: * asin(x) = pi/2-2*asin(sqrt((1-x)/2)) 46: * Let y = (1-x), z = y/2, s := sqrt(z), and pio2_hi+pio2_lo=pi/2; 47: * then for x>0.98 48: * asin(x) = pi/2 - 2*(s+s*z*R(z)) 49: * = pio2_hi - (2*(s+s*z*R(z)) - pio2_lo) 50: * For x<=0.98, let pio4_hi = pio2_hi/2, then 51: * f = hi part of s; 52: * c = sqrt(z) - f = (z-f*f)/(s+f) ...f+c=sqrt(z) 53: * and 54: * asin(x) = pi/2 - 2*(s+s*z*R(z)) 55: * = pio4_hi+(pio4-2s)-(2s*z*R(z)-pio2_lo) 56: * = pio4_hi+(pio4-2f)-(2s*z*R(z)-(pio2_lo+2c)) 57: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 58: arccos 59: ~~~~~~ 60: * Method : 61: * acos(x) = pi/2 - asin(x) 62: * acos(-x) = pi/2 + asin(x) 63: * For |x|<=0.5 64: * acos(x) = pi/2 - (x + x*x^2*R(x^2)) (see asin.c) 65: * For x>0.5 66: * acos(x) = pi/2 - (pi/2 - 2asin(sqrt((1-x)/2))) 67: * = 2asin(sqrt((1-x)/2)) 68: * = 2s + 2s*z*R(z) ...z=(1-x)/2, s=sqrt(z) 69: * = 2f + (2c + 2s*z*R(z)) 70: * where f=hi part of s, and c = (z-f*f)/(s+f) is the correction term 71: * for f so that f+c ~ sqrt(z). 72: * For x<-0.5 73: * acos(x) = pi - 2asin(sqrt((1-|x|)/2)) 74: * = pi - 0.5*(s+s*z*R(z)), where z=(1-|x|)/2,s=sqrt(z) 75: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 76: atan2 77: ~~~~~ 78: * Method : 79: * 1. Reduce y to positive by atan2(y,x)=-atan2(-y,x). 80: * 2. Reduce x to positive by (if x and y are unexceptional): 81: * ARG (x+iy) = arctan(y/x) ... if x > 0, 82: * ARG (x+iy) = pi - arctan[y/(-x)] ... if x < 0, 83: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 84: atan 85: ~~~~ 86: * Method 87: * 1. Reduce x to positive by atan(x) = -atan(-x). 88: * 2. According to the integer k=4t+0.25 chopped, t=x, the argument 89: * is further reduced to one of the following intervals and the 90: * arctangent of t is evaluated by the corresponding formula: 91: * 92: * [0,7/16] atan(x) = t-t^3*(a1+t^2*(a2+...(a10+t^2*a11)...) 93: * [7/16,11/16] atan(x) = atan(1/2) + atan( (t-0.5)/(1+t/2) ) 94: * [11/16.19/16] atan(x) = atan( 1 ) + atan( (t-1)/(1+t) ) 95: * [19/16,39/16] atan(x) = atan(3/2) + atan( (t-1.5)/(1+1.5t) ) 96: * [39/16,INF] atan(x) = atan(INF) + atan( -1/t ) 97: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 98: exp 99: ~~~ 100: * Method 101: * 1. Argument reduction: 102: * Reduce x to an r so that |r| <= 0.5*ln2 ~ 0.34658. 103: * Given x, find r and integer k such that 104: * 105: * x = k*ln2 + r, |r| <= 0.5*ln2. 106: * 107: * Here r will be represented as r = hi-lo for better 108: * accuracy. 109: * 110: * 2. Approximation of exp(r) by a special rational function on 111: * the interval [0,0.34658]: 112: * Write 113: * R(r**2) = r*(exp(r)+1)/(exp(r)-1) = 2 + r*r/6 - r**4/360 + ... 114: * We use a special Reme algorithm on [0,0.34658] to generate 115: * a polynomial of degree 5 to approximate R. The maximum error 116: * of this polynomial approximation is bounded by 2**-59. In 117: * other words, 118: * R(z) ~ 2.0 + P1*z + P2*z**2 + P3*z**3 + P4*z**4 + P5*z**5 119: * (where z=r*r, and the values of P1 to P5 are listed below) 120: * and 121: * | 5 | -59 122: * | 2.0+P1*z+...+P5*z - R(z) | <= 2 123: * | | 124: * The computation of exp(r) thus becomes 125: * 2*r 126: * exp(r) = 1 + ------- 127: * R - r 128: * r*R1(r) 129: * = 1 + r + ----------- (for better accuracy) 130: * 2 - R1(r) 131: * where 132: * 2 4 10 133: * R1(r) = r - (P1*r + P2*r + ... + P5*r ). 134: * 135: * 3. Scale back to obtain exp(x): 136: * From step 1, we have 137: * exp(x) = 2^k * exp(r) 138: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 139: hypot 140: ~~~~~ 141: * If (assume round-to-nearest) z=x*x+y*y 142: * has error less than sqrt(2)/2 ulp, than 143: * sqrt(z) has error less than 1 ulp (exercise). 144: * 145: * So, compute sqrt(x*x+y*y) with some care as 146: * follows to get the error below 1 ulp: 147: * 148: * Assume x>y>0; 149: * (if possible, set rounding to round-to-nearest) 150: * 1. if x > 2y use 151: * x1*x1+(y*y+(x2*(x+x1))) for x*x+y*y 152: * where x1 = x with lower 32 bits cleared, x2 = x-x1; else 153: * 2. if x <= 2y use 154: * t1*y1+((x-y)*(x-y)+(t1*y2+t2*y)) 155: * where t1 = 2x with lower 32 bits cleared, t2 = 2x-t1, 156: * y1= y with lower 32 bits chopped, y2 = y-y1. 157: * 158: * NOTE: scaling may be necessary if some argument is too 159: * large or too tiny 160: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 161: j0/y0 162: ~~~~~ 163: * Method -- j0(x): 164: * 1. For tiny x, we use j0(x) = 1 - x^2/4 + x^4/64 - ... 165: * 2. Reduce x to |x| since j0(x)=j0(-x), and 166: * for x in (0,2) 167: * j0(x) = 1-z/4+ z^2*R0/S0, where z = x*x; 168: * (precision: |j0-1+z/4-z^2R0/S0 |<2**-63.67 ) 169: * for x in (2,inf) 170: * j0(x) = sqrt(2/(pi*x))*(p0(x)*cos(x0)-q0(x)*sin(x0)) 171: * where x0 = x-pi/4. It is better to compute sin(x0),cos(x0) 172: * as follow: 173: * cos(x0) = cos(x)cos(pi/4)+sin(x)sin(pi/4) 174: * = 1/sqrt(2) * (cos(x) + sin(x)) 175: * sin(x0) = sin(x)cos(pi/4)-cos(x)sin(pi/4) 176: * = 1/sqrt(2) * (sin(x) - cos(x)) 177: * (To avoid cancellation, use 178: * sin(x) +- cos(x) = -cos(2x)/(sin(x) -+ cos(x)) 179: * to compute the worse one.) 180: * 181: * Method -- y0(x): 182: * 1. For x<2. 183: * Since 184: * y0(x) = 2/pi*(j0(x)*(ln(x/2)+Euler) + x^2/4 - ...) 185: * therefore y0(x)-2/pi*j0(x)*ln(x) is an even function. 186: * We use the following function to approximate y0, 187: * y0(x) = U(z)/V(z) + (2/pi)*(j0(x)*ln(x)), z= x^2 188: * where 189: * U(z) = u00 + u01*z + ... + u06*z^6 190: * V(z) = 1 + v01*z + ... + v04*z^4 191: * with absolute approximation error bounded by 2**-72. 192: * Note: For tiny x, U/V = u0 and j0(x)~1, hence 193: * y0(tiny) = u0 + (2/pi)*ln(tiny), (choose tiny<2**-27) 194: * 2. For x>=2. 195: * y0(x) = sqrt(2/(pi*x))*(p0(x)*cos(x0)+q0(x)*sin(x0)) 196: * where x0 = x-pi/4. It is better to compute sin(x0),cos(x0) 197: * by the method mentioned above. 198: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 199: j1/y1 200: ~~~~~ 201: * Method -- j1(x): 202: * 1. For tiny x, we use j1(x) = x/2 - x^3/16 + x^5/384 - ... 203: * 2. Reduce x to |x| since j1(x)=-j1(-x), and 204: * for x in (0,2) 205: * j1(x) = x/2 + x*z*R0/S0, where z = x*x; 206: * (precision: |j1/x - 1/2 - R0/S0 |<2**-61.51 ) 207: * for x in (2,inf) 208: * j1(x) = sqrt(2/(pi*x))*(p1(x)*cos(x1)-q1(x)*sin(x1)) 209: * y1(x) = sqrt(2/(pi*x))*(p1(x)*sin(x1)+q1(x)*cos(x1)) 210: * where x1 = x-3*pi/4. It is better to compute sin(x1),cos(x1) 211: * as follow: 212: * cos(x1) = cos(x)cos(3pi/4)+sin(x)sin(3pi/4) 213: * = 1/sqrt(2) * (sin(x) - cos(x)) 214: * sin(x1) = sin(x)cos(3pi/4)-cos(x)sin(3pi/4) 215: * = -1/sqrt(2) * (sin(x) + cos(x)) 216: * (To avoid cancellation, use 217: * sin(x) +- cos(x) = -cos(2x)/(sin(x) -+ cos(x)) 218: * to compute the worse one.) 219: * 220: * Method -- y1(x): 221: * 1. screen out x<=0 cases: y1(0)=-inf, y1(x<0)=NaN 222: * 2. For x<2. 223: * Since 224: * y1(x) = 2/pi*(j1(x)*(ln(x/2)+Euler)-1/x-x/2+5/64*x^3-...) 225: * therefore y1(x)-2/pi*j1(x)*ln(x)-1/x is an odd function. 226: * We use the following function to approximate y1, 227: * y1(x) = x*U(z)/V(z) + (2/pi)*(j1(x)*ln(x)-1/x), z= x^2 228: * where for x in [0,2] (abs err less than 2**-65.89) 229: * U(z) = U0[0] + U0[1]*z + ... + U0[4]*z^4 230: * V(z) = 1 + v0[0]*z + ... + v0[4]*z^5 231: * Note: For tiny x, 1/x dominate y1 and hence 232: * y1(tiny) = -2/pi/tiny, (choose tiny<2**-54) 233: * 3. For x>=2. 234: * y1(x) = sqrt(2/(pi*x))*(p1(x)*sin(x1)+q1(x)*cos(x1)) 235: * where x1 = x-3*pi/4. It is better to compute sin(x1),cos(x1) 236: * by method mentioned above. 237: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 238: jn/yn 239: ~~~~~ 240: * Note 2. About jn(n,x), yn(n,x) 241: * For n=0, j0(x) is called, 242: * for n=1, j1(x) is called, 243: * for n<x, forward recursion us used starting 244: * from values of j0(x) and j1(x). 245: * for n>x, a continued fraction approximation to 246: * j(n,x)/j(n-1,x) is evaluated and then backward 247: * recursion is used starting from a supposed value 248: * for j(n,x). The resulting value of j(0,x) is 249: * compared with the actual value to correct the 250: * supposed value of j(n,x). 251: * 252: * yn(n,x) is similar in all respects, except 253: * that forward recursion is used for all 254: * values of n>1. 255: 256: jn: 257: /* (x >> n**2) 258: * Jn(x) = cos(x-(2n+1)*pi/4)*sqrt(2/x*pi) 259: * Yn(x) = sin(x-(2n+1)*pi/4)*sqrt(2/x*pi) 260: * Let s=sin(x), c=cos(x), 261: * xn=x-(2n+1)*pi/4, sqt2 = sqrt(2),then 262: * 263: * n sin(xn)*sqt2 cos(xn)*sqt2 264: * ---------------------------------- 265: * 0 s-c c+s 266: * 1 -s-c -c+s 267: * 2 -s+c -c-s 268: * 3 s+c c-s 269: ... 270: /* x is tiny, return the first Taylor expansion of J(n,x) 271: * J(n,x) = 1/n!*(x/2)^n - ... 272: ... 273: /* use backward recurrence */ 274: /* x x^2 x^2 275: * J(n,x)/J(n-1,x) = ---- ------ ------ ..... 276: * 2n - 2(n+1) - 2(n+2) 277: * 278: * 1 1 1 279: * (for large x) = ---- ------ ------ ..... 280: * 2n 2(n+1) 2(n+2) 281: * -- - ------ - ------ - 282: * x x x 283: * 284: * Let w = 2n/x and h=2/x, then the above quotient 285: * is equal to the continued fraction: 286: * 1 287: * = ----------------------- 288: * 1 289: * w - ----------------- 290: * 1 291: * w+h - --------- 292: * w+2h - ... 293: * 294: * To determine how many terms needed, let 295: * Q(0) = w, Q(1) = w(w+h) - 1, 296: * Q(k) = (w+k*h)*Q(k-1) - Q(k-2), 297: * When Q(k) > 1e4 good for single 298: * When Q(k) > 1e9 good for double 299: * When Q(k) > 1e17 good for quadruple 300: 301: ... 302: /* estimate log((2/x)^n*n!) = n*log(2/x)+n*ln(n) 303: * Hence, if n*(log(2n/x)) > ... 304: * single 8.8722839355e+01 305: * double 7.09782712893383973096e+02 306: * long double 1.1356523406294143949491931077970765006170e+04 307: * then recurrent value may overflow and the result is 308: * likely underflow to zero 309: 310: yn: 311: /* (x >> n**2) 312: * Jn(x) = cos(x-(2n+1)*pi/4)*sqrt(2/x*pi) 313: * Yn(x) = sin(x-(2n+1)*pi/4)*sqrt(2/x*pi) 314: * Let s=sin(x), c=cos(x), 315: * xn=x-(2n+1)*pi/4, sqt2 = sqrt(2),then 316: * 317: * n sin(xn)*sqt2 cos(xn)*sqt2 318: * ---------------------------------- 319: * 0 s-c c+s 320: * 1 -s-c -c+s 321: * 2 -s+c -c-s 322: * 3 s+c c-s 323: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 324: lgamma 325: ~~~~~~ 326: * Method: 327: * 1. Argument Reduction for 0 < x <= 8 328: * Since gamma(1+s)=s*gamma(s), for x in [0,8], we may 329: * reduce x to a number in [1.5,2.5] by 330: * lgamma(1+s) = log(s) + lgamma(s) 331: * for example, 332: * lgamma(7.3) = log(6.3) + lgamma(6.3) 333: * = log(6.3*5.3) + lgamma(5.3) 334: * = log(6.3*5.3*4.3*3.3*2.3) + lgamma(2.3) 335: * 2. Polynomial approximation of lgamma around its 336: * minimun ymin=1.461632144968362245 to maintain monotonicity. 337: * On [ymin-0.23, ymin+0.27] (i.e., [1.23164,1.73163]), use 338: * Let z = x-ymin; 339: * lgamma(x) = -1.214862905358496078218 + z^2*poly(z) 340: * where 341: * poly(z) is a 14 degree polynomial. 342: * 2. Rational approximation in the primary interval [2,3] 343: * We use the following approximation: 344: * s = x-2.0; 345: * lgamma(x) = 0.5*s + s*P(s)/Q(s) 346: * with accuracy 347: * |P/Q - (lgamma(x)-0.5s)| < 2**-61.71 348: * Our algorithms are based on the following observation 349: * 350: * zeta(2)-1 2 zeta(3)-1 3 351: * lgamma(2+s) = s*(1-Euler) + --------- * s - --------- * s + ... 352: * 2 3 353: * 354: * where Euler = 0.5771... is the Euler constant, which is very 355: * close to 0.5. 356: * 357: * 3. For x>=8, we have 358: * lgamma(x)~(x-0.5)log(x)-x+0.5*log(2pi)+1/(12x)-1/(360x**3)+.... 359: * (better formula: 360: * lgamma(x)~(x-0.5)*(log(x)-1)-.5*(log(2pi)-1) + ...) 361: * Let z = 1/x, then we approximation 362: * f(z) = lgamma(x) - (x-0.5)(log(x)-1) 363: * by 364: * 3 5 11 365: * w = w0 + w1*z + w2*z + w3*z + ... + w6*z 366: * where 367: * |w - f(z)| < 2**-58.74 368: * 369: * 4. For negative x, since (G is gamma function) 370: * -x*G(-x)*G(x) = pi/sin(pi*x), 371: * we have 372: * G(x) = pi/(sin(pi*x)*(-x)*G(-x)) 373: * since G(-x) is positive, sign(G(x)) = sign(sin(pi*x)) for x<0 374: * Hence, for x<0, signgam = sign(sin(pi*x)) and 375: * lgamma(x) = log(|Gamma(x)|) 376: * = log(pi/(|x*sin(pi*x)|)) - lgamma(-x); 377: * Note: one should avoid compute pi*(-x) directly in the 378: * computation of sin(pi*(-x)). 379: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 380: log 381: ~~~ 382: * Method : 383: * 1. Argument Reduction: find k and f such that 384: * x = 2^k * (1+f), 385: * where sqrt(2)/2 < 1+f < sqrt(2) . 386: * 387: * 2. Approximation of log(1+f). 388: * Let s = f/(2+f) ; based on log(1+f) = log(1+s) - log(1-s) 389: * = 2s + 2/3 s**3 + 2/5 s**5 + ....., 390: * = 2s + s*R 391: * We use a special Reme algorithm on [0,0.1716] to generate 392: * a polynomial of degree 14 to approximate R The maximum error 393: * of this polynomial approximation is bounded by 2**-58.45. In 394: * other words, 395: * 2 4 6 8 10 12 14 396: * R(z) ~ Lg1*s +Lg2*s +Lg3*s +Lg4*s +Lg5*s +Lg6*s +Lg7*s 397: * (the values of Lg1 to Lg7 are listed in the program) 398: * and 399: * | 2 14 | -58.45 400: * | Lg1*s +...+Lg7*s - R(z) | <= 2 401: * | | 402: * Note that 2s = f - s*f = f - hfsq + s*hfsq, where hfsq = f*f/2. 403: * In order to guarantee error in log below 1ulp, we compute log 404: * by 405: * log(1+f) = f - s*(f - R) (if f is not too large) 406: * log(1+f) = f - (hfsq - s*(hfsq+R)). (better accuracy) 407: * 408: * 3. Finally, log(x) = k*ln2 + log(1+f). 409: * = k*ln2_hi+(f-(hfsq-(s*(hfsq+R)+k*ln2_lo))) 410: * Here ln2 is split into two floating point number: 411: * ln2_hi + ln2_lo, 412: * where n*ln2_hi is always exact for |n| < 2000. 413: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 414: log10 415: ~~~~~ 416: * Method : 417: * Let log10_2hi = leading 40 bits of log10(2) and 418: * log10_2lo = log10(2) - log10_2hi, 419: * ivln10 = 1/log(10) rounded. 420: * Then 421: * n = ilogb(x), 422: * if(n<0) n = n+1; 423: * x = scalbn(x,-n); 424: * log10(x) := n*log10_2hi + (n*log10_2lo + ivln10*log(x)) 425: * 426: * Note 1: 427: * To guarantee log10(10**n)=n, where 10**n is normal, the rounding 428: * mode must set to Round-to-Nearest. 429: * Note 2: 430: * [1/log(10)] rounded to 53 bits has error .198 ulps; 431: * log10 is monotonic at all binary break points. 432: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 433: pow 434: ~~~ 435: * Method: Let x = 2 * (1+f) 436: * 1. Compute and return log2(x) in two pieces: 437: * log2(x) = w1 + w2, 438: * where w1 has 53-24 = 29 bit trailing zeros. 439: * 2. Perform y*log2(x) = n+y' by simulating muti-precision 440: * arithmetic, where |y'|<=0.5. 441: * 3. Return x**y = 2**n*exp(y'*log2) 442: * 443: * Special cases: 444: * 1. (anything) ** 0 is 1 445: * 2. (anything) ** 1 is itself 446: * 3. (anything) ** NAN is NAN 447: * 4. NAN ** (anything except 0) is NAN 448: * 5. +-(|x| > 1) ** +INF is +INF 449: * 6. +-(|x| > 1) ** -INF is +0 450: * 7. +-(|x| < 1) ** +INF is +0 451: * 8. +-(|x| < 1) ** -INF is +INF 452: * 9. +-1 ** +-INF is NAN 453: * 10. +0 ** (+anything except 0, NAN) is +0 454: * 11. -0 ** (+anything except 0, NAN, odd integer) is +0 455: * 12. +0 ** (-anything except 0, NAN) is +INF 456: * 13. -0 ** (-anything except 0, NAN, odd integer) is +INF 457: * 14. -0 ** (odd integer) = -( +0 ** (odd integer) ) 458: * 15. +INF ** (+anything except 0,NAN) is +INF 459: * 16. +INF ** (-anything except 0,NAN) is +0 460: * 17. -INF ** (anything) = -0 ** (-anything) 461: * 18. (-anything) ** (integer) is (-1)**(integer)*(+anything**integer) 462: * 19. (-anything except 0 and inf) ** (non-integer) is NAN 463: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 464: rem_pio2 return the remainder of x rem pi/2 in y[0]+y[1] 465: ~~~~~~~~ 466: This is one of the basic functions which is written with highest accuracy 467: in mind. 468: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 469: sinh 470: ~~~~ 471: * Method : 472: * mathematically sinh(x) if defined to be (exp(x)-exp(-x))/2 473: * 1. Replace x by |x| (sinh(-x) = -sinh(x)). 474: * 2. 475: * E + E/(E+1) 476: * 0 <= x <= 22 : sinh(x) := --------------, E=expm1(x) 477: * 2 478: * 479: * 22 <= x <= lnovft : sinh(x) := exp(x)/2 480: * lnovft <= x <= ln2ovft: sinh(x) := exp(x/2)/2 * exp(x/2) 481: * ln2ovft < x : sinh(x) := x*shuge (overflow) 482: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 483: sqrt 484: ~~~~ 485: * Method: 486: * Bit by bit method using integer arithmetic. (Slow, but portable) 487: * 1. Normalization 488: * Scale x to y in [1,4) with even powers of 2: 489: * find an integer k such that 1 <= (y=x*2^(-2k)) < 4, then 490: * sqrt(x) = 2^k * sqrt(y) 491: * 2. Bit by bit computation 492: * Let q = sqrt(y) truncated to i bit after binary point (q = 1), 493: * i 0 494: * i+1 2 495: * s = 2*q , and y = 2 * ( y - q ). (1) 496: * i i i i 497: * 498: * To compute q from q , one checks whether 499: * i+1 i 500: * 501: * -(i+1) 2 502: * (q + 2 ) <= y. (2) 503: * i 504: * -(i+1) 505: * If (2) is false, then q = q ; otherwise q = q + 2 . 506: * i+1 i i+1 i 507: * 508: * With some algebric manipulation, it is not difficult to see 509: * that (2) is equivalent to 510: * -(i+1) 511: * s + 2 <= y (3) 512: * i i 513: * 514: * The advantage of (3) is that s and y can be computed by 515: * i i 516: * the following recurrence formula: 517: * if (3) is false 518: * 519: * s = s , y = y ; (4) 520: * i+1 i i+1 i 521: * 522: * otherwise, 523: * -i -(i+1) 524: * s = s + 2 , y = y - s - 2 (5) 525: * i+1 i i+1 i i 526: * 527: * One may easily use induction to prove (4) and (5). 528: * Note. Since the left hand side of (3) contain only i+2 bits, 529: * it does not necessary to do a full (53-bit) comparison 530: * in (3). 531: * 3. Final rounding 532: * After generating the 53 bits result, we compute one more bit. 533: * Together with the remainder, we can decide whether the 534: * result is exact, bigger than 1/2ulp, or less than 1/2ulp 535: * (it will never equal to 1/2ulp). 536: * The rounding mode can be detected by checking whether 537: * huge + tiny is equal to huge, and whether huge - tiny is 538: * equal to huge for some floating point number "huge" and "tiny". 539: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 540: cos 541: ~~~ 542: * kernel cos function on [-pi/4, pi/4], pi/4 ~ 0.785398164 543: * Input x is assumed to be bounded by ~pi/4 in magnitude. 544: * Input y is the tail of x. 545: * 546: * Algorithm 547: * 1. Since cos(-x) = cos(x), we need only to consider positive x. 548: * 2. if x < 2^-27 (hx<0x3e400000 0), return 1 with inexact if x!=0. 549: * 3. cos(x) is approximated by a polynomial of degree 14 on 550: * [0,pi/4] 551: * 4 14 552: * cos(x) ~ 1 - x*x/2 + C1*x + ... + C6*x 553: * where the remez error is 554: * 555: * | 2 4 6 8 10 12 14 | -58 556: * |cos(x)-(1-.5*x +C1*x +C2*x +C3*x +C4*x +C5*x +C6*x )| <= 2 557: * | | 558: * 559: * 4 6 8 10 12 14 560: * 4. let r = C1*x +C2*x +C3*x +C4*x +C5*x +C6*x , then 561: * cos(x) = 1 - x*x/2 + r 562: * since cos(x+y) ~ cos(x) - sin(x)*y 563: * ~ cos(x) - x*y, 564: * a correction term is necessary in cos(x) and hence 565: * cos(x+y) = 1 - (x*x/2 - (r - x*y)) 566: * For better accuracy when x > 0.3, let qx = |x|/4 with 567: * the last 32 bits mask off, and if x > 0.78125, let qx = 0.28125. 568: * Then 569: * cos(x+y) = (1-qx) - ((x*x/2-qx) - (r-x*y)). 570: * Note that 1-qx and (x*x/2-qx) is EXACT here, and the 571: * magnitude of the latter is at least a quarter of x*x/2, 572: * thus, reducing the rounding error in the subtraction. 573: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 574: sin 575: ~~~ 576: * kernel sin function on [-pi/4, pi/4], pi/4 ~ 0.7854 577: * Input x is assumed to be bounded by ~pi/4 in magnitude. 578: * Input y is the tail of x. 579: * Input iy indicates whether y is 0. (if iy=0, y assume to be 0). 580: * 581: * Algorithm 582: * 1. Since sin(-x) = -sin(x), we need only to consider positive x. 583: * 2. if x < 2^-27 (hx<0x3e400000 0), return x with inexact if x!=0. 584: * 3. sin(x) is approximated by a polynomial of degree 13 on 585: * [0,pi/4] 586: * 3 13 587: * sin(x) ~ x + S1*x + ... + S6*x 588: * where 589: * 590: * |sin(x) 2 4 6 8 10 12 | -58 591: * |----- - (1+S1*x +S2*x +S3*x +S4*x +S5*x +S6*x )| <= 2 592: * | x | 593: * 594: * 4. sin(x+y) = sin(x) + sin'(x')*y 595: * ~ sin(x) + (1-x*x/2)*y 596: * For better accuracy, let 597: * 3 2 2 2 2 598: * r = x *(S2+x *(S3+x *(S4+x *(S5+x *S6)))) 599: * then 3 2 600: * sin(x) = x + (S1*x + (x *(r-y/2)+y)) 601: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 602: tan 603: ~~~ 604: * kernel tan function on [-pi/4, pi/4], pi/4 ~ 0.7854 605: * Input x is assumed to be bounded by ~pi/4 in magnitude. 606: * Input y is the tail of x. 607: * Input k indicates whether tan (if k=1) or 608: * -1/tan (if k= -1) is returned. 609: * 610: * Algorithm 611: * 1. Since tan(-x) = -tan(x), we need only to consider positive x. 612: * 2. if x < 2^-28 (hx<0x3e300000 0), return x with inexact if x!=0. 613: * 3. tan(x) is approximated by a odd polynomial of degree 27 on 614: * [0,0.67434] 615: * 3 27 616: * tan(x) ~ x + T1*x + ... + T13*x 617: * where 618: * 619: * |tan(x) 2 4 26 | -59.2 620: * |----- - (1+T1*x +T2*x +.... +T13*x )| <= 2 621: * | x | 622: * 623: * Note: tan(x+y) = tan(x) + tan'(x)*y 624: * ~ tan(x) + (1+x*x)*y 625: * Therefore, for better accuracy in computing tan(x+y), let 626: * 3 2 2 2 2 627: * r = x *(T2+x *(T3+x *(...+x *(T12+x *T13)))) 628: * then 629: * 3 2 630: * tan(x+y) = x + (T1*x + (x *(r+y)+y)) 631: * 632: * 4. For x in [0.67434,pi/4], let y = pi/4 - x, then 633: * tan(x) = tan(pi/4-y) = (1-tan(y))/(1+tan(y)) 634: * = 1 - 2*(tan(y) - (tan(y)^2)/(1+tan(y))) 635: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 636: atan 637: ~~~~ 638: * Method 639: * 1. Reduce x to positive by atan(x) = -atan(-x). 640: * 2. According to the integer k=4t+0.25 chopped, t=x, the argument 641: * is further reduced to one of the following intervals and the 642: * arctangent of t is evaluated by the corresponding formula: 643: * 644: * [0,7/16] atan(x) = t-t^3*(a1+t^2*(a2+...(a10+t^2*a11)...) 645: * [7/16,11/16] atan(x) = atan(1/2) + atan( (t-0.5)/(1+t/2) ) 646: * [11/16.19/16] atan(x) = atan( 1 ) + atan( (t-1)/(1+t) ) 647: * [19/16,39/16] atan(x) = atan(3/2) + atan( (t-1.5)/(1+1.5t) ) 648: * [39/16,INF] atan(x) = atan(INF) + atan( -1/t ) 649: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 650: erf 651: ~~~ 652: * x 653: * 2 |\ 654: * erf(x) = --------- | exp(-t*t)dt 655: * sqrt(pi) \| 656: * 0 657: * 658: * erfc(x) = 1-erf(x) 659: * Note that 660: * erf(-x) = -erf(x) 661: * erfc(-x) = 2 - erfc(x) 662: * 663: * Method: 664: * 1. For |x| in [0, 0.84375] 665: * erf(x) = x + x*R(x^2) 666: * erfc(x) = 1 - erf(x) if x in [-.84375,0.25] 667: * = 0.5 + ((0.5-x)-x*R) if x in [0.25,0.84375] 668: * where R = P/Q where P is an odd poly of degree 8 and 669: * Q is an odd poly of degree 10. 670: * -57.90 671: * | R - (erf(x)-x)/x | <= 2 672: * 673: * 674: * Remark. The formula is derived by noting 675: * erf(x) = (2/sqrt(pi))*(x - x^3/3 + x^5/10 - x^7/42 + ....) 676: * and that 677: * 2/sqrt(pi) = 1.128379167095512573896158903121545171688 678: * is close to one. The interval is chosen because the fix 679: * point of erf(x) is near 0.6174 (i.e., erf(x)=x when x is 680: * near 0.6174), and by some experiment, 0.84375 is chosen to 681: * guarantee the error is less than one ulp for erf. 682: * 683: * 2. For |x| in [0.84375,1.25], let s = |x| - 1, and 684: * c = 0.84506291151 rounded to single (24 bits) 685: * erf(x) = sign(x) * (c + P1(s)/Q1(s)) 686: * erfc(x) = (1-c) - P1(s)/Q1(s) if x > 0 687: * 1+(c+P1(s)/Q1(s)) if x < 0 688: * |P1/Q1 - (erf(|x|)-c)| <= 2**-59.06 689: * Remark: here we use the taylor series expansion at x=1. 690: * erf(1+s) = erf(1) + s*Poly(s) 691: * = 0.845.. + P1(s)/Q1(s) 692: * That is, we use rational approximation to approximate 693: * erf(1+s) - (c = (single)0.84506291151) 694: * Note that |P1/Q1|< 0.078 for x in [0.84375,1.25] 695: * where 696: * P1(s) = degree 6 poly in s 697: * Q1(s) = degree 6 poly in s 698: * 699: * 3. For x in [1.25,1/0.35(~2.857143)], 700: * erfc(x) = (1/x)*exp(-x*x-0.5625+R1/S1) 701: * erf(x) = 1 - erfc(x) 702: * where 703: * R1(z) = degree 7 poly in z, (z=1/x^2) 704: * S1(z) = degree 8 poly in z 705: * 706: * 4. For x in [1/0.35,28] 707: * erfc(x) = (1/x)*exp(-x*x-0.5625+R2/S2) if x > 0 708: * = 2.0 - (1/x)*exp(-x*x-0.5625+R2/S2) if -6<x<0 709: * = 2.0 - tiny (if x <= -6) 710: * erf(x) = sign(x)*(1.0 - erfc(x)) if x < 6, else 711: * erf(x) = sign(x)*(1.0 - tiny) 712: * where 713: * R2(z) = degree 6 poly in z, (z=1/x^2) 714: * S2(z) = degree 7 poly in z 715: * 716: * Note1: 717: * To compute exp(-x*x-0.5625+R/S), let s be a single 718: * precision number and s := x; then 719: * -x*x = -s*s + (s-x)*(s+x) 720: * exp(-x*x-0.5626+R/S) = 721: * exp(-s*s-0.5625)*exp((s-x)*(s+x)+R/S); 722: * Note2: 723: * Here 4 and 5 make use of the asymptotic series 724: * exp(-x*x) 725: * erfc(x) ~ ---------- * ( 1 + Poly(1/x^2) ) 726: * x*sqrt(pi) 727: * We use rational approximation to approximate 728: * g(s)=f(1/x^2) = log(erfc(x)*x) - x*x + 0.5625 729: * Here is the error bound for R1/S1 and R2/S2 730: * |R1/S1 - f(x)| < 2**(-62.57) 731: * |R2/S2 - f(x)| < 2**(-61.52) 732: * 733: * 5. For inf > x >= 28 734: * erf(x) = sign(x) *(1 - tiny) (raise inexact) 735: * erfc(x) = tiny*tiny (raise underflow) if x > 0 736: * = 2 - tiny if x<0 737: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 738: expm1 Returns exp(x)-1, the exponential of x minus 1 739: ~~~~~ 740: * Method 741: * 1. Argument reduction: 742: * Given x, find r and integer k such that 743: * 744: * x = k*ln2 + r, |r| <= 0.5*ln2 ~ 0.34658 745: * 746: * Here a correction term c will be computed to compensate 747: * the error in r when rounded to a floating-point number. 748: * 749: * 2. Approximating expm1(r) by a special rational function on 750: * the interval [0,0.34658]: 751: * Since 752: * r*(exp(r)+1)/(exp(r)-1) = 2+ r^2/6 - r^4/360 + ... 753: * we define R1(r*r) by 754: * r*(exp(r)+1)/(exp(r)-1) = 2+ r^2/6 * R1(r*r) 755: * That is, 756: * R1(r**2) = 6/r *((exp(r)+1)/(exp(r)-1) - 2/r) 757: * = 6/r * ( 1 + 2.0*(1/(exp(r)-1) - 1/r)) 758: * = 1 - r^2/60 + r^4/2520 - r^6/100800 + ... 759: * We use a special Reme algorithm on [0,0.347] to generate 760: * a polynomial of degree 5 in r*r to approximate R1. The 761: * maximum error of this polynomial approximation is bounded 762: * by 2**-61. In other words, 763: * R1(z) ~ 1.0 + Q1*z + Q2*z**2 + Q3*z**3 + Q4*z**4 + Q5*z**5 764: * where Q1 = -1.6666666666666567384E-2, 765: * Q2 = 3.9682539681370365873E-4, 766: * Q3 = -9.9206344733435987357E-6, 767: * Q4 = 2.5051361420808517002E-7, 768: * Q5 = -6.2843505682382617102E-9; 769: * (where z=r*r, and the values of Q1 to Q5 are listed below) 770: * with error bounded by 771: * | 5