(linenum→info "unix/slp.c:2238")

glibc/2.7/manual/arith.texi

    1: @node Arithmetic, Date and Time, Mathematics, Top
    2: @c %MENU% Low level arithmetic functions
    3: @chapter Arithmetic Functions
    4: 
    5: This chapter contains information about functions for doing basic
    6: arithmetic operations, such as splitting a float into its integer and
    7: fractional parts or retrieving the imaginary part of a complex value.
    8: These functions are declared in the header files @file{math.h} and
    9: @file{complex.h}.
   10: 
   11: @menu
   12: * Integers::                    Basic integer types and concepts
   13: * Integer Division::            Integer division with guaranteed rounding.
   14: * Floating Point Numbers::      Basic concepts.  IEEE 754.
   15: * Floating Point Classes::      The five kinds of floating-point number.
   16: * Floating Point Errors::       When something goes wrong in a calculation.
   17: * Rounding::                    Controlling how results are rounded.
   18: * Control Functions::           Saving and restoring the FPU's state.
   19: * Arithmetic Functions::        Fundamental operations provided by the library.
   20: * Complex Numbers::             The types.  Writing complex constants.
   21: * Operations on Complex::       Projection, conjugation, decomposition.
   22: * Parsing of Numbers::          Converting strings to numbers.
   23: * System V Number Conversion::  An archaic way to convert numbers to strings.
   24: @end menu
   25: 
   26: @node Integers
   27: @section Integers
   28: @cindex integer
   29: 
   30: The C language defines several integer data types: integer, short integer,
   31: long integer, and character, all in both signed and unsigned varieties.
   32: The GNU C compiler extends the language to contain long long integers
   33: as well.
   34: @cindex signedness
   35: 
   36: The C integer types were intended to allow code to be portable among
   37: machines with different inherent data sizes (word sizes), so each type
   38: may have different ranges on different machines.  The problem with
   39: this is that a program often needs to be written for a particular range
   40: of integers, and sometimes must be written for a particular size of
   41: storage, regardless of what machine the program runs on.
   42: 
   43: To address this problem, the GNU C library contains C type definitions
   44: you can use to declare integers that meet your exact needs.  Because the
   45: GNU C library header files are customized to a specific machine, your
   46: program source code doesn't have to be.
   47: 
   48: These @code{typedef}s are in @file{stdint.h}.
   49: @pindex stdint.h
   50: 
   51: If you require that an integer be represented in exactly N bits, use one
   52: of the following types, with the obvious mapping to bit size and signedness:
   53: 
   54: @itemize @bullet
   55: @item int8_t
   56: @item int16_t
   57: @item int32_t
   58: @item int64_t
   59: @item uint8_t
   60: @item uint16_t
   61: @item uint32_t
   62: @item uint64_t
   63: @end itemize
   64: 
   65: If your C compiler and target machine do not allow integers of a certain
   66: size, the corresponding above type does not exist.
   67: 
   68: If you don't need a specific storage size, but want the smallest data
   69: structure with @emph{at least} N bits, use one of these:
   70: 
   71: @itemize @bullet
   72: @item int_least8_t
   73: @item int_least16_t
   74: @item int_least32_t
   75: @item int_least64_t
   76: @item uint_least8_t
   77: @item uint_least16_t
   78: @item uint_least32_t
   79: @item uint_least64_t
   80: @end itemize
   81: 
   82: If you don't need a specific storage size, but want the data structure
   83: that allows the fastest access while having at least N bits (and
   84: among data structures with the same access speed, the smallest one), use
   85: one of these:
   86: 
   87: @itemize @bullet
   88: @item int_fast8_t
   89: @item int_fast16_t
   90: @item int_fast32_t
   91: @item int_fast64_t
   92: @item uint_fast8_t
   93: @item uint_fast16_t
   94: @item uint_fast32_t
   95: @item uint_fast64_t
   96: @end itemize
   97: 
   98: If you want an integer with the widest range possible on the platform on
   99: which it is being used, use one of the following.  If you use these,
  100: you should write code that takes into account the variable size and range
  101: of the integer.
  102: 
  103: @itemize @bullet
  104: @item intmax_t
  105: @item uintmax_t
  106: @end itemize
  107: 
  108: The GNU C library also provides macros that tell you the maximum and
  109: minimum possible values for each integer data type.  The macro names
  110: follow these examples: @code{INT32_MAX}, @code{UINT8_MAX},
  111: @code{INT_FAST32_MIN}, @code{INT_LEAST64_MIN}, @code{UINTMAX_MAX},
  112: @code{INTMAX_MAX}, @code{INTMAX_MIN}.  Note that there are no macros for
  113: unsigned integer minima.  These are always zero.
  114: @cindex maximum possible integer
  115: @cindex minimum possible integer
  116: 
  117: There are similar macros for use with C's built in integer types which
  118: should come with your C compiler.  These are described in @ref{Data Type
  119: Measurements}.
  120: 
  121: Don't forget you can use the C @code{sizeof} function with any of these
  122: data types to get the number of bytes of storage each uses.
  123: 
  124: 
  125: @node Integer Division
  126: @section Integer Division
  127: @cindex integer division functions
  128: 
  129: This section describes functions for performing integer division.  These
  130: functions are redundant when GNU CC is used, because in GNU C the
  131: @samp{/} operator always rounds towards zero.  But in other C
  132: implementations, @samp{/} may round differently with negative arguments.
  133: @code{div} and @code{ldiv} are useful because they specify how to round
  134: the quotient: towards zero.  The remainder has the same sign as the
  135: numerator.
  136: 
  137: These functions are specified to return a result @var{r} such that the value
  138: @code{@var{r}.quot*@var{denominator} + @var{r}.rem} equals
  139: @var{numerator}.
  140: 
  141: @pindex stdlib.h
  142: To use these facilities, you should include the header file
  143: @file{stdlib.h} in your program.
  144: 
  145: @comment stdlib.h
  146: @comment ISO
  147: @deftp {Data Type} div_t
  148: This is a structure type used to hold the result returned by the @code{div}
  149: function.  It has the following members:
  150: 
  151: @table @code
  152: @item int quot
  153: The quotient from the division.
  154: 
  155: @item int rem
  156: The remainder from the division.
  157: @end table
  158: @end deftp
  159: 
  160: @comment stdlib.h
  161: @comment ISO
  162: @deftypefun div_t div (int @var{numerator}, int @var{denominator})
  163: This function @code{div} computes the quotient and remainder from
  164: the division of @var{numerator} by @var{denominator}, returning the
  165: result in a structure of type @code{div_t}.
  166: 
  167: If the result cannot be represented (as in a division by zero), the
  168: behavior is undefined.
  169: 
  170: Here is an example, albeit not a very useful one.
  171: 
  172: @smallexample
  173: div_t result;
  174: result = div (20, -6);
  175: @end smallexample
  176: 
  177: @noindent
  178: Now @code{result.quot} is @code{-3} and @code{result.rem} is @code{2}.
  179: @end deftypefun
  180: 
  181: @comment stdlib.h
  182: @comment ISO
  183: @deftp {Data Type} ldiv_t
  184: This is a structure type used to hold the result returned by the @code{ldiv}
  185: function.  It has the following members:
  186: 
  187: @table @code
  188: @item long int quot
  189: The quotient from the division.
  190: 
  191: @item long int rem
  192: The remainder from the division.
  193: @end table
  194: 
  195: (This is identical to @code{div_t} except that the components are of
  196: type @code{long int} rather than @code{int}.)
  197: @end deftp
  198: 
  199: @comment stdlib.h
  200: @comment ISO
  201: @deftypefun ldiv_t ldiv (long int @var{numerator}, long int @var{denominator})
  202: The @code{ldiv} function is similar to @code{div}, except that the
  203: arguments are of type @code{long int} and the result is returned as a
  204: structure of type @code{ldiv_t}.
  205: @end deftypefun
  206: 
  207: @comment stdlib.h
  208: @comment ISO
  209: @deftp {Data Type} lldiv_t
  210: This is a structure type used to hold the result returned by the @code{lldiv}
  211: function.  It has the following members:
  212: 
  213: @table @code
  214: @item long long int quot
  215: The quotient from the division.
  216: 
  217: @item long long int rem
  218: The remainder from the division.
  219: @end table
  220: 
  221: (This is identical to @code{div_t} except that the components are of
  222: type @code{long long int} rather than @code{int}.)
  223: @end deftp
  224: 
  225: @comment stdlib.h
  226: @comment ISO
  227: @deftypefun lldiv_t lldiv (long long int @var{numerator}, long long int @var{denominator})
  228: The @code{lldiv} function is like the @code{div} function, but the
  229: arguments are of type @code{long long int} and the result is returned as
  230: a structure of type @code{lldiv_t}.
  231: 
  232: The @code{lldiv} function was added in @w{ISO C99}.
  233: @end deftypefun
  234: 
  235: @comment inttypes.h
  236: @comment ISO
  237: @deftp {Data Type} imaxdiv_t
  238: This is a structure type used to hold the result returned by the @code{imaxdiv}
  239: function.  It has the following members:
  240: 
  241: @table @code
  242: @item intmax_t quot
  243: The quotient from the division.
  244: 
  245: @item intmax_t rem
  246: The remainder from the division.
  247: @end table
  248: 
  249: (This is identical to @code{div_t} except that the components are of
  250: type @code{intmax_t} rather than @code{int}.)
  251: 
  252: See @ref{Integers} for a description of the @code{intmax_t} type.
  253: 
  254: @end deftp
  255: 
  256: @comment inttypes.h
  257: @comment ISO
  258: @deftypefun imaxdiv_t imaxdiv (intmax_t @var{numerator}, intmax_t @var{denominator})
  259: The @code{imaxdiv} function is like the @code{div} function, but the
  260: arguments are of type @code{intmax_t} and the result is returned as
  261: a structure of type @code{imaxdiv_t}.
  262: 
  263: See @ref{Integers} for a description of the @code{intmax_t} type.
  264: 
  265: The @code{imaxdiv} function was added in @w{ISO C99}.
  266: @end deftypefun
  267: 
  268: 
  269: @node Floating Point Numbers
  270: @section Floating Point Numbers
  271: @cindex floating point
  272: @cindex IEEE 754
  273: @cindex IEEE floating point
  274: 
  275: Most computer hardware has support for two different kinds of numbers:
  276: integers (@math{@dots{}-3, -2, -1, 0, 1, 2, 3@dots{}}) and
  277: floating-point numbers.  Floating-point numbers have three parts: the
  278: @dfn{mantissa}, the @dfn{exponent}, and the @dfn{sign bit}.  The real
  279: number represented by a floating-point value is given by
  280: @tex
  281: $(s \mathrel? -1 \mathrel: 1) \cdot 2^e \cdot M$
  282: @end tex
  283: @ifnottex
  284: @math{(s ? -1 : 1) @mul{} 2^e @mul{} M}
  285: @end ifnottex
  286: where @math{s} is the sign bit, @math{e} the exponent, and @math{M}
  287: the mantissa.  @xref{Floating Point Concepts}, for details.  (It is
  288: possible to have a different @dfn{base} for the exponent, but all modern
  289: hardware uses @math{2}.)
  290: 
  291: Floating-point numbers can represent a finite subset of the real
  292: numbers.  While this subset is large enough for most purposes, it is
  293: important to remember that the only reals that can be represented
  294: exactly are rational numbers that have a terminating binary expansion
  295: shorter than the width of the mantissa.  Even simple fractions such as
  296: @math{1/5} can only be approximated by floating point.
  297: 
  298: Mathematical operations and functions frequently need to produce values
  299: that are not representable.  Often these values can be approximated
  300: closely enough for practical purposes, but sometimes they can't.
  301: Historically there was no way to tell when the results of a calculation
  302: were inaccurate.  Modern computers implement the @w{IEEE 754} standard
  303: for numerical computations, which defines a framework for indicating to
  304: the program when the results of calculation are not trustworthy.  This
  305: framework consists of a set of @dfn{exceptions} that indicate why a
  306: result could not be represented, and the special values @dfn{infinity}
  307: and @dfn{not a number} (NaN).
  308: 
  309: @node Floating Point Classes
  310: @section Floating-Point Number Classification Functions
  311: @cindex floating-point classes
  312: @cindex classes, floating-point
  313: @pindex math.h
  314: 
  315: @w{ISO C99} defines macros that let you determine what sort of
  316: floating-point number a variable holds.
  317: 
  318: @comment math.h
  319: @comment ISO
  320: @deftypefn {Macro} int fpclassify (@emph{float-type} @var{x})
  321: This is a generic macro which works on all floating-point types and
  322: which returns a value of type @code{int}.  The possible values are:
  323: 
  324: @vtable @code
  325: @item FP_NAN
  326: The floating-point number @var{x} is ``Not a Number'' (@pxref{Infinity
  327: and NaN})
  328: @item FP_INFINITE
  329: The value of @var{x} is either plus or minus infinity (@pxref{Infinity
  330: and NaN})
  331: @item FP_ZERO
  332: The value of @var{x} is zero.  In floating-point formats like @w{IEEE
  333: 754}, where zero can be signed, this value is also returned if
  334: @var{x} is negative zero.
  335: @item FP_SUBNORMAL
  336: Numbers whose absolute value is too small to be represented in the
  337: normal format are represented in an alternate, @dfn{denormalized} format
  338: (@pxref{Floating Point Concepts}).  This format is less precise but can
  339: represent values closer to zero.  @code{fpclassify} returns this value
  340: for values of @var{x} in this alternate format.
  341: @item FP_NORMAL
  342: This value is returned for all other values of @var{x}.  It indicates
  343: that there is nothing special about the number.
  344: @end vtable
  345: 
  346: @end deftypefn
  347: 
  348: @code{fpclassify} is most useful if more than one property of a number
  349: must be tested.  There are more specific macros which only test one
  350: property at a time.  Generally these macros execute faster than
  351: @code{fpclassify}, since there is special hardware support for them.
  352: You should therefore use the specific macros whenever possible.
  353: 
  354: @comment math.h
  355: @comment ISO
  356: @deftypefn {Macro} int isfinite (@emph{float-type} @var{x})
  357: This macro returns a nonzero value if @var{x} is finite: not plus or
  358: minus infinity, and not NaN.  It is equivalent to
  359: 
  360: @smallexample
  361: (fpclassify (x) != FP_NAN && fpclassify (x) != FP_INFINITE)
  362: @end smallexample
  363: 
  364: @code{isfinite} is implemented as a macro which accepts any
  365: floating-point type.
  366: @end deftypefn
  367: 
  368: @comment math.h
  369: @comment ISO
  370: @deftypefn {Macro} int isnormal (@emph{float-type} @var{x})
  371: This macro returns a nonzero value if @var{x} is finite and normalized.
  372: It is equivalent to
  373: 
  374: @smallexample
  375: (fpclassify (x) == FP_NORMAL)
  376: @end smallexample
  377: @end deftypefn
  378: 
  379: @comment math.h
  380: @comment ISO
  381: @deftypefn {Macro} int isnan (@emph{float-type} @var{x})
  382: This macro returns a nonzero value if @var{x} is NaN.  It is equivalent
  383: to
  384: 
  385: @smallexample
  386: (fpclassify (x) == FP_NAN)
  387: @end smallexample
  388: @end deftypefn
  389: 
  390: Another set of floating-point classification functions was provided by
  391: BSD.  The GNU C library also supports these functions; however, we
  392: recommend that you use the ISO C99 macros in new code.  Those are standard
  393: and will be available more widely.  Also, since they are macros, you do
  394: not have to worry about the type of their argument.
  395: 
  396: @comment math.h
  397: @comment BSD
  398: @deftypefun int isinf (double @var{x})
  399: @comment math.h
  400: @comment BSD
  401: @deftypefunx int isinff (float @var{x})
  402: @comment math.h
  403: @comment BSD
  404: @deftypefunx int isinfl (long double @var{x})
  405: This function returns @code{-1} if @var{x} represents negative infinity,
  406: @code{1} if @var{x} represents positive infinity, and @code{0} otherwise.
  407: @end deftypefun
  408: 
  409: @comment math.h
  410: @comment BSD
  411: @deftypefun int isnan (double @var{x})
  412: @comment math.h
  413: @comment BSD
  414: @deftypefunx int isnanf (float @var{x})
  415: @comment math.h
  416: @comment BSD
  417: @deftypefunx int isnanl (long double @var{x})
  418: This function returns a nonzero value if @var{x} is a ``not a number''
  419: value, and zero otherwise.
  420: 
  421: @strong{Note:} The @code{isnan} macro defined by @w{ISO C99} overrides
  422: the BSD function.  This is normally not a problem, because the two
  423: routines behave identically.  However, if you really need to get the BSD
  424: function for some reason, you can write
  425: 
  426: @smallexample
  427: (isnan) (x)
  428: @end smallexample
  429: @end deftypefun
  430: 
  431: @comment math.h
  432: @comment BSD
  433: @deftypefun int finite (double @var{x})
  434: @comment math.h
  435: @comment BSD
  436: @deftypefunx int finitef (float @var{x})
  437: @comment math.h
  438: @comment BSD
  439: @deftypefunx int finitel (long double @var{x})
  440: This function returns a nonzero value if @var{x} is finite or a ``not a
  441: number'' value, and zero otherwise.
  442: @end deftypefun
  443: 
  444: @strong{Portability Note:} The functions listed in this section are BSD
  445: extensions.
  446: 
  447: 
  448: @node Floating Point Errors
  449: @section Errors in Floating-Point Calculations
  450: 
  451: @menu
  452: * FP Exceptions::               IEEE 754 math exceptions and how to detect them.
  453: * Infinity and NaN::            Special values returned by calculations.
  454: * Status bit operations::       Checking for exceptions after the fact.
  455: * Math Error Reporting::        How the math functions report errors.
  456: @end menu
  457: 
  458: @node FP Exceptions
  459: @subsection FP Exceptions
  460: @cindex exception
  461: @cindex signal
  462: @cindex zero divide
  463: @cindex division by zero
  464: @cindex inexact exception
  465: @cindex invalid exception
  466: @cindex overflow exception
  467: @cindex underflow exception
  468: 
  469: The @w{IEEE 754} standard defines five @dfn{exceptions} that can occur
  470: during a calculation.  Each corresponds to a particular sort of error,
  471: such as overflow.
  472: 
  473: When exceptions occur (when exceptions are @dfn{raised}, in the language
  474: of the standard), one of two things can happen.  By default the
  475: exception is simply noted in the floating-point @dfn{status word}, and
  476: the program continues as if nothing had happened.  The operation
  477: produces a default value, which depends on the exception (see the table
  478: below).  Your program can check the status word to find out which
  479: exceptions happened.
  480: 
  481: Alternatively, you can enable @dfn{traps} for exceptions.  In that case,
  482: when an exception is raised, your program will receive the @code{SIGFPE}
  483: signal.  The default action for this signal is to terminate the
  484: program.  @xref{Signal Handling}, for how you can change the effect of
  485: the signal.
  486: 
  487: @findex matherr
  488: In the System V math library, the user-defined function @code{matherr}
  489: is called when certain exceptions occur inside math library functions.
  490: However, the Unix98 standard deprecates this interface.  We support it
  491: for historical compatibility, but recommend that you do not use it in
  492: new programs.
  493: 
  494: @noindent
  495: The exceptions defined in @w{IEEE 754} are:
  496: 
  497: @table @samp
  498: @item Invalid Operation
  499: This exception is raised if the given operands are invalid for the
  500: operation to be performed.  Examples are
  501: (see @w{IEEE 754}, @w{section 7}):
  502: @enumerate
  503: @item
  504: Addition or subtraction: @math{@infinity{} - @infinity{}}.  (But
  505: @math{@infinity{} + @infinity{} = @infinity{}}).
  506: @item
  507: Multiplication: @math{0 @mul{} @infinity{}}.
  508: @item
  509: Division: @math{0/0} or @math{@infinity{}/@infinity{}}.
  510: @item
  511: Remainder: @math{x} REM @math{y}, where @math{y} is zero or @math{x} is
  512: infinite.
  513: @item
  514: Square root if the operand is less then zero.  More generally, any
  515: mathematical function evaluated outside its domain produces this
  516: exception.
  517: @item
  518: Conversion of a floating-point number to an integer or decimal
  519: string, when the number cannot be represented in the target format (due
  520: to overflow, infinity, or NaN).
  521: @item
  522: Conversion of an unrecognizable input string.
  523: @item
  524: Comparison via predicates involving @math{<} or @math{>}, when one or
  525: other of the operands is NaN.  You can prevent this exception by using
  526: the unordered comparison functions instead; see @ref{FP Comparison Functions}.
  527: @end enumerate
  528: 
  529: If the exception does not trap, the result of the operation is NaN.
  530: 
  531: @item Division by Zero
  532: This exception is raised when a finite nonzero number is divided
  533: by zero.  If no trap occurs the result is either @math{+@infinity{}} or
  534: @math{-@infinity{}}, depending on the signs of the operands.
  535: 
  536: @item Overflow
  537: This exception is raised whenever the result cannot be represented
  538: as a finite value in the precision format of the destination.  If no trap
  539: occurs the result depends on the sign of the intermediate result and the
  540: current rounding mode (@w{IEEE 754}, @w{section 7.3}):
  541: @enumerate
  542: @item
  543: Round to nearest carries all overflows to @math{@infinity{}}
  544: with the sign of the intermediate result.
  545: @item
  546: Round toward @math{0} carries all overflows to the largest representable
  547: finite number with the sign of the intermediate result.
  548: @item
  549: Round toward @math{-@infinity{}} carries positive overflows to the
  550: largest representable finite number and negative overflows to
  551: @math{-@infinity{}}.
  552: 
  553: @item
  554: Round toward @math{@infinity{}} carries negative overflows to the
  555: most negative representable finite number and positive overflows
  556: to @math{@infinity{}}.
  557: @end enumerate
  558: 
  559: Whenever the overflow exception is raised, the inexact exception is also
  560: raised.
  561: 
  562: @item Underflow
  563: The underflow exception is raised when an intermediate result is too
  564: small to be calculated accurately, or if the operation's result rounded
  565: to the destination precision is too small to be normalized.
  566: 
  567: When no trap is installed for the underflow exception, underflow is
  568: signaled (via the underflow flag) only when both tininess and loss of
  569: accuracy have been detected.  If no trap handler is installed the
  570: operation continues with an imprecise small value, or zero if the
  571: destination precision cannot hold the small exact result.
  572: 
  573: @item Inexact
  574: This exception is signalled if a rounded result is not exact (such as
  575: when calculating the square root of two) or a result overflows without
  576: an overflow trap.
  577: @end table
  578: 
  579: @node Infinity and NaN
  580: @subsection Infinity and NaN
  581: @cindex infinity
  582: @cindex not a number
  583: @cindex NaN
  584: 
  585: @w{IEEE 754} floating point numbers can represent positive or negative
  586: infinity, and @dfn{NaN} (not a number).  These three values arise from
  587: calculations whose result is undefined or cannot be represented
  588: accurately.  You can also deliberately set a floating-point variable to
  589: any of them, which is sometimes useful.  Some examples of calculations
  590: that produce infinity or NaN:
  591: 
  592: @ifnottex
  593: @smallexample
  594: @math{1/0 = @infinity{}}
  595: @math{log (0) = -@infinity{}}
  596: @math{sqrt (-1) = NaN}
  597: @end smallexample
  598: @end ifnottex
  599: @tex
  600: $${1\over0} = \infty$$
  601: $$\log 0 = -\infty$$
  602: $$\sqrt{-1} = \hbox{NaN}$$
  603: @end tex
  604: 
  605: When a calculation produces any of these values, an exception also
  606: occurs; see @ref{FP Exceptions}.
  607: 
  608: The basic operations and math functions all accept infinity and NaN and
  609: produce sensible output.  Infinities propagate through calculations as
  610: one would expect: for example, @math{2 + @infinity{} = @infinity{}},
  611: @math{4/@infinity{} = 0}, atan @math{(@infinity{}) = @pi{}/2}.  NaN, on
  612: the other hand, infects any calculation that involves it.  Unless the
  613: calculation would produce the same result no matter what real value
  614: replaced NaN, the result is NaN.
  615: 
  616: In comparison operations, positive infinity is larger than all values
  617: except itself and NaN, and negative infinity is smaller than all values
  618: except itself and NaN.  NaN is @dfn{unordered}: it is not equal to,
  619: greater than, or less than anything, @emph{including itself}. @code{x ==
  620: x} is false if the value of @code{x} is NaN.  You can use this to test
  621: whether a value is NaN or not, but the recommended way to test for NaN
  622: is with the @code{isnan} function (@pxref{Floating Point Classes}).  In
  623: addition, @code{<}, @code{>}, @code{<=}, and @code{>=} will raise an
  624: exception when applied to NaNs.
  625: 
  626: @file{math.h} defines macros that allow you to explicitly set a variable
  627: to infinity or NaN.
  628: 
  629: @comment math.h
  630: @comment ISO
  631: @deftypevr Macro float INFINITY
  632: An expression representing positive infinity.  It is equal to the value
  633: produced  by mathematical operations like @code{1.0 / 0.0}.
  634: @code{-INFINITY} represents negative infinity.
  635: 
  636: You can test whether a floating-point value is infinite by comparing it
  637: to this macro.  However, this is not recommended; you should use the
  638: @code{isfinite} macro instead.  @xref{Floating Point Classes}.
  639: 
  640: This macro was introduced in the @w{ISO C99} standard.
  641: @end deftypevr
  642: 
  643: @comment math.h
  644: @comment GNU
  645: @deftypevr Macro float NAN
  646: An expression representing a value which is ``not a number''.  This
  647: macro is a GNU extension, available only on machines that support the
  648: ``not a number'' value---that is to say, on all machines that support
  649: IEEE floating point.
  650: 
  651: You can use @samp{#ifdef NAN} to test whether the machine supports
  652: NaN.  (Of course, you must arrange for GNU extensions to be visible,
  653: such as by defining @code{_GNU_SOURCE}, and then you must include
  654: @file{math.h}.)
  655: @end deftypevr
  656: 
  657: @w{IEEE 754} also allows for another unusual value: negative zero.  This
  658: value is produced when you divide a positive number by negative
  659: infinity, or when a negative result is smaller than the limits of
  660: representation.  Negative zero behaves identically to zero in all
  661: calculations, unless you explicitly test the sign bit with
  662: @code{signbit} or @code{copysign}.
  663: 
  664: @node Status bit operations
  665: @subsection Examining the FPU status word
  666: 
  667: @w{ISO C99} defines functions to query and manipulate the
  668: floating-point status word.  You can use these functions to check for
  669: untrapped exceptions when it's convenient, rather than worrying about
  670: them in the middle of a calculation.
  671: 
  672: These constants represent the various @w{IEEE 754} exceptions.  Not all
  673: FPUs report all the different exceptions.  Each constant is defined if
  674: and only if the FPU you are compiling for supports that exception, so
  675: you can test for FPU support with @samp{#ifdef}.  They are defined in
  676: @file{fenv.h}.
  677: 
  678: @vtable @code
  679: @comment fenv.h
  680: @comment ISO
  681: @item FE_INEXACT
  682:  The inexact exception.
  683: @comment fenv.h
  684: @comment ISO
  685: @item FE_DIVBYZERO
  686:  The divide by zero exception.
  687: @comment fenv.h
  688: @comment ISO
  689: @item FE_UNDERFLOW
  690:  The underflow exception.
  691: @comment fenv.h
  692: @comment ISO
  693: @item FE_OVERFLOW
  694:  The overflow exception.
  695: @comment fenv.h
  696: @comment ISO
  697: @item FE_INVALID
  698:  The invalid exception.
  699: @end vtable
  700: 
  701: The macro @code{FE_ALL_EXCEPT} is the bitwise OR of all exception macros
  702: which are supported by the FP implementation.
  703: 
  704: These functions allow you to clear exception flags, test for exceptions,
  705: and save and restore the set of exceptions flagged.
  706: 
  707: @comment fenv.h
  708: @comment ISO
  709: @deftypefun int feclearexcept (int @var{excepts})
  710: This function clears all of the supported exception flags indicated by
  711: @var{excepts}.
  712: 
  713: The function returns zero in case the operation was successful, a
  714: non-zero value otherwise.
  715: @end deftypefun
  716: 
  717: @comment fenv.h
  718: @comment ISO
  719: @deftypefun int feraiseexcept (int @var{excepts})
  720: This function raises the supported exceptions indicated by
  721: @var{excepts}.  If more than one exception bit in @var{excepts} is set
  722: the order in which the exceptions are raised is undefined except that
  723: overflow (@code{FE_OVERFLOW}) or underflow (@code{FE_UNDERFLOW}) are
  724: raised before inexact (@code{FE_INEXACT}).  Whether for overflow or
  725: underflow the inexact exception is also raised is also implementation
  726: dependent.
  727: 
  728: The function returns zero in case the operation was successful, a
  729: non-zero value otherwise.
  730: @end deftypefun
  731: 
  732: @comment fenv.h
  733: @comment ISO
  734: @deftypefun int fetestexcept (int @var{excepts})
  735: Test whether the exception flags indicated by the parameter @var{except}
  736: are currently set.  If any of them are, a nonzero value is returned
  737: which specifies which exceptions are set.  Otherwise the result is zero.
  738: @end deftypefun
  739: 
  740: To understand these functions, imagine that the status word is an
  741: integer variable named @var{status}.  @code{feclearexcept} is then
  742: equivalent to @samp{status &= ~excepts} and @code{fetestexcept} is
  743: equivalent to @samp{(status & excepts)}.  The actual implementation may
  744: be very different, of course.
  745: 
  746: Exception flags are only cleared when the program explicitly requests it,
  747: by calling @code{feclearexcept}.  If you want to check for exceptions
  748: from a set of calculations, you should clear all the flags first.  Here
  749: is a simple example of the way to use @code{fetestexcept}:
  750: 
  751: @smallexample
  752: @{
  753:   double f;
  754:   int raised;
  755:   feclearexcept (FE_ALL_EXCEPT);
  756:   f = compute ();
  757:   raised = fetestexcept (FE_OVERFLOW | FE_INVALID);
  758:   if (raised & FE_OVERFLOW) @{ /* @dots{} */ @}
  759:   if (raised & FE_INVALID) @{ /* @dots{} */ @}
  760:   /* @dots{} */
  761: @}
  762: @end smallexample
  763: 
  764: You cannot explicitly set bits in the status word.  You can, however,
  765: save the entire status word and restore it later.  This is done with the
  766: following functions:
  767: 
  768: @comment fenv.h
  769: @comment ISO
  770: @deftypefun int fegetexceptflag (fexcept_t *@var{flagp}, int @var{excepts})
  771: This function stores in the variable pointed to by @var{flagp} an
  772: implementation-defined value representing the current setting of the
  773: exception flags indicated by @var{excepts}.
  774: 
  775: The function returns zero in case the operation was successful, a
  776: non-zero value otherwise.
  777: @end deftypefun
  778: 
  779: