1 .\" Copyright (c) 1985 Regents of the University of California.
2 .\" All rights reserved.
4 .\" Redistribution and use in source and binary forms, with or without
5 .\" modification, are permitted provided that the following conditions
7 .\" 1. Redistributions of source code must retain the above copyright
8 .\" notice, this list of conditions and the following disclaimer.
9 .\" 2. Redistributions in binary form must reproduce the above copyright
10 .\" notice, this list of conditions and the following disclaimer in the
11 .\" documentation and/or other materials provided with the distribution.
12 .\" 3. All advertising materials mentioning features or use of this software
13 .\" must display the following acknowledgement:
14 .\" This product includes software developed by the University of
15 .\" California, Berkeley and its contributors.
16 .\" 4. Neither the name of the University nor the names of its contributors
17 .\" may be used to endorse or promote products derived from this software
18 .\" without specific prior written permission.
20 .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
21 .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
22 .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
23 .\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
24 .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
25 .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
26 .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
27 .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
28 .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
29 .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
32 .\" from: @(#)ieee.3 6.4 (Berkeley) 5/6/91
33 .\" $FreeBSD: src/lib/msun/man/ieee.3,v 1.22 2005/06/16 21:55:45 ru Exp $
40 .Nd IEEE standard 754 for floating-point arithmetic
42 The IEEE Standard 754 for Binary Floating-Point Arithmetic
43 defines representations of floating-point numbers and abstract
44 properties of arithmetic operations relating to precision,
45 rounding, and exceptional cases, as described below.
46 .Ss IEEE STANDARD 754 Floating-Point Arithmetic
49 Overflow and underflow:
50 .Bd -ragged -offset indent -compact
51 Overflow goes by default to a signed \*(If.
56 Zero is represented ambiguously as +0 or \-0.
57 .Bd -ragged -offset indent -compact
58 Its sign transforms correctly through multiplication or
59 division, and is preserved by addition of zeros
60 with like signs; but x\-x yields +0 for every
62 The only operations that reveal zero's
63 sign are division by zero and
64 .Fn copysign x \(+-0 .
65 In particular, comparison (x > y, x \(>= y, etc.)\&
66 cannot be affected by the sign of zero; but if
67 finite x = y then \*(If = 1/(x\-y) \(!= \-1/(y\-x) = \-\*(If.
71 .Bd -ragged -offset indent -compact
72 It persists when added to itself
73 or to any finite number.
75 correctly through multiplication and division, and
76 (finite)/\(+-\*(If\0=\0\(+-0
77 (nonzero)/0 = \(+-\*(If.
79 \*(If\-\*(If, \*(If\(**0 and \*(If/\*(If
80 are, like 0/0 and sqrt(\-3),
81 invalid operations that produce \*(Na. ...
84 Reserved operands (\*(Nas):
85 .Bd -ragged -offset indent -compact
87 .Em ( N Ns ot Em a N Ns umber ) .
88 Some \*(Nas, called Signaling \*(Nas, trap any floating-point operation
89 performed upon them; they are used to mark missing
90 or uninitialized values, or nonexistent elements
92 The rest are Quiet \*(Nas; they are
93 the default results of Invalid Operations, and
94 propagate through subsequent arithmetic operations.
95 If x \(!= x then x is \*(Na; every other predicate
96 (x > y, x = y, x < y, ...) is FALSE if \*(Na is involved.
100 .Bd -ragged -offset indent -compact
101 Every algebraic operation (+, \-, \(**, /,
103 is rounded by default to within half an
105 and when the rounding error is exactly half an
108 the rounded value's least significant bit is zero.
116 This kind of rounding is usually the best kind,
117 sometimes provably so; for instance, for every
118 x = 1.0, 2.0, 3.0, 4.0, ..., 2.0**52, we find
119 (x/3.0)\(**3.0 == x and (x/10.0)\(**10.0 == x and ...
120 despite that both the quotients and the products
122 Only rounding like IEEE 754 can do that.
123 But no single kind of rounding can be
124 proved best for every circumstance, so IEEE 754
125 provides rounding towards zero or towards
126 +\*(If or towards \-\*(If
127 at the programmer's option.
131 .Bd -ragged -offset indent -compact
132 IEEE 754 recognizes five kinds of floating-point exceptions,
133 listed below in declining order of probable importance.
134 .Bl -column -offset indent "Invalid Operation" "Gradual Underflow"
135 .Em "Exception Default Result"
136 Invalid Operation \*(Na, or FALSE
138 Divide by Zero \(+-\*(If
139 Underflow Gradual Underflow
140 Inexact Rounded value
143 NOTE: An Exception is not an Error unless handled
145 What makes a class of exceptions exceptional
146 is that no single default response can be satisfactory
148 On the other hand, if a default
149 response will serve most instances satisfactorily,
150 the unsatisfactory instances cannot justify aborting
151 computation every time the exception occurs.
155 .Bd -ragged -offset indent -compact
161 Precision: 24 significant bits,
162 roughly like 7 significant decimals.
163 .Bd -ragged -offset indent -compact
164 If x and x' are consecutive positive single-precision
165 numbers (they differ by 1
169 5.9e\-08 < 0.5**24 < (x'\-x)/x \(<= 0.5**23 < 1.2e\-07.
173 .Bl -column "XXX" -compact
174 Range: Overflow threshold = 2.0**128 = 3.4e38
175 Underflow threshold = 0.5**126 = 1.2e\-38
177 .Bd -ragged -offset indent -compact
178 Underflowed results round to the nearest
179 integer multiple of 0.5**149 = 1.4e\-45.
184 .Bd -ragged -offset indent -compact
187 .Bd -ragged -offset indent -compact
188 On some architectures,
196 Precision: 53 significant bits,
197 roughly like 16 significant decimals.
198 .Bd -ragged -offset indent -compact
199 If x and x' are consecutive positive double-precision
200 numbers (they differ by 1
204 1.1e\-16 < 0.5**53 < (x'\-x)/x \(<= 0.5**52 < 2.3e\-16.
208 .Bl -column "XXX" -compact
209 Range: Overflow threshold = 2.0**1024 = 1.8e308
210 Underflow threshold = 0.5**1022 = 2.2e\-308
212 .Bd -ragged -offset indent -compact
213 Underflowed results round to the nearest
214 integer multiple of 0.5**1074 = 4.9e\-324.
219 .Bd -ragged -offset indent -compact
222 (when supported by the hardware)
226 Precision: 64 significant bits,
227 roughly like 19 significant decimals.
228 .Bd -ragged -offset indent -compact
229 If x and x' are consecutive positive double-precision
230 numbers (they differ by 1
234 1.0e\-19 < 0.5**63 < (x'\-x)/x \(<= 0.5**62 < 2.2e\-19.
238 .Bl -column "XXX" -compact
239 Range: Overflow threshold = 2.0**16384 = 1.2e4932
240 Underflow threshold = 0.5**16382 = 3.4e\-4932
242 .Bd -ragged -offset indent -compact
243 Underflowed results round to the nearest
244 integer multiple of 0.5**16445 = 5.7e\-4953.
248 Quad-extended-precision:
249 .Bd -ragged -offset indent -compact
252 (when supported by the hardware)
256 Precision: 113 significant bits,
257 roughly like 34 significant decimals.
258 .Bd -ragged -offset indent -compact
259 If x and x' are consecutive positive double-precision
260 numbers (they differ by 1
264 9.6e\-35 < 0.5**113 < (x'\-x)/x \(<= 0.5**112 < 2.0e\-34.
268 .Bl -column "XXX" -compact
269 Range: Overflow threshold = 2.0**16384 = 1.2e4932
270 Underflow threshold = 0.5**16382 = 3.4e\-4932
272 .Bd -ragged -offset indent -compact
273 Underflowed results round to the nearest
274 integer multiple of 0.5**16494 = 6.5e\-4966.
277 .Ss Additional Information Regarding Exceptions
279 For each kind of floating-point exception, IEEE 754
280 provides a Flag that is raised each time its exception
281 is signaled, and stays raised until the program resets
283 Programs may also test, save and restore a flag.
284 Thus, IEEE 754 provides three ways by which programs
285 may cope with exceptions for which the default result
286 might be unsatisfactory:
289 Test for a condition that might cause an exception
290 later, and branch to avoid the exception.
292 Test a flag to see whether an exception has occurred
293 since the program last reset its flag.
295 Test a result to see whether it is a value that only
296 an exception could have produced.
298 CAUTION: The only reliable ways to discover
299 whether Underflow has occurred are to test whether
300 products or quotients lie closer to zero than the
301 underflow threshold, or to test the Underflow
303 (Sums and differences cannot underflow in
304 IEEE 754; if x \(!= y then x\-y is correct to
305 full precision and certainly nonzero regardless of
307 Products and quotients that
308 underflow gradually can lose accuracy gradually
309 without vanishing, so comparing them with zero
310 (as one might on a VAX) will not reveal the loss.
311 Fortunately, if a gradually underflowed value is
312 destined to be added to something bigger than the
313 underflow threshold, as is almost always the case,
314 digits lost to gradual underflow will not be missed
315 because they would have been rounded off anyway.
316 So gradual underflows are usually
319 The same cannot be said of underflows flushed to 0.
322 At the option of an implementor conforming to IEEE 754,
323 other ways to cope with exceptions may be provided:
327 This mechanism classifies an exception in
328 advance as an incident to be handled by means
329 traditionally associated with error-handling
330 statements like "ON ERROR GO TO ...".
332 languages offer different forms of this statement,
333 but most share the following characteristics:
336 No means is provided to substitute a value for
337 the offending operation's result and resume
338 computation from what may be the middle of an
340 An exceptional result is abandoned.
342 In a subprogram that lacks an error-handling
343 statement, an exception causes the subprogram to
344 abort within whatever program called it, and so
345 on back up the chain of calling subprograms until
346 an error-handling statement is encountered or the
347 whole task is aborted and memory is dumped.
351 This mechanism, requiring an interactive
352 debugging environment, is more for the programmer
354 It classifies an exception in
355 advance as a symptom of a programmer's error; the
356 exception suspends execution as near as it can to
357 the offending operation so that the programmer can
358 look around to see how it happened.
360 the first several exceptions turn out to be quite
361 unexceptionable, so the programmer ought ideally
362 to be able to resume execution after each one as if
363 execution had not been stopped.
365 \&... Other ways lie beyond the scope of this document.
369 elementary function should act as if it were indivisible, or
370 atomic, in the sense that ...
373 No exception should be signaled that is not deserved by
374 the data supplied to that function.
376 Any exception signaled should be identified with that
377 function rather than with one of its subroutines.
379 The internal behavior of an atomic function should not
380 be disrupted when a calling program changes from
381 one to another of the five or so ways of handling
382 exceptions listed above, although the definition
383 of the function may be correlated intentionally
384 with exception handling.
389 are only approximately atomic.
390 They signal no inappropriate exception except possibly ...
391 .Bl -tag -width indent -offset indent -compact
395 when a result, if properly computed, might have lain barely within range, and
405 when it happens to be exact, thanks to fortuitous cancellation of errors.
408 .Bl -tag -width indent -offset indent -compact
410 Invalid Operation is signaled only when
412 any result but \*(Na would probably be misleading.
414 Overflow is signaled only when
416 the exact result would be finite but beyond the overflow threshold.
418 Divide-by-Zero is signaled only when
420 a function takes exactly infinite values at finite operands.
422 Underflow is signaled only when
424 the exact result would be nonzero but tinier than the underflow threshold.
426 Inexact is signaled only when
428 greater range or precision would be needed to represent the exact result.
435 An explanation of IEEE 754 and its proposed extension p854
436 was published in the IEEE magazine MICRO in August 1984 under
437 the title "A Proposed Radix- and Word-length-independent
438 Standard for Floating-point Arithmetic" by
441 The manuals for Pascal, C and BASIC on the Apple Macintosh
442 document the features of IEEE 754 pretty well.
443 Articles in the IEEE magazine COMPUTER vol.\& 14 no.\& 3 (Mar.\&
444 1981), and in the ACM SIGNUM Newsletter Special Issue of
445 Oct.\& 1979, may be helpful although they pertain to
446 superseded drafts of the standard.