## digitalmars.D - bigfloat II

- Paul D. Anderson <paul.d.removethis.anderson comcast.andthis.net> Apr 09 2009
- Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> Apr 09 2009
- Don <nospam nospam.com> Apr 09 2009
- Paul D. Anderson <paul.d.removethis.anderson comcast.andthis.net> Apr 09 2009
- Don <nospam nospam.com> Apr 09 2009
- "Joel C. Salomon" <joelcsalomon gmail.com> Apr 11 2009
- Paul D. Anderson <paul.d.removethis comcast.andthis.net> Apr 12 2009
- Don <nospam nospam.com> Apr 12 2009
- Paul D. Anderson <paul.d.removethis.anderson comcast.andthis.net> Apr 13 2009
- Don <nospam nospam.com> Apr 13 2009
- Georg Wrede <georg.wrede iki.fi> Apr 21 2009
- Paul D. Anderson <paul.d.removethis.anderson comcast.andthis.net> Apr 21 2009
- Don <nospam nospam.com> Apr 23 2009
- Walter Bright <newshound1 digitalmars.com> Apr 09 2009
- Walter Bright <newshound1 digitalmars.com> Apr 09 2009
- Chris Andrews <CodexArcanum gmail.com> Apr 09 2009
- Don <nospam nospam.com> Apr 09 2009
- Walter Bright <newshound1 digitalmars.com> Apr 09 2009

It looks like there's a lot of interest in getting bigfloat going. As I mentioned I have some half-finished stuff so I'll dust it off and get going. Most of the coding is straightforward. However -- An important implementation question is how to determine the "context" of an operation (i.e. things like the precision and rounding method). This isn't often an issue for normal floating point, but applications that need arbitrary precision frequently need control of the context. There are several ways to approach this. First, the context can be a global setting that affects all operations from the time it is set until it is changed. This is the simplest to implement and is adequate if all bigfloat calculations use the same context most of the time. If a different context is needed for some calculations, the user needs to save and reset the settings. Second, the context can be tied to the operation (this is how Java's BigDecimal is implemented). There is a default context, but each operation has an optional context parameter. This is also easy to implement but there is a problem for languages like D which allow operator overloading -- there's no simple way to add an additional parameter to an operation (opAdd, for instance). Of course non-overloading operations [add(op1, op2, context)] can be used, but noone wants to go there. Third, each BigFloat object can carry it's own context. This makes the objects larger, but they tend to be large anyway. A bigger issue is how to carry out an operation on two operands with different contexts. Some sort of tie-breaker has to be decided on (lowest precision wins?). A fourth option is use subclasses and have a static context for each subclass. Calculations requiring a particular context can use a separate subclass. Implementation of this option would be more involved, especially if interoperation of the subclasses is allowed (the same tie-breaking would be needed as above). ------ I think the second and third options are non-starters. As noted, the first is the simplest (and that's enough to get started with). The fourth option may be where we want to end up, but there are an awful lot of details to be worked out. Does anyone have strong feelings for/against any of these options?? Paul

Apr 09 2009

Paul D. Anderson wrote: [...]Does anyone have strong feelings for/against any of these options?? Paul

What does the context exactly consist of? Andrei

Apr 09 2009

Andrei Alexandrescu Wrote:Paul D. Anderson wrote: [...]Does anyone have strong feelings for/against any of these options?? Paul

What does the context exactly consist of?

The rounding mode, the precision (which indicates when (whether) rounding is needed), and flags for determining which error conditions are ignored, reported or trapped.

Apr 09 2009

Paul D. Anderson wrote:Andrei Alexandrescu Wrote:Paul D. Anderson wrote: [...]Does anyone have strong feelings for/against any of these options?? Paul

The rounding mode, the precision (which indicates when (whether) rounding is needed), and flags for determining which error conditions are ignored, reported or trapped.

I see. Then another possibility may be to make them part of the type by making them policies. The tradeoffs are obvious. enum RoundingMode { ... } enum Precision { ... } enum Flags { ... } struct BigFloatT( RoundingMode rm = ..., Precision pr = ..., uint flags = ...) { ... } alias BigFloatT!(... common options ...) BigFloat; Andrei

Apr 09 2009

Paul D. Anderson wrote:It looks like there's a lot of interest in getting bigfloat going. As I mentioned I have some half-finished stuff so I'll dust it off and get going. Most of the coding is straightforward. However -- An important implementation question is how to determine the "context" of an operation (i.e. things like the precision and rounding method). This isn't often an issue for normal floating point, but applications that need arbitrary precision frequently need control of the context.

There are several ways to approach this. First, the context can be a global setting that affects all operations from the time it is set until it is changed. This is the simplest to implement and is adequate if all bigfloat calculations use the same context most of the time. If a different context is needed for some calculations, the user needs to save and reset the settings. Second, the context can be tied to the operation (this is how Java's BigDecimal is implemented). There is a default context, but each operation has an optional context parameter. This is also easy to implement but there is a problem for languages like D which allow operator overloading -- there's no simple way to add an additional parameter to an operation (opAdd, for instance). Of course non-overloading operations [add(op1, op2, context)] can be used, but noone wants to go there. Third, each BigFloat object can carry it's own context. This makes the objects larger, but they tend to be large anyway. A bigger issue is how to carry out an operation on two operands with different contexts. Some sort of tie-breaker has to be decided on (lowest precision wins?). A fourth option is use subclasses and have a static context for each subclass. Calculations requiring a particular context can use a separate subclass. Implementation of this option would be more involved, especially if interoperation of the subclasses is allowed (the same tie-breaking would be needed as above). ------ I think the second and third options are non-starters. As noted, the first is the simplest (and that's enough to get started with). The fourth option may be where we want to end up, but there are an awful lot of details to be worked out. Does anyone have strong feelings for/against any of these options?? Paul

My opinion -- precision should be a per-object setting, since it is a property of the data. Rounding etc is a property of the operators and functions and should be thread-local, and set by creating a scope struct with a constructor that sets the mode to a new value, and a destructor which restores it to the previous one. eg BigFloat a = 5e50; BigFloat b = 1e-20; { BigFloatRounding(ROUND_DOWN); // applies until end of scope a*=sin(b); } // we're back in round-to-nearest.

Apr 09 2009

Don Wrote:Paul D. Anderson wrote:Does anyone have strong feelings for/against any of these options?? Paul

My opinion -- precision should be a per-object setting, since it is a property of the data. Rounding etc is a property of the operators and functions and should be thread-local, and set by creating a scope struct with a constructor that sets the mode to a new value, and a destructor which restores it to the previous one. eg BigFloat a = 5e50; BigFloat b = 1e-20; { BigFloatRounding(ROUND_DOWN); // applies until end of scope a*=sin(b); } // we're back in round-to-nearest.

I like that. I didn't consider scope as an option, but that's really what's wanted now that you've pointed it out. I agree that precision is part of the data, but it needs to be considered as part of the operation as well. Multiplying two floats produces a number whose potential precision is the sum of the operands' precision. We need a method to determine what the precision of the product should be. Not that it's difficult to come up with an answer -- but we have to agree on something. That's what the context provides. Thanks for your input. Paul

Apr 09 2009

Paul D. Anderson wrote:Don Wrote:Paul D. Anderson wrote:Does anyone have strong feelings for/against any of these options?? Paul

property of the data. Rounding etc is a property of the operators and functions and should be thread-local, and set by creating a scope struct with a constructor that sets the mode to a new value, and a destructor which restores it to the previous one. eg BigFloat a = 5e50; BigFloat b = 1e-20; { BigFloatRounding(ROUND_DOWN); // applies until end of scope a*=sin(b); } // we're back in round-to-nearest.

I like that. I didn't consider scope as an option, but that's really what's wanted now that you've pointed it out. I agree that precision is part of the data, but it needs to be considered as part of the operation as well. Multiplying two floats produces a number whose potential precision is the sum of the operands' precision. We need a method to determine what the precision of the product should be. Not that it's difficult to come up with an answer -- but we have to agree on something. That's what the context provides.

Yes, I guess you'd need to specify a maximum precision in the scope, as well. By the way, when I get around to adding rounding and exception control for the FPU in std.math, I propose to do it in this way as well, by default. The idea from C that the rounding mode can legally be changed at any time is ridiculous.Thanks for your input. Paul

Apr 09 2009

Paul D. Anderson wrote:Multiplying two floats produces a number whose potential precision is the sum of the operands' precision. We need a method to determine what the precision of the product should be. Not that it's difficult to come up with an answer -- but we have to agree on something.

Not more precision than the input data deserve. The decimal floating-point numbers of the new IEEE 754-2008 carry with them a notion of “how precise is this result?”; this might be a good starting point for discussion. —Joel Salomon

Apr 11 2009

Joel C. Salomon Wrote:Paul D. Anderson wrote:Multiplying two floats produces a number whose potential precision is the sum of the operands' precision. We need a method to determine what the precision of the product should be. Not that it's difficult to come up with an answer -- but we have to agree on something.

Not more precision than the input data deserve. The decimal floating-point numbers of the new IEEE 754-2008 carry with them a notion of “how precise is this result?”; this might be a good starting point for discussion. —Joel Salomon

The implementation will comply with IEEE 754-2008. I just wanted to illustrate that precision can depend on the operation as well as the operands. Paul

Apr 12 2009

Paul D. Anderson wrote:Joel C. Salomon Wrote:Paul D. Anderson wrote:Multiplying two floats produces a number whose potential precision is the sum of the operands' precision. We need a method to determine what the precision of the product should be. Not that it's difficult to come up with an answer -- but we have to agree on something.

floating-point numbers of the new IEEE 754-2008 carry with them a notion of “how precise is this result?”; this might be a good starting point for discussion. —Joel Salomon

The implementation will comply with IEEE 754-2008. I just wanted to illustrate that precision can depend on the operation as well as the operands. Paul

I'm not sure why you think there needs to be a precision depending on the operation. The IEEE 754-2008 has the notion of "Widento" precision, but AFAIK it's primarily to support x87 -- it's pretty clear that the normal mode of operation is to use the precision which is the maximum precision of the operands. Even when there is a Widento mode, it's only provided for the non-storage formats (Ie, 80-bit reals only). I think it should operate the way int and long does: typeof(x*y) is x if x.mant_dig >= y.mant_dig, else y. What you might perhaps do is have a global setting for adjusting the default size of a newly constructed variable. But it would only affect constructors, not temporaries inside expressions.

Apr 12 2009

Don Wrote:Paul D. Anderson wrote:The implementation will comply with IEEE 754-2008. I just wanted to illustrate that precision can depend on the operation as well as the operands. Paul

I'm not sure why you think there needs to be a precision depending on the operation. The IEEE 754-2008 has the notion of "Widento" precision, but AFAIK it's primarily to support x87 -- it's pretty clear that the normal mode of operation is to use the precision which is the maximum precision of the operands. Even when there is a Widento mode, it's only provided for the non-storage formats (Ie, 80-bit reals only). I think it should operate the way int and long does: typeof(x*y) is x if x.mant_dig >= y.mant_dig, else y. What you might perhaps do is have a global setting for adjusting the default size of a newly constructed variable. But it would only affect constructors, not temporaries inside expressions.

I agree. I seem to have stated it badly, but I'm not suggesting there is a precision inherent in an operation. I'm just considering the possibility that the user may not want the full precision of the operands for the result of a particular operation. Java's BigDecimal class allows for this by including a context object as a parameter to the operation. For D we would allow for this by changing the context, where the precision in the context would govern the precision of the result. So the precision of a new variable, in the absence of an explicit instruction, would be the context precision, and the precision of the result of an operation would be no greater than the context precision, but would otherwise be the the larger of the precision of the operands. So if a user wanted to perform arithmetic at a given precision he would set the context precision to that value. If he wanted the precision to grow to match the largest operand precision (i.e., no fixed precision) he would set the context precision to be larger than any expected value. (Or maybe we could explicitly provide for this.) In any case, the operation would be carried out at the operand precision(s) and only rounded, if necessary, at the point of return. I hope that's a clearer statement of what I'm implementing. If I've still missed something important, let me know. Paul

Apr 13 2009

Paul D. Anderson wrote:Don Wrote:Paul D. Anderson wrote:The implementation will comply with IEEE 754-2008. I just wanted to illustrate that precision can depend on the operation as well as the operands. Paul

the operation. The IEEE 754-2008 has the notion of "Widento" precision, but AFAIK it's primarily to support x87 -- it's pretty clear that the normal mode of operation is to use the precision which is the maximum precision of the operands. Even when there is a Widento mode, it's only provided for the non-storage formats (Ie, 80-bit reals only). I think it should operate the way int and long does: typeof(x*y) is x if x.mant_dig >= y.mant_dig, else y. What you might perhaps do is have a global setting for adjusting the default size of a newly constructed variable. But it would only affect constructors, not temporaries inside expressions.

I agree. I seem to have stated it badly, but I'm not suggesting there is a precision inherent in an operation. I'm just considering the possibility that the user may not want the full precision of the operands for the result of a particular operation. Java's BigDecimal class allows for this by including a context object as a parameter to the operation. For D we would allow for this by changing the context, where the precision in the context would govern the precision of the result. So the precision of a new variable, in the absence of an explicit instruction, would be the context precision, and the precision of the result of an operation would be no greater than the context precision, but would otherwise be the the larger of the precision of the operands. So if a user wanted to perform arithmetic at a given precision he would set the context precision to that value. If he wanted the precision to grow to match the largest operand precision (i.e., no fixed precision) he would set the context precision to be larger than any expected value. (Or maybe we could explicitly provide for this.)

You could increase the precision of a calculation by changing the precision of one of the operands. For that to work, you would need a way to create a new variable which copies the precision from another one.In any case, the operation would be carried out at the operand precision(s) and only rounded, if necessary, at the point of return. I hope that's a clearer statement of what I'm implementing. If I've still missed something important, let me know.

In any case, it seems to me that it doesn't affect the code very much. It's much easier to decide such things once you have working code and some real use cases.

Apr 13 2009

Don wrote:Paul D. Anderson wrote:Joel C. Salomon Wrote:Paul D. Anderson wrote:

floating-point numbers of the new IEEE 754-2008 carry with them a notion of “how precise is this result?”; this might be a good starting point for discussion. —Joel Salomon

The implementation will comply with IEEE 754-2008. I just wanted to illustrate that precision can depend on the operation as well as the operands. Paul

I'm not sure why you think there needs to be a precision depending on the operation. The IEEE 754-2008 has the notion of "Widento" precision, but AFAIK it's primarily to support x87 -- it's pretty clear that the normal mode of operation is to use the precision which is the maximum precision of the operands. Even when there is a Widento mode, it's only provided for the non-storage formats (Ie, 80-bit reals only). I think it should operate the way int and long does: typeof(x*y) is x if x.mant_dig >= y.mant_dig, else y. What you might perhaps do is have a global setting for adjusting the default size of a newly constructed variable. But it would only affect constructors, not temporaries inside expressions.

Well, if you're doing precise arithmetic, you need a different number of significant digits at different parts of a calculation. Say, you got an integer of n digits (which obviously needs n digits of precision in the first place). Square it, and all of a sudden you need n+n digits to display it precisely. Unless you truncate it to n digits. But then, taking the square root would yield something else than the original. To improve the speed, one could envision having the calculations always use a suitable precision. This problem of course /doesn't/ show when doing integer arithmetic, even at "unlimited" precision, or with BCD, which usually is fixed point. But to do variable precision floating point (mostly in software because we're wider than the hardware) the precision really needs to vary. And also that's where "precision deserved by the input" comes from. For instance 12300000000000 has only three digits of precision. Combining all these things in a math library is an interesting task. But done right, it could mean faster calculations (especially with very big or small numbers), and even a more accurate result (when truncating decimal parts in the wrong situations is avoided). --- Yesterday I watched a clip on YouTube about the accretion disk around a black hole. The caption said the black hole was 16km in diameter. I could cry. The guys who originally made the simulation meant that it's less than a hundred miles, but more than one, and that's when a scientist says ten. Then the idiots who translate this to sane units look up the conversion factor, and print the result. 16 has two digits of precision, where the original only had one. (Actually even less, but let's not get carried away...) Even using 15km would have used only 1 1/2 digits of precision, which would have been prefereable. Lucky they didn't write the caption as 16.09km or 16km 90 meters. That would really have been conjuring up precision out of thin air. The right caption would have said 10km. Only that would have retained the [non-]precision, and also given the right idea to the reader. (Currently the reader is left thinking, "Oh, I guess it then would look somehow different had it been 12 or 20 km instead, since the size was given so accurately.") --- Interestingly, the Americans do the inverse of this. They say 18 months ago when they only mean "oh, more than a year, and IIRC, less than two years". Or, 48 hours ago, when they mean the day before yesterday in general. I guess they're 99.9999999999998% wrong.

Apr 21 2009

Georg Wrede Wrote:Well, if you're doing precise arithmetic, you need a different number of significant digits at different parts of a calculation. Say, you got an integer of n digits (which obviously needs n digits of precision in the first place). Square it, and all of a sudden you need n+n digits to display it precisely. Unless you truncate it to n digits. But then, taking the square root would yield something else than the original. To improve the speed, one could envision having the calculations always use a suitable precision. This problem of course /doesn't/ show when doing integer arithmetic, even at "unlimited" precision, or with BCD, which usually is fixed point. But to do variable precision floating point (mostly in software because we're wider than the hardware) the precision really needs to vary. And also that's where "precision deserved by the input" comes from.

There are different, sometimes conflicting, purposes and goals associated with arbitrary precision arithmetic. I agree that balancing these is difficult. My intention is to provide an implementation that can be expanded or modified as needed to meet their particular goals. The default behavior may not be what a user needs. As an aside, I've seen an implementation that was centered around carrying the error associated with any calculation. The intent was to make "scientific" calculation precise, but the result was to rapidly degrade the precision to the point where the error exceeded the computed values. I'm hoping to have an alpha-level implementation available sometime next week. Paul

Apr 21 2009

Paul D. Anderson wrote:Georg Wrede Wrote:Well, if you're doing precise arithmetic, you need a different number of significant digits at different parts of a calculation. Say, you got an integer of n digits (which obviously needs n digits of precision in the first place). Square it, and all of a sudden you need n+n digits to display it precisely. Unless you truncate it to n digits. But then, taking the square root would yield something else than the original. To improve the speed, one could envision having the calculations always use a suitable precision. This problem of course /doesn't/ show when doing integer arithmetic, even at "unlimited" precision, or with BCD, which usually is fixed point. But to do variable precision floating point (mostly in software because we're wider than the hardware) the precision really needs to vary. And also that's where "precision deserved by the input" comes from.

There are different, sometimes conflicting, purposes and goals associated with arbitrary precision arithmetic. I agree that balancing these is difficult. My intention is to provide an implementation that can be expanded or modified as needed to meet their particular goals. The default behavior may not be what a user needs. As an aside, I've seen an implementation that was centered around carrying the error associated with any calculation. The intent was to make "scientific" calculation precise, but the result was to rapidly degrade the precision to the point where the error exceeded the computed values.

I think it's generally better to relegate that task to Interval arithmetic, rather that BigFloat.I'm hoping to have an alpha-level implementation available sometime next week.

Awesome!Paul

Apr 23 2009

The rounding mode should be determined by reading the FPU's current rounding mode. 1. Two different systems for handling rounding is confusing and excessively complicated. 2. The FPU can be used for handling the lower precision calculations, where it will use its rounding mode anyway. 3. If specialized template versions of BigFloat will be the native types, then they'll use the FPU rounding mode. 4. No new API is necessary to get/set the modes. 5. It'll respond appropriately when interfacing with code from other languages that manipulate the FPU rounding modes. This goes ditto for the floating point exceptions.

Apr 09 2009

3. Implementation is unambiguous, straightforward, and easy to get right.

Apr 09 2009

Walter Bright Wrote:3. Implementation is unambiguous, straightforward, and easy to get right.

If that was a vote for method 3, then I cast there as well. I think keeping the precision as a parameter of the object is most intuitive. I don't feel strongly about rounding, but both the scoping method or the FPU setting sound good to me.

Apr 09 2009

Walter Bright wrote:The rounding mode should be determined by reading the FPU's current rounding mode. 1. Two different systems for handling rounding is confusing and excessively complicated. 2. The FPU can be used for handling the lower precision calculations, where it will use its rounding mode anyway. 3. If specialized template versions of BigFloat will be the native types, then they'll use the FPU rounding mode. 4. No new API is necessary to get/set the modes. 5. It'll respond appropriately when interfacing with code from other languages that manipulate the FPU rounding modes. This goes ditto for the floating point exceptions.

I'm not even sure how that should work. On x86, there are two completely independent FPU rounding modes and exception masks: one for x87, and one for SSE. Additionally, some of these may have rounding modes which aren't support by the others (almost all CPUs other than x87 have a flush-subnormals-to-zero mode; and in the IEEE spec, decimal floating point has two extra rounding modes that binary doesn't have). What this means, I think, is that the D-supplied mechanism for controlling rounding mode and exceptions needs to be general enough to support them all.

Apr 09 2009

Don wrote:I'm not even sure how that should work. On x86, there are two completely independent FPU rounding modes and exception masks: one for x87, and one for SSE. Additionally, some of these may have rounding modes which aren't support by the others (almost all CPUs other than x87 have a flush-subnormals-to-zero mode; and in the IEEE spec, decimal floating point has two extra rounding modes that binary doesn't have). What this means, I think, is that the D-supplied mechanism for controlling rounding mode and exceptions needs to be general enough to support them all.

The D functions to get/set the modes should set both of them to the same value. Exception flag readers should OR the two together. This is because the compiler/library will likely mix & match using the SSE and x87 as convenient. Unsupported modes should throw an exception on attempting to set them.

Apr 09 2009