www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Re: Detecting inadvertent use of integer division

reply Jason House <jason.james.house gmail.com> writes:
Don Wrote:

 Walter Bright wrote:
 Don wrote:
 Consider this notorious piece of code:

 assert(x>1);
 double y = 1 / x;

 This calculates y as the reciprocal of x, if x is a floating-point 
 number. But if x is an integer, an integer division is performed 
 instead of a floating-point one, and y will be 0.

 It's a very common newbie trap, but I find it still catches me 
 occasionally, especially when dividing two variables or compile-time 
 constants.

 In the opPow thread there were a couple of mentions of inadvertent 
 integer division, and how Python is removing this error by making / 
 always mean floating-point division, and introducing a new operator 
 for integer division.

 We could largely eliminate this type of bug without doing anything so 
 drastic. Most of the problem just comes from C's cavalier attitude to 
 implicit casting. All we'd need to do is tighten the implicit 
 conversion rules for int->float, in the same way that the int->uint 
 rules have been tightened:

 "If an integer expression has an inexact result (ie, involves an 
 inexact integer divison), that expression cannot be implicitly cast to 
 a floating-point type."

But the compiler cannot reliably tell if it will produce an inexact result.
 (This means that double y = int_val / 1;  is OK, and also:
  double z = 90/3; would be OK. An alternative rule would be:
 "If an integer expression involves integer divison, that expression 
 cannot be implicitly cast to a floating-point type").

This is kinda complicated if one has, say: double z = x/y + 3;

Integer expressions remain inexact until there's a cast. (It's very simple to implement, you just use the integer range code, adding an 'inexact' flag. Division sets the flag, casts clear the flag, everything else just propagates it if a unary operation, or ORs the two flags if a binary operation).

What about function calls? double z = abs(x/y); Regardless, your proposal is a simple incremental improvement, and I'd love to see it in D. Also, one more thought: should similar rigor be used for implicit float -> double conversions?
Dec 14 2009
parent reply Don <nospam nospam.com> writes:
Jason House wrote:
 Don Wrote:
 
 Walter Bright wrote:
 Don wrote:
 Consider this notorious piece of code:

 assert(x>1);
 double y = 1 / x;

 This calculates y as the reciprocal of x, if x is a floating-point 
 number. But if x is an integer, an integer division is performed 
 instead of a floating-point one, and y will be 0.

 It's a very common newbie trap, but I find it still catches me 
 occasionally, especially when dividing two variables or compile-time 
 constants.

 In the opPow thread there were a couple of mentions of inadvertent 
 integer division, and how Python is removing this error by making / 
 always mean floating-point division, and introducing a new operator 
 for integer division.

 We could largely eliminate this type of bug without doing anything so 
 drastic. Most of the problem just comes from C's cavalier attitude to 
 implicit casting. All we'd need to do is tighten the implicit 
 conversion rules for int->float, in the same way that the int->uint 
 rules have been tightened:

 "If an integer expression has an inexact result (ie, involves an 
 inexact integer divison), that expression cannot be implicitly cast to 
 a floating-point type."

 (This means that double y = int_val / 1;  is OK, and also:
  double z = 90/3; would be OK. An alternative rule would be:
 "If an integer expression involves integer divison, that expression 
 cannot be implicitly cast to a floating-point type").

double z = x/y + 3;

(It's very simple to implement, you just use the integer range code, adding an 'inexact' flag. Division sets the flag, casts clear the flag, everything else just propagates it if a unary operation, or ORs the two flags if a binary operation).

What about function calls? double z = abs(x/y);

Yeah, it won't catch cases where there are both integer and floating-point overloads of the same function. abs() and pow() are the only two I can think of -- and pow() will be covered by ^^. There's probably a few others.
 Regardless, your proposal is a simple incremental improvement, and I'd love to
see it in D.
 
 Also, one more thought: should similar rigor be used for implicit float ->
double conversions?

That would be much more complicated, I think. Fortunately you're much better protected in such conversions. For example, if a double is too large to fit inside a float, double -> float returns float.infinity. But perhaps you can think of specific bugs which could be caught?
Dec 15 2009
parent reply "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
Don wrote:
 Jason House wrote:
 Don Wrote:

 Walter Bright wrote:
 Don wrote:
 Consider this notorious piece of code:

 assert(x>1);
 double y = 1 / x;

 This calculates y as the reciprocal of x, if x is a floating-point 
 number. But if x is an integer, an integer division is performed 
 instead of a floating-point one, and y will be 0.

 It's a very common newbie trap, but I find it still catches me 
 occasionally, especially when dividing two variables or 
 compile-time constants.

 In the opPow thread there were a couple of mentions of inadvertent 
 integer division, and how Python is removing this error by making / 
 always mean floating-point division, and introducing a new operator 
 for integer division.

 We could largely eliminate this type of bug without doing anything 
 so drastic. Most of the problem just comes from C's cavalier 
 attitude to implicit casting. All we'd need to do is tighten the 
 implicit conversion rules for int->float, in the same way that the 
 int->uint rules have been tightened:

 "If an integer expression has an inexact result (ie, involves an 
 inexact integer divison), that expression cannot be implicitly cast 
 to a floating-point type."

result.
 (This means that double y = int_val / 1;  is OK, and also:
  double z = 90/3; would be OK. An alternative rule would be:
 "If an integer expression involves integer divison, that expression 
 cannot be implicitly cast to a floating-point type").

double z = x/y + 3;

(It's very simple to implement, you just use the integer range code, adding an 'inexact' flag. Division sets the flag, casts clear the flag, everything else just propagates it if a unary operation, or ORs the two flags if a binary operation).

What about function calls? double z = abs(x/y);

Yeah, it won't catch cases where there are both integer and floating-point overloads of the same function. abs() and pow() are the only two I can think of -- and pow() will be covered by ^^. There's probably a few others.

I think the most subtle cases will be calls to max() and min(). If you do x = max(1.2, 3/2); and the 'inexact' flag doesn't survive beyond the function call, there will be a silent conversion to double inside max() and the function will return 1.2. But it's probably not a very common problem. -Lars
Dec 15 2009
parent reply Don <nospam nospam.com> writes:
Lars T. Kyllingstad wrote:
 Don wrote:
 Jason House wrote:
 Don Wrote:

 Walter Bright wrote:
 Don wrote:
 Consider this notorious piece of code:

 assert(x>1);
 double y = 1 / x;

 This calculates y as the reciprocal of x, if x is a floating-point 
 number. But if x is an integer, an integer division is performed 
 instead of a floating-point one, and y will be 0.

 It's a very common newbie trap, but I find it still catches me 
 occasionally, especially when dividing two variables or 
 compile-time constants.

 In the opPow thread there were a couple of mentions of inadvertent 
 integer division, and how Python is removing this error by making 
 / always mean floating-point division, and introducing a new 
 operator for integer division.

 We could largely eliminate this type of bug without doing anything 
 so drastic. Most of the problem just comes from C's cavalier 
 attitude to implicit casting. All we'd need to do is tighten the 
 implicit conversion rules for int->float, in the same way that the 
 int->uint rules have been tightened:

 "If an integer expression has an inexact result (ie, involves an 
 inexact integer divison), that expression cannot be implicitly 
 cast to a floating-point type."

result.
 (This means that double y = int_val / 1;  is OK, and also:
  double z = 90/3; would be OK. An alternative rule would be:
 "If an integer expression involves integer divison, that 
 expression cannot be implicitly cast to a floating-point type").

double z = x/y + 3;

(It's very simple to implement, you just use the integer range code, adding an 'inexact' flag. Division sets the flag, casts clear the flag, everything else just propagates it if a unary operation, or ORs the two flags if a binary operation).

What about function calls? double z = abs(x/y);

Yeah, it won't catch cases where there are both integer and floating-point overloads of the same function. abs() and pow() are the only two I can think of -- and pow() will be covered by ^^. There's probably a few others.

I think the most subtle cases will be calls to max() and min(). If you do x = max(1.2, 3/2); and the 'inexact' flag doesn't survive beyond the function call, there will be a silent conversion to double inside max() and the function will return 1.2.

Note that that wouldn't happen if max had a signature like: max(double a, double b) or max(T)(T a, T b)
 
 But it's probably not a very common problem.
 
 -Lars

Dec 15 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Don wrote:
 Lars T. Kyllingstad wrote:
 Don wrote:
 Jason House wrote:
 Don Wrote:

 Walter Bright wrote:
 Don wrote:
 Consider this notorious piece of code:

 assert(x>1);
 double y = 1 / x;

 This calculates y as the reciprocal of x, if x is a 
 floating-point number. But if x is an integer, an integer 
 division is performed instead of a floating-point one, and y will 
 be 0.

 It's a very common newbie trap, but I find it still catches me 
 occasionally, especially when dividing two variables or 
 compile-time constants.

 In the opPow thread there were a couple of mentions of 
 inadvertent integer division, and how Python is removing this 
 error by making / always mean floating-point division, and 
 introducing a new operator for integer division.

 We could largely eliminate this type of bug without doing 
 anything so drastic. Most of the problem just comes from C's 
 cavalier attitude to implicit casting. All we'd need to do is 
 tighten the implicit conversion rules for int->float, in the same 
 way that the int->uint rules have been tightened:

 "If an integer expression has an inexact result (ie, involves an 
 inexact integer divison), that expression cannot be implicitly 
 cast to a floating-point type."

inexact result.
 (This means that double y = int_val / 1;  is OK, and also:
  double z = 90/3; would be OK. An alternative rule would be:
 "If an integer expression involves integer divison, that 
 expression cannot be implicitly cast to a floating-point type").

double z = x/y + 3;

(It's very simple to implement, you just use the integer range code, adding an 'inexact' flag. Division sets the flag, casts clear the flag, everything else just propagates it if a unary operation, or ORs the two flags if a binary operation).

What about function calls? double z = abs(x/y);

Yeah, it won't catch cases where there are both integer and floating-point overloads of the same function. abs() and pow() are the only two I can think of -- and pow() will be covered by ^^. There's probably a few others.

I think the most subtle cases will be calls to max() and min(). If you do x = max(1.2, 3/2); and the 'inexact' flag doesn't survive beyond the function call, there will be a silent conversion to double inside max() and the function will return 1.2.

Note that that wouldn't happen if max had a signature like: max(double a, double b) or max(T)(T a, T b)

max takes heterogeneous parameters to catch situations like max(a, 0). Andrei
Dec 15 2009
parent reply Don <nospam nospam.com> writes:
Andrei Alexandrescu wrote:
 Don wrote:
 Lars T. Kyllingstad wrote:
 Don wrote:
 Jason House wrote:
 Don Wrote:

 Walter Bright wrote:
 Don wrote:
 Consider this notorious piece of code:

 assert(x>1);
 double y = 1 / x;

 This calculates y as the reciprocal of x, if x is a 
 floating-point number. But if x is an integer, an integer 
 division is performed instead of a floating-point one, and y 
 will be 0.

 It's a very common newbie trap, but I find it still catches me 
 occasionally, especially when dividing two variables or 
 compile-time constants.

 In the opPow thread there were a couple of mentions of 
 inadvertent integer division, and how Python is removing this 
 error by making / always mean floating-point division, and 
 introducing a new operator for integer division.

 We could largely eliminate this type of bug without doing 
 anything so drastic. Most of the problem just comes from C's 
 cavalier attitude to implicit casting. All we'd need to do is 
 tighten the implicit conversion rules for int->float, in the 
 same way that the int->uint rules have been tightened:

 "If an integer expression has an inexact result (ie, involves an 
 inexact integer divison), that expression cannot be implicitly 
 cast to a floating-point type."

inexact result.
 (This means that double y = int_val / 1;  is OK, and also:
  double z = 90/3; would be OK. An alternative rule would be:
 "If an integer expression involves integer divison, that 
 expression cannot be implicitly cast to a floating-point type").

double z = x/y + 3;

(It's very simple to implement, you just use the integer range code, adding an 'inexact' flag. Division sets the flag, casts clear the flag, everything else just propagates it if a unary operation, or ORs the two flags if a binary operation).

What about function calls? double z = abs(x/y);

Yeah, it won't catch cases where there are both integer and floating-point overloads of the same function. abs() and pow() are the only two I can think of -- and pow() will be covered by ^^. There's probably a few others.

I think the most subtle cases will be calls to max() and min(). If you do x = max(1.2, 3/2); and the 'inexact' flag doesn't survive beyond the function call, there will be a silent conversion to double inside max() and the function will return 1.2.

Note that that wouldn't happen if max had a signature like: max(double a, double b) or max(T)(T a, T b)

max takes heterogeneous parameters to catch situations like max(a, 0). Andrei

Yeah. It's a shame we can't use the ?: type rule for common parameters, since max(T, U) only makes sense when T and U have a common type. So we get code bloat, with unnecessary template instantiations. But it'd be too complicated otherwise, I think. C'est la vie.
Dec 15 2009
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Don wrote:
 Andrei Alexandrescu wrote:
 max takes heterogeneous parameters to catch situations like max(a, 0).

 Andrei

Yeah. It's a shame we can't use the ?: type rule for common parameters, since max(T, U) only makes sense when T and U have a common type. So we get code bloat, with unnecessary template instantiations. But it'd be too complicated otherwise, I think. C'est la vie.

La vie n'est pas si mal. Looking through the implementation of max you'll see that it carefully selects the compatible types, and also computes the correct type for the return value. I don't think there's much, if any, code bloat as well. The operations generated are specialized for the respective types as if you wrote things by hand. Andrei
Dec 15 2009