digitalmars.D - core.checkedint

Andrei Alexandrescu (32/32) Jun 24 2016 There are a few simple enhancements to add to core.checkedint:

Dominikus Dittes Scherkl (100/132) Jun 24 2016 I have a save exponentiation function (+ two helper functions

Dominikus Dittes Scherkl (18/20) Jun 24 2016 Ah, and I use a better abs() function:

Walter Bright (22/27) Jun 24 2016 The reason for the different names is:

Andrei Alexandrescu (10/41) Jun 24 2016 These functions are not supposed to be used naïvely.

Walter Bright (17/18) Jun 24 2016 They are also meant to be inspectable. I want to look at it and triviall...

Andrei Alexandrescu (3/7) Jun 24 2016 Then the right API is add(x, y, overflow). I look at it and I trivially

Andrei Alexandrescu (15/15) Jun 24 2016 To bring a different angle, the best front-end for arithmetic operations...

Walter Bright (6/10) Jun 24 2016 I believe adding such behavior is beyond the charter of checkedint. I un...

Andrei Alexandrescu (2/3) Jun 24 2016 Which checkedint? The one in core? -- Andrei

Walter Bright (2/5) Jun 24 2016 Your Checked!int. Sorry for the confusion.

Andrei Alexandrescu (19/25) Jun 24 2016 Though you should be able to implement that via a hook, I think at this

Andrei Alexandrescu (3/8) Jun 24 2016 To clarify: if array.length is 0, then indeed that should be signaled as...

Observer (59/71) Jun 25 2016 I'm curious how one attempts to handle overflow detection in a

Walter Bright (3/9) Jun 24 2016 I'm curious if that produces an overflow error on the newer clang that
Walter Bright (8/14) Jun 24 2016 That's a seductive test case. But I worry that mixed signed/unsigned ari...

Andrei Alexandrescu (7/25) Jun 24 2016 Doesn't seem that way (with some simplifying rules, associativity is

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

There are a few simple enhancements to add to core.checkedint:

1. No need for different names (addu/adds) etc., overloading should take 
care of it (simplifies caller code). Right now the API is an odd mix of 
overloading and distinct names.

2. The overflow flag should be an integral, not a bool, so you get to 
count how many overflows there were. This is minor but has no extra cost 
on the happy case.

3. There should be an exponentiation function.

So I'm thinking of the signatures:

pure nothrow  nogc  safe
{
   int add(int x, int y, ref uint overflows);
   uint add(uint x, uint y, ref uint overflows);
   long add(long x, int y, ref uint overflows);
   ulong add(ulong x, uint y, ref uint overflows);

   int mul(int x, int y, ref uint overflows);
   uint mul(uint x, uint y, ref uint overflows);
   long mul(long x, int y, ref uint overflows);
   ulong mul(ulong x, uint y, ref uint overflows);

   int sub(int x, int y, ref uint overflows);
   uint sub(uint x, uint y, ref uint overflows);
   long sub(long x, int y, ref uint overflows);
   ulong sub(ulong x, uint y, ref uint overflows);

   int neg(int x, ref uint overflows);
   long neg(long x, ref uint overflows);

   int pow(int x, int y, ref uint overflows);
   uint pow(uint x, uint y, ref uint overflows);
   long pow(long x, int y, ref uint overflows);
   ulong pow(ulong x, uint y, ref uint overflows);
}

Thoughts?


Andrei

Jun 24 2016

Dominikus Dittes Scherkl <Dominikus.Scherkl continental-corporation.com> writes:

On Friday, 24 June 2016 at 16:56:36 UTC, Andrei Alexandrescu 
wrote:
 There are a few simple enhancements to add to core.checkedint:

 1. No need for different names (addu/adds) etc., overloading 
 should take care of it (simplifies caller code). Right now the 
 API is an odd mix of overloading and distinct names.

 2. The overflow flag should be an integral, not a bool, so you 
 get to count how many overflows there were. This is minor but 
 has no extra cost on the happy case.

 3. There should be an exponentiation function.

 So I'm thinking of the signatures:

 pure nothrow  nogc  safe
 {
   int add(int x, int y, ref uint overflows);
   uint add(uint x, uint y, ref uint overflows);
   long add(long x, int y, ref uint overflows);
   ulong add(ulong x, uint y, ref uint overflows);

   int mul(int x, int y, ref uint overflows);
   uint mul(uint x, uint y, ref uint overflows);
   long mul(long x, int y, ref uint overflows);
   ulong mul(ulong x, uint y, ref uint overflows);

   int sub(int x, int y, ref uint overflows);
   uint sub(uint x, uint y, ref uint overflows);
   long sub(long x, int y, ref uint overflows);
   ulong sub(ulong x, uint y, ref uint overflows);

   int neg(int x, ref uint overflows);
   long neg(long x, ref uint overflows);

   int pow(int x, int y, ref uint overflows);
   uint pow(uint x, uint y, ref uint overflows);
   long pow(long x, int y, ref uint overflows);
   ulong pow(ulong x, uint y, ref uint overflows);
 }

 Thoughts?


 Andrei

I have a save exponentiation function (+ two helper functions 
that may also be useful on their own):

/// add a property to numeric types that can be used as return 
value if a result is out of bounds
template invalid(T) if(isNumeric!T)
{
    static if(isFloatingPoint!T)
        property enum invalid = T.init;
    else static if(isSigned!T)
        property enum invalid = T.min; // 0x80..00
    else // unsigned
        property enum invalid = T.max; // 0xFF..FF
}

/// calculate the maximal power of the value that would fit in an 
ucent
/// for smaller types simply shift down the result
/// (e.g. divide by 4 to calc the maxpower that would fit in an 
uint)
ubyte maxpow(const ulong a) pure  safe  nogc nothrow
{
    assert(a>1); // no useful maxpower exists for 0 and 1
    static immutable ubyte[55] mp = [ 127, 80, 63, 55, 49, 45, 43, 
40,
       38, 37, 35, 34, 33, 32, 31, 31, 30, 30, 29, 29, 28, 28, 27, 
27,
       27, 26, 26, 26, 26, 25, 25, 25, 25, 24, 24, 24, 24, 24, 24, 
23,
       23, 23, 23, 23, 23, 23, 22, 22, 22, 22, 22, 22, 22, 22, 22 
];
    return (a<139) ? (a<57) ? mp[cast(ubyte)a-2]
                            : ((a<85) ?  ((a<69) ? 21 : 20)
                                      : ((a<107) ? 19 : 18))
    : ((a<7132) ? ((a<566) ? ((a<256) ? ((a<185) ? 17 : 16)
                                      : ((a<371) ? 15 : 14))
                          : ((a<1626) ? ((a<921) ? 13 : 12)
                                     : ((a<3184) ? 11 : 10)))
        : ((a<2642246) ? ((a<65536) ? ((a<19113) ?  9 : 8)
                                   : ((a<319558) ?  7 : 6))
               : ((a<4294967296) ? ((a<50859009) ?  5 : 4)
                            : ((a<6981463658332) ?  3 : 2))));
}

/// exponentiation without overflow
/// return invalid if overflow would occur.
T safePow(T)(const(T) base, const ubyte exp) pure  safe  nogc 
nothrow if(isIntegral!T)
{
    static if(isUnsigned!T)
    {
       if(!exp) return 1; // x^^0 is always 1
       if(exp == 1 || base < 2) return base; // x^^1 = x, 1^^n = 1 
and 0^^n = 0, so do nothing
       static if(T.sizeof > ulong.sizeof)
       {
          if(base > ulong.max) return invalid!T;
       }
       if(exp > (maxpow(base)>>(5-bitlen(T.sizeof)))) return 
invalid!T;
       return base ^^ exp; // calc only if no overflow for sure 
(very efficient)
    }
    else
    {
       auto r = safePow(abs(base), exp); // sometimes calc even if 
result won't fit
       if(r > T.max) return invalid!T; // but at least we can 
easily detect if it happened
       return (base < 0 && odd(exp)) ? -cast(T)r : cast(T)r;
    }
}

unittest
{
    void test(T)()
    {
       T r;
       foreach(base; T.min..T.max+1)
       {
          foreach(exp; 0..256)
          {
             r = safePow(base, exp);
             if(r == invalid!T)
             {
                assert(base^^exp > T.max, "max wrong: 
"+T.stringOf());
                break;
             }
             assert(r == base^^exp, "calc wrong");
          }
       }
    }
    test!byte;
    test!ubyte;
    test!short;
    test!ushort;
/+ test!int;
    test!uint;
    test!long;
    test!ulong; +/
}

Jun 24 2016

Dominikus Dittes Scherkl <Dominikus.Scherkl continental-corporation.com> writes:

On Friday, 24 June 2016 at 17:25:32 UTC, Dominikus Dittes Scherkl 
wrote:
 I have a save exponentiation function (+ two helper functions 
 that may also be useful on their own):

Ah, and I use a better abs() function:

/// get the absolute value of x as unsigned type. always 
succeeds, even for T.min
Unsigned!T abs(T)(const(T) x) pure  safe  nogc nothrow 
if(isIntegral!T)
{
    static if(isSigned!T) if(x < 0) return -x;
    return x;
}

unittest
{
    byte a = -128;
    auto b = abs(a);
    assert(is(typeof(b) == ubyte));
    assert(b == 128);
}

Jun 24 2016

Walter Bright <newshound2 digitalmars.com> writes:

On 6/24/2016 9:56 AM, Andrei Alexandrescu wrote:
 1. No need for different names (addu/adds) etc., overloading should take care
of
 it (simplifies caller code). Right now the API is an odd mix of overloading and
 distinct names.

The reason for the different names is:

1. Integral promotion rules will affect which overload is selected, when the 
argument type is not an exact match. Few programmers have an exact
understanding 
of integral promotion rules.

2. Things become further obfuscated when type aliases are used.

3. Mixing signed and unsigned integer types calls which overload?

4. People who use core.checkedint want pedantically correct code. Being 
absolutely sure what operation is being done is part of that. Using different 
names makes it trivially inspectable.

5. It is a design mistake to make overloads with the same name have different 
implementations. Arguably, in this context, a signed overflow is different from 
an unsigned overflow, and so should properly be a different name.



 2. The overflow flag should be an integral, not a bool, so you get to count 

how many overflows there were. This is minor but has no extra cost on the happy 
case.

I see no value in this beyond amusement value, and a potential bug:

1. One operation in a sequence that overflows will make subsequent operations 
meaningless, and will likely even cause more overflows. Sticky error flags are 
used in floating point, and I've never heard any desire for nor use case for
counts.

2. The overflow count could conceivably itself overflow, thereby masking the 
failure in rare cases.



 3. There should be an exponentiation function.

Agreed.

Jun 24 2016

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 06/24/2016 06:23 PM, Walter Bright wrote:
 On 6/24/2016 9:56 AM, Andrei Alexandrescu wrote:
 1. No need for different names (addu/adds) etc., overloading should
 take care of
 it (simplifies caller code). Right now the API is an odd mix of
 overloading and
 distinct names.

 The reason for the different names is:

 1. Integral promotion rules will affect which overload is selected, when
 the argument type is not an exact match. Few programmers have an exact
 understanding of integral promotion rules.

These functions are not supposed to be used naïvely.

 2. Things become further obfuscated when type aliases are used.

These functions are not supposed to be used naïvely.

 3. Mixing signed and unsigned integer types calls which overload?

These functions are not supposed to be used naïvely.

 4. People who use core.checkedint want pedantically correct code. Being
 absolutely sure what operation is being done is part of that. Using
 different names makes it trivially inspectable.

These functions are not supposed to be used naïvely.

 5. It is a design mistake to make overloads with the same name have
 different implementations. Arguably, in this context, a signed overflow
 is different from an unsigned overflow, and so should properly be a
 different name.

Wait, what? std.conv.to much?

  > 2. The overflow flag should be an integral, not a bool, so you get to
 count how many overflows there were. This is minor but has no extra cost
 on the happy case.

 I see no value in this beyond amusement value, and a potential bug:

 1. One operation in a sequence that overflows will make subsequent
 operations meaningless, and will likely even cause more overflows.
 Sticky error flags are used in floating point, and I've never heard any
 desire for nor use case for counts.

OK.

 2. The overflow count could conceivably itself overflow, thereby masking
 the failure in rare cases.

Touché.

  > 3. There should be an exponentiation function.

 Agreed.

Yay.

Andrei

Jun 24 2016

Walter Bright <newshound2 digitalmars.com> writes:

On 6/24/2016 3:29 PM, Andrei Alexandrescu wrote:
 These functions are not supposed to be used naïvely.

They are also meant to be inspectable. I want to look at it and trivially know 
what I am getting. Deep knowledge of subtle cases in the language (and integral 
promotions satisfy that) should not be necessary *especially* for this set of 
functions. People get into enough trouble over misunderstanding whether they
are 
using signed or unsigned arithmetic as it is.

I've used core.checkedint here and there, and have been well satisfied with the 
existing interface.

   int add(int x, int y, ref uint overflows);
   uint add(uint x, uint y, ref uint overflows);
   long add(long x, int y, ref uint overflows);
   ulong add(ulong x, uint y, ref uint overflows);

   ubyte i;
   uint u;
   uint overflow;
   add(i, u, overflow);

Quickly, without trying it out, tell me which one it calls. No cheating!

Jun 24 2016

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 06/24/2016 06:56 PM, Walter Bright wrote:
 On 6/24/2016 3:29 PM, Andrei Alexandrescu wrote:
 These functions are not supposed to be used naïvely.

 They are also meant to be inspectable. I want to look at it and
 trivially know what I am getting.

Then the right API is add(x, y, overflow). I look at it and I trivially 
know what I am getting. -- Andrei

Jun 24 2016

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

To bring a different angle, the best front-end for arithmetic operations 
that may overflow is:

typeof(lhs + rhs) add(L, R)(const L lhs, const R rhs, ref bool overflow)
if (isIntegral!L && isIntegral!R);

The overflow bit is set if and only if the mathematical result is not 
the same as the concrete result. Interestingly, things like add(5u, -1) 
should succeed without overflow (returning 4u) even though the negative 
value is conceptually converted to a large positive number and the 
operation overflows. (I've implemented this behavior in the DbI checkedint.)

One point that escaped me initially is that sometimes there's no need 
for any checks, e.g. adding any two types smaller than int doesn't need 
to be checked.

I'm not sure whether this function belongs in core.checkedint, but 
that's the right front end design for the primitives in there.


Andrei

Jun 24 2016

Walter Bright <newshound2 digitalmars.com> writes:

On 6/24/2016 3:55 PM, Andrei Alexandrescu wrote:
 Interestingly, things like add(5u, -1) should succeed
 without overflow (returning 4u) even though the negative value is conceptually
 converted to a large positive number and the operation overflows. (I've
 implemented this behavior in the DbI checkedint.)

I believe adding such behavior is beyond the charter of checkedint. I
understand 
the charter is to check for all overflow, undefined and implementation defined 
behaviors, and not go beyond that.

Pedantically, -1 is an int. But (unsigned op signed) converts the latter to 
unsigned, becoming (unsigned op unsigned), and so an overflow occurs.

Jun 24 2016

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 06/24/2016 07:56 PM, Walter Bright wrote:
 I believe adding such behavior is beyond the charter of checkedint.

Which checkedint? The one in core? -- Andrei

Jun 24 2016

Walter Bright <newshound2 digitalmars.com> writes:

On 6/24/2016 4:59 PM, Andrei Alexandrescu wrote:
 On 06/24/2016 07:56 PM, Walter Bright wrote:
 I believe adding such behavior is beyond the charter of checkedint.

 Which checkedint? The one in core? -- Andrei

Your Checked!int. Sorry for the confusion.

Jun 24 2016

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 06/24/2016 08:11 PM, Walter Bright wrote:
 On 6/24/2016 4:59 PM, Andrei Alexandrescu wrote:
 On 06/24/2016 07:56 PM, Walter Bright wrote:
 I believe adding such behavior is beyond the charter of checkedint.

 Which checkedint? The one in core? -- Andrei

 Your Checked!int. Sorry for the confusion.

Though you should be able to implement that via a hook, I think at this 
point we need to agree to disagree. The relevant code is at 
https://gist.github.com/andralex/a0c0ad32704e6ba66e458ac48add4a99#file-checked-d-L88:

     static if (!isUnsigned!L)
     {
         if (lhs < 0)
         {
             return subu(Result(rhs), Result(negs(lhs, overflow)),
                 overflow);
         }
     }

With your suggestion, this would also be an overflow:

long x = -1;
auto y = array.length + x;

I would be hard pressed to acknowledge that as an overflow that needs to 
be dynamically signaled. And the beauty of two's complement is that 
indeed it just works.


Andrei

Jun 24 2016

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 06/24/2016 09:42 PM, Andrei Alexandrescu wrote:
 long x = -1;
 auto y = array.length + x;

 I would be hard pressed to acknowledge that as an overflow that needs to
 be dynamically signaled. And the beauty of two's complement is that
 indeed it just works.

To clarify: if array.length is 0, then indeed that should be signaled as 
an error. But otherwise it should just go through, no overflow. -- Andrei

Jun 24 2016

Observer <here inter.net> writes:

On Saturday, 25 June 2016 at 01:49:22 UTC, Andrei Alexandrescu 
wrote:
 On 06/24/2016 09:42 PM, Andrei Alexandrescu wrote:
 long x = -1;
 auto y = array.length + x;

 I would be hard pressed to acknowledge that as an overflow 
 that needs to
 be dynamically signaled. And the beauty of two's complement is 
 that
 indeed it just works.

 To clarify: if array.length is 0, then indeed that should be 
 signaled as an error. But otherwise it should just go through, 
 no overflow. -- Andrei

I'm curious how one attempts to handle overflow detection in a
user-level implementation.  It seems to me that such calculations 
are
much better handled at the hardware level -- and that means, in 
the
compiler.  To wit, I'm looking at some old processor manuals I 
happen
to have sitting around, since they tend to be quite descriptive
about such details.  My pdp-11 processor handbook says about the V
(overflow) and C (carry) status bits in an ADD instruction:

V:  set if there was arithmetic overflow as a result of the
     operation; that is both operands were of the same sign and the
     result was of the opposite sign; cleared otherwise

C:  set if there was a carry from the most significant bit of the
     result; cleared otherwise

I guess this is just assuming both operands are signed.  It 
doesn't
talk about a separate ADDU (add unsigned) instruction, so I guess
you'd have to just use ADD and interpret the flags appropriately
for that situation.

The point is, there are two separate flags involved, and they need
to be thought about separately and in detail in such contexts.

In the MC68020 32-Bit Microprocessor User's Manual, Appendix A
("Condition Codes Computation") is really nice because it is
precise and goes right down to the bottom-most detail, listing
the logical equations used to set such flags.  For instance, for
the ADD instruction, the V and C flags are set according to these
logical equations:

     V = (Sm and Dm and (not Rm)) or ((not Sm) and (not Dm) and Rm)
     C = (Sm and Dm) or ((not Rm) and Dm) or (Sm and (not Rm))

where:

     Sm = source (first operand) most significant bit
     Dm = destination (second operand) most significant bit
     Rm = result most significant bit

(I've added parentheses to what is presented in the book, assuming
that logical AND is of higher precedence than logical OR.)
I'm assuming that once again, these flags are being set on the
assumption that both operands are signed, as I see no separate
ADDU instruction.

My points are:

* correct overflow detection involves an awful lot of bit-fiddling

* the V (overflow) bit is *not* the same as the C (carry) bit that
   everyone first thinks about when considering whether or not a
   simple add calculation overflowed

I had to wonder, how is this handled at the user level in a
checkedint implementation?  Do you perform a lot of data masking
and testing in user-level code?  Or do you depend on accessing the
machine's hardware flags register and hoping it hasn't changed in
between when you executed the compiled ADD instruction and when
you attempt to access it from high-level code?

I just checked the implementations of adds() and addu() here:

https://github.com/dlang/druntime/blob/master/src/core/checkedint.d

They don't seem to have quite the complexity of the logical 
equations
I just showed.  Not that they're necessarily incorrect as they 
stand;
I'm just curious.

Jun 25 2016

Walter Bright <newshound2 digitalmars.com> writes:

On 6/24/2016 6:42 PM, Andrei Alexandrescu wrote:
 With your suggestion, this would also be an overflow:

 long x = -1;
 auto y = array.length + x;

 I would be hard pressed to acknowledge that as an overflow that needs to be
 dynamically signaled. And the beauty of two's complement is that indeed it just
 works.

I'm curious if that produces an overflow error on the newer clang that 
instruments such things.

Jun 24 2016

Walter Bright <newshound2 digitalmars.com> writes:

On 6/24/2016 6:42 PM, Andrei Alexandrescu wrote:
 With your suggestion, this would also be an overflow:

 long x = -1;
 auto y = array.length + x;

 I would be hard pressed to acknowledge that as an overflow that needs to be
 dynamically signaled. And the beauty of two's complement is that indeed it just
 works.

That's a seductive test case. But I worry that mixed signed/unsigned arithmetic 
is not so simple.

What about:

     x + array.length
     commutativity in general
     associativity

Does this become a morass of special cases?

Jun 24 2016

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 06/24/2016 11:19 PM, Walter Bright wrote:
 On 6/24/2016 6:42 PM, Andrei Alexandrescu wrote:
 With your suggestion, this would also be an overflow:

 long x = -1;
 auto y = array.length + x;

 I would be hard pressed to acknowledge that as an overflow that needs
 to be
 dynamically signaled. And the beauty of two's complement is that
 indeed it just
 works.

 That's a seductive test case. But I worry that mixed signed/unsigned
 arithmetic is not so simple.

 What about:

      x + array.length
      commutativity in general
      associativity

 Does this become a morass of special cases?

Doesn't seem that way (with some simplifying rules, associativity is 
left to right so not necessarily optimal), but commutativity works 
nicely, please take a close look at 
https://gist.github.com/andralex/a0c0ad32704e6ba66e458ac48add4a99 and 
destroy what you find unfit. And indeed UBSAN is a good baseline to keep 
an eye on. -- Andrei

Jun 24 2016

D Programming

C/C++ Programming

Other

digitalmars.D - core.checkedint