www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - C++ bounded::integer library

reply "bearophile" <bearophileHUGS lycos.com> writes:
I am reading all the slides in this page (some are still missing 
and will be added later):
https://github.com/boostcon/cppnow_presentations_2014

Here are the slides of a nice talk, "Removing undefined behavior 
from integer operations: the bounded::integer library":

https://github.com/boostcon/cppnow_presentations_2014/blob/master/files/bounded_integer.pdf?raw=true

Some design decisions are not good (like not throwing at run-time 
on default, the long name, and more). Near the end of the slides 
pack it shows that a bit of language support can help improve 
both the syntax and the usage.

Bye,
bearophile
May 17 2014
next sibling parent =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:
 Here are the slides of a nice talk, "Removing undefined 
 behavior from integer operations: the bounded::integer library":

This may be a good start: https://github.com/nordlow/justd/blob/master/bound.d
May 18 2014
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Nordlöw:

 This may be a good start:

 https://github.com/nordlow/justd/blob/master/bound.d

I presume some ways to improve it are to add to core.bitop some D standard intrinsics to detect overflows and carry, to increase run-time performance to sufficient levels. If they are not fast, people will be less willing to used them. Another way to improve it is to add to D something like the "enum preconditions" I've discussed in past. So this bug can be detected at compile-time: Bound!byte b = 200; A third possible improvement is to add a __traits(value_range, exp) trait that returns the value range of an expression. This could be useful to implement the bound integers well. Other possible more complex problems are shown in that talk about bounded::integer. Bye, bearophile
May 18 2014
prev sibling next sibling parent =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:
 Another way to improve it is to add to D something like the 
 "enum preconditions" I've discussed in past. So this bug can be 
 detected at compile-time:

 Bound!byte b = 200;

This is already handled For example const b127 = saturated!byte(127); compiles but const b128 = saturated!byte(128); errors as bound.d(421,32): Error: saturated (inout(byte) x) is not callable using argument types (int)
May 19 2014
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Nordlöw:

 Bound!byte b = 200;

This is already handled For example const b127 = saturated!byte(127); compiles but const b128 = saturated!byte(128); errors as bound.d(421,32): Error: saturated (inout(byte) x) is not callable using argument types (int)

Is this giving a compile-time error? saturated!byte b = 127; Bye, bearophile
May 19 2014
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
 Is this giving a compile-time error?

 saturated!byte b = 127;

Sorry, I meant something different. I am not sure you can give compile-time errors in most of the significant cases with the current D language. Please correct me if I am wrong. Bye, bearophile
May 19 2014
prev sibling next sibling parent =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:
On Monday, 19 May 2014 at 19:54:07 UTC, bearophile wrote:
 Nordlöw:

 Bound!byte b = 200;

This is already handled For example const b127 = saturated!byte(127); compiles but const b128 = saturated!byte(128); errors as bound.d(421,32): Error: saturated (inout(byte) x) is not callable using argument types (int)

Is this giving a compile-time error? saturated!byte b = 127; Bye, bearophile

Yes saturated!byte bb = 127; fails as bound.d(425,20): Error: saturated!(byte, true) is used as a type because saturated is an instantiator function. Shouldn't it?
May 19 2014
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Nordlöw:

 because saturated is an instantiator function. Shouldn't it?

I still don't understand how you can design a good enough ranged integer without something like "enum preconditions". Functions run at run-time, so currently in D you can't enforce the correctness of literals for not-enum values at compile time. So in general the mutable ranged integer can only give the error at run-time, and this is not acceptable. I am missing something? Bye, bearophile
May 19 2014
prev sibling next sibling parent =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:
 the error at run-time, and this is not acceptable. I am missing 
 something?

My current solution only supports ranges (min max as template arguments) known at compile-time. It would be nice if we could add a feature to D that specializes return value of bound instantiator based upon if its normal value arguments are constants (enums) and if so infers range checking at compile-time otherwise at run-time. I have already asked for this at the forums, previously. Does this answer your questions? /Per
May 19 2014
prev sibling next sibling parent =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:
 this at the forums, previously. Does this answer your questions?

To clarify: I personally think we need a completely new language feature in D with which to compile-time-introspect function call arguments to check whether they adhere to certain requirements or matches for example the range of other given types.
May 19 2014
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Nordlöw:

 I personally think we need a completely new language feature in 
 D with which to compile-time-introspect function call arguments 
 to check whether they adhere to certain requirements or matches 
 for example the range of other given types.

I think "enum preconditions" are exactly that :-) But I don't know if they are good enough, if they are a good idea, of if there are better ways to do similar things. But I think something like that is an useful improvement for D, able to remove another limitation of user defined types compared to built-ins. Bye, bearophile
May 19 2014
prev sibling next sibling parent =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:
 I think "enum preconditions" are exactly that :-) But I don't

Great! :) This can also be used to specialize construction of linear algebra types such as vectors, matrices etc to use stack-allocated value (simd) types when dimensions are fixed and small.
May 19 2014
prev sibling next sibling parent "Dominikus Dittes Scherkl" writes:
On Sunday, 18 May 2014 at 21:58:54 UTC, bearophile wrote:
 I presume some ways to improve it are to add to core.bitop some 
 D standard intrinsics to detect overflows and carry, to 
 increase run-time performance to sufficient levels. If they are 
 not fast, people will be less willing to used them.

asymmetric min value of signed types as "NaN"), but one side-product is a somewhat useful function to check for overflow (it doesn't throw, instead the result will be T.max for unsigned arguments or T.min for signed arguments). It doesn't create much overhead and is very easy to do: T saveOp(string op, T)(T x, T y) pure save nogc if(isIntegral!T && (op=="+" || op=="-" || op=="<<" || op=="*")) { mixin("x "~op~"= y"); static if(isSigned!T) { static if(op == "*") { asm naked { jnc opok; } } else { asm naked { jno opok; } } x = T.min; } else // unsigned { asm naked { jnc opok; } x = T.max; } opok: return x; }
May 20 2014
prev sibling next sibling parent =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:
On Tuesday, 20 May 2014 at 09:45:12 UTC, Dominikus Dittes Scherkl 
wrote:
 On Sunday, 18 May 2014 at 21:58:54 UTC, bearophile wrote:
 I presume some ways to improve it are to add to core.bitop 
 some D standard intrinsics to detect overflows and carry, to 
 increase run-time performance to sufficient levels. If they 
 are not fast, people will be less willing to used them.

asymmetric min value of signed types as "NaN"), but one side-product is a somewhat useful function to check for overflow (it doesn't throw, instead the result will be T.max for unsigned arguments or T.min for signed arguments). It doesn't create much overhead and is very easy to do: T saveOp(string op, T)(T x, T y) pure save nogc if(isIntegral!T && (op=="+" || op=="-" || op=="<<" || op=="*")) { mixin("x "~op~"= y"); static if(isSigned!T) { static if(op == "*") { asm naked { jnc opok; } } else { asm naked { jno opok; } } x = T.min; } else // unsigned { asm naked { jnc opok; } x = T.max; } opok: return x; }

Interesting. You have a typo: save instead of safe You should also guard the use of asm to x86 architectures only with version.
May 20 2014
prev sibling next sibling parent "Meta" <jared771 gmail.com> writes:
On Monday, 19 May 2014 at 23:02:01 UTC, bearophile wrote:
 I think "enum preconditions" are exactly that :-) But I don't 
 know if they are good enough, if they are a good idea, of if 
 there are better ways to do similar things. But I think 
 something like that is an useful improvement for D, able to 
 remove another limitation of user defined types compared to 
 built-ins.

 Bye,
 bearophile

What's the difference between your proposed "enum precondition" and: void fun(int n) in { static assert(...); } out { static assert(...); } body { //... }
May 20 2014
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Meta:

 What's the difference between your proposed "enum precondition" 
 and:

 void fun(int n)
 in
 {
     static assert(...);
 }
 out
 {
     static assert(...);
 }
 body
 {
     //...
 }

Those static asserts are useless, because the argument n is not a "static" one. Bye, bearophile
May 20 2014
prev sibling next sibling parent "Dominikus Dittes Scherkl" writes:
On Tuesday, 20 May 2014 at 13:03:59 UTC, Nordlöw wrote:
 You have a typo:  save instead of  safe

 You should also guard the use of asm to x86 architectures only 
 with version.

I thought the D assembler should be generic? So if the target platform provides some similar mnemonic, I could use the same? But ok. It's just the do-it-yourself overflow protection is so awfully slow (especially for multiplication). I have also a protection for exponentiation, and that works without asm. On the other side it requires some helper functions: /// returns the absolute value of x in the fitting unsigned type /// (doesn't fail for T.min) Unsigned!T abs(T)(T x) pure safe nogc if(isIntegral!T) { return (x < 0) ? -x : x; } /// returns the number of the highest set bit +1 in the given value /// or 0 if no bit is set ubyte bitlen(ubyte c) pure safe nogc { return (c>15) ? (c>63) ? (c>127) ? 8 : 7 : (c>31) ? 6 : 5 : (c>3) ? (c>7) ? 4 : 3 : (c>1) ? 2 : c; // of course lookup table would be better but I'm to lazy to create one } /// calculate the maximal power of x that would fit in an ucent /// for smaller types simply shift down the result /// (e.g. divide by 4 to calc the maxpower that would fit in an uint) ubyte maxpow(ulong x) pure safe nogc { assert(x>1); // no useful maxpower exists for 0 and 1 static immutable ubyte[10] mp = [ 127, 80, 63, 55, 49, 45, 43, 40, 38, 37 ]; return (x<139) ? ((x<31) ? ((x<20) ? ((x<16) ? ((x<12) ? mp[x-2] : 47-x) : ((x<18) ? 31 : 30)) : ((x<24) ? ((x<22) ? 29 : 28) : ((x<27) ? 27 : 26))) : ((x<57) ? ((x<41) ? ((x<35) ? 25 : 24) : ((x<48) ? 23 : 22)) : ((x<85) ? ((x<69) ? 21 : 20) : ((x<107) ? 19 : 18)))) : ((x<7132) ? ((x<566) ? ((x<256) ? ((x<185) ? 17 : 16) : ((x<371) ? 15 : 14)) : ((x<1626) ? ((x<921) ? 13 : 12) : ((x<3184) ? 11 : 10))) : ((x<2642246) ? ((x<65536) ? ((x<19113) ? 9 : 8) : ((x<319558) ? 7 : 6)) : ((x<4294967296) ? ((x<50859009) ? 5 : 4) : ((x<6981463658332) ? 3 : 2)))); } /// full exponentiation without overflow /// return T.max(for unsigned) or T.min (for signed) if overflow would occur. T savePow(T)(T base, ubyte exp) pure safe nogc if(isIntegral!T) { static if(isUnsigned!T) { if(!exp) return 1; // x^^0 is always 1 if(exp == 1 || base < 2) return base; // x^^1 = x, so do nothing static if(T.sizeof > ulong.sizeof) { if(base > ulong.max) return T.max; } if(exp > (maxpow(base)>>(5-bitlen(T.sizeof)))) return T.max; return base ^^ exp; } else { auto r = savePow(abs!T(base), exp); if(r > T.max) return T.min; return ((base < 0) && (exp&1)) ? -cast(T)r : cast(T)r; } }
May 20 2014
prev sibling next sibling parent "david stone" <david doublewise.net> writes:
On Saturday, 17 May 2014 at 17:31:18 UTC, bearophile wrote:
 I am reading all the slides in this page (some are still 
 missing and will be added later):
 https://github.com/boostcon/cppnow_presentations_2014

 Here are the slides of a nice talk, "Removing undefined 
 behavior from integer operations: the bounded::integer library":

 https://github.com/boostcon/cppnow_presentations_2014/blob/master/files/bounded_integer.pdf?raw=true

 Some design decisions are not good (like not throwing at 
 run-time on default, the long name, and more). Near the end of 
 the slides pack it shows that a bit of language support can 
 help improve both the syntax and the usage.

 Bye,
 bearophile

Hello bearophile. I am the presenter / author of said library, so I feel I should comment on this. First, the default overflow policy. This is an issue I have struggled with a bit. The current default is to mirror built-in integers in terms of performance. If a user types bounded::integer<0, 10>, they get a type with no run-time checks. I did consider making bounded::throw_policy the default (right now, the typedef for that is bounded::checked_integer<0, 10>). My reasoning for this that I suspect this is the behavior that most people want. I could be convinced otherwise primarily by way of proving that compilers can optimize away the run-time bounds check for common cases: situations like the integer increment for a loop index variable. I've also considered a debug assert statement for the default. Second, the name. If you have suggestions for something better, I am listening. I definitely understand the importance of common types being short. The previous version of this library had the namespace as bounded_integer and the class type as bounded_integer, so instead of bounded::integer<0, 10>, you had to refer to it as bounded_integer::bounded_integer<0, 10>, which was even worse. My ideal would probably be something like int<0, 10>, but I'm not getting that for fairly obvious reasons. I do find myself not typing out the name of the type very often, though, preferring to let auto deduce the types for me. You say you have more criticisms of the library design, so I hope they are nothing too fundamentally flawed. I would like to hear them so that I can fix them (or perhaps I addressed the issue in my talk? The video is not yet posted). Thank you for the link :)
Jul 15 2014
prev sibling parent "david stone" <david doublewise.net> writes:
The video of this talk has been uploaded to 
https://www.youtube.com/watch?v=hgeErnYxAUw
Aug 15 2014