www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - symmetric signed types

reply "Dominikus Dittes Scherkl" writes:
There is one mistake in C that D proliverates:

The T.min value of signed types.

e.g.

byte a = -128;
auto b = -a;

What type should b get? (of course "byte" but the value doesn't 
fit!)

Also getting the absolute value of some signed variable
need to return a different type or doesn't work correct for all 
input.
E.g. "ubyte abs(byte)" - this functions which can't even use a 
template,
or has anybody a good idea ho to express "unsigned T abs(T)(T x)"?

So I thought I could design a new type "sbyte" with symmetric 
range
(-127..127) and an additional value NaN (yes, the old 0x80).
(and of course larger, similar types - by the way: why wasn't 
"short"
instead called "word"? Then my new type would be "sword" :-)

It worked all well until I found that the new operators !<> !<= 
etc
can't be overloaded (they are only available for floating types)!
Why is this so?

D made all the floatingpoint stuff so much better than C, but the 
integral types still suffer the same old flaws.
Jan 23 2014
next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Dominikus Dittes Scherkl:

 E.g. "ubyte abs(byte)" - this functions which can't even use a 
 template, or has anybody a good idea ho to express
 "unsigned T abs(T)(T x)"?

I think this is not hard to do in D.
 by the way: why wasn't "short" instead called "word"?

On most modern CPUs a word is longer than a short.
 It worked all well until I found that the new operators !<> !<= 
 etc can't be overloaded (they are only available for floating 
 types)! Why is this so?
 D made all the floating point stuff so much better than C, but 
 the integral types still suffer the same old flaws.

Those FP operators are about to be deprecated. Apparently they add too much complexity for what they offer. using std.math.isNan should suffice. Bye, bearophile
Jan 23 2014
prev sibling next sibling parent "Dominikus Dittes Scherkl" writes:
On Thursday, 23 January 2014 at 14:40:58 UTC, bearophile wrote:
 Dominikus Dittes Scherkl:

 E.g. "ubyte abs(byte)" - this functions which can't even use a 
 template, or has anybody a good idea ho to express
 "unsigned T abs(T)(T x)"?

I think this is not hard to do in D.

How would you express "the corresponding type of same size but unsigned?" But anyway, this is not what I wanted to do.
 by the way: why wasn't "short" instead called "word"?

On most modern CPUs a word is longer than a short.

machines became common everybody assumes int to be 32bit (which made lots of code defective) and still used "word" to refer to 16bit types. Its only I'd love to have a "sword" type in D :-D
 It worked all well until I found that the new operators !<> 
 !<= etc can't be overloaded (they are only available for 
 floating types)! Why is this so?
 D made all the floating point stuff so much better than C, but 
 the integral types still suffer the same old flaws.

Those FP operators are about to be deprecated. Apparently they add too much complexity for what they offer. using std.math.isNan should suffice.

But for FP - aren't there more values which the special operators take care of? Infinity, negative zero, values near zero etc? Hmm - ok, std.math contains them all. But apperently the abs() function implemented there makes exactly the above mentioned error: abs(-128) = -128 because it returns the same type as the input has. I hate standard functions that produce bogous results. This is why I think we would be better off with symmetric signed types. Would also provide a perfect init value just like FP: NaN. Ok, it costs some performance, but for that I would always prefer unsigned types where you can do all the bit-fiddling shift and overflow-with-carry-bit (oh, wait: this is also not available in D. Why?!? Pretty much every processor has it and its available in about every assembler I've ever seen. Why not in D?) From signed types I expect reasonable performance but save operations (e.g. everything that produces over or underflow should deliver NaN and not arbitrary garbage)
Jan 23 2014
prev sibling next sibling parent "Stanislav Blinov" <stanislav.blinov gmail.com> writes:
On Thursday, 23 January 2014 at 12:09:24 UTC, Dominikus Dittes 
Scherkl wrote:

 Also getting the absolute value of some signed variable
 need to return a different type or doesn't work correct for all 
 input.
 E.g. "ubyte abs(byte)" - this functions which can't even use a 
 template,
 or has anybody a good idea ho to express "unsigned T abs(T)(T 
 x)"?

import std.traits : Unsigned; Unsigned!T abs(T)(T x) { /+ magic... +/ }
Jan 23 2014
prev sibling next sibling parent "Dominikus Dittes Scherkl" writes:
On Thursday, 23 January 2014 at 16:52:14 UTC, Stanislav Blinov 
wrote:
 On Thursday, 23 January 2014 at 12:09:24 UTC, Dominikus Dittes 
 Scherkl wrote:

 Also getting the absolute value of some signed variable
 need to return a different type or doesn't work correct for 
 all input.
 E.g. "ubyte abs(byte)" - this functions which can't even use a 
 template,
 or has anybody a good idea ho to express "unsigned T abs(T)(T 
 x)"?

import std.traits : Unsigned; Unsigned!T abs(T)(T x) { /+ magic... +/ }

Cool. So why it that not used in std.math.abs?
Jan 23 2014
prev sibling next sibling parent "Stanislav Blinov" <stanislav.blinov gmail.com> writes:
On Thursday, 23 January 2014 at 16:54:29 UTC, Dominikus Dittes 
Scherkl wrote:

 Cool. So why it that not used in std.math.abs?

http://d.puremagic.com/issues/show_bug.cgi?id=8666 Andrei's comment sums it up pretty much. AFAIK there may also be some portability considertaions when performing casts.
Jan 23 2014
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/23/14 4:09 AM, Dominikus Dittes Scherkl wrote:
 There is one mistake in C that D proliverates:

 The T.min value of signed types.

 e.g.

 byte a = -128;
 auto b = -a;

 What type should b get? (of course "byte" but the value doesn't fit!)

The type will be int.
 Also getting the absolute value of some signed variable
 need to return a different type or doesn't work correct for all input.
 E.g. "ubyte abs(byte)" - this functions which can't even use a template,
 or has anybody a good idea ho to express "unsigned T abs(T)(T x)"?

http://dlang.org/phobos/std_conv.html#.unsigned
 So I thought I could design a new type "sbyte" with symmetric range
 (-127..127) and an additional value NaN (yes, the old 0x80).
 (and of course larger, similar types - by the way: why wasn't "short"
 instead called "word"? Then my new type would be "sword" :-)

 It worked all well until I found that the new operators !<> !<= etc
 can't be overloaded (they are only available for floating types)!
 Why is this so?

We're deprecating the new operators :o).
 D made all the floatingpoint stuff so much better than C, but the
 integral types still suffer the same old flaws.

There are quite a few improvements for integrals, too, most importantly of the kind that don't exact a speed penalty. Andrei
Jan 23 2014
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/23/2014 12:35 PM, Andrei Alexandrescu wrote:
 On 1/23/14 4:09 AM, Dominikus Dittes Scherkl wrote:
 What type should b get? (of course "byte" but the value doesn't fit!)

The type will be int.

As an aside, the C integral arithmetic rules are often criticized. However, nobody has found another set of rules that didn't come with (sometimes severe) shortcomings of their own. The huge advantage of the C rules is they are very widely known, used, and understood.
Jan 23 2014
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/23/2014 4:50 PM, bearophile wrote:
 Walter Bright:

 The huge advantage of the C rules is they are very widely known, used, and
 understood.

And following them allows me to translate intricate C code to D with less headaches. Still, Go has adopted a different strategy...

I know. 1. Go determines the type of (e1 op e2) as the type of the first operand. http://golang.org/ref/spec#Arithmetic_operators I consider this not only surprising (as we expect + to be commutative) but can lead to unexpected truncation, as in (byte = byte + int32). It'll also lead to unexpected signed/unsigned bugs when converting C code. 2. Go also requires casts in order to assign one integral type to another: http://golang.org/ref/spec#Assignability I suspect that leads to a lot of casts, and such makes for hard-to-find bugs when code is refactored. If Go ever gets generics, this effect will get worse. The Go rules are definitely simpler than the C rules, but I don't see otherwise an improvement in reducing unintended behavior. D, on the other hand, uses C rules with implicit casting as long as value range propagation says truncation will not result. The rules are more complex, but work out surprisingly naturally and I believe are a real advance in reducing unintended behavior. Note that even Dominikus' "mistake" is not one that resulted in unexpected truncation.
Jan 23 2014
parent Walter Bright <newshound2 digitalmars.com> writes:
On 1/23/2014 5:44 PM, bearophile wrote:
 So despite what the docs say, it seems the two types need to be the same for
the
 sum to work?

I was going by what the spec said.
 While this program compiles:


 package main
 import ("fmt")
 func main() {
      var x byte = 10;
      var y int = 1000;
      z1 := int(x) + y
      fmt.Println(z1)
      z2 := x + byte(y)
      fmt.Println(z2)
 }

Note the casts.
Jan 23 2014
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/24/2014 2:40 AM, Dominikus Dittes Scherkl wrote:
 Ah, ok. Of course the small types always become int.
 But the problem would be the same with

 long a = long.min;
 auto b = -a;

 does this return ulong (which could hold the correct result) or long (and a
 wrong result)?

The negation operator does not change the type, and no operation changes the type as the result of particular runtime operand values. BTW, Python has what you want - runtime overflow automatically fails over to bignum. But Python is a slow language.
Jan 24 2014
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/24/2014 1:39 PM, Dominikus Dittes Scherkl wrote:
 On Friday, 24 January 2014 at 19:03:59 UTC, Walter Bright wrote:
 On 1/24/2014 2:40 AM, Dominikus Dittes Scherkl wrote:
 Ah, ok. Of course the small types always become int.
 But the problem would be the same with

 long a = long.min;
 auto b = -a;

 does this return ulong (which could hold the correct result) or long (and a
 wrong result)?

The negation operator does not change the type, and no operation changes the type as the result of particular runtime operand values.


No. 1. a was promoted to int before the negation operator was applied. 2. types do not depend on particular runtime values (the whole notion of static typing would fall apart if it did)
Jan 24 2014
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 01/24/2014 11:33 PM, Walter Bright wrote:
 ...
 2. types do not depend on particular runtime values (the whole notion of
 static typing would fall apart if it did)

http://en.wikipedia.org/wiki/Dependent_type
Jan 24 2014
parent Piotr Szturmaj <bncrbme jadamspam.pl> writes:
Timon Gehr wrote:
 On 01/24/2014 11:33 PM, Walter Bright wrote:
 ...
 2. types do not depend on particular runtime values (the whole notion of
 static typing would fall apart if it did)

http://en.wikipedia.org/wiki/Dependent_type

http://en.wikipedia.org/wiki/Refinement_type#Refinement_types
Jan 25 2014
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/24/14 2:40 AM, Dominikus Dittes Scherkl wrote:
 On Thursday, 23 January 2014 at 20:35:56 UTC, Andrei Alexandrescu wrote:
 byte a = -128;
 auto b = -a;

 What type should b get? (of course "byte" but the value doesn't fit!)

The type will be int.

But the problem would be the same with long a = long.min; auto b = -a; does this return ulong (which could hold the correct result) or long (and a wrong result)?

long
 integral types still suffer the same old flaws.

There are quite a few improvements for integrals, too, most importantly of the kind that don't exact a speed penalty.

the main benefit of signed types is their "ease of use" - like in the "hello world" program it should be easy to do it right and work "out of the box". Errors like int a = 2_000_000_000; int b = a + a; should not generate weird stuff like -294_967_296 (which it actually does) but better produce NaN to indicate that the result is not in the valid range or "int". For addition that may be not to complicated to handle, but for multiplication? There it would be very nice (and fast!!) to have an implenetation that checks the carry and set the result to NaN if carry is not 0. At the moment doing so requires the use of inline assembler - not realy a newbi-thing to do...

There's no NaN for integrals. I initially protested a number of things about the way D's integral expressions are handled. For example I found it ridiculous that unary "-" for uint returns uint. Walter talked me into accepting the C-style rules and improving them with value range propagation. Andrei
Jan 24 2014
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/25/14 9:15 AM, "Ola Fosheim Gr√łstad" 
<ola.fosheim.grostad+dlang gmail.com>" wrote:
 On Friday, 24 January 2014 at 22:59:08 UTC, Andrei Alexandrescu wrote:
 integral expressions are handled. For example I found it ridiculous
 that unary "-" for uint returns uint. Walter talked

Fortunately most CPUS have ones-complement so the result is correct if you only use one unsigned type.

??? Andrei
Jan 25 2014
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/25/14 9:35 AM, "Ola Fosheim GrÝstad" 
<ola.fosheim.grostad+dlang gmail.com>" wrote:
 On Saturday, 25 January 2014 at 17:15:56 UTC, Ola Fosheim GrÝstad wrote:
 Fortunately most CPUS have ones-complement so the result is

Err... "Two's complement." Nngh! Anyway, the basic idea is to think of minus for unsigned integers as ~x + 1, not as a operation using negative integers. Then it makes sense.

Of course it does. It was the point of my post. It's just that the type is surprising from a basic arithmetic viewpoint. Andrei
Jan 25 2014
prev sibling next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 1/24/2014 3:25 PM, Brad Roberts wrote:
 None of this is new.  It comes up periodically and ends up in the typical place
 of many feature requests, unimplemented.  Create the type, share it, see who
 uses it and how much they gain from the benefits and if they're worth the
 costs.  Conjecture will only get you so far.

I'll chime in that yes, you can create your own integral type in D. You can in C++, too, there is a SafeInt integral type: http://msdn.microsoft.com/en-us/library/dd570023.aspx http://safeint.codeplex.com/ It's been around since at least 2004. I haven't noticed it getting much attention or traction. Maybe it would in D, you can always give it a try.
Jan 24 2014
prev sibling next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/24/14 4:25 AM, Dominikus Dittes Scherkl wrote:
 On Friday, 24 January 2014 at 11:43:08 UTC, eles wrote:
 On Friday, 24 January 2014 at 10:40:46 UTC, Dominikus Dittes Scherkl
 wrote:
 On Thursday, 23 January 2014 at 20:35:56 UTC, Andrei Alexandrescu wrote:

 int a = 2_000_000_000;
 int b = a + a;

 should not generate weird stuff like -294_967_296 (which it

Long discussion about signed/unsigned integer overflows...

But that is a HUGE source of errors, even in really carefully developed software in safety critical systems! I think it is well worth a thought to have a safe type in the language

s/language/standard library/
 --> If I write code fast, without thinking about subtleties (like e.g.
 the return type of main() in "hello world") I expect the compiler to do
 something sensible (ok, I doesn't expect if from C, but we're talking
 about a better language, do we?) and I don't expect highest performance.

 So I would prefer to have save signed types as default and maybe new
 types "sbyte", "sshort", "sint" etc if I need the last bit of
 performance, but without automatic conversion to those unsave types.
 Using fast signed types with all the over/underflow and other unsafe
 stuff is like manual memory management and pointers instead of GC and
 slices - useful to have in case you really need them, but not the default.

The short answer is - nagonna happen. Proposals for new standard library types and artifacts are of course accepted and encouraged. Andrei
Jan 24 2014
prev sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 01/25/2014 01:57 PM, Dominikus Dittes Scherkl wrote:
 ...
 Walter wrote:
 "There's no NaN for integrals."

 At least the carry-bit is already available in hardware. So the save
 type doesn't incure much performance loss in operations. On the other
 hand comparison, assignment and casts become slower by a factor 2 or 3.
 And then comparison cannot be implemented fully correct with the current
 operator overloding system of D.

Why not? struct S{ auto opCmp(S r){ return float.nan; } } void main(){ S s; assert(s!<>=s); }
Jan 25 2014
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 01/27/2014 02:46 PM, Dominikus Dittes Scherkl wrote:
 On Saturday, 25 January 2014 at 13:43:25 UTC, Timon Gehr wrote:
On 01/25/2014 01:57 PM, Dominikus Dittes Scherkl wrote:
 And then comparison cannot be implemented fully correct with the current
 operator overloding system of D.

struct S{ auto opCmp(S r){ return float.nan; } } void main(){ S s; assert(s!<>=s); }

Yes, but only for floatingpoint types - you cannot overload the !<>= operator for integral types

I don't get what this is supposed to mean.
 and it will be deprecated anyway.

So? It was the most convenient way to illustrate that I have defined a not fully ordered type using opCmp.
 And you cannot opverload opCmp in a way that the new defined integer NaN
 will not compare in some way to the other integer values.

Of course you can. Just return float.nan from opCmp in the case that at least one of the arguments is your 'integer NaN'. float opCmp(sint r){ if(isNan()||r.isNan()) return float.nan; return value<r.value?-1:value>r.value?1:0; }
 What would be needed is a minimal signed type (2bit with the values -1,
 0, 1 and NaN) and use that in opCmp.

That's not needed in order to get correct comparison behaviour.
Jan 27 2014
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/27/14 6:13 AM, Timon Gehr wrote:
 float opCmp(sint r){
      if(isNan()||r.isNan()) return float.nan;
      return value<r.value?-1:value>r.value?1:0;
 }

Quite a nice trick. Andrei
Jan 27 2014
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Walter Bright:

 The huge advantage of the C rules is they are very widely 
 known, used, and understood.

And following them allows me to translate intricate C code to D with less headaches. Still, Go has adopted a different strategy... Bye, bearophile
Jan 23 2014
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Walter Bright:

 1. Go determines the type of (e1 op e2) as the type of the 
 first operand.

   http://golang.org/ref/spec#Arithmetic_operators

 I consider this not only surprising (as we expect + to be 
 commutative) but can lead to unexpected truncation, as in (byte 
 = byte + int32).

I don't know much about Go, but I have read some Go programs, and that comment looks suspicious. So I have created a little Go program: package main import ("fmt") func main() { var x byte = 10; var y int = 1000; z := x + y fmt.Println(z) } If you don't have a Go compiler you can try some code here: http://play.golang.org/ The result: http://play.golang.org/p/iP20v0r566 It gives: prog.go:6: invalid operation: x + y (mismatched types byte and int) So despite what the docs say, it seems the two types need to be the same for the sum to work? While this program compiles: package main import ("fmt") func main() { var x byte = 10; var y int = 1000; z1 := int(x) + y fmt.Println(z1) z2 := x + byte(y) fmt.Println(z2) } And prints: 1010 242 Bye, bearophile
Jan 23 2014
prev sibling next sibling parent "Dominikus Dittes Scherkl" writes:
On Thursday, 23 January 2014 at 20:35:56 UTC, Andrei Alexandrescu 
wrote:
 byte a = -128;
 auto b = -a;

 What type should b get? (of course "byte" but the value 
 doesn't fit!)

The type will be int.

But the problem would be the same with long a = long.min; auto b = -a; does this return ulong (which could hold the correct result) or long (and a wrong result)?
 integral types still suffer the same old flaws.

There are quite a few improvements for integrals, too, most importantly of the kind that don't exact a speed penalty.

for me the main benefit of signed types is their "ease of use" - like in the "hello world" program it should be easy to do it right and work "out of the box". Errors like int a = 2_000_000_000; int b = a + a; should not generate weird stuff like -294_967_296 (which it actually does) but better produce NaN to indicate that the result is not in the valid range or "int". For addition that may be not to complicated to handle, but for multiplication? There it would be very nice (and fast!!) to have an implenetation that checks the carry and set the result to NaN if carry is not 0. At the moment doing so requires the use of inline assembler - not realy a newbi-thing to do...
Jan 24 2014
prev sibling next sibling parent "eles" <eles eles.com> writes:
On Friday, 24 January 2014 at 10:40:46 UTC, Dominikus Dittes 
Scherkl wrote:
 On Thursday, 23 January 2014 at 20:35:56 UTC, Andrei 
 Alexandrescu wrote:

 int a = 2_000_000_000;
 int b = a + a;

 should not generate weird stuff like -294_967_296 (which it

Long discussion about signed/unsigned integer overflows...
Jan 24 2014
prev sibling next sibling parent "Dominikus Dittes Scherkl" writes:
On Friday, 24 January 2014 at 11:43:08 UTC, eles wrote:
 On Friday, 24 January 2014 at 10:40:46 UTC, Dominikus Dittes 
 Scherkl wrote:
 On Thursday, 23 January 2014 at 20:35:56 UTC, Andrei 
 Alexandrescu wrote:

 int a = 2_000_000_000;
 int b = a + a;

 should not generate weird stuff like -294_967_296 (which it

Long discussion about signed/unsigned integer overflows...

But that is a HUGE source of errors, even in really carefully developed software in safety critical systems! I think it is well worth a thought to have a safe type in the language, even if we buy it with a small performance tradeoff. Especially for the "automatic" type where the programmer has not spend much time in carefully choosing the types to be used (e.g. the code above, even with "auto" instead of "int"). --> If I write code fast, without thinking about subtleties (like e.g. the return type of main() in "hello world") I expect the compiler to do something sensible (ok, I doesn't expect if from C, but we're talking about a better language, do we?) and I don't expect highest performance. So I would prefer to have save signed types as default and maybe new types "sbyte", "sshort", "sint" etc if I need the last bit of performance, but without automatic conversion to those unsave types. Using fast signed types with all the over/underflow and other unsafe stuff is like manual memory management and pointers instead of GC and slices - useful to have in case you really need them, but not the default.
Jan 24 2014
prev sibling next sibling parent "Meta" <jared771 gmail.com> writes:
On Friday, 24 January 2014 at 12:25:13 UTC, Dominikus Dittes 
Scherkl wrote:
 On Friday, 24 January 2014 at 11:43:08 UTC, eles wrote:
 On Friday, 24 January 2014 at 10:40:46 UTC, Dominikus Dittes 
 Scherkl wrote:
 On Thursday, 23 January 2014 at 20:35:56 UTC, Andrei 
 Alexandrescu wrote:

 int a = 2_000_000_000;
 int b = a + a;

 should not generate weird stuff like -294_967_296 (which it

Long discussion about signed/unsigned integer overflows...

But that is a HUGE source of errors, even in really carefully developed software in safety critical systems! I think it is well worth a thought to have a safe type in the language, even if we buy it with a small performance tradeoff. Especially for the "automatic" type where the programmer has not spend much time in carefully choosing the types to be used (e.g. the code above, even with "auto" instead of "int"). --> If I write code fast, without thinking about subtleties (like e.g. the return type of main() in "hello world") I expect the compiler to do something sensible (ok, I doesn't expect if from C, but we're talking about a better language, do we?) and I don't expect highest performance. So I would prefer to have save signed types as default and maybe new types "sbyte", "sshort", "sint" etc if I need the last bit of performance, but without automatic conversion to those unsave types. Using fast signed types with all the over/underflow and other unsafe stuff is like manual memory management and pointers instead of GC and slices - useful to have in case you really need them, but not the default.

On the Rust mailing list, there's recently been discussion about auto-promotion to BigInt in case of overflow. Maybe that's a discussion we should be having as well?
Jan 24 2014
prev sibling next sibling parent "Dominikus Dittes Scherkl" writes:
On Friday, 24 January 2014 at 13:30:06 UTC, Meta wrote:
 On the Rust mailing list, there's recently been discussion 
 about auto-promotion to BigInt in case of overflow. Maybe 
 that's a discussion we should be having as well?

Nice idea. But is any overflow known at compile-time? Also really unexpected auto-type... I had something very simple in mind: 1) get rid of the asymmetric T.min value that always causes problems with abs() 2) instead use this special value as NaN 3) let NaN be the init-value of the signed types 4) let every over-/underflow result in NaN 5) let every operation involving NaN result in NaN 5) let any cast from other types to the save signed types check range and set NaN if the value doesn't fit None of that should be too expensive, but with such a type you can simply execute the program and if it result in NaN you know there had been some overflow (or uninitialized variable). That makes analyzing easy, it allows for simple contracts, is easy to catch and allows easy to decide what solution would be the best (e.g. using next bigger type or limit the values). And if performance is critical (which should be true only in some inner loop where one can be sure that no overflow is possible) as next step the now fool-prove program can be changed to use unsave types (because they use the same range + one extra value that hopefully never occures anyway).
Jan 24 2014
prev sibling next sibling parent "eles" <eles eles.com> writes:
On Friday, 24 January 2014 at 12:25:13 UTC, Dominikus Dittes 
Scherkl wrote:
 On Friday, 24 January 2014 at 11:43:08 UTC, eles wrote:
 On Friday, 24 January 2014 at 10:40:46 UTC, Dominikus Dittes 
 Scherkl wrote:
 On Thursday, 23 January 2014 at 20:35:56 UTC, Andrei 
 Alexandrescu wrote:



 But that is a HUGE source of errors, even in really carefully 
 developed software in safety critical systems!

I know, I know, I do exactly that for my bread. I am on your side. Walter isn't ;)
Jan 24 2014
prev sibling next sibling parent "Dominikus Dittes Scherkl" writes:
On Friday, 24 January 2014 at 19:03:59 UTC, Walter Bright wrote:
 On 1/24/2014 2:40 AM, Dominikus Dittes Scherkl wrote:
 Ah, ok. Of course the small types always become int.
 But the problem would be the same with

 long a = long.min;
 auto b = -a;

 does this return ulong (which could hold the correct result) 
 or long (and a
 wrong result)?

The negation operator does not change the type, and no operation changes the type as the result of particular runtime operand values.

example? byte a = -128; auto b = -a; Or is changing byte to int no typechange?!?
 BTW, Python has what you want - runtime overflow automatically 
 fails over to bignum. But Python is a slow language.

feature in D. I'm fine with the safe signed type I've developed. The only flaw in my struct for now is that I'm not able to overload opCmp in such a way that NaN compared to anything else would always be false (either < or >= is true because they can't be overloaded separately) :-/ But checking for NaN before any calculation is always the better way, so that flaw hits not too hard. I will do some performance checks to see how "slow" this will make some heavy calculations in reality.
Jan 24 2014
prev sibling next sibling parent Brad Roberts <braddr puremagic.com> writes:
On 1/24/14 6:08 AM, Dominikus Dittes Scherkl wrote:
 On Friday, 24 January 2014 at 13:30:06 UTC, Meta wrote:
 On the Rust mailing list, there's recently been discussion about
auto-promotion to BigInt in case
 of overflow. Maybe that's a discussion we should be having as well?

Nice idea. But is any overflow known at compile-time? Also really unexpected auto-type... I had something very simple in mind: 1) get rid of the asymmetric T.min value that always causes problems with abs() 2) instead use this special value as NaN 3) let NaN be the init-value of the signed types 4) let every over-/underflow result in NaN 5) let every operation involving NaN result in NaN 5) let any cast from other types to the save signed types check range and set NaN if the value doesn't fit None of that should be too expensive, but with such a type you can simply execute the program and if it result in NaN you know there had been some overflow (or uninitialized variable). That makes analyzing easy, it allows for simple contracts, is easy to catch and allows easy to decide what solution would be the best (e.g. using next bigger type or limit the values). And if performance is critical (which should be true only in some inner loop where one can be sure that no overflow is possible) as next step the now fool-prove program can be changed to use unsave types (because they use the same range + one extra value that hopefully never occures anyway).

The only reason that NaN works in the FP world is that it's done in hardware as part of existing operations. For it to work in the integer world it'd require adding operations to do the extra checking and handling with the resulting performance costs of doing so. It's, pretty much by definition, too expensive for a set of people and the apps they care about. The only practical way is to introduce a new type that behaves like this and use that instead. If, over the course of years, becomes popular enough, it's not impossible that some cpu makers could decide to enshrine it in new instructions. But I wouldn't hold your breath. :) None of this is new. It comes up periodically and ends up in the typical place of many feature requests, unimplemented. Create the type, share it, see who uses it and how much they gain from the benefits and if they're worth the costs. Conjecture will only get you so far. My 2 cents, Brad
Jan 24 2014
prev sibling next sibling parent "Namespace" <rswhite4 googlemail.com> writes:
On Thursday, 23 January 2014 at 20:35:56 UTC, Andrei Alexandrescu 
wrote:
 On 1/23/14 4:09 AM, Dominikus Dittes Scherkl wrote:
 There is one mistake in C that D proliverates:

 The T.min value of signed types.

 e.g.

 byte a = -128;
 auto b = -a;

 What type should b get? (of course "byte" but the value 
 doesn't fit!)

The type will be int.

import std.stdio; void main() { byte a = -128; auto b = -a; writeln(typeof(b).stringof); writeln(b); } ---- 2.064 output: byte -128 Is this a bug or is it already fixed in 2.065?
Jan 24 2014
prev sibling next sibling parent "Dominikus Dittes Scherkl" writes:
On Friday, 24 January 2014 at 23:26:04 UTC, Brad Roberts wrote:
 On 1/24/14 6:08 AM, Dominikus Dittes Scherkl wrote:
 I had something very simple in mind:
 1) get rid of the asymmetric T.min value
    that always causes problems with abs()
 2) instead use this special value as NaN
 3) let NaN be the init-value of the signed types
 4) let every over-/underflow result in NaN
 5) let every operation involving NaN result in NaN
 5) let any cast from other types to the save
    signed types check range and set NaN if the
    value doesn't fit

done in hardware as part of existing operations.

Walter wrote: "There's no NaN for integrals." At least the carry-bit is already available in hardware. So the save type doesn't incure much performance loss in operations. On the other hand comparison, assignment and casts become slower by a factor 2 or 3. And then comparison cannot be implemented fully correct with the current operator overloding system of D.
 It's, pretty much by definition, too expensive for a
 set of people and the apps they care about.

unsafe types - like they use manual memory management insted of GC.
 The only practical way is to introduce a new type that behaves 
 like this and use that instead.

Jan 25 2014
prev sibling next sibling parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Friday, 24 January 2014 at 22:59:08 UTC, Andrei Alexandrescu 
wrote:
 integral expressions are handled. For example I found it 
 ridiculous that unary "-" for uint returns uint. Walter talked

Fortunately most CPUS have ones-complement so the result is correct if you only use one unsigned type. -$01 == (~$01)+1 -$01 == $ff $ff+$03 == ($102)&($ff) $ff+$03 == $02 This is useful for interpolating oscillators (going from 1 to 0, or 0 to 1) ulong phase, delta; ... phase += delta; sample1 = wavetable[phase>>16]; sample2 = wavetable[(phase>>16) + 1]; return interpolate16bit(sample1,sample2,phase&0xffff); It is the automatic promoting of unsigned types that is a C-language bug IMHO.
Jan 25 2014
prev sibling next sibling parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Saturday, 25 January 2014 at 17:15:56 UTC, Ola Fosheim Gr√łstad 
wrote:
 Fortunately most CPUS have ones-complement so the result is

Err... "Two's complement." Nngh! Anyway, the basic idea is to think of minus for unsigned integers as ~x + 1, not as a operation using negative integers. Then it makes sense.
Jan 25 2014
prev sibling next sibling parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Saturday, 25 January 2014 at 18:53:29 UTC, Andrei Alexandrescu 
wrote:
 Of course it does. It was the point of my post. It's just that 
 the type is surprising from a basic arithmetic viewpoint.

Yes, I agree. I think it is messy to have implicit promotion of unsigned values. For signed ints you get sign-expansion, so they are less problematic. If the programmer does not care about bit-patterns he probably just uses regular ints anyway, so why promote when it create bugs that are hard to track down?
Jan 25 2014
prev sibling next sibling parent "Dominikus Dittes Scherkl" writes:
On Saturday, 25 January 2014 at 13:43:25 UTC, Timon Gehr wrote:
 Why not?

 struct S{
     auto opCmp(S r){ return float.nan; }
 }

 void main(){
     S s;
     assert(s!<>=s);
 }

Yes, but only for floatingpoint types - you cannot overload the !<>= operator for integral types and it will be deprecated anyway. And you cannot opverload opCmp in a way that the new defined integer NaN will not compare in some way to the other integer values. What would be needed is a minimal signed type (2bit with the values -1, 0, 1 and NaN) and use that in opCmp.
Jan 27 2014
prev sibling parent "Dominikus Dittes Scherkl" writes:
On Monday, 27 January 2014 at 14:13:36 UTC, Timon Gehr wrote:
 So? It was the most convenient way to illustrate that I have 
 defined a not fully ordered type using opCmp.

 And you cannot opverload opCmp in a way that the new defined 
 integer NaN
 will not compare in some way to the other integer values.

Of course you can. Just return float.nan from opCmp in the case that at least one of the arguments is your 'integer NaN'. float opCmp(sint r){ if(isNan()||r.isNan()) return float.nan; return value<r.value?-1:value>r.value?1:0; }
 What would be needed is a minimal signed type (2bit with the 
 values -1,
 0, 1 and NaN) and use that in opCmp.

That's not needed in order to get correct comparison behaviour.

Ah, ok. Now I understand. not inventing a type with NaN, but instead using one that provides it - really a nice trick. I never thought of a comparison operator returning a float. Cool. Thank you very much.
Jan 27 2014