www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - disabling unary "-" for unsigned types

reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
ulong x0;
static assert(!__traits(compiles, -x0));
uint x1;
static assert(!__traits(compiles, -x1));
ushort x2;
static assert(!__traits(compiles, -x2));
ubyte x3;
static assert(!__traits(compiles, -x3));

Sounds good?

Andrei
Feb 14 2010
next sibling parent reply Justin Johansson <no spam.com> writes:
Andrei Alexandrescu wrote:
 ulong x0;
 static assert(!__traits(compiles, -x0));
 uint x1;
 static assert(!__traits(compiles, -x1));
 ushort x2;
 static assert(!__traits(compiles, -x2));
 ubyte x3;
 static assert(!__traits(compiles, -x3));
 
 Sounds good?
 
 Andrei

Sounds excellent. Who would have thought of that? Cheers Justin Johansson
Feb 14 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Justin Johansson wrote:
 Andrei Alexandrescu wrote:
 ulong x0;
 static assert(!__traits(compiles, -x0));
 uint x1;
 static assert(!__traits(compiles, -x1));
 ushort x2;
 static assert(!__traits(compiles, -x2));
 ubyte x3;
 static assert(!__traits(compiles, -x3));

 Sounds good?

 Andrei

Sounds excellent. Who would have thought of that? Cheers Justin Johansson

Actually Walter just talked me into forgetting about it. -x is conceptually rewritten into ~x + 1 for all types and typed accordingly. I'm dropping this in order to keep focused on more important changes. Andrei
Feb 14 2010
next sibling parent reply Justin Johansson <no spam.com> writes:
Andrei Alexandrescu wrote:
 Justin Johansson wrote:
 Andrei Alexandrescu wrote:
 ulong x0;
 static assert(!__traits(compiles, -x0));
 uint x1;
 static assert(!__traits(compiles, -x1));
 ushort x2;
 static assert(!__traits(compiles, -x2));
 ubyte x3;
 static assert(!__traits(compiles, -x3));

 Sounds good?

 Andrei

Sounds excellent. Who would have thought of that? Cheers Justin Johansson

Actually Walter just talked me into forgetting about it. -x is conceptually rewritten into ~x + 1 for all types and typed accordingly.

Oh, okay. Who would have thought of that? :-)
 I'm dropping this in order to keep focused on more important changes.

This sounds good too. We are all anxiously awaiting the publication of TDPL. What's the latest ETA? Justin
 Andrei

Feb 14 2010
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Justin Johansson wrote:
 Andrei Alexandrescu wrote:
 Justin Johansson wrote:
 Andrei Alexandrescu wrote:
 ulong x0;
 static assert(!__traits(compiles, -x0));
 uint x1;
 static assert(!__traits(compiles, -x1));
 ushort x2;
 static assert(!__traits(compiles, -x2));
 ubyte x3;
 static assert(!__traits(compiles, -x3));

 Sounds good?

 Andrei

Sounds excellent. Who would have thought of that? Cheers Justin Johansson

Actually Walter just talked me into forgetting about it. -x is conceptually rewritten into ~x + 1 for all types and typed accordingly.

Oh, okay. Who would have thought of that? :-)
 I'm dropping this in order to keep focused on more important changes.

This sounds good too. We are all anxiously awaiting the publication of TDPL. What's the latest ETA?

I'm on schedule for late April. With the chapter on concurrency (over 40 pages alone), the size of the book has grown a fair amount. But hey, I even give two lock-free examples. Andrei
Feb 14 2010
prev sibling parent Janzert <janzert janzert.com> writes:
Andrei Alexandrescu wrote:
 Justin Johansson wrote:
 Andrei Alexandrescu wrote:
 ulong x0;
 static assert(!__traits(compiles, -x0));
 uint x1;
 static assert(!__traits(compiles, -x1));
 ushort x2;
 static assert(!__traits(compiles, -x2));
 ubyte x3;
 static assert(!__traits(compiles, -x3));

 Sounds good?

 Andrei

Sounds excellent. Who would have thought of that? Cheers Justin Johansson

Actually Walter just talked me into forgetting about it. -x is conceptually rewritten into ~x + 1 for all types and typed accordingly. I'm dropping this in order to keep focused on more important changes. Andrei

I just want to add I'm glad to see this is staying since I use (x & -x) rather frequently on unsigned types to get one set bit. Obviously I could just expand it to the full ~x + 1 but the current idiom is nice, short and easily recognizable. Janzert
Feb 15 2010
prev sibling parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 ulong x0;
 static assert(!__traits(compiles, -x0));
 uint x1;
 static assert(!__traits(compiles, -x1));
 ushort x2;
 static assert(!__traits(compiles, -x2));
 ubyte x3;
 static assert(!__traits(compiles, -x3));
 Sounds good?
 Andrei

The more you bring up features to give the axe to, the more I find it funny that, not being a very good language lawyer, I wasn't aware that half of these features existed in the first place. This is one of them. Yes, definitely get rid of it. It makes absolutely no sense. If you want to treat your number like a signed int, then it should require an explicit cast.
Feb 14 2010
next sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
dsimcha wrote:
 The more you bring up features to give the axe to, the more I find it funny
that,
 not being a very good language lawyer, I wasn't aware that half of these
features
 existed in the first place.

D is a very ambitious language, and we are definitely shooting for the stars with it. That means that there are many features that missed or otherwise failed to find a target. So we're doing a little pruning for D2, and a big part of posting these possible prunes here is to make sure we aren't missing an important use case for them. One problem C/C++ has is there are several failed features in them that the C/C++ community has repeatedly tried to excise, but they just won't die. Trigraphs are an example.
Feb 14 2010
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Walter Bright wrote:
 dsimcha wrote:
 The more you bring up features to give the axe to, the more I find it 
 funny that,
 not being a very good language lawyer, I wasn't aware that half of 
 these features
 existed in the first place.

D is a very ambitious language, and we are definitely shooting for the stars with it. That means that there are many features that missed or otherwise failed to find a target. So we're doing a little pruning for D2, and a big part of posting these possible prunes here is to make sure we aren't missing an important use case for them. One problem C/C++ has is there are several failed features in them that the C/C++ community has repeatedly tried to excise, but they just won't die. Trigraphs are an example.

And custom operator new is another :o). Andrei
Feb 14 2010
prev sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Walter Bright:
 D is a very ambitious language, and we are definitely shooting for the 
 stars with it. That means that there are many features that missed or 
 otherwise failed to find a target. So we're doing a little pruning for 
 D2, and a big part of posting these possible prunes here is to make sure 
 we aren't missing an important use case for them.

D2 is a mix of some different kind of features: - C features. Some of them are not perfect, but we know their ups and downs, their risks and their qualities. - Some features are patches/fixes over C features that are known to be dangerous, not handy, bug prone, etc. Walter has enough experience with C that they are probably all good things. - Some new features that come from years of D1 development/usage or from other languages. They are not perfect (for example you can by mistake put the underscore in a wrong place in a number literal, and this can produce a deceiving number. Ideally in base 10 underscores can be syntactically allowed only every 3 digits and in base 2 or 6 every 4,8,16,32,16 digits only), but they are generally known enough to be safe enough bets. - And finally in D2 there are several new features that are sometimes only half-implemented, and generally no one has tried them in long programs, they seem to come from just the mind of few (intelligent) people, they don't seem battle-tested at all. Such new features are a dangerous bet, they can hide many traps and problems. Finalizing the D2 language before people have actually tried to use such features in some larger programs looks dangerous. Recently I have understood that this is why Simon Peyton-Jones said "Avoid success at all costs" regarding Haskell, that he has slowly developed for about 15 years: to give the language the time to be tuned, to remove warts, to improve it before people start to use it for rear and it needs to be frozen (today we are probably in a phase when Haskell has to be frozen, because there is enough software written in it that you can't lightly break backward compatibility). So I am a little worried for some of the last features introduced in D2. I don't know if D3 can solve this problem (maybe not). Bye, bearophile
Feb 14 2010
next sibling parent bearophile <bearophileHUGS lycos.com> writes:
 Ideally in base 10 underscores can be syntactically allowed only every 3
digits and in base 2 or 6 every 4,8,16,32,16 digits only),

Sorry, I meant: Ideally in base 10 underscores can be syntactically allowed only every 3 digits (starting from the less significant) and in base 2,8 or 16 every 4,8,16,32,64, etc, digits only), Bye, bearophile
Feb 14 2010
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
bearophile wrote:
 - And finally in D2 there are several new features that are sometimes
 only half-implemented, and generally no one has tried them in long
 programs, they seem to come from just the mind of few (intelligent)
 people, they don't seem battle-tested at all. Such new features are a
 dangerous bet, they can hide many traps and problems. Finalizing the
 D2 language before people have actually tried to use such features in
 some larger programs looks dangerous. Recently I have understood that
 this is why Simon Peyton-Jones said "Avoid success at all costs"
 regarding Haskell, that he has slowly developed for about 15 years:
 to give the language the time to be tuned, to remove warts, to
 improve it before people start to use it for rear and it needs to be
 frozen (today we are probably in a phase when Haskell has to be
 frozen, because there is enough software written in it that you can't
 lightly break backward compatibility).

The response of the Haskell community seems to be "avoid avoiding success". Anyway, either slogan shouldn't be taken out of context, and I don't think the situations of the two languages are easily comparable. For example, a few years ago monads weren't around. At that point, a different I/O method was considered "it" for functional programs (I swear I know which, but I forgot). Behind closed doors every functional language designer was scratching their head trying to find a better way. It's good Haskell didn't commit to the now obsolete I/O method - as good as D not committing to yesteryear's threading model.
 So I am a little worried for some of the last features introduced in
 D2. I don't know if D3 can solve this problem (maybe not).

I read the entire post waiting for the punchline. Which features do you have in mind? Andrei
Feb 14 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
retard wrote:
 Sun, 14 Feb 2010 17:36:59 -0600, Andrei Alexandrescu wrote:
 
 bearophile wrote:
 - And finally in D2 there are several new features that are sometimes
 only half-implemented, and generally no one has tried them in long
 programs, they seem to come from just the mind of few (intelligent)
 people, they don't seem battle-tested at all. Such new features are a
 dangerous bet, they can hide many traps and problems. Finalizing the D2
 language before people have actually tried to use such features in some
 larger programs looks dangerous. Recently I have understood that this
 is why Simon Peyton-Jones said "Avoid success at all costs" regarding
 Haskell, that he has slowly developed for about 15 years: to give the
 language the time to be tuned, to remove warts, to improve it before
 people start to use it for rear and it needs to be frozen (today we are
 probably in a phase when Haskell has to be frozen, because there is
 enough software written in it that you can't lightly break backward
 compatibility).

success". Anyway, either slogan shouldn't be taken out of context, and I don't think the situations of the two languages are easily comparable. For example, a few years ago monads weren't around. At that point, a different I/O method was considered "it" for functional programs (I swear I know which, but I forgot).

There's not much choice here. Probably explicit state passing with a state variable? People also invented other methods, but monads provided an useful abstraction for other kind of use as well.

I'm telling you that pre-monads there was an I/O paradigm that everybody FP swore by. I learned about it in 2001 in my first grad level course, and actually wrote programs using it. The professor stressed how it had been all a fad and that monads may also be one. Since you waste no opportunity to walk us through your vast library, you may as well remind us what it was. Andrei
Feb 14 2010
parent reply Jeff Nowakowski <jeff dilacero.org> writes:
Andrei Alexandrescu wrote:
 I'm telling you that pre-monads there was an I/O paradigm that everybody 
 FP swore by.

I looked this up out of curiosity. The Haskell 1.2 report, circa 1992, talks about "streams of messages" and also "continuation-based I/O" as an alternative to streams. Monads were introduced as the standard for I/O in the 1.3 report, circa 1996. http://haskell.org/haskellwiki/Definition#Historic_development_of_Haskell -Jeff
Mar 01 2010
parent reply bearophile <bearophileHUGS lycos.com> writes:
retard:
  - a new llvm based backend which is a lot faster

I guess most people on Reddit has not read the original thesis. The LLVM Haskell back-end: - Needs less code to be used by the front-end, this is quite positive for them. - LLVM compiles better certain heavy numerical kernels. So if you want to use Haskell for number crunching LLVM is better. While most of other Haskell code becomes a little slower. Bye, bearophile
Mar 02 2010
parent reply bearophile <bearophileHUGS lycos.com> writes:
retard:
 Can you cite any sources? I'd like to know the cases where GCC is faster 
 than the don's new approach.

The thesis, page 52 (page 61 of the PDF): http://www.cse.unsw.edu.au/~pls/thesis/davidt-thesis.pdf Bye, bearophile
Mar 02 2010
parent bearophile <bearophileHUGS lycos.com> writes:
retard:
The full potential hasn't been realized yet.<

I'm sure there's space for improvements. But I have yet to see benchmarks that show improvements compared to those shown in the thesis.
Also the thesis doesn't mention Don's genetic algorithm stuff<

Genetic algorithms can be used with GCC too, this is a more mature software than the one you talk about that uses LLVM: http://www.coyotegulch.com/products/acovea/
or the stream fusion optimization.<

I don't know about this. I am not sure. Bye, bearophile
Mar 02 2010
prev sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
retard wrote:
 Basically D just shows how well Walter has read his CS books.

There's a lot more to writing compilers and designing languages than you'll find in CS books. For example, I discovered that the CS textbook optimization algorithms don't actually work. The authors had obviously never implemented them in a production compiler.
 This is not the case in languages designed by language experts such as 
 Guy Steele, Simon Peyton Jones, or John McCarthy.

I'm curious, why are you here, then? Why aren't you using Fortress, Haskell, or Lisp instead? If you're here to criticize me, that's cool, I don't mind. I'd much prefer it, though, if you were here to help and would step up and contribute code, documentation, articles, etc. D can really use your active help.
Feb 14 2010
parent Walter Bright <newshound1 digitalmars.com> writes:
retard wrote:
 I'm really sorry I often sound so annoying. I guess I should stop posting 
 at all since there seems to be small value in criticizing things or 
 discussing various programming topics.

I did not intend to chase you away. You're welcome to stay here and criticize all you want. What I'm trying to say is that there's far more value in contributing to the D project than there is in simply criticizing it. You have the knowledge and ability to, so why not?
Feb 15 2010
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
dsimcha wrote:
 == Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 ulong x0;
 static assert(!__traits(compiles, -x0));
 uint x1;
 static assert(!__traits(compiles, -x1));
 ushort x2;
 static assert(!__traits(compiles, -x2));
 ubyte x3;
 static assert(!__traits(compiles, -x3));
 Sounds good?
 Andrei

The more you bring up features to give the axe to, the more I find it funny that, not being a very good language lawyer, I wasn't aware that half of these features existed in the first place. This is one of them. Yes, definitely get rid of it. It makes absolutely no sense. If you want to treat your number like a signed int, then it should require an explicit cast.

I said the same. Walter's counter-argument is that 2's complement arithmetic is an inescapable reality that all coders must be aware of in D and its kin. Negation is really taking the two's complement of the thing. The fact that the type was unsigned is not of much import. Andrei
Feb 14 2010
next sibling parent dsimcha <dsimcha yahoo.com> writes:
== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 Walter's counter-argument is that 2's complement
 arithmetic is an inescapable reality that all coders must be aware of in
 D and its kin. Negation is really taking the two's complement of the
 thing. The fact that the type was unsigned is not of much import.
 Andrei

Ok, but you could still do it just by using an explicit cast. When you're thinking of how things are represented at the bit level, this is basically type punning. It should be allowed in a systems language, but it should require an explicit cast. Using unary - to get the two's complement of a number is a way of getting around they type system and is probably a bug more often than an intentional idiom.
Feb 14 2010
prev sibling next sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
Steven Schveighoffer wrote:
 are there any good cases besides this that Walter has?  And even if 
 there are, we are not talking about silently mis-interpreting it.  There 
 is precedent for making valid C code an error because it is error prone.

Here's where I'm coming from with this. The problem is that CPU integers are 2's complement and a fixed number of bits. We'd like to pretend they work just like whole numbers we learned about in 2nd grade arithmetic. But they don't, and we can't fix it so they do. I think it's ultimately fruitless to try and make them behave other than what they are: 2's complement fixed arrays of bits. So, we wind up with oddities like overflow, wrap-around, -int.min==int.min. Heck, we *rely* on these oddities (subtraction depends on wrap-around). Sometimes, we pretend these bit values are signed, sometimes unsigned, and we mix together those notions in the same expression. There's no way to not mix up signed and unsigned arithmetic. Trying to build walls between signed and unsigned integer types is an exercise in utter futility. They are both 2-s complement bits, and it's best to treat them that way rather than pretend they aren't. As for -x in particular, - is not negation. It's complement and increment, and produces exactly the same bit result for signed and unsigned types. If it is disallowed for unsigned integers, then the user is faced with either: (~x + 1) which not only looks weird in an arithmetic expression, but then a special case for it has to be wired into the optimizer to turn it back into a NEG instruction. Or: -cast(int)x That blows when x happens to be a ulong. Whoops. It blows even worse if x turns out to be a struct with overloaded opNeg and opCast, suddenly the opCast gets selected. Oops. We could use a template: -MakeSignedVersionOf(x) and have to specialize that template for every user defined type, but, really, please no.
Feb 15 2010
next sibling parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2010-02-15 18:33:11 -0500, "Steven Schveighoffer" 
<schveiguy yahoo.com> said:

 I should clarify, using - on an unsigned value should work, it just should
 not be assignable to an unsigned type.  I guess I disagree with the
 original statement for this post (that it should be disabled all
 together), but I think that the compiler should avoid something that is
 99% of the time an error.
 
 i.e.
 
 uint a = -1; // error
 uint b = 5;
 uint c = -b; // error
 int d = -b; // ok
 auto e = -b; // e is type int

But should this work? uint a = 0-1; uint c = 0-b; auto e = 0-b; // e is type int? uint zero = 0; uint a = zero-1; uint c = zero-b; auto e = zero-b; // e is type int? This rule has good intentions, but it brings some strange inconsistencies. The current rules are much easier to predict since they behave always the same whether you have a variable, a literal or a constant expression. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Feb 15 2010
next sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
Steven Schveighoffer wrote:
 For example, there is no possible way a person unfamiliar with computers 

That's a valid argument if you're writing a spreadsheet program. But programmers should be familiar with computers, and most definitely should be familiar with 2's complement arithmetic. Similarly, if you do much with floating point, you should be familiar with "What Every Computer Scientist Should Know About Floating-Point Arithmetic" http://docs.sun.com/source/806-3568/ncg_goldberg.html
Feb 15 2010
next sibling parent reply Lutger <lutger.blijdestijn gmail.com> writes:
Walter Bright wrote:

 Steven Schveighoffer wrote:
 For example, there is no possible way a person unfamiliar with computers

That's a valid argument if you're writing a spreadsheet program. But programmers should be familiar with computers, and most definitely should be familiar with 2's complement arithmetic. Similarly, if you do much with floating point, you should be familiar with "What Every Computer Scientist Should Know About Floating-Point Arithmetic" http://docs.sun.com/source/806-3568/ncg_goldberg.html

It's a valid viewpoint, but it is a 'should'. I believe many programmers have only passing familiarity if at all with the semantics of unsigned types and floating point operations. At least when coding, they don't have these semantics in mind. Why do you think Java doesn't have unsigned types? As the language designer you can say that your target users must have this knowledge, that's fine. Paraphrasing Alexandrescu: this is one of those fundamental coordinates that put D on the landscape of programming languages. I'm quite sure though that when you go look at the empirical side of the story, 'should' does not equate with 'is'. However D does seem to target C#/Java and even python programmers. It is often suggested D's 'system programming' features are not actually *needed* and it offers enough high-level and safe features for programmers not comfortable with C / C++ to program effectively. This reasoning does not hold for unsigned integers and floating point vagaries.
Feb 16 2010
parent reply Walter Bright <newshound1 digitalmars.com> writes:
Lutger wrote:
 It's a valid viewpoint, but it is a 'should'. I believe many programmers 
 have only passing familiarity if at all with the semantics of unsigned types 
 and floating point operations. At least when coding, they don't have these 
 semantics in mind. Why do you think Java doesn't have unsigned types? 

Naive programmers have trouble with Java floating point as well: http://www.eecs.berkeley.edu/~wkahan/JAVAhurt.pdf There's just no getting around it. Should Java just remove floating point types as well? Heck, I knew a degree'd mechanical engineer who could not understand why his calculator kept giving him answers off by a factor of 2 (he refused to understand roundoff error, no matter how many times I tried to explain it to him - he believed that calculators had mathematically perfect arithmetic). We could ban calculators, but misuse of slide rules is far worse.
 However D does seem to target C#/Java and even python programmers. It is 
 often suggested D's 'system programming' features are not actually *needed* 
 and it offers enough high-level and safe features for programmers not 
 comfortable with C / C++ to program effectively. This reasoning does not 
 hold for unsigned integers and floating point vagaries. 

Pointers are far more troublesome than negating an unsigned. In my experience with beginning programming courses, the very first thing they explained was 2's complement arithmetic. I do not think it unreasonable at all that someone using a powerful systems programming language ought to understand it.
Feb 16 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Walter Bright wrote:
 Lutger wrote:
 It's a valid viewpoint, but it is a 'should'. I believe many 
 programmers have only passing familiarity if at all with the semantics 
 of unsigned types and floating point operations. At least when coding, 
 they don't have these semantics in mind. Why do you think Java doesn't 
 have unsigned types? 

Naive programmers have trouble with Java floating point as well: http://www.eecs.berkeley.edu/~wkahan/JAVAhurt.pdf There's just no getting around it. Should Java just remove floating point types as well? Heck, I knew a degree'd mechanical engineer who could not understand why his calculator kept giving him answers off by a factor of 2 (he refused to understand roundoff error, no matter how many times I tried to explain it to him - he believed that calculators had mathematically perfect arithmetic).

How could he refuse? One of my favorite games with calculators was to successively extract square root of 2 until I got 1. The better the calculator, the more steps it takes. That's kind of difficult to refuse to acknowledge :o). Andrei
Feb 16 2010
parent Walter Bright <newshound1 digitalmars.com> writes:
Andrei Alexandrescu wrote:
 Walter Bright wrote:
 Lutger wrote:
 It's a valid viewpoint, but it is a 'should'. I believe many 
 programmers have only passing familiarity if at all with the 
 semantics of unsigned types and floating point operations. At least 
 when coding, they don't have these semantics in mind. Why do you 
 think Java doesn't have unsigned types? 

Naive programmers have trouble with Java floating point as well: http://www.eecs.berkeley.edu/~wkahan/JAVAhurt.pdf There's just no getting around it. Should Java just remove floating point types as well? Heck, I knew a degree'd mechanical engineer who could not understand why his calculator kept giving him answers off by a factor of 2 (he refused to understand roundoff error, no matter how many times I tried to explain it to him - he believed that calculators had mathematically perfect arithmetic).

How could he refuse?

Beats me. Naturally, I lost all respect for his engineering prowess.
 One of my favorite games with calculators was to 
 successively extract square root of 2 until I got 1. The better the 
 calculator, the more steps it takes. That's kind of difficult to refuse 
 to acknowledge :o).
 
 Andrei

Feb 16 2010
prev sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
Steven Schveighoffer wrote:
 What I meant by that statement is that the behavior goes against common 
 sense -- when it doesn't have to.  It only makes sense to advanced 
 programmers who understand the inner workings of the CPU and even in 
 those cases, advance programmers easily make mistakes.

Where you and I disagree is I don't feel that 2s-complement arithmetic is in any way an advanced programming topic. Nor is it an inner working of a CPU - it's an exteriorly visible behavior, well documented in the CPU manuals. (Inner behavior would be things like the microcode.) As I mentioned before, how that works was often the very first topic in an introductory book on programming. No comprehension of the fundamentals computer arithmetic will lead to failure after failure as a programmer; no language can paper that over. There is no escaping it or pretending it isn't there.
 When the result of an operation is 99.999% of the time an error (in fact 
 the exact percentage is (T.max-1)/T.max  * 100), disallowing it is worth 
 making the rare valid uses of it illegal.

It conforms to the simple rules of 2s-complement arithmetic, so I disagree with calling it an error.
 The case I'm talking about is the equivalent to doing:
 
 x = x / 0;

Even mathematicians don't know what to do about divide by zero. But 2's complement arithmetic is well defined. So the situations are not comparable.
Feb 16 2010
next sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
Steven Schveighoffer wrote:
 We're not working in Assembly here.  This is a high level language, 
 designed to hide the complexities of the underlying processor.  The 
 processor has no idea whether the data in its registers is signed or 
 unsigned.  The high level language does.  Please use that knowledge to 
 prevent stupid mistakes, or is that not one of the goals of the 
 compiler?  I can't believe this is such a hard point to get across.

It's not that I don't understand your point. I do, I just don't agree with it. At this point, we are going in circles so I don't think there's much value in me reiterating my opinions on it, except to say that Andrei and I once spent a great deal of time trying to separate signed from unsigned using the type system. The problem was that expressions tend to legitimately mix up signed and unsigned types together. Trying to tease out the "correct" sign of the result and what the programmer might have intended turned out to be an inscrutable mess of complication that we finally concluded would never work. It's a seductive idea, it just doesn't work. That's why C, etc. allows for easy implicit conversions between signed and unsigned, and why it has a set of (indeed, arbitrary) rules for combining them. Even though arbitrary, at least they are understandable and consistent. Back when ANSI C was first finalized, there was a raging debate for years about whether C should use value-preserving or sign-preserving integral promotion rules. There were passionate arguments on both sides, both of which claimed the territory of intuitiveness and obviousness. The end result was both sides eventually realized there was no correct answer, and that an arbitrary decision was required. It was made (value preserving), and half of the compiler vendors changed their compilers to match, and the rancor was forgotten. For example, let's take two indices into an array, i and j: size_t i,j; size_t is, by convention, unsigned. Now, to get the distance between two indices: auto delta = i - j; By C convention, delta is unsigned. If i is >= j, which may be an invariant of my algorithm, all is well. If i < j, suddenly delta is a very large value (but it still works, because of wrap around). The point is, there is no correct rule for dealing with the types of i-j. This has consequences: Now, if j happens instead to be a complicated loop invariant expression (e) in a loop, loop auto delta = i - (e); we may instead opt to hoist it out of a loop: auto j = -(e); loop auto delta = i + j; and suddenly the compiler spits out error messages? Why can I subtract an unsigned, but not negate one? Such rules are complicated and will seem arbitrary to the user.
 The case I'm talking about is the equivalent to doing:
  x = x / 0;

Even mathematicians don't know what to do about divide by zero. But 2's complement arithmetic is well defined. So the situations are not comparable.

Sure they do, the result is infinity. It's well defined.

I'm not a mathematician, but I believe it is not well defined, which one finds out when doing branch cuts. Per IEEE 754 (and required by D), floating point arithmetic divide by 0 resolves to infinity, but not all FPU hardware conforms to this spec. There is no similar convention for integer divide by 0. This is why the C standard leaves this as "implementation defined" behavior.
Feb 16 2010
next sibling parent bearophile <bearophileHUGS lycos.cos> writes:
Walter Bright:
 For example, let's take two indices into an array, i and j:
      size_t i,j;
 size_t is, by convention, unsigned.

From what I've seen so far unsigned integers are useful when: - You need a bit array, for example to implement a bloom filter, a bitvector, a bit set, when you want to do SWAR, when you need bit arrays to deal with hardware, etc. - When you really need the full range of numbers 0 .. 2^n, this happens but it's uncommon. In most other situations using unsigned numbers is unsafe (because other rules of the language make them unsafe, mostly) and it's better to use signed values. So array indices are better signed, as almost everything else. If you mix signed and unsigned arithmetic to index an array or to measure its length you will often introduce bugs in the code (because the language seems unable to manage ranges of values in a tidy way). It seems integral numbers is one of the things CommonLisp gets right and C/D do wrong. Bye, bearophile
Feb 17 2010
prev sibling parent Walter Bright <newshound1 digitalmars.com> writes:
Steven Schveighoffer wrote:
 First, it would work under my rules.  j would be of type int.  Under my 
 rules, negating an unsigned value equates to a signed version of that 
 type.

I've tried to stick with the principle that C code compiled with D will either work the same, or will fail with a compiler error message. It's very important to avoid "compiles, but produces subtly different behavior" for integer numeric code. The reason for this is there's a lot of debugged, working C code that contains rather complicated integer expressions. How/why it works may be long since lost information, and having it fail when translated to D will cause frustration and distaste for D. Changing the signedness of a sub-expression will definitely fall into this category.
Feb 18 2010
prev sibling parent Clemens <eriatarka84 gmail.com> writes:
Steven Schveighoffer Wrote:

 Even mathematicians don't know what to do about divide by zero. But 2's  
 complement arithmetic is well defined. So the situations are not  
 comparable.

Sure they do, the result is infinity. It's well defined.

This is a common misconception. Of course it depends on the definition you're working with, but the usual arithmetic on real numbers does not define division by zero. The operation just doesn't exist. To get a bit more abstract, a so-called ring with unity (an algebraic abstraction of, among many other things, the reals) is a set of things, one of which is called "1", together with operations + and *. Division is defined only insofar as that some elements 'a' may have an inverse 'b' such that a*b=b*a=1. There is no requirement that all elements have an inverse (that would be a "group"), and 0 in the reals in particular doesn't have one. In fact, infinity is not a real number (it's not in the set of "things" we're considering), so it doesn't even make sense to say that the inverse of 0 is infinity. http://en.wikipedia.org/wiki/Ring_theory Sorry for off-topic, just riles me to see these half-truths repeated again and again.
Feb 17 2010
prev sibling next sibling parent reply Ellery Newcomer <ellery-newcomer utulsa.edu> writes:
On 02/15/2010 09:15 PM, Steven Schveighoffer wrote:
 For example, there is no possible way a person unfamiliar with computers
 (and most programmers who have not run into this) would believe that

 b = 5;
 a = -b;

Tell any math major that fixnum arithmetic is really just arithmetic modulo 2^32 and they would believe you, even if they had never heard of computers
 would result in a being some large positive number. It's just totally
 unexpected, and totally avoidable.

 -Steve

Feb 16 2010
parent reply Daniel Keep <daniel.keep.lists gmail.com> writes:
Ellery Newcomer wrote:
 On 02/15/2010 09:15 PM, Steven Schveighoffer wrote:
 For example, there is no possible way a person unfamiliar with computers
 (and most programmers who have not run into this) would believe that

 b = 5;
 a = -b;

Tell any math major that fixnum arithmetic is really just arithmetic modulo 2^32 and they would believe you, even if they had never heard of computers
 would result in a being some large positive number. It's just totally
 unexpected, and totally avoidable.

 -Steve


http://googleresearch.blogspot.com/2006/06/extra-extra-read-all-about-it-nearly.html
 Fast forward to 2006. I was shocked to learn that the binary search
 program that Bentley proved correct and subsequently tested in Chapter
 5 of Programming Pearls contains a bug. Once I tell you what it is,
 you will understand why it escaped detection for two decades. Lest you
 think I'm picking on Bentley, let me tell you how I discovered the
 bug: The version of binary search that I wrote for the JDK contained
 the same bug. It was reported to Sun recently when it broke someone's
 program, after lying in wait for nine years or so.

 ...

 The bug is in this line:

 6:             int mid = (low + high) / 2;

 ...

 In Programming Pearls, Bentley says "While the first binary search was
 published in 1946, the first binary search that works correctly for
 all values of n did not appear until 1962." The truth is, very few
 correct versions have ever been published, at least in mainstream
 programming languages.

It's fun to note that one of the fixes the author proposes in the article was actually shown to itself be wrong... nearly two years later. Clearly, knowing that computers use two's complement fixed-width integer arithmetic is insufficient to write correct code. To believe otherwise is to believe that humans are infallible. In which case, I have literature on the Invisible Pink Unicorn [1] that might interest you... [1] http://en.wikipedia.org/wiki/Invisible_Pink_Unicorn
Feb 16 2010
next sibling parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Daniel Keep (daniel.keep.lists gmail.com)'s article
 Ellery Newcomer wrote:
 On 02/15/2010 09:15 PM, Steven Schveighoffer wrote:
 For example, there is no possible way a person unfamiliar with computers
 (and most programmers who have not run into this) would believe that

 b = 5;
 a = -b;

Tell any math major that fixnum arithmetic is really just arithmetic modulo 2^32 and they would believe you, even if they had never heard of computers
 would result in a being some large positive number. It's just totally
 unexpected, and totally avoidable.

 -Steve


 Fast forward to 2006. I was shocked to learn that the binary search
 program that Bentley proved correct and subsequently tested in Chapter
 5 of Programming Pearls contains a bug. Once I tell you what it is,
 you will understand why it escaped detection for two decades. Lest you
 think I'm picking on Bentley, let me tell you how I discovered the
 bug: The version of binary search that I wrote for the JDK contained
 the same bug. It was reported to Sun recently when it broke someone's
 program, after lying in wait for nine years or so.

 ...

 The bug is in this line:

 6:             int mid = (low + high) / 2;

 ...

 In Programming Pearls, Bentley says "While the first binary search was
 published in 1946, the first binary search that works correctly for
 all values of n did not appear until 1962." The truth is, very few
 correct versions have ever been published, at least in mainstream
 programming languages.

article was actually shown to itself be wrong... nearly two years later. Clearly, knowing that computers use two's complement fixed-width integer arithmetic is insufficient to write correct code. To believe otherwise is to believe that humans are infallible. In which case, I have literature on the Invisible Pink Unicorn [1] that might interest you... [1] http://en.wikipedia.org/wiki/Invisible_Pink_Unicorn

I **HATE** this example because it's a classic example of extreme nitpicking. On most modern computers, (void*).sizeof == size_t.sizeof. Furthermore, usually half your address space is reserved for kernel use. Therefore, this bug would only show up when you're searching an array of bytes **and** very close to exhausting available address space (i.e. when you probably have bigger problems anyhow). I have intentionally written binary searches like this even though I'm aware of this bug because it's more readable and efficient than doing it "right" and would only fail in corner cases too extreme to be worth considering.
Feb 16 2010
next sibling parent dsimcha <dsimcha yahoo.com> writes:
== Quote from Robert Jacques (sandford jhu.edu)'s article
 Actually, this bug is more common than that; overflow can happen on arrays
 of length uint.max/2 and that's to say nothing of using 64-bit code. Also,
 the std.algorithm binary search routines use a different algorithm that
 appears to be safe to use. (Though they won't compile in 64-bit mode due
 to a minor bug)

Well, really you should be using size_t instead of int (which I do) to deal with the 64-bit issue. I guess there is a good point here in that sense. However, assuming size_t.sizeof == (void*).sizeof and half your address space is reserved for kernel use, the only way this bug could bite you is on absurdly large arrays of bytes, where binary search is almost always the wrong algorithm anyhow.
Feb 16 2010
prev sibling parent Walter Bright <newshound1 digitalmars.com> writes:
dsimcha wrote:
 I **HATE** this example because it's a classic example of extreme nitpicking. 
On
 most modern computers, (void*).sizeof == size_t.sizeof.  Furthermore, usually
half
 your address space is reserved for kernel use.  Therefore, this bug would only
 show up when you're searching an array of bytes **and** very close to
exhausting
 available address space (i.e. when you probably have bigger problems anyhow). 
I
 have intentionally written binary searches like this even though I'm aware of
this
 bug because it's more readable and efficient than doing it "right" and would
only
 fail in corner cases too extreme to be worth considering.

I agree with you that this "bug" is not worth considering, and that if you have an array that consumes more than half your address space you have other problems that will prevent your program from running.
Feb 16 2010
prev sibling next sibling parent reply Ellery Newcomer <ellery-newcomer utulsa.edu> writes:
On 02/16/2010 09:36 AM, Daniel Keep wrote:
 Ellery Newcomer wrote:
 On 02/15/2010 09:15 PM, Steven Schveighoffer wrote:
 For example, there is no possible way a person unfamiliar with computers
 (and most programmers who have not run into this) would believe that

 b = 5;
 a = -b;

Tell any math major that fixnum arithmetic is really just arithmetic modulo 2^32 and they would believe you, even if they had never heard of computers

It's fun to note that one of the fixes the author proposes in the article was actually shown to itself be wrong... nearly two years later. Clearly, knowing that computers use two's complement fixed-width integer arithmetic is insufficient to write correct code. To believe otherwise is to believe that humans are infallible.

In the same vein, having opposable thumbs is insufficient to peel bananas. To believe otherwise is a failure in logic. But I don't recall saying anywhere that writing correct code is possible. I will say that if I wanted my code to be reasonably correct, I probably wouldn't use fixnum arithmetic. OT: I kinda wish BigInt could be dropped in as a replacement for int with less hassle. OT: has anyone written a wrapper for int/long/etc that throws exceptions on overflow/underflow? Maybe such a thing should exist in the standard library?
Feb 16 2010
next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Ellery Newcomer:

 OT: has anyone written a wrapper for int/long/etc that throws exceptions 
 on overflow/underflow? Maybe such a thing should exist in the standard 
 library?

No, that idea is trash. Think about arrays, do you want to litter your code with a different type of array that test the index bounds? Surely not. You want the built-in arrays to test the bounds (with eventually a compilation option to disable such tests), as in D. The same is true for integral numbers, as done in Delphi/C#. ----------------- Adam D. Ruppe:
 	T opAdd(T a) {
 		T tmp = _payload + a;
 		asm { jo overflow; }
 		return tmp;
 		overflow:
 			throw new Exception("Overflow");
 	}

There are some problems with that. Currently a struct can't used to fully replace a number, for example you can't do if(a) yet. LDC has problems with gotos outside/inside the asm block: Gotos into inline assembly: For labels inside inline asm blocks, the D spec says "They can be the target of goto statements.", this is not supported at the moment. Basically, LLVM does not allow jumping in to or out of an asm block. We work around this for jumping out of asm by converting these branches to assignments to a temporary that is then used in a switch statement right after the inline asm block to jump to the final destination. This same workaround could be applied for jumping into inline assembly. Another problem is that dmd will not inline those little methods (ldc can be forced to do it). I want some efficiency in such safer operations too, otherwise they become less useful. Another problem with micro blocks of ASM in D is that the compiler is not always good at managing it, so it can use the stack and registers in a suboptimal way. In LDC they have invented asm Constraints to solve this problem: http://www.dsource.org/projects/ldc/wiki/InlineAsmExpressions Bye, bearophile
Feb 16 2010
parent Ellery Newcomer <ellery-newcomer utulsa.edu> writes:
On 02/16/2010 12:36 PM, bearophile wrote:
 Ellery Newcomer:

 OT: has anyone written a wrapper for int/long/etc that throws exceptions
 on overflow/underflow? Maybe such a thing should exist in the standard
 library?

No, that idea is trash. Think about arrays, do you want to litter your code with a different type of array that test the index bounds? Surely not. You want the built-in arrays to test the bounds (with eventually a compilation option to disable such tests), as in D. The same is true for integral numbers, as done in Delphi/C#.

Sure, I'd sooner have it built in to the compiler. Is it? A quick peek in your dlibs suggests it wasn't when you wrote powMod
Feb 16 2010
prev sibling parent Ellery Newcomer <ellery-newcomer utulsa.edu> writes:
On 02/16/2010 12:03 PM, Adam D. Ruppe wrote:
 On Tue, Feb 16, 2010 at 11:26:37AM -0600, Ellery Newcomer wrote:
 OT: has anyone written a wrapper for int/long/etc that throws exceptions
 on overflow/underflow? Maybe such a thing should exist in the standard
 library?

Something along these lines should work (pasted at bottom of message). I'd like it a lot more if it could just be opBinary!(string)() -- the struct would be tiny. Suckily, assigning an it to it when declaring it doesn't work. I think there's a way around this, but I don't know. The opPow's are commented out since my dmd is too old, so I couldn't test it. ========

Awesome. It doesn't work for opPow, though. And aha! put a static opCall in conjunction with opAssign and you can assign when declaring it. This ought to be in the docs somewhere where I can find it..
Feb 16 2010
prev sibling parent reply Daniel Keep <daniel.keep.lists gmail.com> writes:
Since everyone seemed to miss the point I was trying to make, I'll be
more explicit.

My point was that it's all very well to say "you should know X" and you
can even be totally right about that, but it doesn't mean behaviour
based on X is necessarily intuitive or desirable.

Walter specifically said that "I don't feel that 2s-complement
arithmetic is in any way an advanced programming topic".  I agree.  I
was trying to point out that even very experienced people who would
surely know what's going on in the hardware can get it wrong.

Still, my fault for posting an example.
Feb 16 2010
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
Daniel Keep Wrote:

 
 Since everyone seemed to miss the point I was trying to make, I'll be
 more explicit.
 
 My point was that it's all very well to say "you should know X" and you
 can even be totally right about that, but it doesn't mean behaviour
 based on X is necessarily intuitive or desirable.
 
 Walter specifically said that "I don't feel that 2s-complement
 arithmetic is in any way an advanced programming topic".  I agree.  I
 was trying to point out that even very experienced people who would
 surely know what's going on in the hardware can get it wrong.

I agree with this. However, even though experienced programmers can write 2s complement math with bugs, in this particular case, there is *no* correct way to do it. So it's impossible to get it right :) Any opposing view would have to include what obtaining the unsigned negation of an unsigned value is useful for. And literals don't count because they're easily expressed otherwise :) -Steve
Feb 16 2010
parent reply Don <nospam nospam.com> writes:
Steven Schveighoffer wrote:
 Daniel Keep Wrote:
 
 Since everyone seemed to miss the point I was trying to make, I'll be
 more explicit.

 My point was that it's all very well to say "you should know X" and you
 can even be totally right about that, but it doesn't mean behaviour
 based on X is necessarily intuitive or desirable.

 Walter specifically said that "I don't feel that 2s-complement
 arithmetic is in any way an advanced programming topic".  I agree.  I
 was trying to point out that even very experienced people who would
 surely know what's going on in the hardware can get it wrong.

I agree with this. However, even though experienced programmers can write 2s complement math with bugs, in this particular case, there is *no* correct way to do it. So it's impossible to get it right :) Any opposing view would have to include what obtaining the unsigned negation of an unsigned value is useful for. And literals don't count because they're easily expressed otherwise :)

x & -x Nonzero if x has has more than one bit set, ie is not a perfect power of 2. I use that often. <rant> My opinion: unsigned types get used *far* more often than they should. If this kind of behaviour confuses you, there's no way you should be using unsigned. Really, the problem is that people abuse unsigned to mean 'this is a positive integer'. And of course, negation doesn't make sense in the context of the naturals. But that is NOT what unsigned is. Unsigned types are types with NO SIGN. </rant>
Feb 17 2010
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
Don Wrote:

 Steven Schveighoffer wrote:
 Any opposing view would have to include what obtaining the unsigned negation
of an unsigned value is useful for.  And literals don't count because they're
easily expressed otherwise :)

x & -x Nonzero if x has has more than one bit set, ie is not a perfect power of 2. I use that often.

Fine, this would not be disallowed in my mind. The abuse is not simply applying negation to an unsigned, it is then using the result as unsigned. Incidentally, I always used the construct x & (x-1) to be zero if it's an exact power of 2. It's not much different.
 My opinion: unsigned types get used *far* more often than they should. 
 If this kind of behaviour confuses you, there's no way you should be 
 using unsigned.
 Really, the problem is that people abuse unsigned to mean 'this is a 
 positive integer'. And of course, negation doesn't make sense in the 
 context of the naturals. But that is NOT what unsigned is. Unsigned 
 types are types with NO SIGN.

If unsigned types get used far more often than they should, then they shouldn't be all over the place in the standard language (i.e. size_t is used for everything size related). You simply can't avoid using unsigned. It's also useful for when you don't think you should ever receive inputs that are negative. That being said, this one issue of applying negation and then using the result as an unsigned is not a very common usage, and is always an error. I don't see the harm in disallowing it. -Steve
Feb 17 2010
parent reply Don <nospam nospam.com> writes:
Steven Schveighoffer wrote:
 Don Wrote:
 
 Steven Schveighoffer wrote:
 Any opposing view would have to include what obtaining the unsigned negation
of an unsigned value is useful for.  And literals don't count because they're
easily expressed otherwise :)

Nonzero if x has has more than one bit set, ie is not a perfect power of 2. I use that often.

Fine, this would not be disallowed in my mind. The abuse is not simply applying negation to an unsigned, it is then using the result as unsigned. Incidentally, I always used the construct x & (x-1) to be zero if it's an exact power of 2. It's not much different.
 My opinion: unsigned types get used *far* more often than they should. 
 If this kind of behaviour confuses you, there's no way you should be 
 using unsigned.
 Really, the problem is that people abuse unsigned to mean 'this is a 
 positive integer'. And of course, negation doesn't make sense in the 
 context of the naturals. But that is NOT what unsigned is. Unsigned 
 types are types with NO SIGN.

If unsigned types get used far more often than they should, then they shouldn't be all over the place in the standard language

Yes, that's exactly my opinion. I think size_t being unsigned is the primary problem. You simply can't avoid using unsigned. It's also useful for when you don't think you should ever receive inputs that are negative. That's like using double for currency. You deserve what you get.
Feb 17 2010
parent bearophile <bearophileHUGS lycos.com> writes:
Don:
 Steven Schveighoffer:
 If unsigned types get used far more often than they should, then they
shouldn't be all over the place in the standard language 

Yes, that's exactly my opinion. I think size_t being unsigned is the primary problem. You simply can't avoid using unsigned. It's also useful for when you don't think you should ever receive inputs that are negative. That's like using double for currency. You deserve what you get.

It seems you and Steven agree with what I am saying for some months (this is just the last I have written): http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=106381 By the way, I think in C# array indexes are signed: http://msdn.microsoft.com/en-us/library/system.collections.arraylist.item.aspx Bye, bearophile
Feb 17 2010
prev sibling parent Brad Roberts <braddr bellevue.puremagic.com> writes:
On Tue, 16 Feb 2010, dsimcha wrote:

 I **HATE** this example because it's a classic example of extreme nitpicking. 
On
 most modern computers, (void*).sizeof == size_t.sizeof.  Furthermore, usually
half
 your address space is reserved for kernel use.  Therefore, this bug would only
 show up when you're searching an array of bytes **and** very close to
exhausting
 available address space (i.e. when you probably have bigger problems anyhow). 
I
 have intentionally written binary searches like this even though I'm aware of
this
 bug because it's more readable and efficient than doing it "right" and would
only
 fail in corner cases too extreme to be worth considering.

Actually, linux has used the 3:1 split for as long as I can recall. That leads to easilly allowing this case to hit. I've seen it hit in real world apps. I agree that when you're playing in that neighborhood that you should consider moving to 64 bit apps, but there's downsides there too. 64bit addressing isn't a silver bullet, unfortunatly. Any app that's integer or pointer heavy in its data structures pays a big cost. Later, Brad
Feb 16 2010
prev sibling next sibling parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2010-02-15 17:21:09 -0500, Walter Bright <newshound1 digitalmars.com> said:

 Or:
 
     -cast(int)x
 
 That blows when x happens to be a ulong. Whoops. It blows even worse if 
 x turns out to be a struct with overloaded opNeg and opCast, suddenly 
 the opCast gets selected. Oops.

That one is easy to fix. Add a "signed(x)" template function returning the signed counterpart of x. You could even insert a runtime range check if you really wanted to. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Feb 15 2010
parent Walter Bright <newshound1 digitalmars.com> writes:
Michel Fortin wrote:
 That one is easy to fix. Add a "signed(x)" template function returning 
 the signed counterpart of x. You could even insert a runtime range check 
 if you really wanted to.

Yes, I mentioned that at the end of that post, and the reasons why it is not very appealing.
Feb 15 2010
prev sibling next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Some of the ways C uses fixnums, its undefined situations, are bad today. The
Ada language is not handy to use, but it shows that if you want to create
reliable software able to fly planes, you want a language with a more tidy
arithmetic than C.

And I am not talking about multiprecision numbers here, there are many
situations where fixnums are enough (even if I think in any D program there are
some or many places where using a fixnum is a premature optimization).

In a tidy language if you have an integral value represented with a fixed
number of bits (like a D uint), and you try to assign it a value (like a -1)
outside the range of the values it can represent, you have a bug. Or you want
modulo arithmetic, but you have to help the compiler tell such two situations
apart.

You can't think just about what DMD2 is/does today: once integral overflow
tests are added to a future D2 compiler, don't you want a runtime error if you
assign to a number a value outside the range of the possible values it can
represent (like a negative value to an unsigned value)?

I am not sure.

Bye,
bearophile
Feb 15 2010
parent Walter Bright <newshound1 digitalmars.com> writes:
bearophile wrote:
 Some of the ways C uses fixnums, its undefined situations, are bad
 today. The Ada language is not handy to use, but it shows that if you
 want to create reliable software able to fly planes, you want a
 language with a more tidy arithmetic than C.

1. Ada seems to be a language people use only when they're forced to. D doesn't have the US government forcing its use. 2. Neither Ada nor its design decisions seem to have caught on with other languages, despite being around for 30 years. 3. Plenty of military and civil avionics use C++ anyway, despite attempts by the US government to force Ada. This is an old, old *old* issue. All the various solutions tried over the decades have failed to catch on. This leads me to believe that either these obvious solutions do not work, or they cause more problems than they fix. Heaven knows we are trying out a lot of new design ideas in D, but I'm not eager to embrace ideas that have failed over and over for 30 years.
Feb 15 2010
prev sibling next sibling parent reply Ellery Newcomer <ellery-newcomer utulsa.edu> writes:
On 02/15/2010 05:33 PM, Steven Schveighoffer wrote:
 i.e.

 uint a = -1; // error

I can't say I would appreciate having to write uint a = 0xFFFFFFFF; or the equivalent for ulong.
 uint b = 5;
 uint c = -b; // error
 int d = -b; // ok
 auto e = -b; // e is type int

 In the case of literals, I think allowing - on a literal should require
 that it be assigned to a signed type or involve a cast.

 -Steve

Feb 15 2010
next sibling parent reply Rainer Deyke <rainerd eldwood.com> writes:
Ellery Newcomer wrote:
 On 02/15/2010 05:33 PM, Steven Schveighoffer wrote:
 uint a = -1; // error

I can't say I would appreciate having to write uint a = 0xFFFFFFFF; or the equivalent for ulong.

uint a = ~0u; -- Rainer Deyke - rainerd eldwood.com
Feb 15 2010
parent reply Ellery Newcomer <ellery-newcomer utulsa.edu> writes:
On 02/15/2010 09:17 PM, Steven Schveighoffer wrote:
 On Mon, 15 Feb 2010 21:32:21 -0500, Rainer Deyke <rainerd eldwood.com>
 wrote:

 Ellery Newcomer wrote:
 On 02/15/2010 05:33 PM, Steven Schveighoffer wrote:
 uint a = -1; // error

I can't say I would appreciate having to write uint a = 0xFFFFFFFF; or the equivalent for ulong.

uint a = ~0u;

even ~0 works, no need for the u (although it makes things clearer). Ellery, you didn't read my original post thoroughly, I said this was the most common case of wanting to use unary negative on an unsigned value, and it's easily rewritten, with the same number of characters no less. -Steve

Ohhh! that post! You're right; I missed that part. Alright, here's something I found myself writing just today or yesterday: //x,r are long, n is ulong if(x < 0){ ulong ux = -x; ... } I also have if(r < 0){ return n - (-r) % n; } emphasis on ensuring dividend is positive before it gets promoted to ulong, etc etc, and I do guard that r is not remotely close to ulong.max/min. assuming that the return type is long (it isn't, but it might as well be, since n is always within [2,long.max]) or gets assigned to long or whatever. -The bottom one obeys your rules. -The top one doesn't. -The bottom one is much less clear than the top. -Whatever I was trying to prove, I think I just inadvertently strengthened your argument tenfold. and no, I expect this doesn't fall within the 99% use case of unary -
Feb 15 2010
parent Ellery Newcomer <ellery-newcomer utulsa.edu> writes:
On 02/15/2010 11:35 PM, Ellery Newcomer wrote:
 On 02/15/2010 09:17 PM, Steven Schveighoffer wrote:
 On Mon, 15 Feb 2010 21:32:21 -0500, Rainer Deyke <rainerd eldwood.com>
 wrote:

 Ellery Newcomer wrote:
 On 02/15/2010 05:33 PM, Steven Schveighoffer wrote:
 uint a = -1; // error

I can't say I would appreciate having to write uint a = 0xFFFFFFFF; or the equivalent for ulong.

uint a = ~0u;

even ~0 works, no need for the u (although it makes things clearer). Ellery, you didn't read my original post thoroughly, I said this was the most common case of wanting to use unary negative on an unsigned value, and it's easily rewritten, with the same number of characters no less. -Steve

Ohhh! that post! You're right; I missed that part. Alright, here's something I found myself writing just today or yesterday: //x,r are long, n is ulong if(x < 0){ ulong ux = -x; ... } I also have if(r < 0){ return n - (-r) % n; } emphasis on ensuring dividend is positive before it gets promoted to ulong, etc etc, and I do guard that r is not remotely close to ulong.max/min. assuming that the return type is long (it isn't, but it might as well be, since n is always within [2,long.max]) or gets assigned to long or whatever. -The bottom one obeys your rules. -The top one doesn't. -The bottom one is much less clear than the top. -Whatever I was trying to prove, I think I just inadvertently strengthened your argument tenfold. and no, I expect this doesn't fall within the 99% use case of unary -

Oh, darn it. nvm
Feb 15 2010
prev sibling parent dsimcha <dsimcha yahoo.com> writes:
== Quote from Ellery Newcomer (ellery-newcomer utulsa.edu)'s article
 On 02/15/2010 05:33 PM, Steven Schveighoffer wrote:
 i.e.

 uint a = -1; // error

uint a = 0xFFFFFFFF; or the equivalent for ulong.

I always just use uint.max for such things. Yes, it's a little more typing, but it's clearer and saves me a few seconds of thinking.
Feb 15 2010
prev sibling parent "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
Steven Schveighoffer wrote:
 On Mon, 15 Feb 2010 17:21:09 -0500, Walter Bright 
 <newshound1 digitalmars.com> wrote:
 
 Steven Schveighoffer wrote:
 are there any good cases besides this that Walter has?  And even if 
 there are, we are not talking about silently mis-interpreting it.  
 There is precedent for making valid C code an error because it is 
 error prone.

Here's where I'm coming from with this. The problem is that CPU integers are 2's complement and a fixed number of bits. We'd like to pretend they work just like whole numbers we learned about in 2nd grade arithmetic. But they don't, and we can't fix it so they do. I think it's ultimately fruitless to try and make them behave other than what they are: 2's complement fixed arrays of bits. So, we wind up with oddities like overflow, wrap-around, -int.min==int.min. Heck, we *rely* on these oddities (subtraction depends on wrap-around). Sometimes, we pretend these bit values are signed, sometimes unsigned, and we mix together those notions in the same expression.

One further thing I'll say on this: signed computer math makes a lot of sense to people because the limits are so large. For instance, -2 billion to 2 billion. It seems logical that a computer can't just keep coming up with new bits to represent numbers, but it seems so far off that an integer will wrap at 2 billion. But unsigned math has a much more familiar boundary -- zero. Numbers are far more likely to be near zero than they are near 2 billion or -2 billion. Applying a negation operator to an unsigned value almost *guarantees* wrapping past that boundary. 99% of the time, when I apply a negative sign to a number, I want the negative equivalent of that number. I don't want some large bizarre value that has no relation to that number, despite what the computer thinks is sane. I'd look at applying a negation operator to an unsigned int with as much scrutiny as I'd look at multiplying an integer by 1_000_000_000. It's almost guaranteed to go out of bounds, why are you doing it? Bringing up -int.min == int.min is exactly my point. For integers, there is one value that the negation operator doesn't work as expected. For unsigned integers, there is only one number that *does* work as expected -- zero. All others either don't work as expected, or rely on the computer behaving strangely (if it is indeed expected). To make such rare purposeful uses more explicit does not lower the quality of code or the ease of use.

Well said. -Lars
Feb 16 2010
prev sibling parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Tue, Feb 16, 2010 at 11:26:37AM -0600, Ellery Newcomer wrote:
 OT: has anyone written a wrapper for int/long/etc that throws exceptions 
 on overflow/underflow? Maybe such a thing should exist in the standard 
 library?

Something along these lines should work (pasted at bottom of message). I'd like it a lot more if it could just be opBinary!(string)() -- the struct would be tiny. Suckily, assigning an it to it when declaring it doesn't work. I think there's a way around this, but I don't know. The opPow's are commented out since my dmd is too old, so I couldn't test it. ======== import std.stdio; struct NoOverflow(T) { T _payload; alias _payload this; T opAdd(T a) { T tmp = _payload + a; asm { jo overflow; } return tmp; overflow: throw new Exception("Overflow"); } T opSub(T a) { T tmp = _payload - a; asm { jo overflow; } return tmp; overflow: throw new Exception("Overflow"); } T opMul(T a) { T tmp = _payload * a; asm { jo overflow; } return tmp; overflow: throw new Exception("Overflow"); } T opDiv(T a) { T tmp = _payload / a; asm { jo overflow; } return tmp; overflow: throw new Exception("Overflow"); } /+ T opPow(T a) { T tmp = _payload ^^ a; asm { jo overflow; } return tmp; overflow: throw new Exception("Overflow"); } +/ T opAddAssign(T a) { _payload += a; asm { jo overflow; } return this; overflow: throw new Exception("Overflow"); } T opSubAssign(T a) { _payload -= a; asm { jo overflow; } return this; overflow: throw new Exception("Overflow"); } T opMulAssign(T a) { _payload *= a; asm { jo overflow; } return this; overflow: throw new Exception("Overflow"); } T opDivAssign(T a) { _payload /= a; asm { jo overflow; } return this; overflow: throw new Exception("Overflow"); } T opPowAssign(T a) { //_payload ^^= a; asm { jo overflow; } return this; overflow: throw new Exception("Overflow"); } T opPostInc() { _payload++; asm { jo overflow; } return this; overflow: throw new Exception("Overflow"); } T opPostDec() { _payload--; asm { jo overflow; } return this; overflow: throw new Exception("Overflow"); } } void main() { alias NoOverflow!(int) oint; oint a; a = int.max; a--; writefln("%d", a); a++; writefln("%d", a); a++; // should throw writefln("%d", a); } ======= -- Adam D. Ruppe http://arsdnet.net
Feb 16 2010
prev sibling next sibling parent retard <re tard.com.invalid> writes:
Sun, 14 Feb 2010 17:36:59 -0600, Andrei Alexandrescu wrote:

 bearophile wrote:
 - And finally in D2 there are several new features that are sometimes
 only half-implemented, and generally no one has tried them in long
 programs, they seem to come from just the mind of few (intelligent)
 people, they don't seem battle-tested at all. Such new features are a
 dangerous bet, they can hide many traps and problems. Finalizing the D2
 language before people have actually tried to use such features in some
 larger programs looks dangerous. Recently I have understood that this
 is why Simon Peyton-Jones said "Avoid success at all costs" regarding
 Haskell, that he has slowly developed for about 15 years: to give the
 language the time to be tuned, to remove warts, to improve it before
 people start to use it for rear and it needs to be frozen (today we are
 probably in a phase when Haskell has to be frozen, because there is
 enough software written in it that you can't lightly break backward
 compatibility).

The response of the Haskell community seems to be "avoid avoiding success". Anyway, either slogan shouldn't be taken out of context, and I don't think the situations of the two languages are easily comparable. For example, a few years ago monads weren't around. At that point, a different I/O method was considered "it" for functional programs (I swear I know which, but I forgot).

There's not much choice here. Probably explicit state passing with a state variable? People also invented other methods, but monads provided an useful abstraction for other kind of use as well.
Feb 14 2010
prev sibling next sibling parent Igor Lesik <curoles yahoo.com> writes:
The problem I see with D is that Walter has mostly experience with C/C++. =

D.=0AIs it really the problem? I think the original idea was to have bette= r alternative=0Ato C/C++, and D 2.0 looks good in that respect; and D has g= ood chance=0Ato be welcomed by C/C++ folks since (almost) no learning is re= quired.=0A=0AThe "ambitious" features of D sometimes do not look=A0well pol= ished, (I could be wrong),=0Aand other than the=A0talks at Google I could n= ot find any material about=0Aadvanced features, directions and design princ= iples (will really=0Aappreciate if someone can give me pointers to such mat= erials, if it exists).=0A=0AI hope TDPL book will cover D's design principl= es and (if it makes sense) may be compare=0AD'd advanced/ambitious features= to other languages/approaches.=0AI like the concurrency chapter, it is gre= at that memory model is discussed and=0Acomparison with other languages is = presented.=0A=0A=0A
Feb 14 2010
prev sibling next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sun, 14 Feb 2010 17:02:12 -0500, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 dsimcha wrote:
 == Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s  
 article
 ulong x0;
 static assert(!__traits(compiles, -x0));
 uint x1;
 static assert(!__traits(compiles, -x1));
 ushort x2;
 static assert(!__traits(compiles, -x2));
 ubyte x3;
 static assert(!__traits(compiles, -x3));
 Sounds good?
 Andrei

funny that, not being a very good language lawyer, I wasn't aware that half of these features existed in the first place. This is one of them. Yes, definitely get rid of it. It makes absolutely no sense. If you want to treat your number like a signed int, then it should require an explicit cast.

I said the same. Walter's counter-argument is that 2's complement arithmetic is an inescapable reality that all coders must be aware of in D and its kin. Negation is really taking the two's complement of the thing. The fact that the type was unsigned is not of much import.

I'd say 99% of the cases where assigning a negative number to an unsigned type is the literal -1 (i.e. all bits set). It's just as easy to type ~0. In fact, in any case that you are using a literal besides that, it's usually more helpful to type the hex representation, or use ~(n-1) where you compute n-1 yourself (as in the -1 case). are there any good cases besides this that Walter has? And even if there are, we are not talking about silently mis-interpreting it. There is precedent for making valid C code an error because it is error prone. If no good cases exist, I'd say drop it. -Steve
Feb 15 2010
parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Tue, Feb 16, 2010 at 01:11:49PM -0600, Ellery Newcomer wrote:
 Awesome. It doesn't work for opPow, though.

Probably because I used assembly there, and opPow wouldn't be implemented as a simple instruction. Reimplementing it in that function and watching for the overflow ourself would work, but would probably be really slow.
 
 And aha! put a static opCall in conjunction with opAssign and you can 
 assign when declaring it. This ought to be in the docs somewhere where I 
 can find it..

Oh yeah, that's right. I think it is in the struct page of the website, but this is one of those things that's easy to forget. I'd really like to see opCall change in structs so it is more intuitive. Problem is, I don't have a better proposal. -- Adam D. Ruppe http://arsdnet.net
Feb 16 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 15 Feb 2010 17:21:09 -0500, Walter Bright
<newshound1 digitalmars.com> wrote:

 Steven Schveighoffer wrote:
 are there any good cases besides this that Walter has?  And even if  
 there are, we are not talking about silently mis-interpreting it.   
 There is precedent for making valid C code an error because it is error  
 prone.

Here's where I'm coming from with this. The problem is that CPU integers are 2's complement and a fixed number of bits. We'd like to pretend they work just like whole numbers we learned about in 2nd grade arithmetic. But they don't, and we can't fix it so they do. I think it's ultimately fruitless to try and make them behave other than what they are: 2's complement fixed arrays of bits. So, we wind up with oddities like overflow, wrap-around, -int.min==int.min. Heck, we *rely* on these oddities (subtraction depends on wrap-around). Sometimes, we pretend these bit values are signed, sometimes unsigned, and we mix together those notions in the same expression.

The counter-point to your point is that a programming language is not fed to the CPU, it is fed to a compiler. The compiler must make the most it can of what it sees in source code, but it can help the user express himself to the CPU. The problem is, when you have ambiguous statements, the compiler can either choose an interpretation or throw an error. There's nothing wrong with throwing an error if the statement is ambiguous or nonsensical. The alternative (which is what we have today) that the actual meaning is most of the time not what the user wants. A more graphic example is something like this: string x = 1; What did the user mean? Did he mean, make a string out of 1 and assign it to x, or did he mistype the type of x? Throwing an error is perfectly acceptable here, I don't see why the same isn't true for: uint x = -1;
 There's no way to not mix up signed and unsigned arithmetic.

 Trying to build walls between signed and unsigned integer types is an  
 exercise in utter futility. They are both 2-s complement bits, and it's  
 best to treat them that way rather than pretend they aren't.

 As for -x in particular, - is not negation. It's complement and  
 increment, and produces exactly the same bit result for signed and  
 unsigned types. If it is disallowed for unsigned integers, then the user  
 is faced with either:

     (~x + 1)

 which not only looks weird in an arithmetic expression, but then a  
 special case for it has to be wired into the optimizer to turn it back  
 into a NEG instruction.

I should clarify, using - on an unsigned value should work, it just should not be assignable to an unsigned type. I guess I disagree with the original statement for this post (that it should be disabled all together), but I think that the compiler should avoid something that is 99% of the time an error. i.e. uint a = -1; // error uint b = 5; uint c = -b; // error int d = -b; // ok auto e = -b; // e is type int In the case of literals, I think allowing - on a literal should require that it be assigned to a signed type or involve a cast. -Steve
Feb 15 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 15 Feb 2010 19:29:27 -0500, Michel Fortin  
<michel.fortin michelf.com> wrote:

 On 2010-02-15 18:33:11 -0500, "Steven Schveighoffer"  
 <schveiguy yahoo.com> said:

 I should clarify, using - on an unsigned value should work, it just  
 should
 not be assignable to an unsigned type.  I guess I disagree with the
 original statement for this post (that it should be disabled all
 together), but I think that the compiler should avoid something that is
 99% of the time an error.
  i.e.
  uint a = -1; // error
 uint b = 5;
 uint c = -b; // error
 int d = -b; // ok
 auto e = -b; // e is type int

But should this work? uint a = 0-1; uint c = 0-b; auto e = 0-b; // e is type int?

Through integer promotion rules, these all work. This is essentially negation, but it is not a unary operation. These could also be disallowed, but only after optimization. Because optimizing cannot change the semantic meaning, they have to be allowed. That is typeof(uint - uint) is uint, no matter how you do it. unary negation is a different operator.
 uint zero = 0;
 uint a = zero-1;
 uint c = zero-b;
 auto e = zero-b; // e is type int?

No different than your first examples. e is of type uint, since uint - uint = uint.
 This rule has good intentions, but it brings some strange  
 inconsistencies. The current rules are much easier to predict since they  
 behave always the same whether you have a variable, a literal or a  
 constant expression.

There are plenty of strange inconsistencies in all aspects of computer math. but unary negation of an unsigned value to get another unsigned value is one of those inconsistencies that is 99% of the time not what the user expected, and easily flagged as an error. For example, there is no possible way a person unfamiliar with computers (and most programmers who have not run into this) would believe that b = 5; a = -b; would result in a being some large positive number. It's just totally unexpected, and totally avoidable. -Steve
Feb 15 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 15 Feb 2010 21:32:21 -0500, Rainer Deyke <rainerd eldwood.com>  
wrote:

 Ellery Newcomer wrote:
 On 02/15/2010 05:33 PM, Steven Schveighoffer wrote:
 uint a = -1; // error

I can't say I would appreciate having to write uint a = 0xFFFFFFFF; or the equivalent for ulong.

uint a = ~0u;

even ~0 works, no need for the u (although it makes things clearer). Ellery, you didn't read my original post thoroughly, I said this was the most common case of wanting to use unary negative on an unsigned value, and it's easily rewritten, with the same number of characters no less. -Steve
Feb 15 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 15 Feb 2010 17:21:09 -0500, Walter Bright  
<newshound1 digitalmars.com> wrote:

 Steven Schveighoffer wrote:
 are there any good cases besides this that Walter has?  And even if  
 there are, we are not talking about silently mis-interpreting it.   
 There is precedent for making valid C code an error because it is error  
 prone.

Here's where I'm coming from with this. The problem is that CPU integers are 2's complement and a fixed number of bits. We'd like to pretend they work just like whole numbers we learned about in 2nd grade arithmetic. But they don't, and we can't fix it so they do. I think it's ultimately fruitless to try and make them behave other than what they are: 2's complement fixed arrays of bits. So, we wind up with oddities like overflow, wrap-around, -int.min==int.min. Heck, we *rely* on these oddities (subtraction depends on wrap-around). Sometimes, we pretend these bit values are signed, sometimes unsigned, and we mix together those notions in the same expression.

One further thing I'll say on this: signed computer math makes a lot of sense to people because the limits are so large. For instance, -2 billion to 2 billion. It seems logical that a computer can't just keep coming up with new bits to represent numbers, but it seems so far off that an integer will wrap at 2 billion. But unsigned math has a much more familiar boundary -- zero. Numbers are far more likely to be near zero than they are near 2 billion or -2 billion. Applying a negation operator to an unsigned value almost *guarantees* wrapping past that boundary. 99% of the time, when I apply a negative sign to a number, I want the negative equivalent of that number. I don't want some large bizarre value that has no relation to that number, despite what the computer thinks is sane. I'd look at applying a negation operator to an unsigned int with as much scrutiny as I'd look at multiplying an integer by 1_000_000_000. It's almost guaranteed to go out of bounds, why are you doing it? Bringing up -int.min == int.min is exactly my point. For integers, there is one value that the negation operator doesn't work as expected. For unsigned integers, there is only one number that *does* work as expected -- zero. All others either don't work as expected, or rely on the computer behaving strangely (if it is indeed expected). To make such rare purposeful uses more explicit does not lower the quality of code or the ease of use. -Steve
Feb 15 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 16 Feb 2010 00:35:21 -0500, Ellery Newcomer  
<ellery-newcomer utulsa.edu> wrote:

 On 02/15/2010 09:17 PM, Steven Schveighoffer wrote:
 On Mon, 15 Feb 2010 21:32:21 -0500, Rainer Deyke <rainerd eldwood.com>
 wrote:

 Ellery Newcomer wrote:
 On 02/15/2010 05:33 PM, Steven Schveighoffer wrote:
 uint a = -1; // error

I can't say I would appreciate having to write uint a = 0xFFFFFFFF; or the equivalent for ulong.

uint a = ~0u;

even ~0 works, no need for the u (although it makes things clearer). Ellery, you didn't read my original post thoroughly, I said this was the most common case of wanting to use unary negative on an unsigned value, and it's easily rewritten, with the same number of characters no less. -Steve

Ohhh! that post! You're right; I missed that part. Alright, here's something I found myself writing just today or yesterday: //x,r are long, n is ulong if(x < 0){ ulong ux = -x; ... }

I think at the same time you can assign an int to a uint without issue, assigning an int that is the result of a negation operator on a uint to another uint is a problem. This means that the type of the expression: -n where n is unsigned can't simply be the signed version of n's type. If it was, it would be impossible to tell the difference between assigning a negation of an unsigned value from a simple int result (which has a 50% chance of being positive). I don't know how the internal workings of the compiler behave, but there has to be a way to single the bad case out. We already have range propagation, this could simply be part of it.
 I also have

 if(r < 0){
    return n - (-r) % n;
 }

This is fine, you are not assigning the negation of unsigned to an unsigned value. Binary subtraction on unsigned and signed numbers is not as error prone.
 emphasis on ensuring dividend is positive before it gets promoted to  
 ulong, etc etc, and I do guard that r is not remotely close to  
 ulong.max/min.

 assuming that the return type is long (it isn't, but it might as well  
 be, since n is always within [2,long.max]) or gets assigned to long or  
 whatever.

 -The bottom one obeys your rules.
 -The top one doesn't.

The top is a good example. It is not something I thought of, but I think in the end, it should be allowed. Coming up with all the nuances of the behavior is key to finding a sound rule to try and implement, and how to explain its behavior through example. -Steve
Feb 16 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 16 Feb 2010 01:10:33 -0500, Walter Bright  
<newshound1 digitalmars.com> wrote:

 Steven Schveighoffer wrote:
 For example, there is no possible way a person unfamiliar with computers

That's a valid argument if you're writing a spreadsheet program. But programmers should be familiar with computers, and most definitely should be familiar with 2's complement arithmetic.

What I meant by that statement is that the behavior goes against common sense -- when it doesn't have to. It only makes sense to advanced programmers who understand the inner workings of the CPU and even in those cases, advance programmers easily make mistakes. When the result of an operation is 99.999% of the time an error (in fact the exact percentage is (T.max-1)/T.max * 100), disallowing it is worth making the rare valid uses of it illegal. This is no different in my mind to requiring comparison of an object to null to use !is instead of !=. If you remember, the compiler was dutifully doing exactly what the user wrote, but in almost all cases, the user really meant !is. To re-iterate, I do *not* think unary - for unsigned types should be disabled. But I think the expression: x = -(exp) where x is an unsigned type and exp is an unsigned type (or a literal that can be interpreted as unsigned), should be an error. The only case where it works properly is when exp is 0. Note that you can allow this behavior, which makes it more obvious: x = 0 - (exp) Because this is not unary negation. It follows the rules of subtraction, which do not disallow wrapping past zero.
 Similarly, if you do much with floating point, you should be familiar  
 with "What Every Computer Scientist Should Know About Floating-Point  
 Arithmetic"

 http://docs.sun.com/source/806-3568/ncg_goldberg.html

Yes, but I'm not talking about normal math with unsigned types. I'm talking about a corner case where it is almost always an error. The case I'm talking about is the equivalent to doing: x = x / 0; for floating point. One could argue that this should be statically disallowed, because it's guaranteed to be an error. This doesn't mean that: x = x / y; should be disallowed because y *might* be zero. -Steve
Feb 16 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 16 Feb 2010 09:17:32 -0500, Ellery Newcomer  
<ellery-newcomer utulsa.edu> wrote:

 On 02/15/2010 09:15 PM, Steven Schveighoffer wrote:
 For example, there is no possible way a person unfamiliar with computers
 (and most programmers who have not run into this) would believe that

 b = 5;
 a = -b;

Tell any math major that fixnum arithmetic is really just arithmetic modulo 2^32 and they would believe you, even if they had never heard of computers

But you don't designate it as such. If it was required to designate modulo 32 in the expression, then it would be fine with me. I went through Calc V and never really had to worry about this. Advanced mathematicians are not a model for the everyday programmer :) Even if you explain it to people, they still forget! It's the same as != null was. -Steve
Feb 16 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 16 Feb 2010 10:36:04 -0500, Daniel Keep  
<daniel.keep.lists gmail.com> wrote:

 It's fun to note that one of the fixes the author proposes in the
 article was actually shown to itself be wrong... nearly two years later.

 Clearly, knowing that computers use two's complement fixed-width integer
 arithmetic is insufficient to write correct code.  To believe otherwise
 is to believe that humans are infallible.

Yes, but in this case, the solution was incorrect for a small number of inputs (arrays with length > 2^30). For negation of unsigned values, the code is incorrect for all inputs except for zero. Appropriately, one will notice something is wrong sooner than a decade. I would postulate they should know instantaneously because the compiler should reject it :) -Steve
Feb 16 2010
prev sibling next sibling parent "Robert Jacques" <sandford jhu.edu> writes:
On Tue, 16 Feb 2010 10:53:40 -0500, dsimcha <dsimcha yahoo.com> wrote:

 == Quote from Daniel Keep (daniel.keep.lists gmail.com)'s article
 Ellery Newcomer wrote:
 On 02/15/2010 09:15 PM, Steven Schveighoffer wrote:
 For example, there is no possible way a person unfamiliar with  


 (and most programmers who have not run into this) would believe that

 b = 5;
 a = -b;

Tell any math major that fixnum arithmetic is really just arithmetic modulo 2^32 and they would believe you, even if they had never heard

 computers

 would result in a being some large positive number. It's just totally
 unexpected, and totally avoidable.

 -Steve


 Fast forward to 2006. I was shocked to learn that the binary search
 program that Bentley proved correct and subsequently tested in Chapter
 5 of Programming Pearls contains a bug. Once I tell you what it is,
 you will understand why it escaped detection for two decades. Lest you
 think I'm picking on Bentley, let me tell you how I discovered the
 bug: The version of binary search that I wrote for the JDK contained
 the same bug. It was reported to Sun recently when it broke someone's
 program, after lying in wait for nine years or so.

 ...

 The bug is in this line:

 6:             int mid = (low + high) / 2;

 ...

 In Programming Pearls, Bentley says "While the first binary search was
 published in 1946, the first binary search that works correctly for
 all values of n did not appear until 1962." The truth is, very few
 correct versions have ever been published, at least in mainstream
 programming languages.

article was actually shown to itself be wrong... nearly two years later. Clearly, knowing that computers use two's complement fixed-width integer arithmetic is insufficient to write correct code. To believe otherwise is to believe that humans are infallible. In which case, I have literature on the Invisible Pink Unicorn [1] that might interest you... [1] http://en.wikipedia.org/wiki/Invisible_Pink_Unicorn

I **HATE** this example because it's a classic example of extreme nitpicking. On most modern computers, (void*).sizeof == size_t.sizeof. Furthermore, usually half your address space is reserved for kernel use. Therefore, this bug would only show up when you're searching an array of bytes **and** very close to exhausting available address space (i.e. when you probably have bigger problems anyhow). I have intentionally written binary searches like this even though I'm aware of this bug because it's more readable and efficient than doing it "right" and would only fail in corner cases too extreme to be worth considering.

Actually, this bug is more common than that; overflow can happen on arrays of length uint.max/2 and that's to say nothing of using 64-bit code. Also, the std.algorithm binary search routines use a different algorithm that appears to be safe to use. (Though they won't compile in 64-bit mode due to a minor bug)
Feb 16 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 16 Feb 2010 13:38:06 -0500, Walter Bright  
<newshound1 digitalmars.com> wrote:

 Steven Schveighoffer wrote:
 What I meant by that statement is that the behavior goes against common  
 sense -- when it doesn't have to.  It only makes sense to advanced  
 programmers who understand the inner workings of the CPU and even in  
 those cases, advance programmers easily make mistakes.

Where you and I disagree is I don't feel that 2s-complement arithmetic is in any way an advanced programming topic. Nor is it an inner working of a CPU - it's an exteriorly visible behavior, well documented in the CPU manuals. (Inner behavior would be things like the microcode.) As I mentioned before, how that works was often the very first topic in an introductory book on programming.

I agree that 2's complement as a whole is not an advanced topic. What I disagree with is the interpretation the result of this one operation. To interpret it as an unsigned value is lunacy. The result should be interpreted as a negative value and not assignable to an unsigned value. There are much more sane and unambiguous alternatives to doing this. Even advanced programmers expect when they negate something it most likely becomes negative. It's a surprise when it *always* flips back to positive.
 No comprehension of the fundamentals computer arithmetic will lead to  
 failure after failure as a programmer; no language can paper that over.  
 There is no escaping it or pretending it isn't there.

You assume that to understand 2s complement is to understand *and* mentally parse why negating a positive value in a computer for unsigned types *always* results in a positive value. I think you can understand 2s complement arithmetic and the limitations, and *still* make the mistake of assigning the result of negating an unsigned value to an unsigned value. There have been already very smart, computer literate, 2s complement knowledgeable people who have said on this very newsgroup they have been bitten by this error. This is not a fix to help just newbies.
 When the result of an operation is 99.999% of the time an error (in  
 fact the exact percentage is (T.max-1)/T.max  * 100), disallowing it is  
 worth making the rare valid uses of it illegal.

It conforms to the simple rules of 2s-complement arithmetic, so I disagree with calling it an error.

We're not working in Assembly here. This is a high level language, designed to hide the complexities of the underlying processor. The processor has no idea whether the data in its registers is signed or unsigned. The high level language does. Please use that knowledge to prevent stupid mistakes, or is that not one of the goals of the compiler? I can't believe this is such a hard point to get across. It's not like I'm proposing to change everything about 2s complement math *except* in this one small situation: Negation on an unsigned value *and* (this part is the most important) assigning it to (or passing it as) an unsigned value. Any time you see that, it is an error or a misguided attempt to be clever.
 The case I'm talking about is the equivalent to doing:
  x = x / 0;

Even mathematicians don't know what to do about divide by zero. But 2's complement arithmetic is well defined. So the situations are not comparable.

Sure they do, the result is infinity. It's well defined. In fact, I think some math-based programming languages allow divide by zero. Again, not arguing against 2s complement here, just the one particular situation which is always an error. -Steve
Feb 16 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 16 Feb 2010 19:33:11 -0500, Walter Bright  
<newshound1 digitalmars.com> wrote:

 Steven Schveighoffer wrote:
 We're not working in Assembly here.  This is a high level language,  
 designed to hide the complexities of the underlying processor.  The  
 processor has no idea whether the data in its registers is signed or  
 unsigned.  The high level language does.  Please use that knowledge to  
 prevent stupid mistakes, or is that not one of the goals of the  
 compiler?  I can't believe this is such a hard point to get across.

It's not that I don't understand your point. I do, I just don't agree with it. At this point, we are going in circles so I don't think there's much value in me reiterating my opinions on it, except to say that Andrei and I once spent a great deal of time trying to separate signed from unsigned using the type system. The problem was that expressions tend to legitimately mix up signed and unsigned types together. Trying to tease out the "correct" sign of the result and what the programmer might have intended turned out to be an inscrutable mess of complication that we finally concluded would never work. It's a seductive idea, it just doesn't work. That's why C, etc. allows for easy implicit conversions between signed and unsigned, and why it has a set of (indeed, arbitrary) rules for combining them. Even though arbitrary, at least they are understandable and consistent.

As long as you clarify that you understand I'm not talking about the entire field of 2s complement math, but only the small case of negating an unsigned and assigning it to an unsigned, I will drop the argument. Because it is not clear from your arguments that you understand that.
 Back when ANSI C was first finalized, there was a raging debate for  
 years about whether C should use value-preserving or sign-preserving  
 integral promotion rules. There were passionate arguments on both sides,  
 both of which claimed the territory of intuitiveness and obviousness.

 The end result was both sides eventually realized there was no correct  
 answer, and that an arbitrary decision was required. It was made (value  
 preserving), and half of the compiler vendors changed their compilers to  
 match, and the rancor was forgotten.

D has made huge strides in creating designs that remove whole classes of errors from equivalent C code, just by making certain constructs illegal. I don't think that citing the past failures of C is a good way to argue why D can't do it. I'm thankful every time the compiler fails to compile code like if(x == 5); doThis();
 For example, let's take two indices into an array, i and j:

      size_t i,j;

 size_t is, by convention, unsigned.

 Now, to get the distance between two indices:

     auto delta = i - j;

 By C convention, delta is unsigned. If i is >= j, which may be an  
 invariant of my algorithm, all is well. If i < j, suddenly delta is a  
 very large value (but it still works, because of wrap around). The point  
 is, there is no correct rule for dealing with the types of i-j. This has  
 consequences:

 Now, if j happens instead to be a complicated loop invariant expression  
 (e) in a loop,

      loop
 	auto delta = i - (e);

 we may instead opt to hoist it out of a loop:

      auto j = -(e);
      loop
            auto delta = i + j;

 and suddenly the compiler spits out error messages? Why can I subtract  
 an unsigned, but not negate one? Such rules are complicated and will  
 seem arbitrary to the user.

First, it would work under my rules. j would be of type int. Under my rules, negating an unsigned value equates to a signed version of that type. It is the only operator which does so because it's more accurate 50% of the time over an unsigned type (the other 50% it becomes the same value, so either is just as accurate). In all other cases of operators on unsigned types, the result should be unsigned because there's no more accurate answer, and most of the time you wish to remain in the same type. To re-iterate, negation of a positive value (as defined by the type) implies the user wants a negative value. Second, I think it's way more clear to do: auto j = (e); loop auto delta = i - e; Just looking at this one line throws up red flags for me: auto delta = i + e; // huh? Isn't a delta a difference? The only other rule I proposed is that assigning a negated unsigned to an unsigned (or passing it to a function that requires unsigned) would be illegal to prevent obvious mistakes.
 The case I'm talking about is the equivalent to doing:
  x = x / 0;

Even mathematicians don't know what to do about divide by zero. But 2's complement arithmetic is well defined. So the situations are not comparable.


I'm not a mathematician, but I believe it is not well defined, which one finds out when doing branch cuts. Per IEEE 754 (and required by D), floating point arithmetic divide by 0 resolves to infinity, but not all FPU hardware conforms to this spec. There is no similar convention for integer divide by 0. This is why the C standard leaves this as "implementation defined" behavior.

I'm not a mathematician either, but I remember in school learning about working with infinity, and varying degrees of infinity. For example, if series 1 approaches infinity twice as fast as series 2, then you could say series 1 divided by series 2 is equal to 2. It was some interesting stuff, and I think dealing with infinity is mostly theoretical. Applying this to computer programming is somewhat specialized, but my point was just as a comparison to something else that is mostly always an error. There are several other constructs I could use, interestingly enough, most are dealing with one specific literals, where negation of an unsigned results in an error for over 2 billion values. -Steve
Feb 18 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 18 Feb 2010 14:22:06 -0500, Walter Bright  
<newshound1 digitalmars.com> wrote:

 Steven Schveighoffer wrote:
 First, it would work under my rules.  j would be of type int.  Under my  
 rules, negating an unsigned value equates to a signed version of that  
 type.

I've tried to stick with the principle that C code compiled with D will either work the same, or will fail with a compiler error message. It's very important to avoid "compiles, but produces subtly different behavior" for integer numeric code. The reason for this is there's a lot of debugged, working C code that contains rather complicated integer expressions. How/why it works may be long since lost information, and having it fail when translated to D will cause frustration and distaste for D. Changing the signedness of a sub-expression will definitely fall into this category.

This is a much better argument than your previous one. However, there is at least one (and probably more common) case where this is not true: statically sized arrays as parameters to functions. But such changes need good justification for their inclusion. It's clear to me that it's justified on the fact that there is no reasonable use to have the negation of an unsigned be unsigned, but it's not my language :) I guess unary - for unsigned will only ever be a lint-worthy error. -Steve
Feb 18 2010
prev sibling next sibling parent retard <re tard.com.invalid> writes:
Mon, 01 Mar 2010 16:10:33 -0500, Jeff Nowakowski wrote:

 Andrei Alexandrescu wrote:
 I'm telling you that pre-monads there was an I/O paradigm that
 everybody FP swore by.

I looked this up out of curiosity. The Haskell 1.2 report, circa 1992, talks about "streams of messages" and also "continuation-based I/O" as an alternative to streams. Monads were introduced as the standard for I/O in the 1.3 report, circa 1996. http://haskell.org/haskellwiki/

 
 -Jeff

Btw, there has been some progress in the Haskell land lately: - a new llvm based backend which is a lot faster - a GA based optimization parameter finder for the new backend - a new fusion technique for optimizing functional code - a rather nice new benchmarking platform http://www.haskell.org/pipermail/cvs-ghc/2010-February/052606.html http://donsbot.wordpress.com/2010/02/21/smoking-fast-haskell-code-using- ghcs-new-llvm-codegen/ http://donsbot.wordpress.com/2010/03/01/evolving-faster-haskell-programs- now-with-llvm/ http://donsbot.wordpress.com/2010/02/26/fusion-makes-functional- programming-fun/ http://donsbot.wordpress.com/2010/02/23/modern-benchmarking-in-haskell/ Of course lots of other stuff is also happening, but I thought the language related stuff might interest you.
Mar 02 2010
prev sibling next sibling parent retard <re tard.com.invalid> writes:
Tue, 02 Mar 2010 14:55:05 -0500, bearophile wrote:

 retard:
  - a new llvm based backend which is a lot faster

I guess most people on Reddit has not read the original thesis. The LLVM Haskell back-end: - Needs less code to be used by the front-end, this is quite positive for them. - LLVM compiles better certain heavy numerical kernels. So if you want to use Haskell for number crunching LLVM is better. While most of other Haskell code becomes a little slower.

Can you cite any sources? I'd like to know the cases where GCC is faster than the don's new approach.
Mar 02 2010
prev sibling parent retard <re tard.com.invalid> writes:
Tue, 02 Mar 2010 15:33:58 -0500, bearophile wrote:

 retard:
 Can you cite any sources? I'd like to know the cases where GCC is
 faster than the don's new approach.

The thesis, page 52 (page 61 of the PDF): http://www.cse.unsw.edu.au/~pls/thesis/davidt-thesis.pdf

Did you notice that he kept improving the backend interoperability after the thesis was done? The full potential hasn't been realized yet. Also the thesis doesn't mention Don's genetic algorithm stuff or the stream fusion optimization.
Mar 02 2010