digitalmars.D.learn - 0 < negative loop condition bug or misunderstanding on my part
- ixid (27/27) Mar 06 2012 I'm writing my first basic algorithms, this one is merge sort.
- =?UTF-8?B?QWxpIMOHZWhyZWxp?= (9/34) Mar 06 2012 We don't know what T is, but I assume a signed type like int.
- ixid (4/4) Mar 06 2012 Ah, thank you, so it's wrapping. That seems like a bad idea, what
- =?UTF-8?B?QWxpIMOHZWhyZWxp?= (5/8) Mar 06 2012 There are probably hundreds of discussions about that over the years on
- H. S. Teoh (6/11) Mar 06 2012 Not necessarily. For example, int.max+1 is a very large negative number,
- James Miller (7/12) Mar 07 2012 Because you don't describe things as -5 metres tall, so you don't
- Ellery Newcomer (2/6) Mar 07 2012 know any good ones off the top of your head?
- =?UTF-8?B?QWxpIMOHZWhyZWxp?= (17/26) Mar 07 2012 Really randomly:
- Timon Gehr (7/10) Mar 07 2012 The problem is not that length is unsigned. The issue is the implicit
- Timon Gehr (4/14) Mar 07 2012 Furthermore, bitwise boolean operators should still accept arguments of
- Andrej Mitrovic (6/8) Mar 07 2012 You bet. I've once had this hard to spot bug where I've used a call
- Jonathan M Davis (16/27) Mar 07 2012 Though that's one of those things that you're not going to convince Walt...
- Sean Cavanaugh (21/33) Mar 07 2012 After writing enough container libraries and whatnot in C++, I always
- Jonathan M Davis (7/49) Mar 07 2012 std.conv.to will check non-implicit, narrowing conversions to make sure ...
- Ary Manzana (5/22) Mar 07 2012 Wouldn't it make more sense to convert the entire expression to signed
- James Miller (10/35) Mar 07 2012 ive
- Jonathan M Davis (12/15) Mar 07 2012 More like it's to avoid code silently breaking when it's ported. In gene...
- Timon Gehr (3/25) Mar 08 2012 This is just as arbitrary as the rule that says the entire expression is...
- H. S. Teoh (10/13) Mar 06 2012 [...]
I'm writing my first basic algorithms, this one is merge sort. This version throws an exception when array.length - setSize is negative (which should be fine, the rest of my function would deal with it): template mergeSort(T) { void mergeSort(ref T[] array, const T setSize = 100) { T[][] merge; merge.length = array.length / setSize; T ii, jj; for(ii = 0, jj = 0;ii < array.length - setSize;ii += setSize, ++jj) merge[jj] = array[ii..ii + setSize]; ... If I make the seemingly pointless change to this: template mergeSort(T) { void mergeSort(ref T[] array, const T setSize = 100) { T[][] merge; merge.length = array.length / setSize; T ii, jj; T temp2 = array.length - setSize; for(ii = 0, jj = 0;ii < temp2;ii += setSize, ++jj) merge[jj] = array[ii..ii + setSize]; Where it's a temporary variable then it works perfectly well.
Mar 06 2012
On 03/06/2012 09:11 PM, ixid wrote:I'm writing my first basic algorithms, this one is merge sort. This version throws an exception when array.length - setSize is negative (which should be fine, the rest of my function would deal with it): template mergeSort(T) { void mergeSort(ref T[] array, const T setSize = 100) { T[][] merge; merge.length = array.length / setSize; T ii, jj; for(ii = 0, jj = 0;ii < array.length - setSize;ii += setSize, ++jj)We don't know what T is, but I assume a signed type like int. array.length is size_t, i.e. an unsigned type. Unsigned types have this nasty habit of converting the entire expression to unsigned (that is a rule since C). So array.length - setSize above is size_t. In other words, it is never negative.merge[jj] = array[ii..ii + setSize]; ... If I make the seemingly pointless change to this: template mergeSort(T) { void mergeSort(ref T[] array, const T setSize = 100) { T[][] merge; merge.length = array.length / setSize; T ii, jj; T temp2 = array.length - setSize;There, you are setting temp2's size to be T (e.g. int), so temp2 can be negative.for(ii = 0, jj = 0;ii < temp2;ii += setSize, ++jj) merge[jj] = array[ii..ii + setSize]; Where it's a temporary variable then it works perfectly well.Ali
Mar 06 2012
Ah, thank you, so it's wrapping. That seems like a bad idea, what is the benefit to size being unsigned rather than signed? This case would seem like one where allowing negatives is clearly better and more intuitive.
Mar 06 2012
On 03/06/2012 10:05 PM, ixid wrote:Ah, thank you, so it's wrapping. That seems like a bad idea, what is the benefit to size being unsigned rather than signed? This case would seem like one where allowing negatives is clearly better and more intuitive.There are probably hundreds of discussions about that over the years on many different language newsgroups and forums. :) There is no clear winner: Both sides of the arguments seem to have good points. Ali
Mar 06 2012
On 03/06/2012 10:05 PM, ixid wrote:Because it doesn't make sense to have something with a negative size?Ah, thank you, so it's wrapping. That seems like a bad idea, what is the benefit to size being unsigned rather than signed?Not necessarily. For example, int.max+1 is a very large negative number, and -int.min is still a very large negative number. :-) T -- Be in denial for long enough, and one day you'll deny yourself of things you wish you hadn't.This case would seem like one where allowing negatives is clearly better and more intuitive.
Mar 06 2012
On 7 March 2012 19:30, H. S. Teoh <hsteoh quickfur.ath.cx> wrote:Because you don't describe things as -5 metres tall, so you don't describe things as -1024 bytes long. size_t makes sense unsigned because negatives make no sense for size. However, if you cast array.length to an int, it may work, haven't tested it. -- James MillerOn 03/06/2012 10:05 PM, ixid wrote:Ah, thank you, so it's wrapping. That seems like a bad idea, what is the benefit to size being unsigned rather than signed? This case would seem like one where allowing negatives is clearly better and more intuitive.
Mar 07 2012
On 03/07/2012 12:23 AM, Ali Çehreli wrote:There are probably hundreds of discussions about that over the years on many different language newsgroups and forums. :) There is no clear winner: Both sides of the arguments seem to have good points. Aliknow any good ones off the top of your head?
Mar 07 2012
On 03/07/2012 07:51 PM, Ellery Newcomer wrote:On 03/07/2012 12:23 AM, Ali Çehreli wrote:Really randomly: * Unsigned is better because some things are never negative: sizes, indexes, etc. * But we occasionally need to subtract one from the other. The following is not safe unless we are sure that the left-hand side is greater: size_t i = /* ... */; size_t j = /* ... */; foo(i - j); // Is this the difference or a very large number? Another bug because of an unsigned type: for (size_t i = /* ... */; i >= 0; --i) { (Although I think most compilers catch that one.) I searched for the following terms on comp.lang.c++.moderated and found many interesting discussions: size_t unsigned signed I don't have a strong opinion myself. AliThere are probably hundreds of discussions about that over the years on many different language newsgroups and forums. :) There is no clear winner: Both sides of the arguments seem to have good points. Aliknow any good ones off the top of your head?
Mar 07 2012
On 03/07/2012 07:05 AM, ixid wrote:Ah, thank you, so it's wrapping. That seems like a bad idea, what is the benefit to size being unsigned rather than signed? This case would seem like one where allowing negatives is clearly better and more intuitive.The problem is not that length is unsigned. The issue is the implicit conversion from signed to unsigned. The right thing would be to disallow signed -> unsigned and unsigned -> signed implicit conversion unless value range propagation can prove it safe, and to make comparison between signed and unsigned actually work by translating it to more than one machine instruction.
Mar 07 2012
On 03/07/2012 11:01 AM, Timon Gehr wrote:On 03/07/2012 07:05 AM, ixid wrote:Furthermore, bitwise boolean operators should still accept arguments of arbitrary signedness but the result should implicitly convert to both signed and unsigned.Ah, thank you, so it's wrapping. That seems like a bad idea, what is the benefit to size being unsigned rather than signed? This case would seem like one where allowing negatives is clearly better and more intuitive.The problem is not that length is unsigned. The issue is the implicit conversion from signed to unsigned. The right thing would be to disallow signed -> unsigned and unsigned -> signed implicit conversion unless value range propagation can prove it safe, and to make comparison between signed and unsigned actually work by translating it to more than one machine instruction.
Mar 07 2012
On 3/7/12, Timon Gehr <timon.gehr gmx.ch> wrote:The problem is not that length is unsigned. The issue is the implicit conversion from signed to unsigned.You bet. I've once had this hard to spot bug where I've used a call that was something like max(0, min(10, <expression>)), and this ended up returning a negative int because <expression> was combining an integer with a .length property of some array (I don't recall the exact call though). It was truly a WTF moment.
Mar 07 2012
On Wednesday, March 07, 2012 11:01:05 Timon Gehr wrote:On 03/07/2012 07:05 AM, ixid wrote:Though that's one of those things that you're not going to convince Walter of - primarily, I believe, because it would require a lot more casting. The interesting part about _that_ is that if it's bad enough, it will actually make code _worse_, because the cast forces things. To really deal with it cleanly, you'd proabably need something similar to the const_cast nonsense in C++ except that it just converts signedness. I suspect that the reality of the matter is that if we disallowed implicit conversions between signed and unsigned, a number of bugs would completely go away, but others would creep in as a result, and the overal situation wouldn't necessarily be any better, but I don't know. My initial reaction would be to agree with you, but there are definitely cases where such an approach would get annoying and bug-prone (due to the casting involved). But regardless, I really don't think that you're going to convince Walter on this one, given what he's said in the past. - Jonathan M DavisAh, thank you, so it's wrapping. That seems like a bad idea, what is the benefit to size being unsigned rather than signed? This case would seem like one where allowing negatives is clearly better and more intuitive.The problem is not that length is unsigned. The issue is the implicit conversion from signed to unsigned. The right thing would be to disallow signed -> unsigned and unsigned -> signed implicit conversion unless value range propagation can prove it safe, and to make comparison between signed and unsigned actually work by translating it to more than one machine instruction.
Mar 07 2012
On 3/7/2012 12:57 PM, Jonathan M Davis wrote:On Wednesday, March 07, 2012 11:01:05 Timon Gehr wrote:After writing enough container libraries and whatnot in C++, I always end up going bed while thinking life would be so much easier if implicit signed/unsigned conversions were not allowed. Then I go to sleep, wake up, and realize how much more horrific the code would be in other places if this was actually true. The best compromise would probably have been to make size_t signed when migrating to 64 bit memory addressing, and leave the unsigned/signed problems specificaly with size_t values back in the past of 32 bit and older systems. On a related note I would love to see something in std.somewhere (conv?) to provide a complete set of narrowing and signed/unsigned conversion functions: The matrix is basically the following methods: 1) Reduce size (64->32->16->8 bits) but maintain signedness with three types: 2) Methods to flip signedness of the value (but mainting bitwidth) multiplied against three types of operations: a) truncating (i.e. mask off the lower bits) b) saturating (values outside of range are clamped to min or max of the narrower range) c) throwing (values outside of narrow range throw a range exception)On 03/07/2012 07:05 AM, ixid wrote:I suspect that the reality of the matter is that if we disallowed implicit conversions between signed and unsigned, a number of bugs would completely go away, but others would creep in as a result, and the overal situation wouldn't necessarily be any better, but I don't know. My initial reaction would be to agree with you, but there are definitely cases where such an approach would get annoying and bug-prone (due to the casting involved). But regardless, I really don't think that you're going to convince Walter on this one, given what he's said in the past. - Jonathan M DavisAh, thank you, so it's wrapping. That seems like a bad idea, what is the
Mar 07 2012
On Wednesday, March 07, 2012 13:20:41 Sean Cavanaugh wrote:On 3/7/2012 12:57 PM, Jonathan M Davis wrote:std.conv.to will check non-implicit, narrowing conversions to make sure that the number will fit in the new type (throwing a ConvException if it won't), but if you're converting between types of the same size but which differ in signedness, then that's an implicit conversion, at it's the same as if you didn't use std.conv.to at all. - Jonathan M DavisOn Wednesday, March 07, 2012 11:01:05 Timon Gehr wrote:After writing enough container libraries and whatnot in C++, I always end up going bed while thinking life would be so much easier if implicit signed/unsigned conversions were not allowed. Then I go to sleep, wake up, and realize how much more horrific the code would be in other places if this was actually true. The best compromise would probably have been to make size_t signed when migrating to 64 bit memory addressing, and leave the unsigned/signed problems specificaly with size_t values back in the past of 32 bit and older systems. On a related note I would love to see something in std.somewhere (conv?) to provide a complete set of narrowing and signed/unsigned conversion functions: The matrix is basically the following methods: 1) Reduce size (64->32->16->8 bits) but maintain signedness with three types: 2) Methods to flip signedness of the value (but mainting bitwidth) multiplied against three types of operations: a) truncating (i.e. mask off the lower bits) b) saturating (values outside of range are clamped to min or max of the narrower range) c) throwing (values outside of narrow range throw a range exception)On 03/07/2012 07:05 AM, ixid wrote:I suspect that the reality of the matter is that if we disallowed implicit conversions between signed and unsigned, a number of bugs would completely go away, but others would creep in as a result, and the overal situation wouldn't necessarily be any better, but I don't know. My initial reaction would be to agree with you, but there are definitely cases where such an approach would get annoying and bug-prone (due to the casting involved). But regardless, I really don't think that you're going to convince Walter on this one, given what he's said in the past. - Jonathan M DavisAh, thank you, so it's wrapping. That seems like a bad idea, what is the
Mar 07 2012
On 3/7/12 2:28 AM, Ali Çehreli wrote:On 03/06/2012 09:11 PM, ixid wrote: > I'm writing my first basic algorithms, this one is merge sort. This > version throws an exception when array.length - setSize is negative > (which should be fine, the rest of my function would deal with it): > > template mergeSort(T) > { > void mergeSort(ref T[] array, const T setSize = 100) > { > T[][] merge; > merge.length = array.length / setSize; > T ii, jj; > for(ii = 0, jj = 0;ii < array.length - setSize;ii += setSize, ++jj) We don't know what T is, but I assume a signed type like int. array.length is size_t, i.e. an unsigned type. Unsigned types have this nasty habit of converting the entire expression to unsigned (that is a rule since C). So array.length - setSize above is size_t.Wouldn't it make more sense to convert the entire expression to signed if there's at list one signed integer? I really don't get it... It's like déjà-vu all the time in this newsgroup.
Mar 07 2012
On 8 March 2012 15:39, Ary Manzana <ary esperanto.org.ar> wrote:On 3/7/12 2:28 AM, Ali =C3=87ehreli wrote:hisOn 03/06/2012 09:11 PM, ixid wrote: =C2=A0> I'm writing my first basic algorithms, this one is merge sort. T=ive=C2=A0> version throws an exception when array.length - setSize is negat=t):=C2=A0> (which should be fine, the rest of my function would deal with i=ze, ++jj)=C2=A0> =C2=A0> template mergeSort(T) =C2=A0> { =C2=A0> void mergeSort(ref T[] array, const T setSize =3D 100) =C2=A0> { =C2=A0> T[][] merge; =C2=A0> merge.length =3D array.length / setSize; =C2=A0> T ii, jj; =C2=A0> for(ii =3D 0, jj =3D 0;ii < array.length - setSize;ii +=3D setSi=Its the semantics in C/C++ and D explicitly tries to have the same semantics as them. From what I remember its to aid people moving from those language to D. -- James MillerWe don't know what T is, but I assume a signed type like int. array.length is size_t, i.e. an unsigned type. Unsigned types have this nasty habit of converting the entire expression to unsigned (that is a rule since C). So array.length - setSize above is size_t.Wouldn't it make more sense to convert the entire expression to signed if there's at list one signed integer? I really don't get it... It's like d=C3=A9j=C3=A0-vu all the time in this newsgroup.
Mar 07 2012
On Thursday, March 08, 2012 16:15:08 James Miller wrote:Its the semantics in C/C++ and D explicitly tries to have the same semantics as them. From what I remember its to aid people moving from those language to D.More like it's to avoid code silently breaking when it's ported. In general, C/C++ code is either valid D code with identical semantics, or it won't compile as D code. There are a _few_ exceptions (e.g. static arrays are passed by value), but they're far and few between. Porting C/C++ code to D would be much nastier if the C/C++ compiled as D code but had different semantics. In addition to that, I believe that Walter's take on it is to generally leave stuff as it is in C/C++ unless there's a good reason to change it. Obviously, there's a lot which was worth changing, and there's tons of stuff in D which is different form C/C++. But there's a lot of little stuff that wasn't messed with, because there just wasn't a good reason to. - Jonathan M Davis
Mar 07 2012
On 03/08/2012 03:39 AM, Ary Manzana wrote:On 3/7/12 2:28 AM, Ali Çehreli wrote:This is just as arbitrary as the rule that says the entire expression is converted to unsigned.On 03/06/2012 09:11 PM, ixid wrote:Wouldn't it make more sense to convert the entire expression to signed if there's at list one signed integer?I'm writing my first basic algorithms, this one is merge sort. This version throws an exception when array.length - setSize is negative (which should be fine, the rest of my function would deal with it): template mergeSort(T) { void mergeSort(ref T[] array, const T setSize = 100) { T[][] merge; merge.length = array.length / setSize; T ii, jj; for(ii = 0, jj = 0;ii < array.length - setSize;ii += setSize, ++jj)We don't know what T is, but I assume a signed type like int. array.length is size_t, i.e. an unsigned type. Unsigned types have this nasty habit of converting the entire expression to unsigned (that is a rule since C). So array.length - setSize above is size_t.
Mar 08 2012
On Wed, Mar 07, 2012 at 06:11:18AM +0100, ixid wrote:I'm writing my first basic algorithms, this one is merge sort. This version throws an exception when array.length - setSize is negative (which should be fine, the rest of my function would deal with it):[...] array.length is of type size_t, which is an *unsigned* integer type. If setSize > array.length, then array.length - setSize will underflow and become a large positive number. T -- They say that "guns don't kill people, people kill people." Well I think the gun helps. If you just stood there and yelled BANG, I don't think you'd kill too many people. -- Eddie Izzard, Dressed to Kill
Mar 06 2012