www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Apparently unsigned types really are necessary

reply Walter Bright <newshound2 digitalmars.com> writes:
http://news.ycombinator.com/item?id=3495283

and getting rid of unsigned types is not the solution to signed/unsigned issues.
Jan 21 2012
next sibling parent reply bcs <bcs example.com> writes:
On 01/21/2012 10:05 PM, Walter Bright wrote:
 http://news.ycombinator.com/item?id=3495283

 and getting rid of unsigned types is not the solution to signed/unsigned
 issues.

A quote from that link: "There are many use cases for data types that behave like pure bit strings with no concept of sign." Why not recast the concept of unsigned integers as "bit vectors (that happen to implement arithmetic)"? I've seen several sources claim that uint (and friends) should never be used unless you are using it for low level bit tricks and the like. Rename them bits{8,16,32,64} and make the current names aliases.
Jan 21 2012
parent reply "Marco Leise" <Marco.Leise gmx.de> writes:
Am 22.01.2012, 08:23 Uhr, schrieb bcs <bcs example.com>:

 On 01/21/2012 10:05 PM, Walter Bright wrote:
 http://news.ycombinator.com/item?id=3495283

 and getting rid of unsigned types is not the solution to signed/unsigned
 issues.

A quote from that link: "There are many use cases for data types that behave like pure bit strings with no concept of sign." Why not recast the concept of unsigned integers as "bit vectors (that happen to implement arithmetic)"? I've seen several sources claim that uint (and friends) should never be used unless you are using it for low level bit tricks and the like.

Those are heretics.
 Rename them bits{8,16,32,64} and make the current names aliases.

So everyone uses int, and we get messages like: "This program currently uses -1404024 bytes of RAM". I have strong feelings against using signed types for variables that are ever going to only hold positive numbers, especially when it comes to sizes and lengths.
Jan 22 2012
next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday, January 22, 2012 10:31:17 Marco Leise wrote:
 Am 22.01.2012, 08:23 Uhr, schrieb bcs <bcs example.com>:
 Rename them bits{8,16,32,64} and make the current names aliases.

So everyone uses int, and we get messages like: "This program currently uses -1404024 bytes of RAM". I have strong feelings against using signed types for variables that are ever going to only hold positive numbers, especially when it comes to sizes and lengths.

Whereas others have string feelings about using unsigned types for much of anything which isn't intended for using bitshifts with. Lots of bugs are caused by the use of unsigned integral values. I know that Don wishes that size_t were signed and thinks that it's horrible that it isn't. I suspect that you will find more people who disagree with you than agree with you on this. Now, whether having bits8, bits16, etc. is a good idea or not, I don't know, but there are a lot of programmers who don't particularly like using unsigned types for normal arithmetic, regardless of what values the variable holds. - Jonathan M Davis
Jan 22 2012
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/22/2012 1:44 AM, Jonathan M Davis wrote:
 Whereas others have string feelings

Sometimes I feel like a hashmap.
Jan 22 2012
parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday, January 22, 2012 01:49:02 Walter Bright wrote:
 On 1/22/2012 1:44 AM, Jonathan M Davis wrote:
 Whereas others have string feelings

Sometimes I feel like a hashmap.

LOL. I keep forgetting to reread my posts before I post them. I really should fix that... - Jonathan M Davis
Jan 22 2012
prev sibling next sibling parent reply "Marco Leise" <Marco.Leise gmx.de> writes:
Am 22.01.2012, 10:44 Uhr, schrieb Jonathan M Davis <jmdavisProg gmx.com>:

 On Sunday, January 22, 2012 10:31:17 Marco Leise wrote:
 Am 22.01.2012, 08:23 Uhr, schrieb bcs <bcs example.com>:
 Rename them bits{8,16,32,64} and make the current names aliases.

So everyone uses int, and we get messages like: "This program currently uses -1404024 bytes of RAM". I have strong feelings against using signed types for variables that are ever going to only hold positive numbers, especially when it comes to sizes and lengths.

Whereas others have string feelings about using unsigned types for much of anything which isn't intended for using bitshifts with. Lots of bugs are caused by the use of unsigned integral values. I know that Don wishes that size_t were signed and thinks that it's horrible that it isn't. I suspect that you will find more people who disagree with you than agree with you on this. Now, whether having bits8, bits16, etc. is a good idea or not, I don't know, but there are a lot of programmers who don't particularly like using unsigned types for normal arithmetic, regardless of what values the variable holds. - Jonathan M Davis

I heard that in the past, but in my own experience using unsigned data types, it did not cause any more bugs. OTOH, textual output is more correct and I find code easier to understand, if it is using the correct 'class' of integers. But this "a lot of programmers who don't particularly like using unsigned types" must come from somewhere. Except for existing bugs in the form of silent under-/overflows that do not appear alarming in a debugger due to their signedness, I've yet to see a convincing example of real world code, that I would write this way and is flawed due to the use of uint instead of int. Or is this like spaces vs. tabs? 'Cause I'm also a tab user.
Jan 22 2012
next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday, January 22, 2012 13:40:08 Marco Leise wrote:
 I heard that in the past, but in my own experience using unsigned data
 types, it did not cause any more bugs. OTOH, textual output is more
 correct and I find code easier to understand, if it is using the correct
 'class' of integers. But this "a lot of programmers who don't particularly
 like using unsigned types" must come from somewhere. Except for existing
 bugs in the form of silent under-/overflows that do not appear alarming in
 a debugger due to their signedness, I've yet to see a convincing example
 of real world code, that I would write this way and is flawed due to the
 use of uint instead of int. Or is this like spaces vs. tabs? 'Cause I'm
 also a tab user.

Down with tabs! ;) One issue with unsigned integers right off the bat is for loops. for(size_t i = a.length; i > 0; --i) {} is not going to work. Another potentially nasty situation is subtraction. It can do fun things when you subtract one unsigned type from another if you're not careful (since if the result is negative and is then assigned to an unsigned integer...). There are probably others, but that's what comes to mind immediately. In general, it comes down to issues with them rolling over and becoming incredibly large values when they go below 0. Sure, unsigned types can be useful, and if you're careful with them, you can be fine, but there are definitely cases where they cause trouble. Hence, why many programmers argue for not using them unless you actually need them. - Jonathan M Davis
Jan 22 2012
next sibling parent torhu <no spam.invalid> writes:
On 22.01.2012 13:49, Jonathan M Davis wrote:
 On Sunday, January 22, 2012 13:40:08 Marco Leise wrote:
  I heard that in the past, but in my own experience using unsigned data
  types, it did not cause any more bugs. OTOH, textual output is more
  correct and I find code easier to understand, if it is using the correct
  'class' of integers. But this "a lot of programmers who don't particularly
  like using unsigned types" must come from somewhere. Except for existing
  bugs in the form of silent under-/overflows that do not appear alarming in
  a debugger due to their signedness, I've yet to see a convincing example
  of real world code, that I would write this way and is flawed due to the
  use of uint instead of int. Or is this like spaces vs. tabs? 'Cause I'm
  also a tab user.

Down with tabs! ;) One issue with unsigned integers right off the bat is for loops. for(size_t i = a.length; i> 0; --i) {} is not going to work.

That'll work just fine, you probably meant '>=' ;)
Jan 22 2012
prev sibling next sibling parent "Marco Leise" <Marco.Leise gmx.de> writes:
Am 22.01.2012, 13:49 Uhr, schrieb Jonathan M Davis <jmdavisProg gmx.com>:

 Down with tabs! ;)

 One issue with unsigned integers right off the bat is for loops.

 for(size_t i = a.length; i > 0; --i) {}
 is not going to work.

That is C style. In D you would write: foreach_reverse(i; 0 .. a.length) {}, which is safe and corrects the two bugs in your code.
 Another potentially nasty situation is subtraction. It
 can do fun things when you subtract one unsigned type from another if  
 you're
 not careful (since if the result is negative and is then assigned to an
 unsigned integer...). There are probably others, but that's what comes  
 to mind
 immediately. In general, it comes down to issues with them rolling over  
 and
 becoming incredibly large values when they go below 0.

I'm always careful when subtracting unsigned ints for the simple reason that the code working on them would be incorrect if results were negative. One example is subtracting two TickDurations. You always know which one is the lower. The same goes for offsets into files. When you copy the block between two locations you cannot exchange start and end. Imagine we had checked integers now, a proposal that doesn't seem far fetched. Would they scream in pain if I wrote "checked_ulong duration = start_time - end_time"? Yes.
 Sure, unsigned types can be useful, and if you're careful with them, you  
 can
 be fine, but there are definitely cases where they cause trouble. Hence,  
 why
 many programmers argue for not using them unless you actually need them.

 - Jonathan M Davis

I guess my mental model of integers has grown on the idea that an unsigned integer matches the addressable memory of my computer, and thus it is the natural choice there for array lengths and whatever is limited only by available RAM and comes in positive counts; whereas I 'waste' half the range and have only 'half' a match with signed types. I will put this under "tabs vs. spaces". :)
Jan 22 2012
prev sibling parent reply =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 01/22/2012 04:49 AM, Jonathan M Davis wrote:

 Another potentially nasty situation is subtraction. It
 can do fun things when you subtract one unsigned type from another if 

you're
 not careful (since if the result is negative and is then assigned to an
 unsigned integer...).

No need to assign the result explicitly either. Additionally, the subtraction is someties implicit. When the expression has an unsigned in it, the temporary result is unsigned by the language rules since C: import std.stdio; int foo() { return -2; } uint bar() { return 1; } void main() { writeln(foo() + bar()); } The program above prints 4294967295. It may make perfect sense for bar() to return an unsigned type (like arrays' .length property), but every time I decide on an unsigned type, I think about the potentially-unintended implicit conversion to unsigned that may bite the users of bar(). Ali
Jan 22 2012
parent "Marco Leise" <Marco.Leise gmx.de> writes:
Am 22.01.2012, 18:00 Uhr, schrieb Ali =C3=87ehreli <acehreli yahoo.com>:=


 On 01/22/2012 04:49 AM, Jonathan M Davis wrote:

  > Another potentially nasty situation is subtraction. It
  > can do fun things when you subtract one unsigned type from another =

if =
 you're
  > not careful (since if the result is negative and is then assigned t=

o =
 an
  > unsigned integer...).

 No need to assign the result explicitly either. Additionally, the  =

 subtraction is someties implicit.

 When the expression has an unsigned in it, the temporary result is  =

 unsigned by the language rules since C:

 import std.stdio;

 int foo()
 {
      return -2;
 }

 uint bar()
 {
      return 1;
 }

 void main()
 {
      writeln(foo() + bar());
 }

 The program above prints 4294967295.

 It may make perfect sense for bar() to return an unsigned type (like  =

 arrays' .length property), but every time I decide on an unsigned type=

, =
 I think about the potentially-unintended implicit conversion to unsign=

ed =
 that may bite the users of bar().

 Ali

That's a valid point, if the order "foo() + bar()" makes sense and a = negative value is expected. After all foo() and bar() must be related = somehow, otherwise you wouldn't add them. In this case, if the expected = = result is a signed int, I would make bar() return a signed int as well a= nd = not treat that case like array.length!
Jan 22 2012
prev sibling next sibling parent Mail Mantis <mail.mantis.88 gmail.com> writes:
2012/1/22 Jonathan M Davis <jmdavisProg gmx.com>:
 Another potentially nasty situation is subtraction. It
 can do fun things when you subtract one unsigned type from another if you're
 not careful...

It is unrelated to unsigned types in any way, isn't it?: int a = 2_000_000_000; assert( a + a == 4_000_000_000 ); // fail assert( a + a == -294_967_296 ); // OK Overflow bug is often related to casting negative value to unsigned types, yes, but without these types we still may have unexpected behaviour due to this bug, so this is barely a proper solution.
Jan 22 2012
prev sibling next sibling parent "Martin Nowak" <dawg dawgfoto.de> writes:
On Sun, 22 Jan 2012 13:49:37 +0100, Jonathan M Davis <jmdavisProg gmx.com>  
wrote:

 One issue with unsigned integers right off the bat is for loops.
 for(size_t i = a.length; i > 0; --i) {}
 is not going to work.

What's not working with this? Besides a neat idiom for reverse array indexing for (size_t i = a.length; i--; ) writeln(a[i]);
Jan 22 2012
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/22/2012 4:40 AM, Marco Leise wrote:
 Or is
 this like spaces vs. tabs? 'Cause I'm also a tab user.

I struggled with that for years. Not with my own code, the tabs worked fine. The trouble was when collaborating with other people, who insisted on using tab stop settings that were the evil spawn of satan. Hence, collaborated code was always a mess. Like newklear combat toe to toe with the roosskies, the only way to win is to not play.
Jan 22 2012
next sibling parent reply bcs <bcs example.com> writes:
On 01/22/2012 10:09 AM, Walter Bright wrote:
 On 1/22/2012 4:40 AM, Marco Leise wrote:
 Or is
 this like spaces vs. tabs? 'Cause I'm also a tab user.

I struggled with that for years. Not with my own code, the tabs worked fine. The trouble was when collaborating with other people, who insisted on using tab stop settings that were the evil spawn of satan. Hence, collaborated code was always a mess. Like newklear combat toe to toe with the roosskies, the only way to win is to not play.

The only way to win the whitespace war is to change the rules: My I propose the following modifications to the D lexer: ''' White space may consist of: - A comment between any two tokens. - A single space between tokens that, if adjoined would be a single token. All other white space (including \n \r \t \v, etc.) is forbidden and a lexical error. ''' With these additions, all valid D code will be so hard to read that nobody will ever attempt to read it without first running a re-formatter over it and once that is standard practice, everyone will see it in there own preferred style.
Jan 22 2012
parent "Nick Sabalausky" <a a.a> writes:
"bcs" <bcs example.com> wrote in message 
news:jfhqgv$13f7$1 digitalmars.com...
 On 01/22/2012 10:09 AM, Walter Bright wrote:
 On 1/22/2012 4:40 AM, Marco Leise wrote:
 Or is
 this like spaces vs. tabs? 'Cause I'm also a tab user.

I struggled with that for years. Not with my own code, the tabs worked fine. The trouble was when collaborating with other people, who insisted on using tab stop settings that were the evil spawn of satan. Hence, collaborated code was always a mess. Like newklear combat toe to toe with the roosskies, the only way to win is to not play.

The only way to win the whitespace war is to change the rules: My I propose the following modifications to the D lexer: ''' White space may consist of: - A comment between any two tokens. - A single space between tokens that, if adjoined would be a single token. All other white space (including \n \r \t \v, etc.) is forbidden and a lexical error. ''' With these additions, all valid D code will be so hard to read that nobody will ever attempt to read it without first running a re-formatter over it and once that is standard practice, everyone will see it in there own preferred style.

Hah! I like it :)
Jan 23 2012
prev sibling parent "Nick Sabalausky" <a a.a> writes:
"Walter Bright" <newshound2 digitalmars.com> wrote in message 
news:jfhjd6$lgf$1 digitalmars.com...
 On 1/22/2012 4:40 AM, Marco Leise wrote:
 Or is
 this like spaces vs. tabs? 'Cause I'm also a tab user.


The dirty rotten spacies can pry the tabs from my cold dead hands ;)
 I struggled with that for years. Not with my own code, the tabs worked 
 fine. The trouble was when collaborating with other people, who insisted 
 on using tab stop settings that were the evil spawn of satan. Hence, 
 collaborated code was always a mess.

How is that even *possible*? No matter what a user's tab size is, it's not going to affect anyone else unless they've snuck spaces in there, too.
 Like newklear combat toe to toe with the roosskies, the only way to win is 
 to not play.

In Soviet Russia, the tabstop adjusts YOU! (Nah, I don't know what the hell I mean by that either...)
Jan 23 2012
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/22/12 3:44 AM, Jonathan M Davis wrote:
 Whereas others have string feelings about using unsigned types

My feelings in the matter are definitely integral. Andrei
Jan 22 2012
prev sibling next sibling parent reply bcs <bcs example.com> writes:
On 01/22/2012 01:31 AM, Marco Leise wrote:
 Am 22.01.2012, 08:23 Uhr, schrieb bcs <bcs example.com>:

 On 01/21/2012 10:05 PM, Walter Bright wrote:
 http://news.ycombinator.com/item?id=3495283

 and getting rid of unsigned types is not the solution to signed/unsigned
 issues.

A quote from that link: "There are many use cases for data types that behave like pure bit strings with no concept of sign." Why not recast the concept of unsigned integers as "bit vectors (that happen to implement arithmetic)"? I've seen several sources claim that uint (and friends) should never be used unless you are using it for low level bit tricks and the like.

Those are heretics.
 Rename them bits{8,16,32,64} and make the current names aliases.

So everyone uses int, and we get messages like: "This program currently uses -1404024 bytes of RAM". I have strong feelings against using signed types for variables that are ever going to only hold positive numbers, especially when it comes to sizes and lengths.

OK, I'll grant that there are a (*extremely* limited) number of cases where you actually need the full range of an unsigned integers type. I'm not suggesting that the actual semantics of the type be modified and it would still be usable for exactly that sort of cases. My suggestion is that the naming be modified to avoid suggesting that the *primary* use for the type is for non negative numbers. To support that position, if you really expect to encounter and thus need to correctly handle numbers between 2^31 and 2^32 (or 63/64, etc.) then you already need to be doing careful analyses to avoid bugs from overflow. At that point, you are already considering low level details and using a "bit vector" type as a number is not much more complicated. The added bonus is that the mismatch between the name and what it's used for is a big red flag saying "be careful or this is likely to cause bugs". Getting people to think of it that way is likely to prevent more bugs that it cause.
Jan 22 2012
parent reply "foobar" <foo bar.com> writes:
On Sunday, 22 January 2012 at 20:01:52 UTC, bcs wrote:
 On 01/22/2012 01:31 AM, Marco Leise wrote:
 Am 22.01.2012, 08:23 Uhr, schrieb bcs <bcs example.com>:

 On 01/21/2012 10:05 PM, Walter Bright wrote:
 http://news.ycombinator.com/item?id=3495283

 and getting rid of unsigned types is not the solution to 
 signed/unsigned
 issues.

A quote from that link: "There are many use cases for data types that behave like pure bit strings with no concept of sign." Why not recast the concept of unsigned integers as "bit vectors (that happen to implement arithmetic)"? I've seen several sources claim that uint (and friends) should never be used unless you are using it for low level bit tricks and the like.

Those are heretics.
 Rename them bits{8,16,32,64} and make the current names 
 aliases.

So everyone uses int, and we get messages like: "This program currently uses -1404024 bytes of RAM". I have strong feelings against using signed types for variables that are ever going to only hold positive numbers, especially when it comes to sizes and lengths.

OK, I'll grant that there are a (*extremely* limited) number of cases where you actually need the full range of an unsigned integers type. I'm not suggesting that the actual semantics of the type be modified and it would still be usable for exactly that sort of cases. My suggestion is that the naming be modified to avoid suggesting that the *primary* use for the type is for non negative numbers. To support that position, if you really expect to encounter and thus need to correctly handle numbers between 2^31 and 2^32 (or 63/64, etc.) then you already need to be doing careful analyses to avoid bugs from overflow. At that point, you are already considering low level details and using a "bit vector" type as a number is not much more complicated. The added bonus is that the mismatch between the name and what it's used for is a big red flag saying "be careful or this is likely to cause bugs". Getting people to think of it that way is likely to prevent more bugs that it cause.

I think that we're looking in the wrong corner for the culprit. While the unsigned types could have had better names (machine related: byte, word, etc..) IMO the real issue here is *not* with the types themselves but rather with the horrid implicit conversion rules inherited from C. mixed signed/unsigned expressions really should be compile errors and should be resolved explicitly by the programmer. foo() + bar() can be any of int/uint/long depending on what the programmer wants to achieve. my 2 cents.
Jan 22 2012
parent captaindet <2krnk gmx.net> writes:
On 2012-01-22 14:36, foobar wrote:
 On Sunday, 22 January 2012 at 20:01:52 UTC, bcs wrote:
 On 01/22/2012 01:31 AM, Marco Leise wrote:
 Am 22.01.2012, 08:23 Uhr, schrieb bcs <bcs example.com>:

 On 01/21/2012 10:05 PM, Walter Bright wrote:
 http://news.ycombinator.com/item?id=3495283

 and getting rid of unsigned types is not the solution to
 signed/unsigned issues.

A quote from that link: "There are many use cases for data types that behave like pure bit strings with no concept of sign." Why not recast the concept of unsigned integers as "bit vectors (that happen to implement arithmetic)"? I've seen several sources claim that uint (and friends) should never be used unless you are using it for low level bit tricks and the like.

Those are heretics.
 Rename them bits{8,16,32,64} and make the current names
 aliases.

So everyone uses int, and we get messages like: "This program currently uses -1404024 bytes of RAM". I have strong feelings against using signed types for variables that are ever going to only hold positive numbers, especially when it comes to sizes and lengths.

OK, I'll grant that there are a (*extremely* limited) number of cases where you actually need the full range of an unsigned integers type. I'm not suggesting that the actual semantics of the type be modified and it would still be usable for exactly that sort of cases. My suggestion is that the naming be modified to avoid suggesting that the *primary* use for the type is for non negative numbers. To support that position, if you really expect to encounter and thus need to correctly handle numbers between 2^31 and 2^32 (or 63/64, etc.) then you already need to be doing careful analyses to avoid bugs from overflow. At that point, you are already considering low level details and using a "bit vector" type as a number is not much more complicated. The added bonus is that the mismatch between the name and what it's used for is a big red flag saying "be careful or this is likely to cause bugs". Getting people to think of it that way is likely to prevent more bugs that it cause.

I think that we're looking in the wrong corner for the culprit. While the unsigned types could have had better names (machine related: byte, word, etc..) IMO the real issue here is *not* with the types themselves but rather with the horrid implicit conversion rules inherited from C. mixed signed/unsigned expressions really should be compile errors and should be resolved explicitly by the programmer. foo() + bar() can be any of int/uint/long depending on what the programmer wants to achieve. my 2 cents.

+1
Jan 22 2012
prev sibling parent reply "Kagamin" <spam here.lot> writes:
On Sunday, 22 January 2012 at 09:31:15 UTC, Marco Leise wrote:
 So everyone uses int, and we get messages like: "This program 
 currently uses -1404024 bytes of RAM". I have strong feelings 
 against using signed types for variables that are ever going to 
 only hold positive numbers, especially when it comes to sizes 
 and lengths.

If you ignore type limits, you're asking for trouble. Imagine you have 2 gigs of ram and 3 gig pagefile on 32-bit OS. What is the total size of available memory?
Jan 22 2012
next sibling parent "Nick Sabalausky" <a a.a> writes:
"Kagamin" <spam here.lot> wrote in message 
news:bhhmhjvgsmlxjvsuwzsb dfeed.kimsufi.thecybershadow.net...
 On Sunday, 22 January 2012 at 09:31:15 UTC, Marco Leise wrote:
 So everyone uses int, and we get messages like: "This program currently 
 uses -1404024 bytes of RAM". I have strong feelings against using signed 
 types for variables that are ever going to only hold positive numbers, 
 especially when it comes to sizes and lengths.

If you ignore type limits, you're asking for trouble. Imagine you have 2 gigs of ram and 3 gig pagefile on 32-bit OS. What is the total size of available memory?

One negative gig, obviously. (I think that means it's positronic...)
Jan 22 2012
prev sibling parent reply "Marco Leise" <Marco.Leise gmx.de> writes:
Am 22.01.2012, 21:44 Uhr, schrieb Kagamin <spam here.lot>:

 On Sunday, 22 January 2012 at 09:31:15 UTC, Marco Leise wrote:
 So everyone uses int, and we get messages like: "This program currently  
 uses -1404024 bytes of RAM". I have strong feelings against using  
 signed types for variables that are ever going to only hold positive  
 numbers, especially when it comes to sizes and lengths.

If you ignore type limits, you're asking for trouble. Imagine you have 2 gigs of ram and 3 gig pagefile on 32-bit OS. What is the total size of available memory?

I can use up to 4GB of that in the address space of my application - the value range of a uint, qed
Jan 22 2012
parent reply "Kagamin" <spam here.lot> writes:
On Sunday, 22 January 2012 at 22:17:10 UTC, Marco Leise wrote:
 If you ignore type limits, you're asking for trouble. Imagine 
 you have 2 gigs of ram and 3 gig pagefile on 32-bit OS. What 
 is the total size of available memory?

I can use up to 4GB of that in the address space of my application - the value range of a uint, qed

With PAE it's possible to access more than that. AFAIK some web-servers do it.
Jan 22 2012
parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sun, 22 Jan 2012 23:38:24 -0500, Kagamin <spam here.lot> wrote:

 On Sunday, 22 January 2012 at 22:17:10 UTC, Marco Leise wrote:
 If you ignore type limits, you're asking for trouble. Imagine you have  
 2 gigs of ram and 3 gig pagefile on 32-bit OS. What is the total size  
 of available memory?

I can use up to 4GB of that in the address space of my application - the value range of a uint, qed

With PAE it's possible to access more than that. AFAIK some web-servers do it.

The OS supports it, but not on a single process. You could achieve the feat with multiple processes. But then again, nobody cares about PAE anymore, just go 64-bit. -Steve
Jan 24 2012
prev sibling next sibling parent reply Mail Mantis <mail.mantis.88 gmail.com> writes:
2012/1/22 Walter Bright <newshound2 digitalmars.com>:
 http://news.ycombinator.com/item?id=3495283

 and getting rid of unsigned types is not the solution to signed/unsigned
 issues.

Would it be sane to add integer overflow/carry runtime checks in -debug builds? This could probably solve such issues, but we'd need some means to avoid this checks when nesessary.
Jan 22 2012
parent bcs <bcs example.com> writes:
On 01/22/2012 01:42 AM, Mail Mantis wrote:
 2012/1/22 Walter Bright<newshound2 digitalmars.com>:
 http://news.ycombinator.com/item?id=3495283

 and getting rid of unsigned types is not the solution to signed/unsigned
 issues.

Would it be sane to add integer overflow/carry runtime checks in -debug builds? This could probably solve such issues, but we'd need some means to avoid this checks when nesessary.

http://embed.cs.utah.edu/ioc/
Jan 22 2012
prev sibling next sibling parent reply Era Scarecrow <rtcvb32 yahoo.com> writes:
 Would it be sane to add integer overflow/carry runtime checks in
 -debug builds? This could probably solve such issues, but we'd need
 some means to avoid this checks when necessary.

I have asked before regarding getting some standard way to hold these values after an arithmetic operation. Comes down to problems making it portable (basically). Being as these are taken directly out of C's view of how to handle arithmetic (Which ignores the hardware's obvious view) we need to look at it twice. First, normal computations where we ask for a squared value, or something for a project we are working on that needs a good value. These are situations where overflow, carry, and where such effects would screw with our results. These should have checks. The second is algorithms, PRNGs, encryption, checksums, which rely on the behavior as it is. We would need a way to specify which ints needed to be checked; Or if you want to go the other direction, specify which ones specifically don't. I think having the checks in the debug mode would be wonderful, for when you need it. It almost seems more likely a new struct type would be made that does those checks for you and is replaced during release with it's emulated type (Not too unlike SafeInt Microsoft was making).
Jan 22 2012
parent bearophile <bearophileHUGS lycos.com> writes:
Era Scarecrow:

  We would need a way to specify which ints needed to be checked; Or if you
want to go the other direction, specify which ones specifically don't. I think
having the checks in the debug mode would be wonderful, for when you need it. 

If D will have some success, and it will be used a bit in situations where Ada is today used, then surely a D compiler will have checked signed and unsigned integrals. But there's also a need for a syntax to locally disable the checks. Bye, bearophile
Jan 22 2012
prev sibling next sibling parent reply equinox atw.hu writes:
Hi,


I noticed I cannot use typedef any longer in D2.
Why did it go?


Regards


Marton Papp
Jan 22 2012
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/22/2012 9:44 AM, equinox atw.hu wrote:
 I noticed I cannot use typedef any longer in D2.
 Why did it go?

typedef turned out to have many difficult issues about when it was a distinct type and when it wasn't.
Jan 22 2012
next sibling parent reply "Nick Sabalausky" <a a.a> writes:
"Walter Bright" <newshound2 digitalmars.com> wrote in message 
news:jfhj4v$l2b$1 digitalmars.com...
 On 1/22/2012 9:44 AM, equinox atw.hu wrote:
 I noticed I cannot use typedef any longer in D2.
 Why did it go?

typedef turned out to have many difficult issues about when it was a distinct type and when it wasn't.

Fortunately, you should still be able to get the same effect of typedef with a struct and alias this.
Jan 22 2012
parent Gor Gyolchanyan <gor.f.gyolchanyan gmail.com> writes:
Wouldn't it be easier to make the typedef a Phobos solution and end
this debate once and for all?
Sure, the definition won't look as pretty as typedef did, but still...

On Mon, Jan 23, 2012 at 1:22 AM, Nick Sabalausky <a a.a> wrote:
 "Walter Bright" <newshound2 digitalmars.com> wrote in message
 news:jfhj4v$l2b$1 digitalmars.com...
 On 1/22/2012 9:44 AM, equinox atw.hu wrote:
 I noticed I cannot use typedef any longer in D2.
 Why did it go?

typedef turned out to have many difficult issues about when it was a distinct type and when it wasn't.

Fortunately, you should still be able to get the same effect of typedef with a struct and alias this.

-- Bye, Gor Gyolchanyan.
Jan 23 2012
prev sibling parent reply Iain Buclaw <ibuclaw ubuntu.com> writes:
On 22 January 2012 18:05, Walter Bright <newshound2 digitalmars.com> wrote:
 On 1/22/2012 9:44 AM, equinox atw.hu wrote:
 I noticed I cannot use typedef any longer in D2.
 Why did it go?

typedef turned out to have many difficult issues about when it was a distinct type and when it wasn't.

+1 Walter, what are your views on emitting the names of aliased types to debug? -- Iain Buclaw *(p < e ? p++ : p) = (c & 0x0f) + '0';
Jan 23 2012
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/23/2012 5:08 AM, Iain Buclaw wrote:
 Walter, what are your views on emitting the names of aliased types to debug?

I don't really have an opinion on it, except that generally when I'm debugging I'm interested in what a type really is, not the layer over it.
Jan 23 2012
next sibling parent bearophile <bearophileHUGS lycos.com> writes:
Walter:

 I don't really have an opinion on it, except that generally when I'm debugging 
 I'm interested in what a type really is, not the layer over it.

Both are useful to know for the programmer. See in this enhancement request of mine what both Clang and GCC do: http://d.puremagic.com/issues/show_bug.cgi?id=5004 Bye, bearophile
Jan 23 2012
prev sibling next sibling parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Monday, January 23, 2012 10:08:50 Walter Bright wrote:
 On 1/23/2012 5:08 AM, Iain Buclaw wrote:
 Walter, what are your views on emitting the names of aliased types to
 debug?

I don't really have an opinion on it, except that generally when I'm debugging I'm interested in what a type really is, not the layer over it.

I usually want the aliased type. I can look up what the alias is, but I can't know that something is using an alias if the alias is gone. - Jonathan M Davis
Jan 23 2012
parent "Nick Sabalausky" <a a.a> writes:
"Jonathan M Davis" <jmdavisProg gmx.com> wrote in message 
news:mailman.742.1327344456.16222.digitalmars-d puremagic.com...
 On Monday, January 23, 2012 10:08:50 Walter Bright wrote:
 On 1/23/2012 5:08 AM, Iain Buclaw wrote:
 Walter, what are your views on emitting the names of aliased types to
 debug?

I don't really have an opinion on it, except that generally when I'm debugging I'm interested in what a type really is, not the layer over it.


That's a pain in the ass for library users. For example, Goldie uses a lot of type sugar so that a nonteminal generated from the rule: <Foo> ::= <Bar> ';' Is usually referred to as type: A: Token_myLang!("<Foo>", "<Bar>", ";") Instead of: B: Token_myLang!( "<Foo>", Token_myLang!(SymbolType.NonTerminal, "<Bar>"), Token_myLang!(SymbolType.Terminal, ";") ) or: C: _Token_myLang!("<Foo>", 47); // Rule ID# The "B" is a rarely-needed unsugared version (which is guaranteed to be unambiguous with other symbols in th grammar). And "C" (or something similar to it) is the true underlying type that's intended purely as an implementation detail - *NOT* for user consumption. The policy of showing types de-aliased breaks proper encapsulation and just detroys all of the above whenever there's a type error (which is the whole point of having those types in the first place).
 I usually want the aliased type.

Same here. I usually find it far more relevent to the code the compiler's referring too. The current way is like having error messages show mangled names or talking in terms of the generated assembly instructions: It totally breaks out of the proper context and level-of-abstraction.
 I can look up what the alias is, but I can't
 know that something is using an alias if the alias is gone.

I hadn't thought of that, that's a good point, too.
Jan 23 2012
prev sibling parent "Jakob Bornecrantz" <wallbraker gmail.com> writes:
On Monday, 23 January 2012 at 18:08:51 UTC, Walter Bright wrote:
 On 1/23/2012 5:08 AM, Iain Buclaw wrote:
 Walter, what are your views on emitting the names of aliased 
 types to debug?

I don't really have an opinion on it, except that generally when I'm debugging I'm interested in what a type really is, not the layer over it.

While this is D1 I think it applies to D2 as well. Classes can be substituted for packages the same behavior is still observed. And yes I have code that does this [1]. class A { class Foo {} } class B { class Foo {} } alias A.Foo AFoo; alias B.Foo BFoo; void bar(AFoo f) { BFoo var; bar(var); } src/foo.d(12): Error: function foo.bar (Foo) does not match parameter types (Foo) src/foo.d(12): Error: cannot implicitly convert expression (var) of type foo.B.Foo to foo.A.Foo The first error line is a bit confusing, the second gets it correct. I guess its mostly my fault here, but getting AFoo and BFoo in the first line and then the full print out in the second would be the best for my case at least. Cheers, Jakob. [1] https://github.com/Wallbraker/Charged-Miners/blob/master/src/charge/charge.d
Jan 23 2012
prev sibling next sibling parent reply Era Scarecrow <rtcvb32 yahoo.com> writes:
 My I propose the following modifications to the D lexer:
 
 '''
 White space may consist of:
 - A comment between any two tokens.
 - A single space between tokens that, if adjoined would be a
 single token.
 
 All other white space (including \n \r \t \v, etc.) is
 forbidden and a lexical error.
 '''
 
 With these additions, all valid D code will be so hard to read that 
 nobody will ever attempt to read it without first running a re-formatter 
 over it and once that is standard practice, everyone will see it in 
 there own preferred style.

'\n' would be a invalid white space? Wow I see problems with that. Take a following debugging function of mine. Uses combination spaces, newlines and tabs. I think it's easy to read and understand. //(incomplete without the full class/struct, but you get the idea) void print() { writeln("\nP.empty = ", empty, "\nP.front = ", front, "\nP.position = ", position, "\nP.cap = ", cap, "\nP.map = ", map, "\n"); } That would instead becomes void print() { writeln("\nP.empty = ", empty, "\nP.front = ", front, "\nP.position = ", position, "\nP.cap = ", cap, "\nP.map = ", map, "\n"); } Far more likely the rules would have to be set for the editor to convert tabs into specific number of spaces and save it as such (and convert them back if they want). Otherwise in said projects, enforce certain rules for the project regarding formatting and reject it otherwise until they fix it.
Jan 22 2012
parent bcs <bcs example.com> writes:
On 01/22/2012 01:24 PM, Era Scarecrow wrote:
 My I propose the following modifications to the D lexer:

 '''
 White space may consist of:
 - A comment between any two tokens.
 - A single space between tokens that, if adjoined would be a
 single token.

 All other white space (including \n \r \t \v, etc.) is
 forbidden and a lexical error.
 '''

 With these additions, all valid D code will be so hard to read that
 nobody will ever attempt to read it without first running a re-formatter
 over it and once that is standard practice, everyone will see it in
 there own preferred style.

'\n' would be a invalid white space? Wow I see problems with that. Take a following debugging function of mine. Uses combination spaces, newlines and tabs. I think it's easy to read and understand. //(incomplete without the full class/struct, but you get the idea) void print() { writeln("\nP.empty = ", empty, "\nP.front = ", front, "\nP.position = ", position, "\nP.cap = ", cap, "\nP.map = ", map, "\n"); } That would instead becomes void print() { writeln("\nP.empty = ", empty, "\nP.front = ", front, "\nP.position = ", position, "\nP.cap = ", cap, "\nP.map = ", map, "\n"); } Far more likely the rules would have to be set for the editor to convert tabs into specific number of spaces and save it as such (and convert them back if they want). Otherwise in said projects, enforce certain rules for the project regarding formatting and reject it otherwise until they fix it.

Points: 1) that 2nd formatting still includes whitesapce that would be illegal (e.g. every place but between 'void' and 'print' and in the strings litereals). 2) The *point* is to turn code into an unreadable mash on a single line. 3) The entire proposal is satire.
Jan 22 2012
prev sibling next sibling parent reply Era Scarecrow <rtcvb32 yahoo.com> writes:
 Points:
 1) that 2nd formatting still includes whitespace that would
 be illegal 
 (e.g. every place but between 'void' and 'print' and in the
 strings 
 litereals).
 2) The *point* is to turn code into an unreadable mash on a
 single line.
 3) The entire proposal is satire.

Ahh, i had the impression from the list that all whitespace tokens were referring to a single statement line, not as a whole. Guess the only way to make it so spaces (1 or more) were whitespace, would be if we still use a fixed 80character width screen for our editors, then leftover whitespace becomes formatting. But it seems sufficiently stupid to do that, filesize being the largest part. I know all of C appeared in it's formatting, to allow you to drop all whitespace (with minor exceptions) into a single line, which is why /**/ comments were used and c++'s // ones were added later. Although fun to do a whole lot on a single line, i don't know if i would want to. /*C code following the follow proposal; tested and works. Not bad for 165 characters of pure code.*/ isprime(int n){int cnt=2;if(n<2)return 0;for(;cnt<n;cnt++)if((n%cnt)==0)return 0;return 1;} main(){int cnt=2;for(;cnt<100;cnt++)if (isprime(cnt))printf("%d ", cnt);}
Jan 22 2012
parent reply "Marco Leise" <Marco.Leise gmx.de> writes:
Am 23.01.2012, 00:22 Uhr, schrieb Era Scarecrow <rtcvb32 yahoo.com>:

 Points:
 1) that 2nd formatting still includes whitespace that would
 be illegal
 (e.g. every place but between 'void' and 'print' and in the
 strings
 litereals).
 2) The *point* is to turn code into an unreadable mash on a
 single line.
 3) The entire proposal is satire.

Ahh, i had the impression from the list that all whitespace tokens were referring to a single statement line, not as a whole. Guess the only way to make it so spaces (1 or more) were whitespace, would be if we still use a fixed 80character width screen for our editors, then leftover whitespace becomes formatting. But it seems sufficiently stupid to do that, filesize being the largest part. I know all of C appeared in it's formatting, to allow you to drop all whitespace (with minor exceptions) into a single line, which is why /**/ comments were used and c++'s // ones were added later. Although fun to do a whole lot on a single line, i don't know if i would want to. /*C code following the follow proposal; tested and works. Not bad for 165 characters of pure code.*/ isprime(int n){int cnt=2;if(n<2)return 0;for(;cnt<n;cnt++)if((n%cnt)==0)return 0;return 1;} main(){int cnt=2;for(;cnt<100;cnt++)if (isprime(cnt))printf("%d ", cnt);}

Sorry, but you still have unnecessary spaces in 3 places ;)
Jan 23 2012
parent bcs <bcs example.com> writes:
On 01/23/2012 02:11 AM, Marco Leise wrote:
 Am 23.01.2012, 00:22 Uhr, schrieb Era Scarecrow <rtcvb32 yahoo.com>:

 Points:
 1) that 2nd formatting still includes whitespace that would
 be illegal
 (e.g. every place but between 'void' and 'print' and in the
 strings
 litereals).
 2) The *point* is to turn code into an unreadable mash on a
 single line.
 3) The entire proposal is satire.

Ahh, i had the impression from the list that all whitespace tokens were referring to a single statement line, not as a whole. Guess the only way to make it so spaces (1 or more) were whitespace, would be if we still use a fixed 80character width screen for our editors, then leftover whitespace becomes formatting. But it seems sufficiently stupid to do that, filesize being the largest part. I know all of C appeared in it's formatting, to allow you to drop all whitespace (with minor exceptions) into a single line, which is why /**/ comments were used and c++'s // ones were added later. Although fun to do a whole lot on a single line, i don't know if i would want to. /*C code following the follow proposal; tested and works. Not bad for 165 characters of pure code.*/ isprime(int n){int cnt=2;if(n<2)return 0;for(;cnt<n;cnt++)if((n%cnt)==0)return 0;return 1;} main(){int cnt=2;for(;cnt<100;cnt++)if (isprime(cnt))printf("%d ", cnt);}

Sorry, but you still have unnecessary spaces in 3 places ;)

4, the space between the leading comment and 'isprime' must be removed as well.
Jan 23 2012
prev sibling parent reply Era Scarecrow <rtcvb32 yahoo.com> writes:
 From: "Nick Sabalausky" <a a.a>
 '''
 White space may consist of:
 - A comment between any two tokens.
 - A single space between tokens that, if adjoined would

be a single token.
 All other white space (including \n \r \t \v, etc.) is

forbidden and a
 lexical error.
 '''

 With these additions, all valid D code will be so hard

to read that nobody
 will ever attempt to read it without first running a

re-formatter over it
 and once that is standard practice, everyone will see

it in there own
 preferred style.

Hah! I like it :)

It does have a certain amount of sense it makes... but if that were really an issue, then having your won formatter strip the unneeded spaces and then re-introducing them back seems trivial, in which case a indentation tool would be more likely to be used (GNU indent anyone?). It does however become an issue regarding debugging, if sources are compiled against said compacted sources in that way. Everything would be on line 1, and other such messes. Course if your compiling the sources yourself and run it through your formatter your fine. But if someone else has their own format and download the source and reformat it to their format, line 117 may not point to anything, or point to the wrong object, or worse yet, if an assert was thrown and it was an unrelated passing assert; that is with the assumption you use a program compiled with debugging flags and you don't rebuild said executable. And most importantly of all. I quote "If it ain't broke, don't fix it".
Jan 23 2012
parent bcs <bcs example.com> writes:
On 01/23/2012 07:13 PM, Era Scarecrow wrote:
 From: "Nick Sabalausky"<a a.a>
 '''
 White space may consist of:
 - A comment between any two tokens.
 - A single space between tokens that, if adjoined would

be a single token.
 All other white space (including \n \r \t \v, etc.) is

forbidden and a
 lexical error.
 '''

 With these additions, all valid D code will be so hard

to read that nobody
 will ever attempt to read it without first running a

re-formatter over it
 and once that is standard practice, everyone will see

it in there own
 preferred style.

Hah! I like it :)

It does have a certain amount of sense it makes... but if that were really an issue, then having your won formatter strip the unneeded spaces and then re-introducing them back seems trivial, in which case a indentation tool would be more likely to be used (GNU indent anyone?). It does however become an issue regarding debugging, if sources are compiled against said compacted sources in that way. Everything would be on line 1, and other such messes. Course if your compiling the sources yourself and run it through your formatter your fine. But if someone else has their own format and download the source and reformat it to their format, line 117 may not point to anything, or point to the wrong object, or worse yet, if an assert was thrown and it was an unrelated passing assert; that is with the assumption you use a program compiled with debugging flags and you don't rebuild said executable. And most importantly of all. I quote "If it ain't broke, don't fix it".

That's just a tools problem... :b
Jan 23 2012