www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Why are unsigned to signed conversions implicit and don't emit a warning?

reply Andrej Mitrovic <none none.none> writes:
I just had a little bug in my code. In the WindowsAPI, there's this alias:

alias ubyte BYTE;

Unfortunately I didn't check for this, and I erroneously assumed BYTE was a
signed value (blame it on my lack of coffee). So when I used code like this:

alias Tuple!(byte, "red", byte, "green", byte, "blue") RGBTuple;

RGBTuple GetRGB(COLORREF cref)
{
    RGBTuple rgb;
    rgb.red   = GetRValue(cref);
    rgb.green = GetGValue(cref);
    rgb.blue  = GetBValue(cref);
    
    return rgb;
}

The rgb fields would often end up being -1 (Yes, I know all about how signed vs
unsigned representation works). My fault, yes.

But what really surprises me is that these unsigned to signed conversions
happen implicitly. I didn't even get a warning, even though I have all warning
switches turned on. 

I'm pretty sure GCC would complain about this in C code. Visual Studio
certainly complains if I set the appropriate warnings, examples given:

warning C4365: '=' : conversion from 'unsigned int' to 'int', signed/unsigned
mismatch
warning C4365: '=' : conversion from 'unsigned short' to 'short',
signed/unsigned mismatch
Apr 10 2011
next sibling parent Andrej Mitrovic <none none.none> writes:
And I just remembered Tuples can be constructed just like regular structs
(which they are underneath):
RGBTuple GetRGB(COLORREF cref)
{
    return RGBTuple(GetRValue(cref), GetGValue(cref), GetBValue(cref));
}
Apr 10 2011
prev sibling next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Andrej Mitrovic:

 I just had a little bug in my code. In the WindowsAPI, there's this alias:
 
 alias ubyte BYTE;
 
 Unfortunately I didn't check for this, and I erroneously assumed BYTE was a
signed value (blame it on my lack of coffee).

I and Don have asked (in Bugzilla and elsewhere) to change the built-in names into sbyte and ubyte, to avoid the common confusions between signed and unsigned bytes in D, but Walter was deaf to this.
 But what really surprises me is that these unsigned to signed conversions
happen implicitly. I didn't even get a warning, even though I have all warning
switches turned on. 

Add your vote here (I have voted this), a bug report from 07 2006, but Walter doesn't like this warning, and warnings in general too: http://d.puremagic.com/issues/show_bug.cgi?id=259 Bye, bearophile
Apr 10 2011
next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
 Andrej Mitrovic:
 I just had a little bug in my code. In the WindowsAPI, there's this
 alias:
 
 alias ubyte BYTE;
 
 Unfortunately I didn't check for this, and I erroneously assumed BYTE was
 a signed value (blame it on my lack of coffee).

I and Don have asked (in Bugzilla and elsewhere) to change the built-in names into sbyte and ubyte, to avoid the common confusions between signed and unsigned bytes in D, but Walter was deaf to this.
 But what really surprises me is that these unsigned to signed conversions
 happen implicitly. I didn't even get a warning, even though I have all
 warning switches turned on.

Add your vote here (I have voted this), a bug report from 07 2006, but Walter doesn't like this warning, and warnings in general too: http://d.puremagic.com/issues/show_bug.cgi?id=259

Personally, I see _zero_ value in renaming byte, int, etc. to sbyte, sint, etc. It's well-known that they're signed. I don't see how adding an extra s would make that any clearer. Their names are perfectly clear as they are. However, I also would have thought that converting between signed and unsigned values would be just as much of an error as narrowing conversions are - such as assigning an int to a byte. And arguably, assigning either an unsigned value to a signed value or vice versa is _also_ a narrowing conversion. So, I would have thought that it would be an error. Apparently not though. - Jonathan M Davis
Apr 10 2011
parent reply bearophile <bearophileHUGS lycos.com> writes:
Jonathan M Davis:

 Personally, I see _zero_ value in renaming byte, int, etc. to sbyte, sint, 
 etc. It's well-known that they're signed. I don't see how adding an extra s 
 would make that any clearer. Their names are perfectly clear as they are.

Discussing this here is probably useless, but: - Unfortunately what's "perfectly clear" for the computer is sometimes bug-prone anyway. - For me, and for Don and from other people that have had bugs in D caused by this, it seems they think of "bytes" as unsigned things. - C# uses sbytes, and ubytes. Enough said. Bye, bearophile
Apr 10 2011
parent reply Kagamin <spam here.lot> writes:
bearophile Wrote:

 - C# uses sbytes, and ubytes. Enough said.

there's no ubyte in C# It has byte, and it's unsigned. http://msdn.microsoft.com/en-us/library/exx3b86w.aspx
Apr 11 2011
parent bearophile <bearophileHUGS lycos.com> writes:
Kagamin:

 bearophile Wrote:
 
 - C# uses sbytes, and ubytes. Enough said.

there's no ubyte in C# It has byte, and it's unsigned. http://msdn.microsoft.com/en-us/library/exx3b86w.aspx

I was partially wrong, thank you. If you take a look it has int/unt, short/ushort, etc, but it doesn't have byte/ubyte, it has sbyte/byte. In my opinion here the naming symmetry has being broken because for most programmers bytes are unsigned. In D I have suggested sbyte/ubyte, but I accept the C# solution too. Bye, bearophile
Apr 11 2011
prev sibling next sibling parent reply spir <denis.spir gmail.com> writes:
On 04/11/2011 02:42 AM, bearophile wrote:
 I and Don have asked (in Bugzilla and elsewhere) to change the built-in names
into sbyte and ubyte, to avoid the common confusions between signed and
unsigned bytes in D, but Walter was deaf to this.

I think a good naming scheme would be: * signed : int8 .. int64 * unsigned : nat8 .. nat64 (since "natural number" more or less means "unsigned integer number") already. What do you think? or counting in octets: * signed : int1 .. int8 * unsigned : nat1 .. nat8 (I prefere the latter naming scheme in absolute, but it would be confusing because some languages --and LLVM, I guess-- count in bits.) Denis -- _________________ vita es estrany spir.wikidot.com
Apr 11 2011
parent SimonM <user example.net> writes:
On 2011/04/11 09:31 AM, spir wrote:
 On 04/11/2011 02:42 AM, bearophile wrote:
 I and Don have asked (in Bugzilla and elsewhere) to change the
 built-in names into sbyte and ubyte, to avoid the common confusions
 between signed and unsigned bytes in D, but Walter was deaf to this.

I think a good naming scheme would be: * signed : int8 .. int64 * unsigned : nat8 .. nat64 (since "natural number" more or less means "unsigned integer number") already. What do you think?

short, int, long, cent) and replacing them with int8..int64 (I'd still prefer uint8..uint64 though). Then you could use just 'int' to specify using the current system's architecture (and hopefully replace the ugly size_t type). I also think it makes more sense to just use 'int' when you don't really care about the specific size of the value. Unfortunately it would break backwards compatility so it would never make it into D's current state.
Apr 11 2011
prev sibling next sibling parent Andrew Wiley <debio264 gmail.com> writes:
On Sun, Apr 10, 2011 at 7:57 PM, Jonathan M Davis <jmdavisProg gmx.com> wrote:
 Andrej Mitrovic:
 I just had a little bug in my code. In the WindowsAPI, there's this
 alias:

 alias ubyte BYTE;

 Unfortunately I didn't check for this, and I erroneously assumed BYTE was
 a signed value (blame it on my lack of coffee).

I and Don have asked (in Bugzilla and elsewhere) to change the built-in names into sbyte and ubyte, to avoid the common confusions between signed and unsigned bytes in D, but Walter was deaf to this.
 But what really surprises me is that these unsigned to signed conversions
 happen implicitly. I didn't even get a warning, even though I have all
 warning switches turned on.

Add your vote here (I have voted this), a bug report from 07 2006, but Walter doesn't like this warning, and warnings in general too: http://d.puremagic.com/issues/show_bug.cgi?id=259

Personally, I see _zero_ value in renaming byte, int, etc. to sbyte, sint, etc. It's well-known that they're signed. I don't see how adding an extra s would make that any clearer. Their names are perfectly clear as they are. However, I also would have thought that converting between signed and unsigned values would be just as much of an error as narrowing conversions are - such as assigning an int to a byte. And arguably, assigning either an unsigned value to a signed value or vice versa is _also_ a narrowing conversion. So, I would have thought that it would be an error. Apparently not though.

I agree completely. The names are fine, we just need to get the conversions right. Yes, they aren't "theoretically correct" but integers as we define them aren't even integers in the mathematical sense, so the whole system isn't "theoretically correct." D's scheme, while not "pure," is very natural to use.
Apr 11 2011
prev sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
There are some aliases in std.stdint
Apr 11 2011
prev sibling next sibling parent reply spir <denis.spir gmail.com> writes:
On 04/11/2011 01:47 AM, Andrej Mitrovic wrote:
 alias Tuple!(byte, "red", byte, "green", byte, "blue") RGBTuple;

 RGBTuple GetRGB(COLORREF cref)
 {
      RGBTuple rgb;
      rgb.red   = GetRValue(cref);
      rgb.green = GetGValue(cref);
      rgb.blue  = GetBValue(cref);

      return rgb;
 }

[O your T] Hello, andrej, I'm trying to understand why people use tuples (outside multiple return values and variadic typetuples). Why do you prefere the above to: struct RGBColor { byte red, green, blue; } RGRColor GetRGB (COLORREF cref) { RGBColor rgb; rgb.red = GetRValue(cref); rgb.green = GetGValue(cref); rgb.blue = GetBValue(cref); return rgb; } ? [/O your T] Denis -- _________________ vita es estrany spir.wikidot.com
Apr 11 2011
parent bearophile <bearophileHUGS lycos.com> writes:
spir:

 I'm trying to understand why people use tuples (outside multiple return values 
 and variadic typetuples). Why do you prefere the above to:

Tuples are also sortable and printable on default, I think. Bye, bearophile
Apr 11 2011
prev sibling next sibling parent spir <denis.spir gmail.com> writes:
On 04/11/2011 06:45 AM, bearophile wrote:
 - For me, and for Don and from other people that have had bugs in D caused by
this, it seems they think of "bytes" as unsigned things.

True for me as well. I was very surprised to discover 'byte' is /not/ unsigned (this was actually the cause of my first bug in D coding ;-). Denis -- _________________ vita es estrany spir.wikidot.com
Apr 11 2011
prev sibling next sibling parent spir <denis.spir gmail.com> writes:
On 04/11/2011 10:10 AM, SimonM wrote:
 On 2011/04/11 09:31 AM, spir wrote:
 On 04/11/2011 02:42 AM, bearophile wrote:
 I and Don have asked (in Bugzilla and elsewhere) to change the
 built-in names into sbyte and ubyte, to avoid the common confusions
 between signed and unsigned bytes in D, but Walter was deaf to this.

I think a good naming scheme would be: * signed : int8 .. int64 * unsigned : nat8 .. nat64 (since "natural number" more or less means "unsigned integer number") already. What do you think?

int, long, cent) and replacing them with int8..int64 (I'd still prefer uint8..uint64 though). Then you could use just 'int' to specify using the current system's architecture (and hopefully replace the ugly size_t type). I also think it makes more sense to just use 'int' when you don't really care about the specific size of the value. Unfortunately it would break backwards compatility so it would never make it into D's current state.

Agreed. Same for uint or nat. And no implicit cast, please ;-) Denis -- _________________ vita es estrany spir.wikidot.com
Apr 11 2011
prev sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
No reason. Sometimes I find a new feature in D and I like to try it
out in my code in various places, to see how it looks, how it works,
etc. In this case a simple struct would do. :)
Apr 11 2011