www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Can we disallow appending integer to string?

reply Nick Treleaven <nick geany.org> writes:
This bug is fixed as the code no longer segfaults but throws 
instead:
https://issues.dlang.org/show_bug.cgi?id=5995

void main(){
	string ret;
	int i = -1;
	ret ~= i;
}

Why is it legal to append an integer?
Apr 19
parent reply Stanislav Blinov <stanislav.blinov gmail.com> writes:
On Wednesday, 19 April 2017 at 14:36:13 UTC, Nick Treleaven wrote:
 This bug is fixed as the code no longer segfaults but throws 
 instead:
 https://issues.dlang.org/show_bug.cgi?id=5995

 void main(){
 	string ret;
 	int i = -1;
 	ret ~= i;
 }

 Why is it legal to append an integer?
Because integrals implicitly convert to characters of same width (byte -> char, short -> wchar, int -> dchar).
Apr 19
next sibling parent reply Jonathan M Davis via Digitalmars-d-learn writes:
On Wednesday, April 19, 2017 14:50:38 Stanislav Blinov via Digitalmars-d-
learn wrote:
 On Wednesday, 19 April 2017 at 14:36:13 UTC, Nick Treleaven wrote:
 This bug is fixed as the code no longer segfaults but throws
 instead:
 https://issues.dlang.org/show_bug.cgi?id=5995

 void main(){

     string ret;
     int i = -1;
     ret ~= i;

 }

 Why is it legal to append an integer?
Because integrals implicitly convert to characters of same width (byte -> char, short -> wchar, int -> dchar).
Yeah, which reduces the number of casts required when doing arithmetic on characters and thus reduces bugs there, and I believe that that's the main reason the implicit conversion from an integral type to a character type exists. So, having the implicit conversion fixes one set of bugs, and disallowing it fixes another. We have similar problems with bool. Personally, I think that we should have taken the stricter approach and not had integral types implicit convert to character types, but from what I recall, Walter feels pretty strongly about the conversion rules being the way that they are. - Jonathan M Davis
Apr 19
next sibling parent reply Stanislav Blinov <stanislav.blinov gmail.com> writes:
On Wednesday, 19 April 2017 at 17:34:01 UTC, Jonathan M Davis 
wrote:

 Personally, I think that we should have taken the stricter 
 approach and not had integral types implicit convert to 
 character types, but from what I recall, Walter feels pretty 
 strongly about the conversion rules being the way that they are.
Yep, me too. Generally, I don't think that an implicit conversion (T : U) should be allowed if T.init is not equivalent to U.init.
Apr 19
parent reply "H. S. Teoh via Digitalmars-d-learn" <digitalmars-d-learn puremagic.com> writes:
On Wed, Apr 19, 2017 at 05:56:18PM +0000, Stanislav Blinov via
Digitalmars-d-learn wrote:
 On Wednesday, 19 April 2017 at 17:34:01 UTC, Jonathan M Davis wrote:
 Personally, I think that we should have taken the stricter approach
 and not had integral types implicit convert to character types, but
 from what I recall, Walter feels pretty strongly about the
 conversion rules being the way that they are.
Yep, me too. Generally, I don't think that an implicit conversion (T : U) should be allowed if T.init is not equivalent to U.init.
Me three. Implicit conversion of int to char/dchar/etc. is a horrible, horrible idea that leads to hard-to-find bugs. The whole point of having a separate type for char vs. ubyte, unlike in C where char pretty much means byte/ubyte, is so that we can keep their distinctions straight, not to continue to confuse them in the typical C manner by having their distinction blurred by implicit casts. Personally, I would rather have arithmetic on char (wchar, dchar) produce results of the same type, so that no implicit conversions are necessary. It seems to totally make sense to me to have to explicitly ask for a character's numerical value -- it documents code intent, which often also helps the programmer clear his head and avoid mistakes that would otherwise inevitably creep in. A few extra keystrokes to type cast(int) or cast(char) ain't gonna kill nobody. In fact, it might even save a few people by preventing certain kinds of bugs. T -- Gone Chopin. Bach in a minuet.
Apr 19
parent Stanislav Blinov <stanislav.blinov gmail.com> writes:
On Wednesday, 19 April 2017 at 18:40:23 UTC, H. S. Teoh wrote:

 A few extra keystrokes to type cast(int) or cast(char) ain't 
 gonna kill nobody. In fact, it might even save a few people by 
 preventing certain kinds of bugs.
Yup. Not to mention one could have property auto numeric(Flag!"unsigned" unsigned = No.unsigned, C)(C c) if(isSomeChar!C) { return cast(IntOfSize!(C.sizeof, unsigned))c; } auto v = 'a'.numeric; ...or even have an equivalent as a built-in property of character types...
Apr 19
prev sibling parent "Nick Sabalausky (Abscissa)" <SeeWebsiteToContactMe semitwist.com> writes:
On 04/19/2017 01:34 PM, Jonathan M Davis via Digitalmars-d-learn wrote:
 Yeah, which reduces the number of casts required when doing arithmetic on
 characters and thus reduces bugs there,
Ugh, because apparently doing arithmetic on characters is such a common, important use-case in this modern unicode world... :/
Apr 20
prev sibling next sibling parent reply Nick Treleaven <nick geany.org> writes:
On Wednesday, 19 April 2017 at 14:50:38 UTC, Stanislav Blinov 
wrote:
 Because integrals implicitly convert to characters of same 
 width (byte -> char, short -> wchar, int -> dchar).
Despite char.min > byte.min, char.max < byte.max. Anyway, appending an integer is inconsistent because concatenation is not allowed: int n; string s; s = "" ~ n; //error s ~= n; // ok! This is the same for dchar instead of int. This is also a problem for overloading: alias foo = (char c) => 1; alias foo = (int i) => 4; static assert(foo(7) == 4); // fails I would like to see some small changes to mitigate against things like this, even if we can't agree to prevent the conversion overall. It would be nice if we filed a bug as a rallying point for ideas on this issue.
Apr 20
next sibling parent reply "H. S. Teoh via Digitalmars-d-learn" <digitalmars-d-learn puremagic.com> writes:
On Thu, Apr 20, 2017 at 11:05:00AM +0000, Nick Treleaven via
Digitalmars-d-learn wrote:
[...]
 This is also a problem for overloading:
 
 alias foo = (char c) => 1;
 alias foo = (int i) => 4;
 
 static assert(foo(7) == 4); // fails
 
 I would like to see some small changes to mitigate against things like
 this, even if we can't agree to prevent the conversion overall. It
 would be nice if we filed a bug as a rallying point for ideas on this
 issue.
+1. Recently I was bitten by the char/int overload ambiguity problem, and it was not pretty. Well, actually, it was worse in my case because I had overloaded char against ubyte. Basically the only way I can disambiguate is to wrap one of them in a struct to force the compiler to treat them differently. Another pernicious thing I encountered recently, related to implicit conversions, is this: https://issues.dlang.org/show_bug.cgi?id=17336 It drew a very enunciated "WAT?!" from me. T -- Gone Chopin. Bach in a minuet.
Apr 20
parent reply Stanislav Blinov <stanislav.blinov gmail.com> writes:
On Thursday, 20 April 2017 at 19:20:28 UTC, H. S. Teoh wrote:


 Another pernicious thing I encountered recently, related to 
 implicit conversions, is this:

 	https://issues.dlang.org/show_bug.cgi?id=17336

 It drew a very enunciated "WAT?!" from me.
Yeah, that one is annoying. I've dealt with this before with: alias OpResult(string op, A, B) = typeof((){ A* a; B* b; return mixin("*a"~op~"*b"); }());
Apr 20
parent "H. S. Teoh via Digitalmars-d-learn" <digitalmars-d-learn puremagic.com> writes:
On Thu, Apr 20, 2017 at 10:39:01PM +0000, Stanislav Blinov via
Digitalmars-d-learn wrote:
 On Thursday, 20 April 2017 at 19:20:28 UTC, H. S. Teoh wrote:
 
 
 Another pernicious thing I encountered recently, related to implicit
 conversions, is this:
 
 	https://issues.dlang.org/show_bug.cgi?id=17336
 
 It drew a very enunciated "WAT?!" from me.
Yeah, that one is annoying. I've dealt with this before with: alias OpResult(string op, A, B) = typeof((){ A* a; B* b; return mixin("*a"~op~"*b"); }());
Yeah, wow, that is very annoying. T -- Debian GNU/Linux: Cray on your desktop.
Apr 20
prev sibling parent reply Nick Treleaven <nick geany.org> writes:
On Thursday, 20 April 2017 at 11:05:00 UTC, Nick Treleaven wrote:
 On Wednesday, 19 April 2017 at 14:50:38 UTC, Stanislav Blinov 
 wrote:
 Because integrals implicitly convert to characters of same 
 width (byte -> char, short -> wchar, int -> dchar).
Despite char.min > byte.min, char.max < byte.max.
The above mismatches would presumably be handled if we disallow implicit conversion between signed/unsigned (https://issues.dlang.org/show_bug.cgi?id=12919).
 I would like to see some small changes to mitigate against 
 things like this, even if we can't agree to prevent the 
 conversion overall.
It seems there's a strong case for preventing the int -> dchar conversion. wchar.max == ushort.max, char.max == ubyte.max, but dchar.max < uint.max. Converting from char types -> integers, whilst (arguably) bug prone, at least is numerically sound. So perhaps we could disallow uint -> dchar, which is unsound.
Apr 21
parent Nick Treleaven <nick geany.org> writes:
On Friday, 21 April 2017 at 08:36:12 UTC, Nick Treleaven wrote:
 Converting from char types -> integers, whilst (arguably) bug 
 prone, at least is numerically sound. So perhaps we could 
 disallow uint -> dchar, which is unsound.
That is in the general case - when VRP can prove an integer fits within a dchar, by the logic of this proposal, it is allowed.
Apr 21
prev sibling parent XavierAP <n3minis-git yahoo.es> writes:
On Wednesday, 19 April 2017 at 14:50:38 UTC, Stanislav Blinov 
wrote:
 On Wednesday, 19 April 2017 at 14:36:13 UTC, Nick Treleaven
 Why is it legal to append an integer?
Because integrals implicitly convert to characters of same width (byte -> char, short -> wchar, int -> dchar).
Huh... I hadn't used but I'd been assuming probably biased from C# that str ~ i would be equivalent to str ~ to!string(i) instead of str ~ cast(char) i Now I see the problem too...
Apr 21