www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Human stupidity or is this a regression?

reply Lionello Lunesu <lionello lunesu.remove.com> writes:
Perhaps should have written "and/or" in the subject line since the two 
are not mutually exclusive.

I was showing off D to friends the other day:

import std.stdio;
void main()
{
   foreach (d; "你好")
     writeln(d);
}


IIRC, this used to work fine, with the variable "d" getting deduced as 
"dchar" and correctly reassembling the UTF-8 bytes into Unicode codepoints.

But when I run this code in OSX, dmd v2.064, I get this:

$ dmd -run uni.d
�
�
�
�
�
�

It's clearly printing the bytes. When I print the typeof(d) I get 
"immutable(char)", so that confirms the type is not deduced as "dchar".

I could have sworn this used to work. Is my memory failing me, or was 
this a deliberate change at some point? Perhaps a regression?

L.
Dec 25 2013
parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Lionello Lunesu:

 I could have sworn this used to work. Is my memory failing me, 
 or was this a deliberate change at some point? Perhaps a 
 regression?
It's not a regression, it's a locked-in design mistake. Write it like this and try again: foreach (dchar d; "你好") Bye, bearophile
Dec 25 2013
parent reply Lionello Lunesu <lionello lunesu.remove.com> writes:
On 12/26/13, 11:58, bearophile wrote:
 Lionello Lunesu:

 I could have sworn this used to work. Is my memory failing me, or was
 this a deliberate change at some point? Perhaps a regression?
It's not a regression, it's a locked-in design mistake. Write it like this and try again: foreach (dchar d; "你好") Bye, bearophile
Yeah, that's what I ended up doing. But D being D, the default should be safe and correct. I feel we could take this breaking change since it would not silently change the code to do something else. You'll get prompted and we could special case the error message to give a meaningful hint. L
Dec 25 2013
next sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Lionello Lunesu:

 Yeah, that's what I ended up doing. But D being D, the default 
 should be safe and correct.

 I feel we could take this breaking change since it would not 
 silently change the code to do something else.
You have to explain such things in the main D newsgroup. D.learn newsgroup is not fit for such requests. Bye, bearophile
Dec 26 2013
parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 12/26/13, bearophile <bearophileHUGS lycos.com> wrote:
 You have to explain such things in the main D newsgroup. D.learn
 newsgroup is not fit for such requests.
There have already been a million of these threads, it's worth doing a search as there's probably lots of answers there.
Dec 26 2013
prev sibling parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Thursday, 26 December 2013 at 05:39:26 UTC, Lionello Lunesu 
wrote:
 On 12/26/13, 11:58, bearophile wrote:
 Lionello Lunesu:

 I could have sworn this used to work. Is my memory failing 
 me, or was
 this a deliberate change at some point? Perhaps a regression?
It's not a regression, it's a locked-in design mistake. Write it like this and try again: foreach (dchar d; "你好") Bye, bearophile
Yeah, that's what I ended up doing. But D being D, the default should be safe and correct.
It is impossible for it to be "correct", unless with a very specific definition of "correct" which makes sense for some languages/locales and not others. As a challenge, try to define a "foreach" semantic that works "correctly" with the OP's code for Unicode composite characters, or Hebrew.
Dec 26 2013
parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, Dec 26, 2013 at 09:38:02PM +0000, Vladimir Panteleev wrote:
 On Thursday, 26 December 2013 at 05:39:26 UTC, Lionello Lunesu
 wrote:
On 12/26/13, 11:58, bearophile wrote:
Lionello Lunesu:

I could have sworn this used to work. Is my memory failing me, or
was this a deliberate change at some point? Perhaps a regression?
It's not a regression, it's a locked-in design mistake. Write it like this and try again: foreach (dchar d; "你好") Bye, bearophile
Yeah, that's what I ended up doing. But D being D, the default should be safe and correct.
It is impossible for it to be "correct", unless with a very specific definition of "correct" which makes sense for some languages/locales and not others. As a challenge, try to define a "foreach" semantic that works "correctly" with the OP's code for Unicode composite characters, or Hebrew.
To be truly "correct" in the intuitive sense, use std.uni.byGrapheme. (Yes it's slow, but that's the price you pay for intuitive correctness.) T -- It always amuses me that Windows has a Safe Mode during bootup. Does that mean that Windows is normally unsafe?
Dec 26 2013