www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.dwt - Strings in DWT

reply Frank Benoit <keinfarbton googlemail.com> writes:
Strings in Java and D are different.
- null vs zero length is in Java completely different
- Java works in utf16, char[] is utf8

In Phobos there is an alias char[] to "string". Would that make sense to 
be used in DWT?

With the helper functions in dwt.dwthelper.utils, the char[] can be used 
like the Java String in many cases. An alias to String would remove many 
replacements. And this would make merging/diffing easier.

1.) "char[]"
2.) alias char[] string;
3.) alias char[] String;

What do you think? What should be used in DWT?
Feb 20 2008
next sibling parent reply Bill Baxter <dnewsgroup billbaxter.com> writes:
Frank Benoit wrote:
 
 Strings in Java and D are different.
 - null vs zero length is in Java completely different
 - Java works in utf16, char[] is utf8
 
 In Phobos there is an alias char[] to "string". Would that make sense to 
 be used in DWT?
 
 With the helper functions in dwt.dwthelper.utils, the char[] can be used 
 like the Java String in many cases. An alias to String would remove many 
 replacements. And this would make merging/diffing easier.
 
 1.) "char[]"
 2.) alias char[] string;
 3.) alias char[] String;
 
 What do you think? What should be used in DWT?
Better wait to see what Tango does first. Some folks like myself believe that the alias should be added in Tango itself to facilitate compatibility with Phobos: http://www.dsource.org/projects/tango/ticket/548 --bb
Feb 20 2008
parent Bjoern <nanali nospam-wanadoo.fr> writes:
Bill Baxter schrieb:
 Frank Benoit wrote:
 Strings in Java and D are different.
 - null vs zero length is in Java completely different
 - Java works in utf16, char[] is utf8

 In Phobos there is an alias char[] to "string". Would that make sense 
 to be used in DWT?

 With the helper functions in dwt.dwthelper.utils, the char[] can be 
 used like the Java String in many cases. An alias to String would 
 remove many replacements. And this would make merging/diffing easier.

 1.) "char[]"
 2.) alias char[] string;
 3.) alias char[] String;

 What do you think? What should be used in DWT?
Better wait to see what Tango does first. Some folks like myself believe that the alias should be added in Tango itself to facilitate compatibility with Phobos: http://www.dsource.org/projects/tango/ticket/548 --bb
And a horse is horse is a horse of course. ++Bill . DWT is depending on Tango ergo : No need to be in a hurry from a Java to Dx view : wchar[] is looking good, in fact it is not, candidate. my cents
Feb 20 2008
prev sibling parent reply torhu <no spam.invalid> writes:
Frank Benoit wrote:
 Strings in Java and D are different.
 - null vs zero length is in Java completely different
 - Java works in utf16, char[] is utf8
 
 In Phobos there is an alias char[] to "string". Would that make sense to 
 be used in DWT?
 
 With the helper functions in dwt.dwthelper.utils, the char[] can be used 
 like the Java String in many cases. An alias to String would remove many 
 replacements. And this would make merging/diffing easier.
 
 1.) "char[]"
 2.) alias char[] string;
 3.) alias char[] String;
 
 What do you think? What should be used in DWT?
If it helps with porting and updating DWT, 'String' could be added for internal use. Or maybe 'jstring', to give a unique name? 'string' is more problematic, since it would conflict if tango adds it too, leading to some user confusion until it can be fixed. This can of course be avoided by using the trick[1] that Thomas Kuehne posted last summer. But if it's called 'string', it'll in reality become a part of DWT's public api, and hard to remove later. So I think it's good to stay away from that exact name unless Tango adds it. [1] static if(!is(string)) { static if(!is(typeof((new Object()).toString()) string)) { alias char[] string; } }
Feb 20 2008
parent reply =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:
torhu wrote:

 In Phobos there is an alias char[] to "string". Would that make
 sense to be used in DWT?

 With the helper functions in dwt.dwthelper.utils, the char[] can be 
 used like the Java String in many cases. An alias to String would 
 remove many replacements. And this would make merging/diffing easier.

 1.) "char[]"
 2.) alias char[] string;
 3.) alias char[] String;

 What do you think? What should be used in DWT?
If it helps with porting and updating DWT, 'String' could be added for internal use. Or maybe 'jstring', to give a unique name? 'string' is more problematic, since it would conflict if tango adds it too, leading to some user confusion until it can be fixed. This can of course be avoided by using the trick[1] that Thomas Kuehne posted last summer.
And "string" is predefined as invariant in D2, which further complicates the matter and makes it a good idea to avoid redeclaring "string" type. ("String" would be a class normally, with the upper case letter in it ?) wxD had a "string" type long before D did, so it's currently using some versioning in order to only declare the type when the stdlib doesn't... (the fully qualified name of the old alias type being wx.common.string) version (Tango) { const int version_major = 1; const int version_minor = 0; } else // Phobos { public import std.compiler; // version } static if (version_major < 1 || (version_major == 1 && version_minor < 16)) alias char[] string; // added in DMD 1.016 and DMD 2.000 I'd go with either of the now built-into-Phobos types "string" (char[]) and "wstring" (wchar[]), or use some adapation of the java String class. Depending on whether you want it to be an array or a class, not sure ? --anders
Feb 21 2008
parent reply Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
Anders F Björklund wrote:
 And "string" is predefined as invariant in D2, which further complicates
 the matter and makes it a good idea to avoid redeclaring "string" type.
 ("String" would be a class normally, with the upper case letter in it ?)
Though since they're porting from Java, and a Java String[1] is immutable, an invariant string is actually the correct translation... (And with array pseudo-methods, AFAICT the only aspect of Java String use that can't be emulated exactly is the use of '+' for concatenation instead of '~'. It's been a while since I used Java though, so I may be overlooking something) [1] Mutable strings are implemented by StringBuffer, IIRC.
Feb 21 2008
parent reply DBloke <DBloke nowhere.org> writes:
 Though since they're porting from Java, and a Java String[1] is 
 immutable, 
Correct :) an invariant string is actually the correct translation...
 (And with array pseudo-methods, AFAICT the only aspect of Java String 
 use that can't be emulated exactly is the use of '+' for concatenation 
 instead of '~'. 
True, I still don't get why it is believed using + as a concatenation operator is so confusing for a string? coming from a C/C++ and Java background I personally find ~ confusing it somehow reminds me of the two's compliment operator. It's been a while since I used Java though, so I may be
 overlooking something)
Nope your memory serves you correctly :)
 
 
 [1] Mutable strings are implemented by StringBuffer, IIRC.
Yes and no StringBuilder is now favoured over StringBuffer unless multiple threads need access to the StringBuilder object(s):)
Feb 23 2008
parent reply Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
DBloke wrote:
 
 True, I still don't get why it is believed using + as a concatenation 
 operator is so confusing for a string?
It's a completely different operation that has different semantics. Addition is commutative while concatenation isn't; i.e. a + b == b + a for any numeric type, but a ~ b is not generally interchangeable with b ~ a. This comes into play in D's operator overloading as well: if the compiler can't compile a+b as a.opAdd(b) or b.opAdd_r(a) it then tries a.opAdd_r(b) and b.opAdd(a) (i.e. it tries to compile as b+a instead). This transformation wouldn't be correct for concatenations, so that has a different operator that isn't commutative and will therefore skip the second step. (see the section "Binary Operator Overloading" on <http://www.digitalmars.com/d/1.0/operatoroverloading.html>)
 coming from a C/C++ and Java 
 background I personally find ~ confusing it somehow reminds me of the 
 two's compliment operator.
Does template instantiation also remind you of negation? Does multiplication remind you of pointer dereferencing? Does bitwise-and remind you of taking an address? I don't see the problem with unary and binary use of an operator doing different things as long as the unary meaning doesn't make any sense for binary use and the binary meaning doesn't make any for unary use. Though I must admit that the unary use of 'is' had me a bit confused for a while when I first tried to use it, but I don't think that was because of the binary use...
Feb 25 2008
next sibling parent reply Bill Baxter <dnewsgroup billbaxter.com> writes:
Frits van Bommel wrote:

 Though I must admit that the unary use of 'is' had me a bit confused for 
 a while when I first tried to use it, but I don't think that was because 
 of the binary use...
There's a unary 'is' operator? --bb
Feb 25 2008
parent Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
Bill Baxter wrote:
 Frits van Bommel wrote:
 
 Though I must admit that the unary use of 'is' had me a bit confused 
 for a while when I first tried to use it, but I don't think that was 
 because of the binary use...
There's a unary 'is' operator?
Well, there's an 'is()' :) http://www.digitalmars.com/d/1.0/expression.html#IsExpression
Mar 01 2008
prev sibling parent DBloke <DBloke nowhere.org> writes:
 It's a completely different operation that has different semantics. 
 Addition is commutative while concatenation isn't; i.e. a + b == b + a 
 for any numeric type, but a ~ b is not generally interchangeable with b 
 ~ a.
 This comes into play in D's operator overloading as well: if the 
 compiler can't compile a+b as a.opAdd(b) or b.opAdd_r(a) it then tries 
 a.opAdd_r(b) and b.opAdd(a) (i.e. it tries to compile as b+a instead). 
 This transformation wouldn't be correct for concatenations, so that has 
 a different operator that isn't commutative and will therefore skip the 
 second step.
 (see the section "Binary Operator Overloading" on 
 <http://www.digitalmars.com/d/1.0/operatoroverloading.html>)
This makes sense, but not if you are used to Java or C/C++ though not a huge problem I admit, it just takes a bit of getting used to. :)
 
 Does template instantiation also remind you of negation? 
Yep
 Does multiplication remind you of pointer dereferencing?
Nope ;) I always format my code to be readable and understandable now and for the future.
 Does bitwise-and remind you of taking an address?
Nope, as above :)
 
 I don't see the problem with unary and binary use of an operator doing 
 different things as long as the unary meaning doesn't make any sense for 
 binary use and the binary meaning doesn't make any for unary use.
True but the context and formatting of the code should give more than a clue, unless you are one of the few people who believe that removing any form of unnecessary white space will somehow make compilation faster ;) no I kid you not some people I have encountered (usually briefly) believe this in C/C++ code and better still we had some guy believing that removing every other binary digit in assembly language would make the code smaller and lighter for portability ;)
Feb 25 2008