digitalmars.D.dwt - Strings in DWT
- Frank Benoit <keinfarbton googlemail.com> Feb 20 2008
- Bill Baxter <dnewsgroup billbaxter.com> Feb 20 2008
- Bjoern <nanali nospam-wanadoo.fr> Feb 20 2008
- torhu <no spam.invalid> Feb 20 2008
- =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> Feb 21 2008
- Frits van Bommel <fvbommel REMwOVExCAPSs.nl> Feb 21 2008
- DBloke <DBloke nowhere.org> Feb 23 2008
- Frits van Bommel <fvbommel REMwOVExCAPSs.nl> Feb 25 2008
- Bill Baxter <dnewsgroup billbaxter.com> Feb 25 2008
- Frits van Bommel <fvbommel REMwOVExCAPSs.nl> Mar 01 2008
- DBloke <DBloke nowhere.org> Feb 25 2008
Strings in Java and D are different. - null vs zero length is in Java completely different - Java works in utf16, char[] is utf8 In Phobos there is an alias char[] to "string". Would that make sense to be used in DWT? With the helper functions in dwt.dwthelper.utils, the char[] can be used like the Java String in many cases. An alias to String would remove many replacements. And this would make merging/diffing easier. 1.) "char[]" 2.) alias char[] string; 3.) alias char[] String; What do you think? What should be used in DWT?
Feb 20 2008
Frank Benoit wrote:Strings in Java and D are different. - null vs zero length is in Java completely different - Java works in utf16, char[] is utf8 In Phobos there is an alias char[] to "string". Would that make sense to be used in DWT? With the helper functions in dwt.dwthelper.utils, the char[] can be used like the Java String in many cases. An alias to String would remove many replacements. And this would make merging/diffing easier. 1.) "char[]" 2.) alias char[] string; 3.) alias char[] String; What do you think? What should be used in DWT?
Better wait to see what Tango does first. Some folks like myself believe that the alias should be added in Tango itself to facilitate compatibility with Phobos: http://www.dsource.org/projects/tango/ticket/548 --bb
Feb 20 2008
Bill Baxter schrieb:Frank Benoit wrote:Strings in Java and D are different. - null vs zero length is in Java completely different - Java works in utf16, char[] is utf8 In Phobos there is an alias char[] to "string". Would that make sense to be used in DWT? With the helper functions in dwt.dwthelper.utils, the char[] can be used like the Java String in many cases. An alias to String would remove many replacements. And this would make merging/diffing easier. 1.) "char[]" 2.) alias char[] string; 3.) alias char[] String; What do you think? What should be used in DWT?
Better wait to see what Tango does first. Some folks like myself believe that the alias should be added in Tango itself to facilitate compatibility with Phobos: http://www.dsource.org/projects/tango/ticket/548 --bb
Tango ergo : No need to be in a hurry from a Java to Dx view : wchar[] is looking good, in fact it is not, Given D2 Strings, which means Tango for D2, #3 is most probabely a good candidate. my cents
Feb 20 2008
Frank Benoit wrote:Strings in Java and D are different. - null vs zero length is in Java completely different - Java works in utf16, char[] is utf8 In Phobos there is an alias char[] to "string". Would that make sense to be used in DWT? With the helper functions in dwt.dwthelper.utils, the char[] can be used like the Java String in many cases. An alias to String would remove many replacements. And this would make merging/diffing easier. 1.) "char[]" 2.) alias char[] string; 3.) alias char[] String; What do you think? What should be used in DWT?
If it helps with porting and updating DWT, 'String' could be added for internal use. Or maybe 'jstring', to give a unique name? 'string' is more problematic, since it would conflict if tango adds it too, leading to some user confusion until it can be fixed. This can of course be avoided by using the trick[1] that Thomas Kuehne posted last summer. But if it's called 'string', it'll in reality become a part of DWT's public api, and hard to remove later. So I think it's good to stay away from that exact name unless Tango adds it. [1] static if(!is(string)) { static if(!is(typeof((new Object()).toString()) string)) { alias char[] string; } }
Feb 20 2008
torhu wrote:In Phobos there is an alias char[] to "string". Would that make sense to be used in DWT? With the helper functions in dwt.dwthelper.utils, the char[] can be used like the Java String in many cases. An alias to String would remove many replacements. And this would make merging/diffing easier. 1.) "char[]" 2.) alias char[] string; 3.) alias char[] String; What do you think? What should be used in DWT?
If it helps with porting and updating DWT, 'String' could be added for internal use. Or maybe 'jstring', to give a unique name? 'string' is more problematic, since it would conflict if tango adds it too, leading to some user confusion until it can be fixed. This can of course be avoided by using the trick[1] that Thomas Kuehne posted last summer.
And "string" is predefined as invariant in D2, which further complicates the matter and makes it a good idea to avoid redeclaring "string" type. ("String" would be a class normally, with the upper case letter in it ?) wxD had a "string" type long before D did, so it's currently using some versioning in order to only declare the type when the stdlib doesn't... (the fully qualified name of the old alias type being wx.common.string) version (Tango) { const int version_major = 1; const int version_minor = 0; } else // Phobos { public import std.compiler; // version } static if (version_major < 1 || (version_major == 1 && version_minor < 16)) alias char[] string; // added in DMD 1.016 and DMD 2.000 I'd go with either of the now built-into-Phobos types "string" (char[]) and "wstring" (wchar[]), or use some adapation of the java String class. Depending on whether you want it to be an array or a class, not sure ? --anders
Feb 21 2008
Anders F Björklund wrote:And "string" is predefined as invariant in D2, which further complicates the matter and makes it a good idea to avoid redeclaring "string" type. ("String" would be a class normally, with the upper case letter in it ?)
Though since they're porting from Java, and a Java String[1] is immutable, an invariant string is actually the correct translation... (And with array pseudo-methods, AFAICT the only aspect of Java String use that can't be emulated exactly is the use of '+' for concatenation instead of '~'. It's been a while since I used Java though, so I may be overlooking something) [1] Mutable strings are implemented by StringBuffer, IIRC.
Feb 21 2008
Though since they're porting from Java, and a Java String[1] is immutable,
an invariant string is actually the correct translation...(And with array pseudo-methods, AFAICT the only aspect of Java String use that can't be emulated exactly is the use of '+' for concatenation instead of '~'.
operator is so confusing for a string? coming from a C/C++ and Java background I personally find ~ confusing it somehow reminds me of the two's compliment operator. It's been a while since I used Java though, so I may beoverlooking something)
[1] Mutable strings are implemented by StringBuffer, IIRC.
multiple threads need access to the StringBuilder object(s):)
Feb 23 2008
DBloke wrote:True, I still don't get why it is believed using + as a concatenation operator is so confusing for a string?
It's a completely different operation that has different semantics. Addition is commutative while concatenation isn't; i.e. a + b == b + a for any numeric type, but a ~ b is not generally interchangeable with b ~ a. This comes into play in D's operator overloading as well: if the compiler can't compile a+b as a.opAdd(b) or b.opAdd_r(a) it then tries a.opAdd_r(b) and b.opAdd(a) (i.e. it tries to compile as b+a instead). This transformation wouldn't be correct for concatenations, so that has a different operator that isn't commutative and will therefore skip the second step. (see the section "Binary Operator Overloading" on <http://www.digitalmars.com/d/1.0/operatoroverloading.html>)coming from a C/C++ and Java background I personally find ~ confusing it somehow reminds me of the two's compliment operator.
Does template instantiation also remind you of negation? Does multiplication remind you of pointer dereferencing? Does bitwise-and remind you of taking an address? I don't see the problem with unary and binary use of an operator doing different things as long as the unary meaning doesn't make any sense for binary use and the binary meaning doesn't make any for unary use. Though I must admit that the unary use of 'is' had me a bit confused for a while when I first tried to use it, but I don't think that was because of the binary use...
Feb 25 2008
Frits van Bommel wrote:Though I must admit that the unary use of 'is' had me a bit confused for a while when I first tried to use it, but I don't think that was because of the binary use...
There's a unary 'is' operator? --bb
Feb 25 2008
Bill Baxter wrote:Frits van Bommel wrote:Though I must admit that the unary use of 'is' had me a bit confused for a while when I first tried to use it, but I don't think that was because of the binary use...
There's a unary 'is' operator?
Well, there's an 'is()' :) http://www.digitalmars.com/d/1.0/expression.html#IsExpression
Mar 01 2008
It's a completely different operation that has different semantics. Addition is commutative while concatenation isn't; i.e. a + b == b + a for any numeric type, but a ~ b is not generally interchangeable with b ~ a. This comes into play in D's operator overloading as well: if the compiler can't compile a+b as a.opAdd(b) or b.opAdd_r(a) it then tries a.opAdd_r(b) and b.opAdd(a) (i.e. it tries to compile as b+a instead). This transformation wouldn't be correct for concatenations, so that has a different operator that isn't commutative and will therefore skip the second step. (see the section "Binary Operator Overloading" on <http://www.digitalmars.com/d/1.0/operatoroverloading.html>)
This makes sense, but not if you are used to Java or C/C++ though not a huge problem I admit, it just takes a bit of getting used to. :)
Does template instantiation also remind you of negation?
Does multiplication remind you of pointer dereferencing?
and for the future.Does bitwise-and remind you of taking an address?
I don't see the problem with unary and binary use of an operator doing different things as long as the unary meaning doesn't make any sense for binary use and the binary meaning doesn't make any for unary use.
True but the context and formatting of the code should give more than a clue, unless you are one of the few people who believe that removing any form of unnecessary white space will somehow make compilation faster ;) no I kid you not some people I have encountered (usually briefly) believe this in C/C++ code and better still we had some guy believing that removing every other binary digit in assembly language would make the code smaller and lighter for portability ;)
Feb 25 2008









Bjoern <nanali nospam-wanadoo.fr> 