digitalmars.D - .init property for char[] type
- Justin Johansson <procode adam-dott-com.au> Sep 22 2009
- Jeremie Pelletier <jeremiep gmail.com> Sep 22 2009
- Jarrett Billingsley <jarrett.billingsley gmail.com> Sep 22 2009
- Justin Johansson <procode adam-dott-com.au> Sep 22 2009
- Justin Johansson <procode adam-dott-com.au> Sep 22 2009
- Daniel Keep <daniel.keep.lists gmail.com> Sep 22 2009
- Justin Johansson <procode adam-dott-com.au> Sep 22 2009
- Jeremie Pelletier <jeremiep gmail.com> Sep 22 2009
- Justin Johansson <procode adam-dott-com.au> Sep 22 2009
- Jeremie Pelletier <jeremiep gmail.com> Sep 22 2009
- Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> Sep 22 2009
- Jeremie Pelletier <jeremiep gmail.com> Sep 22 2009
- Justin Johansson <procode adam-dott-com.au> Sep 22 2009
- Michel Fortin <michel.fortin michelf.com> Sep 23 2009
- Jeremie Pelletier <jeremiep gmail.com> Sep 23 2009
- Walter Bright <newshound1 digitalmars.com> Sep 24 2009
- Justin Johansson <procode adam-dott-com.au> Sep 22 2009
- "Steven Schveighoffer" <schveiguy yahoo.com> Sep 22 2009
In a templated class (D1.0) along lines ...
class Foo(T) {
//..
static T bar() { return T.init; }
//..
}
Foo!(int).bar() returns 0 and Foo!(char[]).bar() returns nil.
I'd much prefer (at least for my purposes) that (char[]).init returned an empty
string rather than effectively a null pointer. Is there a convenient solution
for this, e.g. by specializing just the bar method of class Foo when T is
char[], or by some other means?
Maybe this type of question best be asked on D.learn, but I do wonder if an
empty string is a more reasonable initializer for char[] .. well maybe not .. I
don't know .. I yield to your sensibilities.
Thanks to all.
Sep 22 2009
Justin Johansson wrote:In a templated class (D1.0) along lines ... class Foo(T) { //.. static T bar() { return T.init; } //.. } Foo!(int).bar() returns 0 and Foo!(char[]).bar() returns nil. I'd much prefer (at least for my purposes) that (char[]).init returned an empty string rather than effectively a null pointer. Is there a convenient solution for this, e.g. by specializing just the bar method of class Foo when T is char[], or by some other means? Maybe this type of question best be asked on D.learn, but I do wonder if an empty string is a more reasonable initializer for char[] .. well maybe not .. I don't know .. I yield to your sensibilities. Thanks to all.
You could use a custom type, which would solve your .init problem: typedef string myString = ""; Or you could specialize your bar(): static T bar() { static if(isSomeString!T) return ""; else return T.init; } I myself favor a null initializer, since char[] is a reference type, not a value type, it only makes sense to initialize it to a null reference.
Sep 22 2009
On Tue, Sep 22, 2009 at 8:07 AM, Justin Johansson <procode adam-dott-com.au> wrote:In a templated class (D1.0) along lines ... class Foo(T) { //.. =A0static T bar() { return T.init; } //.. } Foo!(int).bar() returns 0 and Foo!(char[]).bar() returns nil. I'd much prefer (at least for my purposes) that (char[]).init returned an=
ent solution for this, e.g. by specializing just the bar method of class Fo= o when T is char[], or by some other means?Maybe this type of question best be asked on D.learn, but I do wonder if =
ot .. I don't know .. I yield to your sensibilities.Thanks to all.
There's no real difference between an empty string and a null reference. Both have 0 length.
Sep 22 2009
Jarrett Billingsley Wrote:On Tue, Sep 22, 2009 at 8:07 AM, Justin JohanssonMaybe this type of question best be asked on D.learn, but I do wonder if an empty string is a more reasonable initializer for char[] .. well maybe not .. I don't know .. I yield to your sensibilities.
reference. Both have 0 length.
Big difference if you pass char[] variable .ptr to a C function. static if ( typeid(T) is typeid(char[])) { } else { init_sequence = new ExactlyOne!(T)( T.init); } Tks Jeremie got specialized method working with
Sep 22 2009
Justin Johansson Wrote: Scratch that last garbled reply .. finger trouble. Was going to say that ...There's no real difference between an empty string and a null reference. Both have 0 length.
Big difference if you pass char[] variable .ptr to a C function. And thanks Jeremie, got specialized method working with static if ( typeid(T) is typeid(char[])) { // .. } else { // .. } Cheers Justin Johansson
Sep 22 2009
In general, if you pass a string to a C function you should send it through toStringz first. If you don't, you're just begging for segfaults. Justin Johansson wrote:There's no real difference between an empty string and a null reference. Both have 0 length.
Big difference if you pass char[] variable .ptr to a C function.
Sep 22 2009
Daniel Keep Wrote:Big difference if you pass char[] variable .ptr to a C function.
In general, if you pass a string to a C function you should send it through toStringz first. If you don't, you're just begging for segfaults.
Agreed .. fair enough. Actually I'm more interested in the semantics for default initialized char[]. Does it have exactly the same semantics as an empty string (in general D or runtime library, Phobos et. al. context)?
Sep 22 2009
Justin Johansson wrote:Daniel Keep Wrote:Big difference if you pass char[] variable .ptr to a C function.
In general, if you pass a string to a C function you should send it through toStringz first. If you don't, you're just begging for segfaults.
Agreed .. fair enough. Actually I'm more interested in the semantics for default initialized char[]. Does it have exactly the same semantics as an empty string (in general D or runtime library, Phobos et. al. context)?
It isn't the same semantics: a null array is {0, null}, while an empty array is {0, &zero} where zero is of type 'char zero = 0;' since string literals are zero terminated. Their usage is mostly the same, you can concatenate both of them, append to both of them, and etc, all giving the same results. Where it makes a difference is when you need to enforce an invariant that .ptr is not null. Calling toStringz on either will give the same C string: a pointer to a zero value. You have to remember that arrays are reference types; they are perfectly valid without referenced data. Think of pointers or objects for example, which are also reference types. Besides, if you initialize character arrays to "", what do you initialize other arrays to, and other reference types to? It just wouldn't be consistent.
Sep 22 2009
Jeremie Pelletier Wrote:Besides, if you initialize character arrays to "", what do you initialize other arrays to, and other reference types to? It just wouldn't be consistent.
Consistency. Since when is that an argument? Just to be a PITA, pick the inconsistent row in the table below (from spec_D1.00.pdf). The row ordering of the the table has been shuffled just to make it a bit more difficult to spot :-) short.init 0 int.init 0 bool.init false byte.init 0 double.init double.nan long.init 0L
Sep 22 2009
Justin Johansson wrote:Jeremie Pelletier Wrote:Besides, if you initialize character arrays to "", what do you initialize other arrays to, and other reference types to? It just wouldn't be consistent.
Consistency. Since when is that an argument? Just to be a PITA, pick the inconsistent row in the table below (from spec_D1.00.pdf). The row ordering of the the table has been shuffled just to make it a bit more difficult to spot :-) short.init 0 int.init 0 bool.init false byte.init 0 double.init double.nan long.init 0L
Obviously the nan floating points, which has annoyed me quite many times, every other type in D inits to zeroed memory, with the exception of void initializers.
Sep 22 2009
Justin Johansson wrote:Jeremie Pelletier Wrote:Besides, if you initialize character arrays to "", what do you initialize other arrays to, and other reference types to? It just wouldn't be consistent.
Consistency. Since when is that an argument? Just to be a PITA, pick the inconsistent row in the table below (from spec_D1.00.pdf). The row ordering of the the table has been shuffled just to make it a bit more difficult to spot :-) short.init 0 int.init 0 bool.init false byte.init 0 double.init double.nan long.init 0L
You forgot char.init 0xFF wchar.init 0xFFFF dchar.init 0xFFFFFFFF Andrei
Sep 22 2009
Andrei Alexandrescu wrote:Justin Johansson wrote:Jeremie Pelletier Wrote:Besides, if you initialize character arrays to "", what do you initialize other arrays to, and other reference types to? It just wouldn't be consistent.
Consistency. Since when is that an argument? Just to be a PITA, pick the inconsistent row in the table below (from spec_D1.00.pdf). The row ordering of the the table has been shuffled just to make it a bit more difficult to spot :-) short.init 0 int.init 0 bool.init false byte.init 0 double.init double.nan long.init 0L
You forgot char.init 0xFF wchar.init 0xFFFF dchar.init 0xFFFFFFFF Andrei
Actually, dchar.init is "\U0000ffff". Jeremie
Sep 22 2009
Andrei Alexandrescu Wrote:Justin Johansson wrote:Jeremie Pelletier Wrote:Besides, if you initialize character arrays to "", what do you initialize other arrays to, and other reference types to? It just wouldn't be consistent.
Consistency. Since when is that an argument? Just to be a PITA, pick the inconsistent row in the table below (from spec_D1.00.pdf). The row ordering of the the table has been shuffled just to make it a bit more difficult to spot :-) short.init 0 int.init 0 bool.init false byte.init 0 double.init double.nan long.init 0L
You forgot char.init 0xFF wchar.init 0xFFFF dchar.init 0xFFFFFFFF Andrei
Shhh; don't tell anybody; I left those out of the quiz to weigh in favour of zero bit pattern init values. (This trick, i.e. omitting information, is one I learned from the Ministries of Statistics and (un)Employment.) Seriously though, I imagine the D design choices to be influenced by the desire to propagate NaN and invalid UTF in their respective cases so as to detect uninitialized data errors. Hmm, guess one could argue the init issue for eons. -- Justin
Sep 22 2009
On 2009-09-22 18:08:24 -0400, Justin Johansson <procode adam-dott-com.au> said:You forgot char.init 0xFF wchar.init 0xFFFF dchar.init 0xFFFFFFFF Andrei
Shhh; don't tell anybody; I left those out of the quiz to weigh in favour of zero bit pattern init values. (This trick, i.e. omitting information, is one I learned from the Ministries of Statistics and (un)Employment.) Seriously though, I imagine the D design choices to be influenced by the desire to propagate NaN and invalid UTF in their respective cases so as to detect uninitialized data errors. Hmm, guess one could argue the init issue for eons.
Well, I see this as a problem because I've often relied on default initialization being zero in my algorithms. I was bitten once when my algorithm worked perfectly with char but not with wchar. Turns out that char.init == 0 (contraty to what Andrei wrote) and wchar.init == 0xFFFF. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Sep 23 2009
Michel Fortin wrote:On 2009-09-22 18:08:24 -0400, Justin Johansson <procode adam-dott-com.au> said:You forgot char.init 0xFF wchar.init 0xFFFF dchar.init 0xFFFFFFFF Andrei
Shhh; don't tell anybody; I left those out of the quiz to weigh in favour of zero bit pattern init values. (This trick, i.e. omitting information, is one I learned from the Ministries of Statistics and (un)Employment.) Seriously though, I imagine the D design choices to be influenced by the desire to propagate NaN and invalid UTF in their respective cases so as to detect uninitialized data errors. Hmm, guess one could argue the init issue for eons.
Well, I see this as a problem because I've often relied on default initialization being zero in my algorithms. I was bitten once when my algorithm worked perfectly with char but not with wchar. Turns out that char.init == 0 (contraty to what Andrei wrote) and wchar.init == 0xFFFF.
pragma(msg, char.init.stringof); outputs '\xff' in D2, wchar and dchar have the same initializer: '\U0000FFFF'. If you rely on char initializer being the null character, use char c = 0, or else your char gets initialized to an invalid character, just like floats get initialized to nan, other types have the invalid value as either null or do not have an invalid value and use 0.
Sep 23 2009
Justin Johansson wrote:Seriously though, I imagine the D design choices to be influenced by the desire to propagate NaN and invalid UTF in their respective cases so as to detect uninitialized data errors.
That's exactly what drove the design choices. If there was a nan value for integers, D would use that. But there isn't, so 0 is the best we can do. Andrei and I were talking last night about the purity of software design principles and the reality, and how the reality forces compromise on the purity if you wanted to get anything done.
Sep 24 2009
Steven Schveighoffer Wrote:A null string *is* an empty string, but an empty string may not be a null string. The subtle difference is that the pointer points to null versus some data. A non-null empty string: - May be pointing to heap data, therefore keeping the data from being collected. - May reallocate in place on appending (a null string always must allocate new data on append). It's a difficult concept to get, but an array is really a hybrid type between a reference and a value type. The array is actually a value type struct with a pointer reference and a length value. If the length is zero, then the pointer value technically isn't needed, but in subtle cases, it makes a difference. When you copy the array, the length behaves like a value type (changing the length of one array doesn't affect the other), but the array data is referenced (changing an element of the array *does* affect the other). I think plans are to make the array a full reference type, and leave slices as these structs (in D2). This probably will clear up a lot of confusion people have. I hope this helps... Oh, and BTW, you can pass string literals to C functions, but *not* char[] variables. Always pass them through toStringz. It generally does not take much time/resources to add the zero. -Steve
Good write-up Steve; thanks. Being relatively new to D, but from a strong C++ and assembler background, I did the usual interrogation for interest: writefln( "(char[]).sizeof=%d", (char[]).sizeof); 8 bytes. So if you wanted to intern string data to conserve memory, and reference such data with a single 32-bit pointer, sounds like you would have to do this with either a char* or perhaps a pointer to a char[], rather than a full char[] field in your class or struct. There's less reason to want to intern string data if you still need 8 bytes to reference said data. Justin
Sep 22 2009
On Tue, 22 Sep 2009 09:53:52 -0400, Justin Johansson <procode adam-dott-com.au> wrote:Daniel Keep Wrote:Big difference if you pass char[] variable .ptr to a C function.
In general, if you pass a string to a C function you should send it through toStringz first. If you don't, you're just begging for segfaults.
Agreed .. fair enough. Actually I'm more interested in the semantics for default initialized char[]. Does it have exactly the same semantics as an empty string (in general D or runtime library, Phobos et. al. context)?
A null string *is* an empty string, but an empty string may not be a null string. The subtle difference is that the pointer points to null versus some data. A non-null empty string: - May be pointing to heap data, therefore keeping the data from being collected. - May reallocate in place on appending (a null string always must allocate new data on append). It's a difficult concept to get, but an array is really a hybrid type between a reference and a value type. The array is actually a value type struct with a pointer reference and a length value. If the length is zero, then the pointer value technically isn't needed, but in subtle cases, it makes a difference. When you copy the array, the length behaves like a value type (changing the length of one array doesn't affect the other), but the array data is referenced (changing an element of the array *does* affect the other). I think plans are to make the array a full reference type, and leave slices as these structs (in D2). This probably will clear up a lot of confusion people have. I hope this helps... Oh, and BTW, you can pass string literals to C functions, but *not* char[] variables. Always pass them through toStringz. It generally does not take much time/resources to add the zero. -Steve
Sep 22 2009









Jeremie Pelletier <jeremiep gmail.com> 