www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - string literals

reply "Saaa" <empty needmail.com> writes:
Can somebody please explain to me why char arrays are a special case of 
arrays?

(when is this useful?) 
Jan 23 2008
parent reply downs <default_357-line yahoo.de> writes:
Saaa wrote:
 Can somebody please explain to me why char arrays are a special case of 
 arrays?
 
 (when is this useful?) 
 
 

I don't understand what you mean with "special case". Character arrays (a.k.a. strings) are arrays just like the rest of 'em :) The only difference is that they have an additional literal constructor in the form of "foo". --downs
Jan 23 2008
next sibling parent reply "Saaa" <empty needmail.com> writes:
I meant why are they read only?

 I don't understand what you mean with "special case". Character arrays 
 (a.k.a. strings) are arrays just like the rest of 'em :)

 The only difference is that they have an additional literal constructor in 
 the form of "foo".

 --downs 

Jan 23 2008
parent reply downs <default_357-line yahoo.de> writes:
Saaa wrote:
 I meant why are they read only?
 
 I don't understand what you mean with "special case". Character arrays 
 (a.k.a. strings) are arrays just like the rest of 'em :)

 The only difference is that they have an additional literal constructor in 
 the form of "foo".


http://digitalmars.com/d/1.0/arrays.html#strings --downs
Jan 23 2008
parent reply "Saaa" <empty needmail.com> writes:
I'm sorry about the late reply..

I read it, but still don't really know what you mean additional literal 
constructor.

But what I really meant was this:
String literals are immutable (read only).

Why is this? (I expect there to be a nice reason for it, I just down't know 
it :)


 I meant why are they read only?

 I don't understand what you mean with "special case". Character arrays
 (a.k.a. strings) are arrays just like the rest of 'em :)

 The only difference is that they have an additional literal constructor 
 in
 the form of "foo".


http://digitalmars.com/d/1.0/arrays.html#strings --downs

Jan 24 2008
parent reply bearophile <bearophileHUGS lycos.com> writes:
Saaa:
Why is this? (I expect there to be a nice reason for it, I just down't know it
:)<

That's an interesting question. Now and then I too feel the need of mutable stings, to change their chars, etc, but here Josh Block and the famous Sedgewick explain why immutables are often good: http://www.cs.princeton.edu/introcs/33design/ << An immutable data type is a data type such that the value of an object never changes once constructed. Examples: Complex and String. When you pass a String to a method, you don't have to worry about that method changing the sequence of characters in the String. On the other hand, when you pass an array to a method, the method is free to change the elements of the array. Immutable data types have numerous advantages. they are easier to use, harder to misuse, easier to debug code that uses immutable types, easier to guarantee that the class variables remain in a consistent state (since they never change after construction), no need for copy constructor, are thread-safe, work well as keys in symbol table, don't need to be defensively copied when used as an instance variable in another class. Disadvantage: separate object for each value. Josh Block, a Java API architect, advises that "Classes should be immutable unless there's a very good reason to make them mutable....If a class cannot be made immutable, you should still limit its mutability as much as possible."


Something more: - Immutable data works well with multiprocessing - it's common in functional style programming allowing more pure functions - the GC can manage immutable strings/objects in an efficient enough way. Bye, bearophile
Jan 24 2008
parent reply "Saaa" <empty needmail.com> writes:
Thanks, so the common use of strings is such that it is best to make the 
default setting 'immutable'.
Then, how about adding the .access property to arrays? (or all types :)
With strings default .access as read only?


Why is this? (I expect there to be a nice reason for it, I just down't 
know it :)<

That's an interesting question. Now and then I too feel the need of mutable stings, to change their chars, etc, but here Josh Block and the famous Sedgewick explain why immutables are often good: http://www.cs.princeton.edu/introcs/33design/ << An immutable data type is a data type such that the value of an object never changes once constructed. Examples: Complex and String. When you pass a String to a method, you don't have to worry about that method changing the sequence of characters in the String. On the other hand, when you pass an array to a method, the method is free to change the elements of the array. Immutable data types have numerous advantages. they are easier to use, harder to misuse, easier to debug code that uses immutable types, easier to guarantee that the class variables remain in a consistent state (since they never change after construction), no need for copy constructor, are thread-safe, work well as keys in symbol table, don't need to be defensively copied when used as an instance variable in another class. Disadvantage: separate object for each value. Josh Block, a Java API architect, advises that "Classes should be immutable unless there's a very good reason to make them mutable....If a class cannot be made immutable, you should still limit its mutability as much as possible."


Something more: - Immutable data works well with multiprocessing - it's common in functional style programming allowing more pure functions - the GC can manage immutable strings/objects in an efficient enough way. Bye, bearophile

Jan 24 2008
parent reply bearophile <bearophileHUGS lycos.com> writes:
Saaa:

Then, how about adding the .access property to arrays? (or all types :) With
strings default .access as read only?<

What's the .access property of arrays in the language you talk about? If you want to try the D language, you may want to use the 1.x series instead of the 2.x that is alpha and has immutable strings. Immutable things (like strings) are useful as AA keys too. In D 1.x (and Ruvy too, maybe) you can use a string as key, but if you later change that string its hash function isn't automatically recomputed, this leads to chaos. In Python you have both mutable and immutable arrays (called list and tuple), strings are immutable, but with the standard module "array" you can use mutable "strings" too. Python AAs called dict accept only their immutable versions to avoid those bugs (and inside tuples you have to put immutables). Bye, bearophile
Jan 25 2008
next sibling parent Christopher Wright <dhasenan gmail.com> writes:
bearophile wrote:
 Saaa:
 
 Then, how about adding the .access property to arrays? (or all types :) With
strings default .access as read only?<

What's the .access property of arrays in the language you talk about?

Basically runtime const, as far as I can tell. You can fake it with a class easily enough.
Jan 25 2008
prev sibling parent reply "Saaa" <empty needmail.com> writes:
 What's the .access property of arrays in the language you talk about?

It was a suggestion :)
 If you want to try the D language, you may want to use the 1.x series 
 instead of the 2.x that is alpha and has immutable strings.

Didn't they both have immutable strings? I'm more in search of a way to make char[] not immutable anymore, for most things I do its only a hassle.
 Immutable things (like strings) are useful as AA keys too. In D 1.x (and 
 Ruvy too, maybe) you can use a string as key, but if you later change that 
 string its hash function isn't automatically recomputed, this leads to 
 chaos. In Python you have both mutable and immutable arrays (called list 
 and tuple), strings are immutable, but with the standard module "array" 
 you can use mutable "strings" too. Python AAs called dict accept only 
 their immutable versions to avoid those bugs (and inside tuples you have 
 to put immutables).

Sounds sound to me. This was what I suggested, use a .access property and only accept the 'read-only' for things like that.
Jan 25 2008
parent reply Robert Fraser <fraserofthenight gmail.com> writes:
Saaa wrote:
 Didn't they both have immutable strings?
 I'm more in search of a way to make char[] not immutable anymore, for most 
 things I do its only a hassle.

On UNIX systems, string literals are stored in the code segment. Thus, modifying them will cause a segfault. In D 1.0, the compiler will let you modify it, but the OS won't. In D 2.0, this rule is enforced by the compiler. Even on Windows, modifying string literals is probably a bad idea. To allow a string literal to be modified, copy it onto the heap with a .dup .
Jan 25 2008
parent reply "Saaa" <empty needmail.com> writes:
 On UNIX systems, string literals are stored in the code segment.

segment? Wouldn't you normally want to edit it after loading? But shouldn't there be an way (per variable) to force the compiler to not store it like that?
 Thus, modifying them will cause a segfault. In D 1.0, the compiler will 
 let you modify it, but the OS won't. In D 2.0, this rule is enforced by 
 the compiler. Even on Windows, modifying string literals is probably a bad 
 idea.

 To allow a string literal to be modified, copy it onto the heap with a 
 .dup .

I know this, although it just says it creates a dynamic array. How do you know that the .dup dynamic array is not read-only.
Jan 25 2008
next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Saaa:

 If you load a file into a char[][], will the file be stored in the code 
 segment?

Nope.
 But shouldn't there be an way (per variable) to force the compiler to not 
 store it like that?

In practice, there is. In D 1.x string literals are, dynamic ones aren't.
 I know this, although it just says it creates a dynamic array. How do you 
 know that the .dup dynamic array is not read-only.

In D.1.x dyn arrays are mutable. Bye, bearophile
Jan 25 2008
parent reply "Saaa" <empty needmail.com> writes:
from the 1.0 documentation:char[] str;
char[] str1 = "abc";
str[0] = 'b';        // error, "abc" is read only, may crash Is this example 
correct?
 Saaa:

 If you load a file into a char[][], will the file be stored in the code
 segment?

Nope.
 But shouldn't there be an way (per variable) to force the compiler to not
 store it like that?

In practice, there is. In D 1.x string literals are, dynamic ones aren't.
 I know this, although it just says it creates a dynamic array. How do you
 know that the .dup dynamic array is not read-only.

In D.1.x dyn arrays are mutable. Bye, bearophile

Jan 25 2008
parent reply "Saaa" <empty needmail.com> writes:
from the 1.0 documentation:

char[] str;
char[] str1 = "abc";
str[0] = 'b';        // error, "abc" is read only, may crash

Is this example correct?
Jan 25 2008
parent reply Robert Fraser <fraserofthenight gmail.com> writes:
Saaa wrote:
 from the 1.0 documentation:
 
 char[] str;
 char[] str1 = "abc";
 str[0] = 'b';        // error, "abc" is read only, may crash
 
 Is this example correct?
 
 

Yes, because the "abc" is a string literal, that is to say it's written in the code itself. If str1 was loaded from an outside source, such as a file, user input, etc., then you could modify it without issue. For string LITERALS (a string literal is one you write in the code itself, usually encased in double-quotes), modifying them without calling .dup on them is bad. For other strings, it's perfectly okay.
Jan 26 2008
next sibling parent reply "Saaa" <empty needmail.com> writes:
I finally see what a string literal means, but this code still bothers me.
What does changing a dynamic char[] (str) have to do with the string literal 
in str1

 from the 1.0 documentation:

 char[] str;
 char[] str1 = "abc";
 str[0] = 'b';        // error, "abc" is read only, may crash

 Is this example correct?

Yes, because the "abc" is a string literal, that is to say it's written in the code itself. If str1 was loaded from an outside source, such as a file, user input, etc., then you could modify it without issue. For string LITERALS (a string literal is one you write in the code itself, usually encased in double-quotes), modifying them without calling .dup on them is bad. For other strings, it's perfectly okay.

Jan 26 2008
parent "Simen Kjaeraas" <simen.kjaras gmail.com> writes:
Saaa <empty needmail.com> wrote:

 I finally see what a string literal means, but this code still bothers=

 me.
 What does changing a dynamic char[] (str) have to do with the string  =

 literal
 in str1

 from the 1.0 documentation:

 char[] str;
 char[] str1 =3D "abc";
 str[0] =3D 'b';        // error, "abc" is read only, may crash

 Is this example correct?

Yes, because the "abc" is a string literal, that is to say it's writt=


 in
 the code itself. If str1 was loaded from an outside source, such as a=


 file, user input, etc., then you could modify it without issue.

 For string LITERALS (a string literal is one you write in the code  =


 itself,
 usually encased in double-quotes), modifying them without calling .du=


 on
 them is bad. For other strings, it's perfectly okay.


You're right, of course. str[0] has nothing to do with "abc" or str1. Th= e = code will fail, but with an arraybounds error (str.length =3D=3D 0, so n= o = element 0 is available) Now, this code: char[] str1 =3D "abc"; str1[0] =3D 'b'; // error, "abc" is read only, may crash Should fail for the above reasons. Simen Kjaeraas
Jan 26 2008
prev sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Robert Fraser:
 Saaa wrote:
 char[] str;
 char[] str1 = "abc";
 str[0] = 'b';        // error, "abc" is read only, may crash
 Is this example correct?

in the code itself. If str1 was loaded from an outside source, such as a file, user input, etc., then you could modify it without issue. For string LITERALS (a string literal is one you write in the code itself, usually encased in double-quotes), modifying them without calling .dup on them is bad. For other strings, it's perfectly okay.

I am no an expert of D yet, but I think in the following D 1.x code str is a dynamic array, so it can be changed safely: void main() { char[] str = "abc"; str[0] = 'b'; } Bye, bearophile
Jan 26 2008
parent reply bearophile <bearophileHUGS lycos.com> writes:
Milke Wey:
 On linux you would get an nice Segfault when running that code.

Ah, thank you then. I have thought that's true only for string literals assigned to static arrays: void main() { char[3] str = "abc"; str[0] = 'b'; } Bye, bearophile
Jan 26 2008
parent Robert Fraser <fraserofthenight gmail.com> writes:
Milke Wey wrote:
 On Sat, 2008-01-26 at 15:26 -0500, bearophile wrote:
 Milke Wey:
 On linux you would get an nice Segfault when running that code.

void main() { char[3] str = "abc"; str[0] = 'b'; } Bye, bearophile

I don't know why but it actually works when using a static array.

On a UNIX system or a Windows system? It should work fine on Windows anyway. Also, the static array declaration may say to the compiler "allocate this on the stack"... not sure; I don't think that's speced.
Jan 27 2008
prev sibling parent reply =?UTF-8?B?IkrDqXLDtG1lIE0uIEJlcmdlciI=?= <jeberger free.fr> writes:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Saaa wrote:
 On UNIX systems, string literals are stored in the code segment.

segment? Wouldn't you normally want to edit it after loading?

 But shouldn't there be an way (per variable) to force the compiler to not 
 store it like that?
 

char[] foo = "abc"; // Store it in the code segment char[] bar = "abc".dup; // Store it on the heap
 Thus, modifying them will cause a segfault. In D 1.0, the compiler will 
 let you modify it, but the OS won't. In D 2.0, this rule is enforced by 
 the compiler. Even on Windows, modifying string literals is probably a bad 
 idea.

 To allow a string literal to be modified, copy it onto the heap with a 
 .dup .

I know this, although it just says it creates a dynamic array. How do you know that the .dup dynamic array is not read-only.

Jerome - -- +------------------------- Jerome M. BERGER ---------------------+ | mailto:jeberger free.fr | ICQ: 238062172 | | http://jeberger.free.fr/ | Jabber: jeberger jabber.fr | +---------------------------------+------------------------------+ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (GNU/Linux) iD8DBQFHmvLqd0kWM4JG3k8RAhmTAJ0f0SlvCA4JlMiko5wRoarq/2mJkQCfZ/Ix tc085J5V9KkkxApKLxGeLgE= =GfMd -----END PGP SIGNATURE-----
Jan 26 2008
parent "Saaa" <empty needmail.com> writes:
OK, now I know what I didn't get

Robert Frazier:
LITERALS (a string literal is one you write in the code
itself, usually encased in double-quotes),

 On UNIX systems, string literals are stored in the code segment.

segment? Wouldn't you normally want to edit it after loading?


 But shouldn't there be an way (per variable) to force the compiler to not
 store it like that?

char[] foo = "abc"; // Store it in the code segment char[] bar = "abc".dup; // Store it on the heap

 Thus, modifying them will cause a segfault. In D 1.0, the compiler will
 let you modify it, but the OS won't. In D 2.0, this rule is enforced by
 the compiler. Even on Windows, modifying string literals is probably a 
 bad
 idea.

 To allow a string literal to be modified, copy it onto the heap with a
 .dup .

I know this, although it just says it creates a dynamic array. How do you know that the .dup dynamic array is not read-only.


Jan 26 2008
prev sibling next sibling parent Bill Baxter <dnewsgroup billbaxter.com> writes:
downs wrote:
 Saaa wrote:
 Can somebody please explain to me why char arrays are a special case of 
 arrays?

 (when is this useful?) 

I don't understand what you mean with "special case". Character arrays (a.k.a. strings) are arrays just like the rest of 'em :) The only difference is that they have an additional literal constructor in the form of "foo". --downs

That plus there is special casing in the compiler to support foreach(dchar x; some_string) {...} --bb
Jan 23 2008
prev sibling next sibling parent Milke Wey <no.spam example.com> writes:
On Sat, 2008-01-26 at 10:26 -0500, bearophile wrote:
 Robert Fraser:
 Saaa wrote:
 char[] str;
 char[] str1 = "abc";
 str[0] = 'b';        // error, "abc" is read only, may crash
 Is this example correct?

in the code itself. If str1 was loaded from an outside source, such as a file, user input, etc., then you could modify it without issue. For string LITERALS (a string literal is one you write in the code itself, usually encased in double-quotes), modifying them without calling .dup on them is bad. For other strings, it's perfectly okay.

I am no an expert of D yet, but I think in the following D 1.x code str is a dynamic array, so it can be changed safely: void main() { char[] str = "abc"; str[0] = 'b'; } Bye, bearophile

On linux you would get an nice Segfault when running that code.
Jan 26 2008
prev sibling next sibling parent Milke Wey <no.spam example.com> writes:
On Sat, 2008-01-26 at 15:26 -0500, bearophile wrote:
 Milke Wey:
 On linux you would get an nice Segfault when running that code.

Ah, thank you then. I have thought that's true only for string literals assigned to static arrays: void main() { char[3] str = "abc"; str[0] = 'b'; } Bye, bearophile

I don't know why but it actually works when using a static array. -- Mike Wey
Jan 27 2008
prev sibling parent Milke Wey <no.spam example.com> writes:
On Sun, 2008-01-27 at 22:44 -0800, Robert Fraser wrote:
 Milke Wey wrote:
 On Sat, 2008-01-26 at 15:26 -0500, bearophile wrote:
 Milke Wey:
 On linux you would get an nice Segfault when running that code.

void main() { char[3] str = "abc"; str[0] = 'b'; } Bye, bearophile

I don't know why but it actually works when using a static array.

On a UNIX system or a Windows system? It should work fine on Windows anyway. Also, the static array declaration may say to the compiler "allocate this on the stack"... not sure; I don't think that's speced.

On Linux with DMD 1.025. -- Mike Wey
Jan 28 2008