digitalmars.D.learn - peculiarities with char[] and std.string

Kyle K (38/38) Jun 19 2006 Greetings.

xs0 (22/64) Jun 19 2006 Well, you didn't touch the memory you didn't allocate :) If you had

Kyle K (5/14) Jun 19 2006 Ah ok, that makes sense. So using 'in' with arrays and aggregate types w...

BCS (28/47) Jun 19 2006 Actually "in" always gives you a copy of the actual "thing". Arrays are

Kyle K (2/34) Jun 19 2006 Got it, thanks a bunch. I knew it had to be something simple... :D

Kyle K (5/14) Jun 19 2006 Ah ok, that makes sense. So using 'in' with arrays and aggregate types w...

Kyle K <Kyle_member pathlink.com> writes:

Greetings.

I was poking around the std.string lib, and was wondering if someone could
answer a few questions about it. I'm relatively new to D, so I'm sure there are
pretty obvious answers.

I notice in most of the functions like toStringz() and tolower() it implements
the copy-on-write convention... but since the default function parameter is in,
is there not already an implicit copy of the data being made? For example,

import std.stdio;
int main()
{
char []str, str2;
str="foo";
str2= bob(str);
writefln("%s:%s", str, str2);  // should print "foo:keke"
return 0;
}
char []bob(in char[] str)
{
str = "keke"; 
return str;
}

Works fine with my copy of DMD. Is this behavior not to be relied on as you
shouldn't ever touch memory you didnt allocate (according to the FAQ)?


Also, why is the following the case:

printf("%s", "hello\0"); // Fails with access violation
printf("%s", cast(char *)"hello\0"); // OK

Is the implicit casting from char[] to char * doing something im not aware of in
terms of the length of the string, like chopping off the \0?

My last question is which is the preferred method of making a copy of a string?
Suppose I want str2 to be a copy of str, then:

str2.length = str.length;
str2[] = str;
//      These two equivalent?
str2 = str.dup;

Sorry for all the questions and thanks for the help, let me know if this info is
somewhere obvious.. I wasn't able to find it in the spec.

Regards
Kyle K.

Jun 19 2006

xs0 <xs0 xs0.com> writes:

Kyle K wrote:
 Greetings.
 
 I was poking around the std.string lib, and was wondering if someone could
 answer a few questions about it. I'm relatively new to D, so I'm sure there are
 pretty obvious answers.
 
 I notice in most of the functions like toStringz() and tolower() it implements
 the copy-on-write convention... but since the default function parameter is in,
 is there not already an implicit copy of the data being made? 

No, just a copy of the _reference_ is made, but both point to the same data.

 For example,
 
 import std.stdio;
 int main()
 {
 char []str, str2;
 str="foo";
 str2= bob(str);
 writefln("%s:%s", str, str2);  // should print "foo:keke"
 return 0;
 }
 char []bob(in char[] str)
 {
 str = "keke"; 
 return str;
 }
 
 Works fine with my copy of DMD. Is this behavior not to be relied on as you
 shouldn't ever touch memory you didnt allocate (according to the FAQ)?

Well, you didn't touch the memory you didn't allocate :) If you had

char[] bob(in char[] str)
{
     str[0] = 'a';
     return str;
}

You'd get "aoo:aoo" as output (or a crash, as you can't write into 
constants on some platforms)


 Also, why is the following the case:
 
 printf("%s", "hello\0"); // Fails with access violation
 printf("%s", cast(char *)"hello\0"); // OK
 
 Is the implicit casting from char[] to char * doing something im not aware of
in
 terms of the length of the string, like chopping off the \0?

"hello\0" is a D char[] array, which is composed of length + char*. 
printf doesn't know about D arrays, so it takes the length to be the 
pointer to data, which fails for obvious reasons. When you cast it to 
char*, you lose the length, keep the pointer, and it works. I think you 
should use something like

printf("%.*s", "hello"); // no zero needed/wanted in this case..

Better yet, use writef/ln instead - it knows all about D's types..

 My last question is which is the preferred method of making a copy of a string?
 Suppose I want str2 to be a copy of str, then:
 
 str2.length = str.length;
 str2[] = str;
 //      These two equivalent?
 str2 = str.dup;

Generally, .dup is/could/should be faster, as it's obvious you want a 
copy, so there's no need to initialize the destination array on 
resizing, for example.

Hope that helped :)


xs0

Jun 19 2006

Kyle K <Kyle_member pathlink.com> writes:

In article <e76aq8$qsr$1 digitaldaemon.com>, xs0 says...
Well, you didn't touch the memory you didn't allocate :) If you had

char[] bob(in char[] str)
{
     str[0] = 'a';
     return str;
}

You'd get "aoo:aoo" as output (or a crash, as you can't write into 
constants on some platforms)

Ah ok, that makes sense. So using 'in' with arrays and aggregate types will
always still give you a reference? I assume with primitives the semantics remain
pass-by-value, such that foo(in int b) will never modify the caller's data?


Hope that helped :)

It did, thanks a lot! :D

Jun 19 2006

BCS <BCS pathlink.com> writes:

Kyle K wrote:
 In article <e76aq8$qsr$1 digitaldaemon.com>, xs0 says...
 
Well, you didn't touch the memory you didn't allocate :) If you had

char[] bob(in char[] str)
{
    str[0] = 'a';
    return str;
}

You'd get "aoo:aoo" as output (or a crash, as you can't write into 
constants on some platforms)

 
 
 Ah ok, that makes sense. So using 'in' with arrays and aggregate types will
 always still give you a reference? I assume with primitives the semantics
remain
 pass-by-value, such that foo(in int b) will never modify the caller's data?
 
 

Actually "in" always gives you a copy of the actual "thing". Arrays are 
reference types so you get a copy of the reference. Same with objects, 
as they are also reference types. Stucts on the other hand are not 
reference types and as such will get passed by value


class fooC{int i;}
struct fooS{int i;}


void main()
{
	fooC c1= new fooC, c2;
	c1.i = 0;
	c2 = fn(c1);
	writef(c1.i, " ", c2.i, \n);	// prints "1 1"

	fooS s1, s2;
	s1.i = 0;
	s2 = fn(s1);
	writef(s1.i, " ", s2.i, \n);	// prints "0 1"

}

fooC fn(in fooC v)
{
	v.i=1;
	return v;
}

fooS fn(in fooS v)
{
	v.i=1;
	return v;
}

Jun 19 2006

Kyle K <Kyle_member pathlink.com> writes:

In article <e76jri$1ds7$1 digitaldaemon.com>, BCS says...

 Ah ok, that makes sense. So using 'in' with arrays and aggregate types will
 always still give you a reference? I assume with primitives the semantics
remain
 pass-by-value, such that foo(in int b) will never modify the caller's data?
 
 

Actually "in" always gives you a copy of the actual "thing". Arrays are 
reference types so you get a copy of the reference. Same with objects, 
as they are also reference types. Stucts on the other hand are not 
reference types and as such will get passed by value


class fooC{int i;}
struct fooS{int i;}


void main()
{
	fooC c1= new fooC, c2;
	c1.i = 0;
	c2 = fn(c1);
	writef(c1.i, " ", c2.i, \n);	// prints "1 1"

	fooS s1, s2;
	s1.i = 0;
	s2 = fn(s1);
	writef(s1.i, " ", s2.i, \n);	// prints "0 1"

}

fooC fn(in fooC v)
{
	v.i=1;
	return v;
}

fooS fn(in fooS v)
{
	v.i=1;
	return v;
}

Got it, thanks a bunch. I knew it had to be something simple... :D

Jun 19 2006

Kyle K <Kyle_member pathlink.com> writes:

In article <e76aq8$qsr$1 digitaldaemon.com>, xs0 says...
Well, you didn't touch the memory you didn't allocate :) If you had

char[] bob(in char[] str)
{
     str[0] = 'a';
     return str;
}

You'd get "aoo:aoo" as output (or a crash, as you can't write into 
constants on some platforms)

Ah ok, that makes sense. So using 'in' with arrays and aggregate types will
always still give you a reference? I assume with primitives the semantics remain
pass-by-value, such that foo(in int b) will never modify the caller's data?


Hope that helped :)

It did, thanks a lot! :D

Jun 19 2006

D Programming

C/C++ Programming

Other

digitalmars.D.learn - peculiarities with char[] and std.string