www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - type of concatenated arrays

reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
In D 2, What should the return type be of concatenated const arrays?

For example, if I have a const(char)[] array a1, and I want to concatenate 
"\n", I could do:

auto a2 = a1 ~ "\n";

Now, a2 is declared as a const(char)[] array, but I think it should be just 
a char[] array.  Why?  because the concat operator should have made a copy 
of the array with length +1 and added the \n character.  If this is the 
case, I should be able to do whatever I want with the copy.  Why does it 
have to be const?

If I want a mutable copy I am forced to do:

auto a2 = (a1 ~ "\n").dup;

which makes a copy of a temporary copy, or if I want to be more efficient, I 
am forced to do something funky like:

char[] a2 = new char[a1.length + "\n".length];
a2[0..a1.length] = a1[];
a2[length - 1] = '\n';

But now, I think the new operation wastes time initializing the array before 
copying the slice.

thoughts?

-Steve 
Nov 07 2007
next sibling parent reply "Janice Caron" <caron800 googlemail.com> writes:
On 11/7/07, Steven Schveighoffer <schveiguy yahoo.com> wrote:

 Now, a2 is declared as a const(char)[] array, but I think it should be just
 a char[] array.  Why?  because the concat operator should have made a copy
 of the array with length +1 and added the \n character.  If this is the
 case, I should be able to do whatever I want with the copy.  Why does it
 have to be const?

Because of copy-on-write. If b is the empty string, then (a ~ b) will evaluate to a. No copy need be made.
Nov 07 2007
parent reply BCS <BCS pathlink.com> writes:
Janice Caron wrote:
 On 11/7/07, Steven Schveighoffer <schveiguy yahoo.com> wrote:
 
 
Now, a2 is declared as a const(char)[] array, but I think it should be just
a char[] array.  Why?  because the concat operator should have made a copy
of the array with length +1 and added the \n character.  If this is the
case, I should be able to do whatever I want with the copy.  Why does it
have to be const?

Because of copy-on-write. If b is the empty string, then (a ~ b) will evaluate to a. No copy need be made.

from the spec: "Concatenation always creates a copy of its operands, even if one of the operands is a 0 length array" http://www.digitalmars.com/d/arrays.html grep for "~" and look down about 8 lines
Nov 07 2007
parent reply Reiner Pope <some address.com> writes:
Janice Caron wrote:
 Whoops!
 
 Thanks for the correction. My apologies.
 
 Well, one other possible explanation I can think of is that, if
 concatenation were to produce a mutable result then (for example)
 
     tolower(a~b)
 
 wouldn't compile, because the argument to tolower() needs to be
 invariant. That's kind of a lame argument though.

This problem keeps occurring, because we have no way within the language te express the property, "this array is unique" -- which would mean it can be used as invariant or mutable as you desire. A number of people have suggested a unique type qualifier at different times, although I'm not sure if anyone has defined a semantics for unique types, so that they are guaranteed to be unique. -- Reiner
Nov 07 2007
parent "David B. Held" <dheld codelogicconsulting.com> writes:
Reiner Pope wrote:
 [...]
 This problem keeps occurring, because we have no way within the language 
 te express the property, "this array is unique" -- which would mean it 
 can be used as invariant or mutable as you desire. A number of people 
 have suggested a unique type qualifier at different times, although I'm 
 not sure if anyone has defined a semantics for unique types, so that 
 they are guaranteed to be unique.

There are some rather thorough papers on 'unique', and it's been discussed by Walter's posse, but the general conclusion is that infects the type system in a way that is rather intrusive for the amount of benefit gained. I happen to think it's an elegant concept, but I have to admit that it might be too elegant to be useful. One problem with unique is that it's not really a closed system. It's highly entropic. What I mean by that is that there's lots of ways to go from a unique reference to a non-unique one, but no sound way to do the reverse (technically, collected memory could be soundly declared as 'unique', but manually allocated memory is problematic at best). Dave
Nov 09 2007
prev sibling parent reply "Janice Caron" <caron800 googlemail.com> writes:
Whoops!

Thanks for the correction. My apologies.

Well, one other possible explanation I can think of is that, if
concatenation were to produce a mutable result then (for example)

    tolower(a~b)

wouldn't compile, because the argument to tolower() needs to be
invariant. That's kind of a lame argument though.
Nov 07 2007
parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
"Janice Caron" wrote
 Whoops!

 Thanks for the correction. My apologies.

 Well, one other possible explanation I can think of is that, if
 concatenation were to produce a mutable result then (for example)

    tolower(a~b)

 wouldn't compile, because the argument to tolower() needs to be
 invariant. That's kind of a lame argument though.

I don't really agree with tolower being invariant, but it's not my problem because I don't use Phobos. I'm building a D 2.0 compatible version of Tango. Second, why should functions of the library dictate how a builtin language feature works? It's supposed to be the other way around. Third, my example did not involve 2 invariant arrays, one was const, the other was invariant. In fact, the current result I believe is that it results in a const array (not sure though). In any case, if there was another way, even if it was ugly, I'd use that, but I can't see how to build a non-const array without having it either copy the array twice (dup), or initialize all the memory before I can copy to it (new char[x]). -Steve
Nov 07 2007