www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Array concatenation, missing functionality

reply MIcroWizard <MIcroWizard_member pathlink.com> writes:
It would be nice to have the array concatenation to be able to concatenate
not only arrays but single elements of the basic type also.

I mean:
------------------
char a;
char[] b;

a='X';
b="YZ";

writefln(a~b);  // <-- this is illegal now
------------------
XYZ
------------------
(For types more complex than "char" it can be very-very useful also.)

Is there any theoretical objection against this functionality or
it is only not implemented yet?

Tamas Nagy
May 09 2005
next sibling parent "Charlie" <charles jwavro.com> writes:
 Is there any theoretical objection against this functionality or
 it is only not implemented yet?

Im guessing not yet implemented, opCatAssign lets you do this with elements char [] x = "xy"; x ~= 'z'; "MIcroWizard" <MIcroWizard_member pathlink.com> wrote in message news:d5odll$1b1a$1 digitaldaemon.com...
 It would be nice to have the array concatenation to be able to concatenate
 not only arrays but single elements of the basic type also.

 I mean:
 ------------------
 char a;
 char[] b;

 a='X';
 b="YZ";

 writefln(a~b);  // <-- this is illegal now
 ------------------
 XYZ
 ------------------
 (For types more complex than "char" it can be very-very useful also.)

 Is there any theoretical objection against this functionality or
 it is only not implemented yet?

 Tamas Nagy

May 09 2005
prev sibling next sibling parent reply "Andrew Fedoniouk" <news terrainformatica.com> writes:
 Is there any theoretical objection against this functionality or
 it is only not implemented yet?

My best guess: Opeartion like: char[] ~ char is very unefficient as it forces creation of brand new array each time which is pretty expensive for single element operation. Consider use of format() for chars. For other cases buffered ~= is better.
May 09 2005
parent reply Sean Kelly <sean f4.ca> writes:
In article <d5ogih$1dgr$1 digitaldaemon.com>, Andrew Fedoniouk says...
 Is there any theoretical objection against this functionality or
 it is only not implemented yet?

My best guess: Opeartion like: char[] ~ char is very unefficient as it forces creation of brand new array each time which is pretty expensive for single element operation. Consider use of format() for chars. For other cases buffered ~= is better.

It would be nice if arrays had a .capacity property, though this would increase their .sizeof. The simple alternative is something like this: # char[] a; # size_t pos=a.length; # for(char c=getchar();c!=EOF;c=getchar()){ # if(pos==a.length)a.length=a.length*2; # a[pos++]=c; # } # a.length=pos; Sean
May 09 2005
next sibling parent reply "Andrew Fedoniouk" <news terrainformatica.com> writes:
 It would be nice if arrays had a .capacity property, though this would 
 increase
 their .sizeof.  The simple alternative is something like this:

As far as I remeber it was message from Walter that internally array inplace concatenation is made this way opCatAssign(....) { ... if (new array length > arr._capacity) arr._capacity = arr._capacity * 2; // or something, reallocation of array buffer. } So arr ~= something is by default "buffered" operation. Andrew. "Sean Kelly" <sean f4.ca> wrote in message news:d5oimq$1f1n$1 digitaldaemon.com...
 In article <d5ogih$1dgr$1 digitaldaemon.com>, Andrew Fedoniouk says...
 Is there any theoretical objection against this functionality or
 it is only not implemented yet?

My best guess: Opeartion like: char[] ~ char is very unefficient as it forces creation of brand new array each time which is pretty expensive for single element operation. Consider use of format() for chars. For other cases buffered ~= is better.

It would be nice if arrays had a .capacity property, though this would increase their .sizeof. The simple alternative is something like this: # char[] a; # size_t pos=a.length; # for(char c=getchar();c!=EOF;c=getchar()){ # if(pos==a.length)a.length=a.length*2; # a[pos++]=c; # } # a.length=pos; Sean

May 09 2005
parent reply Sean Kelly <sean f4.ca> writes:
In article <d5os8h$11g$1 digitaldaemon.com>, Andrew Fedoniouk says...
 It would be nice if arrays had a .capacity property, though this would 
 increase
 their .sizeof.  The simple alternative is something like this:

As far as I remeber it was message from Walter that internally array inplace concatenation is made this way opCatAssign(....) { ... if (new array length > arr._capacity) arr._capacity = arr._capacity * 2; // or something, reallocation of array buffer. } So arr ~= something is by default "buffered" operation.

I thought so too, but the function _d_arraycat in internal/arraycat.d doesn't seem to be allocating extra space. Though I might just be looking at the wrong function... Sean
May 09 2005
parent Burton Radons <burton-radons smocky.com> writes:
Sean Kelly wrote:

 In article <d5os8h$11g$1 digitaldaemon.com>, Andrew Fedoniouk says...
 
It would be nice if arrays had a .capacity property, though this would 
increase
their .sizeof.  The simple alternative is something like this:

As far as I remeber it was message from Walter that internally array inplace concatenation is made this way opCatAssign(....) { ... if (new array length > arr._capacity) arr._capacity = arr._capacity * 2; // or something, reallocation of array buffer. } So arr ~= something is by default "buffered" operation.

I thought so too, but the function _d_arraycat in internal/arraycat.d doesn't seem to be allocating extra space. Though I might just be looking at the wrong function...

Yup, that's for "a ~ b". "a ~= b" is in (internal/gc/gc.d) under (_d_arraysetlength).
May 09 2005
prev sibling next sibling parent "Uwe Salomon" <post uwesalomon.de> writes:
 My best guess:
 Opeartion like:
   char[] ~ char
 is very unefficient as it forces creation of brand new array each time  
 which
 is
 pretty expensive for single element operation.

 Consider use of format() for chars. For other cases buffered ~= is  
 better.

It would be nice if arrays had a .capacity property, though this would increase their .sizeof. The simple alternative is something like this:

All this and more is realized in my "Vector" struct. It is very very fast, you don't have to care about allocations and stuff, and of course you can always "extract" the D array from the vector -- for example you could fill the vector and then work on with the D array, when the overhead (there isn't much, though, just the 4 bytes capacity and a lot of fast convenience functions you don't need to call) is not needed any more. If you like, look at vector.d in: http://www.uwesalomon.de/code/indigo/indigo.tar.gz The docs (currently home-brewn, in some days more professional) are at http://www.uwesalomon.de/code/indigo/ Ciao uwe
May 10 2005
prev sibling parent reply "B.G." <gbatyan gmx.net> writes:
Sean Kelly wrote:
 In article <d5ogih$1dgr$1 digitaldaemon.com>, Andrew Fedoniouk says...
 
Is there any theoretical objection against this functionality or
it is only not implemented yet?

My best guess: Opeartion like: char[] ~ char is very unefficient as it forces creation of brand new array each time which is pretty expensive for single element operation. Consider use of format() for chars. For other cases buffered ~= is better.

It would be nice if arrays had a .capacity property, though this would increase their .sizeof. The simple alternative is something like this:

Agreed! Almost every more or less complex application uses resizable arrays. If an array anyway has capacity property, I think it's absolutely natural to make it public and hereby grant a more finegrained control on the capacity behaviour (for instance length *= 2 is not always an optimal solution) Btw, what's the current behaviour, if I set a length of an array to a smaller value? Does it cause reallocation? Does .dup return a copy where length == capacity?
 
 # char[] a;
 # size_t pos=a.length;
 # for(char c=getchar();c!=EOF;c=getchar()){
 #     if(pos==a.length)a.length=a.length*2;
 #     a[pos++]=c;
 # }
 # a.length=pos;
 

May 10 2005
parent reply Sean Kelly <sean f4.ca> writes:
In article <d5rpf7$28pr$1 digitaldaemon.com>, B.G. says...
Btw, what's the current behaviour, if I set a length of an array to a 
smaller value? Does it cause reallocation?

I'm pretty sure it does not. So you can kind of fake the idea of a capacity property by setting length to something large and then reducing it again.
Does .dup return a copy where length == capacity?

I believe so. Sean
May 12 2005
parent "Uwe Salomon" <post uwesalomon.de> writes:
 Btw, what's the current behaviour, if I set a length of an array to a
 smaller value? Does it cause reallocation?

I'm pretty sure it does not. So you can kind of fake the idea of a capacity property by setting length to something large and then reducing it again.

No, that isn't a very good solution, because it is very slow.
 Does .dup return a copy where length == capacity?

I believe so.

Not in every case. I made some tests under linux, and the GC seems to round up the size of the array to a multiple of 16 bytes. If you declare this: uint[] array; array.length = 5; Then you can increase the length up to 8 without reallocation. Of course this is not enough "preallocation" if you deal with a noteworthy number of elements. Even if you increase the length and set it back (MinTL does that with the reserve() function, i think), this variant is much slower than it could be. For comparison: # array.length = 50000; # for (int i = 0; i < 50000; ++i) # array[i] = i; Takes 2.5 milliseconds. # array.length = 50000; # array.length = 1; # for (int i = 0; i < 50000; ++i) # { # array.length = array.length + 1; # array[i] = i; # } Takes 6 milliseconds. # array.length = 16; # for (int i = 0; i < 50000; ++i) # { # if (i >= array.length) # array.length = array.length * 2; # # array[i] = i; # } # array.length = 50000; Takes 5 milliseconds. As you can see, a combination of the first and the third approach will easily outperform the second. Ciao uwe
May 12 2005
prev sibling parent reply "Walter" <newshound digitalmars.com> writes:
"MIcroWizard" <MIcroWizard_member pathlink.com> wrote in message
news:d5odll$1b1a$1 digitaldaemon.com...
 Is there any theoretical objection against this functionality or
 it is only not implemented yet?

It's just not implemented yet. There's no technical problem with it.
May 12 2005
parent MicroWizard <MicroWizard_member pathlink.com> writes:
Your answer makes me really happy :-)))

Tamas

In article <d5v1v7$2nr3$1 digitaldaemon.com>, Walter says...
"MIcroWizard" <MIcroWizard_member pathlink.com> wrote in message
news:d5odll$1b1a$1 digitaldaemon.com...
 Is there any theoretical objection against this functionality or
 it is only not implemented yet?

It's just not implemented yet. There's no technical problem with it.

May 12 2005