www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Array Appenders

reply dsimcha <dsimcha yahoo.com> writes:
Ever since D emerged from the primordial soup back in the Paleozoic Era,
there's been discussion around here about array appending and the need/want
for a capacity field.  To solve this problem, Andrei's pre-release Phobos
includes an array appender struct (see
http://www.erdani.dreamhosters.com/d/web/phobos/std_array.html).  While this
gets the job done, IMHO it feels kind of hackish and undermining of builtin
arrays, especially now that D supports alias this.

When alias this came out (yes, this was just last week but it feels like
forever ago with the pace of evolution lately), I started playing around with
the idea that an ArrayAppender!(T) should consist of an extremely thin wrapper
around a T[].  It should implicitly convert to T[] and be usable exactly like
a T[], except that it should have a capacity field and its ~= operator should
be overridden to use the capacity field for fast appends.  IMHO this is a much
more elegant solution than having an ArrayAppender that is a completely
different animal than builtin arrays and doesn't allow access to the builtin
array's operators without an explicit conversion.

Right now, my prototype of the implicitly-converting ArrayAppender doesn't
work well due to bugs 2777, 2778, and 2781, but if there's sufficient
interest, I'd be willing to release the prototype anyhow for comment after
it's cleaned up a little.  What do others think?  Should arrays w/ capacity
fields blend in with and implicitly convert to builtin arrays, or should there
be a high wall distinguishing the two?
Apr 06 2009
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
dsimcha wrote:
 Ever since D emerged from the primordial soup back in the Paleozoic Era,
 there's been discussion around here about array appending and the need/want
 for a capacity field.  To solve this problem, Andrei's pre-release Phobos
 includes an array appender struct (see
 http://www.erdani.dreamhosters.com/d/web/phobos/std_array.html).  While this
 gets the job done, IMHO it feels kind of hackish and undermining of builtin
 arrays, especially now that D supports alias this.
 
 When alias this came out (yes, this was just last week but it feels like
 forever ago with the pace of evolution lately), I started playing around with
 the idea that an ArrayAppender!(T) should consist of an extremely thin wrapper
 around a T[].  It should implicitly convert to T[] and be usable exactly like
 a T[], except that it should have a capacity field and its ~= operator should
 be overridden to use the capacity field for fast appends.  IMHO this is a much
 more elegant solution than having an ArrayAppender that is a completely
 different animal than builtin arrays and doesn't allow access to the builtin
 array's operators without an explicit conversion.
 
 Right now, my prototype of the implicitly-converting ArrayAppender doesn't
 work well due to bugs 2777, 2778, and 2781, but if there's sufficient
 interest, I'd be willing to release the prototype anyhow for comment after
 it's cleaned up a little.  What do others think?  Should arrays w/ capacity
 fields blend in with and implicitly convert to builtin arrays, or should there
 be a high wall distinguishing the two?

ArrayAppender is different from an array because it's supposed to disconnected from the array notion. It should only be an output range, and as such supports the put() method. Other output streams will mushroom for files, lists, sockets... so ArrayAppender is uniform with those. It's not supposed to be uniform with arrays. Andrei
Apr 06 2009
prev sibling next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
dsimcha:
 When alias this came out (yes, this was just last week but it feels like
 forever ago with the pace of evolution lately), I started playing around with
 the idea that an ArrayAppender!(T) should consist of an extremely thin wrapper
 around a T[].  It should implicitly convert to T[] and be usable exactly like
 a T[], except that it should have a capacity field and its ~= operator should
 be overridden to use the capacity field for fast appends.

I mostly use D1, but this is a very nice idea. I may try to adapt your idea to modify my ArrayAppender for D2. Bye, bearophile
Apr 07 2009
parent bearophile <bearophileHUGS lycos.com> writes:
dsimcha:
IMHO this is a much more elegant solution than having an ArrayAppender that is
a completely different animal than builtin arrays and doesn't allow access to
the builtin array's operators without an explicit conversion.<

You also may need to disable array slicing on that. So maybe it's not a too much good idea... Bye, bearophile
Apr 07 2009
prev sibling next sibling parent Kagamin <spam here.lot> writes:
dsimcha Wrote:

 it's cleaned up a little.  What do others think?  Should arrays w/ capacity
 fields blend in with and implicitly convert to builtin arrays, or should there
 be a high wall distinguishing the two?

I think, there will be no problems, when sentinels get implemented, so that gc_query will return capacity and _d_arrayappend could work correctly.
Apr 07 2009
prev sibling parent reply "Denis Koroskin" <2korden gmail.com> writes:
On Tue, 07 Apr 2009 07:08:22 +0400, dsimcha <dsimcha yahoo.com> wrote:

 Ever since D emerged from the primordial soup back in the Paleozoic Era,
 there's been discussion around here about array appending and the  
 need/want
 for a capacity field.  To solve this problem, Andrei's pre-release Phobos
 includes an array appender struct (see
 http://www.erdani.dreamhosters.com/d/web/phobos/std_array.html).  While  
 this
 gets the job done, IMHO it feels kind of hackish and undermining of  
 builtin
 arrays, especially now that D supports alias this.

 When alias this came out (yes, this was just last week but it feels like
 forever ago with the pace of evolution lately), I started playing around  
 with
 the idea that an ArrayAppender!(T) should consist of an extremely thin  
 wrapper
 around a T[].  It should implicitly convert to T[] and be usable exactly  
 like
 a T[], except that it should have a capacity field and its ~= operator  
 should
 be overridden to use the capacity field for fast appends.  IMHO this is  
 a much
 more elegant solution than having an ArrayAppender that is a completely
 different animal than builtin arrays and doesn't allow access to the  
 builtin
 array's operators without an explicit conversion.

 Right now, my prototype of the implicitly-converting ArrayAppender  
 doesn't
 work well due to bugs 2777, 2778, and 2781, but if there's sufficient
 interest, I'd be willing to release the prototype anyhow for comment  
 after
 it's cleaned up a little.  What do others think?  Should arrays w/  
 capacity
 fields blend in with and implicitly convert to builtin arrays, or should  
 there
 be a high wall distinguishing the two?

Well, actually I think that having an Appender object is an overkill. I never use, although I wrote a few implementations. Instead, I found the following method to be extemely handy, very fast and cover all my cases: void append(T)(T[] array, ref size_t index, T value) { assert(array.length >= index); if (array.length == index) { array.length = array.length * 2; } array[index++] = value; } It fits very well with the following D pattern: whenever you want to avoid allocation, you provide an external buffer to write. Buffer may be too small to store the result, in which case it grows using heap. Here is an example: // converts an array from one type to another template toArray(Dst) { Dst[] toArray(Src)(Src[] items, Dst[] buffer = null) { size_t index = 0; foreach (i; items) { buffer.append(index, to!(Dst)(i)); } return buffer[0..index]; } } string[] s = ["123", "345", "567"]; int[] i = toArray!(int)(s); // no buffer provided int[3] tmp; int[] ii = toArray!(int)(s, tmp[]); // same but with a buffer (Not tested)
Apr 07 2009
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Denis Koroskin wrote:
 Well, actually I think that having an Appender object is an overkill. I 
 never use, although I wrote a few implementations. Instead, I found the 
 following method to be extemely handy, very fast and cover all my cases:
 
 void append(T)(T[] array, ref size_t index, T value)
 {
    assert(array.length >= index);
    if (array.length == index) {
        array.length = array.length * 2;
    }
     
    array[index++] = value;
 }

I'm pretty sure you meant to pass array by reference. Andrei
Apr 07 2009
parent reply Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
Andrei Alexandrescu wrote:
 Denis Koroskin wrote:
 Well, actually I think that having an Appender object is an overkill. 
 I never use, although I wrote a few implementations. Instead, I found 
 the following method to be extemely handy, very fast and cover all my 
 cases:

 void append(T)(T[] array, ref size_t index, T value)
 {
    assert(array.length >= index);
    if (array.length == index) {
        array.length = array.length * 2;
    }
        array[index++] = value;
 }

I'm pretty sure you meant to pass array by reference.

It also breaks when the array is empty.
Apr 08 2009
parent "Nick Sabalausky" <a a.a> writes:
"Denis Koroskin" <2korden gmail.com> wrote in message 
news:op.ur2f3rglo7cclz korden-pc...
 On Wed, 08 Apr 2009 15:04:29 +0400, Frits van Bommel 
 <fvbommel remwovexcapss.nl> wrote:

 Andrei Alexandrescu wrote:
 Denis Koroskin wrote:
 Well, actually I think that having an Appender object is an overkill. 
 I never use, although I wrote a few implementations. Instead, I found 
 the following method to be extemely handy, very fast and cover all my 
 cases:

 void append(T)(T[] array, ref size_t index, T value)
 {
    assert(array.length >= index);
    if (array.length == index) {
        array.length = array.length * 2;
    }
        array[index++] = value;
 }


It also breaks when the array is empty.

Yeah, I was writing from memory and could (and did!) introduce bugs. My intend was to show an easy way of appending to array without use of a special Appender struct. I use it /very/ often and believe it belongs to std.array.

A lot of code doesn't expect extra "unofficial" elements at the end of an array. Any such code would break if that append had been used on the array it's operating on. In my experience, such code occurs often enough to make that a dangerous strategy. You could be careful and make sure all your code is designed in a way that accomodates such extra "allocated-but-unused" elements, but for all the bother that would be, you might as well just use a real .length/.capacity solution.
Apr 10 2009
prev sibling next sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Tue, 07 Apr 2009 17:24:07 +0400, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:

 Denis Koroskin wrote:
 Well, actually I think that having an Appender object is an overkill. I  
 never use, although I wrote a few implementations. Instead, I found the  
 following method to be extemely handy, very fast and cover all my cases:
  void append(T)(T[] array, ref size_t index, T value)
 {
    assert(array.length >= index);
    if (array.length == index) {
        array.length = array.length * 2;
    }
        array[index++] = value;
 }

I'm pretty sure you meant to pass array by reference. Andrei

Yes, of course!
Apr 07 2009
prev sibling next sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Wed, 08 Apr 2009 15:04:29 +0400, Frits van Bommel
<fvbommel remwovexcapss.nl> wrote:

 Andrei Alexandrescu wrote:
 Denis Koroskin wrote:
 Well, actually I think that having an Appender object is an overkill.  
 I never use, although I wrote a few implementations. Instead, I found  
 the following method to be extemely handy, very fast and cover all my  
 cases:

 void append(T)(T[] array, ref size_t index, T value)
 {
    assert(array.length >= index);
    if (array.length == index) {
        array.length = array.length * 2;
    }
        array[index++] = value;
 }


It also breaks when the array is empty.

Yeah, I was writing from memory and could (and did!) introduce bugs. My intend was to show an easy way of appending to array without use of a special Appender struct. I use it /very/ often and believe it belongs to std.array.
Apr 08 2009
prev sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Fri, 10 Apr 2009 11:00:01 +0400, Nick Sabalausky <a a.a> wrote:

 "Denis Koroskin" <2korden gmail.com> wrote in message
 news:op.ur2f3rglo7cclz korden-pc...
 On Wed, 08 Apr 2009 15:04:29 +0400, Frits van Bommel
 <fvbommel remwovexcapss.nl> wrote:

 Andrei Alexandrescu wrote:
 Denis Koroskin wrote:
 Well, actually I think that having an Appender object is an overkill.
 I never use, although I wrote a few implementations. Instead, I found
 the following method to be extemely handy, very fast and cover all my
 cases:

 void append(T)(T[] array, ref size_t index, T value)
 {
    assert(array.length >= index);
    if (array.length == index) {
        array.length = array.length * 2;
    }
        array[index++] = value;
 }


It also breaks when the array is empty.

Yeah, I was writing from memory and could (and did!) introduce bugs. My intend was to show an easy way of appending to array without use of a special Appender struct. I use it /very/ often and believe it belongs to std.array.

A lot of code doesn't expect extra "unofficial" elements at the end of an array. Any such code would break if that append had been used on the array it's operating on. In my experience, such code occurs often enough to make that a dangerous strategy. You could be careful and make sure all your code is designed in a way that accomodates such extra "allocated-but-unused" elements, but for all the bother that would be, you might as well just use a real .length/.capacity solution.

Real .length and .capacity solution would be the best solution, but since slices currently lack one of these properties, I carry length outside and re-use buffer.length as capacity. In most cases, I use this pattern for appending to a temporary buffer (which is designed to be re-used, even if it stores something). And if it stores values, then it's either 1) full and you just set "index = buffer.length;" at the beginning, or 2) size is provided ("index = initialSize;"). Never experienced any problems with it.
Apr 10 2009