www.digitalmars.com         C & C++   DMDScript  

D - Array handling

reply Burton Radons <loth users.sourceforge.net> writes:
Here's what I think we need in arrays.

- Deleting append operator.  Where ~ leaves the previous copy around if 
it re-allocates, this would delete the previous copy.  Optionally, make 
~ delete the previous copy if it re-allocates itself.  I actually prefer 
this latter option, as the current mixin from the array object method 
doesn't fit very nicely.  This would allow conservative-use functions 
like fmt to be cleanly implemented.

- Default arguments for slicing.

- array.insert (index, item or array); allocate space for a new cell, 
move everything index and above up, and insert into index.

- item in array.  Same as with aarray.

- array.find (item or array).  Returns the index of the first equivalent 
item/subarray or -1 if it is not in the array.

- array.rfind (item or array).  Same as above, only searching from the end.

- array.set (array2).  Copy the second array into this one; similar to 
"array [] = array2;" but handles overlapping.  Optionally, inverse the 
logic; this method doesn't allow overlapping, but [] does.  I prefer the 
latter, as it's not that expensive or common, just an optimisation dead 
end.  But in any case not having a workaround now results in ugly code.

- array.sort (int delegate (inout type a, inout type b) sorter).  Obvious.

- array.remove (index).  Move everything in the array down a cell to 
cover up the index and then slice the array.  Implementations should 
retain ownership if they can.  Actually, that could be covered by the 
strong recommendation that the following not remove ownership:

     a = a [ .. y];

- array.remove (start, end).  Remove a slice of data from the array.

And other ideas, that I haven't had much need for but are out there:

- array.at (index): array [index < 0 ? index + array.length : index].

- array.atWrap (index): array [((index % array.length) + array.length) % 
array.length].

- array.atSaturate (index): array [index < 0 ? 0 : index >= array.length 
? array.length - 1 : index].

- array.atBorder (index, type outside): index >= 0 && index < 
array.length ? array [index] : outside.

- array.atReflect (index):

     while (1)
     {
         if (index < 0)
             index = -index - 1;
         else if (index >= array.length)
             index = (array.length - 2) - (index - array.length);
         else
             break;
     }

This is the full image processing set.  Not really useful to me outside 
of the first one - this code tends to be too specialised to exploit 
inspecific features - and even the "at" is a little eh-ish for me.  When 
I need that I just put a delta in a local.

- array.map (void delegate (inout type a) mapper).  Call mapper on each 
item of the array, in-place, and return the array.  This is a powerful 
set of functionality in Python that I've personally never exploited.

- array.map (void delegate (inout type a, inout type b) mapper, array). 
  Both arrays must be the same length.  Apply mapper to each item of the 
array, doing them in parallel, and return the array.

- array.reduce (void delegate (inout type a, inout type b, out type c) 
function).  Apply function to the items in the array, from left to 
right.  For example (using a couple non-existent features):

     ((int []) [1, 2, 3, 4, 5]).reduce ({ c = a + b; });

Calculates ((((1+2)+3)+4)+5).  I've never had any use for this one in 
Python either; however:

     array.reduce ({ c = a > b ? a : b; }) == array.max;
     array.reduce ({ c = a > b ? b : a; }) == array.min;

- array.endswith (item or array), array.startswith (item or array). 
Return whether the array ends or starts with this item or subarray. 
"foo.exe".endswith (".exe") is true.

- array.replace (old, new).  Replace every item or subarray in array 
that equals old with new.  For example, "acklavick".replace ("ck", 
"sta") returns "astalavista".

- array.min, array.max.  Compare every cell in the array to each other 
and return the minimum, maximum.

- array.repeat (int).  This returns an array that is this array appended 
int times, so "x".repeat (8) is "xxxxxxxx".  It's extremely convenient 
in Python (using "x" * 8) but in a very limited number of places.  Best 
I can say for it is that it's a small function in Phobos.

- array.count (item or array).  Return the number of times the 
item/subarray occurs in the array.  "dododododo".count ("dodo") would 
return 2 and not 4, methinks.  I've never used this in Python.

- array.zip (array...).  Create an array that consecutively attaches 
arrays together.  "abcd".zip ("efgh") returns "aebfgcdh".  Another one I 
haven't used from Python.

- generic apply (generic function, generic [] args...).  Create the 
proper function call on the stack and calls the function, returning the 
allocated return value.  Handles all supported calling conventions, 
functions and delegates both.  This would be very nice for scripting (it 
would eliminate the need for SWIG wrappers if the scripting language is 
built for D), but I don't know why it's here.
Oct 20 2002
next sibling parent reply "Sandor Hojtsy" <hojtsy index.hu> writes:
"Burton Radons" <loth users.sourceforge.net> wrote in message
news:aouvq2$227k$1 digitaldaemon.com...
 Here's what I think we need in arrays.

Hmm. Quite a fine collection of features.
 - Deleting append operator.  Where ~ leaves the previous copy around if
 it re-allocates, this would delete the previous copy.  Optionally, make
 ~ delete the previous copy if it re-allocates itself.  I actually prefer
 this latter option, as the current mixin from the array object method
 doesn't fit very nicely.  This would allow conservative-use functions
 like fmt to be cleanly implemented.

What about slices pointing inside the previous copy? I feel that slices are blocking the way of some particular optimizations.
 - Default arguments for slicing.

 - array.insert (index, item or array); allocate space for a new cell,
 move everything index and above up, and insert into index.

This may need reallocation. What about slices pointing inside the previous copy?
 - item in array.  Same as with aarray.

 - array.find (item or array).  Returns the index of the first equivalent
 item/subarray or -1 if it is not in the array.

 - array.rfind (item or array).  Same as above, only searching from the

Compiler magic. Conceptually these are template functions/classes, which cannot be reproduced in user/library code, because there are no such thing as template functions/classes.
 - array.set (array2).  Copy the second array into this one; similar to
 "array [] = array2;" but handles overlapping.  Optionally, inverse the
 logic; this method doesn't allow overlapping, but [] does.  I prefer the
 latter, as it's not that expensive or common, just an optimisation dead
 end.  But in any case not having a workaround now results in ugly code.

 - array.sort (int delegate (inout type a, inout type b) sorter).  Obvious.

Why are the delegate parameters "inout"? Do they need to be changed? Does the delegate swap them if necessary? If not, the "inout" keyword is used to enable the sometimes faster/sometimes slower pass-by-reference, and poses a danger of the delegate chaging the values.
 - array.remove (index).  Move everything in the array down a cell to
 cover up the index and then slice the array.  Implementations should
 retain ownership if they can.

Should do what?
  Actually, that could be covered by the
 strong recommendation that the following not remove ownership:

      a = a [ .. y];

 - array.remove (start, end).  Remove a slice of data from the array.

 And other ideas, that I haven't had much need for but are out there:

 - array.at (index): array [index < 0 ? index + array.length : index].

 - array.atWrap (index): array [((index % array.length) + array.length) %
 array.length].

 - array.atSaturate (index): array [index < 0 ? 0 : index >= array.length
 ? array.length - 1 : index].

 - array.atBorder (index, type outside): index >= 0 && index <
 array.length ? array [index] : outside.

 - array.atReflect (index):

      while (1)
      {
          if (index < 0)
              index = -index - 1;
          else if (index >= array.length)
              index = (array.length - 2) - (index - array.length);
          else
              break;
      }

IMHO we can have all. Why not.
 This is the full image processing set.  Not really useful to me outside
 of the first one - this code tends to be too specialised to exploit
 inspecific features - and even the "at" is a little eh-ish for me.  When
 I need that I just put a delta in a local.

 - array.map (void delegate (inout type a) mapper).  Call mapper on each
 item of the array, in-place, and return the array.  This is a powerful
 set of functionality in Python that I've personally never exploited.

 - array.map (void delegate (inout type a, inout type b) mapper, array).
   Both arrays must be the same length.  Apply mapper to each item of the
 array, doing them in parallel, and return the array.

 - array.reduce (void delegate (inout type a, inout type b, out type c)
 function).  Apply function to the items in the array, from left to
 right.  For example (using a couple non-existent features):

      ((int []) [1, 2, 3, 4, 5]).reduce ({ c = a + b; });

I like those non-existent features. Especially creating arrays on-the-fly.
 Calculates ((((1+2)+3)+4)+5).  I've never had any use for this one in
 Python either; however:

      array.reduce ({ c = a > b ? a : b; }) == array.max;
      array.reduce ({ c = a > b ? b : a; }) == array.min;

 - array.endswith (item or array), array.startswith (item or array).
 Return whether the array ends or starts with this item or subarray.
 "foo.exe".endswith (".exe") is true.

 - array.replace (old, new).  Replace every item or subarray in array
 that equals old with new.  For example, "acklavick".replace ("ck",
 "sta") returns "astalavista".

What does "apple".replace("a", "aa"); return?
 - array.min, array.max.  Compare every cell in the array to each other
 and return the minimum, maximum.

 - array.repeat (int).  This returns an array that is this array appended
 int times, so "x".repeat (8) is "xxxxxxxx".  It's extremely convenient
 in Python (using "x" * 8) but in a very limited number of places.  Best
 I can say for it is that it's a small function in Phobos.

 - array.count (item or array).  Return the number of times the
 item/subarray occurs in the array.  "dododododo".count ("dodo") would
 return 2 and not 4, methinks.  I've never used this in Python.

I find this very usefull.
 - array.zip (array...).  Create an array that consecutively attaches
 arrays together.  "abcd".zip ("efgh") returns "aebfgcdh".  Another one I
 haven't used from Python.

 - generic apply (generic function, generic [] args...).  Create the
 proper function call on the stack and calls the function, returning the
 allocated return value.  Handles all supported calling conventions,
 functions and delegates both.  This would be very nice for scripting (it
 would eliminate the need for SWIG wrappers if the scripting language is
 built for D), but I don't know why it's here.

Could you show an actual example, please? Sandor
Oct 21 2002
parent Burton Radons <loth users.sourceforge.net> writes:
Sandor Hojtsy wrote:
 "Burton Radons" <loth users.sourceforge.net> wrote in message
 news:aouvq2$227k$1 digitaldaemon.com...

 Here's what I think we need in arrays.

Hmm. Quite a fine collection of features.
 - Deleting append operator.  Where ~ leaves the previous copy
 around if it re-allocates, this would delete the previous copy.
 Optionally, make ~ delete the previous copy if it re-allocates
 itself.  I actually prefer this latter option, as the current mixin
 from the array object method doesn't fit very nicely.  This would
 allow conservative-use functions like fmt to be cleanly
 implemented.

What about slices pointing inside the previous copy? I feel that

 blocking the way of some particular optimizations.

They'd be pointing into invalid space. The alternative is that they'll point to data which is no longer being updated by the main owner; that's almost certainly a bug!
 - array.set (array2).  Copy the second array into this one; similar
 to "array [] = array2;" but handles overlapping.  Optionally,
 inverse the logic; this method doesn't allow overlapping, but []
 does.  I prefer the latter, as it's not that expensive or common,
 just an optimisation dead end.  But in any case not having a
 workaround now results in ugly code.

 - array.sort (int delegate (inout type a, inout type b) sorter).
 Obvious.

Why are the delegate parameters "inout"? Do they need to be changed? Does the delegate swap them if necessary? If not, the "inout" keyword is used to enable the sometimes faster/sometimes slower pass-by-reference, and

Change it if you want; if you change anything used in sorting, it won't sort properly and life continues. Most of my sorting is on a single field in a large struct, which this would account for, and when you are sorting ints, you don't need to use *a and *b.
 - array.remove (index).  Move everything in the array down a cell
 to cover up the index and then slice the array.  Implementations
 should retain ownership if they can.

Should do what?

If the array is overallocated, slicing it from the beginning to a certain point shouldn't remove ownership so that it still uses the additional data if it appends.
 - array.replace (old, new).  Replace every item or subarray in
 array that equals old with new.  For example, "acklavick".replace
 ("ck", "sta") returns "astalavista".

What does "apple".replace("a", "aa"); return?

"aapple". Any reason to do any other?
 - generic apply (generic function, generic [] args...).  Create the
  proper function call on the stack and calls the function,
 returning the allocated return value.  Handles all supported
 calling conventions, functions and delegates both.  This would be
 very nice for scripting (it would eliminate the need for SWIG
 wrappers if the scripting language is built for D), but I don't
 know why it's here.

Could you show an actual example, please?

Okay. pObject is a counted scripting object, pTuple is an immutable list. /* Calls an object with a given set of functions. */ pObject pCallObject (pObject object, pTuple args) { generic [] arglist; /* Argument list for apply */ generic result; /* What the function returns */ generic caller; /* Our function to call */ TypeInfo type; /* The result type */ pObject output; /* The object we return */ /* Get the delegate for this object, or null if it's pure script or non-callable */ caller = object.caller (); /* It's pure script, so we dispatch using script */ if (caller.data === null) return pCallScript (object, args); /* Construct the argument list */ arglist.length = args.length; try { for (int c; c < arglist.length; c ++) { /* Creates a non-allocated generic object */ arglist [c] = genericFrom (&args [c], args [c].typeinfo); } /* Call the function */ result = apply (caller, arglist...); } finally delete arglist; /* Now construct the proper script object to return */ type = result.type; if (cast (TypeInt) type) output = new pIntObject (result.toInt ()); else if (cast (TypeArray) type && cast (TypeChar) type.next) output = new pStringObject (*(char []*) result.data); ... /* Of course, this should be using a set of registered types that claim they can make various conversions */ result.del (); /* Free the returned cell and leave */ return output; } apply would handle casting. In this case it's handed a set of pObject references; so it asks each of them to convert to the proper argument. So under the hood, apply is a complex function; straightforward but very nonportable. But the code here is as portable as the language gets and would work everywhere. The other direction, calling script directly from D, is far more complex. I have a dim shape of how it could be done; it would allow overloading of virtual methods in the scripting language.
Oct 21 2002
prev sibling next sibling parent "Walter" <walter digitalmars.com> writes:
Lots of good ideas here.
Oct 22 2002
prev sibling next sibling parent Burton Radons <loth users.sourceforge.net> writes:
Burton Radons wrote:
 - array.map (void delegate (inout type a) mapper).  Call mapper on each 
 item of the array, in-place, and return the array.  This is a powerful 
 set of functionality in Python that I've personally never exploited.

Come to think of it, this doesn't have to be used for modifying the array; you're just cycling on it, so it can handle a good subset of iterator necessity. For example, saving all documents would be: Document [] documents; ... documents.map ({ a.save (); }); Coolness! The code generator can have a field day with the code, too, particularly if the method is heavily used, which would be a matter of education.
Oct 23 2002
prev sibling parent reply Daniel Yokomiso <Daniel_member pathlink.com> writes:
Hi,

I'm writing a template library with array utilities, and followed the ideas you
posted. But I'm a bit puzzled by this one:

- array.set (array2).  Copy the second array into this one; similar to 
"array [] = array2;" but handles overlapping.  Optionally, inverse the 
logic; this method doesn't allow overlapping, but [] does.  I prefer the 
latter, as it's not that expensive or common, just an optimisation dead 
end.  But in any case not having a workaround now results in ugly code.

What do you mean by overlapping? What is the purpose of this function? What test cases should it satisfy? I intend to release this library via opend.org, once I finish the sort functions. Best regards, Daniel Yokomiso. "But what do you expect? Truth is not an industrial product." - Wlad Turski, as quoted by Edsger W. Dijkstra (EWD714)
Nov 19 2002
next sibling parent Burton Radons <loth users.sourceforge.net> writes:
Daniel Yokomiso wrote:
 I'm writing a template library with array utilities, and followed the ideas you
 posted. But I'm a bit puzzled by this one:
 
 
- array.set (array2).  Copy the second array into this one; similar to 
"array [] = array2;" but handles overlapping.  Optionally, inverse the 
logic; this method doesn't allow overlapping, but [] does.  I prefer the 
latter, as it's not that expensive or common, just an optimisation dead 
end.  But in any case not having a workaround now results in ugly code.

What do you mean by overlapping? What is the purpose of this function? What test cases should it satisfy?

This is illegal code, for example: char [] foo = new char [400]; foo [50 .. 150] = foo [0 .. 100]; The compiler doesn't have to check for and reverse overlapping array copying in D. This method would be easily implemented as a single call to memmove. Test cases would ensure that overlapped copying results in a pure stream. For example: unittest { byte [] foo = new byte [25]; for (int c; c < foo.length; c ++) foo [c] = c; foo [0 .. 10] = foo [5 .. 15]; for (int c; c < 10; c ++) assert (foo [c] == c + 5); foo [20 .. 30] = foo [15 .. 25]; for (int c = 20; c < 30; c ++) assert (foo [c] == c - 5); }
Nov 19 2002
prev sibling parent "Walter" <walter digitalmars.com> writes:
"Daniel Yokomiso" <Daniel_member pathlink.com> wrote in message
news:ardpp8$u6k$1 digitaldaemon.com...
 I'm writing a template library with array utilities, and followed the

 posted.

Great!
Nov 19 2002