www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - std.array suggestion (version 2)

reply Oskar Linde <olREM OVEnada.kth.se> writes:
Hello,

I've been working a bit more on my std.array suggestion in
digitalmars.D/35008
and have made some changes to the specification. I would like to thank
everyone that commented. I have made the following changes:

- removed .sum() (belongs to std.math or similar)
- added optional ordering predicate to .max() and .min()
- find now returns size_t instead of ptrdiff_t
- added .count(predicate|element)
- in-place versions have void return type

A sample implementation is available at:
http://www.csc.kth.se/~ol/array.d
And a brief DDoc-output (hardly readable):
http://www.csc.kth.se/~ol/array.html
The implementation has been focused on correctness. No code is optimized.
(except for .doSort() where I couldn't resist). What is missing is manual
template argument checking with error reporting via pragma(msg,...).

I have some open issues:

1. max/min on empty arrays
 max/min are undefined on empty arrays. I see three options:

 a) return something, i.e T.min/max or T.init.
 b) use an in contract asserting the array is non-empty
 c) throw EmptyArrayException

 - a) doesn't feel very robust. b) seems to be the most D-ish.

2. Naming of in-place functions
 my current suggestion is doMap(), doSort(), etc. Other suggestions are
 mapInPlace() and inPlaceMap(), but I find them a bit to wordy.

3. Functions like left, right, skip, skipUntil, countUntil, etc.
 I find them either redundant or too specialized to belong in a std.array
 module. Maybe some of them fit in std.string.

4. Cut-off point between std.string and std.array.
 I have no stringent rule for this. As someone said: If it is simple enough
 and could be generically useful (as opposed to only applicable to strings)
 why not put it in std.array. That's why split and join are here. (Well join
 belongs together with split, and split is just a recursive find, and find
 definitely belongs in std.array. Atleast the single element and delegate
 version of find. But separating sub-array find from element-find, with
 both having the same function name will lead to problems.) But the current
 std.string non-argument version of split on char[] defaults to split on
 whitespace (iirc), a behavior that definitely is std.string matter.

Here is the full list of function definitions:

T fold(T[] arr, T init, T delegate|function combiner(T,T));
T max(T[] arr);
T max(T[] arr, delegate|function wrongOrder(T,T));
T min(T[] arr);
T min(T[] arr, delegate|function wrongOrder(T,T));
size_t find(T[] arr, T|delegate|function d);
size_t indexOf(T[] arr, T|delegate|function d);
size_t count(T[], T|delegate|function d);
T[][] split(T[] arr, T|T[]|delegate|function d);
T[] join(T[][] arr);
T[] join(T[][] arr, T|T[] separator);
U[] map(T[] arr, U delegate|function f(T));
T[] filter(T[] arr, delegate|function p(T));
T[] sort(T[]);
T[] sort(T[], delegate|function wrongOrder(T,T));
T[] stableSort(T[]);
T[] stableSort(T[], delegate|function wrongOrder(T,T));
T[] reverse(T[]);

In-place versions:
void doMap(T[], T delegate|function f(T));
void doSort(T[]);
void doSort(T[], delegate|function wrongOrder(T,T));
void doStableSort(T[]);
void doStableSort(T[], delegate|function wrongOrder(T,T));
void doReverse(T[]);

/Oskar
Mar 19 2006
next sibling parent "Ameer Armaly" <ameer_armaly hotmail.com> writes:
"Oskar Linde" <olREM OVEnada.kth.se> wrote in message 
news:dvklqj$4cl$1 digitaldaemon.com...
 Hello,

 I've been working a bit more on my std.array suggestion in
 digitalmars.D/35008
 and have made some changes to the specification. I would like to thank
 everyone that commented. I have made the following changes:

 - removed .sum() (belongs to std.math or similar)
 - added optional ordering predicate to .max() and .min()
 - find now returns size_t instead of ptrdiff_t
 - added .count(predicate|element)
 - in-place versions have void return type

 A sample implementation is available at:
 http://www.csc.kth.se/~ol/array.d
 And a brief DDoc-output (hardly readable):
 http://www.csc.kth.se/~ol/array.html
 The implementation has been focused on correctness. No code is optimized.
 (except for .doSort() where I couldn't resist). What is missing is manual
 template argument checking with error reporting via pragma(msg,...).

 I have some open issues:

 1. max/min on empty arrays
 max/min are undefined on empty arrays. I see three options:

 a) return something, i.e T.min/max or T.init.
 b) use an in contract asserting the array is non-empty

someething with nothing.
 c) throw EmptyArrayException

 - a) doesn't feel very robust. b) seems to be the most D-ish.

 2. Naming of in-place functions
 my current suggestion is doMap(), doSort(), etc. Other suggestions are
 mapInPlace() and inPlaceMap(), but I find them a bit to wordy.

 3. Functions like left, right, skip, skipUntil, countUntil, etc.
 I find them either redundant or too specialized to belong in a std.array
 module. Maybe some of them fit in std.string.

 4. Cut-off point between std.string and std.array.
 I have no stringent rule for this. As someone said: If it is simple enough
 and could be generically useful (as opposed to only applicable to strings)
 why not put it in std.array. That's why split and join are here. (Well 
 join
 belongs together with split, and split is just a recursive find, and find
 definitely belongs in std.array. Atleast the single element and delegate
 version of find. But separating sub-array find from element-find, with
 both having the same function name will lead to problems.) But the current
 std.string non-argument version of split on char[] defaults to split on
 whitespace (iirc), a behavior that definitely is std.string matter.

that if we want specific "string" functions, generic array functions could also be included. Perhaps the whitespace version of split could be put in std.string, calling the std.array version of split with the appropriate parameters.
 Here is the full list of function definitions:

 T fold(T[] arr, T init, T delegate|function combiner(T,T));
 T max(T[] arr);
 T max(T[] arr, delegate|function wrongOrder(T,T));
 T min(T[] arr);
 T min(T[] arr, delegate|function wrongOrder(T,T));
 size_t find(T[] arr, T|delegate|function d);
 size_t indexOf(T[] arr, T|delegate|function d);
 size_t count(T[], T|delegate|function d);
 T[][] split(T[] arr, T|T[]|delegate|function d);
 T[] join(T[][] arr);
 T[] join(T[][] arr, T|T[] separator);
 U[] map(T[] arr, U delegate|function f(T));
 T[] filter(T[] arr, delegate|function p(T));
 T[] sort(T[]);
 T[] sort(T[], delegate|function wrongOrder(T,T));
 T[] stableSort(T[]);
 T[] stableSort(T[], delegate|function wrongOrder(T,T));
 T[] reverse(T[]);

 In-place versions:
 void doMap(T[], T delegate|function f(T));
 void doSort(T[]);
 void doSort(T[], delegate|function wrongOrder(T,T));
 void doStableSort(T[]);
 void doStableSort(T[], delegate|function wrongOrder(T,T));
 void doReverse(T[]);

 /Oskar 

Mar 19 2006
prev sibling parent reply Derek Parnell <derek psych.ward> writes:
On Sun, 19 Mar 2006 22:25:55 +0000, Oskar Linde wrote:

 Hello,
 
 I've been working a bit more on my std.array suggestion in
 digitalmars.D/35008
 and have made some changes to the specification. 

Thanks Oskar. This is a great idea.
 I have some open issues:
 
 1. max/min on empty arrays
  max/min are undefined on empty arrays. I see three options:
 
  a) return something, i.e T.min/max or T.init.
  b) use an in contract asserting the array is non-empty
  c) throw EmptyArrayException
 
  - a) doesn't feel very robust. b) seems to be the most D-ish.

Of course, 'in' contracts are stripped out in -release editions so one can never rely on them catching empty arrays.
 2. Naming of in-place functions
  my current suggestion is doMap(), doSort(), etc. Other suggestions are
  mapInPlace() and inPlaceMap(), but I find them a bit to wordy.

Any name will do if its documented well enough and consistent. But maybe MapIt(), and SortIt() ;-)
 Here is the full list of function definitions:

Is a 'merge' routine useful? I know it could be done with a join() then sort(), but a merge might be able to optimize the behaviour. -- Derek (skype: derek.j.parnell) Melbourne, Australia "Down with mediocracy!" 20/03/2006 10:59:52 AM
Mar 19 2006
parent Oskar Linde <oskar.lindeREM OVEgmail.com> writes:
Derek Parnell skrev:
 On Sun, 19 Mar 2006 22:25:55 +0000, Oskar Linde wrote:
 
 Hello,

 I've been working a bit more on my std.array suggestion in
 digitalmars.D/35008
 and have made some changes to the specification. 

Thanks Oskar. This is a great idea.

Thank you Derek.
 I have some open issues:

 1. max/min on empty arrays
  max/min are undefined on empty arrays. I see three options:

  a) return something, i.e T.min/max or T.init.
  b) use an in contract asserting the array is non-empty
  c) throw EmptyArrayException

  - a) doesn't feel very robust. b) seems to be the most D-ish.

Of course, 'in' contracts are stripped out in -release editions so one can never rely on them catching empty arrays.

Yes. This makes b) and c) differ quite a bit. With c), the library does the error checking for you, allowing you to recover by catching an exception. b) allows for higher performance code, but places the responsibility to make sure the array is non-empty on the user. b) would make it a programming error to pass an empty array to max/min, in the same league as out of bounds indexing or null pointer dereferencing. A release build would in the best case give you a segmentation fault and in the worst case return a bogus value. c) would make passing an empty array a well defined operation but would incur a runtime cost in release builds.
 2. Naming of in-place functions
  my current suggestion is doMap(), doSort(), etc. Other suggestions are
  mapInPlace() and inPlaceMap(), but I find them a bit to wordy.

Any name will do if its documented well enough and consistent. But maybe MapIt(), and SortIt() ;-)

arr.mapIt(function int(int x) { return x*x; }); arr.sortIt(); They would work for me.
 Here is the full list of function definitions:

Is a 'merge' routine useful? I know it could be done with a join() then sort(), but a merge might be able to optimize the behaviour.

Yes, it would most certainly be useful, but I am slightly hesitant to add it. I try to keep std.array contain only the most basic and generically useful functions. More specialized functions should go into more specialized modules. Of course, I judge usefulness by a very subjective measure -- mostly based on my own coding habits: I've almost never had the need for a merge() routine (I can remember a few cases, but they all needed specialized implementations) -- but by this reasoning, I should not have included reverse() or stableSort() either. If you have a different experience regarding merge(), I'm more than happy to add it, as I recognize it being a logical companion to sort() and join(). What should it's signature be? Similar to join: T[] merge(T[][] arr, opionalOrderingPredicate) or do you almost always merge only two arrays: T[] merge(T[] arr1, T[] arr2, optionalOrderingPredicate); (It would then be the only std.array function symmetric with regard to its first arguments.) /Oskar
Mar 20 2006