digitalmars.D - RangeExtra

dsimcha (41/41) Apr 23 2009 I'm loving the new Phobos, and have been playing around with ranges, ali...

Denis Koroskin (2/59) Apr 23 2009 Great! Personally, I'd love to use CapacityArray, others are pretty cool...
Andrei Alexandrescu (25/53) Apr 23 2009 Upon more thinking, I think we'll have to bit the bullet and define

Denis Koroskin (4/37) Apr 23 2009 Now that there is a consideration about putting built-in types into a li...

Andrei Alexandrescu (3/7) Apr 23 2009 Maybe that could be done by manipulating the implicit constructor of Ref...

dsimcha (28/47) Apr 23 2009 So it would take something like a lambda function as a template alias pa...
Robert Jacques (6/25) Apr 23 2009 No. The capacity field hack only works with the current free-list based ...

Andrei Alexandrescu (6/36) Apr 23 2009 1. STL does the same thing so it's likely a more general issue than one

Robert Jacques (8/42) Apr 23 2009 1) Malloc generally uses a free-list based allocation scheme, so no, its...

dsimcha <dsimcha yahoo.com> writes:

I'm loving the new Phobos, and have been playing around with ranges, alias
this, etc.  I've compiled a list of a few range types that I have at least
partial implementations of for personal use that were overlooked by Andrei, et
al. in the initial release of the new Phobos.  Please tell me which of these
are generally useful and should be cleaned up and released, which ones are too
niche to belong in Phobos, which ones std.range can already do by some
slightly non-obvious idiom, and which ones, if any, are already in the works.

CapacityArray:  A range that covers a builtin array and gives it a capacity
field.  All operations that don't have to do with appending are forwarded to
the underlying array via alias this (modulo a few compiler bugs), meaning that
a CapacityArray has an identical compile time interface to a builtin array,
and is implicitly convertible to one.

Reindex:  Takes a zero-indexed random access range and converts it to a range
with an arbitrary base index.  Useful for one-indexed arrays to fit
mathematical notation conventions, etc.

Comb:  Given a random-access range, iterate over all unordered pairs, triples,
etc. of elements in the underlying range.  front(), opIndex(), etc. would
return tuples or structs containing static arrays.  Note that to do this
efficiently, the number of elements (pair, triplet, etc.) would have to be
known at compile time.

Joint:  Kind of like std.range.Zip, but returns a tuple of values of the
elements in the underlying ranges instead of a proxy of pointers to the
elements of the underlying range.  Zip is great for what it's designed for,
but sometimes the pointer/reference semantics are problematic.

Rotated:  Gives a rotated view of a random access range without modifying the
underlying range.  Also has a put() method.  Given an underlying range of
length L, this provides an efficient way to remember the last L elements seen
by the range:

auto underlying = [1,2,3];
auto r = rotated(underlying, 1);

foreach(elem; r) {  // Prints "2  3  1".
    write(elem, "  ");
}

r.put(5);  // O(1).
r.put(4);
r.put(3);
r.put(2);
r.put(1);
foreach(elem; r) {  // Prints "3  2  1".
    write(elem, "  ");
}

Apr 23 2009

"Denis Koroskin" <2korden gmail.com> writes:

On Thu, 23 Apr 2009 22:12:03 +0400, dsimcha <dsimcha yahoo.com> wrote:

 I'm loving the new Phobos, and have been playing around with ranges,  
 alias
 this, etc.  I've compiled a list of a few range types that I have at  
 least
 partial implementations of for personal use that were overlooked by  
 Andrei, et
 al. in the initial release of the new Phobos.  Please tell me which of  
 these
 are generally useful and should be cleaned up and released, which ones  
 are too
 niche to belong in Phobos, which ones std.range can already do by some
 slightly non-obvious idiom, and which ones, if any, are already in the  
 works.

 CapacityArray:  A range that covers a builtin array and gives it a  
 capacity
 field.  All operations that don't have to do with appending are  
 forwarded to
 the underlying array via alias this (modulo a few compiler bugs),  
 meaning that
 a CapacityArray has an identical compile time interface to a builtin  
 array,
 and is implicitly convertible to one.

 Reindex:  Takes a zero-indexed random access range and converts it to a  
 range
 with an arbitrary base index.  Useful for one-indexed arrays to fit
 mathematical notation conventions, etc.

 Comb:  Given a random-access range, iterate over all unordered pairs,  
 triples,
 etc. of elements in the underlying range.  front(), opIndex(), etc. would
 return tuples or structs containing static arrays.  Note that to do this
 efficiently, the number of elements (pair, triplet, etc.) would have to  
 be
 known at compile time.

 Joint:  Kind of like std.range.Zip, but returns a tuple of values of the
 elements in the underlying ranges instead of a proxy of pointers to the
 elements of the underlying range.  Zip is great for what it's designed  
 for,
 but sometimes the pointer/reference semantics are problematic.

 Rotated:  Gives a rotated view of a random access range without  
 modifying the
 underlying range.  Also has a put() method.  Given an underlying range of
 length L, this provides an efficient way to remember the last L elements  
 seen
 by the range:

 auto underlying = [1,2,3];
 auto r = rotated(underlying, 1);

 foreach(elem; r) {  // Prints "2  3  1".
     write(elem, "  ");
 }

 r.put(5);  // O(1).
 r.put(4);
 r.put(3);
 r.put(2);
 r.put(1);
 foreach(elem; r) {  // Prints "3  2  1".
     write(elem, "  ");
 }

Great! Personally, I'd love to use CapacityArray, others are pretty cool, too.

Apr 23 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

dsimcha wrote:
 I'm loving the new Phobos, and have been playing around with ranges, alias
 this, etc.  I've compiled a list of a few range types that I have at least
 partial implementations of for personal use that were overlooked by Andrei, et
 al. in the initial release of the new Phobos.  Please tell me which of these
 are generally useful and should be cleaned up and released, which ones are too
 niche to belong in Phobos, which ones std.range can already do by some
 slightly non-obvious idiom, and which ones, if any, are already in the works.

I think these are fantastic. Some comments within.

 CapacityArray:  A range that covers a builtin array and gives it a capacity
 field.  All operations that don't have to do with appending are forwarded to
 the underlying array via alias this (modulo a few compiler bugs), meaning that
 a CapacityArray has an identical compile time interface to a builtin array,
 and is implicitly convertible to one.

Upon more thinking, I think we'll have to bit the bullet and define 
Array!T and Slice!T. Then dmd rewrites:

[ a, b, c ] -> .Array!(typeof(a))(a, b, c)
T[new] -> .Array!(T)
new T[n] -> .Array!(T)(n)
T[] -> .Slice!(T)

It seems increasingly clear that slices alone can't quite cut the mustard.

So your CapacityArray will essentially be the basis of Array.

 Reindex:  Takes a zero-indexed random access range and converts it to a range
 with an arbitrary base index.  Useful for one-indexed arrays to fit
 mathematical notation conventions, etc.

Hm. I'd take the opportunity to actually reindex the range in an 
arbitrary way, e.g. by computing a separate index for it and using it 
transparently. I see little utility in just changing base (I tried such 
based_vector code for C++ and it created more confusion than it removed.)

 Comb:  Given a random-access range, iterate over all unordered pairs, triples,
 etc. of elements in the underlying range.  front(), opIndex(), etc. would
 return tuples or structs containing static arrays.  Note that to do this
 efficiently, the number of elements (pair, triplet, etc.) would have to be
 known at compile time.

Sounds cool.

 Joint:  Kind of like std.range.Zip, but returns a tuple of values of the
 elements in the underlying ranges instead of a proxy of pointers to the
 elements of the underlying range.  Zip is great for what it's designed for,
 but sometimes the pointer/reference semantics are problematic.

Joint is a risque name, but the idea is good. My preference would be to 
look into using alias this with Zip for a seamless experience though. 
After all, once you have a seamless Zip, creating copies out of it is 
the easy part. Besides, Joint can only make for an input range as people 
can't save references returned by front(). Zip can be as strong a range 
as possible.

 Rotated:  Gives a rotated view of a random access range without modifying the
 underlying range.  Also has a put() method.  Given an underlying range of
 length L, this provides an efficient way to remember the last L elements seen
 by the range:

I think that's very related to Cycle.

http://www.digitalmars.com/d/2.0/phobos/std_range.html#Cycle

Keep the great ideas coming!


Andrei

Apr 23 2009

"Denis Koroskin" <2korden gmail.com> writes:

On Thu, 23 Apr 2009 22:37:37 +0400, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:

 dsimcha wrote:
 I'm loving the new Phobos, and have been playing around with ranges,  
 alias
 this, etc.  I've compiled a list of a few range types that I have at  
 least
 partial implementations of for personal use that were overlooked by  
 Andrei, et
 al. in the initial release of the new Phobos.  Please tell me which of  
 these
 are generally useful and should be cleaned up and released, which ones  
 are too
 niche to belong in Phobos, which ones std.range can already do by some
 slightly non-obvious idiom, and which ones, if any, are already in the  
 works.

 I think these are fantastic. Some comments within.

 CapacityArray:  A range that covers a builtin array and gives it a  
 capacity
 field.  All operations that don't have to do with appending are  
 forwarded to
 the underlying array via alias this (modulo a few compiler bugs),  
 meaning that
 a CapacityArray has an identical compile time interface to a builtin  
 array,
 and is implicitly convertible to one.

 Upon more thinking, I think we'll have to bit the bullet and define  
 Array!T and Slice!T. Then dmd rewrites:

 [ a, b, c ] -> .Array!(typeof(a))(a, b, c)
 T[new] -> .Array!(T)
 new T[n] -> .Array!(T)(n)
 T[] -> .Slice!(T)

 It seems increasingly clear that slices alone can't quite cut the  
 mustard.

 So your CapacityArray will essentially be the basis of Array.

Now that there is a consideration about putting built-in types into a library...

... how about introducing struct MaybeNull(T) { ... } among them (and
NonNull-by-default)?

Oh, and Object.d deserves a better name, IMHO.

Apr 23 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Denis Koroskin wrote:
 On Thu, 23 Apr 2009 22:37:37 +0400, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
 Now that there is a consideration about putting built-in types into a
library...
 
 ... how about introducing struct MaybeNull(T) { ... } among them (and
NonNull-by-default)?

Maybe that could be done by manipulating the implicit constructor of Ref.

Andrei

Apr 23 2009

dsimcha <dsimcha yahoo.com> writes:

== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 Reindex:  Takes a zero-indexed random access range and converts it to a range
 with an arbitrary base index.  Useful for one-indexed arrays to fit
 mathematical notation conventions, etc.

 Hm. I'd take the opportunity to actually reindex the range in an
 arbitrary way, e.g. by computing a separate index for it and using it
 transparently. I see little utility in just changing base (I tried such
 based_vector code for C++ and it created more confusion than it removed.)

So it would take something like a lambda function as a template alias parameter
to
map an index in whatever space you want onto normal zero-indexing?  This sounds
like a very good idea:

Array with only even indices:

uint[] foo = [0,2,4,6,8,10];
auto bar = reindex!("assert(a & 1 == 0); a / 2")(foo);
assert(bar[2] == 2);
assert(bar[10] == 10);

One-indexing:

foo = [1,2,3,4,5];
auto baz = reindex!("a - 1")(foo);
assert(foo[1] == 1);
assert(foo[5] == 5);

Array indexed by powers of two:

foo = [1,2,4,8,16,32,64];
auto pow2 = reindex!((a) { return cast(size_t) log2(a); })(foo);
assert(foo[32] == 32);
assert(foo[4] == 4);

 Joint:  Kind of like std.range.Zip, but returns a tuple of values of the
 elements in the underlying ranges instead of a proxy of pointers to the
 elements of the underlying range.  Zip is great for what it's designed for,
 but sometimes the pointer/reference semantics are problematic.

 Joint is a risque name

This idea grew out of dstats.infotheory and joint probability distributions.

 Rotated:  Gives a rotated view of a random access range without modifying the
 underlying range.  Also has a put() method.  Given an underlying range of
 length L, this provides an efficient way to remember the last L elements seen
 by the range:

 I think that's very related to Cycle.
 http://www.digitalmars.com/d/2.0/phobos/std_range.html#Cycle
 Keep the great ideas coming!


True, I didn't see the index parameter in cycle when I initially looked at it.
This one could go either way.  On the one hand, avoiding bloat and having a few
very general primitives is usually a good thing.  On the other hand, if you use
cycle and take, you're implementing someting fairly simple on top of something
much more complex, and you can't have the put() function that makes the range
track the last L items written to it  in order with O(1) put() calls, only one
memory allocation over the object's lifetime and minimal space overhead, which
was
my original motivation for Rotate.

Apr 23 2009

"Robert Jacques" <sandford jhu.edu> writes:

On Thu, 23 Apr 2009 14:37:37 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:
 dsimcha wrote:
 CapacityArray:  A range that covers a builtin array and gives it a  
 capacity
 field.  All operations that don't have to do with appending are  
 forwarded to
 the underlying array via alias this (modulo a few compiler bugs),  
 meaning that
 a CapacityArray has an identical compile time interface to a builtin  
 array,
 and is implicitly convertible to one.

 Upon more thinking, I think we'll have to bit the bullet and define  
 Array!T and Slice!T. Then dmd rewrites:

 [ a, b, c ] -> .Array!(typeof(a))(a, b, c)
 T[new] -> .Array!(T)
 new T[n] -> .Array!(T)(n)
 T[] -> .Slice!(T)

 It seems increasingly clear that slices alone can't quite cut the  
 mustard.

 So your CapacityArray will essentially be the basis of Array.

No. The capacity field hack only works with the current free-list based  
mark-sweep GC. In every other type of GC, the capacity is dynamic and can  
not be cached in the array itself. Introducing a language feature that is  
specific to an implementation seems like a bad idea.

Apr 23 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Robert Jacques wrote:
 On Thu, 23 Apr 2009 14:37:37 -0400, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> wrote:
 dsimcha wrote:
 CapacityArray:  A range that covers a builtin array and gives it a 
 capacity
 field.  All operations that don't have to do with appending are 
 forwarded to
 the underlying array via alias this (modulo a few compiler bugs), 
 meaning that
 a CapacityArray has an identical compile time interface to a builtin 
 array,
 and is implicitly convertible to one.

 Upon more thinking, I think we'll have to bit the bullet and define 
 Array!T and Slice!T. Then dmd rewrites:

 [ a, b, c ] -> .Array!(typeof(a))(a, b, c)
 T[new] -> .Array!(T)
 new T[n] -> .Array!(T)(n)
 T[] -> .Slice!(T)

 It seems increasingly clear that slices alone can't quite cut the 
 mustard.

 So your CapacityArray will essentially be the basis of Array.

 
 No. The capacity field hack only works with the current free-list based 
 mark-sweep GC. In every other type of GC, the capacity is dynamic and 
 can not be cached in the array itself. Introducing a language feature 
 that is specific to an implementation seems like a bad idea.

1. STL does the same thing so it's likely a more general issue than one 
might think.

2. This is not a language feature. Different implementations can take 
different routes.


Andrei

Apr 23 2009

"Robert Jacques" <sandford jhu.edu> writes:

On Thu, 23 Apr 2009 19:00:10 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Robert Jacques wrote:
 On Thu, 23 Apr 2009 14:37:37 -0400, Andrei Alexandrescu  
 <SeeWebsiteForEmail erdani.org> wrote:
 dsimcha wrote:
 CapacityArray:  A range that covers a builtin array and gives it a  
 capacity
 field.  All operations that don't have to do with appending are  
 forwarded to
 the underlying array via alias this (modulo a few compiler bugs),  
 meaning that
 a CapacityArray has an identical compile time interface to a builtin  
 array,
 and is implicitly convertible to one.

 Upon more thinking, I think we'll have to bit the bullet and define  
 Array!T and Slice!T. Then dmd rewrites:

 [ a, b, c ] -> .Array!(typeof(a))(a, b, c)
 T[new] -> .Array!(T)
 new T[n] -> .Array!(T)(n)
 T[] -> .Slice!(T)

 It seems increasingly clear that slices alone can't quite cut the  
 mustard.

 So your CapacityArray will essentially be the basis of Array.

  No. The capacity field hack only works with the current free-list  
 based mark-sweep GC. In every other type of GC, the capacity is dynamic  
 and can not be cached in the array itself. Introducing a language  
 feature that is specific to an implementation seems like a bad idea.

 1. STL does the same thing so it's likely a more general issue than one  
 might think.

1) Malloc generally uses a free-list based allocation scheme, so no, its  
not more general.
2) The STL is a library, not a language feature.

 2. This is not a language feature. Different implementations can take  
 different routes.

Creating seperate types for slices and arrays is a language feature.

P.S. I think having array builders in Phobos is a good idea, I just don't  
think that they warrant a language change.

Apr 23 2009

D Programming

C/C++ Programming

Other

digitalmars.D - RangeExtra