www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Multidimensional opSlice and opSliceAssign

reply Norbert Nemec <Norbert.Nemec gmx.de> writes:
Hi there,

while trying to answer to Derek's mail, the concept of vector expressions in
my head evolved further and I realized, that opSlice and opSliceAssign for
multiple dimensions would actually fit in nicely. Therefore here a complete
proposal:

---------------
1. If we want strided slicing sometime in the future, we have to make
provisions for it now. Once opSlice is overloadable with multiple
dimensions, inserting stride arguments would break compatibility.
Therefore: Allow strided slicing now, even if the native arrays don't
support it yet. This will also allow writing more powerful library-based
arrays until the native ones are there.

An expression
        A[start..end:stride]
should be read as
        A.opSlice(start,end,stride)

When no stride is given, 1 is taken as default:
        A[start..end]
should be read as
        A.opSlice(start,end,1)

When defining opSlice, the stride argument should not be given a default
value, since this would be ignored by the compiler anyway (except for the
rare case of calling opSlice explicitely as a function - and there you can
live with giving the 1 explicitely)
---------------
2. Slice assignments are done similar to index assignments:
        A[start..end:stride] = X
is read as
        A.opSliceAssign(X,start,end,stride)

Again stride is set to 1 if not given.
---------------
3. Full slicing is done without giving a range:

        A[]
is read as
        A.opSlice()

The full slice assignment
        A[] = X
is read as
        A.opSliceAssign(X)
---------------
4. Multidimensional slices are a straightforward extension similar to that
of the indexing:

        A[s0..e0:st0,s1..e1:st1]
is read as
        A.opSlice(s0,e0,st0,s1,e1,st1)
and similar for opSliceAssign
---------------
5. Partial indexing (This was my most recent idea and seems to solve the
problem):

An expression
        A[s0..e0:st0,i1,i2,s3..e3:st3]
is read as
A.opIndexPartial(2,i2).opIndexPartial(1,i1).opSlice(s0,e0,st0,s3,e3,st3)

For assignments, only the last opSlice is replaced by the corresponding
opSliceAssign. There is no need for a opIndexPartialAssign.
---------------
6. N-dimensional indexing: working with templates it is desired to have code
that works for arbitrary dimensions. With opIndex etc, this is not
possible. Therefore one additional feature:

If opIndex(int i0,int i1,int i2) is not found when it is called, the
arguments are packed up into a static array and
        opIndexN(int[3] i)
is called instead. opIndexAssignN works accordingly.

If opSlice(int s0,int e0,int st0,int s1,int e1,int st1) is not found when it
is called, again, everything is packed up and
        opSliceN(int[2] s,int[2] e,int[3] st)
is called.
---------------

Comments? Corrections? Ideas?

Now, probably this will not have highest priority for Walter, but I hope,
something like this should make it into 1.0 without problems, since it is
fairly unintrusive, clean and necessary for completeness.

Ciao,
Nobbi
Jul 07 2004
parent reply Sam McCall <tunah.d tunah.net> writes:
Erk, I didn't see this before I replied to the old thread. Please read 
my other post. Brief replies here:

Norbert Nemec wrote:

 Hi there,
 
 while trying to answer to Derek's mail, the concept of vector expressions in
 my head evolved further and I realized, that opSlice and opSliceAssign for
 multiple dimensions would actually fit in nicely. Therefore here a complete
 proposal:
 
 ---------------
 1. If we want strided slicing sometime in the future, we have to make
 provisions for it now. Once opSlice is overloadable with multiple
 dimensions, inserting stride arguments would break compatibility.
 Therefore: Allow strided slicing now, even if the native arrays don't
 support it yet. This will also allow writing more powerful library-based
 arrays until the native ones are there.

new syntax/overloading. We shouldn't allow overloading everything possible, we still have regular function calls! Do the benefits really outweigh the costs?
 When defining opSlice, the stride argument should not be given a default
 value, since this would be ignored by the compiler anyway (except for the
 rare case of calling opSlice explicitely as a function - and there you can
 live with giving the 1 explicitely)
 ---------------
 2. Slice assignments are done similar to index assignments:
         A[start..end:stride] = X
 is read as
         A.opSliceAssign(X,start,end,stride)
 
 Again stride is set to 1 if not given.

assert(stride==1). It turns a compile time error with (potentially) a clear error message into a runtime error.
 ---------------
         A[]
         A.opSlice()
 
         A[] = X
         A.opSliceAssign(X)

         A[s0..e0:st0,s1..e1:st1]
         A.opSlice(s0,e0,st0,s1,e1,st1)
 and similar for opSliceAssign

three dimensional case, assuming the "obvious" (to me) thing would result in opSlice(s1,e1,s2,e2,s3,e3) which is going to break horribly at runtime.
 ---------------
 5. Partial indexing (This was my most recent idea and seems to solve the
 problem):
 
 An expression
         A[s0..e0:st0,i1,i2,s3..e3:st3]
 is read as
 A.opIndexPartial(2,i2).opIndexPartial(1,i1).opSlice(s0,e0,st0,s3,e3,st3)
 
 For assignments, only the last opSlice is replaced by the corresponding
 opSliceAssign. There is no need for a opIndexPartialAssign.

looks like it would be hard to code for certain underlying data structures. An alternate idea I posted in the other thread: What if there were two allowed signatures for opSlice: T opSlice(start1,end1,...); T opSlice(start1,end1,...,bool[N] single) For a "full" slice (all indices are ranges), the first would be preferred, else fall back to the second with single all true. Partial slices would only be allowed to use the second one. In the second one, N would have to match the number of indices, and if single[i] was true then endI==startI (or startI+1, whatever), and the intent is that that dimension is being indexed, not sliced. opSliceAssign would be similar.
 ---------------
 6. N-dimensional indexing: working with templates it is desired to have code
 that works for arbitrary dimensions. With opIndex etc, this is not
 possible. Therefore one additional feature:
 
 If opIndex(int i0,int i1,int i2) is not found when it is called, the
 arguments are packed up into a static array and
         opIndexN(int[3] i)
 is called instead. opIndexAssignN works accordingly.

function, but I don't see an alternative. Maybe it should also (!) look for opIndexN(int[] i) - N-dimensional indexing is good, but templates would require N to be fixed at compile time :-)
 Now, probably this will not have highest priority for Walter, but I hope,
 something like this should make it into 1.0 without problems, since it is
 fairly unintrusive, clean and necessary for completeness.

I would like to see it in, but a line has to be drawn, I guess. There should really be a page on the wiki about proposed language enhancements, things like this are liable to get lost after a while. I'd create one and get stuck in, but for some reason all this performance talk has made me want to go write some Ruby ;-) Sam
 
 Ciao,
 Nobbi
 

Jul 07 2004
parent reply Norbert Nemec <Norbert.Nemec gmx.de> writes:
Sam McCall wrote:

 ---------------
 1. If we want strided slicing sometime in the future, we have to make
 provisions for it now. Once opSlice is overloadable with multiple
 dimensions, inserting stride arguments would break compatibility.
 Therefore: Allow strided slicing now, even if the native arrays don't
 support it yet. This will also allow writing more powerful library-based
 arrays until the native ones are there.

new syntax/overloading. We shouldn't allow overloading everything possible, we still have regular function calls! Do the benefits really outweigh the costs?

How about offering two versions: opSlice(s0,e0,s1,e1) opSdSlice(s0,e0,sd0,s1,e1,sd1) If at least one stride is given all others are implicitely assumed to be 1. If opSlice is not found opSdSlice is tried instead (rendering it unnecessary to implement both) This would raise the number of special functions even further, but it is just a large number with little complexity. As an overview: ---------------- opIndex opSlice opSdSlice opIndexAssign opSliceAssign opSdSliceAssign opIndexN opSliceN opSdSliceN opIndexAssignN opSliceAssignN opSdSliceAssignN opPartialIndex ---------------- Count: 2*2*(1+2)+1=13 But you don't have to learn 13 names. Merely five concepts: opIndex vs. opSlice opSlice vs. opSdSlice assignment vs. access int[N] indexing opPartialIndex
 ---------------
 5. Partial indexing (This was my most recent idea and seems to solve the
 problem):
 
 An expression
         A[s0..e0:st0,i1,i2,s3..e3:st3]
 is read as
 A.opIndexPartial(2,i2).opIndexPartial(1,i1).opSlice(s0,e0,st0,s3,e3,st3)
 
 For assignments, only the last opSlice is replaced by the corresponding
 opSliceAssign. There is no need for a opIndexPartialAssign.

looks like it would be hard to code for certain underlying data structures. An alternate idea I posted in the other thread: What if there were two allowed signatures for opSlice: T opSlice(start1,end1,...); T opSlice(start1,end1,...,bool[N] single) For a "full" slice (all indices are ranges), the first would be preferred, else fall back to the second with single all true. Partial slices would only be allowed to use the second one. In the second one, N would have to match the number of indices, and if single[i] was true then endI==startI (or startI+1, whatever), and the intent is that that dimension is being indexed, not sliced.

The problem with this is, that the dimension of T depends on the contents of single. The former should be known at compile time, the latter is a runtime value. I don't like the complexity of this solution either, but compared to all other ideas that I chewed over, it is clean and simple.
 Maybe it should also (!) look for opIndexN(int[] i) - N-dimensional
 indexing is good, but templates would require N to be fixed at compile
 time :-)

Of course, the compiler would take care of this. int[N] is implicitly casted to int[] if no routine for the former is specified.
 I don't think Walter's likely to accept any new features as this stage.

Of course. He stated that he is working on bugfixes instead of features right now, which really is a good idea. But I still hope there will be another phase of taking in simple features before 1.0 is finished up. This opSlice stuff really seems more like "rounding off" what was started by opIndex. I agree with Walter, that there should be no new cans of worms opened before 1.0, but this here would rather mean closing a can that was opened already... Ciao, Nobbi
Jul 07 2004
parent reply Sam McCall <tunah.d tunah.net> writes:
Norbert Nemec wrote:

 How about offering two versions:
         opSlice(s0,e0,s1,e1)
         opSdSlice(s0,e0,sd0,s1,e1,sd1)
 If at least one stride is given all others are implicitely assumed to be 1.
 If opSlice is not found opSdSlice is tried instead (rendering it
 unnecessary to implement both)

However, unless there's a compelling argument that this would/could/should be commonly used, I don't see a need for an operator for this.
5. Partial indexing (This was my most recent idea and seems to solve the
problem):

An expression
        A[s0..e0:st0,i1,i2,s3..e3:st3]
is read as
A.opIndexPartial(2,i2).opIndexPartial(1,i1).opSlice(s0,e0,st0,s3,e3,st3)

For assignments, only the last opSlice is replaced by the corresponding
opSliceAssign. There is no need for a opIndexPartialAssign.

I don't like this, it requires making a sequence of temporaries and looks like it would be hard to code for certain underlying data structures. An alternate idea I posted in the other thread: What if there were two allowed signatures for opSlice: T opSlice(start1,end1,...); T opSlice(start1,end1,...,bool[N] single) For a "full" slice (all indices are ranges), the first would be preferred, else fall back to the second with single all true. Partial slices would only be allowed to use the second one. In the second one, N would have to match the number of indices, and if single[i] was true then endI==startI (or startI+1, whatever), and the intent is that that dimension is being indexed, not sliced.

The problem with this is, that the dimension of T depends on the contents of single. The former should be known at compile time, the latter is a runtime value.

I'm still not happy about all the temporary objects, but it looks like the only way. (Where have all the don't-need-nullable-types types gone? ;-)
Maybe it should also (!) look for opIndexN(int[] i) - N-dimensional
indexing is good, but templates would require N to be fixed at compile
time :-)

Of course, the compiler would take care of this. int[N] is implicitly casted to int[] if no routine for the former is specified.

I don't think Walter's likely to accept any new features as this stage.

right now, which really is a good idea. But I still hope there will be another phase of taking in simple features before 1.0 is finished up.

with features for 1.0". While this seems conceptually simple, I'm sure making a language change like this takes quite a bit of work, and possibly involves risky bugs. Sam
Jul 08 2004
parent reply Norbert Nemec <Norbert.Nemec gmx.de> writes:
Sam McCall wrote:

 Norbert Nemec wrote:
 
 How about offering two versions:
         opSlice(s0,e0,s1,e1)
         opSdSlice(s0,e0,sd0,s1,e1,sd1)
 If at least one stride is given all others are implicitely assumed to be
 1. If opSlice is not found opSdSlice is tried instead (rendering it
 unnecessary to implement both)

This fixes all my syntactic objections :) However, unless there's a compelling argument that this would/could/should be commonly used, I don't see a need for an operator for this.

The compelling reason might be that this might allow a draft-implementation of the arrays in the library. Who knows how long it will take until we really have native N-dim arrays...
I don't think Walter's likely to accept any new features as this stage.

right now, which really is a good idea. But I still hope there will be another phase of taking in simple features before 1.0 is finished up.

with features for 1.0". While this seems conceptually simple, I'm sure making a language change like this takes quite a bit of work, and possibly involves risky bugs.

Well, unlike the recent opIndex change, this one won't break compatibility. On the other hand, one could argue that it is not really a new feature but the completion the opIndex change. I won't press the issue, though.
Jul 08 2004
parent Sam McCall <tunah.d tunah.net> writes:
Norbert Nemec wrote:

This fixes all my syntactic objections :)
However, unless there's a compelling argument that this
would/could/should be commonly used, I don't see a need for an operator
for this.

The compelling reason might be that this might allow a draft-implementation of the arrays in the library. Who knows how long it will take until we really have native N-dim arrays...

array-slice-striding using a method rather than an operator would block this.
 Well, unlike the recent opIndex change, this one won't break compatibility.
 On the other hand, one could argue that it is not really a new feature but
 the completion the opIndex change.

code and therefore bugs. But I could (would like to) be wrong.
 I won't press the issue, though.

Hmm, now I'm going to be completely hypocritical and say that the lack of a simple one dimensional opSliceAssign counterpart to the (currently) simple one dimensional opSlice is an obvious wart that should be fixed for 1.0 ;-) (I'm probably missing some subtle issue, but not having an N dimensional version straight away isn't a good enough reason) Sam
Jul 08 2004