digitalmars.D - Multidimensional opSlice and opSliceAssign

Norbert Nemec (74/74) Jul 07 2004 Hi there,

Sam McCall (39/100) Jul 07 2004 Erk, I didn't see this before I replied to the old thread. Please read

Norbert Nemec (47/86) Jul 07 2004 How about offering two versions:

Sam McCall (14/60) Jul 08 2004 This fixes all my syntactic objections :)

Norbert Nemec (8/29) Jul 08 2004 The compelling reason might be that this might allow a draft-implementat...

Sam McCall (14/26) Jul 08 2004 What do you mean by "the arrays"? I don't think that

Norbert Nemec <Norbert.Nemec gmx.de> writes:

Hi there,

while trying to answer to Derek's mail, the concept of vector expressions in
my head evolved further and I realized, that opSlice and opSliceAssign for
multiple dimensions would actually fit in nicely. Therefore here a complete
proposal:

---------------
1. If we want strided slicing sometime in the future, we have to make
provisions for it now. Once opSlice is overloadable with multiple
dimensions, inserting stride arguments would break compatibility.
Therefore: Allow strided slicing now, even if the native arrays don't
support it yet. This will also allow writing more powerful library-based
arrays until the native ones are there.

An expression
        A[start..end:stride]
should be read as
        A.opSlice(start,end,stride)

When no stride is given, 1 is taken as default:
        A[start..end]
should be read as
        A.opSlice(start,end,1)

When defining opSlice, the stride argument should not be given a default
value, since this would be ignored by the compiler anyway (except for the
rare case of calling opSlice explicitely as a function - and there you can
live with giving the 1 explicitely)
---------------
2. Slice assignments are done similar to index assignments:
        A[start..end:stride] = X
is read as
        A.opSliceAssign(X,start,end,stride)

Again stride is set to 1 if not given.
---------------
3. Full slicing is done without giving a range:

        A[]
is read as
        A.opSlice()

The full slice assignment
        A[] = X
is read as
        A.opSliceAssign(X)
---------------
4. Multidimensional slices are a straightforward extension similar to that
of the indexing:

        A[s0..e0:st0,s1..e1:st1]
is read as
        A.opSlice(s0,e0,st0,s1,e1,st1)
and similar for opSliceAssign
---------------
5. Partial indexing (This was my most recent idea and seems to solve the
problem):

An expression
        A[s0..e0:st0,i1,i2,s3..e3:st3]
is read as
A.opIndexPartial(2,i2).opIndexPartial(1,i1).opSlice(s0,e0,st0,s3,e3,st3)

For assignments, only the last opSlice is replaced by the corresponding
opSliceAssign. There is no need for a opIndexPartialAssign.
---------------
6. N-dimensional indexing: working with templates it is desired to have code
that works for arbitrary dimensions. With opIndex etc, this is not
possible. Therefore one additional feature:

If opIndex(int i0,int i1,int i2) is not found when it is called, the
arguments are packed up into a static array and
        opIndexN(int[3] i)
is called instead. opIndexAssignN works accordingly.

If opSlice(int s0,int e0,int st0,int s1,int e1,int st1) is not found when it
is called, again, everything is packed up and
        opSliceN(int[2] s,int[2] e,int[3] st)
is called.
---------------

Comments? Corrections? Ideas?

Now, probably this will not have highest priority for Walter, but I hope,
something like this should make it into 1.0 without problems, since it is
fairly unintrusive, clean and necessary for completeness.

Ciao,
Nobbi

Jul 07 2004

Sam McCall <tunah.d tunah.net> writes:

Erk, I didn't see this before I replied to the old thread. Please read 
my other post. Brief replies here:

Norbert Nemec wrote:

 Hi there,
 
 while trying to answer to Derek's mail, the concept of vector expressions in
 my head evolved further and I realized, that opSlice and opSliceAssign for
 multiple dimensions would actually fit in nicely. Therefore here a complete
 proposal:
 
 ---------------
 1. If we want strided slicing sometime in the future, we have to make
 provisions for it now. Once opSlice is overloadable with multiple
 dimensions, inserting stride arguments would break compatibility.
 Therefore: Allow strided slicing now, even if the native arrays don't
 support it yet. This will also allow writing more powerful library-based
 arrays until the native ones are there.

My opinion would be that this case simply isn't common enough to warrant 
   new syntax/overloading. We shouldn't allow overloading everything 
possible, we still have regular function calls! Do the benefits really 
outweigh the costs?

 When defining opSlice, the stride argument should not be given a default
 value, since this would be ignored by the compiler anyway (except for the
 rare case of calling opSlice explicitely as a function - and there you can
 live with giving the 1 explicitely)
 ---------------
 2. Slice assignments are done similar to index assignments:
         A[start..end:stride] = X
 is read as
         A.opSliceAssign(X,start,end,stride)
 
 Again stride is set to 1 if not given.

This forces all slices to handle striding, even if the only handling is 
assert(stride==1). It turns a compile time error with (potentially) a 
clear error message into a runtime error.

 ---------------
         A[]
         A.opSlice()
 
         A[] = X
         A.opSliceAssign(X)

         A[s0..e0:st0,s1..e1:st1]
         A.opSlice(s0,e0,st0,s1,e1,st1)
 and similar for opSliceAssign

Agreed, except I don't think stride belongs here. In particular, for the 
three dimensional case, assuming the "obvious" (to me) thing would 
result in opSlice(s1,e1,s2,e2,s3,e3) which is going to break horribly at 
runtime.

 ---------------
 5. Partial indexing (This was my most recent idea and seems to solve the
 problem):
 
 An expression
         A[s0..e0:st0,i1,i2,s3..e3:st3]
 is read as
 A.opIndexPartial(2,i2).opIndexPartial(1,i1).opSlice(s0,e0,st0,s3,e3,st3)
 
 For assignments, only the last opSlice is replaced by the corresponding
 opSliceAssign. There is no need for a opIndexPartialAssign.

I don't like this, it requires making a sequence of temporaries and 
looks like it would be hard to code for certain underlying data structures.
An alternate idea I posted in the other thread:

   What if there were two allowed signatures for opSlice:
   T opSlice(start1,end1,...);
   T opSlice(start1,end1,...,bool[N] single)
   For a "full" slice (all indices are ranges), the first would be
   preferred, else fall back to the second with single all true.
   Partial slices would only be allowed to use the second one.
   In the second one, N would have to match the number of indices, and if
   single[i] was true then endI==startI (or startI+1, whatever), and the
   intent is that that dimension is being indexed, not sliced.

opSliceAssign would be similar.

 ---------------
 6. N-dimensional indexing: working with templates it is desired to have code
 that works for arbitrary dimensions. With opIndex etc, this is not
 possible. Therefore one additional feature:
 
 If opIndex(int i0,int i1,int i2) is not found when it is called, the
 arguments are packed up into a static array and
         opIndexN(int[3] i)
 is called instead. opIndexAssignN works accordingly.

I like this. I'm not to keen on having a different name for the 
function, but I don't see an alternative.
Maybe it should also (!) look for opIndexN(int[] i) - N-dimensional 
indexing is good, but templates would require N to be fixed at compile 
time :-)

 Now, probably this will not have highest priority for Walter, but I hope,
 something like this should make it into 1.0 without problems, since it is
 fairly unintrusive, clean and necessary for completeness.

I don't think Walter's likely to accept any new features as this stage. 
I would like to see it in, but a line has to be drawn, I guess. There 
should really be a page on the wiki about proposed language 
enhancements, things like this are liable to get lost after a while. I'd 
create one and get stuck in, but for some reason all this performance 
talk has made me want to go write some Ruby ;-)

Sam

 
 Ciao,
 Nobbi

Jul 07 2004

Norbert Nemec <Norbert.Nemec gmx.de> writes:

Sam McCall wrote:

 ---------------
 1. If we want strided slicing sometime in the future, we have to make
 provisions for it now. Once opSlice is overloadable with multiple
 dimensions, inserting stride arguments would break compatibility.
 Therefore: Allow strided slicing now, even if the native arrays don't
 support it yet. This will also allow writing more powerful library-based
 arrays until the native ones are there.

 My opinion would be that this case simply isn't common enough to warrant
    new syntax/overloading. We shouldn't allow overloading everything
 possible, we still have regular function calls! Do the benefits really
 outweigh the costs?

How about offering two versions:
        opSlice(s0,e0,s1,e1)
        opSdSlice(s0,e0,sd0,s1,e1,sd1)
If at least one stride is given all others are implicitely assumed to be 1.
If opSlice is not found opSdSlice is tried instead (rendering it
unnecessary to implement both)

This would raise the number of special functions even further, but it is
just a large number with little complexity. As an overview:

----------------
opIndex
opSlice
opSdSlice

opIndexAssign
opSliceAssign
opSdSliceAssign

opIndexN
opSliceN
opSdSliceN

opIndexAssignN
opSliceAssignN
opSdSliceAssignN

opPartialIndex
----------------
Count: 2*2*(1+2)+1=13
But you don't have to learn 13 names. Merely five concepts:

opIndex vs. opSlice
opSlice vs. opSdSlice
assignment vs. access
int[N] indexing
opPartialIndex




 ---------------
 5. Partial indexing (This was my most recent idea and seems to solve the
 problem):
 
 An expression
         A[s0..e0:st0,i1,i2,s3..e3:st3]
 is read as
 A.opIndexPartial(2,i2).opIndexPartial(1,i1).opSlice(s0,e0,st0,s3,e3,st3)
 
 For assignments, only the last opSlice is replaced by the corresponding
 opSliceAssign. There is no need for a opIndexPartialAssign.

 I don't like this, it requires making a sequence of temporaries and
 looks like it would be hard to code for certain underlying data
 structures. An alternate idea I posted in the other thread:
 
    What if there were two allowed signatures for opSlice:
    T opSlice(start1,end1,...);
    T opSlice(start1,end1,...,bool[N] single)
    For a "full" slice (all indices are ranges), the first would be
    preferred, else fall back to the second with single all true.
    Partial slices would only be allowed to use the second one.
    In the second one, N would have to match the number of indices, and if
    single[i] was true then endI==startI (or startI+1, whatever), and the
    intent is that that dimension is being indexed, not sliced.

The problem with this is, that the dimension of T depends on the contents of
single. The former should be known at compile time, the latter is a runtime
value.

I don't like the complexity of this solution either, but compared to all
other ideas that I chewed over, it is clean and simple.

 Maybe it should also (!) look for opIndexN(int[] i) - N-dimensional
 indexing is good, but templates would require N to be fixed at compile
 time :-)

Of course, the compiler would take care of this. int[N] is implicitly casted
to int[] if no routine for the former is specified.

 I don't think Walter's likely to accept any new features as this stage.

Of course. He stated that he is working on bugfixes instead of features
right now, which really is a good idea.

But I still hope there will be another phase of taking in simple features
before 1.0 is finished up. This opSlice stuff really seems more like
"rounding off" what was started by opIndex. I agree with Walter, that there
should be no new cans of worms opened before 1.0, but this here would
rather mean closing a can that was opened already...

Ciao,
Nobbi

Jul 07 2004

Sam McCall <tunah.d tunah.net> writes:

Norbert Nemec wrote:

 How about offering two versions:
         opSlice(s0,e0,s1,e1)
         opSdSlice(s0,e0,sd0,s1,e1,sd1)
 If at least one stride is given all others are implicitely assumed to be 1.
 If opSlice is not found opSdSlice is tried instead (rendering it
 unnecessary to implement both)

This fixes all my syntactic objections :)
However, unless there's a compelling argument that this 
would/could/should be commonly used, I don't see a need for an operator 
for this.

5. Partial indexing (This was my most recent idea and seems to solve the
problem):

An expression
        A[s0..e0:st0,i1,i2,s3..e3:st3]
is read as
A.opIndexPartial(2,i2).opIndexPartial(1,i1).opSlice(s0,e0,st0,s3,e3,st3)

For assignments, only the last opSlice is replaced by the corresponding
opSliceAssign. There is no need for a opIndexPartialAssign.

I don't like this, it requires making a sequence of temporaries and
looks like it would be hard to code for certain underlying data
structures. An alternate idea I posted in the other thread:

   What if there were two allowed signatures for opSlice:
   T opSlice(start1,end1,...);
   T opSlice(start1,end1,...,bool[N] single)
   For a "full" slice (all indices are ranges), the first would be
   preferred, else fall back to the second with single all true.
   Partial slices would only be allowed to use the second one.
   In the second one, N would have to match the number of indices, and if
   single[i] was true then endI==startI (or startI+1, whatever), and the
   intent is that that dimension is being indexed, not sliced.

 
 
 The problem with this is, that the dimension of T depends on the contents of
 single. The former should be known at compile time, the latter is a runtime
 value.

True (in some very important cases). :-(
I'm still not happy about all the temporary objects, but it looks like 
the only way. (Where have all the don't-need-nullable-types types gone? ;-)

Maybe it should also (!) look for opIndexN(int[] i) - N-dimensional
indexing is good, but templates would require N to be fixed at compile
time :-)

 
 Of course, the compiler would take care of this. int[N] is implicitly casted
 to int[] if no routine for the former is specified.

Of course.

I don't think Walter's likely to accept any new features as this stage.

 Of course. He stated that he is working on bugfixes instead of features
 right now, which really is a good idea.
 But I still hope there will be another phase of taking in simple features
 before 1.0 is finished up. 

I meant before 1.0. Walter's said several times "that's it, we're done 
with features for 1.0".
While this seems conceptually simple, I'm sure making a language change 
like this takes quite a bit of work, and possibly involves risky bugs.
Sam

Jul 08 2004

Norbert Nemec <Norbert.Nemec gmx.de> writes:

Sam McCall wrote:

 Norbert Nemec wrote:
 
 How about offering two versions:
         opSlice(s0,e0,s1,e1)
         opSdSlice(s0,e0,sd0,s1,e1,sd1)
 If at least one stride is given all others are implicitely assumed to be
 1. If opSlice is not found opSdSlice is tried instead (rendering it
 unnecessary to implement both)

 This fixes all my syntactic objections :)
 However, unless there's a compelling argument that this
 would/could/should be commonly used, I don't see a need for an operator
 for this.

The compelling reason might be that this might allow a draft-implementation
of the arrays in the library. Who knows how long it will take until we
really have native N-dim arrays...

I don't think Walter's likely to accept any new features as this stage.

 Of course. He stated that he is working on bugfixes instead of features
 right now, which really is a good idea.
 But I still hope there will be another phase of taking in simple features
 before 1.0 is finished up.

 I meant before 1.0. Walter's said several times "that's it, we're done
 with features for 1.0".
 While this seems conceptually simple, I'm sure making a language change
 like this takes quite a bit of work, and possibly involves risky bugs.

Well, unlike the recent opIndex change, this one won't break compatibility.
On the other hand, one could argue that it is not really a new feature but
the completion the opIndex change.

I won't press the issue, though.

Jul 08 2004

Sam McCall <tunah.d tunah.net> writes:

Norbert Nemec wrote:

This fixes all my syntactic objections :)
However, unless there's a compelling argument that this
would/could/should be commonly used, I don't see a need for an operator
for this.

 
 The compelling reason might be that this might allow a draft-implementation
 of the arrays in the library. Who knows how long it will take until we
 really have native N-dim arrays...

What do you mean by "the arrays"? I don't think that 
array-slice-striding using a method rather than an operator would block 
this.

 Well, unlike the recent opIndex change, this one won't break compatibility.
 On the other hand, one could argue that it is not really a new feature but
 the completion the opIndex change.

It just looks to my untrained eye like it will require a bunch of new 
code and therefore bugs. But I could (would like to) be wrong.

 I won't press the issue, though.

Then it definitely won't get in!

Hmm, now I'm going to be completely hypocritical and say that the lack 
of a simple one dimensional opSliceAssign counterpart to the (currently) 
simple one dimensional opSlice is an obvious wart that should be fixed 
for 1.0 ;-)
(I'm probably missing some subtle issue, but not having an N dimensional 
version straight away isn't a good enough reason)

Sam

Jul 08 2004

D Programming

C/C++ Programming

Other

digitalmars.D - Multidimensional opSlice and opSliceAssign