www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Proposal: Make [][x] and [][a..b] illegal (reserve for multidimensional

reply Don Clugston <dac nospam.com.au> writes:
 Bill Baxter wrote:
 2) D lacks a clean syntax for slices of multiple dimensions.  opSlice
 calls must have one and only one ".."

Can you propose a syntax for that?


How about making [] more than just a shorthand for [0..$], but rather mean "take everything from this dimension and move on". This would, I think, make it more consistent with its usage in declarations, where it is a dimension marker, not a slice. This would leave the way open for multidimensional array slicing. For example, access of 5 dimensions would be of the form auto c = m[][4][][][7]; where [] = take the whole contents of that dimension, or [x] takes only element x of it. Just as now, there is an implicit [][][] at the end for any unspecified dimensions. A slice can be inserted in the expression before any dimension is completed; it takes a slice of that dimension. So, for example, with a 2-D matrix m, auto c = m[4..5][][1]; would mean [ m[4][1], m[5][1] ]. and m[1][4..5] == m[1][4..5][]; would continue to mean [ m[1][4], m[1][5] ]. You can do this right now for user-defined types, but built-in types work differently because of a (IMHO) useless redundancy. int f[][2] = [[1,2], [3,4], [5,6], [7,8]]; int [] e = f[][1]; int [] g = f[1][]; int [] h = f[1]; int [] p = f[][][][][][][1][][][][][][][][][][][][]; // isn't this silly??? Currently this is legal. All those expressions are synonyms. e[0] == g[0] == h[0] == p[0] == f[1][0] == 3. In my proposal e would mean: f[0][1], f[1][1], f[2][1], f[3][1] which would become illegal since it is a strided slice (currently unsupported in the language) and g would be: f[1][0], f[1][1] which is the same as h. In fact, right now, in an expression, any slicing or indexing operation on a built-in array after a [] would become illegal. This would only break existing code which contained (IMHO) bad style and confusing redundancy. If for some reason the existing behaviour was desired (such as in metaprogramming or code generation ?), simply change [] to [0..$]. But [] would continue to provide syntactic sugar in the places it was originally intended.
Nov 21 2007
next sibling parent reply Bill Baxter <dnewsgroup billbaxter.com> writes:
Don Clugston wrote:
  > Bill Baxter wrote:
  >>> 2) D lacks a clean syntax for slices of multiple dimensions.  opSlice
  >>> calls must have one and only one ".."
  >>
  >> Can you propose a syntax for that?
 
 How about making [] more than just a shorthand for [0..$], but rather 
 mean "take everything from this dimension and move on".
 This would, I think, make it more consistent with its usage in 
 declarations, where it is a dimension marker, not a slice.
 
 This would leave the way open for multidimensional array slicing.
 For example, access of 5 dimensions would be of the form
 auto c = m[][4][][][7];
 
 where [] = take the whole contents of that dimension, or [x] takes only 
 element x of it. Just as now, there is an implicit [][][] at the end for 
 any unspecified dimensions.
 
 A slice can be inserted in the expression before any dimension is 
 completed; it takes a slice of that dimension.
 So, for example, with a 2-D matrix m,
 auto c = m[4..5][][1];
 
 would mean [ m[4][1], m[5][1] ].
 and m[1][4..5] == m[1][4..5][];
 
 would continue to mean [ m[1][4], m[1][5] ].
 
 You can do this right now for user-defined types, but built-in types 
 work differently because of a (IMHO) useless redundancy.
 
     int f[][2] = [[1,2], [3,4], [5,6], [7,8]];
     int [] e = f[][1];
     int [] g = f[1][];
     int [] h = f[1];
     int [] p = f[][][][][][][1][][][][][][][][][][][][]; // isn't this 
 silly???
 
 Currently this is legal. All those expressions are synonyms.
 e[0] == g[0] == h[0] == p[0] == f[1][0] == 3.
 In my proposal e would mean:
 f[0][1], f[1][1], f[2][1], f[3][1]
 which would become illegal since it is a strided slice (currently 
 unsupported in the language)
 and g would be:
 f[1][0], f[1][1]
 which is the same as h.
 In fact, right now, in an expression, any slicing or indexing operation 
 on a built-in array after a [] would become illegal.
 
 This would only break existing code which contained (IMHO) bad style and 
 confusing redundancy.
 If for some reason the existing behaviour was desired (such as in 
 metaprogramming or code generation ?), simply change [] to [0..$].
 But [] would continue to provide syntactic sugar in the places it was 
 originally intended.

What would this give you? float[] x; auto y = x[]; // what's y's type? Or would that be just like it is now, a float[]? If so then it seems like you lose some syntactical associativity. That is, (x[])[] would no longer be the same as x[][]. Another proposal would be to allow [..] as your slice an move on syntax, and keep the current meaning of []. Anyway, f[i][j][k] syntax is much more difficult to read than f[i,j,k]. So I think the latter is what we should be shooting for. Also one thing not clear to me with your proposal is whether f[1..2][2..3][3..4] generates 3 opSlice calls or just 1 when applied to a user class. I can see reasons for wanting both. If you have a fixed dimension class that only allows integer slicing, then probably 1 opSlice call with all 3 slices is the most useful. If you have something representing a N-dimensional thing where N is runtime, and you allow for slicing with objects (like other N-dimsional arrays), then 3 separate chained calls like f.opSlice(a).opSlice(b).opSlice(c) might be more useful to cut down on all the combinatorial explosion of possible combos of opSlice arguments. --bb
Nov 21 2007
next sibling parent reply Dan <murpsoft hotmail.com> writes:
Bill Baxter Wrote:

 Don Clugston wrote:> 
 Another proposal would be to allow [..] as your slice an move on syntax, 
 and keep the current meaning of [].

Jeepers, we already have $ as a shorthand for length. Is 0..$ really too long? Even an 11 dimensional array is legible with that. 11 dimensions is barely comprehensible to even the best mind. Most folk can't even perceive 4. x[0..$][0..$][0..$][0..$][0..$][0..$][0..$][0..$][0..$][0..$][0..$];
 Also one thing not clear to me with your proposal is whether 
 f[1..2][2..3][3..4] generates 3 opSlice calls or just 1 when applied to 
 a user class.  I can see reasons for wanting both.  If you have a fixed 
 dimension class that only allows integer slicing, then probably 1 
 opSlice call with all 3 slices is the most useful.  If you have 
 something representing a N-dimensional thing where N is runtime, and you 
 allow for slicing with objects (like other N-dimsional arrays), then 3 
 separate chained calls like f.opSlice(a).opSlice(b).opSlice(c) might be 
 more useful to cut down on all the combinatorial explosion of possible 
 combos of opSlice arguments.
 
 --bb

Yeah, I tend to agree with using 3 separate opSlice calls. But... we don't even need to create an opSlice method, do we? Arrays already work wonderfully? Doesn't: [code] float[42][12][14] x; float[42][2][2] y; y = x[0..$][1..2][3..4]; [/code] That works, right? and you can loop over it just like this? [code] for(int i = 0; i < x.length; i++) { for(int j = 0; j < x[i].length; j++) { } } [/code] I don't see the problem?
Nov 21 2007
next sibling parent "Bruce Adams" <tortoise_74 yeah.who.co.uk> writes:
On Thu, 22 Nov 2007 06:50:28 -0000, Dan <murpsoft hotmail.com> wrote:

 Bill Baxter Wrote:

 Don Clugston wrote:>
 Another proposal would be to allow [..] as your slice an move on synt=


 and keep the current meaning of [].

Jeepers, we already have $ as a shorthand for length. Is 0..$ really =

 too long?  Even an 11 dimensional array is legible with that.  11  =

 dimensions is barely comprehensible to even the best mind.  Most folk =

 can't even perceive 4.

 x[0..$][0..$][0..$][0..$][0..$][0..$][0..$][0..$][0..$][0..$][0..$];

 Also one thing not clear to me with your proposal is whether
 f[1..2][2..3][3..4] generates 3 opSlice calls or just 1 when applied =


 a user class.  I can see reasons for wanting both.  If you have a fix=


 dimension class that only allows integer slicing, then probably 1
 opSlice call with all 3 slices is the most useful.  If you have
 something representing a N-dimensional thing where N is runtime, and =


 allow for slicing with objects (like other N-dimsional arrays), then =


 separate chained calls like f.opSlice(a).opSlice(b).opSlice(c) might =


 more useful to cut down on all the combinatorial explosion of possibl=


 combos of opSlice arguments.

 --bb

Yeah, I tend to agree with using 3 separate opSlice calls. But... we =

 don't even need to create an opSlice method, do we?  Arrays already wo=

 wonderfully?  Doesn't:

 [code]
 float[42][12][14] x;
 float[42][2][2] y;

 y =3D x[0..$][1..2][3..4];
 [/code]
 That works, right?  and you can loop over it just like this?
 [code]
 for(int i =3D 0; i < x.length; i++)
 {
    for(int j =3D 0; j < x[i].length; j++)
    {
    }
 }
 [/code]

 I don't see the problem?

Or even: foreach(xslice, x[0..$]) { foreach(yslice, xslice[0..$]) { foreach(element, yslice[0..$]) { } } } Question is foreach clever enough to allow: foreach(element, x[0..$][0..$][0..$]) { } and make it efficient? and what about: foreach(x,y,z,element, x[0..$][0..$][0..$]) { } Regards, Bruce.
Nov 22 2007
prev sibling parent Bill Baxter <dnewsgroup billbaxter.com> writes:
Dan wrote:
 Bill Baxter Wrote:
 
 Don Clugston wrote:> 
 Another proposal would be to allow [..] as your slice an move on syntax, 
 and keep the current meaning of [].

Jeepers, we already have $ as a shorthand for length. Is 0..$ really too long? Even an 11 dimensional array is legible with that. 11 dimensions is barely comprehensible to even the best mind. Most folk can't even perceive 4. x[0..$][0..$][0..$][0..$][0..$][0..$][0..$][0..$][0..$][0..$][0..$];

It looks very ugly to people used to working with the likes of Matlab and NumPy.
 Also one thing not clear to me with your proposal is whether 
 f[1..2][2..3][3..4] generates 3 opSlice calls or just 1 when applied to 
 a user class.  I can see reasons for wanting both.  If you have a fixed 
 dimension class that only allows integer slicing, then probably 1 
 opSlice call with all 3 slices is the most useful.  If you have 
 something representing a N-dimensional thing where N is runtime, and you 
 allow for slicing with objects (like other N-dimsional arrays), then 3 
 separate chained calls like f.opSlice(a).opSlice(b).opSlice(c) might be 
 more useful to cut down on all the combinatorial explosion of possible 
 combos of opSlice arguments.

 --bb

Yeah, I tend to agree with using 3 separate opSlice calls. But... we don't even need to create an opSlice method, do we? Arrays already work wonderfully? Doesn't: [code] float[42][12][14] x; float[42][2][2] y; y = x[0..$][1..2][3..4]; [/code] That works, right? and you can loop over it just like this? [code] for(int i = 0; i < x.length; i++) { for(int j = 0; j < x[i].length; j++) { } } [/code] I don't see the problem?

Aside from the unattractiveness of the syntax, there's a question of efficiency creating all those extra function calls and temp variables. If the compiler will make it all go away great, but I'm not counting on it, given that DMD can't even inline functions with loops. --bb
Nov 22 2007
prev sibling next sibling parent reply Don Clugston <dac nospam.com.au> writes:
Bill Baxter wrote:
 Don Clugston wrote:
  > Bill Baxter wrote:
  >>> 2) D lacks a clean syntax for slices of multiple dimensions.  
 opSlice
  >>> calls must have one and only one ".."
  >>
  >> Can you propose a syntax for that?

 How about making [] more than just a shorthand for [0..$], but rather 
 mean "take everything from this dimension and move on".
 This would, I think, make it more consistent with its usage in 
 declarations, where it is a dimension marker, not a slice.

 This would leave the way open for multidimensional array slicing.
 For example, access of 5 dimensions would be of the form
 auto c = m[][4][][][7];

 where [] = take the whole contents of that dimension, or [x] takes 
 only element x of it. Just as now, there is an implicit [][][] at the 
 end for any unspecified dimensions.

 A slice can be inserted in the expression before any dimension is 
 completed; it takes a slice of that dimension.
 So, for example, with a 2-D matrix m,
 auto c = m[4..5][][1];

 would mean [ m[4][1], m[5][1] ].
 and m[1][4..5] == m[1][4..5][];

 would continue to mean [ m[1][4], m[1][5] ].

 You can do this right now for user-defined types, but built-in types 
 work differently because of a (IMHO) useless redundancy.

     int f[][2] = [[1,2], [3,4], [5,6], [7,8]];
     int [] e = f[][1];
     int [] g = f[1][];
     int [] h = f[1];
     int [] p = f[][][][][][][1][][][][][][][][][][][][]; // isn't this 
 silly???

 Currently this is legal. All those expressions are synonyms.
 e[0] == g[0] == h[0] == p[0] == f[1][0] == 3.
 In my proposal e would mean:
 f[0][1], f[1][1], f[2][1], f[3][1]
 which would become illegal since it is a strided slice (currently 
 unsupported in the language)
 and g would be:
 f[1][0], f[1][1]
 which is the same as h.
 In fact, right now, in an expression, any slicing or indexing 
 operation on a built-in array after a [] would become illegal.

 This would only break existing code which contained (IMHO) bad style 
 and confusing redundancy.
 If for some reason the existing behaviour was desired (such as in 
 metaprogramming or code generation ?), simply change [] to [0..$].
 But [] would continue to provide syntactic sugar in the places it was 
 originally intended.

What would this give you? float[] x; auto y = x[]; // what's y's type? Or would that be just like it is now, a float[]? If so then it seems like you lose some syntactical associativity. That is, (x[])[] would no longer be the same as x[][].

Yes, that's a problem. My original idea was that it would be remembered throughout the expression -- but I think that would mean it ends up being context-sensitive. I think that the only way it could work is to disallow [] in any situation where it is redundant; which would end up breaking almost all existing code that uses it.
 
 Another proposal would be to allow [..] as your slice an move on syntax, 
 and keep the current meaning of [].

That would be preferable, and more intuitive, but would break existing code.
 Anyway, f[i][j][k] syntax is much more difficult to read than f[i,j,k]. 
  So I think the latter is what we should be shooting for.

That's a reasonable argument. In which case it should be a high priority to make the comma operator illegal inside array declarations. It seems to me that so much nonsensical syntax is legal at present, that there's hardly any room to maneuver. Curiously, float[3,4] a; is legal, (and is equivalent to float [4] a;) but a[2,3] generates: Error: only one index allowed to index float[4u] which is a pretty strong indicator that Walter intends to support [i,j,k] syntax at some point. I therefore retract this proposal. Basically I just want _some_ syntax which to support in BLADE (doesn't matter what, I can pretty much do anything) and so I want to match what D will eventually allow in normal code. (For example, I can check for the existence of opBladeDollar(int dimension) and opBladeSlice(uint DimensionReduction)(uint slices[2]) where slice[x][0]=start of slice, [x][1] = end of slice, and start==end if it's an index rather than a slice, and DimensionReduction is number of indexing operations so that you can determine the return type).
Nov 22 2007
parent reply Bill Baxter <dnewsgroup billbaxter.com> writes:
Don Clugston wrote:
 Bill Baxter wrote:
 Don Clugston wrote:
  > Bill Baxter wrote:
  >>> 2) D lacks a clean syntax for slices of multiple dimensions.  
 opSlice
  >>> calls must have one and only one ".."
  >>
  >> Can you propose a syntax for that?

 How about making [] more than just a shorthand for [0..$], but rather 
 mean "take everything from this dimension and move on".
 This would, I think, make it more consistent with its usage in 
 declarations, where it is a dimension marker, not a slice.

 This would leave the way open for multidimensional array slicing.
 For example, access of 5 dimensions would be of the form
 auto c = m[][4][][][7];

 where [] = take the whole contents of that dimension, or [x] takes 
 only element x of it. Just as now, there is an implicit [][][] at the 
 end for any unspecified dimensions.

 A slice can be inserted in the expression before any dimension is 
 completed; it takes a slice of that dimension.
 So, for example, with a 2-D matrix m,
 auto c = m[4..5][][1];

 would mean [ m[4][1], m[5][1] ].
 and m[1][4..5] == m[1][4..5][];

 would continue to mean [ m[1][4], m[1][5] ].

 You can do this right now for user-defined types, but built-in types 
 work differently because of a (IMHO) useless redundancy.

     int f[][2] = [[1,2], [3,4], [5,6], [7,8]];
     int [] e = f[][1];
     int [] g = f[1][];
     int [] h = f[1];
     int [] p = f[][][][][][][1][][][][][][][][][][][][]; // isn't 
 this silly???

 Currently this is legal. All those expressions are synonyms.
 e[0] == g[0] == h[0] == p[0] == f[1][0] == 3.
 In my proposal e would mean:
 f[0][1], f[1][1], f[2][1], f[3][1]
 which would become illegal since it is a strided slice (currently 
 unsupported in the language)
 and g would be:
 f[1][0], f[1][1]
 which is the same as h.
 In fact, right now, in an expression, any slicing or indexing 
 operation on a built-in array after a [] would become illegal.

 This would only break existing code which contained (IMHO) bad style 
 and confusing redundancy.
 If for some reason the existing behaviour was desired (such as in 
 metaprogramming or code generation ?), simply change [] to [0..$].
 But [] would continue to provide syntactic sugar in the places it was 
 originally intended.

What would this give you? float[] x; auto y = x[]; // what's y's type? Or would that be just like it is now, a float[]? If so then it seems like you lose some syntactical associativity. That is, (x[])[] would no longer be the same as x[][].

Yes, that's a problem. My original idea was that it would be remembered throughout the expression -- but I think that would mean it ends up being context-sensitive. I think that the only way it could work is to disallow [] in any situation where it is redundant; which would end up breaking almost all existing code that uses it.
 Another proposal would be to allow [..] as your slice an move on 
 syntax, and keep the current meaning of [].

That would be preferable, and more intuitive, but would break existing code.
 Anyway, f[i][j][k] syntax is much more difficult to read than 
 f[i,j,k].  So I think the latter is what we should be shooting for.

That's a reasonable argument. In which case it should be a high priority to make the comma operator illegal inside array declarations. It seems to me that so much nonsensical syntax is legal at present, that there's hardly any room to maneuver. Curiously, float[3,4] a; is legal, (and is equivalent to float [4] a;) but a[2,3] generates: Error: only one index allowed to index float[4u] which is a pretty strong indicator that Walter intends to support [i,j,k] syntax at some point.

There's more than just intent. It works fine. Write yourself an opIndex(int,int,int) and try it out. opIndexAssign(Value v, int,int,int) type things work too. opIndex(MyStruct,Object,Object) even works. The hole in functionality right now is just in the intersection of multiple indices and slices. But since you can have Object/struct indices you can create your slicing scheme using a custom Slice type for indices. You just can't have $ or .. syntax (at least .. in any other form than foo[A..B].
 I therefore retract this proposal.
 
 Basically I just want _some_ syntax which to support in BLADE (doesn't 
 matter what, I can pretty much do anything) and so I want to match what 
 D will eventually allow in normal code.
 (For example, I can check for the existence of
   opBladeDollar(int dimension)  and
   opBladeSlice(uint DimensionReduction)(uint slices[2])
 where slice[x][0]=start of slice, [x][1] = end of slice, and start==end 
 if it's an index rather than a slice, and DimensionReduction is number 
 of indexing operations so that you can determine the return type).

--bb
Nov 22 2007
parent Don Clugston <dac nospam.com.au> writes:
Bill Baxter wrote:
 Curiously,  float[3,4] a; is legal, (and is equivalent to float [4] a;)
 but a[2,3]
 generates:
  Error: only one index allowed to index float[4u]

 which is a pretty strong indicator that Walter intends to support 
 [i,j,k] syntax at some point.

There's more than just intent. It works fine. Write yourself an opIndex(int,int,int) and try it out. opIndexAssign(Value v, int,int,int) type things work too. opIndex(MyStruct,Object,Object) even works. The hole in functionality right now is just in the intersection of multiple indices and slices. But since you can have Object/struct indices you can create your slicing scheme using a custom Slice type for indices. You just can't have $ or .. syntax (at least .. in any other form than foo[A..B].

Cool! I have no idea how I missed that. (My Blade syntax parser even supports it, and has unit tests for it(!) -- man I feel like an idiot <g>) So in theory, all we need is a standard slice/range container (for indices), and uint opDollar(uint dimension)? Presumably, opDollar and slices only make sense for integral indices? (Could be stretched to include reals, but the concept does require a total ordering, so couldn't even include complex types). Plus it should probably work for built-in arrays (recognising that at present, most slicing operations would be unsupported).
Nov 22 2007
prev sibling next sibling parent reply Oskar Linde <oskar.lindeREM OVEgmail.com> writes:
Bill Baxter wrote:

 Anyway, f[i][j][k] syntax is much more difficult to read than f[i,j,k]. 
  So I think the latter is what we should be shooting for.

I agree, and for what it is worth it is possible to implement multidimensional indexing and slicing for UDTs today. The only thing lacking is the ability to use the .. and $ tokens. For example: f[i, all, all, j, all, end-1, all, range(1, end)] gets translated into quite optimal code. The only variables that get passed to the indexing function is (i,j,1,1) and no temporaries need to be created. I guess that with some additional compiler supported sugaring, the above could become something like: f[i, .. , .. , j, .. , $-1, .. , 1..$] -- Oskar
Nov 22 2007
next sibling parent reply Bill Baxter <dnewsgroup billbaxter.com> writes:
Oskar Linde wrote:
 Bill Baxter wrote:
 
 Anyway, f[i][j][k] syntax is much more difficult to read than 
 f[i,j,k].  So I think the latter is what we should be shooting for.

I agree, and for what it is worth it is possible to implement multidimensional indexing and slicing for UDTs today. The only thing lacking is the ability to use the .. and $ tokens. For example: f[i, all, all, j, all, end-1, all, range(1, end)] gets translated into quite optimal code. The only variables that get passed to the indexing function is (i,j,1,1) and no temporaries need to be created. I guess that with some additional compiler supported sugaring, the above could become something like: f[i, .. , .. , j, .. , $-1, .. , 1..$]

Now if you would just put all that goodness on Dsource or somewhere where the rest of us can appreciate it... :-) --bb
Nov 22 2007
parent reply Oskar Linde <oskar.lindeREM OVEgmail.com> writes:
Bill Baxter wrote:
 Oskar Linde wrote:
 Bill Baxter wrote:

 Anyway, f[i][j][k] syntax is much more difficult to read than 
 f[i,j,k].  So I think the latter is what we should be shooting for.

I agree, and for what it is worth it is possible to implement multidimensional indexing and slicing for UDTs today. The only thing lacking is the ability to use the .. and $ tokens. For example: f[i, all, all, j, all, end-1, all, range(1, end)] gets translated into quite optimal code. The only variables that get passed to the indexing function is (i,j,1,1) and no temporaries need to be created. I guess that with some additional compiler supported sugaring, the above could become something like: f[i, .. , .. , j, .. , $-1, .. , 1..$]

Now if you would just put all that goodness on Dsource or somewhere where the rest of us can appreciate it... :-)

OK. I extracted it from my sources, stripping it from most external dependencies. Operator overloading is also stripped and most non-essentials: http://www.csc.kth.se/~ol/multiarray.zip The result is basically a simple demonstration of a strided multidimensional array slice type. Indexing/slicing is done in such a way that: f[a,b,c] is identical to f[a][b][c] f[a,b][c] f[a][b,c] Don's global __dollar idea is implemented, but not int[2] slices (can't do them easily as things are now). f[i, all, all, j, all, $-1, all, range(1, $)] All indexing expressions are inlined by GDC at least. Since they are inlined, there is probably no need to make "all" a distinct type rather than range(0,$), but that is how it is now. Iteration is very sub-optimal since it relies on opApply, that neither GDC nor DMD can inline (as far as I know). opIndex and opIndexAssign are implemented so they return the ElementType rather than a 0-dimensional array when all dimensions are collapsed. Since I just rewrote parts of it right now without keeping my unit tests, the implementation is very unlikely to be bug free. :) -- Oskar
Nov 22 2007
next sibling parent reply Don Clugston <dac nospam.com.au> writes:
Oskar Linde wrote:
 Bill Baxter wrote:
 Oskar Linde wrote:
 Bill Baxter wrote:

 Anyway, f[i][j][k] syntax is much more difficult to read than 
 f[i,j,k].  So I think the latter is what we should be shooting for.

I agree, and for what it is worth it is possible to implement multidimensional indexing and slicing for UDTs today. The only thing lacking is the ability to use the .. and $ tokens. For example: f[i, all, all, j, all, end-1, all, range(1, end)] gets translated into quite optimal code. The only variables that get passed to the indexing function is (i,j,1,1) and no temporaries need to be created. I guess that with some additional compiler supported sugaring, the above could become something like: f[i, .. , .. , j, .. , $-1, .. , 1..$]

Now if you would just put all that goodness on Dsource or somewhere where the rest of us can appreciate it... :-)

OK. I extracted it from my sources, stripping it from most external dependencies. Operator overloading is also stripped and most non-essentials: http://www.csc.kth.se/~ol/multiarray.zip The result is basically a simple demonstration of a strided multidimensional array slice type. Indexing/slicing is done in such a way that: f[a,b,c] is identical to f[a][b][c] f[a,b][c] f[a][b,c] Don's global __dollar idea is implemented, but not int[2] slices (can't do them easily as things are now).

Would be nice to know if that __dollar behaviour is a half-finished/unannounced/forgotten feature by Walter, or just a bug. I suspect it's an Easter egg. I wonder how long it's been there. Walter, care to comment on this? Simply adding a line into std.object would make it a standard feature. We can road-test it first.
Nov 22 2007
next sibling parent Bill Baxter <dnewsgroup billbaxter.com> writes:
Janice Caron wrote:
 I had a mad thought yesterday. It probably won't fly, but...
 
 what if opSlice took tuple arguments, like this
 
     a[3,4 .. 5,6]

Actually, you can definitely already have a similar syntax if you want it: a[[3,4] .. [5,6]] The problem is that nobody does. :-) It's hard to see what's going on with the uppers and lowers grouped together, rather than each dim. Your eyes have to go back and forth trying to pair up this with that.
 translating to
 
     a.opSlice(Tuple!(3,4),Tuple!(5,6))
 
 implying the slice [3..5] in one dimension, and [4..6] in a second.
 That would kind of automatically give you a syntax for
 multidimensional arrays. Of course it would help if length() also took
 tuples...
 
     a.length = Tuple!(3,3); /* create a 3x3 array */
 
     Tuple!(int,int) dims = a.length; /* get the array dimensions */
 
 I suspect it might be possible to implement this right now in
 user-defined classes by suitably overloading opSlice(),
 opSliceAssign() and length().
 
 There's probably some reason why it wouldn't work that I haven't thought of.
:-)

--bb
Nov 22 2007
prev sibling parent reply Oskar Linde <oskar.lindeREM OVEgmail.com> writes:
Don Clugston wrote:

 Would be nice to know if that __dollar behaviour is a 
 half-finished/unannounced/forgotten feature by Walter, or just a bug. I 
 suspect it's an Easter egg. I wonder how long it's been there.

It has been there quite some time. Since the introduction of $ in DMD 0.116 in fact. I've been toying with __dollar before, it just never occurred to me that it could be used as a global identifier in this way. I leave to Walter to answer whether it is an intentional feature or not. The front end sources doesn't make that very clear. -- Oskar
Nov 23 2007
parent "Stewart Gordon" <smjg_1998 yahoo.com> writes:
"Oskar Linde" <oskar.lindeREM OVEgmail.com> wrote in message 
news:fi6b93$1rsa$1 digitalmars.com...
<snip>
 It has been there quite some time. Since the introduction of $ in DMD 
 0.116 in fact. I've been toying with __dollar before, it just never 
 occurred to me that it could be used as a global identifier in this way.

 I leave to Walter to answer whether it is an intentional feature or not. 
 The front end sources doesn't make that very clear.

From what I can tell, no: - it doesn't seem to be documented - it begins with __, indicating that it's part of the implementation rather than the standard language As such, I expect that it's a DMD-specific hack, and a consequence of the way $ is implemented rather than something that was meant to be exposed to the end programmer. But indeed, only Walter can confirm. Stewart. -- My e-mail address is valid but not my primary mailbox. Please keep replies on the 'group where everybody may benefit.
Nov 23 2007
prev sibling parent reply Bill Baxter <dnewsgroup billbaxter.com> writes:
Oskar Linde wrote:
 Bill Baxter wrote:
 Oskar Linde wrote:
 Bill Baxter wrote:

 Anyway, f[i][j][k] syntax is much more difficult to read than 
 f[i,j,k].  So I think the latter is what we should be shooting for.

I agree, and for what it is worth it is possible to implement multidimensional indexing and slicing for UDTs today. The only thing lacking is the ability to use the .. and $ tokens. For example: f[i, all, all, j, all, end-1, all, range(1, end)] gets translated into quite optimal code. The only variables that get passed to the indexing function is (i,j,1,1) and no temporaries need to be created. I guess that with some additional compiler supported sugaring, the above could become something like: f[i, .. , .. , j, .. , $-1, .. , 1..$]

Now if you would just put all that goodness on Dsource or somewhere where the rest of us can appreciate it... :-)

OK. I extracted it from my sources, stripping it from most external dependencies. Operator overloading is also stripped and most non-essentials: http://www.csc.kth.se/~ol/multiarray.zip The result is basically a simple demonstration of a strided multidimensional array slice type. Indexing/slicing is done in such a way that: f[a,b,c] is identical to f[a][b][c] f[a,b][c] f[a][b,c] Don's global __dollar idea is implemented, but not int[2] slices (can't do them easily as things are now). f[i, all, all, j, all, $-1, all, range(1, $)] All indexing expressions are inlined by GDC at least. Since they are inlined, there is probably no need to make "all" a distinct type rather than range(0,$), but that is how it is now. Iteration is very sub-optimal since it relies on opApply, that neither GDC nor DMD can inline (as far as I know). opIndex and opIndexAssign are implemented so they return the ElementType rather than a 0-dimensional array when all dimensions are collapsed. Since I just rewrote parts of it right now without keeping my unit tests, the implementation is very unlikely to be bug free. :)

Wonderful! Would you mind if I checked that into my dsource multiarray project? I'd like to try to hook it up to lapack, and maybe try to unify it with my runtime N-dim array class. I use the Zlib license for the rest of the stuff there, is that ok? --bb
Nov 22 2007
parent Oskar Linde <oskar.lindeREM OVEgmail.com> writes:
Bill Baxter wrote:
 Oskar Linde wrote:

 http://www.csc.kth.se/~ol/multiarray.zip

 The result is basically a simple demonstration of a strided 
 multidimensional array slice type.


 Would you mind if I checked that into my dsource multiarray project? I'd 
 like to try to hook it up to lapack, and maybe try to unify it with my 
 runtime N-dim array class.

Not at all. I'm glad if it is of any use.
 I use the Zlib license for the rest of the stuff there, is that ok?

The Zlib licence is fine. -- Oskar
Nov 23 2007
prev sibling parent reply Don Clugston <dac nospam.com.au> writes:
Oskar Linde wrote:
 Bill Baxter wrote:
 
 Anyway, f[i][j][k] syntax is much more difficult to read than 
 f[i,j,k].  So I think the latter is what we should be shooting for.

I agree, and for what it is worth it is possible to implement multidimensional indexing and slicing for UDTs today. The only thing lacking is the ability to use the .. and $ tokens. For example: f[i, all, all, j, all, end-1, all, range(1, end)] gets translated into quite optimal code. The only variables that get passed to the indexing function is (i,j,1,1) and no temporaries need to be created. I guess that with some additional compiler supported sugaring, the above could become something like: f[i, .. , .. , j, .. , $-1, .. , 1..$]

Try putting this line at the top of your file. enum : uint { __dollar = int.max } Of course you have to add (length-int.max) to any indices you get, for any index
 (int.max-length).

Gotta love these easter eggs....
Nov 22 2007
next sibling parent reply Don Clugston <dac nospam.com.au> writes:
Don Clugston wrote:
 Oskar Linde wrote:
 Bill Baxter wrote:

 Anyway, f[i][j][k] syntax is much more difficult to read than 
 f[i,j,k].  So I think the latter is what we should be shooting for.

I agree, and for what it is worth it is possible to implement multidimensional indexing and slicing for UDTs today. The only thing lacking is the ability to use the .. and $ tokens. For example: f[i, all, all, j, all, end-1, all, range(1, end)] gets translated into quite optimal code. The only variables that get passed to the indexing function is (i,j,1,1) and no temporaries need to be created. I guess that with some additional compiler supported sugaring, the above could become something like: f[i, .. , .. , j, .. , $-1, .. , 1..$]

Try putting this line at the top of your file. enum : int { __dollar = int.max } Of course you have to add (length-int.max) to any indices you get, for any index > (int.max-length). Gotta love these easter eggs....

BTW this means you can use arrays as an approximation to a slice, and write: f[i, [0,$], [0,$], k, [0,$], $-1, [0,$], [1,$] ]. And every index element will be int or int[2], depending on whether it's a slice or an index. Of course you can define const all= [0,int.max]; if you prefer that.
Nov 22 2007
parent Don Clugston <dac nospam.com.au> writes:
Don Clugston wrote:
 Don Clugston wrote:
 Oskar Linde wrote:
 Bill Baxter wrote:

 Anyway, f[i][j][k] syntax is much more difficult to read than 
 f[i,j,k].  So I think the latter is what we should be shooting for.

I agree, and for what it is worth it is possible to implement multidimensional indexing and slicing for UDTs today. The only thing lacking is the ability to use the .. and $ tokens. For example: f[i, all, all, j, all, end-1, all, range(1, end)] gets translated into quite optimal code. The only variables that get passed to the indexing function is (i,j,1,1) and no temporaries need to be created. I guess that with some additional compiler supported sugaring, the above could become something like: f[i, .. , .. , j, .. , $-1, .. , 1..$]

Try putting this line at the top of your file. enum : int { __dollar = int.max } Of course you have to add (length-int.max) to any indices you get, for any index > (int.max-length). Gotta love these easter eggs....

BTW this means you can use arrays as an approximation to a slice, and write: f[i, [0,$], [0,$], k, [0,$], $-1, [0,$], [1,$] ]. And every index element will be int or int[2], depending on whether it's a slice or an index. Of course you can define const all= [0,int.max]; if you prefer that.

Final answer: null = full slice. struct F { opIndex(T...)(T indices) { ... } } F f; f[i, null, null, k, null, $-1, null, [1,$] ]. Much better than what I was asking for.
Nov 22 2007
prev sibling next sibling parent Oskar Linde <oskar.lindeREM OVEgmail.com> writes:
Don Clugston wrote:
 Oskar Linde wrote:
 Bill Baxter wrote:

 Anyway, f[i][j][k] syntax is much more difficult to read than 
 f[i,j,k].  So I think the latter is what we should be shooting for.

I agree, and for what it is worth it is possible to implement multidimensional indexing and slicing for UDTs today. The only thing lacking is the ability to use the .. and $ tokens. For example: f[i, all, all, j, all, end-1, all, range(1, end)] gets translated into quite optimal code. The only variables that get passed to the indexing function is (i,j,1,1) and no temporaries need to be created. I guess that with some additional compiler supported sugaring, the above could become something like: f[i, .. , .. , j, .. , $-1, .. , 1..$]

Try putting this line at the top of your file. enum : uint { __dollar = int.max } Of course you have to add (length-int.max) to any indices you get, for any index > (int.max-length).

In my case, it turned out to be even easier: alias end __dollar; -- Oskar
Nov 22 2007
prev sibling parent Bill Baxter <dnewsgroup billbaxter.com> writes:
Don Clugston wrote:
 Oskar Linde wrote:
 Bill Baxter wrote:

 Anyway, f[i][j][k] syntax is much more difficult to read than 
 f[i,j,k].  So I think the latter is what we should be shooting for.

I agree, and for what it is worth it is possible to implement multidimensional indexing and slicing for UDTs today. The only thing lacking is the ability to use the .. and $ tokens. For example: f[i, all, all, j, all, end-1, all, range(1, end)] gets translated into quite optimal code. The only variables that get passed to the indexing function is (i,j,1,1) and no temporaries need to be created. I guess that with some additional compiler supported sugaring, the above could become something like: f[i, .. , .. , j, .. , $-1, .. , 1..$]

Try putting this line at the top of your file. enum : uint { __dollar = int.max } Of course you have to add (length-int.max) to any indices you get, for any index > (int.max-length). Gotta love these easter eggs....

Wow. You da man. --bb
Nov 22 2007
prev sibling parent "Janice Caron" <caron800 googlemail.com> writes:
I had a mad thought yesterday. It probably won't fly, but...

what if opSlice took tuple arguments, like this

    a[3,4 .. 5,6]

translating to

    a.opSlice(Tuple!(3,4),Tuple!(5,6))

implying the slice [3..5] in one dimension, and [4..6] in a second.
That would kind of automatically give you a syntax for
multidimensional arrays. Of course it would help if length() also took
tuples...

    a.length = Tuple!(3,3); /* create a 3x3 array */

    Tuple!(int,int) dims = a.length; /* get the array dimensions */

I suspect it might be possible to implement this right now in
user-defined classes by suitably overloading opSlice(),
opSliceAssign() and length().

There's probably some reason why it wouldn't work that I haven't thought of. :-)
Nov 22 2007
prev sibling next sibling parent reply "Stewart Gordon" <smjg_1998 yahoo.com> writes:
"Don Clugston" <dac nospam.com.au> wrote in message 
news:fi1c45$28gh$1 digitalmars.com...
 Bill Baxter wrote:
 2) D lacks a clean syntax for slices of multiple dimensions.  opSlice
 calls must have one and only one ".."

Can you propose a syntax for that?


How about making [] more than just a shorthand for [0..$], but rather mean "take everything from this dimension and move on". This would, I think, make it more consistent with its usage in declarations, where it is a dimension marker, not a slice.

I'm not sure how this would work more generally, if you know what I mean. AIUI when/if we get real multidimensional arrays, the syntax would be array[1..5, 2] rather than array[1..5][2] Moreover, the semantics you propose can lead to some even more counter-intuitive notation than we have already. You'd get such things as array[1..5][][2][6..10][][][4] and while it's true that you could determine the dimension each applies to by counting the []s without .. in them, probably even more people than already would slip up and try something like array[1..6][7..10] expecting it to take a rectangular slice. Another way it could be defined is that any slicing operation on a multidimensional array would cycle the dimensions, but I'm not sure if that's desirable either. Stewart. -- My e-mail address is valid but not my primary mailbox. Please keep replies on the 'group where everybody may benefit.
Nov 22 2007
parent Don Clugston <dac nospam.com.au> writes:
Stewart Gordon wrote:
 "Don Clugston" <dac nospam.com.au> wrote in message 
 news:fi1c45$28gh$1 digitalmars.com...
 Bill Baxter wrote:
 2) D lacks a clean syntax for slices of multiple dimensions.  opSlice
 calls must have one and only one ".."

Can you propose a syntax for that?


How about making [] more than just a shorthand for [0..$], but rather mean "take everything from this dimension and move on". This would, I think, make it more consistent with its usage in declarations, where it is a dimension marker, not a slice.

I'm not sure how this would work more generally, if you know what I mean. AIUI when/if we get real multidimensional arrays, the syntax would be array[1..5, 2] rather than array[1..5][2] Moreover, the semantics you propose can lead to some even more counter-intuitive notation than we have already. You'd get such things as array[1..5][][2][6..10][][][4] and while it's true that you could determine the dimension each applies to by counting the []s without .. in them, probably even more people than already would slip up and try something like array[1..6][7..10] expecting it to take a rectangular slice. Another way it could be defined is that any slicing operation on a multidimensional array would cycle the dimensions, but I'm not sure if that's desirable either. Stewart.

the syntax array[ 5, [1,$-5], 2, $-2] is already possible. So we're 90% of the way there already.
Nov 22 2007
prev sibling parent BCS <ao pathlink.com> writes:
Reply to Don,

I dislike the idea. I offten use this:

int[] i;

i[start..$][0..len];
Nov 23 2007