www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Status of multidimensional slicing

reply "Jared Miller" <jared economicmodeling.com> writes:
I would like to revisit the topic of operator overloads for 
multidimensional slicing.

Bottom line: opSlice is currently limited to 1 dimension/axis 
only. The cleanest workaround right now is to pass your own 
"slice" structs to opIndex. It works but it's not too pretty.

----
// Suppose we have a user-defined type...
auto mat = Matrix(
     [ [0,1,2],
       [3,4,5],
       [6,7,8] ] );

// This type of indexing can be implemented:
auto cell = mat[1, $-1];

// But multidimensional slicing cannot:
// auto submatrix = mat[0..2, 1..$];

// "Cleanest" workaround with a slice struct S taken by opIndex
//  (no $ capability):
auto submatrix = mat[ S(0,2), S(1,3) ];

// With a bit more hacking, something like this could be done:
auto submatrix = mat[ S[0..2], S[1..$] ];
----

Problem with current state of affairs and rationale for a fix:

* A stated design goal of D is to "Cater to the needs of 
numerical analysis programmers", and presumably HPC / scientific 
computing that's heavy on linear algebra and n-dimensional 
arrays. Well, it seems like the multidimensional slice/stride 
syntax in Matlab, NumPy, and even Fortran has been pretty popular 
with these folks. Syntactic sugar here is a clear win. I don't 
think it's a niche feature.
* The limitation on slicing is inconsistent with the capabilities 
of opIndex and opDollar, and workarounds are ugly.

but it was never implemented (despite opDollar getting done).

Recap of discussions so far:

* 2009-10-10: DIP7 
(http://www.prowiki.org/wiki4d/wiki.cgi?LanguageDevel/DIPs/DIP7)
* 2010-03-08: "Proposal: Multidimensional opSlice solution" 
(http://forum.dlang.org/thread/hn2q9q$263e$1 digitalmars.com)

multidimensional indexing and slicing"
* 2012-06-01: "[Proposal] Additional operator overloadings for 
multidimentional indexing and slicing" 
(http://forum.dlang.org/thread/mailman.1202.1338515967.24740.digitalmars-d puremagic.com)
* 2012-11-19: "Multidimensional array operator overloading" 
(http://forum.dlang.org/thread/mailman.2065.1353348152.5162.digitalmars-d puremagic.com)
* 2012-12-19: "Multidimensional slice" 
(http://forum.dlang.org/thread/lglljlnzoathjxijomrn forum.dlang.org)
* 2013-04-06: "rationale for opSlice, opSliceAssign, vs a..b 
being syntax suger for a Slice struct?" 
(http://forum.dlang.org/thread/mailman.551.1365290408.4724.digitalmars-d-learn puremagic.com)
* 2013-05-12: Andrei asks for feedback on Kenji's 2011 pull 

* 2013-10-11: "std.linalg" 
(http://forum.dlang.org/thread/rmyaglfeimzuggoluxvd forum.dlang.org)

Steps forward:

So I basically want resurrect the topic and gauge support for 
fixing slice overloads. Then, core committers could revisit 

solid near-term solution. Finally, perhaps a DIP for stride 
syntax/overloads?

Looking forward to discussion.
Mar 07 2014
parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Jared Miller:

 Looking forward to discussion.
D needs to offer a nice syntax for user defined multidimensional slicing. Bye, bearophile
Mar 07 2014
next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Fri, Mar 07, 2014 at 09:23:30PM +0000, bearophile wrote:
 Jared Miller:
 
Looking forward to discussion.
D needs to offer a nice syntax for user defined multidimensional slicing.
[...] +1. I fully support Kenji's pull to extend the language in that direction. I'm a bit sad that Walter is pushing for a large breaking change to D string handling, while Kenji's pull, which is a non-breaking enhancement that would lead to much better D support for many numerical computation applications, has been stagnating for at least a year (probably more). T -- Unix is my IDE. -- Justin Whear
Mar 07 2014
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 3/7/14, 2:30 PM, H. S. Teoh wrote:
 On Fri, Mar 07, 2014 at 09:23:30PM +0000, bearophile wrote:
 Jared Miller:

 Looking forward to discussion.
D needs to offer a nice syntax for user defined multidimensional slicing.
[...] +1. I fully support Kenji's pull to extend the language in that direction. I'm a bit sad that Walter is pushing for a large breaking change to D string handling, while Kenji's pull, which is a non-breaking enhancement that would lead to much better D support for many numerical computation applications, has been stagnating for at least a year (probably more).
I agree and sympathize. Andrei
Mar 07 2014
next sibling parent "Jared Miller" <jared economicmodeling.com> writes:
So are there any significant objections to Kenji's PR?

I think it's got a lot of things going for it, particularly in 

likely be a top priority for most people, but it's got a lot of 
bang for your buck: a great benefit to an important subset of 
users for relatively little effort.

I'd love to see it on the official agenda for release this year. 
Is this the right location: http://wiki.dlang.org/Agenda, and is 
anybody welcome to offer edits?

Jared

On Saturday, 8 March 2014 at 01:24:50 UTC, Andrei Alexandrescu 
wrote:

 I agree and sympathize.

 Andrei
Mar 10 2014
prev sibling parent reply Kenji Hara <k.hara.pg gmail.com> writes:
2014-03-08 10:24 GMT+09:00 Andrei Alexandrescu <
SeeWebsiteForEmail erdani.org>:

 I agree and sympathize.
Kenji Hara
Mar 12 2014
parent reply "Mason McGill" <mmcgill caltech.edu> writes:
Hi all,

I think D has a lot to offer technical computing:
   - the speed and modeling power of C++
   - GC for clean API design
   - reflection for automatic bindings
And technical computing has a lot to offer D:
   - users
   - API writers
   - time in the minds of people who teach

Multidimensional array support is important for this exchange to 
happen, so as a D user and a computer vision researcher I'm glad 
to see it's being addressed! However, I'm interested in hearing 
more about the rationale for the design decisions made concerning 

ignoring some of the lessons the SciPy community has learned over 
the past 10+ years.  A bit of elaboration:

In Python, slicing and indexing were originally separate 
operations.  Custom containers would have to define both 
`__getitem__(self, key)` and `__getslice__(self, start, end)`.  
This is where D is now.  Python then deprecated `__getslice__` 
and decided `container[start:end]` should translate to 
`container[slice(start, end)]`: the slicing syntax just became 
sugar for creating a lightweight slice object (i.e. a "range 
literal"), but it only worked inside an index expression.  If I 
understand correctly, this is similar in spirit to the solution 
the D community seems to be converging upon.  This solution 
enables multidimensional slicing, but needlessly prohibits the 
construction of range literals outside of an index expression.

So, why is this important?  One point of view is that 
multidimensional slicing is just one of many use cases for a 
concise representation of a range of numbers.  In more 
"specialized" scientific languages, like MATLAB/Octave and Julia, 
range literals are a critical component to readable, idiomatic 
code.  In order to partially make up for this, SciPy is forced to 
subvert Python's indexing syntax for calling functions that may 
operate on numeric ranges, obfuscating code (e.g. 
http://docs.scipy.org/doc/numpy/reference/generated/numpy.r_.html).

I point this out because it (fortunately) seems like D is in a 
position to have range literals while maintaining backwards 
compatibility and reducing language complexity (details are 
below).  I'd like to hear your thoughts about range literals as a 
solution for multidimensional indexing: whether it's been 
proposed, if so, why is was decided against, what its 
disadvantages might be, whether they're compatible with the work 
already done on this front, etc.

===================
Range Literals in D
===================

// Right now this works:
foreach (i; 0..10) doScience(i);

// And this works:
auto range = iota(0, 10);
foreach (i; range) doScience(i);

// So why shouldn't this work?
auto range = 0..10;
foreach (i; range) doScience(i);

// Or this?
auto range = 0..10;
myFavoriteArray[range] = fascinatingFindings();

// Or this?
auto range = 0..10;
myFavoriteMatrix[0..$, range] = fascinating2DFindings();

// `opSlice` would no longer be necessary...
myMap["key"];       // calls `opIndex(string);`
myVector[5];        // calls `opIndex(int);`
myMatrix[5, 0..10]; // calls `opIndex(int, NumericRange);`

// But old code that defines `opSlice` could still work (like in 
Python).
myVector[0..10]; // If `opIndex(NumericRange)` isn't defined,
                  // fall back to`opSlice`.

// `ForeachRangeStatement` would no longer need to exist as an 
odd special case.
// The following two statements are semantically equivalent, and 
with range
// literals, they'd be instances of the same looping syntax.
foreach (i; 0..10) doScience();
foreach (i; iota(0, 10)) doScience();

// Compilers would, of course, be free to special-case `foreach` 
loops
// over range literals, if it's helpful for performance.

On Wednesday, 12 March 2014 at 13:55:05 UTC, Kenji Hara wrote:
 2014-03-08 10:24 GMT+09:00 Andrei Alexandrescu <
 SeeWebsiteForEmail erdani.org>:

 I agree and sympathize.
Kenji Hara
Mar 14 2014
parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Mason McGill:

 My concern is that this design may be ignoring some of the 
 lessons the SciPy community has learned over the past 10+ years.
Thank you for your help. An injection of experience is quite important here. Julia is far newer than D, and yet it has already a better design and more refined implementation in several things related to numerical computing.
 // So why shouldn't this work?
 auto range = 0..10;
 foreach (i; range) doScience(i);
People have suggested this lot of time ago, again and again. So I ask that question for Walter. Bye, bearophile
Mar 14 2014
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Fri, Mar 14, 2014 at 12:29:34PM +0000, bearophile wrote:
 Mason McGill:
 
My concern is that this design may be ignoring some of the lessons
the SciPy community has learned over the past 10+ years.
Thank you for your help. An injection of experience is quite important here. Julia is far newer than D, and yet it has already a better design and more refined implementation in several things related to numerical computing.
// So why shouldn't this work?
auto range = 0..10;
foreach (i; range) doScience(i);
People have suggested this lot of time ago, again and again. So I ask that question for Walter.
[...] Replace the first line with: auto range = iota(0, 10); and it will work. It's not *that* hard to learn, is it? T -- Klein bottle for rent ... inquire within. -- Stephen Mulraney
Mar 14 2014
parent "Mason McGill" <mmcgill caltech.edu> writes:
 // So why shouldn't this work?
 auto range = 0..10;
 foreach (i; range) doScience(i);
Replace the first line with: auto range = iota(0, 10); and it will work. It's not *that* hard to learn, is it?
True, but I think the issue at hand when discussing "sugary" syntax is clarity and expressiveness rather than completeness. In many domains, programmer working memory is at a premium, and code like this: { auto samples = meshgrid(iota(0, 2), iota(0, 100), iota(0, 100)); vector[StridedSlice(0, 10, 2)] = iota(1, 6); plot(iota(-10, 10), myFunction(iota(-10, 10))); foreach (i; square(iota(0, 10))) performSquareDance(i); } might not be as respectful of that resource as code like this: { auto samples = meshgrid(0..2, 0..100, 0..100); vector[(0..10).by(2)] = 1..6; plot(-10..10, myFunction(-10..10)); foreach (i; square(0..10)) performSquareDance(i); } Reference for `meshgrid`: http://www.mathworks.com/help/matlab/ref/meshgrid.html Reference for strided indexing: http://docs.scipy.org/doc/numpy/user/basics.indexing.html On Friday, 14 March 2014 at 14:36:29 UTC, H. S. Teoh wrote:
 On Fri, Mar 14, 2014 at 12:29:34PM +0000, bearophile wrote:
 Mason McGill:
 
My concern is that this design may be ignoring some of the 
lessons
the SciPy community has learned over the past 10+ years.
Thank you for your help. An injection of experience is quite important here. Julia is far newer than D, and yet it has already a better design and more refined implementation in several things related to numerical computing.
// So why shouldn't this work?
auto range = 0..10;
foreach (i; range) doScience(i);
People have suggested this lot of time ago, again and again. So I ask that question for Walter.
[...] Replace the first line with: auto range = iota(0, 10); and it will work. It's not *that* hard to learn, is it? T
Mar 14 2014
prev sibling parent Brad Roberts <braddr puremagic.com> writes:
On 3/7/2014 2:30 PM, H. S. Teoh wrote:
 I'm a bit sad that Walter is pushing for a large breaking change to D
 string handling, while Kenji's pull, which is a non-breaking enhancement
 that would lead to much better D support for many numerical computation
 applications, has been stagnating for at least a year (probably more).
You expressed this as if there's actual correlation or causation between the two when it's highly unlikely any exists. He's doing exactly what many many others do: express concern about a problem encountered during recent use of some aspect of the D ecosystem. It's an unfortunate but true aspect of the rate of D development combined with the relative small community: old pull's get lost in the noise. For pulls to get attention, the author or proponents of a pull need to keep it alive. The rate of application of pulls (regardless of age) isn't bad, but when combined with the influx rate of new pull requests it's just not high enough to get the backlog gone.
Mar 07 2014