www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Array Operations: a[] + b[] etc.

reply "John Colvin" <john.loughran.colvin gmail.com> writes:
First things first: I'm not actually sure what the current spec 
for this is,
http://dlang.org/arrays.html is not the clearest on the subject 
and seems to rule out a lot of things that I reckon should work.

For this post I'm going to use the latest dmd from github. 
Behaviour is sometimes quite different for different versions of 
dmd, let alone gdc or ldc.

e.g.

int[] a = [1,2,3,4];
int[] b = [6,7,8,9];
int[] c;
int[] d = [10];
int[] e = [0,0,0,0];

a[] += b[];       // result [7, 9, 11, 13], as expected.

c = a[] + b[];    // Error: Array operation a[] + b[] not 
implemented.

c[] = a[] + b[];  // result [], is run-time assert on some 
compiler(s)/versions
d[] = a[] + b[]   // result [7], also a rt assert for some 
compiler(s)/versions


My vision of how things could work:
c = a[] opBinary b[];
should be legal. It should create a new array that is then 
reference assigned to c.

d[] = a[] opBinary b[];
should be d[i] = a[i] + b[i] for all i in 0..length.
What should the length be? Do we silently truncate to the 
shortest array or do we run-time assert (like ldc2 does, and so 
did dmd for a while between 2.060 and now). Currently dmd (and 
gdc) does neither of these reliably, e.g.
d[] = a[] + b[] results in [7],
a[] = d[] + b[] results in [16, 32747, 38805832, 67108873]

Another nice things to be able to do that i miss from working in 
IDL, I'm not sure how they'd be possible in D though:
given a multidimensional array I should be able to slice and 
index along any axis.
for example:
int[4][3] m = [[0,1,2,3],
                [4,5,6,7],
                [8,9,10,11]];
I can index vertically, i.e. m[1] == [4,5,6,7], but there's no 
syntactic sugar for indexing horizontally. Obviously m[][2] just 
gives me the 3rd row, so what could be a nice concise statement 
suddenly requires a manually written loop that the compiler has 
to work it's way through, extracting the meaning (see Walter on 
this, here: http://www.drdobbs.com/loop-optimizations/229300270)

A possible approach, heavily tried and tested in numpy and IDL: 
http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html
http://www.atmos.umd.edu/~gcm/usefuldocs/IDL.html#operations

Use multiple indices within the brackets.
     m[1,2] would be identical to m[1][2], returning 6
     m[0..2,3] would return [3,7]
     m[,2] would give me [2,6,10]
     Alternative syntax could be m[*,2], m[:,2] or we could even 
require m[0..$,2], I don't know how much of a technical challenge 
each of these would be for parsing and lexing.

//An example, lets imagine a greyscale image, stored as an array 
of pixel rows:

double[][] img = read_bmp(fn,"grey");

//we want to crop it to some user defined co-ords (x1,y1),(x2,y2):

//Version A, current syntax

auto img_cropped = img[y1..y2].dup;
foreach(ref row; img_cropped) {
     row = row[x1..x2];
}
//3 lines of code for a very simple idea.

//Version B, new syntax

auto img_cropped = img[y1..y2, x1..x2];

//Very simple, easy to read code that is clear in it's purpose.

I propose that Version B would be equivalent to A: An independent 
window on the data. Any reassignment of a row (i.e. pointing it 
to somewhere else, not copying new data in) will have no effect 
on the data. This scales naturally to higher dimensions and is in 
agreement with the normal slicing rules: the slice itself is 
independent of the original, but the data inside is shared.

I believe this would be a significant improvement to D, 
particularly for image processing and scientific applications.

P.S.
As you can probably tell, I have no experience in compiler 
design! I may be missing something that makes all of this 
impossible/impractical. I also don't think this would have to 
cause any code breakage at all, but again, I could be wrong.

P.P.S.
I think there many be something quite wrong with how the frontend 
understands current array expression syntax... see here: 
http://dpaste.dzfl.pl/f4a931db
Nov 21 2012
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 11/21/2012 10:02 AM, John Colvin wrote:
 My vision of how things could work:
 c = a[] opBinary b[];
 should be legal. It should create a new array that is then reference assigned
to c.
This is not done because it puts excessive pressure on the garbage collector. Array ops do not allocate memory by design.
Nov 21 2012
next sibling parent reply "David Nadlinger" <see klickverbot.at> writes:
On Wednesday, 21 November 2012 at 18:15:51 UTC, Walter Bright 
wrote:
 On 11/21/2012 10:02 AM, John Colvin wrote:
 My vision of how things could work:
 c = a[] opBinary b[];
 should be legal. It should create a new array that is then 
 reference assigned to c.
This is not done because it puts excessive pressure on the garbage collector. Array ops do not allocate memory by design.
We really need better error messages for this, though – Andrej? ;) David
Nov 21 2012
parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 11/21/12, David Nadlinger <see klickverbot.at> wrote:
 We really need better error messages for this, though =96 Andrej?
 ;)
Considering how slow pulls are being merged.. I don't think it's worth my time hacking on dmd. Anyway I have other things I'm working on now.
Nov 21 2012
prev sibling next sibling parent reply "John Colvin" <john.loughran.colvin gmail.com> writes:
On Wednesday, 21 November 2012 at 18:15:51 UTC, Walter Bright 
wrote:
 On 11/21/2012 10:02 AM, John Colvin wrote:
 My vision of how things could work:
 c = a[] opBinary b[];
 should be legal. It should create a new array that is then 
 reference assigned to c.
This is not done because it puts excessive pressure on the garbage collector. Array ops do not allocate memory by design.
Fair enough. When you say excessive pressure, is that performance pressure or design pressure?
Nov 21 2012
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/21/12 1:20 PM, John Colvin wrote:
 On Wednesday, 21 November 2012 at 18:15:51 UTC, Walter Bright wrote:
 On 11/21/2012 10:02 AM, John Colvin wrote:
 My vision of how things could work:
 c = a[] opBinary b[];
 should be legal. It should create a new array that is then reference
 assigned to c.
This is not done because it puts excessive pressure on the garbage collector. Array ops do not allocate memory by design.
Fair enough. When you say excessive pressure, is that performance pressure or design pressure?
Performance pressure - the design here is rather easy if efficiency is not a concern. Andrei
Nov 21 2012
parent reply "John Colvin" <john.loughran.colvin gmail.com> writes:
On Wednesday, 21 November 2012 at 18:38:59 UTC, Andrei 
Alexandrescu wrote:
 On 11/21/12 1:20 PM, John Colvin wrote:
 On Wednesday, 21 November 2012 at 18:15:51 UTC, Walter Bright 
 wrote:
 On 11/21/2012 10:02 AM, John Colvin wrote:
 My vision of how things could work:
 c = a[] opBinary b[];
 should be legal. It should create a new array that is then 
 reference
 assigned to c.
This is not done because it puts excessive pressure on the garbage collector. Array ops do not allocate memory by design.
Fair enough. When you say excessive pressure, is that performance pressure or design pressure?
Performance pressure - the design here is rather easy if efficiency is not a concern. Andrei
In what way does it become a performance problem? Apologies for the naive questions, I have nothing more than a passing understanding of how garbage collection works.
Nov 21 2012
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 11/21/2012 10:41 AM, John Colvin wrote:
 In what way does it become a performance problem?
Allocating memory is always much, much slower than not allocating memory. A design that forces allocating new memory and discarding the old as opposed to reusing existing already allocated memory is going to be far slower. In fact, the allocation/deallocation is going to dominate the performance timings, not the array operation itself. Generally, people who use array operations want them to be really fast.
Nov 21 2012
parent reply "John Colvin" <john.loughran.colvin gmail.com> writes:
On Wednesday, 21 November 2012 at 20:16:59 UTC, Walter Bright 
wrote:
 On 11/21/2012 10:41 AM, John Colvin wrote:
 In what way does it become a performance problem?
Allocating memory is always much, much slower than not allocating memory. A design that forces allocating new memory and discarding the old as opposed to reusing existing already allocated memory is going to be far slower. In fact, the allocation/deallocation is going to dominate the performance timings, not the array operation itself. Generally, people who use array operations want them to be really fast.
Well yes, of course, I thought you meant something more esoteric. I'm not suggesting that we replace the current design, simply extend it. We'd have: c[] = a[] + b[]; fast, in place array operation, the cost of allocation happens earlier in the code. but also c = a[] + b[]; a much slower, memory assigning array operation, pretty much just shorthand for c = new T[a.length]; c[] = a[] + b[]; You could argue that hiding an allocation is bad, but I would think it's quite obvious to any programmer that if you add 2 arrays together, you're going to have to create somewhere to put them... Having the shorthand prevents any possible mistakes with the length of the new array and saves a line of code. Anyway, this is a pretty trivial matter, I'd be more interested in seeing a definitive answer for what the correct behaviour for the statement a[] = b[] + c[] is when the arrays have different lengths.
Nov 22 2012
next sibling parent "monarch_dodra" <monarchdodra gmail.com> writes:
On Thursday, 22 November 2012 at 11:25:31 UTC, John Colvin wrote:
 Anyway, this is a pretty trivial matter, I'd be more interested 
 in seeing a definitive answer for what the correct behaviour 
 for the statement a[] = b[] + c[] is when the arrays have 
 different lengths.
I'd say the same as for "a[] += b[];": an assertion error. You have to compile druntime in non-release to see it though.
Nov 22 2012
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 11/22/2012 3:25 AM, John Colvin wrote:
 Anyway, this is a pretty trivial matter, I'd be more interested in seeing a
 definitive answer for what the correct behaviour for the statement a[] = b[] +
 c[] is when the arrays have different lengths.
An error.
Nov 22 2012
parent reply "John Colvin" <john.loughran.colvin gmail.com> writes:
On Thursday, 22 November 2012 at 20:58:25 UTC, Walter Bright 
wrote:
 On 11/22/2012 3:25 AM, John Colvin wrote:
 Anyway, this is a pretty trivial matter, I'd be more 
 interested in seeing a
 definitive answer for what the correct behaviour for the 
 statement a[] = b[] +
 c[] is when the arrays have different lengths.
An error.
Is monarch_dodra correct in saying that one would have to compile druntime in non-release to see this error? That would be a pity, couldn't this be implemented somehow so that it would depend on the user code being compiled non-release, not druntime?
Nov 22 2012
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 11/22/2012 6:11 PM, John Colvin wrote:
 An error.
Is monarch_dodra correct in saying that one would have to compile druntime in non-release to see this error? That would be a pity, couldn't this be implemented somehow so that it would depend on the user code being compiled non-release, not druntime?
I'd have to look at the specific code to see. In any case, it is an error. It takes a runtime check to do it, which can be turned on or off with the -noboundscheck switch.
Nov 22 2012
parent reply "monarch_dodra" <monarchdodra gmail.com> writes:
On Friday, 23 November 2012 at 06:41:06 UTC, Walter Bright wrote:
 On 11/22/2012 6:11 PM, John Colvin wrote:
 An error.
Is monarch_dodra correct in saying that one would have to compile druntime in non-release to see this error? That would be a pity, couldn't this be implemented somehow so that it would depend on the user code being compiled non-release, not druntime?
I'd have to look at the specific code to see. In any case, it is an error. It takes a runtime check to do it, which can be turned on or off with the -noboundscheck switch.
I originally opened this some time back, related to opSlice operations not error-ing: http://d.puremagic.com/issues/show_bug.cgi?id=8650 I've since learned to build druntime as non-release, which "fixes" the problem. I don't know if you plan to change anything about this, but just wanted to point out there's an Bugzilla entry for it.
Nov 22 2012
parent Walter Bright <newshound2 digitalmars.com> writes:
On 11/22/2012 10:49 PM, monarch_dodra wrote:
 I originally opened this some time back, related to opSlice operations not
 error-ing:
 http://d.puremagic.com/issues/show_bug.cgi?id=8650

 I've since learned to build druntime as non-release, which "fixes" the problem.

 I don't know if you plan to change anything about this, but just wanted to
point
 out there's an Bugzilla entry for it.
Thank you.
Nov 23 2012
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 11/22/2012 3:25 AM, John Colvin wrote:
 c[] = a[] + b[];
 fast, in place array operation, the cost of allocation happens earlier in the
code.

 but also
 c = a[] + b[];
 a much slower, memory assigning array operation, pretty much just shorthand for
 c = new T[a.length];
 c[] = a[] + b[];

 You could argue that hiding an allocation is bad, but I would think it's quite
 obvious to any programmer that if you add 2 arrays together, you're going to
 have to create somewhere to put them... Having the shorthand prevents any
 possible mistakes with the length of the new array and saves a line of code.
I'll be bold and predict what will happen if this proposal is implemented: "Array operations in D are cool but are incredibly slow. D sux." Few will notice that the hidden memory allocation can be easily removed, certainly not people casually looking at D to see if they should use it, and the damage will be done.
Nov 22 2012
parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
11/23/2012 1:02 AM, Walter Bright пишет:

 I'll be bold and predict what will happen if this proposal is implemented:

      "Array operations in D are cool but are incredibly slow. D sux."

 Few will notice that the hidden memory allocation can be easily removed,
 certainly not people casually looking at D to see if they should use it,
 and the damage will be done.
Expending on it and adding more serious reasoning. Array ops supposed to be overhead-free loops transparently leveraging SIMD parallelism of modern CPUs. No more and no less. It's like auto-vectorization but it's guaranteed and obvious in the form. Now if array ops did the checking for matching lengths it would slow them down. And that's something you can't turn off when you know the lengths match as it's a built-in. Ditto for checking if the left side is already allocated and allocating if not (but it's even worse). Basically you can't make the fastest primitive on something wrapped in safeguards. Doing the other way around is easy, for example via defining special wrapper type with custom opSlice, opSliceAssign etc.. that will do the checks. -- Dmitry Olshansky
Nov 22 2012
parent reply "John Colvin" <john.loughran.colvin gmail.com> writes:
On Thursday, 22 November 2012 at 21:37:19 UTC, Dmitry Olshansky 
wrote:
 Array ops supposed to be overhead-free loops transparently 
 leveraging SIMD parallelism of modern CPUs. No more and no 
 less. It's like auto-vectorization but it's guaranteed and 
 obvious in the form.
I disagree that array ops are only for speed. I would argue that their primary significance lies in their ability to make code significantly more readable, and more importantly, writeable. For example, the vector distance between 2 position vectors can be written as: dv[] = v2[] - v1[] or dv = v2[] - v1[] anyone with an understanding of mathematical vectors instantly understands the general intent of the code. With documentation something vaguely like this: "An array is a reference to a chunk of memory that contains a list of data, all of the same type. v[] means the set of elements in the array, while v on it's own refers to just the reference. Operations on sets of elements e.g. dv[] = v2[] - v1[] work element-wise along the arrays {insert mathematical notation and picture of 3 arrays as columns next to each other etc.}. Array operations can be very fast, as they are sometimes lowered directly to cpu vector instructions. However, be aware of situations where a new array has to be created implicitly, e.g. dv = v2[] - v1[]; Let's look at what this really means: we are asking for dv to be set to refer to the vector difference between v2 and v1. Note we said nothing about the current elements of dv, it might not even have any! This means we need to put the result of v2[] - v1] in a new chunk of memory, which we then set dv to refer to. Allocating new memory takes time, potentially taking a lot longer than the array operation itself, so if you can, avoid it!", anyone with the most basic programming and mathematical knowledge can write concise code operating on arrays, taking advantage of the potential speedups while being aware of the pitfalls. In short: Vector syntax/array ops is/are great. Concise code that's easy to read and write. They fulfill one of the guiding principles of D: the most obvious code is fast and safe (or if not 100% safe, at least not too error-prone). More vector syntax capabilities please!
Nov 22 2012
next sibling parent "Robert Jacques" <rjacque2 live.johnshopkins.edu> writes:
On Thu, 22 Nov 2012 20:06:44 -0600, John Colvin  
<john.loughran.colvin gmail.com> wrote:
 On Thursday, 22 November 2012 at 21:37:19 UTC, Dmitry Olshansky wrote:
 Array ops supposed to be overhead-free loops transparently leveraging  
 SIMD parallelism of modern CPUs. No more and no less. It's like  
 auto-vectorization but it's guaranteed and obvious in the form.
I disagree that array ops are only for speed. I would argue that their primary significance lies in their ability to make code significantly more readable, and more importantly, writeable. For example, the vector distance between 2 position vectors can be written as: dv[] = v2[] - v1[] or dv = v2[] - v1[] anyone with an understanding of mathematical vectors instantly understands the general intent of the code. With documentation something vaguely like this: "An array is a reference to a chunk of memory that contains a list of data, all of the same type. v[] means the set of elements in the array, while v on it's own refers to just the reference. Operations on sets of elements e.g. dv[] = v2[] - v1[] work element-wise along the arrays {insert mathematical notation and picture of 3 arrays as columns next to each other etc.}. Array operations can be very fast, as they are sometimes lowered directly to cpu vector instructions. However, be aware of situations where a new array has to be created implicitly, e.g. dv = v2[] - v1[]; Let's look at what this really means: we are asking for dv to be set to refer to the vector difference between v2 and v1. Note we said nothing about the current elements of dv, it might not even have any! This means we need to put the result of v2[] - v1] in a new chunk of memory, which we then set dv to refer to. Allocating new memory takes time, potentially taking a lot longer than the array operation itself, so if you can, avoid it!", anyone with the most basic programming and mathematical knowledge can write concise code operating on arrays, taking advantage of the potential speedups while being aware of the pitfalls. In short: Vector syntax/array ops is/are great. Concise code that's easy to read and write. They fulfill one of the guiding principles of D: the most obvious code is fast and safe (or if not 100% safe, at least not too error-prone). More vector syntax capabilities please!
While I think implicit allocation is a good idea in the case of variable initialization, i.e.: auto dv = v2[] - v1[]; however, as a general statement, i.e. dv = v2[] - v1[];, it could just as easily be a typo and result in a silent and hard to find performance bug. // An alternative syntax for variable initialization by an array operation expression: auto dv[] = v2[] - v1[];
Nov 22 2012
prev sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
11/23/2012 6:06 AM, John Colvin пишет:
 On Thursday, 22 November 2012 at 21:37:19 UTC, Dmitry Olshansky wrote:
 Array ops supposed to be overhead-free loops transparently leveraging
 SIMD parallelism of modern CPUs. No more and no less. It's like
 auto-vectorization but it's guaranteed and obvious in the form.
 I disagree that array ops are only for speed.
Well that and intuitive syntax.
 I would argue that their primary significance lies in their ability to
 make code significantly more readable, and more importantly, writeable.
 For example, the vector distance between 2 position vectors can be
 written as:
 dv[] = v2[] - v1[]
 or
 dv = v2[] - v1[]
 anyone with an understanding of mathematical vectors instantly
 understands the general intent of the code.
Mathematical sense doesn't take into account that arrays occupy memory and generally the cost of operations. Also : dv = v2 - v1 Is plenty as obvious, thus structs + operator overloading covers the usability department of this problem. Operating on raw arrays directly as N-dimensional vectors is fine but hardly helps maintainability/readability as the program grows over time.
 With documentation something vaguely like this:
 "An array is a reference to a chunk of memory that contains a list of
 data, all of the same type. v[] means the set of elements in the array,
 while v on it's own refers to just the reference. Operations on sets of
 elements e.g. dv[] = v2[] - v1[] work element-wise along the arrays
 {insert mathematical notation and picture of 3 arrays as columns next to
 each other etc.}.
.... So far so good, but I'd rather not use 'list' to define array nor the 'set' of elements. Semantically v[] means the slice of the whole array - nothing more and nothing less.
 Array operations can be very fast, as they are sometimes lowered
 directly to cpu vector instructions. However, be aware of situations
 where a new array has to be created implicitly, e.g. dv = v2[] - v1[];
 Let's look at what this really means: we are asking for dv to be set to
 refer to the vector difference between v2 and v1. Note we said nothing
 about the current elements of dv, it might not even have any! This means
 we need to put the result of v2[] - v1] in a new chunk of memory, which
 we then set dv to refer to. Allocating new memory takes time,
 potentially taking a lot longer than the array operation itself, so if
 you can, avoid it!",
IMHO I'd shot this kind of documentation on sight. "There is a fast tool but here is our peculiar set of rules that makes certain constructs slow as a pig. So, watch out! Isn't that convenient?"
 anyone with the most basic programming and mathematical knowledge can
 write concise code operating on arrays, taking advantage of the
 potential speedups while being aware of the pitfalls.
People typically are not aware as long as it seems to work.
 In short:
 Vector syntax/array ops is/are great. Concise code that's easy to read
 and write. They fulfill one of the guiding principles of D: the most
 obvious code is fast and safe (or if not 100% safe, at least not too
 error-prone).
This change fits scripting language more then system. For me a[] = b[] + c[]; implies: a[0..$] = b[0..$] + c[0..$] so it's obvious that lengths better match and 'a' must be preallocated.
 More vector syntax capabilities please!
It would have been nice to write things like: a[] = min(b[], c[]); where min is a regular function. But again I don't see the pressing need: - if speed is of concern then 'arbitrary function' can't be sped up much by hardware - if flexibility then range-style operation is far more flexible -- Dmitry Olshansky
Nov 23 2012
parent Walter Bright <newshound2 digitalmars.com> writes:
On 11/23/2012 7:58 AM, Dmitry Olshansky wrote:
 anyone with the most basic programming and mathematical knowledge can
 write concise code operating on arrays, taking advantage of the
 potential speedups while being aware of the pitfalls.
People typically are not aware as long as it seems to work.
As an example, bearophile is an experienced programmer. He just posted two loops, one using pointers and another using arrays, and was mystified why the array version was slower. He even posted the assembler output, where it was pretty obvious (to me, anyway) that he had array bounds checking turned on in the array version, which will slow it down. So yes, it's a problem when subtle changes in code can result in significant slowdowns, and yes, even experienced programmers get caught by that.
Nov 23 2012
prev sibling parent "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Wednesday, 21 November 2012 at 18:15:51 UTC, Walter Bright 
wrote:
 On 11/21/2012 10:02 AM, John Colvin wrote:
 My vision of how things could work:
 c = a[] opBinary b[];
 should be legal. It should create a new array that is then 
 reference assigned to c.
This is not done because it puts excessive pressure on the garbage collector. Array ops do not allocate memory by design.
But if they wanted it anyways, could implement it as a struct... Here's a rough build... Should be fairly obvious what's happening. struct AllocatingVectorArray(T) { T[] data; alias data this; alias AllocatingVectorArray AVA; //forces slice operations for vector format only static struct AVASlice { T[] data; alias data this; this(T[] rhs) { data = rhs; } AVA opBinary(string op)(const AVASlice rhs) { assert(rhs.length == data.length, "Lengths don't match, cannot use vector operations"); AVA var; var.data = data.dup; mixin("var[] " ~ op ~ "= rhs[];"); return var; } } this(T[] rhs) { data = rhs; } ref AVA opAssign(T[] rhs) { data = rhs; return this; } AVASlice opSlice() { return AVASlice(this); } } unittest { alias AllocatingVectorArray!int AVAint; AVAint a = [1,2,3,4]; AVAint b = [5,6,7,8]; AVAint c; // c = a + b; //not allowed, 'not implemented error' // assert(c == [6,8,10,12]); c = a[] + b[]; //known vector syntax assert(c == [6,8,10,12]); c[] = a[] + b[]; //more obvious what's happening assert(c == [6,8,10,12]); }
Nov 23 2012
prev sibling parent reply Mike Wey <mike-wey example.com> writes:
On 11/21/2012 07:02 PM, John Colvin wrote:
 //An example, lets imagine a greyscale image, stored as an array of
 pixel rows:

 double[][] img = read_bmp(fn,"grey");

 //we want to crop it to some user defined co-ords (x1,y1),(x2,y2):

 //Version A, current syntax

 auto img_cropped = img[y1..y2].dup;
 foreach(ref row; img_cropped) {
      row = row[x1..x2];
 }
 //3 lines of code for a very simple idea.

 //Version B, new syntax

 auto img_cropped = img[y1..y2, x1..x2];

 //Very simple, easy to read code that is clear in it's purpose.

 I propose that Version B would be equivalent to A: An independent window
 on the data. Any reassignment of a row (i.e. pointing it to somewhere
 else, not copying new data in) will have no effect on the data. This
 scales naturally to higher dimensions and is in agreement with the
 normal slicing rules: the slice itself is independent of the original,
 but the data inside is shared.

 I believe this would be a significant improvement to D, particularly for
 image processing and scientific applications.
If you want to use this syntax with images, DMagick's ImageView might be interesting: http://dmagick.mikewey.eu/docs/ImageView.html -- Mike Wey
Nov 21 2012
parent reply "John Colvin" <john.loughran.colvin gmail.com> writes:
On Wednesday, 21 November 2012 at 19:40:25 UTC, Mike Wey wrote:
 If you want to use this syntax with images, DMagick's ImageView 
 might be interesting:
 http://dmagick.mikewey.eu/docs/ImageView.html
I like it :) From what I can see it provides exactly what i'm talking about for 2D. I haven't looked at the implementation in detail, but do you think that such an approach could be scaled up to arbitrary N-dimensional arrays?
Nov 22 2012
next sibling parent "Robert Jacques" <rjacque2 live.johnshopkins.edu> writes:
On Thu, 22 Nov 2012 06:10:04 -0600, John Colvin  
<john.loughran.colvin gmail.com> wrote:

 On Wednesday, 21 November 2012 at 19:40:25 UTC, Mike Wey wrote:
 If you want to use this syntax with images, DMagick's ImageView might  
 be interesting:
 http://dmagick.mikewey.eu/docs/ImageView.html
I like it :) From what I can see it provides exactly what i'm talking about for 2D. I haven't looked at the implementation in detail, but do you think that such an approach could be scaled up to arbitrary N-dimensional arrays?
Yes and no. Basically, like an array, an ImageView is a thick pointer and as the dimensions increase the pointer gets thicker by 1-2 words a dimension. And each indexing or slicing operation has to create a temporary with this framework, which leads to stack churn as the dimensions get large. An another syntax that can be used until we get true, multi-dimensional slicing is to use opIndex with int[2] arguments, i.e: view[[4,40],[5,50]] = new Color("red");
Nov 22 2012
prev sibling parent Mike Wey <mike-wey example.com> writes:
On 11/22/2012 01:10 PM, John Colvin wrote:
 On Wednesday, 21 November 2012 at 19:40:25 UTC, Mike Wey wrote:
 If you want to use this syntax with images, DMagick's ImageView might
 be interesting:
 http://dmagick.mikewey.eu/docs/ImageView.html
I like it :) From what I can see it provides exactly what i'm talking about for 2D. I haven't looked at the implementation in detail, but do you think that such an approach could be scaled up to arbitrary N-dimensional arrays?
Every dimension has it's own type, so it won't scale well to a lot of dimensions. When sliceing every dimension would create an temporary. -- Mike Wey
Nov 22 2012