www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Negative array indices?

reply Norbert Nemec <Norbert.Nemec gmx.de> writes:
Hi there,

I wonder whether this has been discussed before:

* How about allowing negative array indices to count backward from the end
of the array?

* How about allowing to drop one of the bounds of a range to indicate
beginning or end of the array?

An example should make clear what I mean:

        char[] str = "0123456789"
        assert(str[2] == '2');
        assert(str[-3] == '7');
        assert(str[1..3] == "12");
        assert(str[..4] == "0123"); // just for completeness
                                    // could be written as str[0..4] just as
well
        str[-11]; // throws ArrayBoundsError

        assert(str[7..] == "789");
        assert(str[-4..] == "6789");
        assert(str[4..-2] == "4567");

One question I could not resolve for myself: should illegal ranges throw an
ArrayBoundsError or return an empty/truncated string? One way would be an
extremely tolerant behaviour like:

        assert(str[4..2] == "");
        assert(str[-2..3] == "");

        assert(str[7..22] == "789");
        assert(str[-7..8] == "34567");

Alternatively, one could argue that each case should throw an
ArrayBoundsError. What is the current behaviour?

Ciao,
Nobbi
May 06 2004
next sibling parent "Matthew" <matthew.hat stlsoft.dot.org> writes:
"Norbert Nemec" <Norbert.Nemec gmx.de> wrote in message
news:c7co59$11i4$1 digitaldaemon.com...
 Hi there,

 I wonder whether this has been discussed before:

 * How about allowing negative array indices to count backward from the end
 of the array?
That stinks
 * How about allowing to drop one of the bounds of a range to indicate
 beginning or end of the array?
That doesn't.
May 06 2004
prev sibling parent reply J Anderson <REMOVEanderson badmama.com.au> writes:
Norbert Nemec wrote:

Hi there,

I wonder whether this has been discussed before:

* How about allowing negative array indices to count backward from the end
of the array?
  
Yes it has. -- -Anderson: http://badmama.com.au/~anderson/
May 06 2004
parent reply "Matthew" <matthew.hat stlsoft.dot.org> writes:
To be less facetious. The reason it's a very bad idea is that array subscripting
in C and C++ and D can be done with signed integers because it is legal _and
meaningful_ to pass a -ve subscript to mean prior to the given base (pointer
and/or array).

Since D's support of C constructs most certainly encompasses this very
important,
albeit dangerous, construct, it would be nonsensical to have built-in D arrays
use a back-relative offset and pointers use -ve offset. It would be a total
killer.

"J Anderson" <REMOVEanderson badmama.com.au> wrote in message
news:c7d5bc$1lsl$1 digitaldaemon.com...
 Norbert Nemec wrote:

Hi there,

I wonder whether this has been discussed before:

* How about allowing negative array indices to count backward from the end
of the array?
Yes it has. -- -Anderson: http://badmama.com.au/~anderson/
May 06 2004
next sibling parent reply Norbert Nemec <Norbert.Nemec gmx.de> writes:
Matthew wrote:

 To be less facetious. The reason it's a very bad idea is that array
 subscripting in C and C++ and D can be done with signed integers because
 it is legal _and meaningful_ to pass a -ve subscript to mean prior to the
 given base (pointer and/or array).
 
 Since D's support of C constructs most certainly encompasses this very
 important, albeit dangerous, construct, it would be nonsensical to have
 built-in D arrays use a back-relative offset and pointers use -ve offset.
 It would be a total killer.
OK, I understand this. How about finding another syntax for describing a range bound that is counted backwards from the end of an array? p.e. char[] str = "0123456789" assert(str[<3] == '7'); assert(str[<4..] == "6789"); assert(str[4..<2] == "4567"); The < as a prefix does not exist yet and would only be valid for indices or ragne bounds, just like the .. infix operator. Alternative syntax proposals are probably easy to find. In general, some way to indicate this "counting back from the end" really improves string handling qualities of a language. Ciao, Nobbi
May 06 2004
next sibling parent J Anderson <REMOVEanderson badmama.com.au> writes:
Norbert Nemec wrote:

Matthew wrote:

  

To be less facetious. The reason it's a very bad idea is that array
subscripting in C and C++ and D can be done with signed integers because
it is legal _and meaningful_ to pass a -ve subscript to mean prior to the
given base (pointer and/or array).

Since D's support of C constructs most certainly encompasses this very
important, albeit dangerous, construct, it would be nonsensical to have
built-in D arrays use a back-relative offset and pointers use -ve offset.
It would be a total killer.
    
OK, I understand this. How about finding another syntax for describing a range bound that is counted backwards from the end of an array? p.e. char[] str = "0123456789" assert(str[<3] == '7'); assert(str[<4..] == "6789"); assert(str[4..<2] == "4567"); The < as a prefix does not exist yet and would only be valid for indices or ragne bounds, just like the .. infix operator. Alternative syntax proposals are probably easy to find. In general, some way to indicate this "counting back from the end" really improves string handling qualities of a language. Ciao, Nobbi
This has been discussed before, a few times. It's only syntax sugar but a good idea. -- -Anderson: http://badmama.com.au/~anderson/
May 06 2004
prev sibling parent reply Mark <Mark_member pathlink.com> writes:
OK, I understand this. How about finding another syntax for describing a
range bound that is counted backwards from the end of an array? p.e.

        char[] str = "0123456789"
        assert(str[<3] == '7');
        assert(str[<4..] == "6789");
        assert(str[4..<2] == "4567");
How about defining a symbol for the str.length such as '$', then char[] str = "0123456789" assert(str[$-3] == '7'); assert(str[$-4..$] == "6789"); assert(str[4..$-2] == "4567"); and since '$' already has the meaning "end of line" in other contexts readability is maintained without the clutter of 'str.length'. Mark.
May 06 2004
next sibling parent "Matthew" <matthew.hat stlsoft.dot.org> writes:
"Mark" <Mark_member pathlink.com> wrote in message
news:c7dmj7$2lsu$1 digitaldaemon.com...
OK, I understand this. How about finding another syntax for describing a
range bound that is counted backwards from the end of an array? p.e.

        char[] str = "0123456789"
        assert(str[<3] == '7');
        assert(str[<4..] == "6789");
        assert(str[4..<2] == "4567");
How about defining a symbol for the str.length such as '$', then char[] str = "0123456789" assert(str[$-3] == '7'); assert(str[$-4..$] == "6789"); assert(str[4..$-2] == "4567"); and since '$' already has the meaning "end of line" in other contexts readability is maintained without the clutter of 'str.length'.
Weird city. I was thinking just the same thing, but without the erudite rationale. Consider yourself "hear, hear"'d!
May 06 2004
prev sibling next sibling parent Norbert Nemec <Norbert.Nemec gmx.de> writes:
Mark wrote:
 How about defining a symbol for the str.length such as '$', then
 
 char[] str = "0123456789"
 assert(str[$-3] == '7');
 assert(str[$-4..$] == "6789");
 assert(str[4..$-2] == "4567");
 
 and since '$' already has the meaning "end of line" in other contexts
 readability is maintained without the clutter of 'str.length'.
I like that one! It would not only be more readable but also more flexible than my idea. Consider for example things like: str[..$/2] It would make even more sense for multidimensional arrays, since there str.length would become str.range[i] with i indexing the different dimensions. Just consider: mymatrix[..$-2,..$-2] as syntactic sugar for mymatrix[0..mymatrix.range[0]-2,0..mymatrix.range[1]-2]
May 06 2004
prev sibling parent Derek <ddparnell bigpond.com> writes:
On Thu, 6 May 2004 15:45:43 +0000 (UTC), Mark wrote:

OK, I understand this. How about finding another syntax for describing a
range bound that is counted backwards from the end of an array? p.e.

        char[] str = "0123456789"
        assert(str[<3] == '7');
        assert(str[<4..] == "6789");
        assert(str[4..<2] == "4567");
How about defining a symbol for the str.length such as '$', then char[] str = "0123456789" assert(str[$-3] == '7'); assert(str[$-4..$] == "6789"); assert(str[4..$-2] == "4567"); and since '$' already has the meaning "end of line" in other contexts readability is maintained without the clutter of 'str.length'.
Agreed. I have been advocating this exact same thing for the Euphoria language for ages now. -- Derek Melbourne, Australia
May 07 2004
prev sibling parent reply "Harvey Stroud" <hstroud ntlworld.com> writes:
It seems to me that supporting the negative array index of C for the sake of
backward compatibility goes against the design philosophy for D, which as I
see it,  is the keeping of the general look and feel of C++ while discarding
dubious features of which -ve array indexing is surely one?  Wouldn't it
make sense to remove this dangerous behaviour from the language, or better
to replace it with an alternative meaning. Besides, how many people out
there actually use indexing in this way, although maybe for pointer
manipulation it could be useful, albeit error prone.

Introducing a special operator ($) to denote the length strikes me as
ungainly, making the code more perl-like, but perhaps that's just my dislike
of none C symbols.

Has anybody given any thought to an [optional] stride value:

int[] x = a[1..10 : 2];    // Gets every other element of the array

Cheers,
Harvey.


"Matthew" <matthew.hat stlsoft.dot.org> wrote in message
news:c7d5p2$1n2f$1 digitaldaemon.com...
 To be less facetious. The reason it's a very bad idea is that array
subscripting
 in C and C++ and D can be done with signed integers because it is legal
_and
 meaningful_ to pass a -ve subscript to mean prior to the given base
(pointer
 and/or array).

 Since D's support of C constructs most certainly encompasses this very
important,
 albeit dangerous, construct, it would be nonsensical to have built-in D
arrays
 use a back-relative offset and pointers use -ve offset. It would be a
total
 killer.

 "J Anderson" <REMOVEanderson badmama.com.au> wrote in message
 news:c7d5bc$1lsl$1 digitaldaemon.com...
 Norbert Nemec wrote:

Hi there,

I wonder whether this has been discussed before:

* How about allowing negative array indices to count backward from the
end
of the array?
Yes it has. -- -Anderson: http://badmama.com.au/~anderson/
May 09 2004
next sibling parent Stewart Gordon <smjg_1998 yahoo.com> writes:
Harvey Stroud wrote:
 It seems to me that supporting the negative array index of C for the sake of
 backward compatibility goes against the design philosophy for D, which as I
 see it,  is the keeping of the general look and feel of C++ while discarding
 dubious features of which -ve array indexing is surely one?
We already do discard this. It's called array bounds checking.
 Wouldn't it make sense to remove this dangerous behaviour from the language,
or better
 to replace it with an alternative meaning.
There would be a performance hit if we had to check at runtime if every index is +ve or -ve, wherever it can't be determined at compile time.
 Besides, how many people out there actually use indexing in this way, although
maybe for pointer
 manipulation it could be useful, albeit error prone.
Well, D&DP in general is almost bound to be error prone. But since preserving D&DP support is part of D's design strategy, perhaps it should be kept.
 Introducing a special operator ($) to denote the length strikes me as
 ungainly, making the code more perl-like, but perhaps that's just my dislike
 of none C symbols.
Do you have an idea for a nicer symbol to use for this?
 Has anybody given any thought to an [optional] stride value:
 
 int[] x = a[1..10 : 2];    // Gets every other element of the array
<snip top of upside-down reply> Not yet AFAIK. Stewart. -- My e-mail is valid but not my primary mailbox, aside from its being the unfortunate victim of intensive mail-bombing at the moment. Please keep replies on the 'group where everyone may benefit.
May 10 2004
prev sibling parent reply Norbert Nemec <Norbert.Nemec gmx.de> writes:
Harvey Stroud wrote:

 It seems to me that supporting the negative array index of C for the sake
 of backward compatibility goes against the design philosophy for D...
For pointers, negative indices actually make sense. If you allow indexing of raw pointers (which I think is a good idea) then prohibiting negative indices would be strange. For arrays, negative indices are, of cause, caught by the range checking mechanism. Raw pointers, of course, are error prone. Anyway it's the philosophy of D to give the developer all the tools to shoot himself in the foot, but make it clear what the dangerous tools are, and encourage him to avoid these tools completely.
 Introducing a special operator ($) to denote the length strikes me as
 ungainly, making the code more perl-like, but perhaps that's just my
 dislike of none C symbols.
That's just personal taste. $ has no meaning in D so far, and it is a plain ASCII character. Why not put it to use? B.t.w: in the suggested meaning, $ would not be a normal operator at all, but something special that does not exist in D so far: a "zero-ary operator" or however you want to call it.
 Has anybody given any thought to an [optional] stride value:
 
 int[] x = a[1..10 : 2];    // Gets every other element of the array
See my multidimension array proposal at http://homepages.uni-regensburg.de/~nen10015/documents/D-multidimarray.html it contains strided slices and much more.
May 10 2004
next sibling parent reply Stewart Gordon <smjg_1998 yahoo.com> writes:
Norbert Nemec wrote:

<snip>
 See my multidimension array proposal at
 
 http://homepages.uni-regensburg.de/~nen10015/documents/D-multidimarray.html
 
 it contains strided slices and much more.
"The former possibility to assign to the .size property of an array should be dropped since it is obscuring what actually happens:" Did you mean .size or .length? If .length, then I'm not sure I'd agree with you. "upsizing on the other hand, means allocating new memory and copying the existing data. For this operation, a different syntax should be found that makes clear what is happening." Not necessarily. It could mean simply changing the range, filling up already allocated growing space. The point of D isn't to have the programmer concern him/herself with the inner workings of everything. If they wanted that, they'd probably use assembly. Or maybe compromise with plain old C. The idea of D is to support syntax that makes sense to the human programmer, while allowing the compiler to implement it efficiently. "M[a][b] = new mytype(); Be aware of the difference between the type declaration mytype[B][A] and the dereferenciation mytype[a][b]." You mean "the dereferenciation M[a][b]"? "In its full generality, this internal representation would, of course, allow all kinds of weird shapes and self-overlappings." And word/dword-alignment of rows where the individual elements may be byte-aligned, if there are benefits to that. "The property .diag() sums up all strides and returns a one-dimensional array reference corresponding to the total diagonal of the original array." What if the array isn't square/cube/tesseract/general hypercube? Would it just count the shortest dimension, i.e. as far as the diagonal remains inside the array? What conversions would be allowed between multidimensional arrays and old-fashioned D linear arrays? Even something that can be understood by third-party foreign code? I'd also suggest running the proposal through a spellchecker at some point. Stewart. -- My e-mail is valid but not my primary mailbox, aside from its being the unfortunate victim of intensive mail-bombing at the moment. Please keep replies on the 'group where everyone may benefit.
May 10 2004
parent reply Norbert Nemec <Norbert.Nemec gmx.de> writes:
Stewart Gordon wrote:

 Norbert Nemec wrote:
 
 <snip>
 See my multidimension array proposal at
 
http://homepages.uni-regensburg.de/~nen10015/documents/D-multidimarray.html
 
 it contains strided slices and much more.
"The former possibility to assign to the .size property of an array should be dropped since it is obscuring what actually happens:" Did you mean .size or .length?
Of course. Sorry about that error.
 If .length, then I'm not sure I'd agree with you.
 
 "upsizing on the other hand, means allocating new memory and copying the
 existing data. For this operation, a different syntax should be found
 that makes clear what is happening."
 
 Not necessarily.  It could mean simply changing the range, filling up
 already allocated growing space.
Even worse! If, after upsizing, you don't even know whether you are referencing new space or the same as before, the whole thing gets completely uncontrollable. The current situation is: dynamic arrays actually are references to the heap. Two arrays may reference the same portion of the heap, so changing one will change the other. Anyhow, the language does everything to obscure this fact and make it rather hard to predict, when it happens. Unless you really know the details, you will often call .dup without need, and, in the other way, you will have trouble if you trust that two arrays refer to the same space.
 The point of D isn't to have the programmer concern him/herself with the
 inner workings of everything.  If they wanted that, they'd probably use
 assembly.  Or maybe compromise with plain old C.
That is true, but I'm not talking about implementation details but about semantics. Dynamic arrays are references and behave like references. The language tries to hide this from the developer but does not do so completely, resulting in behaviour that is hard to predict and hard to understand.
 The idea of D is to support syntax that makes sense to the human
 programmer, while allowing the compiler to implement it efficiently.
True, but if you hide implementation details, this should be done completely. There is a semantic difference between reference and value types. Currently, dynamic arrays behave somewhere in between, making it very confusing to use them efficiently.
          "M[a][b] = new mytype();
 
 Be aware of the difference between the type declaration mytype[B][A] and
 the dereferenciation mytype[a][b]."
 
 You mean "the dereferenciation M[a][b]"?
True, another typo.
 "The property .diag() sums up all strides and returns a one-dimensional
 array reference corresponding to the total diagonal of the original
 array."
 
 What if the array isn't square/cube/tesseract/general hypercube?  Would
 it just count the shortest dimension, i.e. as far as the diagonal
 remains inside the array?
True, I thought about picking the smallest range. Should have said so, I guess.
 What conversions would be allowed between multidimensional arrays and
 old-fashioned D linear arrays?
old-fashioned D array, or "lightweight array references" as they are called in the proposal are implicitely casted to mytype[[1]] (trivially setting the stride to 1) In the other direction, a direct cast does not make sense, since the stride might be !=1. I was thinking about having mytype[[1]].dup return a mytype[] reference. This would avoid the need for another language construct. Not sure about it, yet.
 Even something that can be understood by third-party foreign code?
Conversion to Fortran arrays is already trivially possible. (A few convenience functions might make it even more comfortable.) Everything else should be easy to implement.
 I'd also suggest running the proposal through a spellchecker at some
 point.
Sorry - missed out on that, I guess... :-(
May 10 2004
next sibling parent reply J Anderson <REMOVEanderson badmama.com.au> writes:
Norbert Nemec wrote:

If, after upsizing, you don't even know whether you are
referencing new space or the same as before, the whole thing gets
completely uncontrollable.
  
It won't get uncontrollable if your careful not to create persistent alias of the same memory location. You would have to do that any way you look at it. Just because C had malloc and realloc didn't change this problem at all. Please give a good source example of where D arrays fail you. -- -Anderson: http://badmama.com.au/~anderson/
May 10 2004
parent reply Norbert Nemec <Norbert.Nemec gmx.de> writes:
J Anderson wrote:

 Norbert Nemec wrote:
 
If, after upsizing, you don't even know whether you are
referencing new space or the same as before, the whole thing gets
completely uncontrollable.
  
It won't get uncontrollable if your careful not to create persistent alias of the same memory location. You would have to do that any way you look at it. Just because C had malloc and realloc didn't change this problem at all. Please give a good source example of where D arrays fail you.
What does the following routine return: --------------------------- char myrountine(char[] input, uint param) { char[] strB = input; strB.size = param; input[0] = 'x'; return strB[0]; }; --------------------------- admittedly, you will probably call strB a persistent alias and tell me to avoid it, but how would I know? The language spec sounds like: If you want to make sure your array is unique, call .dup - otherwise nothing is guaranteed. This will result in people calling .dup unnecessarily, just because they are frightened of the "nothing is guaranteed". Actually: if you don't know the implementation details, you just have to build in .dups that are probably unnecessary. Furthermore, sometimes, you might actually be interested in having definitely aliased arrays. The language spec, though, does not give you much certainty that an implementation might not suddenly call .dup for some reason.
May 10 2004
parent J Anderson <REMOVEanderson badmama.com.au> writes:
Norbert Nemec wrote:

J Anderson wrote:

  

Norbert Nemec wrote:

    

If, after upsizing, you don't even know whether you are
referencing new space or the same as before, the whole thing gets
completely uncontrollable.
 
      
It won't get uncontrollable if your careful not to create persistent alias of the same memory location. You would have to do that any way you look at it. Just because C had malloc and realloc didn't change this problem at all. Please give a good source example of where D arrays fail you.
What does the following routine return: --------------------------- char myrountine(char[] input, uint param) { char[] strB = input; strB.size = param; input[0] = 'x'; return strB[0]; }; ---------------------------
admittedly, you will probably call strB a persistent alias and tell me to
avoid it, but how would I know? 
You've just answered your own question.
The language spec sounds like: If you want
to make sure your array is unique, call .dup - otherwise nothing is
guaranteed. This will result in people calling .dup unnecessarily, just
because they are frightened of the "nothing is guaranteed". 
  
That is not true, its quite easy to learn how D arrays behave. You only need to use dup if you want to modify a copy of the array. That is you don't want to modify the original array.
Actually: if
you don't know the implementation details, you just have to build in .dups
that are probably unnecessary.
You should know what the function you call does, otherwise why call it. Functions that modify the size of an array are generally very easy to spot and are rare (most of the array resize should be encapsulated with its own module.) Changing the name does not help things one bit. It's performace reasons that make arrays like this nessary. If you want a garrenteed array positions, wrap it in a class and create something like STL's slow vector class.
Furthermore, sometimes, you might actually be interested in having
definitely aliased arrays. The language spec, though, does not give you
much certainty that an implementation might not suddenly call .dup for some
reason.
  
This is a design issue. Use pointers to the array, or wrap them in classes. -- -Anderson: http://badmama.com.au/~anderson/
May 10 2004
prev sibling parent reply Stewart Gordon <smjg_1998 yahoo.com> writes:
Norbert Nemec wrote:

 Stewart Gordon wrote:
<snip>
 The current situation is: dynamic arrays actually are references to 
 the heap. Two arrays may reference the same portion of the heap, so 
 changing one will change the other. Anyhow, the language does 
 everything to obscure this fact and make it rather hard to predict, 
 when it happens. Unless you really know the details, you will often 
 call .dup without need,
If you want to guarantee that it's a separate copy, of course you'd call dup. Of course, a decent compiler would coalesce two statements int[] qwert = yuiop.dup; qwert.length = asdfg; into a single allocation operation.
 and, in the other way, you will have trouble if you trust that two 
 arrays refer to the same space.
To which someone might say, "Don't do that then!" At the moment I can see little use for wanting to access one array by what's effectively another, longer array. <snip>
 Conversion to Fortran arrays is already trivially possible. (A few 
 convenience functions might make it even more comfortable.) 
 Everything else should be easy to implement.
<snip> True, if the strides remain those for a new array. But if you've been playing with strided/block/diagonal slicing, then unless Fortran arrays support striding on this level, you'd need to do some rearrangement. Stewart. -- My e-mail is valid but not my primary mailbox, aside from its being the unfortunate victim of intensive mail-bombing at the moment. Please keep replies on the 'group where everyone may benefit.
May 11 2004
parent reply Norbert Nemec <Norbert.Nemec gmx.de> writes:
Stewart Gordon wrote:

 Norbert Nemec wrote:
 
 and, in the other way, you will have trouble if you trust that two
 arrays refer to the same space.
To which someone might say, "Don't do that then!" At the moment I can see little use for wanting to access one array by what's effectively another, longer array.
For strings, it might not be very useful. For arrays in general, though, there are many cases where it really is extremely useful. Imagine a 1GB array in memory, maybe representing a huge multidimensional matrix or whatever. You would really want to be able to handle multiple references to portions of that data in a comfortable way without the risk of suddenly getting a copy unintentionally.
 <snip>
 Conversion to Fortran arrays is already trivially possible. (A few
 convenience functions might make it even more comfortable.)
 Everything else should be easy to implement.
<snip> True, if the strides remain those for a new array. But if you've been playing with strided/block/diagonal slicing, then unless Fortran arrays support striding on this level, you'd need to do some rearrangement.
Of course. If a given fortran routine expects data aligned in memory in a given way, you might need to copy the data to that alignment before passing a reference to the fortran routine. Anyhow: if D is able to handle arrays in arbitrary alignment and striding, you may often be able to handle the data in Fortran alignment for a long time without necessary conversions. Example: get an array from Fortran, use a D-library function on it, pass it back to Fortran. No conversion necessary, because the D library can easily handle the array no matter how it is aligned in memory, because the alignment information is fully enclosed in the array reference with minimal (with good optimization: neglectible) overhead in terms of access time. Furthermore: writing a wrapper for a Fortran library, the wrapper can do all necessary conversions automatically, without doing any unnecessary conversions back and forth.
May 11 2004
parent reply Stewart Gordon <smjg_1998 yahoo.com> writes:
Norbert Nemec wrote:

<snip>
 For strings, it might not be very useful. For arrays in general, though,
 there are many cases where it really is extremely useful. Imagine a 1GB
 array in memory, maybe representing a huge multidimensional matrix or
 whatever. You would really want to be able to handle multiple references to
 portions of that data in a comfortable way without the risk of suddenly
 getting a copy unintentionally.
<snip> You can, if you allocate the matrix first and then start creating windows of it. Slice references don't unintentionally turn into copies. (Of course, I'm not sure what happens if you increase the length of a slice reference, but if that's an issue you'd avoid it anyway for this kind of work.) As long as the matrix doesn't grow, you're safe. If the matrix wants to be variable in size, you can still treat it as being one size (a reasonable maximum, whatever that may be) for allocation purposes. Of course, if no maximum is reasonable, or you bump into an unreasonable circumstance, you'd need to deal with reallocation whether the .length property is there and assignable or not. Stewart. -- My e-mail is valid but not my primary mailbox, aside from its being the unfortunate victim of intensive mail-bombing at the moment. Please keep replies on the 'group where everyone may benefit.
May 11 2004
parent Norbert Nemec <Norbert.Nemec gmx.de> writes:
Stewart Gordon wrote:

 Norbert Nemec wrote:
 
 <snip>
 For strings, it might not be very useful. For arrays in general, though,
 there are many cases where it really is extremely useful. Imagine a 1GB
 array in memory, maybe representing a huge multidimensional matrix or
 whatever. You would really want to be able to handle multiple references
 to portions of that data in a comfortable way without the risk of
 suddenly getting a copy unintentionally.
<snip> You can, if you allocate the matrix first and then start creating windows of it. Slice references don't unintentionally turn into copies. (Of course, I'm not sure what happens if you increase the length of a slice reference, but if that's an issue you'd avoid it anyway for this kind of work.) As long as the matrix doesn't grow, you're safe.
Guess, it is just a question of documenting clearly what happens. It should just be absolutely clear which operations might copy data. By now, I have even been convinced to cut the paragraph about making .length read only. Anyhow: it definitely has to be documented in which way it works, what exactly .dup does, etc.
 If the matrix wants to be variable in size, you can still treat it as
 being one size (a reasonable maximum, whatever that may be) for
 allocation purposes.  Of course, if no maximum is reasonable, or you
 bump into an unreasonable circumstance, you'd need to deal with
 reallocation whether the .length property is there and assignable or not.
OK. Guess, I'll just accept that the behaviour upsizing by assigning to .length is not predictable if you don't know where the array reference came from. B.t.w.: assigning to range[] in my multidimensional arrays is even more tricky, since you have to consider the full shape to see whether upsizing in place might be possible. I'm still not sure whether it might be best to allow assignment to length in one-dimensional arrays, but leave the range property read-only.
May 11 2004
prev sibling next sibling parent reply "Walter" <newshound digitalmars.com> writes:
"Norbert Nemec" <Norbert.Nemec gmx.de> wrote in message
news:c7nj4j$218d$1 digitaldaemon.com...
 See my multidimension array proposal at
http://homepages.uni-regensburg.de/~nen10015/documents/D-multidimarray.html
 it contains strided slices and much more.
Could you put that into the D wiki?
May 10 2004
parent Norbert Nemec <Norbert.Nemec gmx.de> writes:
Walter wrote:

 
 "Norbert Nemec" <Norbert.Nemec gmx.de> wrote in message
 news:c7nj4j$218d$1 digitaldaemon.com...
 See my multidimension array proposal at
http://homepages.uni-regensburg.de/~nen10015/documents/D-multidimarray.html
 it contains strided slices and much more.
Could you put that into the D wiki?
I have a long list of changes and corrections I want to work in first. Stewart Gordon and Ben Hinkle gave me some very valuable input that should find its way into the the proposal. Once I'm satisfied with it and the most controversial points are solved, it is, of course, up to you, Walter, to do with it what you like.
May 10 2004
prev sibling parent reply "Harvey Stroud" <hstroud ntlworld.com> writes:
----- Original Message ----- 
From: "Norbert Nemec" <Norbert.Nemec gmx.de>
Newsgroups: digitalmars.D
Sent: Monday, May 10, 2004 10:48 AM
Subject: Re: Negative array indices?


 Harvey Stroud wrote:

 It seems to me that supporting the negative array index of C for the
sake
 of backward compatibility goes against the design philosophy for D...
For pointers, negative indices actually make sense. If you allow indexing
of
 raw pointers (which I think is a good idea) then prohibiting negative
 indices would be strange. For arrays, negative indices are, of cause,
 caught by the range checking mechanism.
I think I should have read the language spec more before posting, as I was assuming from the following that -ve indices were valid for arrays: "The reason it's a very bad idea is that array subscripting in C and C++ and D can be done with signed integers because it is legal _and meaningful_ to pass a -ve subscript to mean prior to the given base (pointer and/or array)." Of course, this isn't quite the case with arrays as runtime bounds checking won't allow this, although whether switching off this mechanism via a compiler switch would circumvent this I'm not sure. I can see why the introduction of -ve indices to have a different behaviour would impose (slight) overhead on the runtime, and while this overhead must be already present with bounds checking, at least the latter is optional and may be compiled out. With -ve indexing implying a different semantic the checking would always have to remain regardless, unless it was only allowed for (compile time detectable) literals, which is bad as it wouldn't be orthogonal.
 Raw pointers, of course, are error prone. Anyway it's the philosophy of D
to
 give the developer all the tools to shoot himself in the foot, but make it
 clear what the dangerous tools are, and encourage him to avoid these tools
 completely.
Yup, I completely agree. If the programmer still wants the raw power of pointers then let them have it. Btw, I wasn't suggesting that -ve indexing for such pointers should be prohibited - that would just be wacky.
 Introducing a special operator ($) to denote the length strikes me as
 ungainly, making the code more perl-like, but perhaps that's just my
 dislike of none C symbols.
That's just personal taste. $ has no meaning in D so far, and it is a
plain
 ASCII character. Why not put it to use?
Agreed, just my preference. I think what I don't like about it is that it's an arbitrary symbol denoting some magic value. To the uninitiated it looks odd. Ok, it'd wouldn't take long to get used to but still, it seems a step in the direcion perl has taken in using such arbitrary symbols, and look how unreadable that is. Probably a very minor point though.
 B.t.w: in the suggested meaning, $ would not be a normal operator at all,
 but something special that does not exist in D so far: a "zero-ary
 operator" or however you want to call it.

 Has anybody given any thought to an [optional] stride value:

 int[] x = a[1..10 : 2];    // Gets every other element of the array
See my multidimension array proposal at
http://homepages.uni-regensburg.de/~nen10015/documents/D-multidimarray.html
 it contains strided slices and much more.
Wow, there's a lot to chew over in that doc! I've only had chance to skim it so far but it looks like a lot of good thought's gone into it. I really like the notation of the indices being within the same set of brackets a[m,n] for rectangular arrays as this suggests a tighter coupling of the array elements than the a[n][m] notation for dynamic arrays; both notations are appropriate to reflect the underlying nature of the data types. I look forward to seeing your next draft. Cheers, Harvey.
May 11 2004
parent Norbert Nemec <Norbert.Nemec gmx.de> writes:
Harvey Stroud wrote:

 ----- Original Message -----
 From: "Norbert Nemec" <Norbert.Nemec gmx.de>
 See my multidimension array proposal at

http://homepages.uni-regensburg.de/~nen10015/documents/D-multidimarray.html

 it contains strided slices and much more.
Wow, there's a lot to chew over in that doc! I've only had chance to skim it so far but it looks like a lot of good thought's gone into it. I really like the notation of the indices being within the same set of brackets a[m,n] for rectangular arrays as this suggests a tighter coupling of the array elements than the a[n][m] notation for dynamic arrays; both notations are appropriate to reflect the underlying nature of the data types. I look forward to seeing your next draft.
Thanks. The basic idea still is rather simple, but explaining it in detail really took more effort than I myself would have expected. I would suggest to wait for the next version of the proposal before reading it in detail. I have a number of changes to make already, and running it through a spellchecker might also improve readability...
May 11 2004