www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.announce - resizeable arrays: T[new]

reply Walter Bright <newshound1 digitalmars.com> writes:
Sean Kelly wrote:
 Jarrett Billingsley wrote:
 "Sean Kelly" <sean f4.ca> wrote in message 


 Most array algorithms would apply.  But I'm still not sure I see
 the point of having an immutable reference, because it's just passed
 by value anyway.  Who cares if the size of the array is modified
 within a function where it's not passed by reference?  The change is
 just local to the function anyway.

If that array is pointing into some meaningful area of memory (like in the example, a texture buffer), resizing the array could (probably would) move the array around, which I guess isn't illegal but then the function operating on the array wouldn't be accessing the correct place. Prevent them from changing the length, it prevents them from accessing anywhere but there.

Well yeah. I don't personally think this is a problem because it doesn't affect the callee in any way, but I can see how others might disagree. Doesn't 'final' do this now though?

There was a lot of talk about this problem today. Final doesn't quite do it, because it's not part of the type signature of the function. But making it a type signature adds a whole raft of complexity, and also inevitably completely screws up the notion of transitivity of const and invariant. (The problem is if I pass a slice to a function, and then that function reallocates the slice by changing the length or appending to it, it can affect other slices into the same data in very hard-to-explain ways. Essentially, we needed to find a way to disallow it. Options explored and found unacceptable were final (causes more problems), ref counting arrays (too expensive), always duping (too expensive), adding 'slice' bits to the fat pointers (doesn't work), adding ref counting to the memory chunks (too expensive), and data flow analysis hacks (unreliable, inconsistent). Note that other languages solve the problem with ref counting, but we considered that unacceptable for D because it adds 50% to the cost of transferring arrays around, and big runtime costs in manipulating the ref counts. And that doesn't even consider multithreading or exception safety requirements.) Now, it turns out that it is very rare for a function to legitimately want to resize a buffer passed to it. So we finally hit on the idea of making a resizeable array a different type, say: T[n] a; // static array T[] b; // dynamic array T[new] c; // resizeable array a.length = 6; // error, static array b.length = 6; // error, dynamic array is not resizeable c.length = 6; // ok b = c; // ok, implicit conversion of resizeable to dynamic c = b; // error c = b[0..n]; // error b = c[0..n]; // ok, but watch out if c is later resized b = c[0..n].dup; // ok, no worries c = cast(T[new])b; // ok, but this will be deliberate Note that .dup will return a T[new], as will operator new. Slicing will return a T[]. The runtime representation of T[new] will be identical to that of T[], just the type is different. So, if our function parameter is typed as T[], we know there isn't going to be any monkey business going on in that function with our other slices. If it's T[new], we know we need to take a hard look at it. I think this will entail only very rare changes to user code, but will be a big improvement in type safety. Note that it will *still* possible for array resizing to affect other slices into the same array, but it's far, far less likely to happen inadvertently. The attractiveness of this solution is it nearly completely solves the problem with zero runtime cost.
Jun 04 2007
next sibling parent reply Daniel Keep <daniel.keep.lists gmail.com> writes:
I agree with your reasoning for this (honestly, the solution had never
occured to me before, although looking back I can think of a few places
where it would have been quite useful), although I can't say I'm a fan
of the syntax.  The "new" inside the brackets just looks strange; a new
what?

A few possible alternatives:

T[new]	// Proposed
T[$]	// Since $ already means "length" for arrays
T[*]	// Indicating the index could be anything
T[..]	// Bit of a stretch...
T[volatile] // Bit long
T[[]]	// Just ugly

Like I said; I'm erring on the side of liking the idea, just not the use
of "new" in that particular spot.

	-- Daniel
Jun 04 2007
next sibling parent reply Tom S <h3r3tic remove.mat.uni.torun.pl> writes:
Daniel Keep wrote:
 The "new" inside the brackets just looks strange; a new what?

I'm with you here ;)
 A few possible alternatives:
 
 T[new]	// Proposed

Looks awkward, yeah. And it would produce false hits for anyone doing new-hunting (happens when you care about allocations, but prototype using 'new').
 T[$]	// Since $ already means "length" for arrays

I don't like it... Looks like an out of bounds error to my parser ;)
 T[*]	// Indicating the index could be anything

++votes; -- Tomasz Stachowiak http://h3.team0xf.com/ h3/h3r3tic on #D freenode
Jun 04 2007
next sibling parent reply Lionello Lunesu <lio lunesu.remove.com> writes:
Tom S wrote:
 Daniel Keep wrote:
 The "new" inside the brackets just looks strange; a new what?

I'm with you here ;)
 A few possible alternatives:

 T[new]    // Proposed

Looks awkward, yeah. And it would produce false hits for anyone doing new-hunting (happens when you care about allocations, but prototype using 'new').

Huh. To me, that sounds like an argument *for* using [new], since any array passed as [new] can cause an allocation in that function. I like the [new] syntax :P L.
Jun 04 2007
parent Tom S <h3r3tic remove.mat.uni.torun.pl> writes:
Lionello Lunesu wrote:
 Tom S wrote:
 Daniel Keep wrote:
 The "new" inside the brackets just looks strange; a new what?

I'm with you here ;)
 A few possible alternatives:

 T[new]    // Proposed

Looks awkward, yeah. And it would produce false hits for anyone doing new-hunting (happens when you care about allocations, but prototype using 'new').

Huh. To me, that sounds like an argument *for* using [new], since any array passed as [new] can cause an allocation in that function.

Not really, if you consider that it may be modified using realloc or some other wild schemes. And these are ok. I'd usually search for 'new', '~' and 'length' :P
 I like the [new] syntax :P

Nooooo! What have I done! *slaps himself with a trout* -- Tomasz Stachowiak http://h3.team0xf.com/ h3/h3r3tic on #D freenode
Jun 04 2007
prev sibling parent Walter Bright <newshound1 digitalmars.com> writes:
Tom S wrote:
 A few possible alternatives:

 T[new]    // Proposed

Looks awkward, yeah. And it would produce false hits for anyone doing new-hunting (happens when you care about allocations, but prototype using 'new').

I'm not necessarily sold on that syntax, but it's workable and unambiguous.
 
 T[$]    // Since $ already means "length" for arrays

I don't like it... Looks like an out of bounds error to my parser ;)

There are parsing ambiguities with that I wish to avoid.
 T[*]    // Indicating the index could be anything


That is used for VLA's in C99, which could mislead people.
Jun 04 2007
prev sibling next sibling parent reply Bill Baxter <dnewsgroup billbaxter.com> writes:
Daniel Keep wrote:
 I agree with your reasoning for this (honestly, the solution had never
 occured to me before, although looking back I can think of a few places
 where it would have been quite useful), although I can't say I'm a fan
 of the syntax.  The "new" inside the brackets just looks strange; a new
 what?
 
 A few possible alternatives:
 
 T[new]	// Proposed

Like the idea. Not being able to tell a slice from a "real" array has always made me a bit uncomfortable, though it never occurred to me to make them separate types. I would have expected it would introduce more headaches than it solves, but sounds like maybe not. Cool. T[new], though, doesn't speak to me as a syntax.
 T[$]	// Since $ already means "length" for arrays

Reminding us that '$'/length can be assigned to ... not bad.
 T[..]	// Bit of a stretch...

I liked it at first till I realized it would mean the *opposite* of the what seems obvious to me -- i.e. I took it to mean that T is slice-like, meaning it can't be resized. But it's supposed to be the other case.
 T[*]	// Indicating the index could be anything

That's not bad.
 T[volatile] // Bit long

 T[[]]	// Just ugly

Here's another: T[~] // mnemonic: T can be concatenated onto Or maybe it should just be a different brace type? T{} = new T[10]; Probably breaks context freeness or something, but I like it if for no other reason than the braces are shift-[ shift-] on the US keyboard. It's also easy to type, and it's a nice mnemonic for array with augmented capabilities. All the T[punctuation] varieties are going to be pretty annoying to type, I think. --bb
Jun 04 2007
next sibling parent Downs <default_357-line yahoo.de> writes:
Bill Baxter wrote:
 T[~] // mnemonic: T can be concatenated onto

Seconded. I like it! :) -downs
Jun 04 2007
prev sibling next sibling parent Downs <default_357-line yahoo.de> writes:
Bill Baxter wrote:
 T{} = new T[10];
 
 Probably breaks context freeness or something, but I like it if for no 
 other reason than the braces are shift-[ shift-] on the US keyboard. 
 It's also easy to type, and it's a nice mnemonic for array with 
 augmented capabilities.

I missed that in my original reply. {} are slightly harder to type than [] on german keyboards, because all the braces require the use of the alt-gr key (just right of space). This means [] is .. doable, but quickly typing {} requires quite a bit of finger stretching. (I usually put the thumb on alt-gr and reach over it with the index digit, and .. well, { is the alt-gr usage of the 7 key. Acrobatics! :D ) -- downs
Jun 04 2007
prev sibling parent torhu <fake address.dude> writes:
Bill Baxter wrote:
 Daniel Keep wrote:
 T[*]	// Indicating the index could be anything

That's not bad.

It means something completely different in C99. Not that that matters much.
 T[~] // mnemonic: T can be concatenated onto
 

Somehow this appeals to me. :)
Jun 04 2007
prev sibling next sibling parent Lars Ivar Igesund <larsivar igesund.net> writes:
Daniel Keep wrote:

 
 I agree with your reasoning for this (honestly, the solution had never
 occured to me before, although looking back I can think of a few places
 where it would have been quite useful), although I can't say I'm a fan
 of the syntax.  The "new" inside the brackets just looks strange; a new
 what?

 T[*]  // Indicating the index could be anything

This one. -- Lars Ivar Igesund blog at http://larsivi.net DSource, #d.tango & #D: larsivi Dancing the Tango
Jun 04 2007
prev sibling parent reply Daniel Keep <daniel.keep.lists gmail.com> writes:
Daniel Keep wrote:
 ...
 A few possible alternatives:
 
 T[new]	// Proposed
 T[$]	// Since $ already means "length" for arrays
 T[*]	// Indicating the index could be anything
 T[..]	// Bit of a stretch...
 T[volatile] // Bit long
 T[[]]	// Just ugly

One thing that didn't occur to me until just now is that T[$], T[*], T[~] (and someone suggested T[?] on IRC) require you to alternate the use of the shift key. This means that I keep accidentally typing T[$} or T{$] or T[4}. I do think that T[*] is probably the nicest looking without giving the wrong initial impression (T[$] could be misinterpreted as a typo, as could T[..]), but is tricky to type quickly. T[..] is easy to type fast, as is T[new]. Another thing to consider is that T[*] requires your hands to move further than either T[$] or T[..]. A few other possible symbols that can be entered without using shift (at least on my keyboard): T[.] T[,] T[`] T[-] T[=] T[;] I'm not counting keys on the numeric keypad because it's too far away from the usual resting position of the hands. Just something to chew on :) -- Daniel
Jun 04 2007
parent "Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:
"Daniel Keep" <daniel.keep.lists gmail.com> wrote in message 
news:f412os$24fh$1 digitalmars.com...
 Another thing to consider is that T[*] requires your hands to move
 further than either T[$] or T[..].

Ho-hoh! I've got an asterisk key between my left ctrl and alt keys. No shift for me ;)
 T[.]

Odd. Doesn't say much.
 T[,]

Looks like a 2D rectangular array (like in C#).
 T[`]
 T[-]
 T[=]
 T[;]

All odd. T[=] doesn't look bad, T[-] is easier to type for me.
Jun 04 2007
prev sibling next sibling parent reply Manfred Nowak <svv1999 hotmail.com> writes:
Walter Bright wrote

 The attractiveness of this 
 solution is it nearly completely solves the problem with zero
 runtime cost. 

Great thought. If incorporating this new type, please address also the complication of having two views on the storage allocated for resizable arrays: - the "potential" capacity, to which the `length' property currently can grow without the need of allocating more consecutive space - the "actual" capacity, which is currently set/get by the `length' property It is a pity, that the documentation, http://www.digitalmars.com/d/arrays.html#resize, illustrates a pattern that can be easily incorporated into the language as a default behaviour: - introduce new `length.pot' r/w-property - whenever the `length' property exceeds the `length.pot' property double the `length.pot' property and allocate appropriate space - issue an error when `length.pot' is decreased below the value of `length' -manfred
Jun 04 2007
next sibling parent Lionello Lunesu <lio lunesu.remove.com> writes:
Manfred Nowak wrote:
 Walter Bright wrote
 
 The attractiveness of this 
 solution is it nearly completely solves the problem with zero
 runtime cost. 

Great thought. If incorporating this new type, please address also the complication of having two views on the storage allocated for resizable arrays: - the "potential" capacity, to which the `length' property currently can grow without the need of allocating more consecutive space - the "actual" capacity, which is currently set/get by the `length' property It is a pity, that the documentation, http://www.digitalmars.com/d/arrays.html#resize, illustrates a pattern that can be easily incorporated into the language as a default behaviour: - introduce new `length.pot' r/w-property - whenever the `length' property exceeds the `length.pot' property double the `length.pot' property and allocate appropriate space - issue an error when `length.pot' is decreased below the value of `length'

I agree. Although I'd keep the syntax less freaky by adding a property directly to arrays, array.capacity makes the most sense IMHO. It can be write-only for all I care. Honestly, the array.length=12; array.length=0; -pattern (if you can call it that) is horrendous and should have been taken care of before v1.0 was released, in my not so humble opinion. L.
Jun 04 2007
prev sibling parent Daniel Keep <daniel.keep.lists gmail.com> writes:
Manfred Nowak wrote:
 ...
 If incorporating this new type, please address also the complication 
 of having two views on the storage allocated for resizable arrays:
 - the "potential" capacity, to which the `length' property currently 
 can grow without the need of allocating more consecutive space
 - the "actual" capacity, which is currently set/get by the `length' 
 property
 
 It is a pity, that the documentation, 
 http://www.digitalmars.com/d/arrays.html#resize,
 illustrates a pattern that can be easily incorporated into the 
 language as a default behaviour:
 - introduce new `length.pot' r/w-property
 - whenever the `length' property exceeds the `length.pot' property 
 double the `length.pot' property and allocate appropriate space
 - issue an error when `length.pot' is decreased below the value of 
 `length'
  
 -manfred

This can be pretty trivially implemented as a user type; if Walter added implicit casts and, say, opDollar, it'd be nigh indistinguishable from regular arrays. Just threw up a quick and dirty version here: http://www.prowiki.org/wiki4d/wiki.cgi?DanielKeep/snippets (bottom of the page). -- Daniel
Jun 04 2007
prev sibling next sibling parent reply Derek Parnell <derek psych.ward> writes:
On Mon, 04 Jun 2007 02:33:58 -0700, Walter Bright wrote:

 ... we finally hit on the idea of 
 making a resizeable array a different type, say:
 
     T[n]   a;  // static array
     T[]    b;  // dynamic array
     T[new] c;  // resizeable array
 
     a.length = 6;   // error, static array
     b.length = 6;   // error, dynamic array is not resizeable
     c.length = 6;   // ok

Huh? 'static array' used to mean fixed-size array and 'dynamic array' used to mean 'variable-length array' ... so a dynamic array is not dynamic anymore? So a dynamic array is 'sizeable' but not 'resizable'. In other words, once its length has been set it can never be changed again? Is that what you are saying? And is that a compile-time or a run-time check? If this is what 'resizeable' means for dynamic arrays then nearly all my dynamic arrays will have to be changed to resizable ones, because I rely on that feature for lots of things. What is the point of having dynmaic arrays now? The difference between them an static arrays is rather small.
     b = c;          // ok, implicit conversion of resizeable to dynamic

But hang on, you said that b.length isn't allowed to change? Or did you mean that it can only change if b.ptr also being updated?
 I think this will entail only very rare changes to user code, but will 
 be a big improvement in type safety. 

Not so sure about that ... I'll do some analysis and report back. -- Derek Parnell Melbourne, Australia "Justice for David Hicks!" skype: derek.j.parnell
Jun 04 2007
parent reply Walter Bright <newshound1 digitalmars.com> writes:
Derek Parnell wrote:
 On Mon, 04 Jun 2007 02:33:58 -0700, Walter Bright wrote:
 
 ... we finally hit on the idea of 
 making a resizeable array a different type, say:

     T[n]   a;  // static array
     T[]    b;  // dynamic array
     T[new] c;  // resizeable array

     a.length = 6;   // error, static array
     b.length = 6;   // error, dynamic array is not resizeable
     c.length = 6;   // ok

Huh? 'static array' used to mean fixed-size array

It still does.
 and 'dynamic array' used
 to mean 'variable-length array' ... so a dynamic array is not dynamic
 anymore?

A dynamic array would now mean that its length is not known at compile time.
 So a dynamic array is 'sizeable' but not 'resizable'. In other words, once
 its length has been set it can never be changed again? Is that what you are
 saying?

Yes.
 And is that a compile-time or a run-time check?

Compile time.
 If this is what 'resizeable' means for dynamic arrays then nearly all my
 dynamic arrays will have to be changed to resizable ones, because I rely on
 that feature for lots of things. 

Without looking at your code, I suspect it may be less than you think. I thought I was modifying strings all the time, so when switching over to const strings, I expected lots and lots of work. Turns out, only rarely do I do this.
 What is the point of having dynmaic arrays now? The difference between them
 an static arrays is rather small.

Consider the following: alias char[] string; That alone justifies it! And the difference between static arrays and dynamic arrays is very large. For one thing, static arrays are passed by value, and dynamic arrays by reference.
     b = c;          // ok, implicit conversion of resizeable to dynamic

But hang on, you said that b.length isn't allowed to change? Or did you mean that it can only change if b.ptr also being updated?

I meant that the underlying array cannot be resized through b. b itself can have its ptr/length values changed through simple assignment, but there will be no realloc happening.
 I think this will entail only very rare changes to user code, but will 
 be a big improvement in type safety. 

Not so sure about that ... I'll do some analysis and report back.

Jun 04 2007
next sibling parent reply BCS <BCS pathlink.com> writes:
Walter Bright wrote:
 Derek Parnell wrote:
 On Mon, 04 Jun 2007 02:33:58 -0700, Walter Bright wrote:
     b = c;          // ok, implicit conversion of resizeable to dynamic

But hang on, you said that b.length isn't allowed to change? Or did you mean that it can only change if b.ptr also being updated?

I meant that the underlying array cannot be resized through b. b itself can have its ptr/length values changed through simple assignment, but there will be no realloc happening.

so with T[n], .length (all by it's self) is a compile time const, with T[] it is an r-value and with T[new] it is an l-value. Is that more or less the sum of it?
Jun 04 2007
parent Walter Bright <newshound1 digitalmars.com> writes:
BCS wrote:
 Walter Bright wrote:
 Derek Parnell wrote:
 On Mon, 04 Jun 2007 02:33:58 -0700, Walter Bright wrote:
     b = c;          // ok, implicit conversion of resizeable to dynamic

But hang on, you said that b.length isn't allowed to change? Or did you mean that it can only change if b.ptr also being updated?

I meant that the underlying array cannot be resized through b. b itself can have its ptr/length values changed through simple assignment, but there will be no realloc happening.

so with T[n], .length (all by it's self) is a compile time const, with T[] it is an r-value and with T[new] it is an l-value. Is that more or less the sum of it?

Yes.
Jun 04 2007
prev sibling parent reply Oskar Linde <oskar.lindeREM OVEgmail.com> writes:
Walter Bright skrev:

 [...] the difference between static arrays and 
 dynamic arrays is very large. For one thing, static arrays are passed by 
 value, and dynamic arrays by reference.

This sounds intriguing. Does this mean that static arrays will be passed by value in D 2.0? Will string literals become dynamic (invariant) arrays instead of static too? /Oskar
Jun 07 2007
parent reply Walter Bright <newshound1 digitalmars.com> writes:
Oskar Linde wrote:
 Walter Bright skrev:
 
 [...] the difference between static arrays and dynamic arrays is very 
 large. For one thing, static arrays are passed by value, and dynamic 
 arrays by reference.

This sounds intriguing. Does this mean that static arrays will be passed by value in D 2.0?

Yes.
 Will string literals become dynamic (invariant) 
 arrays instead of static too?

They'll be invariant static arrays.
Jun 08 2007
next sibling parent Oskar Linde <oskar.lindeREM OVEgmail.com> writes:
Walter Bright skrev:
 Oskar Linde wrote:
 Walter Bright skrev:

 [...] the difference between static arrays and dynamic arrays is very 
 large. For one thing, static arrays are passed by value, and dynamic 
 arrays by reference.

This sounds intriguing. Does this mean that static arrays will be passed by value in D 2.0?

Yes.

That is wonderful! Another wrinkle ironed out.
 Will string literals become dynamic (invariant) arrays instead of 
 static too?

They'll be invariant static arrays.

Yeah, I got that a few minutes after posting from another post. My only (minor) concern here was the potential performance penalty of passing string literals to simple template functions, like void foo(T)(T x) {...}, but there are plenty of ways to deal with that. For example, since they are invariant, I guess they could be passed by reference behind the scenes anyway. /Oskar
Jun 08 2007
prev sibling parent reply "Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:
"Walter Bright" <newshound1 digitalmars.com> wrote in message 
news:f4b70o$i3t$1 digitalmars.com...
 Oskar Linde wrote:

 They'll be invariant static arrays.

Why wouldn't they be invariant unresizable arrays? Or do you plan on fixing all the special casing stuff that has to be taken into account when using string literals in {templates, array literals, AA literals, etc.}?
Jun 08 2007
parent reply Walter Bright <newshound1 digitalmars.com> writes:
Jarrett Billingsley wrote:
 "Walter Bright" <newshound1 digitalmars.com> wrote in message 
 news:f4b70o$i3t$1 digitalmars.com...
 Oskar Linde wrote:

 They'll be invariant static arrays.

Why wouldn't they be invariant unresizable arrays?

Because static arrays can be implicitly cast to unresizable arrays, but not vice versa.
 Or do you plan on fixing 
 all the special casing stuff that has to be taken into account when using 
 string literals in {templates, array literals, AA literals, etc.}? 

I don't know what you mean.
Jun 08 2007
parent "Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:
"Walter Bright" <newshound1 digitalmars.com> wrote in message >
 Because static arrays can be implicitly cast to unresizable arrays, but 
 not vice versa.

Why would you need it to implicitly cast to a static array? So you can initialize things like "char[4] = "abcd";" ?
 I don't know what you mean.

// cannot implicitly convert expression (["hi","bye"]) of type char[][2] to char // auto a = ["hi", "bye"]; // works auto a = ["hi"[], "bye"]; int[char[]] aa; // cannot implicitly convert expression ("bye") of type char[3] to char[2] //aa = ["hi":1, "bye":2]; // works aa = ["hi"[]:1, "bye":2]; typeof(T[0]) foo(T...)() { return T[0]; } // Error: functions cannot return static array char[2] //foo!("hi")(); // works foo!("hi"[])(); Templates also have to deal with static array types in completely different ways, since they can't be returned and their .init is not the same type as them.
Jun 08 2007
prev sibling next sibling parent reply Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
Walter Bright wrote:
 Now, it turns out that it is very rare for a function to legitimately 
 want to resize a buffer passed to it. So we finally hit on the idea of 
 making a resizeable array a different type, say:
 
    T[n]   a;  // static array
    T[]    b;  // dynamic array
    T[new] c;  // resizeable array
 
    a.length = 6;   // error, static array
    b.length = 6;   // error, dynamic array is not resizeable
    c.length = 6;   // ok
    b = c;          // ok, implicit conversion of resizeable to dynamic
    c = b;          // error
    c = b[0..n];    // error
    b = c[0..n];    // ok, but watch out if c is later resized
    b = c[0..n].dup; // ok, no worries
    c = cast(T[new])b; // ok, but this will be deliberate

What about c = a; ? I have several places where I have functions that follow this pattern: --- T[] foo(<parameters>, T[] buffer = null) { buffer.length = <something>; <fill buffer> return buffer; } --- These are intended to be usable with "buffer" being either a stack-allocated static array or a heap-allocated dynamic array. If the passed buffer is big enough a slice is returned, otherwise it gets enlarged (and possibly reallocated on the heap). I guess with this proposed modification I'd want to accept and return a "T[new]", but will that allow static arrays (stack-allocated) to be passed in? I'd hate to have to write two different functions, one for static arrays and slices and one for resizable arrays... (and I'm not fond of the idea of extra casts to make it work with a single function) Also, I often write toString() members of objects (and other functions returning strings) like this: --- char[] toString() { auto ret = "prefix"; foreach (elt; children) { ret ~= elt.toString(); // perhaps append a separator here } // if separators were appended, remove the last one here ret ~= "postfix"; return ret; } --- Will these need to be changed? (probably a .dup after the "prefix" string) (Since string constants are also static arrays, this is the same basic issue as above)
 I think this will entail only very rare changes to user code, but will 
 be a big improvement in type safety. Note that it will *still* possible 
 for array resizing to affect other slices into the same array, but it's 
 far, far less likely to happen inadvertently. The attractiveness of this 
 solution is it nearly completely solves the problem with zero runtime cost.

I think I may need to make quite some changes actually, though the changes required to the latter code sample I gave will be minimal. Another question, will the following be allowed: --- class Base { abstract char[] foo(); } class Derived : Base { override char[new] foo() { assert(0, "Not implemented yet"); } } --- ?
Jun 04 2007
parent Walter Bright <newshound1 digitalmars.com> writes:
Frits van Bommel wrote:
 Walter Bright wrote:
 Now, it turns out that it is very rare for a function to legitimately 
 want to resize a buffer passed to it. So we finally hit on the idea of 
 making a resizeable array a different type, say:

    T[n]   a;  // static array
    T[]    b;  // dynamic array
    T[new] c;  // resizeable array

    a.length = 6;   // error, static array
    b.length = 6;   // error, dynamic array is not resizeable
    c.length = 6;   // ok
    b = c;          // ok, implicit conversion of resizeable to dynamic
    c = b;          // error
    c = b[0..n];    // error
    b = c[0..n];    // ok, but watch out if c is later resized
    b = c[0..n].dup; // ok, no worries
    c = cast(T[new])b; // ok, but this will be deliberate

What about c = a; ?

Good question. That'll be allowed. I have a couple misgivings about it, though.
 I have several places where I have functions that follow this pattern:
 ---
 T[] foo(<parameters>, T[] buffer = null) {
     buffer.length = <something>;
     <fill buffer>
     return buffer;
 }
 ---
 These are intended to be usable with "buffer" being either a 
 stack-allocated static array or a heap-allocated dynamic array. If the 
 passed buffer is big enough a slice is returned, otherwise it gets 
 enlarged (and possibly reallocated on the heap).
 I guess with this proposed modification I'd want to accept and return a 
 "T[new]", but will that allow static arrays (stack-allocated) to be 
 passed in?

Yes.
 I'd hate to have to write two different functions, one for static arrays 
 and slices and one for resizable arrays... (and I'm not fond of the idea 
 of extra casts to make it work with a single function)
 
 Also, I often write toString() members of objects (and other functions 
 returning strings) like this:
 ---
 char[] toString() {
     auto ret = "prefix";
     foreach (elt; children) {
         ret ~= elt.toString();
         // perhaps append a separator here
     }
     // if separators were appended, remove the last one here
     ret ~= "postfix";
     return ret;
 }
 ---
 Will these need to be changed? (probably a .dup after the "prefix" string)
 (Since string constants are also static arrays, this is the same basic 
 issue as above)

You'd need to rewrite it as: --- string toString() { invariant(char)[new] ret = "prefix"; foreach (elt; children) { ret ~= elt.toString(); // perhaps append a separator here } // if separators were appended, remove the last one here ret ~= "postfix"; return ret; } --- since the type of "prefix" will be invariant(char)[6]; You'll need to use "string" as a return type, because you'll potentially be returning a reference to an immutable string literal, and that must be reflected in the type. If you chose to do "prefix".dup, it could be written as: --- string toString() { auto ret = "prefix".dup; foreach (elt; children) { ret ~= elt.toString(); // perhaps append a separator here } // if separators were appended, remove the last one here ret ~= "postfix"; return ret; } --- and the return type could be either string or char[]. The type of ret here will be char[new], as that is what .dup will return.
 I think this will entail only very rare changes to user code, but will 
 be a big improvement in type safety. Note that it will *still* 
 possible for array resizing to affect other slices into the same 
 array, but it's far, far less likely to happen inadvertently. The 
 attractiveness of this solution is it nearly completely solves the 
 problem with zero runtime cost.

I think I may need to make quite some changes actually, though the changes required to the latter code sample I gave will be minimal. Another question, will the following be allowed: --- class Base { abstract char[] foo(); } class Derived : Base { override char[new] foo() { assert(0, "Not implemented yet"); } } --- ?

Yes, since char[new] can be implicitly converted to char[]. The reverse will not work.
Jun 04 2007
prev sibling next sibling parent reply Oskar Linde <oskar.lindeREM OVEgmail.com> writes:
Walter Bright wrote:

 Now, it turns out that it is very rare for a function to legitimately 
 want to resize a buffer passed to it. So we finally hit on the idea of 
 making a resizeable array a different type, say:

An excellent suggestion. And I'm not only saying that because I have suggested this several times myself. :p The dual resizable array / slice semantics of the old dynamic array have several subtle problems, and also considering the fact that slices and resizable arrays are two very different beasts that seldom need to be mixed makes this a welcome change. I'm glad that introducing separate types seems to be relatively painless.
 
    T[n]   a;  // static array
    T[]    b;  // dynamic array
    T[new] c;  // resizeable array

I'd propose a different nomenclature: T[n] a; // static array T[] b; // (array) slice T[new] c; // dynamic array I also agree with others that there are better alternatives to "new". T[*] is my favorite.
 So, if our function parameter is typed as T[], we know there isn't going 
 to be any monkey business going on in that function with our other 
 slices. If it's T[new], we know we need to take a hard look at it.

It will still be a bit weird appending to or resizing a T[new] function argument that isn't passed by reference... Actually it will most likely be a bug. It would be *very* neat getting that fixed too. In 99.9 cases of 100 where one passes a T[new] as a function argument, one wants the argument to be passed by reference. The remaining 0.1 cases have obvious workarounds. So... Is there any way T[new] somehow could be made a reference type? Just like T[U] has become. E.g. it would be nice if the following just worked: void appendTo(T[new] arr, T val) { arr ~= val; } Without the problem of the user accidentally forgetting to pass arr by reference. Short version: You always want to pass ref T[new]. Forgetting ref is probably a bug, but is silently accepted by the compiler. Ergo, ref should be the default. :) /Oskar
Jun 04 2007
parent reply Walter Bright <newshound1 digitalmars.com> writes:
Oskar Linde wrote:
 Walter Bright wrote:
 
 Now, it turns out that it is very rare for a function to legitimately 
 want to resize a buffer passed to it. So we finally hit on the idea of 
 making a resizeable array a different type, say:

An excellent suggestion. And I'm not only saying that because I have suggested this several times myself. :p

Can you point me to the posting(s)? You should get the credit for being first.
    T[n]   a;  // static array
    T[]    b;  // dynamic array
    T[new] c;  // resizeable array

I'd propose a different nomenclature: T[n] a; // static array T[] b; // (array) slice T[new] c; // dynamic array

I like "resizeable" array because it is pretty clear what it does.
 I also agree with others that there are better alternatives to "new". 
 T[*] is my favorite.

Looks like C99's VLA.
 Short version: You always want to pass ref T[new]. Forgetting ref is 
 probably a bug, but is silently accepted by the compiler. Ergo, ref 
 should be the default. :)

Frits has posted a realistic use case that argues it shouldn't be.
Jun 04 2007
next sibling parent reply Oskar Linde <oskar.lindeREM OVEgmail.com> writes:
Walter Bright skrev:
 Oskar Linde wrote:
 Walter Bright wrote:

 Now, it turns out that it is very rare for a function to legitimately 
 want to resize a buffer passed to it. So we finally hit on the idea 
 of making a resizeable array a different type, say:

An excellent suggestion. And I'm not only saying that because I have suggested this several times myself. :p

Can you point me to the posting(s)? You should get the credit for being first.

Looking back at some of my posts, I don't think I ever made a public newsgroup post actually suggesting D change in this way... For the reason that I never dared to believe such a change could ever happen. I found some rants on the topic of the separate semantics of slices and resizable arrays, but it is hardly posts I feel very proud of. :p http://www.digitalmars.com/d/archives/digitalmars/D/bugs/bug_or_feature_9420.html#N9425 http://www.digitalmars.com/d/archives/digitalmars/D/learn/3521.html#N3593 http://www.digitalmars.com/d/archives/digitalmars/D/learn/3354.html#N3357
    T[n]   a;  // static array
    T[]    b;  // dynamic array
    T[new] c;  // resizeable array

I'd propose a different nomenclature: T[n] a; // static array T[] b; // (array) slice T[new] c; // dynamic array

I like "resizeable" array because it is pretty clear what it does.

Ok, another suggestion: (Sneakily injecting the [*] too... :) T[n] a; // static array T[] b; // array view T[*] c; // dynamic array IMHO, "array view" is a better description than "dynamic array". But it depends on your point of view, of course.
 I also agree with others that there are better alternatives to "new". 
 T[*] is my favorite.

Looks like C99's VLA.

Afaik, it is only used for vla _prototypes_ in C99, and is that really a problem? It is not like the compiler won't warn you if you write: extern(C) void foo(int,int[*]); right? And I don't really see how using T[*] for resizable arrays in D is any different from using T[] for non-resizable ones -- T[] also means something slightly different in C99.
 Short version: You always want to pass ref T[new]. Forgetting ref is 
 probably a bug, but is silently accepted by the compiler. Ergo, ref 
 should be the default. :)

Frits has posted a realistic use case that argues it shouldn't be.

I presume you mean: T[] foo(<parameters>, T[] buffer = null) { buffer.length = <something>; <fill buffer> return buffer; } """These are intended to be usable with "buffer" being either a stack-allocated static array or a heap-allocated dynamic array. If the passed buffer is big enough a slice is returned, otherwise it gets enlarged (and possibly reallocated on the heap).""" The buffer.length = <something>; in the use case might be a neat trick, but is it really that essential? Consider the rewrite: (T[] means non-resizable dynamic array) T[] foo(<parameters>, T[] buffer = null) { T[] ret = buffer.length >= needed_capacity ? buffer[0..needed_capacity] : new T[needed_capacity]; <fill ret> return ret; } The only advantage of the original code is that it might be able to squeeze a few extra bytes out of a buffer that for some reason would not be at its full capacity. But is that really a feature? I would personally not have expected this assert to potentially fail: T[] buffer = new T[1000]; T[] ret = foo(buffer); assert(ret.length <= buffer.length || ret.ptr != buffer.ptr); So I don't really see what you gain from being able to pass a resizable array by value and then change its length. At least not compared to what you gain by not being able to do that. /Oskar
Jun 04 2007
parent Walter Bright <newshound1 digitalmars.com> writes:
Oskar Linde wrote:
 The buffer.length = <something>; in the use case might be a neat trick, 
 but is it really that essential? Consider the rewrite: (T[] means 
 non-resizable dynamic array)
 
 T[] foo(<parameters>, T[] buffer = null)
 {
     T[] ret = buffer.length >= needed_capacity
             ? buffer[0..needed_capacity]
             : new T[needed_capacity];
     <fill ret>
     return ret;
 }

You're probably right with that.
Jun 04 2007
prev sibling parent reply Derek Parnell <derek psych.ward> writes:
On Mon, 04 Jun 2007 10:45:09 -0700, Walter Bright wrote:

    T[n]   a;  // static array
    T[]    b;  // dynamic array
    T[new] c;  // resizeable array

I'd propose a different nomenclature: T[n] a; // static array T[] b; // (array) slice T[new] c; // dynamic array

I like "resizeable" array because it is pretty clear what it does.

I think you are quite wrong Walter. The new interpretation of 'x[]' is almost identical to what we used to think of as a slice. And the interpreatation of 'x[new]' is exactly what used to be known as a dynamic array. Your change will effect all code immediately whereas reversing the (changed) meaning of '[]' and '[new]' to mean 'resizable' and 'slice' will allow coders to change their coding over time. Unless of course that's what you are trying to do ... -- Derek Parnell Melbourne, Australia "Justice for David Hicks!" skype: derek.j.parnell
Jun 04 2007
next sibling parent reply Tom S <h3r3tic remove.mat.uni.torun.pl> writes:
Derek Parnell wrote:
 On Mon, 04 Jun 2007 10:45:09 -0700, Walter Bright wrote:
 
    T[n]   a;  // static array
    T[]    b;  // dynamic array
    T[new] c;  // resizeable array

T[n] a; // static array T[] b; // (array) slice T[new] c; // dynamic array


I think you are quite wrong Walter. The new interpretation of 'x[]' is almost identical to what we used to think of as a slice. And the interpreatation of 'x[new]' is exactly what used to be known as a dynamic array. Your change will effect all code immediately whereas reversing the (changed) meaning of '[]' and '[new]' to mean 'resizable' and 'slice' will allow coders to change their coding over time. Unless of course that's what you are trying to do ...

Except that the non-resizable arrays will be more common in use than the fully dynamic ones. Thus, the resizable are a special case and deserve special syntax. *Except* that it's similar to const-by default, while D will use mutable-by default. Long story short, IMHO, it makes more sense to do const-by-default everywhere, just as using non-resizable arrays by default. But what you're suggesting will probably be more consistent with the planned constness facilities. And as I understand it, their main selling point over const-by-default is the smaller number of changes required in existing code. -- Tomasz Stachowiak http://h3.team0xf.com/ h3/h3r3tic on #D freenode
Jun 04 2007
parent Bill Baxter <dnewsgroup billbaxter.com> writes:
Tom S wrote:
 Derek Parnell wrote:
 On Mon, 04 Jun 2007 10:45:09 -0700, Walter Bright wrote:

    T[n]   a;  // static array
    T[]    b;  // dynamic array
    T[new] c;  // resizeable array

T[n] a; // static array T[] b; // (array) slice T[new] c; // dynamic array


I think you are quite wrong Walter. The new interpretation of 'x[]' is almost identical to what we used to think of as a slice. And the interpreatation of 'x[new]' is exactly what used to be known as a dynamic array. Your change will effect all code immediately whereas reversing the (changed) meaning of '[]' and '[new]' to mean 'resizable' and 'slice' will allow coders to change their coding over time. Unless of course that's what you are trying to do ...

Except that the non-resizable arrays will be more common in use than the fully dynamic ones. Thus, the resizable are a special case and deserve special syntax. *Except* that it's similar to const-by default, while D will use mutable-by default.

I think it really depends. I can imagine for a lot of folks it will basically boil down to parameter vs. local variable. If it's a parameter you want slice-semantics, if it's a local variable you want resizeable semantics. A quick grep through my code gave me about 400 lines where I used '~='. That doesn't tell you all that much, but it does give something close to an upper bound. For each of those lines a T[] somewhere is going to need to be changed to a T[new] (or whatever). The number could be less if I'm using auto. But I also use T[] x; x.length = N; a lot. Those will need to be changed too. On the other hand "[]" appears about 1700 times. That says to me that arrays that I don't resize (whether in paramter or not) are probably significantly more common than those I do. But either way it's a significant amount of code to change. --bb
Jun 04 2007
prev sibling parent gareis <dhasenan gmail.com> writes:
Derek Parnell wrote:
 On Mon, 04 Jun 2007 10:45:09 -0700, Walter Bright wrote:
 
    T[n]   a;  // static array
    T[]    b;  // dynamic array
    T[new] c;  // resizeable array

T[n] a; // static array T[] b; // (array) slice T[new] c; // dynamic array


I think you are quite wrong Walter. The new interpretation of 'x[]' is almost identical to what we used to think of as a slice. And the interpreatation of 'x[new]' is exactly what used to be known as a dynamic array. Your change will effect all code immediately whereas reversing the (changed) meaning of '[]' and '[new]' to mean 'resizable' and 'slice' will allow coders to change their coding over time. Unless of course that's what you are trying to do ...

There's another issue. Function parameters cover most of the ground that this does, if not more. There's already a set of keywords that can be used here with the same meaning. C++ has T const * as a single-allocate array and T* as the resizeable form (albeit only with realloc). It'd be a bit ugly, but D could do T const[] or T const([]) for the single-allocate array and T[] for the resizeable form. T const[] also addresses the above issue of converting code. I'm working on a MUD currently, and since it's a text-based game just about everything involved is string catenation. Either I'd suddenly grow to love the auto keyword (but specifying types manually is a comforting sanity check), or I'd have to convert almost every string to be mutable. Walter, you said already that you couldn't think of many cases where it would be useful to have an immutable array of mutable elements. But now it seems that not only are you implementing that, but you're making it the default. I'm confused. Why are you doing it? I agree it's useful to have these slice arrays, but it's an edge case. Please don't require extra keywords to get to the common case.
Jun 04 2007
prev sibling next sibling parent reply Sean Kelly <sean f4.ca> writes:
Walter Bright wrote:
 
 (The problem is if I pass a slice to a function, and then that function 
 reallocates the slice by changing the length or appending to it, it can 
 affect other slices into the same data in very hard-to-explain ways.

I'm not sure I understand how a reallocation could affect other slices into the same data. If a reallocation occurs, that reallocation will be into entirely new memory, provided the slice doesn't start at the head of a memory block. Is it the condition I just mentioned that's a problem? If so, I suppose it is one way to trick the compiler into thinking the reference no longer refers to const data, when in fact it does. If not, could you point me in the right direction?
 Now, it turns out that it is very rare for a function to legitimately 
 want to resize a buffer passed to it. So we finally hit on the idea of 
 making a resizeable array a different type, say:
 
    T[n]   a;  // static array
    T[]    b;  // dynamic array
    T[new] c;  // resizeable array

Well, I think it is very rare for a function to want to resize an array that isn't passed by reference. My first reaction would be to disallow resizing of non-ref array parameters entirely. Since resizing may occur in place, this could be seen as a mutating operation on the underlying data. If the user wants to resize a non-ref array parameter for local use he use assign ops and such. I think this may be related to a question I've been avoiding until now: how will the new const features interact with struct and class member function use? Will it always be legal to call member functions of const objects? Will it never be legal? Or perhaps there will be a way to label some operations as non-modifying (which suggests logical const)? Sean
Jun 04 2007
parent reply Walter Bright <newshound1 digitalmars.com> writes:
Sean Kelly wrote:
 Walter Bright wrote:
 (The problem is if I pass a slice to a function, and then that 
 function reallocates the slice by changing the length or appending to 
 it, it can affect other slices into the same data in very 
 hard-to-explain ways.

I'm not sure I understand how a reallocation could affect other slices into the same data. If a reallocation occurs, that reallocation will be into entirely new memory, provided the slice doesn't start at the head of a memory block. Is it the condition I just mentioned that's a problem? If so, I suppose it is one way to trick the compiler into thinking the reference no longer refers to const data, when in fact it does. If not, could you point me in the right direction?

Given: int[] a = new int[7]; int[] b = a[1..6]; b[1] = 2; // writes through to a[2] b.length = 4; b[1] = 3; // writes through to a[2] b.length = 6; b[1] = 4; // does not write through to a[2]
 Now, it turns out that it is very rare for a function to legitimately 
 want to resize a buffer passed to it. So we finally hit on the idea of 
 making a resizeable array a different type, say:

    T[n]   a;  // static array
    T[]    b;  // dynamic array
    T[new] c;  // resizeable array

Well, I think it is very rare for a function to want to resize an array that isn't passed by reference. My first reaction would be to disallow resizing of non-ref array parameters entirely. Since resizing may occur in place, this could be seen as a mutating operation on the underlying data. If the user wants to resize a non-ref array parameter for local use he use assign ops and such.

That was my first reaction too, but Frits posted a reasonable use case.
 I think this may be related to a question I've been avoiding until now: 
 how will the new const features interact with struct and class member 
 function use?  Will it always be legal to call member functions of const 
 objects?  Will it never be legal?  Or perhaps there will be a way to 
 label some operations as non-modifying (which suggests logical const)?

You'll only be able to call const member functions on const objects.
Jun 04 2007
next sibling parent reply Sean Kelly <sean f4.ca> writes:
Walter Bright wrote:
 Sean Kelly wrote:
 Walter Bright wrote:
 (The problem is if I pass a slice to a function, and then that 
 function reallocates the slice by changing the length or appending to 
 it, it can affect other slices into the same data in very 
 hard-to-explain ways.

I'm not sure I understand how a reallocation could affect other slices into the same data. If a reallocation occurs, that reallocation will be into entirely new memory, provided the slice doesn't start at the head of a memory block. Is it the condition I just mentioned that's a problem? If so, I suppose it is one way to trick the compiler into thinking the reference no longer refers to const data, when in fact it does. If not, could you point me in the right direction?

Given: int[] a = new int[7]; int[] b = a[1..6]; b[1] = 2; // writes through to a[2] b.length = 4; b[1] = 3; // writes through to a[2] b.length = 6; b[1] = 4; // does not write through to a[2]

Okay, so it is related to non-reallocating changes. Thanks.
 Now, it turns out that it is very rare for a function to legitimately 
 want to resize a buffer passed to it. So we finally hit on the idea 
 of making a resizeable array a different type, say:

    T[n]   a;  // static array
    T[]    b;  // dynamic array
    T[new] c;  // resizeable array

Well, I think it is very rare for a function to want to resize an array that isn't passed by reference. My first reaction would be to disallow resizing of non-ref array parameters entirely. Since resizing may occur in place, this could be seen as a mutating operation on the underlying data. If the user wants to resize a non-ref array parameter for local use he use assign ops and such.

That was my first reaction too, but Frits posted a reasonable use case.

Do you mean: T[] foo(<parameters>, T[] buffer = null) { buffer.length = <something>; <fill buffer> return buffer; } Wouldn't it work if changed to this: T[] foo(<parameters>, T[] buffer = null) { if( buffer.length < newSize ) buffer = new T[newSize]; <fill buffer> return buffer; } I think it's arguable that the second form is also more meaningful given that the array is not passed by reference. Sean
Jun 04 2007
parent reply Sean Kelly <sean f4.ca> writes:
Sean Kelly wrote:
 Walter Bright wrote:
 Sean Kelly wrote:
 Well, I think it is very rare for a function to want to resize an 
 array that isn't passed by reference.  My first reaction would be to 
 disallow resizing of non-ref array parameters entirely.  Since 
 resizing may occur in place, this could be seen as a mutating 
 operation on the underlying data.  If the user wants to resize a 
 non-ref array parameter for local use he use assign ops and such.

That was my first reaction too, but Frits posted a reasonable use case.

Do you mean: T[] foo(<parameters>, T[] buffer = null) { buffer.length = <something>; <fill buffer> return buffer; } Wouldn't it work if changed to this: T[] foo(<parameters>, T[] buffer = null) { if( buffer.length < newSize ) buffer = new T[newSize]; <fill buffer> return buffer; } I think it's arguable that the second form is also more meaningful given that the array is not passed by reference.

Given the above, am I correct in assuming that it will only be illegal to resize an array via .length if the underlying data is const? If the data is not const then it seems completely legitimate to resize the supplied array, even if doing so means an in-place expansion. Sean
Jun 05 2007
parent reply Walter Bright <newshound1 digitalmars.com> writes:
Sean Kelly wrote:
 Given the above, am I correct in assuming that it will only be illegal 
 to resize an array via .length if the underlying data is const?  If the 
 data is not const then it seems completely legitimate to resize the 
 supplied array, even if doing so means an in-place expansion.

Making the referred to data const does not inhibit resizing, because resizing does not change the referred to data. The key to understanding dynamic arrays is understanding their representation. Dynamic arrays of const data are like a C++ struct defined as: struct DynamicArray { size_t length; const char *ptr; } Here, I can change .length and .ptr EVEN THOUGH the data being pointed to is const.
Jun 06 2007
parent reply Sean Kelly <sean f4.ca> writes:
Walter Bright wrote:
 Sean Kelly wrote:
 Given the above, am I correct in assuming that it will only be illegal 
 to resize an array via .length if the underlying data is const?  If 
 the data is not const then it seems completely legitimate to resize 
 the supplied array, even if doing so means an in-place expansion.

Making the referred to data const does not inhibit resizing, because resizing does not change the referred to data. The key to understanding dynamic arrays is understanding their representation. Dynamic arrays of const data are like a C++ struct defined as: struct DynamicArray { size_t length; const char *ptr; } Here, I can change .length and .ptr EVEN THOUGH the data being pointed to is const.

Okay, that makes sense. So I'm not sure I entirely understand the problem with resizing even mutable array in place. Could this just be an instance where documentation is indeed sufficient? I can't think of a situation where I would actually pass a slice of a buffer to a routine that may grow that buffer when writing to it--it just doesn't make any sense from a use perspective. Is this something that we really need to add const features to check? Sean
Jun 06 2007
parent reply Reiner Pope <some address.com> writes:
Sean Kelly wrote:
  > Okay, that makes sense.  So I'm not sure I entirely understand the
 problem with resizing even mutable array in place.  Could this just be 
 an instance where documentation is indeed sufficient?  I can't think of 
 a situation where I would actually pass a slice of a buffer to a routine 
 that may grow that buffer when writing to it--it just doesn't make any 
 sense from a use perspective.  Is this something that we really need to 
 add const features to check?

I don't know if this helps, but currently the following code behaves unexpectedly with version=bad
 import std.stdio;
 
 void main()
 {
     char[] data = "1 50.0 60".dup;
 
     writefln( data[0..1].withDecimals() );
     writefln( data );                      // with version=bad prints "1.00.0
60" instead of "1 50.0 60"
     writefln( data[2..6].withDecimals() );
     writefln( data[7..$].withDecimals() );
 }
 
 char[] withDecimals(char[] num)
 {
     foreach (i, c; num)
         if (c == '.')
             return num;
 
 version(bad)
 {
     num ~= ".0";
     return num;
 }
 else
 {
     return (num ~ ".0");
 }
 
 }      

I'm not sure what the suggested rewrite of this would be. To avoid unnecessary dup'ing, perhaps:
 
 private char[new] withDecimals(char[new] num, bool needsDup)
 {
     foreach (i, c; num)
         if (c == '.')
             return num;
 
     if (needsDup)
         num = num.dup;
     num ~= ".0";
     return num;
 }
 
 char[new] withDecimals(char[new] num)
 {
     return withDecimals(num, false);
 }
 
 char[] withDecimals(char[] num)
 {
     return withDecimals(cast(char[new])num, true)
 }
  

-- Reiner
Jun 06 2007
parent Reiner Pope <some address.com> writes:
Reiner Pope wrote:
 I'm not sure what the suggested rewrite of this would be. 

 char[new=IsNew] withDecimals(bool IsNew)(char[new=IsNew] num)
 {
     foreach (i, c; num)
         if (c == '.')
             return num;
 
     char[new] newNum;
     static if (IsNew)
         newNum = num;
     else
         newNum = num.dup;
 
     newNum ~= ".0";
     return newNum;
 }

-- Reiner
Jun 07 2007
prev sibling parent reply Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
Walter Bright wrote:
 Sean Kelly wrote:
 Walter Bright wrote:
 (The problem is if I pass a slice to a function, and then that 
 function reallocates the slice by changing the length or appending to 
 it, it can affect other slices into the same data in very 
 hard-to-explain ways.

I'm not sure I understand how a reallocation could affect other slices into the same data. If a reallocation occurs, that reallocation will be into entirely new memory, provided the slice doesn't start at the head of a memory block. Is it the condition I just mentioned that's a problem? If so, I suppose it is one way to trick the compiler into thinking the reference no longer refers to const data, when in fact it does. If not, could you point me in the right direction?

Given: int[] a = new int[7]; int[] b = a[1..6]; b[1] = 2; // writes through to a[2] b.length = 4; b[1] = 3; // writes through to a[2] b.length = 6; b[1] = 4; // does not write through to a[2]

So, is that the problem we are trying to solve? Why not just simply make it so that setting .length *allways* creates a new array? Hum, now that I think about it, this may not only be a better solution, it may be the *only* real solution. Even, with the new resizable/unresizable array proposal you can have the same problem as above: int[new] a = new int[7]; int[new] b = a; b.length = 6; b[1] = 2; // writes through to a[1] b.length = 4; b[1] = 3; // writes through to a[1] b.length = 10; b[1] = 4; // does not write through to a[1] Why does this problem exist? Because, setting an array length may or may not create a new array, and there isn't a way to know which will happen. Well, in that code segment we can see when it will resize, but we can't know in the general case (like when we receive an array as a function parameter). This makes setting .length as very unsafe operation, if not plain *broken*. (Oskar, I would particularly like to hear your opinion on this) -- Bruno Medeiros - MSc in CS/E student http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Jun 09 2007
next sibling parent Deewiant <deewiant.doesnotlike.spam gmail.com> writes:
Bruno Medeiros wrote:
   int[new] a = new int[7];
   int[new] b = a;
   b.length = 6;
 
   b[1] = 2;   // writes through to a[1]
   b.length = 4;
   b[1] = 3;   // writes through to a[1]
   b.length = 10;
   b[1] = 4;   // does not write through to a[1]
 
 Why does this problem exist? Because, setting an array length may or may
 not create a new array, and there isn't a way to know which will happen.
 Well, in that code segment we can see when it will resize, but we can't
 know in the general case (like when we receive an array as a function
 parameter). This makes setting .length as very unsafe operation, if not
 plain *broken*.
 

There could perhaps be an internal flag in every array, initially set to false. It would be set in the b = a assignment, thus making it known that b is an alias to a. Then, when b.length is modified, a new array is created and the flag is unset. I'm not quite sure how robust this would be, but having .length changeable without creating a new array is just too handy to give up. Otherwise, we'd have to use realloc() to achieve the same effect. The loss here is 1 bit of .length or 1 byte in the array struct for the flag, and the overhead of having to check it whenever .length is modified. Of course, the simplest solution is simply to say that modification of .length for aliased arrays is undefined behaviour. -- Remove ".doesnotlike.spam" from the mail address.
Jun 09 2007
prev sibling parent reply Oskar Linde <oskar.lindeREM OVEgmail.com> writes:
Bruno Medeiros skrev:
 Walter Bright wrote:
 Sean Kelly wrote:
 Walter Bright wrote:
 (The problem is if I pass a slice to a function, and then that 
 function reallocates the slice by changing the length or appending 
 to it, it can affect other slices into the same data in very 
 hard-to-explain ways.

I'm not sure I understand how a reallocation could affect other slices into the same data. If a reallocation occurs, that reallocation will be into entirely new memory, provided the slice doesn't start at the head of a memory block. Is it the condition I just mentioned that's a problem? If so, I suppose it is one way to trick the compiler into thinking the reference no longer refers to const data, when in fact it does. If not, could you point me in the right direction?

Given: int[] a = new int[7]; int[] b = a[1..6]; b[1] = 2; // writes through to a[2] b.length = 4; b[1] = 3; // writes through to a[2] b.length = 6; b[1] = 4; // does not write through to a[2]

So, is that the problem we are trying to solve? Why not just simply make it so that setting .length *allways* creates a new array?

Not really an option. That would make ~= O(n^2) instead of O(n).
 Hum, now that I think about it, this may not only be a better solution, 
 it may be the *only* real solution. Even, with the new 
 resizable/unresizable array proposal you can have the same problem as 
 above:
 
   int[new] a = new int[7];
   int[new] b = a;
   b.length = 6;
 
   b[1] = 2;   // writes through to a[1]
   b.length = 4;
   b[1] = 3;   // writes through to a[1]
   b.length = 10;
   b[1] = 4;   // does not write through to a[1]
 
 Why does this problem exist? Because, setting an array length may or may 
 not create a new array, and there isn't a way to know which will happen. 
  Well, in that code segment we can see when it will resize, but we can't 
 know in the general case (like when we receive an array as a function 
 parameter). This makes setting .length as very unsafe operation, if not 
 plain *broken*.

That's why my suggestion was making T[new] a reference type instead of a value type, just like T[U] has become. The associative array used to have a similar issue. Previously (before D 0.151): int[int] a = ...; int[int] b = a; b[0] = 0; could potentially corrupt a... And an analogue for todays dynamic array: int[] a = ...; int[] b = a; b.length = 1; b ~= ...; will corrupt a. The solution would be identical: Make the resizable array a reference type. Todays value type behavior for dynamic arrays are perfect for slices, just not for resizable arrays. Another option, if one wishes to resolve the above issues and have T[new] remain a value type, would be to insert an implicit .dup on assignment for T[new]. That way, there would never be two resizable arrays sharing the same ptr. /Oskar
Jun 09 2007
next sibling parent Oskar Linde <oskar.lindeREM OVEgmail.com> writes:
Oskar Linde skrev:

 Not really an option. That would make ~= O(n^2) instead of O(n).

Ehm, I mean O(n) instead of O(1) of course. /Oskar
Jun 09 2007
prev sibling parent Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
Oskar Linde wrote:
 Bruno Medeiros skrev:
 Walter Bright wrote:
 Sean Kelly wrote:
 Walter Bright wrote:
 (The problem is if I pass a slice to a function, and then that 
 function reallocates the slice by changing the length or appending 
 to it, it can affect other slices into the same data in very 
 hard-to-explain ways.

I'm not sure I understand how a reallocation could affect other slices into the same data. If a reallocation occurs, that reallocation will be into entirely new memory, provided the slice doesn't start at the head of a memory block. Is it the condition I just mentioned that's a problem? If so, I suppose it is one way to trick the compiler into thinking the reference no longer refers to const data, when in fact it does. If not, could you point me in the right direction?

Given: int[] a = new int[7]; int[] b = a[1..6]; b[1] = 2; // writes through to a[2] b.length = 4; b[1] = 3; // writes through to a[2] b.length = 6; b[1] = 4; // does not write through to a[2]

So, is that the problem we are trying to solve? Why not just simply make it so that setting .length *allways* creates a new array?

Not really an option. That would make ~= O(n^2) instead of O(n).

Hum, didn't recall that. However, I still wouldn't say it's not an option. You're asking to choose between performance (significant only in some cases, like realloc'ing in loops) and correctness/"safeness". Safeness because length setting is not safe in the general case, but only if you know who and how the array was allocated, and also that no other references to it exist (in the case you are increasing the length). It seems to me that it should be left to the compiler, through code analysis, whether the in-place realloc optimization can be safely performed or not (i.e., without changing program semantics). However, since this would take quite some work to implement in the compiler, and also since some people may like to know for sure if their optimization is implemented or not, we could have a syntax (alternative to setting the length), that would do an inplace realloc. It could be another property (ar.inlength = 42), or simply a method (ar.realloc(42)), or whatever other syntax. This would mean that concatenation operators would always dup the array, so when you would want to resize inplace you couldn't use the concatenation operators. That would be, I think, the only difference versus the two array types solution. But given the rarity of inplace resizing, does not being able to use the concatenation operators justify adding a new array type to the language?
 Hum, now that I think about it, this may not only be a better 
 solution, it may be the *only* real solution. Even, with the new 
 resizable/unresizable array proposal you can have the same problem as 
 above:

   int[new] a = new int[7];
   int[new] b = a;
   b.length = 6;

   b[1] = 2;   // writes through to a[1]
   b.length = 4;
   b[1] = 3;   // writes through to a[1]
   b.length = 10;
   b[1] = 4;   // does not write through to a[1]

 Why does this problem exist? Because, setting an array length may or 
 may not create a new array, and there isn't a way to know which will 
 happen.  Well, in that code segment we can see when it will resize, 
 but we can't know in the general case (like when we receive an array 
 as a function parameter). This makes setting .length as very unsafe 
 operation, if not plain *broken*.

That's why my suggestion was making T[new] a reference type instead of a value type, just like T[U] has become. The associative array used to have a similar issue. Previously (before D 0.151): int[int] a = ...; int[int] b = a; b[0] = 0; could potentially corrupt a... And an analogue for todays dynamic array: int[] a = ...; int[] b = a; b.length = 1; b ~= ...; will corrupt a. The solution would be identical: Make the resizable array a reference type. Todays value type behavior for dynamic arrays are perfect for slices, just not for resizable arrays. Another option, if one wishes to resolve the above issues and have T[new] remain a value type, would be to insert an implicit .dup on assignment for T[new]. That way, there would never be two resizable arrays sharing the same ptr. /Oskar

Yeah, I guess those two would also be solutions, although I don't see right now if there could be some other implications with those usages. But in any case, even having two array kinds, and whatever the way the resizable array kind works, why do the non-resizable arrays need to be non-resizable at all? They could be resizable, but having the resize operation always allocate a new array (and consequently also allow concatenation). You gain language functionality and lose *nothing*. -- Bruno Medeiros - MSc in CS/E student http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Jun 10 2007
prev sibling next sibling parent reply Ary Manzana <ary esperanto.org.ar> writes:
Walter Bright escribió:
 
    T[n]   a;  // static array
    T[]    b;  // dynamic array
    T[new] c;  // resizeable array
 
    a.length = 6;   // error, static array
    b.length = 6;   // error, dynamic array is not resizeable
    c.length = 6;   // ok
    b = c;          // ok, implicit conversion of resizeable to dynamic
    c = b;          // error
    c = b[0..n];    // error
    b = c[0..n];    // ok, but watch out if c is later resized
    b = c[0..n].dup; // ok, no worries
    c = cast(T[new])b; // ok, but this will be deliberate

I don't understand how dynamic arrays work with this change. T[] b; // b is a dynamic array of size... ??? b.length = 6; // error, dynamic array is not resizeable ???????? So maybe this should be: T[] b = new T[6]; // ... of size 6, and 6 can // be replaced with a dynamic variable Is this the intention of it? I've only wrote one program in D, and I used dynamic arrays as lists (AKA ArrayList in Java). So I guess my program will break with this change, because I can't concatenate anymore. Maybe I've been using dynamic arrays in the wrong way?
Jun 04 2007
parent Bill Baxter <dnewsgroup billbaxter.com> writes:
Ary Manzana wrote:
 Walter Bright escribió:
    T[n]   a;  // static array
    T[]    b;  // dynamic array
    T[new] c;  // resizeable array

    a.length = 6;   // error, static array
    b.length = 6;   // error, dynamic array is not resizeable
    c.length = 6;   // ok
    b = c;          // ok, implicit conversion of resizeable to dynamic
    c = b;          // error
    c = b[0..n];    // error
    b = c[0..n];    // ok, but watch out if c is later resized
    b = c[0..n].dup; // ok, no worries
    c = cast(T[new])b; // ok, but this will be deliberate

I don't understand how dynamic arrays work with this change. T[] b; // b is a dynamic array of size... ??? b.length = 6; // error, dynamic array is not resizeable ???????? So maybe this should be: T[] b = new T[6]; // ... of size 6, and 6 can // be replaced with a dynamic variable Is this the intention of it?

Yep, I'm pretty sure that's what you'll have to do for a *non*-resizeable array. On the other hand if you want to be able to resize (using Walter's not-so-popular syntax): T[new] b; b.length = 6;
 I've only wrote one program in D, and I used dynamic arrays as lists 
 (AKA ArrayList in Java). So I guess my program will break with this 
 change, because I can't concatenate anymore. Maybe I've been using 
 dynamic arrays in the wrong way?  

Nope, the way you've been using them is fine. But to get safer slices, something's got to change. And since not resizing is less common than resizing, it makes more sense to change the syntax for the resizeable arrays. --bb
Jun 04 2007
prev sibling parent reply Sean Kelly <sean f4.ca> writes:
It just occurred to me this morning that I may have missed something 
regarding the proposed const changes.  Given the following D 1.0 code, 
can someone tell me whether or how it will change in D 2.0?

     void alter( char[] str )
     {
         str[0] = 'x';
     }

     char[] buf = "abc".dup;

     alter( buf );
     alter( buf[1 .. $] );

     printf( "%.*s\n", buf ); // should print "xxc"



Sean
Jun 05 2007
parent reply Brad Roberts <braddr puremagic.com> writes:
I believe that code would not need to change to retain existing behavior.

The function args aren't const by default since it's not using 'in'.  So 
it'll be fine.  The syntax used is also to pass a slice. which would 
become non-resizeable, which you're not resizing, so that's fine.  Slices 
are to retain their write through behavior, so that's fine.  Your 
initialization of buf would have an implicit cast from dup's T[new] to 
buf's T[], which works.  Passing a T[] to a function taking a T[] is 
perfectly normal.. 

So, that code 'just works'.

Later,
Brad

(now, time for Walter to point out the flaw in my reasoning)

On Tue, 5 Jun 2007, Sean Kelly wrote:

 Date: Tue, 05 Jun 2007 10:16:53 -0700
 From: Sean Kelly <sean f4.ca>
 Reply-To: digitalmars.D.announce <digitalmars-d-announce puremagic.com>
 To: digitalmars-d-announce puremagic.com
 Newsgroups: digitalmars.D.announce
 Subject: Re: resizeable arrays: T[new]
 
 It just occurred to me this morning that I may have missed something regarding
 the proposed const changes.  Given the following D 1.0 code, can someone tell
 me whether or how it will change in D 2.0?
 
     void alter( char[] str )
     {
         str[0] = 'x';
     }
 
     char[] buf = "abc".dup;
 
     alter( buf );
     alter( buf[1 .. $] );
 
     printf( "%.*s\n", buf ); // should print "xxc"
 
 
 
 Sean
 

Jun 05 2007
parent Sean Kelly <sean f4.ca> writes:
Brad Roberts wrote:
 I believe that code would not need to change to retain existing behavior.
 
 The function args aren't const by default since it's not using 'in'.  So 
 it'll be fine.  The syntax used is also to pass a slice. which would 
 become non-resizeable, which you're not resizing, so that's fine.  Slices 
 are to retain their write through behavior, so that's fine.  Your 
 initialization of buf would have an implicit cast from dup's T[new] to 
 buf's T[], which works.  Passing a T[] to a function taking a T[] is 
 perfectly normal.. 
 
 So, that code 'just works'.

Thanks for clearing this up. After thinking about the talk yesterday I'd started to wonder if the "write through" feature of slices might change, etc. It's very reassuring to know that this isn't the case. Sean
Jun 05 2007