www.digitalmars.com         C & C++   DMDScript  

D - Array properties

reply Ziemowit Zglinski <Ziemowit_member pathlink.com> writes:
Reading the D documentation about Arrays and their properties, I wonder about
the
definition of dup, reverse and sort properties. As far as I can understand, the
behaviour
is of these properties is:

int[] a, b;
(1)  a  =  b;            a points to same array as b does
(2)  a  =  b.dup;      a points to a copy of b
(3)  a  =  b.reverse; a points to same array as b; b is reversed
(4)  a  =  b.sort;      a points to same array as b; b is sorted

While lines (1) and (2) are intuitively clear, I have a problem with the lines
(3) and (4).
I would expect the behaviour:

(3a) a  =  b.reverse;  a points to reversed copy of b; b is unchanged
(4a) a  =  b.sort;       a points to sorted copy of b; b is unchanged

When sorting an array in place, the properties are evolving to methods:

(5)  b.reverse;          b reversed in place
(6)  b.sort;               b sorted in place

To reverse/sort an array in place, the using of properties I would expect are:

(5a) b  =  b.reverse;  b reversed in place
(6a) b  =  b.sort;       b sorted in place

Am I missing something? 
Or are there strong reasons to define the first behaviour?

--  Ziemowit

    ziemek tera.com.pl
Mar 16 2003
next sibling parent reply Russ Lewis <spamhole-2001-07-16 deming-os.org> writes:
Ziemowit Zglinski wrote:

 While lines (1) and (2) are intuitively clear, I have a problem with the lines
 (3) and (4).
 I would expect the behaviour:

 (3a) a  =  b.reverse;  a points to reversed copy of b; b is unchanged
 (4a) a  =  b.sort;       a points to sorted copy of b; b is unchanged

 When sorting an array in place, the properties are evolving to methods:

 (5)  b.reverse;          b reversed in place
 (6)  b.sort;               b sorted in place

 To reverse/sort an array in place, the using of properties I would expect are:

 (5a) b  =  b.reverse;  b reversed in place
 (6a) b  =  b.sort;       b sorted in place

 Am I missing something?
 Or are there strong reasons to define the first behaviour?

I agree that your interpretations are better, in theory. I think that Walter chose the order he did because he was looking for reasonably fast speed while still having simple compilers. The problem with your example (5a) is that there are two possible interpretations that the programmer might want: (5a) b = b.reverse; In this definition, "b.reverse" creates a new array, which contains the (reversed) contents of the old "b" array. Then you set "b" to point at that array. So you're not really reversed in place, you're creating a new array that is reversed from the old one. To reverse in place, you have to do this: (6a) b[] = b.reverse; In theory, you are declaring that the program do these steps: 1) Create a new array, T, on the heap 2) Copy the elements of b into T, reversed 3) Copy the elements of T into b 4) Later (during GC) clean up T A good optimizer might be smart enough to collapse that down to "reverse b in place." But still, you don't have a language element that explicitly exists for reversing in place. -- The Villagers are Online! http://villagersonline.com .[ (the fox.(quick,brown)) jumped.over(the dog.lazy) ] .[ (a version.of(English).(precise.more)) is(possible) ] ?[ you want.to(help(develop(it))) ]
Mar 16 2003
parent Ziemowit Zglinski <Ziemowit_member pathlink.com> writes:
Another idea to resolve the "reverse/sort in place problem" would it be
to define the METHODS reverse()/sort() to the array definition, as 
opposite to PROPERTIES reverse/sort. It would mean the syntax:

(5b)  b.reverse();     b reversed in place 
(6b)  b.sort();        b sorted in place

This also syntactically highlites the in place operation. Also the hidden 
function call (and side effect) inside the property is made clearly visible.
So the interpretation of lines (5a)/(6a) would be exactly as you indicated.

-- Ziemowit

ziemek tera.com.pl

In article <3E74EF65.C0AF0CA7 deming-os.org>, Russ Lewis says...
Ziemowit Zglinski wrote:

 While lines (1) and (2) are intuitively clear, I have a problem with the lines
 (3) and (4).
 I would expect the behaviour:

 (3a) a  =  b.reverse;  a points to reversed copy of b; b is unchanged
 (4a) a  =  b.sort;       a points to sorted copy of b; b is unchanged

 When sorting an array in place, the properties are evolving to methods:

 (5)  b.reverse;          b reversed in place
 (6)  b.sort;               b sorted in place

 To reverse/sort an array in place, the using of properties I would expect are:

 (5a) b  =  b.reverse;  b reversed in place
 (6a) b  =  b.sort;       b sorted in place

 Am I missing something?
 Or are there strong reasons to define the first behaviour?

I agree that your interpretations are better, in theory. I think that Walter chose the order he did because he was looking for reasonably fast speed while still having simple compilers. The problem with your example (5a) is that there are two possible interpretations that the programmer might want: (5a) b = b.reverse; In this definition, "b.reverse" creates a new array, which contains the (reversed) contents of the old "b" array. Then you set "b" to point at that array. So you're not really reversed in place, you're creating a new array that is reversed from the old one. To reverse in place, you have to do this: (6a) b[] = b.reverse; In theory, you are declaring that the program do these steps: 1) Create a new array, T, on the heap 2) Copy the elements of b into T, reversed 3) Copy the elements of T into b 4) Later (during GC) clean up T A good optimizer might be smart enough to collapse that down to "reverse b in place." But still, you don't have a language element that explicitly exists for reversing in place. -- The Villagers are Online! http://villagersonline.com .[ (the fox.(quick,brown)) jumped.over(the dog.lazy) ] .[ (a version.of(English).(precise.more)) is(possible) ] ?[ you want.to(help(develop(it))) ]

Mar 16 2003
prev sibling parent reply "Walter" <walter digitalmars.com> writes:
I've been thinking about this. I think you're probably right - .reverse and
.sort should return copies.

"Ziemowit Zglinski" <Ziemowit_member pathlink.com> wrote in message
news:b52p24$les$1 digitaldaemon.com...
 Reading the D documentation about Arrays and their properties, I wonder

 the
 definition of dup, reverse and sort properties. As far as I can

 behaviour
 is of these properties is:

 int[] a, b;
 (1)  a  =  b;            a points to same array as b does
 (2)  a  =  b.dup;      a points to a copy of b
 (3)  a  =  b.reverse; a points to same array as b; b is reversed
 (4)  a  =  b.sort;      a points to same array as b; b is sorted

 While lines (1) and (2) are intuitively clear, I have a problem with the

 (3) and (4).
 I would expect the behaviour:

 (3a) a  =  b.reverse;  a points to reversed copy of b; b is unchanged
 (4a) a  =  b.sort;       a points to sorted copy of b; b is unchanged

 When sorting an array in place, the properties are evolving to methods:

 (5)  b.reverse;          b reversed in place
 (6)  b.sort;               b sorted in place

 To reverse/sort an array in place, the using of properties I would expect

 (5a) b  =  b.reverse;  b reversed in place
 (6a) b  =  b.sort;       b sorted in place

 Am I missing something?
 Or are there strong reasons to define the first behaviour?

 --  Ziemowit

     ziemek tera.com.pl

Mar 28 2003
parent reply Burton Radons <loth users.sourceforge.net> writes:
Walter wrote:
 I've been thinking about this. I think you're probably right - .reverse and
 .sort should return copies.

Please no! If one wants a copy, use .dup.sort. If .sort copied, it would be a time and memory hemmhorrhage that I would have no control over. I would have to write my own sort just to avoid this. From a simple matter of incidence, of 15 uses of .sort in some of my code, only 2 use .dup.sort. But even were that inverted, I can get the opposite behaviour when .sort is noncopying, but I'm impotent when .sort copies.
Mar 29 2003
next sibling parent "Walter" <walter digitalmars.com> writes:
"Burton Radons" <loth users.sourceforge.net> wrote in message
news:b64enj$2t8h$1 digitaldaemon.com...
 Walter wrote:
 I've been thinking about this. I think you're probably right - .reverse


 .sort should return copies.

Please no! If one wants a copy, use .dup.sort. If .sort copied, it would be a time and memory hemmhorrhage that I would have no control over. I would have to write my own sort just to avoid this. From a simple matter of incidence, of 15 uses of .sort in some of my code, only 2 use .dup.sort. But even were that inverted, I can get the opposite behaviour when .sort is noncopying, but I'm impotent when .sort copies.

I was afraid of that :-(
Mar 31 2003
prev sibling parent reply "Dario" <supdar yahoo.com> writes:
 Walter wrote:
 I've been thinking about this. I think you're probably right - .reverse


 .sort should return copies.


 Burton Radons wrote:
 Please no!  If one wants a copy, use .dup.sort.  If .sort copied, it
 would be a time and memory hemmhorrhage that I would have no control
 over.  I would have to write my own sort just to avoid this.

It wouldn't necessary happen if the compiler optimized a statement like: a[] = a.sort; That would literally mean (if sort made copies): 1) a new array is allocated; 2) the content of 'a' is copied in sorted; 3) the content of the sorted array is copied in 'a'; 4) (the copy of 'a' is lost and becomes garbage); But the compiler should be able to optimize it so that the unnecessary allocation won't be done. This can be done, can't it? "a[] = a.sort" is now "a.sort" and "a = a.sort" is now "a = a.dup.sort". -- Dario
May 01 2003
next sibling parent reply "Sean L. Palmer" <palmer.sean verizon.net> writes:
I dislike having to rely on compiler optimizations.  I'd rather the language
have the expressiveness required to tell the compiler exactly what I want it
to have the program do.  If it does anything above and beyond the call of
duty, that's just gravy.

Sean

"Dario" <supdar yahoo.com> wrote in message
news:b8rf9e$2mmn$1 digitaldaemon.com...
 Walter wrote:
 I've been thinking about this. I think you're probably right -



 and
 .sort should return copies.


 Burton Radons wrote:
 Please no!  If one wants a copy, use .dup.sort.  If .sort copied, it
 would be a time and memory hemmhorrhage that I would have no control
 over.  I would have to write my own sort just to avoid this.

It wouldn't necessary happen if the compiler optimized a statement like: a[] = a.sort; That would literally mean (if sort made copies): 1) a new array is allocated; 2) the content of 'a' is copied in sorted; 3) the content of the sorted array is copied in 'a'; 4) (the copy of 'a' is lost and becomes garbage); But the compiler should be able to optimize it so that the unnecessary allocation won't be done. This can be done, can't it? "a[] = a.sort" is now "a.sort" and "a = a.sort" is now "a = a.dup.sort". -- Dario

May 01 2003
parent Bill Cox <bill viasic.com> writes:
Hi, Sean.

I also don't want sort and reverse to make copies.  I can easily do that 
manually before calling them.

As for the larger point of having to rely on compiler optimizations, I 
hate having to learn templates for code layout, and then use them in my 
programming.  For one thing, no two compilers will agree on the 
templates that generate efficient code, even if both conform to the 
standard.

However, I also would like the compiler to do more optimizations than is 
currently possible in C, because C gives up too much control.

One of my favorite examples is automatically generated unions.  I find 
that as much as 1/3 of all object fields in the EDA databases I've used 
in the past are never used at the same time.  For example, an instance 
can have both placement as an owning cell location, and an owning 
netlist partition, used during synthesis.  The compiler could in theory 
examine my program's flow, and determine that some fields don't have 
overlapping lifespans.  These could be put into unions.  If you do this 
manually, you not only confuse people, but you make the code fragile 
(Did I say synthesis and placement couldn't be done together?).

So, specifically what I'm saying is that memory layout of class objects 
should be abstract, and not determined by the user, other than to 
provide some guidance.  This is the case in D.   I'm looking forward to 
seeing some advanced optimizations that should further leave C in the dust.

Bill

Sean L. Palmer wrote:
 I dislike having to rely on compiler optimizations.  I'd rather the language
 have the expressiveness required to tell the compiler exactly what I want it
 to have the program do.  If it does anything above and beyond the call of
 duty, that's just gravy.
 
 Sean
 
 "Dario" <supdar yahoo.com> wrote in message
 news:b8rf9e$2mmn$1 digitaldaemon.com...
 
Walter wrote:

I've been thinking about this. I think you're probably right -



and

.sort should return copies.

Burton Radons wrote: Please no! If one wants a copy, use .dup.sort. If .sort copied, it would be a time and memory hemmhorrhage that I would have no control over. I would have to write my own sort just to avoid this.

It wouldn't necessary happen if the compiler optimized a statement like: a[] = a.sort; That would literally mean (if sort made copies): 1) a new array is allocated; 2) the content of 'a' is copied in sorted; 3) the content of the sorted array is copied in 'a'; 4) (the copy of 'a' is lost and becomes garbage); But the compiler should be able to optimize it so that the unnecessary allocation won't be done. This can be done, can't it? "a[] = a.sort" is now "a.sort" and "a = a.sort" is now "a = a.dup.sort". -- Dario


May 02 2003
prev sibling next sibling parent Burton Radons <loth users.sourceforge.net> writes:
Dario wrote:

Walter wrote:

I've been thinking about this. I think you're probably right - .reverse


and
.sort should return copies.


Burton Radons wrote:
Please no!  If one wants a copy, use .dup.sort.  If .sort copied, it
would be a time and memory hemmhorrhage that I would have no control
over.  I would have to write my own sort just to avoid this.

It wouldn't necessary happen if the compiler optimized a statement like: a[] = a.sort;

That would introduce an inconsistency: class Foo { int x; int cmp (Object other) { return a [0].x + x - other.x; } } Foo[] a; ... a[] = a.sort; So it would have to be explicit semantics.
May 02 2003
prev sibling parent reply "Walter" <walter digitalmars.com> writes:
"Dario" <supdar yahoo.com> wrote in message
news:b8rf9e$2mmn$1 digitaldaemon.com...
 Walter wrote:
 I've been thinking about this. I think you're probably right -



 and
 .sort should return copies.


 Burton Radons wrote:
 Please no!  If one wants a copy, use .dup.sort.  If .sort copied, it
 would be a time and memory hemmhorrhage that I would have no control
 over.  I would have to write my own sort just to avoid this.

It wouldn't necessary happen if the compiler optimized a statement like: a[] = a.sort; That would literally mean (if sort made copies): 1) a new array is allocated; 2) the content of 'a' is copied in sorted; 3) the content of the sorted array is copied in 'a'; 4) (the copy of 'a' is lost and becomes garbage); But the compiler should be able to optimize it so that the unnecessary allocation won't be done. This can be done, can't it? "a[] = a.sort" is now "a.sort" and "a = a.sort" is now "a = a.dup.sort".

It couldn't do that optimization, because a could point inside another array.
May 02 2003
parent reply "Sean L. Palmer" <palmer.sean verizon.net> writes:
Which is why slices and arrays should be different beasts.

Sean

"Walter" <walter digitalmars.com> wrote in message
news:b8u68i$28o8$1 digitaldaemon.com...
 "Dario" <supdar yahoo.com> wrote in message
 news:b8rf9e$2mmn$1 digitaldaemon.com...
 Walter wrote:
 I've been thinking about this. I think you're probably right -



 and
 .sort should return copies.


 Burton Radons wrote:
 Please no!  If one wants a copy, use .dup.sort.  If .sort copied, it
 would be a time and memory hemmhorrhage that I would have no control
 over.  I would have to write my own sort just to avoid this.

It wouldn't necessary happen if the compiler optimized a statement like: a[] = a.sort; That would literally mean (if sort made copies): 1) a new array is allocated; 2) the content of 'a' is copied in sorted; 3) the content of the sorted array is copied in 'a'; 4) (the copy of 'a' is lost and becomes garbage); But the compiler should be able to optimize it so that the unnecessary allocation won't be done. This can be done, can't it? "a[] = a.sort" is now "a.sort" and "a = a.sort" is now "a = a.dup.sort".

It couldn't do that optimization, because a could point inside another array.

May 02 2003
parent reply "Walter" <walter digitalmars.com> writes:
"Sean L. Palmer" <palmer.sean verizon.net> wrote in message
news:b8u7ht$29tg$1 digitaldaemon.com...
 Which is why slices and arrays should be different beasts.

It's not just slices, two handles can refer to the same array.
May 02 2003
parent reply "Sean L. Palmer" <palmer.sean verizon.net> writes:
And the problem with sorting them then is...?  Maybe that's what the user
wants.  Maybe he wants a copy instead.  (S)he needs to specify.

The problem is that sometimes you want to treat normally reference types as
values, and sometimes you want to treat values as references.  We have to
use two different sets of syntaxes and thus the type system is not unified.
We can't really just say what we want, we have to remember what *type* the
thing is and say it differently.  This impedes generic programming.

This sort issue, the in/out/inout issue, most of this is symptoms of kludges
around this basic fact.  Trying to "make it easier" on people by introducing
two separate programming models with the same syntax.

I wish there was no semantic difference between an class and a struct, or in
fact between a class and a basic type such as int.  I should be able to
program them all the same way, with method calls and operators.  And if I
want a copy I should be able to say "make a copy" and if I want a reference
I should be able to say "make a reference to x", of course with much terser
syntax, but one that is explicit.  But it should be obvious what it is that
I'm doing;  hiding references is a bad idea. I want the references to stick
out to the naked eye, like & and * do in C.  I'd rather do something like
this though:

int& g = 10; // reference to int

void inc(int& i)  // we're reinventing ++ today  ;)
{
    i = i + 1; // dereferencing is automatic
}

{
    int foo;  // init to zero, since we don't specify anything else
    inc(&foo);  // require making explicit reference to foo
    inc(foo);  // or don't, I'm not that picky, but the & helps to be clear
about what's going on.
    print(foo); // please?  how hard should this be?  prints 2
    int& dog = foo;
    inc(dog);  // passing a reference in is no problem, no need for &, which
would be redundant, unnecessary,
    print(dog); // prints 3
    dog = g;  // store a 10 into foo thru dog reference
    dog = &g;  // this is still only assigning a 10 into foo, because
references are automatically dereferenced if necessary during implicit
conversion.
    // they could be automatically created if necessary too, but this may
not be desirable.  I kinda like references being transparent though.
    &dog = &g; // assign reference to g to reference named dog
    inc(&dog);  // would form a reference to reference to int which
automatically implicitly casts to reference to int
    print(dog); // prints 11
    print(foo); // prints 10
}

But I bet that D is no longer malleable enough to make such a large change.

The compiler is always of course free to optimize away any physical
reference if it can tell it is not needed.  The semantics have to be
honored, though.  It must give the correct results.

How to deal with references to references to references (double/triple
indirection in C with pointers)?  This would require some kind of way to
explicitly dereference, so you can get to the desired level.  Or you could
force the programmer to declare an intermediate reference of the right level
or require an explicit cast.  Extra references should still be automatically
dereferenced.

i.e.

int aa = 1, bb = 2;
int& a = &aa, b = &bb;
int&& selector = which ? &a : &b;
int& result = selector;
print(result);

I doubt a few &'s would clutter up your OO code that much.  And we could use
the same &'s for dealing with arrays by reference, or, occasionally, ints.
If you think that'll add too many &'s, invert the whole thing the other way
and do everything by reference *unless* the user explicitly requests a copy.
You could indicate a value type using #, and dereference a reference
(promote to actual value, or "copy" it) by using the unary # operator.  We
could have #= for by-value assignment (explicit copy) and #== for value
comparison.

One problem with using = for assignment and == for comparison is then how
you do make reference assignment and reference comparison?  D has gone the
way of === for reference comparison and no explicit reference assignment.
That leaves reference in the data realm, not operator realm.  I'd prefer to
state intent at time of assignment or comparison, in an unambiguous way,
using := for reference assignment and :== for reference comparison.

Am I crazy?  Things don't have to be this complicated.

Then maybe we can do fun things like build our own reference types and
overload referencing/dereferencing.

Sean

"Walter" <walter digitalmars.com> wrote in message
news:b8uigr$2lb4$1 digitaldaemon.com...
 "Sean L. Palmer" <palmer.sean verizon.net> wrote in message
 news:b8u7ht$29tg$1 digitaldaemon.com...
 Which is why slices and arrays should be different beasts.

It's not just slices, two handles can refer to the same array.

May 03 2003
parent "Walter" <walter digitalmars.com> writes:
You're right, this does boil down to the difference between reference and
value types. They do have fundamentally different semantics, while
frequently using the same syntax.

"Sean L. Palmer" <palmer.sean verizon.net> wrote in message
news:b8vprv$o7u$1 digitaldaemon.com...
 And the problem with sorting them then is...?  Maybe that's what the user
 wants.  Maybe he wants a copy instead.  (S)he needs to specify.

 The problem is that sometimes you want to treat normally reference types

 values, and sometimes you want to treat values as references.  We have to
 use two different sets of syntaxes and thus the type system is not

 We can't really just say what we want, we have to remember what *type* the
 thing is and say it differently.  This impedes generic programming.

 This sort issue, the in/out/inout issue, most of this is symptoms of

 around this basic fact.  Trying to "make it easier" on people by

 two separate programming models with the same syntax.

 I wish there was no semantic difference between an class and a struct, or

 fact between a class and a basic type such as int.  I should be able to
 program them all the same way, with method calls and operators.  And if I
 want a copy I should be able to say "make a copy" and if I want a

 I should be able to say "make a reference to x", of course with much

 syntax, but one that is explicit.  But it should be obvious what it is

 I'm doing;  hiding references is a bad idea. I want the references to

 out to the naked eye, like & and * do in C.  I'd rather do something like
 this though:

 int& g = 10; // reference to int

 void inc(int& i)  // we're reinventing ++ today  ;)
 {
     i = i + 1; // dereferencing is automatic
 }

 {
     int foo;  // init to zero, since we don't specify anything else
     inc(&foo);  // require making explicit reference to foo
     inc(foo);  // or don't, I'm not that picky, but the & helps to be

 about what's going on.
     print(foo); // please?  how hard should this be?  prints 2
     int& dog = foo;
     inc(dog);  // passing a reference in is no problem, no need for &,

 would be redundant, unnecessary,
     print(dog); // prints 3
     dog = g;  // store a 10 into foo thru dog reference
     dog = &g;  // this is still only assigning a 10 into foo, because
 references are automatically dereferenced if necessary during implicit
 conversion.
     // they could be automatically created if necessary too, but this may
 not be desirable.  I kinda like references being transparent though.
     &dog = &g; // assign reference to g to reference named dog
     inc(&dog);  // would form a reference to reference to int which
 automatically implicitly casts to reference to int
     print(dog); // prints 11
     print(foo); // prints 10
 }

 But I bet that D is no longer malleable enough to make such a large

 The compiler is always of course free to optimize away any physical
 reference if it can tell it is not needed.  The semantics have to be
 honored, though.  It must give the correct results.

 How to deal with references to references to references (double/triple
 indirection in C with pointers)?  This would require some kind of way to
 explicitly dereference, so you can get to the desired level.  Or you could
 force the programmer to declare an intermediate reference of the right

 or require an explicit cast.  Extra references should still be

 dereferenced.

 i.e.

 int aa = 1, bb = 2;
 int& a = &aa, b = &bb;
 int&& selector = which ? &a : &b;
 int& result = selector;
 print(result);

 I doubt a few &'s would clutter up your OO code that much.  And we could

 the same &'s for dealing with arrays by reference, or, occasionally, ints.
 If you think that'll add too many &'s, invert the whole thing the other

 and do everything by reference *unless* the user explicitly requests a

 You could indicate a value type using #, and dereference a reference
 (promote to actual value, or "copy" it) by using the unary # operator.  We
 could have #= for by-value assignment (explicit copy) and #== for value
 comparison.

 One problem with using = for assignment and == for comparison is then how
 you do make reference assignment and reference comparison?  D has gone the
 way of === for reference comparison and no explicit reference assignment.
 That leaves reference in the data realm, not operator realm.  I'd prefer

 state intent at time of assignment or comparison, in an unambiguous way,
 using := for reference assignment and :== for reference comparison.

 Am I crazy?  Things don't have to be this complicated.

 Then maybe we can do fun things like build our own reference types and
 overload referencing/dereferencing.

 Sean

 "Walter" <walter digitalmars.com> wrote in message
 news:b8uigr$2lb4$1 digitaldaemon.com...
 "Sean L. Palmer" <palmer.sean verizon.net> wrote in message
 news:b8u7ht$29tg$1 digitaldaemon.com...
 Which is why slices and arrays should be different beasts.

It's not just slices, two handles can refer to the same array.


May 03 2003