digitalmars.D - Walter - Should we use arrays as Null?

AJG (36/36) Jul 28 2005 Hi Walter,

Derek Parnell (34/76) Jul 28 2005 I don't understand your concern. Both these are allowed and both work as

Regan Heath (4/8) Jul 29 2005 I hope not. Someone once mentioned it as a goal Walter had. I've not hea...
Regan Heath (40/45) Jul 29 2005 Why? I mean, I know what you mean: What's the point in having a

AJG (6/17) Jul 29 2005 Something _is_ wrong, IMHO.

Ben Hinkle (11/14) Jul 29 2005 I don't remember what Walter said but I hope he thinks about the options...
AJG (55/122) Jul 29 2005 Sure, they both "work" to a certain extent. But if you try to access a p...

Derek Parnell (59/167) Jul 29 2005 Yes ... but so what? Is that behavior hurting anyone?

AJG (71/146) Jul 29 2005 Well, if we look at it that way, then everything becomes a lot easier, d...

Niko Korhonen (9/14) Jul 29 2005 Do you want all operations on a null array, such as:

AJG (6/17) Jul 29 2005 For starters, yes. Why should objects be different than arrays when they...

Ben Hinkle (7/28) Jul 29 2005 I think you'll have a hard time getting lots of support for that. I much...

AJG (10/26) Jul 29 2005 I agree that there won't be much support for this. I don't suppose it wi...

Ben Hinkle (6/38) Jul 29 2005 no

AJG (14/14) Jul 29 2005 Hi Ben,

Ben Hinkle (24/40) Jul 29 2005 no - an array is two pieces of information: (1) a pointer to the data an...

Shammah Chancellor (14/57) Jul 29 2005 I've been following this, but have as of yet been unable to express my p...

AJG (23/35) Jul 29 2005 If this is so, it is unfortunate. I'm asking Walter to clarify this, tha...

AJG (20/43) Jul 29 2005 Hi,

Ben Hinkle (24/82) Jul 29 2005 Yes - they "have reference semantics" in the sense that they act on the ...

AJG (34/74) Jul 29 2005 Just to make sure I understand:

Ben Hinkle (26/81) Jul 29 2005 yes - aside from the fact that you should dup the "123" before trying to...

AJG (27/86) Jul 30 2005 So then .length is related to slicing? How does the semantics of .length...

Ben Hinkle (15/74) Jul 30 2005 I recommend you pursue some of your ideas where length is manipulated by

AJG (14/66) Jul 30 2005 Would an example do? I may not be an expert regarding slicing, but I cou...

Shammah Chancellor (34/38) Jul 30 2005 Because sometimes it needs to reallocate memory. Why don't you look at ...
Ben Hinkle (14/26) Jul 31 2005 Let me step through some choices that I was hoping you would do. Let's s...

AJG (19/31) Jul 31 2005 I don't think this change in the way arrays operate internally would be

Derek Parnell (20/36) Jul 30 2005 You are wrong here because 'B.someProperty' operates on B not A.

AJG (12/33) Jul 30 2005 Um... I said "except .length" for a reason. That's my very point. That ....

Shammah Chancellor (34/56) Jul 30 2005 No, All others do _NOT_ operate on A. They happen to operate on the sam...

AJG (37/57) Jul 30 2005 You are simply splitting hairs here. You are arguing language semantics....

Shammah Chancellor (80/137) Jul 31 2005 I am not splitting hairs. I gave you a very valid reason why a and b ar...

Carlos Santander (19/29) Jul 30 2005 First of all, I don't agree with AJG: I think D arrays are very well the...

Derek Parnell (30/50) Jul 29 2005 This is where I think we separate. I don't think that D arrays are

AJG (31/69) Jul 29 2005 Well, my "in theory" is actually pretty down-to-earth. I mean reference

Mike Parker (6/20) Jul 30 2005 Wasn't it you who posted elsewhere in this thread that change is good? ;...
Derek Parnell (7/26) Jul 30 2005 I think I have the solution. Rename them. Don't call them arrays. Call t...

Niko Korhonen (22/27) Jul 31 2005 Indeed. I think the array semantics where you can't access a property of...

Derek Parnell (16/33) Aug 01 2005 Agreed. The way I look at it is that a D array variable *contains* a

J Thomas (7/7) Jul 30 2005 so wait, you basically want an array to be a pointer to data containing

AJG (22/30) Jul 30 2005 No. I would like it to be that way, but I know there wouldn't be support...

Derek Parnell (9/18) Jul 30 2005 There might have been be an argument that .reverse and .sort should foll...

Ben Hinkle (6/22) Aug 01 2005 Besides those reasons writing "B.reverse" to me indicates you want to af...

Shammah Chancellor (12/37) Aug 01 2005 Utterly confusing! reserve(b) and B.reverse have nothing in their name ...

Ben Hinkle (4/53) Aug 01 2005 You've lost me. Are you proposing a change to any existing behavior or

Shammah Chancellor (18/54) Aug 01 2005 I wasn't proposing a change at all. I was disagreing with Derek. I thi...

AJG (10/21) Aug 01 2005 IMHO, and for consistency, it should never do COW. If a user wants to do...

Shammah Chancellor (16/36) Aug 01 2005 While I agree with you that it could be annoying, the problem is that ar...

Ben Hinkle (12/77) Aug 01 2005 I didn't read Derek's post as proposing reverse use COW. He was pointing...

Shammah Chancellor (18/102) Aug 01 2005 You're right, he didn't. I was contesting that tolower(b) and b.tolower...

Ben Hinkle (13/134) Aug 01 2005 That is what I'm implying - and that's what many std.string functions do...

Shammah Chancellor (20/159) Aug 01 2005 Exactly. Quite often when I want to replace one thing, I want to replac...

Ben Hinkle (24/56) Aug 01 2005 I don't know if you followed the recent COW/const/inplace performance

Shammah Chancellor (19/78) Aug 01 2005 I think this would be a bad choice. It might be wise with respect to

Ben Hinkle (2/16) Aug 02 2005 There's a link at the bottom of the phobos page for the wiki. I don't kn...

Derek Parnell (16/30) Aug 01 2005 Hi Shammah,

Shammah Chancellor (26/52) Aug 01 2005 No,no I understood that. I'm just being argumentative. I don't agree w...

AJG <AJG_member pathlink.com> writes:

Hi Walter,

This is something that's confused me quite a bit and I think you are the only
one that can settle it for good. The question is whether we should be using null
as a special array value. Maybe it can be broken down to pieces:

1) Why can objects be null but arrays can't (given that _both_ are by-ref)?





IMHO this is inconsistent. The former makes sense, the latter is weird. Another
way of looking at it is: why dumb-down arrays but not objects?

2) Is it a technical limitation (for now)?
3) Is support for "proper" null arrays planned?

I, for one, would _like_ to see support for both null arrays and continued
support for null objects. As Regan has argued (and now I'm a believer), the null
special value is very useful, and we should keep this distinction (vs. empty).
Perhaps you can clarify whether this is going to happen properly or not.

In my view, proper array nulls do _not_ exist. What we have right now is very
confusing because sometimes we can use the null value and sometimes we can't. It
is also fickle because the null value is tied to the pointer. Regan thinks that
you are planning on merging emptiness and existence into one (a bad thing).

Some of the problems (not technically "bugs"):
- array.length = 0 sets the pointer to null.
- static int[0] is not null, but new int[0] is.
- .dup of an empty string (static or not) also sets the pointer to null.
- static arrays can't have null pointers.

4) What exactly does [if (array)] mean (or theoretically should mean)?
- if (array.ptr)
- if (array.length)
- if (array == null)
- if (array is null)
- or some combination thereof?

==============

In short, I think it would be dangerous to use this feature if you are planning
on subtly phasing it out. Could you please shed some light on the situation?

Thanks!
--AJG.

Jul 28 2005

Derek Parnell <derek psych.ward> writes:

On Fri, 29 Jul 2005 05:30:52 +0000 (UTC), AJG wrote:

 Hi Walter,

I know I'm not the big W. but here's my take this anyhow ;-)

 This is something that's confused me quite a bit and I think you are the only
 one that can settle it for good. The question is whether we should be using
null
 as a special array value. Maybe it can be broken down to pieces:
 
 1) Why can objects be null but arrays can't (given that _both_ are by-ref)?
 




I don't understand your concern. Both these are allowed and both work as
expected; that is to say in both cases, after the assignment, the variable
does not reference anything. What more than this are you expecting?
 
 IMHO this is inconsistent. The former makes sense, the latter is weird. Another
 way of looking at it is: why dumb-down arrays but not objects?

Huh? I still can't see what you are worried about.

  Object obj = null;  // Makes sense.
  int[]array = null;  // Also makes sense (to me anyhow).

 2) Is it a technical limitation (for now)?

Is *what* a limitation?

 3) Is support for "proper" null arrays planned?
 
 I, for one, would _like_ to see support for both null arrays and continued
 support for null objects. 

Who says that this support is going away?

As Regan has argued (and now I'm a believer), the null
 special value is very useful, and we should keep this distinction (vs. empty).

Absolutely! Both are good and distinct concepts.

 Perhaps you can clarify whether this is going to happen properly or not.
 
 In my view, proper array nulls do _not_ exist. 

But they do. If array.ptr is null, then array is a null array. 

What we have right now is very
 confusing because sometimes we can use the null value and sometimes we can't.

Huh? When can't you use it?

 It is also fickle because the null value is tied to the pointer. 

Huh? Of course it is. What else could it be?

 Regan thinks that
 you are planning on merging emptiness and existence into one (a bad thing).

I don't think that Walter is planning on this.

 Some of the problems (not technically "bugs"):
 - array.length = 0 sets the pointer to null.

This is a bug and Walter has said so. He will fix this.

 - static int[0] is not null, but new int[0] is.

static arrays cannot be null (not reference anything) by their very nature.
A static array must always reference some RAM somewhere.

What do you think that 'new int[0]' should return? 

 - .dup of an empty string (static or not) also sets the pointer to null.

This is because of the bug.

 - static arrays can't have null pointers.

Of course not. The 'static' attribute means that they occupy RAM that is
allocated at compile time.
 
 4) What exactly does [if (array)] mean (or theoretically should mean)?
 - if (array.ptr)
 - if (array.length)
 - if (array == null)
 - if (array is null)
 - or some combination thereof?

It actually means ...

  if (array.ptr !is null || array.length != 0)

which is a bit redundant because we can never have the situation where the
ptr is null and the length is > 0.

 ==============
 
 In short, I think it would be dangerous to use this feature if you are planning
 on subtly phasing it out. Could you please shed some light on the situation?

Once Walter fixes the bug in which setting the length to zero also clears
the ptr, I think we will have what you want. Hope I've helped.

-- 
Derek
Melbourne, Australia
29/07/2005 4:17:38 PM

Jul 28 2005

"Regan Heath" <regan netwin.co.nz> writes:

On Fri, 29 Jul 2005 16:39:54 +1000, Derek Parnell <derek psych.ward> wrote:
 Regan thinks that
 you are planning on merging emptiness and existence into one (a bad  
 thing).

 I don't think that Walter is planning on this.

I hope not. Someone once mentioned it as a goal Walter had. I've not heard  
 from Big W himself.

Regan

Jul 29 2005

"Regan Heath" <regan netwin.co.nz> writes:

On Fri, 29 Jul 2005 16:39:54 +1000, Derek Parnell <derek psych.ward> wrote:
 - static int[0] is not null, but new int[0] is.

 static arrays cannot be null (not reference anything) by their very  
 nature.
 A static array must always reference some RAM somewhere.

Why? I mean, I know what you mean: What's the point in having a  
non-existant static array. A static array always exists, therefore cannot  
be null. But doesn't that then make:

int[0] a;

illegal?

I thought about this for a sec and decided that no, to make it illegal  
would likely annoy the heck out of a template programmer some time in the  
future.

But, it can be null, can't it? I mean the data pointer, not the array  
'reference'. I'm not sure an 'array reference' even exists for static  
arrays? My impression is that a static array is simply implemented as a  
pointer, the length property which is static is 'macro replaced' at  
compile time.

In which case, the data pointer could be null, right? Statements like  
a.length would be fine, it's marco replaced after all. Statements like  
a[0] = 'a'; would crash, or give array bounds errors, just like any other  
array would.

Maybe I'm missing some secret of their implementation.


 What do you think that 'new int[0]' should return?

Well, at first glance 'null'. You're asking for (0 * int.sizeof) memory  
which is 0 bytes. But, have you tried it in C/C++?

The MSDN documentation states:
   "If size is 0, malloc allocates a zero-length item in the heap and  
returns a valid pointer to that item"

There is nothing in the docs for "new" but a quick experiment showed the  
same behaviour as malloc for the statement "new int[0]".

Ilya wondered immediately how it was possible to have a "zero-length item  
in the heap" so he tried DMC and Cygwin-GCC and found: "both returned at  
least 8 bytes."


If you step back and just look it at a conceptual level you'd expect the  
statements:

int[0] a;
int[] a = new int[0];

to result in the same thing, surely? i mean 'a' is an instance of an  
'int[0]' in both cases (whatever that is decided to be).

Currently they don't and there appears to be 3 choices:
  - leave it as it, nothing is wrong.
  - make "int[0] a" null.
  - make "new int[0]" non-null.

Regan

Jul 29 2005

AJG <AJG_member pathlink.com> writes:

Hi,

If you step back and just look it at a conceptual level you'd expect the  
statements:

int[0] a;
int[] a = new int[0];

to result in the same thing, surely? i mean 'a' is an instance of an  
'int[0]' in both cases (whatever that is decided to be).

Exactly.

Currently they don't and there appears to be 3 choices:
  - leave it as it, nothing is wrong.

Something _is_ wrong, IMHO. 

  - make "int[0] a" null.

Two wrongs don't a right make. ;)

  - make "new int[0]" non-null.

Bingo!

Regan

--AJG.

Jul 29 2005

"Ben Hinkle" <ben.hinkle gmail.com> writes:

 Some of the problems (not technically "bugs"):
 - array.length = 0 sets the pointer to null.

 This is a bug and Walter has said so. He will fix this.

I don't remember what Walter said but I hope he thinks about the options. 
There are three factors involved (that I can see):
1) setting length to/from 0
2) slicing to a 0 length array and appending to a 0 length array
3) the +1 that gets added to every array allocation which makes powers-of-2 
allocations the most inefficient (takes 2x the memory of what you asked for)

Currently item 3 is added because one can slice off the end of an array and 
then ask to grow that. Should 2 behave like 1 or should 1 behave like 2? I 
could imagine a solution where appending to a zero-length array reallocs 
like setting the length from 0 reallocs. In any case there isn't a pain-free 
solution.

Jul 29 2005

AJG <AJG_member pathlink.com> writes:

Hi Derek,

I know I'm not the big W. but here's my take this anyhow ;-)

 This is something that's confused me quite a bit and I think you are the only
 one that can settle it for good. The question is whether we should be using
null
 as a special array value. Maybe it can be broken down to pieces:
 
 1) Why can objects be null but arrays can't (given that _both_ are by-ref)?
 




I don't understand your concern. Both these are allowed and both work as
expected; that is to say in both cases, after the assignment, the variable
does not reference anything. What more than this are you expecting?

Sure, they both "work" to a certain extent. But if you try to access a property
on a null object -that's illegal. On a null array, it's not. That's not a null
array. That's a pseudo-null reference that never goes away.

 IMHO this is inconsistent. The former makes sense, the latter is weird. Another
 way of looking at it is: why dumb-down arrays but not objects?

Huh? I still can't see what you are worried about.

Arrays are dumbed-down so that you can do things like:

foreach (char c; null) { // do something }

NULL, in my opinion, is _not_ the same as empty. BUT, the above operation makes
it so. Instead, it should throw an exception at the very least, or even better,
it could be detected at compile-time.

 2) Is it a technical limitation (for now)?

Is *what* a limitation?

The distinction I made before between the nullness of objects (which is
complete), and that of arrays (which is incomplete). I asked this because
perhaps it was due to the way D properties worked (more like functions), or
somesuch, meaning, it was not _intended_ to be that way.

 3) Is support for "proper" null arrays planned?
 
 I, for one, would _like_ to see support for both null arrays and continued
 support for null objects. 

Who says that this support is going away?

If Walter decides emptiness and existence should be one. This already happens in
the language. Maybe it's due to bugs, maybe not. That's why I asked Walter for
his "vision," if you will, regarding arrays and nulls. If array null disappears,
it's likely object null will also disappear. Both of these worry me. But if
indeed they are just bugs, then why doesn't Walter say so?

As Regan has argued (and now I'm a believer), the null
 special value is very useful, and we should keep this distinction (vs. empty).

Absolutely! Both are good and distinct concepts.

In theory, yes. In D, not entirely. Please see below:

 Perhaps you can clarify whether this is going to happen properly or not.
 
 In my view, proper array nulls do _not_ exist. 

But they do. If array.ptr is null, then array is a null array. 

Why should be have to resort to array.ptr for nullness? Why can't the _array
itself_ be null? An object _can_ be null by itself, no need to check an
"object.ptr". In fact, on a null object .ptr would throw. You have to
acknowledge this is a significant difference.

What we have right now is very
 confusing because sometimes we can use the null value and sometimes we can't.

Huh? When can't you use it?

I can't use it when the "bugs" get in my way. And they just so happen to get in
the way a lot. I'm working with databases right not, and essentially there's no
way to have a string represent a DBNULL, because when I dup an empty string, it
_too_ becomes NULL.

 It is also fickle because the null value is tied to the pointer. 

Huh? Of course it is. What else could it be?

It could be, say, a simple boolean. Or, it could be, say, like objects. Objects
don't rely on object.ptr, why should arrays?

 Regan thinks that
 you are planning on merging emptiness and existence into one (a bad thing).

I don't think that Walter is planning on this.

I certainly hope not, but how can we be sure? This is why I asked.

 Some of the problems (not technically "bugs"):
 - array.length = 0 sets the pointer to null.

This is a bug and Walter has said so. He will fix this.

Just out of curiosity, is there a post that I could read regarding this? I'd
really like to see what he said.

 - static int[0] is not null, but new int[0] is.

static arrays cannot be null (not reference anything) by their very nature.

Their very nature says nothing of nullness. It just means allocate in a
different area of memory.

A static array must always reference some RAM somewhere.

Why? Why can't it reference null? Conceptually, I don't see a problem. But maybe
this is one of the "technical limitations" I was talking about.

What do you think that 'new int[0]' should return? 

It should return a NON-null empty array. In current terminology:

in[] arr = new int[0];
if (arr)        // this should be TRUE.
if (arr.length) // this should be FALSE.

 - .dup of an empty string (static or not) also sets the pointer to null.

This is because of the bug.

Well this subtle bugs renders DB-null impossible because as it happens .dups are
fairly common. This is what I said about it being "fickle."

char[] s = ""; // here you have it.
char[] p = s.dup; // now you don't. very fickle.

 - static arrays can't have null pointers.

Of course not. The 'static' attribute means that they occupy RAM that is
allocated at compile time.

This is a technical limitation. Once again, conceptually, it should be able to
point to static null just as well. Perhaps allocate 0-bytes? In other words,
this is a problem with the implementation. The language itself shouldn't be
limited because of this.

 
 4) What exactly does [if (array)] mean (or theoretically should mean)?
 - if (array.ptr)
 - if (array.length)
 - if (array == null)
 - if (array is null)
 - or some combination thereof?

It actually means ...

  if (array.ptr !is null || array.length != 0)

which is a bit redundant because we can never have the situation where the
ptr is null and the length is > 0.

IIRC this was deduced from a single dissasembly, wasn't it?
Is it _always_ the same thing? (static/dynamic/associative)?

 ==============
 
 In short, I think it would be dangerous to use this feature if you are planning
 on subtly phasing it out. Could you please shed some light on the situation?

Once Walter fixes the bug in which setting the length to zero also clears
the ptr, I think we will have what you want. Hope I've helped.

And the very important duping "bug" (for DBs). And the inconsistency with static
arrays. And I'm sure I could find some more problems. But first I need to know
whether they are problems in the first place. Only Walter knows...

Cheers,
--AJG.

Jul 29 2005

Derek Parnell <derek psych.ward> writes:

On Fri, 29 Jul 2005 14:19:06 +0000 (UTC), AJG wrote:

 Hi Derek,
 
I know I'm not the big W. but here's my take this anyhow ;-)

 This is something that's confused me quite a bit and I think you are the only
 one that can settle it for good. The question is whether we should be using
null
 as a special array value. Maybe it can be broken down to pieces:
 
 1) Why can objects be null but arrays can't (given that _both_ are by-ref)?
 




I don't understand your concern. Both these are allowed and both work as
expected; that is to say in both cases, after the assignment, the variable
does not reference anything. What more than this are you expecting?

 
 Sure, they both "work" to a certain extent. But if you try to access a property
 on a null object -that's illegal. On a null array, it's not. That's not a null
 array. That's a pseudo-null reference that never goes away.

Yes ... but so what? Is that behavior hurting anyone?

Object properties are user-defined and all take 'this' as an automatic
argument, thus they require an instance. Array properties are built-in to
D. The 'array' is the instance. Don't confuse its elements as being
instances of the array. 

Thus 'int[] array;' creates an instance of the array even though it has no
data yet. And because the instance exists, you can use its properties.
There is no inconsistency here. I think you have merged object nullness and
array nullness into the same meaning. But they are not the same thing. A
null object is a placeholder into which you can later store a reference to
an object instance. A null array is an instance of an array that has no
data. 

 IMHO this is inconsistent. The former makes sense, the latter is weird. Another
 way of looking at it is: why dumb-down arrays but not objects?

Huh? I still can't see what you are worried about.

 
 Arrays are dumbed-down so that you can do things like:
 
 foreach (char c; null) { // do something }
 
 NULL, in my opinion, is _not_ the same as empty. BUT, the above operation makes
 it so. Instead, it should throw an exception at the very least, or even better,
 it could be detected at compile-time.

You use the phrase 'dumbed-down' where as I see this as a smart thing. And
just because the coder does some weirdo foreach() statement, doesn't mean
the language is wrong. And by the way, you code example does produce a
compiler error - " foreach: void* is not an aggregate type". To get it to
compile you have to use 'foreach(char c; cast(char[])null) {};' which shows
to me that somebody with a big stick needs to chat with the coder.

 2) Is it a technical limitation (for now)?

Is *what* a limitation?

 
 The distinction I made before between the nullness of objects (which is
 complete), and that of arrays (which is incomplete). I asked this because
 perhaps it was due to the way D properties worked (more like functions), or
 somesuch, meaning, it was not _intended_ to be that way.

But arrays are not classes or objects. So what if they are both reference
types. They are still not the same beasties.

 3) Is support for "proper" null arrays planned?
 
 I, for one, would _like_ to see support for both null arrays and continued
 support for null objects. 

Who says that this support is going away?

 
 If Walter decides emptiness and existence should be one. This already happens
in
 the language. Maybe it's due to bugs, maybe not. That's why I asked Walter for
 his "vision," if you will, regarding arrays and nulls. If array null
disappears,
 it's likely object null will also disappear. Both of these worry me. But if
 indeed they are just bugs, then why doesn't Walter say so?

That could be said about lots of things ;-)

As Regan has argued (and now I'm a believer), the null
 special value is very useful, and we should keep this distinction (vs. empty).

Absolutely! Both are good and distinct concepts.

 
 In theory, yes. In D, not entirely. Please see below:

I distinctly remember reading a note from Walter saying that he was
surprised that setting the length to zero also nulled the pointer. He has
code in Phobos that assumes that this is not the right behavior.
 
 Perhaps you can clarify whether this is going to happen properly or not.
 
 In my view, proper array nulls do _not_ exist. 

But they do. If array.ptr is null, then array is a null array. 

 
 Why should be have to resort to array.ptr for nullness? Why can't the _array
 itself_ be null? 

Because its like saying, why can't an object instance be null. The array IS
an instance. A null array means something different to a null object.

An object _can_ be null by itself, no need to check an
 "object.ptr". In fact, on a null object .ptr would throw. You have to
 acknowledge this is a significant difference.

Yes it is a difference. So what? Learn it and move on. This is D, not
C/C++. 

What we have right now is very
 confusing because sometimes we can use the null value and sometimes we can't.

Huh? When can't you use it?

 
 I can't use it when the "bugs" get in my way. And they just so happen to get in
 the way a lot. I'm working with databases right not, and essentially there's no
 way to have a string represent a DBNULL, because when I dup an empty string, it
 _too_ becomes NULL.

Yep. Been there, done that. I just wish he'd fix this bug. Its very easy to
fix.

 It is also fickle because the null value is tied to the pointer. 

Huh? Of course it is. What else could it be?

 
 It could be, say, a simple boolean. Or, it could be, say, like objects. Objects
 don't rely on object.ptr, why should arrays?

Because they are arrays and not class instances.

 Regan thinks that
 you are planning on merging emptiness and existence into one (a bad thing).

I don't think that Walter is planning on this.

 
 I certainly hope not, but how can we be sure? This is why I asked.
 
 Some of the problems (not technically "bugs"):
 - array.length = 0 sets the pointer to null.

This is a bug and Walter has said so. He will fix this.

 
 Just out of curiosity, is there a post that I could read regarding this? I'd
 really like to see what he said.

Yes, but I don't know how to search for it.
 
 - static int[0] is not null, but new int[0] is.

static arrays cannot be null (not reference anything) by their very nature.

 
 Their very nature says nothing of nullness. It just means allocate in a
 different area of memory.

By 'static' are you meaning non-dynamic arrays or single-instance arrays.
For example, which of these lines are static to you?

void func()
{
  int[] a;
  int[1] b;
  static int[1] c;
}

To me, I only call array 'c' a static array. The array 'a' is a
dynamic(-length) array and array 'b' is a fixed-length array. But array 'a'
and 'b' are not single-instance arrays. After checking with the usage in D
itself, it seems that D uses static ambiguously when it comes to arrays.

A static array must always reference some RAM somewhere.

 
 Why? Why can't it reference null? Conceptually, I don't see a problem. But
maybe
 this is one of the "technical limitations" I was talking about.

Because static arrays are allocated RAM at compile time and they reference
themselves. Because they exist they can't be null. Given ...

  static int[1] x;

You will find that ...

  x.ptr == &x

And because &x will always return a non-null, then x.ptr is always
non-null.
 

-- 
Derek Parnell
Melbourne, Australia
30/07/2005 2:09:05 AM

Jul 29 2005

AJG <AJG_member pathlink.com> writes:

Hi Derek,

 Sure, they both "work" to a certain extent. But if you try to access a property
 on a null object -that's illegal. On a null array, it's not. That's not a null
 array. That's a pseudo-null reference that never goes away.

Yes ... but so what? Is that behavior hurting anyone?

Well, if we look at it that way, then everything becomes a lot easier, doesn't
it? Whether it hurts anyone or not is not the way to build a language. There are
things that are correct, and those that are not. Arrays, as it stands, _break_
reference semantics. I don't know whether this hurts anyone or not, but it is
certainly inconsistent, and it is my view simply incorrect.

Object properties are user-defined and all take 'this' as an automatic
argument, thus they require an instance. Array properties are built-in to
D. The 'array' is the instance.

Technically speaking, this is half-right:





Semantically speaking, I think this is wrong. 

int[] arr    // This is the reference.
= new int[0] // _This_ is the array.

Don't confuse its elements as being
instances of the array.

I never did. Elements are fine the way they are. But by your logic, perhaps we
should be able to do this:




Or wouldn't you say this is wrong? Does it "hurt" anyone? Nah. In fact, it will
help by preventing those annoying ArrayOutOfBounds thingamajjigs. 

Thus 'int[] array;' creates an instance of the array even though it has no
data yet. And because the instance exists, you can use its properties.

This just doesn't make sense. int[] array creates a _reference_. That's the very
definition of a reference. That's why arrays are reference-types. It's
essentially a nicer version of a pointer.

There is no inconsistency here. I think you have merged object nullness and
array nullness into the same meaning. But they are not the same thing. A
null object is a placeholder into which you can later store a reference to
an object instance. A null array is an instance of an array that has no
data. 

If this is so, then arrays can't be called reference types. That's not what
references do. Frankly, I wouldn't know what the heck to call arrays if these
are the semantics we're supposed to follow.

You use the phrase 'dumbed-down' where as I see this as a smart thing. And
just because the coder does some weirdo foreach() statement, doesn't mean
the language is wrong. 

It's not whether it's a smart-thing or a dumb thing. I see it as being a dumbing
down, but that's not the point. The point is that it muddies the distinction
between _empty_ and _non-existant_. Conceptually, you can't iterate thru a
non-existant array. It doesn't exist. It should be a bug.

Conceptually, you _can_ iterate thru an empty array. It exists and has no
elements, thus no iteration would happen, but the construct is valid.

With this "smart" feature, the two are fused into one. Empty and Non-existant
can _both_ be iterated thru. They both produce the same result: 0 iterations.
This I think is incorrect.

0 iterations == 0 elements // correct
0 iterations == null       // incorrect

And by the way, you code example does produce a
compiler error - " foreach: void* is not an aggregate type". To get it to
compile you have to use 'foreach(char c; cast(char[])null) {};' which shows
to me that somebody with a big stick needs to chat with the coder.

Sorry, my bad. I don't have access to DMD. But you know what I meant:

char[] nullArray = null;
foreach (char c; nullArray) { /* do something */ } 

 2) Is it a technical limitation (for now)?

Is *what* a limitation?

 
 The distinction I made before between the nullness of objects (which is
 complete), and that of arrays (which is incomplete). I asked this because
 perhaps it was due to the way D properties worked (more like functions), or
 somesuch, meaning, it was not _intended_ to be that way.

But arrays are not classes or objects. So what if they are both reference
types. They are still not the same beasties.

Certainly they are not the same. But they both have the same "nature" as you put
it, -references. As it is, arrays are breaking reference behaviour too, as my
example above showed. Or do you know agree that arrays are reference types
either?

I distinctly remember reading a note from Walter saying that he was
surprised that setting the length to zero also nulled the pointer. He has
code in Phobos that assumes that this is not the right behavior.

This is all very circumstantial, but oh well...

 Perhaps you can clarify whether this is going to happen properly or not.
 
 In my view, proper array nulls do _not_ exist. 

But they do. If array.ptr is null, then array is a null array. 

 
 Why should be have to resort to array.ptr for nullness? Why can't the _array
 itself_ be null? 

Because its like saying, why can't an object instance be null. The array IS
an instance. A null array means something different to a null object.

Once again, this view is incorrect. An array can be both a reference and an
instance. 

int[] arr    // This is the reference.
= new int[0] // _This_ is the array.

An object _can_ be null by itself, no need to check an
 "object.ptr". In fact, on a null object .ptr would throw. You have to
 acknowledge this is a significant difference.

Yes it is a difference. So what? Learn it and move on. This is D, not
C/C++. 

Why learn and move on from something that is clearly wrong? I'd rather fix it,
thank you very much ;)

Yep. Been there, done that. I just wish he'd fix this bug. Its very easy to
fix.

It's very frustrating. I would sue this bug if I could. :p

 It is also fickle because the null value is tied to the pointer. 

Huh? Of course it is. What else could it be?

 
 It could be, say, a simple boolean. Or, it could be, say, like objects. Objects
 don't rely on object.ptr, why should arrays?

Because they are arrays and not class instances.

Assuming the _wrong_ semantics stay in place, why couldn't we do something like:
array.isNull or array.exists as a simple boolean check, instead of the more
complicated array.ptr that is riddled with "bugs?"

That way we separate the implementation details (how the array.ptr is handled
internally), from the semantics (whether the array exists or not).

 Regan thinks that
 you are planning on merging emptiness and existence into one (a bad thing).

I don't think that Walter is planning on this.

 
 I certainly hope not, but how can we be sure? This is why I asked.
 
 Some of the problems (not technically "bugs"):
 - array.length = 0 sets the pointer to null.

This is a bug and Walter has said so. He will fix this.

 
 Just out of curiosity, is there a post that I could read regarding this? I'd
 really like to see what he said.

Yes, but I don't know how to search for it.

Well, perhaps he can clarify his position now.

----

Re:Static
<snip>
By 'static' are you meaning non-dynamic arrays or single-instance arrays.

<snip>

It doesn't matter. I don't understand why you bring technical implementation
details to the discussion, when I am talking solely about the concept. 

Conceptually, whether an array is static or not has no effect on whether the
array can exist or not. Static changes allocation semantics, _not_ existance
semantics. That's my point.

Now, if you tell me: We can't have that because of a technical limitation, then
I would understand. However, the point in your memory argument can be fixed the
way Regan and others have mentioned. 

Cheers,
--AJG.

Jul 29 2005

Niko Korhonen <niktheblak hotmail.com> writes:

AJG wrote:
 1) Why can objects be null but arrays can't (given that _both_ are by-ref)?
 




Do you want all operations on a null array, such as:




to segfault (to throw a NullPointerException in managed environments 
parlance), like they do on a null object reference?

-- 
Niko Korhonen
SW Developer

Jul 29 2005

AJG <AJG_member pathlink.com> writes:

Hi,

In article <dcclvb$v0i$1 digitaldaemon.com>, Niko Korhonen says...
AJG wrote:
 1) Why can objects be null but arrays can't (given that _both_ are by-ref)?
 




Do you want all operations on a null array, such as:




to segfault (to throw a NullPointerException in managed environments 
parlance), like they do on a null object reference?

For starters, yes. Why should objects be different than arrays when they are
both reference types? This is inconsistent IMHO.

Thanks,
--AJG.

Jul 29 2005

"Ben Hinkle" <bhinkle mathworks.com> writes:

"AJG" <AJG_member pathlink.com> wrote in message 
news:dcde0p$1kop$1 digitaldaemon.com...
 Hi,

 In article <dcclvb$v0i$1 digitaldaemon.com>, Niko Korhonen says...
AJG wrote:
 1) Why can objects be null but arrays can't (given that _both_ are 
 by-ref)?





Do you want all operations on a null array, such as:




to segfault (to throw a NullPointerException in managed environments
parlance), like they do on a null object reference?

 For starters, yes. Why should objects be different than arrays when they 
 are
 both reference types? This is inconsistent IMHO.

I think you'll have a hard time getting lots of support for that. I much 
prefer the current behavior and I bet there is lots of existing D code that 
assumes one can test the length of an array at any time. Since an array is 
not an object I see no problem with the "inconistency" - an array is an 
array.

Jul 29 2005

AJG <AJG_member pathlink.com> writes:

Hi Ben,

Do you want all operations on a null array, such as:




to segfault (to throw a NullPointerException in managed environments
parlance), like they do on a null object reference?

 For starters, yes. Why should objects be different than arrays when they 
 are
 both reference types? This is inconsistent IMHO.

I think you'll have a hard time getting lots of support for that. I much 
prefer the current behavior and I bet there is lots of existing D code that 
assumes one can test the length of an array at any time. Since an array is 
not an object I see no problem with the "inconistency" - an array is an 
array. 

I agree that there won't be much support for this. I don't suppose it will
change. But ideally that's what the behaviour should be. Say you had no D code
written at the moment, would you support the change?

On the other hand, would you support access to object properties that don't
require an instance from a null reference? It's the same thing, isn't it? Yet
aren't those illegal at the moment? (don't have DMD at hand).

Cheers,
--AJG.

"What is popular is not always right; what is right is not always popular."

Jul 29 2005

"Ben Hinkle" <bhinkle mathworks.com> writes:

"AJG" <AJG_member pathlink.com> wrote in message 
news:dcdqig$1us6$1 digitaldaemon.com...
 Hi Ben,

Do you want all operations on a null array, such as:




to segfault (to throw a NullPointerException in managed environments
parlance), like they do on a null object reference?

 For starters, yes. Why should objects be different than arrays when they
 are
 both reference types? This is inconsistent IMHO.

I think you'll have a hard time getting lots of support for that. I much
prefer the current behavior and I bet there is lots of existing D code 
that
assumes one can test the length of an array at any time. Since an array is
not an object I see no problem with the "inconistency" - an array is an
array.

 I agree that there won't be much support for this. I don't suppose it will
 change. But ideally that's what the behaviour should be. Say you had no D 
 code
 written at the moment, would you support the change?

no

 On the other hand, would you support access to object properties that 
 don't
 require an instance from a null reference?

no (assuming you aren't referring to static class properties)

 It's the same thing, isn't it?

no

Yet aren't those illegal at the moment? (don't have DMD at hand).

yes

 Cheers,
 --AJG.

 "What is popular is not always right; what is right is not always 
 popular."

Jul 29 2005

AJG <AJG_member pathlink.com> writes:

Hi Ben,

Ok, I don't think I said exactly what I meant before. Let's look at this piece
by piece:

1) Arrays are ("in theory") reference types.
2) Objects are reference types.
3) Arrays are not objects.
4) So, even though Arrays and Objects are different, they share (or should)
reference semantics.

I believe most of us can agree up to here.

My overall point is that D is not keeping its promise regarding Arrays obeying
reference semantics. Whether this is good or not is debatable, but at least it
should be noted. Do you agree that D's arrays break reference semantics?

Thanks,
--AJG.

Jul 29 2005

"Ben Hinkle" <bhinkle mathworks.com> writes:

"AJG" <AJG_member pathlink.com> wrote in message 
news:dcdtq5$21q6$1 digitaldaemon.com...
 Hi Ben,

 Ok, I don't think I said exactly what I meant before. Let's look at this 
 piece
 by piece:

 1) Arrays are ("in theory") reference types.

no - an array is two pieces of information: (1) a pointer to the data and 
(2) a length. The pointer can be considered a reference but the length 
information is definitely not manipulated by reference. For example

  int[] a,b;
  a.length = 10;
  b = a;
  b.length = 100;
  assert( a.length == 10 );

If arrays had "pure" reference semantics in the same way objects do then one 
would expect a.length == 100. In casual conversations one often says arrays 
have reference semantics but the unspoken assumption is that one is talking 
about the data pointer. This can confuse people who aren't used to D array 
semantics.

 2) Objects are reference types.
 3) Arrays are not objects.

these I agree with.

 4) So, even though Arrays and Objects are different, they share (or 
 should)
 reference semantics.

naturally I disagree given 1).

 I believe most of us can agree up to here.

 My overall point is that D is not keeping its promise regarding Arrays 
 obeying
 reference semantics. Whether this is good or not is debatable, but at 
 least it
 should be noted. Do you agree that D's arrays break reference semantics?

The length information is not manipulated with reference semantics. I think 
this is a good design choice that shouldn't be changed. I agree it is 
different than object behavior but that's well worth the benefits of the 
current system. If there are statements in the D doc that say "arrays have 
reference sematnics" I think they should be changed to be more accurate and 
say something like "the array data has reference semantics". It's common to 
ignore the length field when you are casually talking about arrays.

Jul 29 2005

Shammah Chancellor <Shammah_member pathlink.com> writes:

I've been following this, but have as of yet been unable to express my problem
with this whole issue. My feelings line up with yours ben.

Arrays are not pointers, nor are they reference types.  In C, pointers happen to
be able to be dereferenced with the array index operator, but that's a side
effect of implementation.  If something is a true array, I think there is a
reasonable expectation that the array always points to some data.

array != null; //should always be true; Especially in the case of D where the
array is really a structure with a ptr and a length.


However, if for some reason you need the reference symantics, those are not
denied to you.  You're free to do this:

int* array = new int[100];

My 2 cents
-Sha

In article <dcdur3$232d$1 digitaldaemon.com>, Ben Hinkle says...
"AJG" <AJG_member pathlink.com> wrote in message 
news:dcdtq5$21q6$1 digitaldaemon.com...
 Hi Ben,

 Ok, I don't think I said exactly what I meant before. Let's look at this 
 piece
 by piece:

 1) Arrays are ("in theory") reference types.

no - an array is two pieces of information: (1) a pointer to the data and 
(2) a length. The pointer can be considered a reference but the length 
information is definitely not manipulated by reference. For example

  int[] a,b;
  a.length = 10;
  b = a;
  b.length = 100;
  assert( a.length == 10 );

If arrays had "pure" reference semantics in the same way objects do then one 
would expect a.length == 100. In casual conversations one often says arrays 
have reference semantics but the unspoken assumption is that one is talking 
about the data pointer. This can confuse people who aren't used to D array 
semantics.

 2) Objects are reference types.
 3) Arrays are not objects.

these I agree with.

 4) So, even though Arrays and Objects are different, they share (or 
 should)
 reference semantics.

naturally I disagree given 1).

 I believe most of us can agree up to here.

 My overall point is that D is not keeping its promise regarding Arrays 
 obeying
 reference semantics. Whether this is good or not is debatable, but at 
 least it
 should be noted. Do you agree that D's arrays break reference semantics?

The length information is not manipulated with reference semantics. I think 
this is a good design choice that shouldn't be changed. I agree it is 
different than object behavior but that's well worth the benefits of the 
current system. If there are statements in the D doc that say "arrays have 
reference sematnics" I think they should be changed to be more accurate and 
say something like "the array data has reference semantics". It's common to 
ignore the length field when you are casually talking about arrays.

Jul 29 2005

AJG <AJG_member pathlink.com> writes:

Hi,

I've been following this, but have as of yet been unable to express my problem
with this whole issue. My feelings line up with yours ben.

Arrays [...] are not reference types.

If this is so, it is unfortunate. I'm asking Walter to clarify this, that is
all.

In C, pointers happen to
be able to be dereferenced with the array index operator, but that's a side
effect of implementation.  If something is a true array, I think there is a
reasonable expectation that the array always points to some data.

This is not a reasonable expectation. We are talking about two things here:
a) Existance.
b) Emptiness.

Even in C, you can express both. I'm asking whether Walter thinks we should do
that in D or not.

Some examples (of the 3 possible cases):

char[] string = "hi"; // non-null non-empty array.
char[] empty  = "";   // non-null empty array.
char[] cnull  = null; // null array.

array != null; //should always be true; Especially in the case of D where the
array is really a structure with a ptr and a length.

This is not the case in D. array != null is sometimes false, because it's
comparing the pointer. This is the very thing that allows an array to be
non-existant (a true, NULL array). Thus, that was my original question to
Walter, whether we should rely on this behaviour or if he's planning on phasing
it out.

However, if for some reason you need the reference symantics, those are not
denied to you.  You're free to do this:

int* array = new int[100];

Yes, there's nothing like regressing a couple of decades ;) I think one of D's
design goals was to make pointer use unnecessary. Using a pointer you lose
safety, lose info (.length, etc.) and lose functionality. This is not a valid
solution, IMHO.

Cheers,
--AJG.

Jul 29 2005

AJG <AJG_member pathlink.com> writes:

Hi,

Well, this is certainly an interesting development. So, to recap, arrays in D
are not reference types. I was always under the impression that they were. This
is very saddening to me.

Is this correct? Walter, could you clarify this?

 1) Arrays are ("in theory") reference types.

no - an array is two pieces of information: (1) a pointer to the data and 
(2) a length. The pointer can be considered a reference but the length 
information is definitely not manipulated by reference. For example

What about .dup, .sort, .reverse, .sizeof?
Do those have reference semantics or not? 

If arrays had "pure" reference semantics in the same way objects do then one 
would expect a.length == 100. In casual conversations one often says arrays 
have reference semantics but the unspoken assumption is that one is talking 
about the data pointer. This can confuse people who aren't used to D array 
semantics.

Yes, arrays semantics are definitely weird. I was hoping they were references
and that .length was simply buggy, but perhaps it's by design. In addition, IMO
this "unspoken assumption" is not mentioned anywhere in the docs.

 My overall point is that D is not keeping its promise regarding Arrays 
 obeying
 reference semantics. Whether this is good or not is debatable, but at 
 least it
 should be noted. Do you agree that D's arrays break reference semantics?

The length information is not manipulated with reference semantics. I think 
this is a good design choice that shouldn't be changed.

Why is it a good design choice? Forget about legacy for a second. Wouldn't it be
much simpler, more consistent and less confusing to make arrays pure reference
types? It would eliminate a lot of the various special cases that we have to
deal with given the current convoluted semantics. It would also align their
behaviour to that of objects, much like a struct's behaviour is aligned to that
of a primitive.

I agree it is 
different than object behavior but that's well worth the benefits of the 
current system.

Like what? Which benefits?

If there are statements in the D doc that say "arrays have 
reference sematnics" I think they should be changed to be more accurate and 
say something like "the array data has reference semantics". It's common to 
ignore the length field when you are casually talking about arrays.

Or perhaps the arrays themselves could be changed to reference types? ;) 

Cheers,
--AJG

Jul 29 2005

"Ben Hinkle" <bhinkle mathworks.com> writes:

"AJG" <AJG_member pathlink.com> wrote in message 
news:dce24h$272t$1 digitaldaemon.com...
 Hi,

 Well, this is certainly an interesting development. So, to recap, arrays 
 in D
 are not reference types. I was always under the impression that they were. 
 This
 is very saddening to me.

 Is this correct? Walter, could you clarify this?

 1) Arrays are ("in theory") reference types.

no - an array is two pieces of information: (1) a pointer to the data and
(2) a length. The pointer can be considered a reference but the length
information is definitely not manipulated by reference. For example

 What about .dup, .sort, .reverse, .sizeof?
 Do those have reference semantics or not?

Yes - they "have reference semantics" in the sense that they act on the data 
(though in the case of .dup and .sizeof the reference/value semantics is 
irrelevant).

If arrays had "pure" reference semantics in the same way objects do then 
one
would expect a.length == 100. In casual conversations one often says 
arrays
have reference semantics but the unspoken assumption is that one is 
talking
about the data pointer. This can confuse people who aren't used to D array
semantics.

 Yes, arrays semantics are definitely weird. I was hoping they were 
 references
 and that .length was simply buggy, but perhaps it's by design. In 
 addition, IMO
 this "unspoken assumption" is not mentioned anywhere in the docs.

The first sentance of http://www.digitalmars.com/d/arrays.html section 
Dynamic Arrays says "Dynamic arrays consist of a length and a pointer to the 
array data." I agree, though, that the doc needs to emphasize this more. I 
added some feedback to the Wiki about arrays asking for examples 
illustrating how array assignment works.

 My overall point is that D is not keeping its promise regarding Arrays
 obeying
 reference semantics. Whether this is good or not is debatable, but at
 least it
 should be noted. Do you agree that D's arrays break reference semantics?

The length information is not manipulated with reference semantics. I 
think
this is a good design choice that shouldn't be changed.

 Why is it a good design choice? Forget about legacy for a second. Wouldn't 
 it be
 much simpler, more consistent and less confusing to make arrays pure 
 reference
 types? It would eliminate a lot of the various special cases that we have 
 to
 deal with given the current convoluted semantics. It would also align 
 their
 behaviour to that of objects, much like a struct's behaviour is aligned to 
 that
 of a primitive.

It would be very annoying to have to check for null before asking if an 
array length is zero. Plus the whole design of slicing would need to be 
redone and probably would lose much of the efficiency it has today. I view 
an array as much closer to a struct than an object: an array is just like a 
struct with a pointer field and a length field. That's the simplest 
description of what an array is. Comparing them to objects is the wrong 
analogy.

I agree it is
different than object behavior but that's well worth the benefits of the
current system.

 Like what? Which benefits?

see above - checking all the time for null would be very annoying. Almost 
all the time with arrays one cares if the length is zero and making people 
check for null before asking that question is error-prone. See Java for 
examples of making people check for null before asking for the length.

If there are statements in the D doc that say "arrays have
reference sematnics" I think they should be changed to be more accurate 
and
say something like "the array data has reference semantics". It's common 
to
ignore the length field when you are casually talking about arrays.

 Or perhaps the arrays themselves could be changed to reference types? ;)

Sure - one can change anything in D if the tradeoffs are worth it. I happen 
to believe D's dynamic array semantics are an excellent balance of 
tradeoffs.

Jul 29 2005

AJG <AJG_member pathlink.com> writes:

Hi,

 What about .dup, .sort, .reverse, .sizeof?
 Do those have reference semantics or not?

Yes - they "have reference semantics" in the sense that they act on the data 
(though in the case of .dup and .sizeof the reference/value semantics is 
irrelevant).

Just to make sure I understand:

char[] A = "123";
char[] B = A;
B.reverse;
// B will be 321
// A will be 321 also.
// correct?

BUT:

char[] A = "123";
char[] B = A;
B.length = 2;
// B will be 12
// A will be remain 123.
// correct?


If this is true, then it seems rather arbitrary to me that .length should break
reference semantics. Why not keep it in line to how the rest work? (Specially
since it's not related to the benefits you talked about before).

The first sentance of http://www.digitalmars.com/d/arrays.html section 
Dynamic Arrays says "Dynamic arrays consist of a length and a pointer to the 
array data." I agree, though, that the doc needs to emphasize this more. I 
added some feedback to the Wiki about arrays asking for examples 
illustrating how array assignment works.

Ok. This would be an improvement.

 Why is it a good design choice?


It would be very annoying to have to check for null before asking if an 
array length is zero. Plus the whole design of slicing would need to be 
redone and probably would lose much of the efficiency it has today. 

Ok. This is a valid point. However, that's not to say the problem is
insurmoutable. Solutions do exist. In fact, I have thought of a couple of
possible solutions, but I'm afraid it'll scare everybody so for now this would
be "something to think about." I just want to say that change, even if it breaks
things, can be very good. It shouldn't be automatically ruled out in fear.

I view 
an array as much closer to a struct than an object: an array is just like a 
struct with a pointer field and a length field. That's the simplest 
description of what an array is. Comparing them to objects is the wrong 
analogy.

Except that _all_ properties other than .length operate via reference semantics.
Structs wouldn't do that. Objects would.

I agree it is
different than object behavior but that's well worth the benefits of the
current system.

 Like what? Which benefits?

see above - checking all the time for null would be very annoying. Almost 
all the time with arrays one cares if the length is zero and making people 
check for null before asking that question is error-prone. 

It wouldn't be error prone. Perhaps you mean exceptions would be thrown, and
that's fine, but there wouldn't be unnoticed errors. But in general I agree with
you, slicing would lose its "magic" having to check for nulls.

See Java for 
examples of making people check for null before asking for the length.

You can also learn from their mistakes and avoid them.

If there are statements in the D doc that say "arrays have
reference sematnics" I think they should be changed to be more accurate 
and
say something like "the array data has reference semantics". It's common 
to
ignore the length field when you are casually talking about arrays.

 Or perhaps the arrays themselves could be changed to reference types? ;)

Sure - one can change anything in D if the tradeoffs are worth it. I happen 
to believe D's dynamic array semantics are an excellent balance of 
tradeoffs. 

I think the semantics could use a little rethinking and specially a bit of
clarification.

Cheers,
--AJG.

Jul 29 2005

Ben Hinkle <Ben_member pathlink.com> writes:

In article <dce4up$2cbc$1 digitaldaemon.com>, AJG says...
Hi,

 What about .dup, .sort, .reverse, .sizeof?
 Do those have reference semantics or not?

Yes - they "have reference semantics" in the sense that they act on the data 
(though in the case of .dup and .sizeof the reference/value semantics is 
irrelevant).

Just to make sure I understand:

char[] A = "123";
char[] B = A;
B.reverse;
// B will be 321
// A will be 321 also.
// correct?

yes - aside from the fact that you should dup the "123" before trying to modify
it since "123" is put in read-only memory.
Reverse acts in-place because it is a method of the array type - like sorting is
in-place.

BUT:

char[] A = "123";
char[] B = A;
B.length = 2;
// B will be 12
// A will be remain 123.
// correct?

yes

If this is true, then it seems rather arbitrary to me that .length should break
reference semantics. Why not keep it in line to how the rest work? (Specially
since it's not related to the benefits you talked about before).

It is not arbitrary. There are advantages to the current design. I don't see why
you say it is not related since it would be silly to have length do something
different if there weren't benefits to making length special.

 Why is it a good design choice?


It would be very annoying to have to check for null before asking if an 
array length is zero. Plus the whole design of slicing would need to be 
redone and probably would lose much of the efficiency it has today. 

Ok. This is a valid point. However, that's not to say the problem is
insurmoutable. Solutions do exist. In fact, I have thought of a couple of
possible solutions, but I'm afraid it'll scare everybody so for now this would
be "something to think about." I just want to say that change, even if it breaks
things, can be very good. It shouldn't be automatically ruled out in fear.

Where is the reaction in fear? I only see people trying to explain the current
design and its advantages. I said I doubt a solution exists that would have the
benefits of the current design while having reference semantics (if even
reference semantics for length would be desirable). If you want to present some
ideas that would be great - do whatever you want and enjoy (remember we're all
doing this for fun).

I view 
an array as much closer to a struct than an object: an array is just like a 
struct with a pointer field and a length field. That's the simplest 
description of what an array is. Comparing them to objects is the wrong 
analogy.

Except that _all_ properties other than .length operate via reference semantics.
Structs wouldn't do that. Objects would.

uhh - the struct has a pointer to the data. The pointer part has reference
semantics and the length part doesn't. A struct can easily have methods that
derefence the pointer and modify shared state. I do it all the time with the
MinTL containers and pretty much any struct that stores a pointer.

I agree it is
different than object behavior but that's well worth the benefits of the
current system.

 Like what? Which benefits?

see above - checking all the time for null would be very annoying. Almost 
all the time with arrays one cares if the length is zero and making people 
check for null before asking that question is error-prone. 

It wouldn't be error prone. Perhaps you mean exceptions would be thrown, and
that's fine, but there wouldn't be unnoticed errors. But in general I agree with
you, slicing would lose its "magic" having to check for nulls.

By error-prone I mean the programmer will introduce bugs into the code by
forgetting to check for null every time they want to know if an array has any
content (meaning non-zero length). 

See Java for 
examples of making people check for null before asking for the length.

You can also learn from their mistakes and avoid them.

That's what D has now - it is avoiding the mistakes of Java by not requiring all
those annoying null checks. Plus slicing is fast by not requiring memory
allocations. Note in Java the length of an array is read-only so the whole
question about length having value/reference semantics doesn't apply.

Jul 29 2005

AJG <AJG_member pathlink.com> writes:

Hi Ben,

If this is true, then it seems rather arbitrary to me that .length should break
reference semantics. Why not keep it in line to how the rest work? (Specially
since it's not related to the benefits you talked about before).

It is not arbitrary. There are advantages to the current design. I don't see why
you say it is not related since it would be silly to have length do something
different if there weren't benefits to making length special.

So then .length is related to slicing? How does the semantics of .length affect
slicing? Or perhaps you meant other benefits?

 Why is it a good design choice?


It would be very annoying to have to check for null before asking if an 
array length is zero. Plus the whole design of slicing would need to be 
redone and probably would lose much of the efficiency it has today. 

Ok. This is a valid point. However, that's not to say the problem is
insurmoutable. Solutions do exist. In fact, I have thought of a couple of
possible solutions, but I'm afraid it'll scare everybody so for now this would
be "something to think about." I just want to say that change, even if it breaks
things, can be very good. It shouldn't be automatically ruled out in fear.

Where is the reaction in fear? I only see people trying to explain the current
design and its advantages. I said I doubt a solution exists that would have the
benefits of the current design while having reference semantics (if even
reference semantics for length would be desirable). If you want to present some
ideas that would be great - do whatever you want and enjoy (remember we're all
doing this for fun).

The general impression I get is that as soon as something creates the
possibility of breaking existing code, then there is backlash. This would be
fine for the embedded C language that runs medical heart devices. But for a
language that isn't even out the door, it's disheartening (haha, no pun intended
;). Just my 2 cents.

I view 
an array as much closer to a struct than an object: an array is just like a 
struct with a pointer field and a length field. That's the simplest 
description of what an array is. Comparing them to objects is the wrong 
analogy.

Except that _all_ properties other than .length operate via reference semantics.
Structs wouldn't do that. Objects would.

uhh - the struct has a pointer to the data. The pointer part has reference
semantics and the length part doesn't. A struct can easily have methods that
derefence the pointer and modify shared state. I do it all the time with the
MinTL containers and pretty much any struct that stores a pointer.

SomeObject A = new SomeObject;
SomeObject B = A;
B.SomeProperty; // Operates on A.

SomeStruct A;
SomeStruct B = A;
B.SomeProperty; // Operates on B.

int[] A = new int[5];
int[] B = A;
B.SomeProperty; // Operates on A; 
// _Except_ if it's .length.

This behaviour seems much more in line with Objects than with Structs, to me.
That's why I don't see how .length should break the current semantics.

I agree it is
different than object behavior but that's well worth the benefits of the
current system.

 Like what? Which benefits?

see above - checking all the time for null would be very annoying. Almost 
all the time with arrays one cares if the length is zero and making people 
check for null before asking that question is error-prone. 

It wouldn't be error prone. Perhaps you mean exceptions would be thrown, and
that's fine, but there wouldn't be unnoticed errors. But in general I agree with
you, slicing would lose its "magic" having to check for nulls.

By error-prone I mean the programmer will introduce bugs into the code by
forgetting to check for null every time they want to know if an array has any
content (meaning non-zero length). 

Ok.

See Java for 
examples of making people check for null before asking for the length.

You can also learn from their mistakes and avoid them.

That's what D has now - it is avoiding the mistakes of Java by not requiring all
those annoying null checks. Plus slicing is fast by not requiring memory
allocations. Note in Java the length of an array is read-only so the whole
question about length having value/reference semantics doesn't apply.

I'm not suggesting making .length read-only. I'm suggesting making it operate on
the same data it has a pointer to. Just like .sort or .reverse would. The way I
see it, if you explicitly want to make a copy of the data, that's why there is
dup. Why should .length secretely call .dup sometimes, and sometimes not?

Cheers,
--AJG.

Jul 30 2005

Ben Hinkle <Ben_member pathlink.com> writes:

In article <dcgc3q$13i9$1 digitaldaemon.com>, AJG says...
Hi Ben,

If this is true, then it seems rather arbitrary to me that .length should break
reference semantics. Why not keep it in line to how the rest work? (Specially
since it's not related to the benefits you talked about before).

It is not arbitrary. There are advantages to the current design. I don't see why
you say it is not related since it would be silly to have length do something
different if there weren't benefits to making length special.

So then .length is related to slicing? How does the semantics of .length affect
slicing? Or perhaps you meant other benefits?

I recommend you pursue some of your ideas where length is manipulated by
reference and follow the dependencies to see how different dynamic arrays (and,
yes, slicing) would be. In particular I recommend you learn more about slicing.
I'm sorry if that sounds harsh but I've gotten the opinion now that you haven't
really gotten experience with D arrays as they exist now.

 Why is it a good design choice?


It would be very annoying to have to check for null before asking if an 
array length is zero. Plus the whole design of slicing would need to be 
redone and probably would lose much of the efficiency it has today. 

Ok. This is a valid point. However, that's not to say the problem is
insurmoutable. Solutions do exist. In fact, I have thought of a couple of
possible solutions, but I'm afraid it'll scare everybody so for now this would
be "something to think about." I just want to say that change, even if it breaks
things, can be very good. It shouldn't be automatically ruled out in fear.

Where is the reaction in fear? I only see people trying to explain the current
design and its advantages. I said I doubt a solution exists that would have the
benefits of the current design while having reference semantics (if even
reference semantics for length would be desirable). If you want to present some
ideas that would be great - do whatever you want and enjoy (remember we're all
doing this for fun).

The general impression I get is that as soon as something creates the
possibility of breaking existing code, then there is backlash. This would be
fine for the embedded C language that runs medical heart devices. But for a
language that isn't even out the door, it's disheartening (haha, no pun intended
;). Just my 2 cents.

For my case when I said essentially "much code will break" it wasn't meant as a
backlash - just as a fact you would have to address. A proposed change that
breaks lots of code is harder to push through than one that doesn't as a simple
practical matter more than any emotional attachment to old code.

I view 
an array as much closer to a struct than an object: an array is just like a 
struct with a pointer field and a length field. That's the simplest 
description of what an array is. Comparing them to objects is the wrong 
analogy.

Except that _all_ properties other than .length operate via reference semantics.
Structs wouldn't do that. Objects would.

uhh - the struct has a pointer to the data. The pointer part has reference
semantics and the length part doesn't. A struct can easily have methods that
derefence the pointer and modify shared state. I do it all the time with the
MinTL containers and pretty much any struct that stores a pointer.

SomeObject A = new SomeObject;
SomeObject B = A;
B.SomeProperty; // Operates on A.

SomeStruct A;
SomeStruct B = A;
B.SomeProperty; // Operates on B.

int[] A = new int[5];
int[] B = A;
B.SomeProperty; // Operates on A; 
// _Except_ if it's .length.

This behaviour seems much more in line with Objects than with Structs, to me.
That's why I don't see how .length should break the current semantics.

Please think about structs that contain pointers.

[snip]
Why should .length secretely call .dup sometimes, and sometimes not?

Here I agree that the documentation should be more explicit in describing when
setting the length reallocated and when it doesn't. If it is compiler-dependent
the doc should say so.

Jul 30 2005

AJG <AJG_member pathlink.com> writes:

Hi Ben,

So then .length is related to slicing? How does the semantics of
.length affect
slicing? Or perhaps you meant other benefits?

I recommend you pursue some of your ideas where length is manipulated by
reference and follow the dependencies to see how different dynamic arrays (and,
yes, slicing) would be. In particular I recommend you learn more about slicing.
I'm sorry if that sounds harsh but I've gotten the opinion now that you haven't
really gotten experience with D arrays as they exist now.

Would an example do? I may not be an expert regarding slicing, but I could see a
discrete problem if you point it out.

The general impression I get is that as soon as something creates the
possibility of breaking existing code, then there is backlash. This would be
fine for the embedded C language that runs medical heart devices. But for a
language that isn't even out the door, it's disheartening (haha, no pun intended
;). Just my 2 cents.

For my case when I said essentially "much code will break" it wasn't meant as a
backlash - just as a fact you would have to address. A proposed change that
breaks lots of code is harder to push through than one that doesn't as a simple
practical matter more than any emotional attachment to old code.

This kind of thinking only works ceteris paribus. But if a solution that breaks
less code is not as good, then the language loses. I think at this point the
language can afford such changes before it becomes like C, where a header file
was needed to introduce mere booleans.

I view 
an array as much closer to a struct than an object: an array is just like a 
struct with a pointer field and a length field. That's the simplest 
description of what an array is. Comparing them to objects is the wrong 
analogy.

Except that _all_ properties other than .length operate via reference semantics.
Structs wouldn't do that. Objects would.

uhh - the struct has a pointer to the data. The pointer part has reference
semantics and the length part doesn't. A struct can easily have methods that
derefence the pointer and modify shared state. I do it all the time with the
MinTL containers and pretty much any struct that stores a pointer.

SomeObject A = new SomeObject;
SomeObject B = A;
B.SomeProperty; // Operates on A.

SomeStruct A;
SomeStruct B = A;
B.SomeProperty; // Operates on B.

int[] A = new int[5];
int[] B = A;
B.SomeProperty; // Operates on A; 
// _Except_ if it's .length.

This behaviour seems much more in line with Objects than with Structs, to me.
That's why I don't see how .length should break the current semantics.

Please think about structs that contain pointers.

Even if we see arrays as structs (which I don't, but for the sake of the
argument), it doesn't explain why .length should break the other properties'
semantics. If there's an obvious reason I'm blind to, could you point it out?
I'm a little dense sometimes.

[snip]
Why should .length secretely call .dup sometimes, and sometimes not?

Here I agree that the documentation should be more explicit in describing when
setting the length reallocated and when it doesn't. If it is compiler-dependent
the doc should say so.

Ok.

Cheers,
--AJG.

Jul 30 2005

Shammah Chancellor <Shammah_member pathlink.com> writes:

In article <dcgkt5$1b4i$1 digitaldaemon.com>, AJG says...

Even if we see arrays as structs (which I don't, but for the sake of the
argument), it doesn't explain why .length should break the other properties'
semantics. If there's an obvious reason I'm blind to, could you point it out?
I'm a little dense sometimes.


Because sometimes it needs to reallocate memory.  Why don't you look at `man
realloc`:

     The realloc() function tries to change the size of the allocation pointed
     to by ptr to size, and return ptr.  If there is not enough room to
     enlarge the memory allocation pointed to by ptr, realloc() creates a new
     allocation, copies as much of the old data pointed to by ptr as will fit
     to the new allocation, frees the old allocation, and returns a pointer to
     the allocated memory.  realloc() returns a NULL pointer if there is an
     error, and the allocation pointed to by ptr is still valid.

The difference is that D cannot let it free the original, because if it did then
other refereces to the data 
would break.  So it dups the data if a realloc is going to allocate memory in a
different area.    I'm not 
sure fo the exact implementation details in D, but that's my basic
understanding.  

So for recap:
If length increases, and there is not enough space available to grow the array
it, it allocates another 
block of memory and copies the data.   It leaves the original pointer in tack
then and lets the garbage 
collector decide if anybody else has references to it still.

This may seem confusing, but it's about array slicing being fast.  If you don't
want there do be this 
mixed semantics, and always dup your data.  

(P.S. You mention C++ reference symatecs when you're talking about these arrays.
But this isn't even 
legal in C++:
int foo[10];
foo = null;  
You really can't compare the two languages in this aspect.  I think D arrays are
a big step forward when 
compared to C arrays, which literally couldn't find their ass with both hands.)

-Sha

Jul 30 2005

Ben Hinkle <Ben_member pathlink.com> writes:

In article <dcgkt5$1b4i$1 digitaldaemon.com>, AJG says...
Hi Ben,

So then .length is related to slicing? How does the semantics of
.length affect
slicing? Or perhaps you meant other benefits?

I recommend you pursue some of your ideas where length is manipulated by
reference and follow the dependencies to see how different dynamic arrays (and,
yes, slicing) would be. In particular I recommend you learn more about slicing.
I'm sorry if that sounds harsh but I've gotten the opinion now that you haven't
really gotten experience with D arrays as they exist now.

Would an example do? I may not be an expert regarding slicing, but I could see a
discrete problem if you point it out.

Let me step through some choices that I was hoping you would do. Let's start by
thinking about what an array with reference-based length would look like. It
would either be a pointer to today's dynamic array (a ptr and a length) or it
would be a pointer to one memory block with the length stored either at the
front or end of the array data. How would slicing work for those two
implementations? For the first slicing would have to allocate memory to store
the new ptr and new length. For the second slicing would have to be a different
type since it is impossible to store the length for the slice in the middle of
the original source array. So that's why I suggested you think through your
initial suggestion and work out the impact on slicing and arrays in general.

But to be honest I would still prefer the current behavior where the length
information is always available without having to check for null first - even if
you could somehow make the rest of D remain the same as today.

Jul 31 2005

AJG <AJG_member pathlink.com> writes:

Hi Ben,

Let me step through some choices that I was hoping you would do. Let's start by
thinking about what an array with reference-based length would look like. It
would either be a pointer to today's dynamic array (a ptr and a length) or it
would be a pointer to one memory block with the length stored either at the
front or end of the array data. How would slicing work for those two
implementations? For the first slicing would have to allocate memory to store
the new ptr and new length. For the second slicing would have to be a different
type since it is impossible to store the length for the slice in the middle of
the original source array. So that's why I suggested you think through your
initial suggestion and work out the impact on slicing and arrays in general.

I don't think this change in the way arrays operate internally would be
necessary. What about simply using the current data pointer as it is to
implement reference semantics? A null pointer means the reference is null; and
vice-versa.

The problem I keep hearing comes when trying to re-size (specifically, enlarge),
an array, by reference. So then what it all comes down to re: .length is the
inability of realloc() to guarantee that the pointer it returns is the same on
it receives. Is this correct?

But to be honest I would still prefer the current behavior where the length
information is always available without having to check for null first - even
>if you could somehow make the rest of D remain the same as today.

I understand this concern, and it is a valid one. However, at this point D is
trying to have the cake and eating it too: It wants to have null arrays, but not
have to go thru null checks. The result is a bit confusing, IMHO. Moreover, it
is buggy. Worse of all, it is not well documented.

This combination of factors leads me to think something should be done.

Frankly, from the docs I can't make out what the semantics of arrays are
supposed to be. That was why I asked the original question: should we or
shouldn't we treat arrays as null? I guess maybe not even Walter knows ;) ?

Cheers,
--AJG.

Jul 31 2005

Derek Parnell <derek psych.ward> writes:

On Sat, 30 Jul 2005 17:07:06 +0000 (UTC), AJG wrote:


[snip]

 
 SomeObject A = new SomeObject;
 SomeObject B = A;
 B.SomeProperty; // Operates on A.
 
 SomeStruct A;
 SomeStruct B = A;
 B.SomeProperty; // Operates on B.
 
 int[] A = new int[5];
 int[] B = A;
 B.SomeProperty; // Operates on A; 
 // _Except_ if it's .length.
 
 This behaviour seems much more in line with Objects than with Structs, to me.
 That's why I don't see how .length should break the current semantics.

You are wrong here because 'B.someProperty' operates on B not A. 
A simple proof is this ...

 int[] A = new int[5];
 int[] B = A;
 A.length = 4;
 writefln("%d", B.length);  // displays 5.
 
In your example, it *appears* to operate on A (the 8-byte array structure)
because B and A have the same values. That is A.ptr == B.ptr and A.length
== B.length. 

We just have to admit that arrays in D are not the classical array
definition and are really a different type of thing altogether. Then get to
learn the rules of D 'arrays'. If you want arrays to behave like objects,
then maybe you can write an array class.

-- 
Derek Parnell
Melbourne, Australia
31/07/2005 8:26:46 AM

Jul 30 2005

AJG <AJG_member pathlink.com> writes:

Hi Derek,

 int[] A = new int[5];
 int[] B = A;
 B.SomeProperty; // Operates on A; 
 // _Except_ if it's .length.
 
 This behaviour seems much more in line with Objects than with Structs, to me.
 That's why I don't see how .length should break the current semantics.

You are wrong here because 'B.someProperty' operates on B not A. 
A simple proof is this ...

 int[] A = new int[5];
 int[] B = A;
 A.length = 4;
 writefln("%d", B.length);  // displays 5.
 
In your example, it *appears* to operate on A (the 8-byte array structure)
because B and A have the same values. That is A.ptr == B.ptr and A.length
== B.length.

Um... I said "except .length" for a reason. That's my very point. That .length
is the exception. All others operate on A.

We just have to admit that arrays in D are not the classical array
definition and are really a different type of thing altogether. Then get to
learn the rules of D 'arrays'. If you want arrays to behave like objects,
then maybe you can write an array class.

First of all, this would throw efficiency out the window. Second, let me quote
you a little of the D manifesto:

[Taken from "The D Programming Language" written by Walter Bright]
[Arrays Section]

"Arrays are enhanced from being little more than an alternative syntax for a
pointer into first class objects."

That's, ahem, "First Class Objects," for those that missed it.

Cheers,
--AJG.

Jul 30 2005

Shammah Chancellor <Shammah_member pathlink.com> writes:

In article <dch28c$1nrj$1 digitaldaemon.com>, AJG says...
Hi Derek,

 int[] A = new int[5];
 int[] B = A;
 B.SomeProperty; // Operates on A; 
 // _Except_ if it's .length.
 
 This behaviour seems much more in line with Objects than with Structs, to me.
 That's why I don't see how .length should break the current semantics.

You are wrong here because 'B.someProperty' operates on B not A. 
A simple proof is this ...

 int[] A = new int[5];
 int[] B = A;
 A.length = 4;
 writefln("%d", B.length);  // displays 5.
 
In your example, it *appears* to operate on A (the 8-byte array structure)
because B and A have the same values. That is A.ptr == B.ptr and A.length
== B.length.

Um... I said "except .length" for a reason. That's my very point. That .length
is the exception. All others operate on A.

No, All others do _NOT_ operate on A.  They happen to operate on the same data
that A points to.  A is 
a struct which an int and a ptr, obviously changing B's ptr, or B's length do
not affect A.  You're thinking 
about D arrays all wrong.   That's what Derek was getting at.  A and B are two
separate objects which 
happen to be able to have references to the same data.   For effiencies sake
both the length and the ptr 
are assigned by value.  Think of it this way in C, if you have this structure:

struct Array {  int length; void* ptr; } a, b; 

a.ptr = new char[100];
b = a;

What does this do?  This is the semantics of D arrays. A and B are distinct
structures, and if you allocate 
more memory for b then it's not going to change A.  As you can see this is not
the same as reference 
semantics at all, otherwise A's ptr would change as well.  If you want reference
semantics you are free 
to use an array handle.  But the way D arrays are handled is not mystical or
inconsistent.  They're 
perfectly consistent with themselves, and if you understand how they operate
(which is not hard) then 
you won't make mistakes.

As for your other issue, where array nullness and length == 0 being converged, I
do not think this is an 
issue.  length == 0 is the definition of a null set (arrays in CS seem to be
more in line with sets, dunno 
why they're named as they are). But if you want to be consitent with
terminology, techincally a null array 
is a an array with all elements set to null.    Can you show me an example where
it matters if length == 
0 and arr.ptr == null does not denote the same thing?

-Sha

Jul 30 2005

AJG <AJG_member pathlink.com> writes:

Hi,

Um... I said "except .length" for a reason. That's my very point. That .length
is the exception. All others operate on A.

No, All others do _NOT_ operate on A.  They happen to operate on the same data
that A points to.

You are simply splitting hairs here. You are arguing language semantics. The
fact of the matter is that for all practical purposes, EXCEPT for .length,
arrays in D are by reference. This means that for all practical purposes, EXCEPT
for .length, B operates on A. It doesn't matter if it's because of the pointer
(an implementation, system-dependent, gory detail) or because of any other
reason.

If assiging an array _immediately_ copied the data, then what you said is true.
But it doesn't, because (a) that would be inefficient, and (b) that would remove
_all_ reference semantics.

Therefore, as it is, reference semantics are broken when it comes to .length.

<snip>
 RE: Arrays as structs.

This is were _you_ are wrong. Arrays are not structs. Arrays do not share the
semantics of structs. Arrays share _implementation details_ with structs, and
that's _it_.

Didn't you see the quote from the D language doc? It clearly says "First-Class
Objects." Not structs. Not primitives. Not pointers.

If you, however, equate that with structs, that's fine. But I certainly do not.

They're 
perfectly consistent with themselves,

This means absolutely nothing. A bug can be perfectly consistent with itself and
it is still a bug. To be meaningful, they would have to be consistent with the
rest of the language. Or perhaps, consistent with another part of the language,
like, say, Objects. 

and if you understand how they operate
(which is not hard) then 
you won't make mistakes.

It's not about making mistakes. Sure, I can just as well avoid a function in a
library that is buggy, and I'll avoid a mistake. That's not the point. If
something is broken, then it need to be fixed. If Walter could perhaps clarify
the semantics of arrays, then we would get somewhere.

As for your other issue, where array nullness and length == 0 being converged
do not think this is an 
issue.  length == 0 is the definition of a null set

So? What I would like to express is _No Set_.

(arrays in CS seem to be
more in line with sets, dunno 
why they're named as they are). But if you want to be consitent with
terminology, techincally a null array 
is a an array with all elements set to null. Can you show me an example
it matters if length == 
0 and arr.ptr == null does not denote the same thing?

When you are returning fields from a database, for instance. If you've ever
dealt with a DB, you would know fields can be NULL, meaning no value. This is
different than "", which means explicitly the empty string. It is very difficult
to do this because of certain bugs which meld .length == 0 and .ptr == null.

They are not the same thing. Not semantically. Not technically, at the moment,
except for the "bugs." That's why I'm asking Walter whether he _plans_ on
merging the two into one. If that's his vision, which would be unfortunate, then
those things aren't "bugs" at all, but rather the intended design.


Cheers,
--AJG.

Jul 30 2005

Shammah Chancellor <Shammah_member pathlink.com> writes:

In article <dchgkl$23v5$1 digitaldaemon.com>, AJG says...
Hi,

Um... I said "except .length" for a reason. That's my very point. That .length
is the exception. All others operate on A.

No, All others do _NOT_ operate on A.  They happen to operate on the same data
that A points to.

You are simply splitting hairs here. You are arguing language semantics. The
fact of the matter is that for all practical purposes, EXCEPT for .length,
arrays in D are by reference. This means that for all practical purposes, EXCEPT
for .length, B operates on A. It doesn't matter if it's because of the pointer
(an implementation, system-dependent, gory detail) or because of any other
reason.

I am not splitting hairs.  I gave you a very valid reason why a and b are not
references, not even 
theoretically. They happen to have a reference member that in some cases, will
point to the same data.  
YOU are in full control over when that happens. If that's not what you intended,
then you should be 
using references to the ARRAY.  Rather than using multiple arrays with have
references to the same 
data. 

I might ask you this:  What MAGIC would you like to happen with arrays?  What
you want is not possible 
without some kind of magic.  Try this example on for size, from classic C:

int* a = malloc(100 * sizeof(int));
int* b = a;

b = realloc( b, 1000 * sizeof(int) );

Guess what, a is most likely now a bad reference.  Is this what you would like D
to do?  Probably not, 
you probably want 'a' to point to the new array of length 1000.  Do you want the
compiler to magically 
handle this for you?  

Would you like length to be read only?  Forcing us to call b = new int[], and
then manually code up the 
data copy to resize the array?  Starting to sound like C.... What a pain arrays
were.  And a still didn't 
change automatically to where b is pointing now.

If assiging an array _immediately_ copied the data, then what you said is true.
But it doesn't, because (a) that would be inefficient, and (b) that would remove
_all_ reference semantics.

Therefore, as it is, reference semantics are broken when it comes to .length.

There are no reference semantics when it comes to arrays.  Maybe what you want
is D to automagically 
do a Copy-on-Write.  Any time an array that is set to a reference of another
array the flag could get 
turned on, and when you use it as an lvalue and that is on, it could dup the
array.   But that's silly since 

b = new int[100]; is perfectly legal in D, and would result in a double memory
access if you ever tried to 
assign to the array.  Wonder what kind of magic would have to be done to fix
this case.  

IMHO, Better to let the programmer specify when he wants a and b to point a the
same data.

<snip>
 RE: Arrays as structs.

This is were _you_ are wrong. Arrays are not structs. Arrays do not share the
semantics of structs. Arrays share _implementation details_ with structs, and
that's _it_.

Didn't you see the quote from the D language doc? It clearly says "First-Class
Objects." Not structs. Not primitives. Not pointers.

If you, however, equate that with structs, that's fine. But I certainly do not.

You can't use a language to it's fully potential if you don't know
implementation details.  There will 
always be ambiguities of when references are by value, by ref, or whatever else.
As the saying goes:  
the language is in the details.

Here's a good example for you, from a VB.NET project i just inherited:

If arr.Length - arr.Replace(",", "").Length <> 17 Then
'error out

What's the big deal?  It's only one line of code, must be just as good as
counting the number of commas 
in the array....

They're 
perfectly consistent with themselves,

This means absolutely nothing. A bug can be perfectly consistent with itself and
it is still a bug. To be meaningful, they would have to be consistent with the
rest of the language. Or perhaps, consistent with another part of the language,
like, say, Objects. 

and if you understand how they operate
(which is not hard) then 
you won't make mistakes.

It's not about making mistakes. Sure, I can just as well avoid a function in a
library that is buggy, and I'll avoid a mistake. That's not the point. If
something is broken, then it need to be fixed. If Walter could perhaps clarify
the semantics of arrays, then we would get somewhere.

As for your other issue, where array nullness and length == 0 being converged
do not think this is an 
issue.  length == 0 is the definition of a null set

So? What I would like to express is _No Set_.

Not Set?

(arrays in CS seem to be
more in line with sets, dunno 
why they're named as they are). But if you want to be consitent with
terminology, techincally a null array 
is a an array with all elements set to null. Can you show me an example
it matters if length == 
0 and arr.ptr == null does not denote the same thing?

When you are returning fields from a database, for instance. If you've ever
dealt with a DB, you would know fields can be NULL, meaning no value. This is
different than "", which means explicitly the empty string. It is very difficult
to do this because of certain bugs which meld .length == 0 and .ptr == null.

I see your point, but any kind of attempt to do that would be abusing the array.
There are laws against 
array abuse in most countries these days. </sarcasm>

Most every single database api in existence deals with that by having special
objects.

so you have this:

static char[0] DBNull; in your database module;

then

char[] foo;
foo = dbCommand.executeScalar( );

if( foo is DBNull )
// I'm not sure if the .ptr prop is needed here.  Last I heard if you just use
the array name it defaults to 
the ptr
. oh noes, the field was null!
else 
. oh good ..



They are not the same thing. Not semantically. Not technically, at the moment,
except for the "bugs." That's why I'm asking Walter whether he _plans_ on
merging the two into one.

They should never be the same thing.  But there's a gotcha,  if .ptr is null,
then length should always be 
0.  Other way around is not necessarily true.  Just because length == 0 the ptr
isn't necesisarily null.  
This should be the case when the array was at one point allocated, and then
length was reduced.  It 
should be that way for efficiency.  

That however is not useful for your example of DBNulls.  It would be silly to
allocate some space and 
then just not use it and say that's when somebody entered something, and it was
nothing.

 If that's his vision, which would be unfortunate, then
those things aren't "bugs" at all, but rather the intended design.

What 'things'?  Are you talking about the .ptr value being the same for two
arrays?

Jul 31 2005

Carlos Santander <csantander619 gmail.com> writes:

AJG escribi�:
 
 I'm not suggesting making .length read-only. I'm suggesting making it operate
on
 the same data it has a pointer to. Just like .sort or .reverse would. The way I
 see it, if you explicitly want to make a copy of the data, that's why there is
 dup. Why should .length secretely call .dup sometimes, and sometimes not?
 
 Cheers,
 --AJG.
 
 

First of all, I don't agree with AJG: I think D arrays are very well the 
way they're now.

There's something, though, and correct me if I'm wrong, but I think 
array.length doesn't go hand in hand with COW.

char [] a;
a.length = 3;
foo(a);
void foo(char [] b)
{
	b[0] = 'f';    // 1
	b.length = 5;  // 2
}

COW says to do 1, you have to dup first, because you don't own the 
array, but when you do 2, b is automatically dupped. So, my point is 
that to be consistent, maybe resizing should also require dupping.

Am I right? Does it make sense?

-- 
Carlos Santander Bernal

Jul 30 2005

Derek Parnell <derek psych.ward> writes:

On Fri, 29 Jul 2005 18:50:45 +0000 (UTC), AJG wrote:

 Hi Ben,
 
 Ok, I don't think I said exactly what I meant before. Let's look at this piece
 by piece:
 
 1) Arrays are ("in theory") reference types.

This is where I think we separate. I don't think that D arrays are
reference types in the same manner as objects. I think they are value types
in that they always have two fields; a pointer and a length. D arrays are
more like a predefined struct. Your phrase "in theory", depends on whose
theory you are talking about. 

 2) Objects are reference types.

Okay.

 3) Arrays are not objects.

True.

 4) So, even though Arrays and Objects are different, they share (or should)
 reference semantics.

I assume at this point that you are talking about arrays as defined in some
computer science book rather than how they are implemented in D.
 
 I believe most of us can agree up to here.

Apparently not ;-)

 My overall point is that D is not keeping its promise regarding Arrays obeying
 reference semantics. 

"Promise"? Where is that written down?

Whether this is good or not is debatable, but at least it
 should be noted. Do you agree that D's arrays break reference semantics?

I suppose so. But it doesn't worry me because it is a pragmatic
implementation that makes coding clearer (IMO) and improves performance.
I'm not sorry that D doesn't have text-book arrays, in that case.

In your previous example ...




 
 Semantically speaking, I think this is wrong. 

I've adjusted my thinking when using D. To me, after the assignment 'b =
a', I see that 'a' and 'b' are distinct arrays that happen to share the
same data. This may be seen as twisting words or playing with semantics,
but it works for me.

And by the way, the 'b.length = 2' statement does not cause 'b' to become
another instance. It still shares the same data as 'a'. You only get a new
instance when the length increases.

If D has not implemented text-book arrays, what are we losing? I can't see
that we have lost anything, in fact we have gained.

-- 
Derek Parnell
Melbourne, Australia
30/07/2005 11:19:20 AM

Jul 29 2005

AJG <AJG_member pathlink.com> writes:

Hi Derek,

 Ok, I don't think I said exactly what I meant before. Let's look at this piece
 by piece:
 
 1) Arrays are ("in theory") reference types.

This is where I think we separate. I don't think that D arrays are
reference types in the same manner as objects. I think they are value types
in that they always have two fields; a pointer and a length. D arrays are
more like a predefined struct. Your phrase "in theory", depends on whose
theory you are talking about. 

Well, my "in theory" is actually pretty down-to-earth. I mean reference

references. This is not an ivory tower concept. It means essentially a nicer,
fancier version of a pointer. When using the languages I mentioned, if you
assign a reference, it will not become its own instance spontaineously in
certain cases.

 4) So, even though Arrays and Objects are different, they share (or should)
 reference semantics.

I assume at this point that you are talking about arrays as defined in some
computer science book rather than how they are implemented in D.

Guilty as charged re: being a computer scientist ;). However, once again, this
is not a high-brow idea. Reference semantics are very basic and are implemented

Javascript). D breaks reference semantics when it comes to arrays. This leads me
to believe arrays are _not_ reference types, which is not the impression I got
from their description. Walter has remained conspicously silent about the
matter, and has not answered the question.

Are arrays reference types or not? If yes, then they are broken.

 I believe most of us can agree up to here.

Apparently not ;-)

Indeed. The final word can only come from the Big W., I'm afraid.

 My overall point is that D is not keeping its promise regarding Arrays obeying
 reference semantics. 

"Promise"? Where is that written down?

It was a figure of speech :p. The promise "would" be written down if D agrees to
implement array reference semantics and then doesn't. This is what I'm not sure
about.

Whether this is good or not is debatable, but at least it
 should be noted. Do you agree that D's arrays break reference semantics?

I suppose so. But it doesn't worry me because it is a pragmatic
implementation that makes coding clearer (IMO) and improves performance.
I'm not sorry that D doesn't have text-book arrays, in that case.

Once more, these "text-book" arrays are fairly common across modern languages,
and D's semantics are certainly a twisted variation. Also, I don't follow how
that improves performance. If anything, it _decreases_ performance by spawning
deep copies of array instances in certain special cases.

In your previous example ...




 
 Semantically speaking, I think this is wrong. 

I've adjusted my thinking when using D. To me, after the assignment 'b =
a', I see that 'a' and 'b' are distinct arrays that happen to share the
same data. This may be seen as twisting words or playing with semantics,
but it works for me.

Well, then that's not a reference. Sharing just the same data is some weird
variation of array that I hadn't encountered. This is not a reference.

And by the way, the 'b.length = 2' statement does not cause 'b' to become
another instance. It still shares the same data as 'a'. You only get a new
instance when the length increases.

Great, yet another exception. Thanks for pointing it out.

If D has not implemented text-book arrays, what are we losing? I can't see
that we have lost anything, in fact we have gained.

Well, so what if we lost object reference semantics? Would that also be another
"gain?" Less is more! Rations will be increased -33%. It's doubleplusgood!

;)

Cheers,
--AJG.

Jul 29 2005

Mike Parker <aldacron71 yahoo.com> writes:

AJG wrote:

 
 
 Well, then that's not a reference. Sharing just the same data is some weird
 variation of array that I hadn't encountered. This is not a reference.
 
 

 
 
 Great, yet another exception. Thanks for pointing it out.
 
 

 Well, so what if we lost object reference semantics? Would that also be another
 "gain?" Less is more! Rations will be increased -33%. It's doubleplusgood!
 

Wasn't it you who posted elsewhere in this thread that change is good? ;)

D has changed the way we think about arrays. From my perspective, it's a 
good change and your desire to revert to the 'array as a reference' 
paradigm is not. Maybe it would help if you think of the D array as a 
wrapper/facade to the actual reference?

Jul 30 2005

Derek Parnell <derek psych.ward> writes:

On Sat, 30 Jul 2005 02:30:17 +0000 (UTC), AJG wrote:

 Hi Derek,
 
 Ok, I don't think I said exactly what I meant before. Let's look at this piece
 by piece:
 
 1) Arrays are ("in theory") reference types.

This is where I think we separate. I don't think that D arrays are
reference types in the same manner as objects. I think they are value types
in that they always have two fields; a pointer and a length. D arrays are
more like a predefined struct. Your phrase "in theory", depends on whose
theory you are talking about. 

 
 Well, my "in theory" is actually pretty down-to-earth. I mean reference

 references. This is not an ivory tower concept. It means essentially a nicer,
 fancier version of a pointer. When using the languages I mentioned, if you
 assign a reference, it will not become its own instance spontaineously in
 certain cases.

I think I have the solution. Rename them. Don't call them arrays. Call them
something else. Then your problem goes away ;-)

-- 
Derek Parnell
Melbourne, Australia
30/07/2005 10:49:59 PM

Jul 30 2005

Niko Korhonen <niktheblak hotmail.com> writes:

Ben Hinkle wrote:
 I think you'll have a hard time getting lots of support for that. I much 
 prefer the current behavior and I bet there is lots of existing D code that 
 assumes one can test the length of an array at any time. Since an array is 
 not an object I see no problem with the "inconistency" - an array is an 
 array. 

Indeed. I think the array semantics where you can't access a property of 
the array without the Fear of the NullPointerException is the most 
annoying thing in the world, or at least in the field of programming.

I will happily agree to this difference in semantics because the 
benefits far outweigh the slight inconsistency.

Besides, in a way there is no inconsistency. An array reference is a 
value type consisting of two 4-byte integers (in 32-bit environments). 
This is different from an object reference. The first integer is the 
length of the array and the second is a pointer to the first item of the 
array. Whenever an array reference is created a pointer to the data 
exists. The .length property is just a shortcut to access the length 
field of the array. The .sort property is a function called on the array 
reference. These always work even if the array reference points to an 
empty array. Trying to access the elements of an empty array will 
segfault in the usual way.

Object references stored in an array have the usual semantics. IMO 
nothing forces a language to treat arrays as templated instances of a 
class Array with regular object semantics. D's way is just better.

-- 
Niko Korhonen
SW Developer

Jul 31 2005

Derek Parnell <derek psych.ward> writes:

On Mon, 01 Aug 2005 09:56:57 +0300, Niko Korhonen wrote:

 Ben Hinkle wrote:
 I think you'll have a hard time getting lots of support for that. I much 
 prefer the current behavior and I bet there is lots of existing D code that 
 assumes one can test the length of an array at any time. Since an array is 
 not an object I see no problem with the "inconistency" - an array is an 
 array. 

 
 Indeed. I think the array semantics where you can't access a property of 
 the array without the Fear of the NullPointerException is the most 
 annoying thing in the world, or at least in the field of programming.
 
 I will happily agree to this difference in semantics because the 
 benefits far outweigh the slight inconsistency.
 
 Besides, in a way there is no inconsistency. An array reference is a 
 value type consisting of two 4-byte integers (in 32-bit environments). 
 This is different from an object reference.

Agreed. The way I look at it is that a D array variable *contains* a
reference to the array elements but is, in itself, not the reference.

When it comes to implementation, dynamic-length arrays always have an
8-byte structure allocated to themselves, and may have more RAM allocated
if there are any elements in the array. The address of the array variable
is not the address of the first element; the length property is fetched at
runtime from the array variable. 

However, fixed-length arrays always have a minimum of 8 bytes allocated
regardless of the number of elements declared, and the address of the array
variable is also the address of its first element; the length property is
'hard-coded' by the compiler in any expressions that use it. 

-- 
Derek
Melbourne, Australia
1/08/2005 5:01:43 PM

Aug 01 2005

J Thomas <jtd514 ameritech.net> writes:

so wait, you basically want an array to be a pointer to data containing 
a length and a pointer? i have been following this thread somewhat but I 
can hardly find the benifit here. it seems to me you want to take 
something very straightforward and close to the metal and turn it into a 
referenced object, for some bizzare reason regarding reference 
semantics. why dont you just put your arrays in objects if you are 
having problems?

Jul 30 2005

AJG <AJG_member pathlink.com> writes:

Hi,

so wait, you basically want an array to be a pointer to data containing 
a length and a pointer? i have been following this thread somewhat but I 
can hardly find the benifit here.

No. I would like it to be that way, but I know there wouldn't be support for
this. What I'd like is for all array properties to follow reference semantics.

it seems to me you want to take 
something very straightforward and close to the metal and turn it into a 
referenced object, for some bizzare reason regarding reference 
semantics. 

What is bizarre is the current array semantics, be it due to "close to the
metal" requirements, or whatever. If you don't think arrays at the moment follow
at least _partial_ reference semantics, then why does:





Reverse _also_ the contents of A? Those are reference semantics. According to
Derek, the array reference itself is implemented on the stack in 8-byte chunks.
That's fine. I'm not talking about making the array itself a pointer.

Now, my point is that .length breaks reference semantics in special cases,
because:





A.length did not change. If it were consistent with .reverse and .sort, then A's
length too would have changed.

Cheers,
--AJG.







why dont you just put your arrays in objects if you are 
having problems?

Jul 30 2005

Derek Parnell <derek psych.ward> writes:

On Sat, 30 Jul 2005 22:12:31 +0000 (UTC), AJG wrote:


[snip]
 What is bizarre is the current array semantics, be it due to "close to the
 metal" requirements, or whatever. If you don't think arrays at the moment
follow
 at least _partial_ reference semantics, then why does:
 



 
 Reverse _also_ the contents of A?

There might have been be an argument that .reverse and .sort should follow
Walter's Copy-on-Write rules of engagement, but the current behavior is
documented and relied upon in current code.

-- 
Derek Parnell
Melbourne, Australia
31/07/2005 8:53:41 AM

Jul 30 2005

"Ben Hinkle" <bhinkle mathworks.com> writes:

"Derek Parnell" <derek psych.ward> wrote in message 
news:a118xxgyuee7.t1828b9vk5du$.dlg 40tude.net...
 On Sat, 30 Jul 2005 22:12:31 +0000 (UTC), AJG wrote:


 [snip]
 What is bizarre is the current array semantics, be it due to "close to 
 the
 metal" requirements, or whatever. If you don't think arrays at the moment 
 follow
 at least _partial_ reference semantics, then why does:





 Reverse _also_ the contents of A?

 There might have been be an argument that .reverse and .sort should follow
 Walter's Copy-on-Write rules of engagement, but the current behavior is
 documented and relied upon in current code.

Besides those reasons writing "B.reverse" to me indicates you want to affect 
B hence no COW while "reverse(B)" says you want a reversed B hence COW. 
That's one reason why I don't really like the current syntax hack of being 
able to write B.tolower() to mean tolower(B).

Aug 01 2005

Shammah Chancellor <Shammah_member pathlink.com> writes:

In article <dclba9$2pif$1 digitaldaemon.com>, Ben Hinkle says...
"Derek Parnell" <derek psych.ward> wrote in message 
news:a118xxgyuee7.t1828b9vk5du$.dlg 40tude.net...
 On Sat, 30 Jul 2005 22:12:31 +0000 (UTC), AJG wrote:


 [snip]
 What is bizarre is the current array semantics, be it due to "close to 
 the
 metal" requirements, or whatever. If you don't think arrays at the moment 
 follow
 at least _partial_ reference semantics, then why does:





 Reverse _also_ the contents of A?

 There might have been be an argument that .reverse and .sort should follow
 Walter's Copy-on-Write rules of engagement, but the current behavior is
 documented and relied upon in current code.

Besides those reasons writing "B.reverse" to me indicates you want to affect 
B hence no COW while "reverse(B)" says you want a reversed B hence COW. 
That's one reason why I don't really like the current syntax hack of being 
able to write B.tolower() to mean tolower(B). 

Utterly confusing!  reserve(b) and B.reverse have nothing in their name to imply
that either one copies the data.  By default COW should not happen.  Believe me,
look at .NET where everything is COW.  New memory allocations all over the
place.  IMHO .dup is there for a reason, and nothing is preventing you from
doing:

foo.dup.reverse

If somebody else comes along, they will knows you are copying the array. It's
only 4 more characters of typing.  Plus no confusion as to what does cow and
what doesn't.  I can copy the thing first with .dup if I want.  This isn't C
where it's 5 lines of code every time you need to copy an array!

-Sha

Aug 01 2005

"Ben Hinkle" <bhinkle mathworks.com> writes:

"Shammah Chancellor" <Shammah_member pathlink.com> wrote in message 
news:dcleqr$2ti5$1 digitaldaemon.com...
 In article <dclba9$2pif$1 digitaldaemon.com>, Ben Hinkle says...
"Derek Parnell" <derek psych.ward> wrote in message
news:a118xxgyuee7.t1828b9vk5du$.dlg 40tude.net...
 On Sat, 30 Jul 2005 22:12:31 +0000 (UTC), AJG wrote:


 [snip]
 What is bizarre is the current array semantics, be it due to "close to
 the
 metal" requirements, or whatever. If you don't think arrays at the 
 moment
 follow
 at least _partial_ reference semantics, then why does:





 Reverse _also_ the contents of A?

 There might have been be an argument that .reverse and .sort should 
 follow
 Walter's Copy-on-Write rules of engagement, but the current behavior is
 documented and relied upon in current code.

Besides those reasons writing "B.reverse" to me indicates you want to 
affect
B hence no COW while "reverse(B)" says you want a reversed B hence COW.
That's one reason why I don't really like the current syntax hack of being
able to write B.tolower() to mean tolower(B).

 Utterly confusing!  reserve(b) and B.reverse have nothing in their name to 
 imply
 that either one copies the data.  By default COW should not happen. 
 Believe me,
 look at .NET where everything is COW.  New memory allocations all over the
 place.  IMHO .dup is there for a reason, and nothing is preventing you 
 from
 doing:

 foo.dup.reverse

 If somebody else comes along, they will knows you are copying the array. 
 It's
 only 4 more characters of typing.  Plus no confusion as to what does cow 
 and
 what doesn't.  I can copy the thing first with .dup if I want.  This isn't 
 C
 where it's 5 lines of code every time you need to copy an array!

 -Sha

You've lost me. Are you proposing a change to any existing behavior or 
coding practice (ie COW)?

Aug 01 2005

Shammah Chancellor <Shammah_member pathlink.com> writes:

In article <dclfvs$2usj$1 digitaldaemon.com>, Ben Hinkle says...
"Shammah Chancellor" <Shammah_member pathlink.com> wrote in message 
news:dcleqr$2ti5$1 digitaldaemon.com...
 In article <dclba9$2pif$1 digitaldaemon.com>, Ben Hinkle says...
"Derek Parnell" <derek psych.ward> wrote in message
news:a118xxgyuee7.t1828b9vk5du$.dlg 40tude.net...
 [snip]
Besides those reasons writing "B.reverse" to me indicates you want to 
affect
B hence no COW while "reverse(B)" says you want a reversed B hence COW.
That's one reason why I don't really like the current syntax hack of being
able to write B.tolower() to mean tolower(B).

 Utterly confusing!  reserve(b) and B.reverse have nothing in their name to 
 imply
 that either one copies the data.  By default COW should not happen. 
 Believe me,
 look at .NET where everything is COW.  New memory allocations all over the
 place.  IMHO .dup is there for a reason, and nothing is preventing you 
 from
 doing:

 foo.dup.reverse

 If somebody else comes along, they will knows you are copying the array. 
 It's
 only 4 more characters of typing.  Plus no confusion as to what does cow 
 and
 what doesn't.  I can copy the thing first with .dup if I want.  This isn't 
 C
 where it's 5 lines of code every time you need to copy an array!

 -Sha

You've lost me. Are you proposing a change to any existing behavior or 
coding practice (ie COW)? 

I wasn't proposing a change at all.  I was disagreing with Derek.  I think COW
is a bad thing for API functions to be doing mysteriously.  It leads to crap
like this:

foo = foo.Replace("Hello","");
dateFoo = dateFoo.AddDays(1);

If I want a duplicate something, in D, it's as easy as saying:

(Not that replace is a valid property for char[]s, but you get my gist)

This leads to effective memory use, and no confusion about:

reverse(b), or b.reverse 

Which one does c-o-w?  The name certainly doesn't say, maybe by somebodies
reasoning it might make sense that one does cow and one doesn't.  But certainly
not mine, from the information given.

Also, you might say for consistency, always use cow.  But cow is not always what
you want. Since there's no way to manually un-cowify it,  It would make logical
sense to NEVER do cow, and let the programmer call dup first.

-Sha

Aug 01 2005

AJG <AJG_member pathlink.com> writes:

Hi,

If I want a duplicate something, in D, it's as easy as saying:

(Not that replace is a valid property for char[]s, but you get my gist)

Exactly.

This leads to effective memory use, and no confusion about:
reverse(b), or b.reverse 

Which one does c-o-w?  The name certainly doesn't say, maybe by somebodies
reasoning it might make sense that one does cow and one doesn't.  But certainly
not mine, from the information given.

IMHO, and for consistency, it should never do COW. If a user wants to do COW,
let the user do it. That's exactly what I mean by reference semantics, so it
seems we are in agreement here.

Also, you might say for consistency, always use cow.  But cow is not always 
you want. Since there's no way to manually un-cowify it,  It would make logical
sense to NEVER do cow, and let the programmer call dup first.

Interestingly enough (and one of my points), .length does COW about half of the
time, and there's no way to un-cowify it.

That's a great word, btw, un-cowify. It had me chuckling.

Cheers,
--AJG.

Aug 01 2005

Shammah Chancellor <Shammah_member pathlink.com> writes:

In article <dclls6$37o$1 digitaldaemon.com>, AJG says...
Hi,

If I want a duplicate something, in D, it's as easy as saying:

(Not that replace is a valid property for char[]s, but you get my gist)

Exactly.

This leads to effective memory use, and no confusion about:
reverse(b), or b.reverse 

Which one does c-o-w?  The name certainly doesn't say, maybe by somebodies
reasoning it might make sense that one does cow and one doesn't.  But certainly
not mine, from the information given.

IMHO, and for consistency, it should never do COW. If a user wants to do COW,
let the user do it. That's exactly what I mean by reference semantics, so it
seems we are in agreement here.

Also, you might say for consistency, always use cow.  But cow is not always 
you want. Since there's no way to manually un-cowify it,  It would make logical
sense to NEVER do cow, and let the programmer call dup first.

Interestingly enough (and one of my points), .length does COW about half of the
time, and there's no way to un-cowify it.

While I agree with you that it could be annoying, the problem is that arrays are
really stack variables which have a reference member. (As you well know by now.)

So, in order to un-cowify .length we would have to make all arrays true
references which contain references.  Also, that still doesn't fix array slices.
We would ALWAYS need to dup when an array slice is made.  :(

However, there's an easy way to handle the first problem already:

char[] a = "Hello";
char[]* b = &a; // (I hope anyways, & shouldn't return a.ptr... I haven't
checked this.)
(*b).length = 10;
writef("%i", a.length);

Although array slices won't be fixed without a special array slice type.  So
that it would know the start of the array and resize that.

That's a great word, btw, un-cowify. It had me chuckling.

Thanks :)

-Sha

Aug 01 2005

"Ben Hinkle" <bhinkle mathworks.com> writes:

"Shammah Chancellor" <Shammah_member pathlink.com> wrote in message 
news:dclk4p$1o0$1 digitaldaemon.com...
 In article <dclfvs$2usj$1 digitaldaemon.com>, Ben Hinkle says...
"Shammah Chancellor" <Shammah_member pathlink.com> wrote in message
news:dcleqr$2ti5$1 digitaldaemon.com...
 In article <dclba9$2pif$1 digitaldaemon.com>, Ben Hinkle says...
"Derek Parnell" <derek psych.ward> wrote in message
news:a118xxgyuee7.t1828b9vk5du$.dlg 40tude.net...
 [snip]
Besides those reasons writing "B.reverse" to me indicates you want to
affect
B hence no COW while "reverse(B)" says you want a reversed B hence COW.
That's one reason why I don't really like the current syntax hack of 
being
able to write B.tolower() to mean tolower(B).

 Utterly confusing!  reserve(b) and B.reverse have nothing in their name 
 to
 imply
 that either one copies the data.  By default COW should not happen.
 Believe me,
 look at .NET where everything is COW.  New memory allocations all over 
 the
 place.  IMHO .dup is there for a reason, and nothing is preventing you
 from
 doing:

 foo.dup.reverse

 If somebody else comes along, they will knows you are copying the array.
 It's
 only 4 more characters of typing.  Plus no confusion as to what does cow
 and
 what doesn't.  I can copy the thing first with .dup if I want.  This 
 isn't
 C
 where it's 5 lines of code every time you need to copy an array!

 -Sha

You've lost me. Are you proposing a change to any existing behavior or
coding practice (ie COW)?

 I wasn't proposing a change at all.  I was disagreing with Derek.  I think 
 COW
 is a bad thing for API functions to be doing mysteriously.  It leads to 
 crap
 like this:

 foo = foo.Replace("Hello","");
 dateFoo = dateFoo.AddDays(1);

I didn't read Derek's post as proposing reverse use COW. He was pointing out 
that it doesn't. It's too bad you see COW as mysterious.

 If I want a duplicate something, in D, it's as easy as saying:

 (Not that replace is a valid property for char[]s, but you get my gist)

 This leads to effective memory use, and no confusion about:

 reverse(b), or b.reverse

 Which one does c-o-w?  The name certainly doesn't say, maybe by somebodies
 reasoning it might make sense that one does cow and one doesn't.  But 
 certainly
 not mine, from the information given.

The statement about effective memory use only is true when the operation is 
guaranteed to change the string. If foo in the example didn't contain any 
Hellos then the dup would be wasteful. Plus I'm surprised you don't see any 
difference between reverse(b) and b.reverse since it's common in OOP to 
interpret b.foo as acting on b while foo(b) is just some function of b.

 Also, you might say for consistency, always use cow.  But cow is not 
 always what
 you want. Since there's no way to manually un-cowify it,  It would make 
 logical
 sense to NEVER do cow, and let the programmer call dup first.

That would be a big change in D style since many times you do not know if a 
dup will be needed or not (eg most of the functions in std.string might just 
return the original string).

Aug 01 2005

Shammah Chancellor <Shammah_member pathlink.com> writes:

In article <dclmn7$42s$1 digitaldaemon.com>, Ben Hinkle says...
"Shammah Chancellor" <Shammah_member pathlink.com> wrote in message 
news:dclk4p$1o0$1 digitaldaemon.com...
 In article <dclfvs$2usj$1 digitaldaemon.com>, Ben Hinkle says...
"Shammah Chancellor" <Shammah_member pathlink.com> wrote in message
news:dcleqr$2ti5$1 digitaldaemon.com...
 In article <dclba9$2pif$1 digitaldaemon.com>, Ben Hinkle says...
"Derek Parnell" <derek psych.ward> wrote in message
news:a118xxgyuee7.t1828b9vk5du$.dlg 40tude.net...
 [snip]
Besides those reasons writing "B.reverse" to me indicates you want to
affect
B hence no COW while "reverse(B)" says you want a reversed B hence COW.
That's one reason why I don't really like the current syntax hack of 
being
able to write B.tolower() to mean tolower(B).

 Utterly confusing!  reserve(b) and B.reverse have nothing in their name 
 to
 imply
 that either one copies the data.  By default COW should not happen.
 Believe me,
 look at .NET where everything is COW.  New memory allocations all over 
 the
 place.  IMHO .dup is there for a reason, and nothing is preventing you
 from
 doing:

 foo.dup.reverse

 If somebody else comes along, they will knows you are copying the array.
 It's
 only 4 more characters of typing.  Plus no confusion as to what does cow
 and
 what doesn't.  I can copy the thing first with .dup if I want.  This 
 isn't
 C
 where it's 5 lines of code every time you need to copy an array!

 -Sha

You've lost me. Are you proposing a change to any existing behavior or
coding practice (ie COW)?

 I wasn't proposing a change at all.  I was disagreing with Derek.  I think 
 COW
 is a bad thing for API functions to be doing mysteriously.  It leads to 
 crap
 like this:

 foo = foo.Replace("Hello","");
 dateFoo = dateFoo.AddDays(1);

I didn't read Derek's post as proposing reverse use COW. He was pointing out 
that it doesn't.

You're right, he didn't.  I was contesting that tolower(b) and b.tolower should
do different things.

 It's too bad you see COW as mysterious.

I don't find anything mysterious about it. It's just not useful most every time
I've had any dealing with COW functions.   If I want COW, I can dupe the object
first.

 If I want a duplicate something, in D, it's as easy as saying:

 (Not that replace is a valid property for char[]s, but you get my gist)

 This leads to effective memory use, and no confusion about:

 reverse(b), or b.reverse

 Which one does c-o-w?  The name certainly doesn't say, maybe by somebodies
 reasoning it might make sense that one does cow and one doesn't.  But 
 certainly
 not mine, from the information given.

The statement about effective memory use only is true when the operation is 
guaranteed to change the string. If foo in the example didn't contain any 
Hellos then the dup would be wasteful.

I hope you're not implying that replace should only return a new instance if
something was actually changed.  That is obsurd.  I would then need to check to
see if it's given me back a reference to a new array before I could use it?

 Plus I'm surprised you don't see any 
difference between reverse(b) and b.reverse since it's common in OOP to 
interpret b.foo as acting on b while foo(b) is just some function of b.

Why don't you tell microsoft that.   Many of the examples I listed were from
VB.NET, and do COW from member functions. Also, Just because it is common
doesn't make it logical, consistent, or obvious to a somebody not familiar with
these __unwritten__ agreements.

 Also, you might say for consistency, always use cow.  But cow is not 
 always what
 you want. Since there's no way to manually un-cowify it,  It would make 
 logical
 sense to NEVER do cow, and let the programmer call dup first.

That would be a big change in D style since many times you do not know if a 
dup will be needed or not (eg most of the functions in std.string might just 
return the original string).

If I'm understanding what you just said, let me say this:

As I said above, I think it's silly to have non-deterministic behavior from
those functions.  When I say deterministic, I mean that I should be able to
expect it to always return a duplicate string, or not.

-Sha

Aug 01 2005

"Ben Hinkle" <ben.hinkle gmail.com> writes:

"Shammah Chancellor" <Shammah_member pathlink.com> wrote in message 
news:dcm4an$grn$1 digitaldaemon.com...
 In article <dclmn7$42s$1 digitaldaemon.com>, Ben Hinkle says...
"Shammah Chancellor" <Shammah_member pathlink.com> wrote in message
news:dclk4p$1o0$1 digitaldaemon.com...
 In article <dclfvs$2usj$1 digitaldaemon.com>, Ben Hinkle says...
"Shammah Chancellor" <Shammah_member pathlink.com> wrote in message
news:dcleqr$2ti5$1 digitaldaemon.com...
 In article <dclba9$2pif$1 digitaldaemon.com>, Ben Hinkle says...
"Derek Parnell" <derek psych.ward> wrote in message
news:a118xxgyuee7.t1828b9vk5du$.dlg 40tude.net...
 [snip]
Besides those reasons writing "B.reverse" to me indicates you want to
affect
B hence no COW while "reverse(B)" says you want a reversed B hence 
COW.
That's one reason why I don't really like the current syntax hack of
being
able to write B.tolower() to mean tolower(B).

 Utterly confusing!  reserve(b) and B.reverse have nothing in their 
 name
 to
 imply
 that either one copies the data.  By default COW should not happen.
 Believe me,
 look at .NET where everything is COW.  New memory allocations all over
 the
 place.  IMHO .dup is there for a reason, and nothing is preventing you
 from
 doing:

 foo.dup.reverse

 If somebody else comes along, they will knows you are copying the 
 array.
 It's
 only 4 more characters of typing.  Plus no confusion as to what does 
 cow
 and
 what doesn't.  I can copy the thing first with .dup if I want.  This
 isn't
 C
 where it's 5 lines of code every time you need to copy an array!

 -Sha

You've lost me. Are you proposing a change to any existing behavior or
coding practice (ie COW)?

 I wasn't proposing a change at all.  I was disagreing with Derek.  I 
 think
 COW
 is a bad thing for API functions to be doing mysteriously.  It leads to
 crap
 like this:

 foo = foo.Replace("Hello","");
 dateFoo = dateFoo.AddDays(1);

I didn't read Derek's post as proposing reverse use COW. He was pointing 
out
that it doesn't.

 You're right, he didn't.  I was contesting that tolower(b) and b.tolower 
 should
 do different things.
 If I want a duplicate something, in D, it's as easy as saying:

 (Not that replace is a valid property for char[]s, but you get my gist)

 This leads to effective memory use, and no confusion about:

 reverse(b), or b.reverse

 Which one does c-o-w?  The name certainly doesn't say, maybe by 
 somebodies
 reasoning it might make sense that one does cow and one doesn't.  But
 certainly
 not mine, from the information given.

The statement about effective memory use only is true when the operation 
is
guaranteed to change the string. If foo in the example didn't contain any
Hellos then the dup would be wasteful.

 I hope you're not implying that replace should only return a new instance 
 if
 something was actually changed.

That is what I'm implying - and that's what many std.string functions do.

 That is obsurd.  I would then need to check to
 see if it's given me back a reference to a new array before I could use 
 it?

why? The only time you would care is if you start modifying the array 
in-place.

 Plus I'm surprised you don't see any
difference between reverse(b) and b.reverse since it's common in OOP to
interpret b.foo as acting on b while foo(b) is just some function of b.

 Why don't you tell microsoft that.   Many of the examples I listed were 
 from
 VB.NET, and do COW from member functions.

Strings in VB.NET are immutable so I'm not surprised that methods return new 
strings - that's the definition of immutable. Mutable objects would 
interpret b.reverse as acting on b.

 Also, Just because it is common
 doesn't make it logical, consistent, or obvious to a somebody not familiar 
 with
 these __unwritten__ agreements.

Unwritten in what sense? COW is documented in several places in D (though I 
would like even more documenation about it since it appears people don't 
know about it).

 Also, you might say for consistency, always use cow.  But cow is not
 always what
 you want. Since there's no way to manually un-cowify it,  It would make
 logical
 sense to NEVER do cow, and let the programmer call dup first.

That would be a big change in D style since many times you do not know if 
a
dup will be needed or not (eg most of the functions in std.string might 
just
return the original string).

 If I'm understanding what you just said, let me say this:

 As I said above, I think it's silly to have non-deterministic behavior 
 from
 those functions.  When I say deterministic, I mean that I should be able 
 to
 expect it to always return a duplicate string, or not.

ok - everyone is entitled to their opinions. To me it's simpler to obey COW. 
Changing an array in-place is rare enough that special care is ok with me.

Aug 01 2005

Shammah Chancellor <Shammah_member pathlink.com> writes:

In article <dcmaak$l5m$1 digitaldaemon.com>, Ben Hinkle says...
"Shammah Chancellor" <Shammah_member pathlink.com> wrote in message 
news:dcm4an$grn$1 digitaldaemon.com...
 In article <dclmn7$42s$1 digitaldaemon.com>, Ben Hinkle says...
"Shammah Chancellor" <Shammah_member pathlink.com> wrote in message
news:dclk4p$1o0$1 digitaldaemon.com...
 In article <dclfvs$2usj$1 digitaldaemon.com>, Ben Hinkle says...
"Shammah Chancellor" <Shammah_member pathlink.com> wrote in message
news:dcleqr$2ti5$1 digitaldaemon.com...
 In article <dclba9$2pif$1 digitaldaemon.com>, Ben Hinkle says...
"Derek Parnell" <derek psych.ward> wrote in message
news:a118xxgyuee7.t1828b9vk5du$.dlg 40tude.net...
 [snip]
Besides those reasons writing "B.reverse" to me indicates you want to
affect
B hence no COW while "reverse(B)" says you want a reversed B hence 
COW.
That's one reason why I don't really like the current syntax hack of
being
able to write B.tolower() to mean tolower(B).

 Utterly confusing!  reserve(b) and B.reverse have nothing in their 
 name
 to
 imply
 that either one copies the data.  By default COW should not happen.
 Believe me,
 look at .NET where everything is COW.  New memory allocations all over
 the
 place.  IMHO .dup is there for a reason, and nothing is preventing you
 from
 doing:

 foo.dup.reverse

 If somebody else comes along, they will knows you are copying the 
 array.
 It's
 only 4 more characters of typing.  Plus no confusion as to what does 
 cow
 and
 what doesn't.  I can copy the thing first with .dup if I want.  This
 isn't
 C
 where it's 5 lines of code every time you need to copy an array!

 -Sha

You've lost me. Are you proposing a change to any existing behavior or
coding practice (ie COW)?

 I wasn't proposing a change at all.  I was disagreing with Derek.  I 
 think
 COW
 is a bad thing for API functions to be doing mysteriously.  It leads to
 crap
 like this:

 foo = foo.Replace("Hello","");
 dateFoo = dateFoo.AddDays(1);

I didn't read Derek's post as proposing reverse use COW. He was pointing 
out
that it doesn't.

 You're right, he didn't.  I was contesting that tolower(b) and b.tolower 
 should
 do different things.
 If I want a duplicate something, in D, it's as easy as saying:

 (Not that replace is a valid property for char[]s, but you get my gist)

 This leads to effective memory use, and no confusion about:

 reverse(b), or b.reverse

 Which one does c-o-w?  The name certainly doesn't say, maybe by 
 somebodies
 reasoning it might make sense that one does cow and one doesn't.  But
 certainly
 not mine, from the information given.

The statement about effective memory use only is true when the operation 
is
guaranteed to change the string. If foo in the example didn't contain any
Hellos then the dup would be wasteful.

 I hope you're not implying that replace should only return a new instance 
 if
 something was actually changed.

That is what I'm implying - and that's what many std.string functions do.

Bah

 That is obsurd.  I would then need to check to
 see if it's given me back a reference to a new array before I could use 
 it?

why? The only time you would care is if you start modifying the array 
in-place.

Exactly.  Quite often when I want to replace one thing, I want to replace ALOT
of things. (Or take any 
other example.)  If each replace allocates a new string, that's inefficient.
Maybe I only want to copy it 
once, and then modify it in place.   When .dup is only 4 extra characters per
instance of this, it does not 
justify having two copies of every array function, one for cow and one for in
place.  

 Plus I'm surprised you don't see any
difference between reverse(b) and b.reverse since it's common in OOP to
interpret b.foo as acting on b while foo(b) is just some function of b.

 Why don't you tell microsoft that.   Many of the examples I listed were 
 from
 VB.NET, and do COW from member functions.

Strings in VB.NET are immutable so I'm not surprised that methods return new 
strings - that's the definition of immutable. Mutable objects would 
interpret b.reverse as acting on b.

True. However, for mutable objects, would you like to duplicate every function
for COW and non-COW?  
I find it less confusing to explicitly dup. (It also clutters my namespace
less!)

 Also, Just because it is common
 doesn't make it logical, consistent, or obvious to a somebody not familiar 
 with
 these __unwritten__ agreements.

Unwritten in what sense? COW is documented in several places in D (though I 
would like even more documenation about it since it appears people don't 
know about it).

There's barely any documentation for the API as it is.   And a footnote about
tolower( string ) on a man 
page is not enough for me. 

 Also, you might say for consistency, always use cow.  But cow is not
 always what
 you want. Since there's no way to manually un-cowify it,  It would make
 logical
 sense to NEVER do cow, and let the programmer call dup first.

That would be a big change in D style since many times you do not know if 
a
dup will be needed or not (eg most of the functions in std.string might 
just
return the original string).

 If I'm understanding what you just said, let me say this:

 As I said above, I think it's silly to have non-deterministic behavior 
 from
 those functions.  When I say deterministic, I mean that I should be able 
 to
 expect it to always return a duplicate string, or not.

ok - everyone is entitled to their opinions. To me it's simpler to obey COW. 
Changing an array in-place is rare enough that special care is ok with me. 

Rare in what?  Rare in what you're writing? I think you'll find that  many
projects have use for it alot.

-Sha

Aug 01 2005

"Ben Hinkle" <ben.hinkle gmail.com> writes:

 That is obsurd.  I would then need to check to
 see if it's given me back a reference to a new array before I could use
 it?

why? The only time you would care is if you start modifying the array
in-place.

 Exactly.  Quite often when I want to replace one thing, I want to replace 
 ALOT
 of things. (Or take any
 other example.)  If each replace allocates a new string, that's 
 inefficient.
 Maybe I only want to copy it
 once, and then modify it in place.   When .dup is only 4 extra characters 
 per
 instance of this, it does not
 justify having two copies of every array function, one for cow and one for 
 in
 place.

I don't know if you followed the recent COW/const/inplace performance 
discussion but my own $0.02 is that one should use COW as a general rule and 
after profiling the performance target a (presumably) small set of routines 
that need more careful memory management and possibly inplace manipulations. 
In a "worst case" one can use one of the many other memory management 
techniques listed in the D docs. In any case you might want to look over 
those recent COW threads for more (and more and more) discussion on the 
topic.

On a side note, remember that operations like "replace" might increase the 
length of the string (if the replacement is longer that the pattern) in 
which case modifying it inplace becomes tricky. A general rule like COW can 
take the place of lots of individual rules for each function. But you can 
code your app however you like or write a phobos lib that does everything 
inplace - there's nothing technically preventing that and it's perfectly ok 
if that's what you want to do.

 Also, Just because it is common
 doesn't make it logical, consistent, or obvious to a somebody not 
 familiar
 with
 these __unwritten__ agreements.

Unwritten in what sense? COW is documented in several places in D (though 
I
would like even more documenation about it since it appears people don't
know about it).

 There's barely any documentation for the API as it is.   And a footnote 
 about
 tolower( string ) on a man
 page is not enough for me.

I'm not sure what man page you are referring to since D doesn't have man 
pages (or footnotes from what I can tell). Maybe you are speaking 
figuratively in which case I recommend that if you have concrete suggestions 
for improving the doc that you add comments to the doc wiki.

On a slight OT I wonder if/how the doc wiki is being used. Are comments 
removed as they are fixed in the doc? what's the process for using the wiki? 
I add my comments where I think they should go but I notice there's stuff 
ranging all over the map and to be honest I have no clue if Walter ever 
looks at it, how often, and what happens when he does look at it.

Aug 01 2005

Shammah Chancellor <Shammah_member pathlink.com> writes:

In article <dcmpne$10pp$1 digitaldaemon.com>, Ben Hinkle says...
 That is obsurd.  I would then need to check to
 see if it's given me back a reference to a new array before I could use
 it?

why? The only time you would care is if you start modifying the array
in-place.

 Exactly.  Quite often when I want to replace one thing, I want to replace 
 ALOT
 of things. (Or take any
 other example.)  If each replace allocates a new string, that's 
 inefficient.
 Maybe I only want to copy it
 once, and then modify it in place.   When .dup is only 4 extra characters 
 per
 instance of this, it does not
 justify having two copies of every array function, one for cow and one for 
 in
 place.

I don't know if you followed the recent COW/const/inplace performance 
discussion but my own $0.02 is that one should use COW as a general rule and 
after profiling the performance target a (presumably) small set of routines 
that need more careful memory management and possibly inplace manipulations. 
In a "worst case" one can use one of the many other memory management 
techniques listed in the D docs. In any case you might want to look over 
those recent COW threads for more (and more and more) discussion on the 
topic.

I think this would be a bad choice.  It might be wise with respect to
performance, but having different 
methods randomly be cow or not cow depending on how much more time they take is
a bit confusing 
to say the least.

On a side note, remember that operations like "replace" might increase the 
length of the string (if the replacement is longer that the pattern) in 
which case modifying it inplace becomes tricky. 

Inplace may not be possible, but it could still follow the normal rule of
modifying the ptr of your array 
to point to the new value.  That way a dup only happens when it is required, and
the calling function 
does not care.  This would be ideal IMHO.

 A general rule like COW can 
take the place of lots of individual rules for each function. But you can 
code your app however you like or write a phobos lib that does everything 
inplace - there's nothing technically preventing that and it's perfectly ok 
if that's what you want to do.

That's true, but it would be nice not to be including my own runtime in every
little application I write.  I 
suppose I could force installation of a shared library.  Ugh.

 Also, Just because it is common
 doesn't make it logical, consistent, or obvious to a somebody not 
 familiar
 with
 these __unwritten__ agreements.

Unwritten in what sense? COW is documented in several places in D (though 
I
would like even more documenation about it since it appears people don't
know about it).

 There's barely any documentation for the API as it is.   And a footnote 
 about
 tolower( string ) on a man
 page is not enough for me.

I'm not sure what man page you are referring to since D doesn't have man 
pages (or footnotes from what I can tell). Maybe you are speaking 
figuratively in which case I recommend that if you have concrete suggestions 
for improving the doc that you add comments to the doc wiki.

Which I have been doing when I see them.  However, most of the doc that you can
post on the Wiki.  (I 
haven't looked alot at it. ) Seems to be for the language specification.  Can
the phobos docs be 
modified?

On a slight OT I wonder if/how the doc wiki is being used. Are comments 
removed as they are fixed in the doc? what's the process for using the wiki? 
I add my comments where I think they should go but I notice there's stuff 
ranging all over the map and to be honest I have no clue if Walter ever 
looks at it, how often, and what happens when he does look at it.

Aug 01 2005

"Ben Hinkle" <ben.hinkle gmail.com> writes:

 There's barely any documentation for the API as it is.   And a footnote
 about tolower( string ) on a man page is not enough for me.

I'm not sure what man page you are referring to since D doesn't have man
pages (or footnotes from what I can tell). Maybe you are speaking
figuratively in which case I recommend that if you have concrete 
suggestions
for improving the doc that you add comments to the doc wiki.

 Which I have been doing when I see them.  However, most of the doc that 
 you can
 post on the Wiki.  (I
 haven't looked alot at it. ) Seems to be for the language specification. 
 Can
 the phobos docs be  modified?

There's a link at the bottom of the phobos page for the wiki. I don't know 
if the modules with stand-along pages have the link, though.

Aug 02 2005

Derek Parnell <derek psych.ward> writes:

On Mon, 1 Aug 2005 16:54:49 +0000 (UTC), Shammah Chancellor wrote:


"Derek Parnell" <derek psych.ward> wrote in message
news:a118xxgyuee7.t1828b9vk5du$.dlg 40tude.net...
 [snip]
Besides those reasons writing "B.reverse" to me indicates you want to 
affect
B hence no COW while "reverse(B)" says you want a reversed B hence COW.
That's one reason why I don't really like the current syntax hack of being
able to write B.tolower() to mean tolower(B).




 I was disagreing with Derek.  I think COW
 is a bad thing for API functions to be doing mysteriously.  It leads to crap
 like this:
 
 foo = foo.Replace("Hello","");
 dateFoo = dateFoo.AddDays(1);

Hi Shammah,
I wasn't actually saying that .reverse must use CoW. I was saying that it
didn't and that fact seems go counter to Walter's general principle (as I
understand it) about when to use Cow or not. I thought that one should use
CoW if the code is actually changing the data *and* the data might be
accessible to the calling routine. Thus as the .reverse will change the
data for lengths > 1, and the data is probably accessible to the code using
.reverse, one could have expected it to CoW.

Of course, I might be misunderstanding that 'general principle' ;-)

As the current behaviour is documented, we can cope with this seeming
exception.

-- 
Derek Parnell
Melbourne, Australia
2/08/2005 7:21:43 AM

Aug 01 2005

Shammah Chancellor <Shammah_member pathlink.com> writes:

In article <1as80g46qpg5w$.1dfr6mqon4u1t$.dlg 40tude.net>, Derek Parnell says...
On Mon, 1 Aug 2005 16:54:49 +0000 (UTC), Shammah Chancellor wrote:


"Derek Parnell" <derek psych.ward> wrote in message
news:a118xxgyuee7.t1828b9vk5du$.dlg 40tude.net...
 [snip]
Besides those reasons writing "B.reverse" to me indicates you want to 
affect
B hence no COW while "reverse(B)" says you want a reversed B hence COW.
That's one reason why I don't really like the current syntax hack of being
able to write B.tolower() to mean tolower(B).




 I was disagreing with Derek.  I think COW
 is a bad thing for API functions to be doing mysteriously.  It leads to crap
 like this:
 
 foo = foo.Replace("Hello","");
 dateFoo = dateFoo.AddDays(1);

Hi Shammah,
I wasn't actually saying that .reverse must use CoW. I was saying that it
didn't and that fact seems go counter to Walter's general principle (as I
understand it) about when to use Cow or not. I thought that one should use
CoW if the code is actually changing the data *and* the data might be
accessible to the calling routine. Thus as the .reverse will change the
data for lengths > 1, and the data is probably accessible to the code using
.reverse, one could have expected it to CoW.

Of course, I might be misunderstanding that 'general principle' ;-)

As the current behaviour is documented, we can cope with this seeming
exception.

No,no I understood that.  I'm just being argumentative.  I don't agree with you
that tolower(b) and b.tolower should do different things.  I don't agree that
tolower(b) should even exist in the face of b.tolower.  It clutters up my
namespace.  (Aside that user properties in D can't be added to a special char[]
namespace =/ )

It just happened my example from VB was using class methods.  For example in
NET in order to round a date up from seconds to 5 minutes, you need to allocate
like 3 or 4 datetimes.  Of course you don't SEE this, but .AddDays, .AddSeconds
etc. They all allocate a new datetime.

For example, in .NET in order to get tomorrow's date:

Dim tomorrow as String = DateTime.Now.Date.AddDays(1).ToLongDateString()

That required allocations of 3 dateTimes and a String.

I could completely be abusing the .NET Framework, but I searched far and wide
and couldn't find an alternative that worked on the original.  

This kind of crud is why I'm very opposed to COW. In class methods or global
functions. If .NET had D style dupes and I really wanted to operate on a new
object:

Dim tomorrow as String =
DateTime.Now.Duplicate.Date.AddDays(1).ToLongDateString()

One less allocation since AddDays didn't need/get it's own copy of the memory.

You might still cite tolower(b) instead of b.tolower as not being as rediculous
as what .NET wants.  But I ask you thi: If somebody doesn't know your COW
conventions, would they know the difference in what happens?  In any case
arr.dup.tolower would fit the same purpose just fine, and it's more explicit.  


-Sha

Aug 01 2005

D Programming

C/C++ Programming

Other

digitalmars.D - Walter - Should we use arrays as Null?