www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - in/inout/out for arrays needs clear defining

reply Stewart Gordon <smjg_1998 yahoo.com> writes:
Using DMD 0.94, Windows 98SE.

I've long noticed a potential cause for confusion.  The passing of 
arrays as (address, length) tuples, while sensible on the whole, leads 
to counter-intuitive behaviour.  While it's logical to this form, it 
isn't what one might expect, and the spec doesn't clearly define the 
semantics of in/inout/out on arrays.

In short: if an array is passed as in, someone might think that the 
contents of the array are in, when in reality only the (address, length) 
tuple is in; the actual array contents are inout.

Of course, it's probably too late to clean up the semantics without 
breaking plenty of existing code.  But I suppose the main point is that 
the semantics we have need to be clearly explained.


Still, a possible idea for the future would be to allow the contents and 
dimensions to have separate in/inout/out settings.  The idea I came up 
with a while back is a notation like

	void qwert(inout int[in] yuiop);

meaning that the length is in, and the contents are inout.  There would 
be six possibilities:


in int[in]
	"Give me an array to look at"
	Pass in (address, length); either disallow changes to data in the 
function body, or copy on entry if any changes can occur

inout int[in]
	"Give me the size and some starting data, and I'll play around with the 
data"
	Pass in (address, length); allow the contents to be modified in-place 
(current treatment of in int[], except that changing the length should 
be disallowed)

out int[in]
	"Give me the size, and I'll give you the data"
	Pass in (address, length); initialise the array elements on entry; 
allow the contents to be modified in-place

inout int[inout]
	"Give me an array, and I'll do what I like with it"
	Pass in reference to (address, length); anything goes (current 
treatment of inout int[])

out int[inout]
	"Give me a starting size, and I'll give you the data, changing the size 
if I want"
	Pass in reference to (address, length); initialise the array elements 
on entry; anything goes

out int[out]
	"I'll give you an array"
	Pass in reference to (address, length), which is initialised to null


Of course, the other combinations in[inout], in[out], inout[out] make 
little or no sense.

I guess the extension to arrays of arrays would be straightforward, but 
I'm not sure.

And combined with the existing system, while the syntax of my idea 
doesn't clash with it, it would cause confusion.  And it would probalby 
clash if we tried to extend it to static array parameters.  Maybe 
there's a better notation....

Stewart.

-- 
My e-mail is valid but not my primary mailbox, aside from its being the 
unfortunate victim of intensive mail-bombing at the moment.  Please keep 
replies on the 'group where everyone may benefit.
Jul 06 2004
parent reply Norbert Nemec <Norbert.Nemec gmx.de> writes:
What you say is very true: we have a problem there that needs to be solved.
Anyhow, I fear, documenting the current behavior is all we can do:

Regulating what the routine is allowed to do with the referenced data would
be equal to going for const references, which you should not even mention
if you want to live in peace with Walter....

In any case, one should be aware that the behavior of static arrays is
defined even less. In some cases they have plain value semantics, in others
they mimic the behavior of dynamic arrays being handled by reference. My
preference would be to make a clear cut here and say: static arrays have
value semantics throughout, dynamic arrays have reference semantics (unless
you are using the magic of array/vector expressions, but that's a different
topic)

There is some fundamental asymmetry between the two kinds of arrays anyway,
so why not make it complete instead of trying to cover it up?




Stewart Gordon wrote:

 Using DMD 0.94, Windows 98SE.
 
 I've long noticed a potential cause for confusion.  The passing of
 arrays as (address, length) tuples, while sensible on the whole, leads
 to counter-intuitive behaviour.  While it's logical to this form, it
 isn't what one might expect, and the spec doesn't clearly define the
 semantics of in/inout/out on arrays.
 
 In short: if an array is passed as in, someone might think that the
 contents of the array are in, when in reality only the (address, length)
 tuple is in; the actual array contents are inout.
 
 Of course, it's probably too late to clean up the semantics without
 breaking plenty of existing code.  But I suppose the main point is that
 the semantics we have need to be clearly explained.
 
 
 Still, a possible idea for the future would be to allow the contents and
 dimensions to have separate in/inout/out settings.  The idea I came up
 with a while back is a notation like
 
 void qwert(inout int[in] yuiop);
 
 meaning that the length is in, and the contents are inout.  There would
 be six possibilities:
 
 
 in int[in]
 "Give me an array to look at"
 Pass in (address, length); either disallow changes to data in the
 function body, or copy on entry if any changes can occur
 
 inout int[in]
 "Give me the size and some starting data, and I'll play around with the
 data"
 Pass in (address, length); allow the contents to be modified in-place
 (current treatment of in int[], except that changing the length should
 be disallowed)
 
 out int[in]
 "Give me the size, and I'll give you the data"
 Pass in (address, length); initialise the array elements on entry;
 allow the contents to be modified in-place
 
 inout int[inout]
 "Give me an array, and I'll do what I like with it"
 Pass in reference to (address, length); anything goes (current
 treatment of inout int[])
 
 out int[inout]
 "Give me a starting size, and I'll give you the data, changing the size
 if I want"
 Pass in reference to (address, length); initialise the array elements
 on entry; anything goes
 
 out int[out]
 "I'll give you an array"
 Pass in reference to (address, length), which is initialised to null
 
 
 Of course, the other combinations in[inout], in[out], inout[out] make
 little or no sense.
 
 I guess the extension to arrays of arrays would be straightforward, but
 I'm not sure.
 
 And combined with the existing system, while the syntax of my idea
 doesn't clash with it, it would cause confusion.  And it would probalby
 clash if we tried to extend it to static array parameters.  Maybe
 there's a better notation....
 
 Stewart.
 

Jul 06 2004
parent reply "Walter" <newshound digitalmars.com> writes:
"Norbert Nemec" <Norbert.Nemec gmx.de> wrote in message
news:ccdu2u$2cjg$1 digitaldaemon.com...
 What you say is very true: we have a problem there that needs to be

 Anyhow, I fear, documenting the current behavior is all we can do:

 Regulating what the routine is allowed to do with the referenced data

 be equal to going for const references, which you should not even mention
 if you want to live in peace with Walter....

 In any case, one should be aware that the behavior of static arrays is
 defined even less. In some cases they have plain value semantics, in

 they mimic the behavior of dynamic arrays being handled by reference. My
 preference would be to make a clear cut here and say: static arrays have
 value semantics throughout, dynamic arrays have reference semantics

 you are using the magic of array/vector expressions, but that's a

 topic)

 There is some fundamental asymmetry between the two kinds of arrays

 so why not make it complete instead of trying to cover it up?

All it comes down to is understanding the difference between a reference type and a value type. This isn't a problem unique to D, it happens in C++, Java, C#, Javascript, etc. Once this difference is understood, the semantics fall out in a straightforward manner and shouldn't cause confusion. Arrays and classes are reference types. Structs and scalars are value types.
Jul 07 2004
next sibling parent reply Norbert Nemec <Norbert.Nemec gmx.de> writes:
Walter wrote:

 All it comes down to is understanding the difference between a reference
 type and a value type. This isn't a problem unique to D, it happens in
 C++, Java, C#, Javascript, etc. Once this difference is understood, the
 semantics fall out in a straightforward manner and shouldn't cause
 confusion.

True. The problem is not about understanding the two kinds of *types* but understanding what in/out/inout do with them. The texts in the specs just is to vague about this: First, the section "Function Parameters" should be renamed in "Function Arguments". Functions have arguments, templates have parameters. That is convention in C++ and I see little reason to mix this up in D. It becomes especially important for function templates which have both. Then, what means "out and inout work like storage classes"?!? If that somehow refers to pass-by-reference then it should be said explicitely. How are struct-in-arguments passed? Structs are value types, so they have value semantics when passed as in-arguments. This should be clearly stated, I think. "out is rare enough, and inout even rarer"??!! One common reason to use inout is to enforce call-by-reference on large structs. There is good reason for that in many cases. The compiler cannot do the decision automatically. This would inout not-so-rare. "out" on the other hand might actually really be rather rare. Depends on the coding style, though, I guess.
 Arrays and classes are reference types. Structs and scalars are value
 types.

Wow - I just realized a fundamental misconception I always had about static arrays. And that after writing up a detailed proposal about arrays...
Jul 07 2004
next sibling parent reply "Walter" <newshound digitalmars.com> writes:
"Norbert Nemec" <Norbert.Nemec gmx.de> wrote in message
news:cchipu$1oof$1 digitaldaemon.com...
 Walter wrote:

 All it comes down to is understanding the difference between a reference
 type and a value type. This isn't a problem unique to D, it happens in
 C++, Java, C#, Javascript, etc. Once this difference is understood, the
 semantics fall out in a straightforward manner and shouldn't cause
 confusion.

understanding what in/out/inout do with them. The texts in the specs just is to vague about this:

Ok, I can fix that.
 First, the section "Function Parameters" should be renamed in "Function
 Arguments". Functions have arguments, templates have parameters. That is
 convention in C++ and I see little reason to mix this up in D. It becomes
 especially important for function templates which have both.

Not my understanding. A "parameter" is the declaration for the symbol, an "argument" is the value passed to that symbol: int foo(int x); // x is a parameter ... foo(3); // 3 is the argument This applies to both templates and functions.
 Then, what means "out and inout work like storage classes"?!? If that
 somehow refers to pass-by-reference then it should be said explicitely.

 How are struct-in-arguments passed? Structs are value types, so they have
 value semantics when passed as in-arguments. This should be clearly

 I think.

 "out is rare enough, and inout even rarer"??!! One common reason to use
 inout is to enforce call-by-reference on large structs. There is good
 reason for that in many cases. The compiler cannot do the decision
 automatically. This would inout not-so-rare. "out" on the other hand might
 actually really be rather rare. Depends on the coding style, though, I
 guess.

 Arrays and classes are reference types. Structs and scalars are value
 types.

Wow - I just realized a fundamental misconception I always had about

 arrays. And that after writing up a detailed proposal about arrays...

Jul 07 2004
parent reply Norbert Nemec <Norbert.Nemec gmx.de> writes:
Walter wrote:

 Not my understanding. A "parameter" is the declaration for the symbol, an
 "argument" is the value passed to that symbol:
     int foo(int x);        // x is a parameter
     ...
     foo(3);                // 3 is the argument
 This applies to both templates and functions.

Ouch - this goes contrary to everything I've believed so far. But well - arguing is pointless in such an issue. I'll probably have to adjust here. Do you have precedence for that terminology? Would be interesting to see what the C/C++ gurus used.
Jul 08 2004
parent Charles Hixson <charleshixsn earthlink.net> writes:
Norbert Nemec wrote:
 Walter wrote:
 
 
Not my understanding. A "parameter" is the declaration for the symbol, an
"argument" is the value passed to that symbol:
    int foo(int x);        // x is a parameter
    ...
    foo(3);                // 3 is the argument
This applies to both templates and functions.

Ouch - this goes contrary to everything I've believed so far. But well - arguing is pointless in such an issue. I'll probably have to adjust here. Do you have precedence for that terminology? Would be interesting to see what the C/C++ gurus used.

that's the way my instructors used the terms. I think that they were picking up the usage from the Algol people, but I was never sure.
Jul 08 2004
prev sibling parent Arcane Jill <Arcane_member pathlink.com> writes:
In article <cchipu$1oof$1 digitaldaemon.com>, Norbert Nemec says...

First, the section "Function Parameters" should be renamed in "Function
Arguments". Functions have arguments, templates have parameters. That is
convention in C++ and I see little reason to mix this up in D. It becomes
especially important for function templates which have both.

I got news for you, dude. I've been alive for longer than templates have existed. In fact, I've been alive for longer than C++, and even C, have existed. And I can tell you from personal memory that the word "parameter" existed before the advent of templates, C++, and C. And, according to my recollection, the definition of the word has not changed over time. I've never had to put my finger on a precise definition before, but I'd say Walter's description agrees with my usage. Arcane Jill
Jul 08 2004
prev sibling parent Regan Heath <regan netwin.co.nz> writes:
On Wed, 7 Jul 2004 10:24:08 -0700, Walter <newshound digitalmars.com> 
wrote:
 "Norbert Nemec" <Norbert.Nemec gmx.de> wrote in message
 news:ccdu2u$2cjg$1 digitaldaemon.com...
 What you say is very true: we have a problem there that needs to be

 Anyhow, I fear, documenting the current behavior is all we can do:

 Regulating what the routine is allowed to do with the referenced data

 be equal to going for const references, which you should not even 
 mention
 if you want to live in peace with Walter....

 In any case, one should be aware that the behavior of static arrays is
 defined even less. In some cases they have plain value semantics, in

 they mimic the behavior of dynamic arrays being handled by reference. My
 preference would be to make a clear cut here and say: static arrays have
 value semantics throughout, dynamic arrays have reference semantics

 you are using the magic of array/vector expressions, but that's a

 topic)

 There is some fundamental asymmetry between the two kinds of arrays

 so why not make it complete instead of trying to cover it up?

All it comes down to is understanding the difference between a reference type and a value type. This isn't a problem unique to D, it happens in C++, Java, C#, Javascript, etc. Once this difference is understood, the semantics fall out in a straightforward manner and shouldn't cause confusion. Arrays and classes are reference types. Structs and scalars are value types.

I realise this.. but because they are different if you try to use them with a template eg. class A { int val; } struct B { int val; } template Foo(T) { void Foo(T a) { a.val = 1; } } void main() { A a = new A(); B b; Foo!(A)(a); Foo!(B)(b); //at this point a.val == 1 printf("%d\n",a.val); //at this point b.val == 0 printf("%d\n",b.val); } of course you *can* change the template to template Foo(T) { void Foo(inout T a) { a.val = 1; } } and now it works. If 'in' enforced 'const' the compiler would have given an error as soon as I tried to use the original template with a 'value' type. I think the basic problem is consistency, and what 'in' (and other type modifiers) apply to. In the case of a 'value' type 'in' applies to the contents of that type, yet, in the case of a 'reference' type 'in' applies to the reference itself. The same is true for 'const' eg. const char[10] foo; the 'reference' foo is const, the data is not. Furthermore it seems to have no effect on a 'value' type eg. const B global; //at this point global.val == 0 global.val = 5; //at this point global.val == 5 Regan -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
Jul 07 2004