www.digitalmars.com         C & C++   DMDScript  

D - passing arrays as "in" parameter with suprising results

reply "Sandor Hojtsy" <hojtsy index.hu> writes:
Arrays are not passed by value, nor by reference.

lets consider:

void fn(int[] a)
{
  a[0] = 1;
  a.length = 3;
  a[1] = 2;
  printf("a.length = %d, a[0] = %d, a[1] = %d\n", a.length, a[0], a[1]);
}

int main()
{
  int[] t;
  t.length = 2;
  fn(t);
  printf("t.length = %d, t[0] = %d, t[1] = %d\n", t.length, t[0], t[1]);
  return 0;
}

If arrays were passed by value (as an int) this would print:
a.length = 3, a[0] = 1, a[1] = 2
t.length = 2, t[0] = 0, t[0] = 0
(Original array unchanged)

If arrays were passed by reference (as an Object) this would print:
a.length = 3, a[0] = 1, a[1] = 2
t.length = 3, t[0] = 1, t[1] = 2
(Original array is the same)

But actually it prints:

a.length = 3, a[0] = 1, a[1] = 2
t.length = 2, t[0] = 1, t[1] = 0

(if it cannot resize the array in place) *OR* (Undefined Behaviour)

a.length = 3, a[0] = 1, a[1] = 2
t.length = 2, t[0] = 1, t[1] = 2

if there is enough memory to resize the array in place.

So arrays are not passed by value, nor by reference.
Some changes to the array are incorporated into the original array, and some
are not.
They are using "passing by array reference", using which needs a deep
understanding of the low-level implementation of the arrays.
I understand that this is a result of the current (fast) implemetation of
arrays, but the result is *unacceptable*.
One easy solution would be to disallow passing arrays as "in" parameters and
require "inout". (I still like the "ref" keyword better than "inout")
But using "inout" parameters is slower, isn't it? It is "just another level
of indirection". So an effective solution would include redesigning the
low-level array implementation.

Yours,
Sandor
Oct 07 2002
next sibling parent reply Evan McClanahan <evan dontSPAMaltarinteractive.com> writes:
Sandor Hojtsy wrote:
 So arrays are not passed by value, nor by reference.
 Some changes to the array are incorporated into the original array, and some
 are not.
 They are using "passing by array reference", using which needs a deep
 understanding of the low-level implementation of the arrays.
 I understand that this is a result of the current (fast) implemetation of
 arrays, but the result is *unacceptable*.
 One easy solution would be to disallow passing arrays as "in" parameters and
 require "inout". (I still like the "ref" keyword better than "inout")
 But using "inout" parameters is slower, isn't it? It is "just another level
 of indirection". So an effective solution would include redesigning the
 low-level array implementation.

Although I disagree with the 'ref' keyword, as I think that 'inout' fits the in/out/inout semantics better, I think that it's weird that you can resize an array that's passed as an 'in' parameter at all. Shouldn't changing the values or resizing be an error? I thought that this was the point of the 'in' keyword in the first place. Evan
Oct 07 2002
parent "Sandor Hojtsy" <hojtsy index.hu> writes:
"Evan McClanahan" <evan dontSPAMaltarinteractive.com> wrote in message
news:anrq9c$hpc$1 digitaldaemon.com...
 Sandor Hojtsy wrote:
 So arrays are not passed by value, nor by reference.
 Some changes to the array are incorporated into the original array, and


 are not.
 They are using "passing by array reference", using which needs a deep
 understanding of the low-level implementation of the arrays.
 I understand that this is a result of the current (fast) implemetation


 arrays, but the result is *unacceptable*.
 One easy solution would be to disallow passing arrays as "in" parameters


 require "inout". (I still like the "ref" keyword better than "inout")
 But using "inout" parameters is slower, isn't it? It is "just another


 of indirection". So an effective solution would include redesigning the
 low-level array implementation.

Although I disagree with the 'ref' keyword, as I think that 'inout' fits the in/out/inout semantics better,

You mean it rhimes? Is that important?
 I think that it's weird that you can
 resize an array that's passed as an 'in' parameter at all.  Shouldn't
 changing the values or resizing be an error?   I thought that this was
 the point of the 'in' keyword in the first place.

It was not. "In" means "pass a copy of the value to the function". For "primitive types" (Java jargon), this means copy the value, for objects, it means copy the reference. For arrays it means some mystic in-between. void fn(in int a, in Object o) { a = 3; o = NULL; } int main() { int b = 2; Object o = new Object(); fn(b, o); // b and o is unchanged here return 0; } Sandor
Oct 07 2002
prev sibling next sibling parent reply "Mike Wynn" <mike.wynn l8night.co.uk> writes:
I would check you D compiler ?
there is still an issue with arrays passed as 'in' because the length is
stored (AFAIK) within the "array type" variable not within the array
"object". so the behaviour is similar to C or Java where arrays are passed
by reference always.

I modified you code .... and use DMD 0.44

void fn(int[] a){
  a[0] = 4;
  a[1] = 4;
  printf("a.length = %d, a[0] = %d, a[1] = %d\n", a.length, a[0], a[1]);
}

void fnex(int[] a){
  a[0] = 1;
  a.length = 3;
  a[1] = 2;
  printf("a.length = %d, a[0] = %d, a[1] = %d\n", a.length, a[0], a[1]);
}

void fnref(inout int[] a){
  a[0] = 4;
  a[1] = 4;
  printf("a.length = %d, a[0] = %d, a[1] = %d\n", a.length, a[0], a[1]);
}

void fnexref(inout int[] a){
  a[0] = 1;
  a.length = 3;
  a[1] = 2;
  printf("a.length = %d, a[0] = %d, a[1] = %d\n", a.length, a[0], a[1]);
}

int main(){
  int[] t;
  t.length = 2;
  t[0] = 0;
  t[1] = 1;
  printf("initial t.length = %d, t[0] = %d, t[1] = %d\n", t.length, t[0],
t[1]);
  fn(t);
  printf("after call fn(int[]a) t.length = %d, t[0] = %d, t[1] = %d\n",
t.length, t[0], t[1]);
  fnex(t);
  printf("after call fnex(int[]a) t.length = %d, t[0] = %d, t[1] = %d\n",
t.length, t[0], t[1]);
  t.length = 2;
  t[0] = 0;
  t[1] = 1;
  printf("reset t t.length = %d, t[0] = %d, t[1] = %d\n", t.length, t[0],
t[1]);
  fnref(t);
  printf("after call fn(inout int[]a) t.length = %d, t[0] = %d, t[1] =
%d\n", t.length, t[0], t[1]);
  fnexref(t);
  printf("after call fnex(inout int[]a) t.length = %d, t[0] = %d, t[1] =
%d\n", t.length, t[0], t[1]);
  return 0;
}

give the following output ...

initial t.length = 2, t[0] = 0, t[1] = 1
a.length = 2, a[0] = 4, a[1] = 4
after call fn(int[]a) t.length = 2, t[0] = 4, t[1] = 4
a.length = 3, a[0] = 1, a[1] = 2
after call fnex(int[]a) t.length = 2, t[0] = 1, t[1] = 2
reset t t.length = 2, t[0] = 0, t[1] = 1
a.length = 2, a[0] = 4, a[1] = 4
after call fn(inout int[]a) t.length = 2, t[0] = 4, t[1] = 4
a.length = 3, a[0] = 1, a[1] = 2
after call fnex(inout int[]a) t.length = 3, t[0] = 1, t[1] = 2

as you see inout does work, it modifies both the array and the length, it is
just 'in' that has odd behaviour
it can modify the array contents but not the length.

IMHO an array passed as 'in' should do copy-on-write (at compile time) so
read only access is fast, if there are any writes then the compiler does `a
= a.dup;` in the method prolog. which would give both speed and pass by
value semantics.

(off to try some more tests)..

Mike.

"Sandor Hojtsy" <hojtsy index.hu> wrote in message
news:anrkjd$boh$1 digitaldaemon.com...
 Arrays are not passed by value, nor by reference.

 lets consider:

 void fn(int[] a)
 {
   a[0] = 1;
   a.length = 3;
   a[1] = 2;
   printf("a.length = %d, a[0] = %d, a[1] = %d\n", a.length, a[0], a[1]);
 }

 int main()
 {
   int[] t;
   t.length = 2;
   fn(t);
   printf("t.length = %d, t[0] = %d, t[1] = %d\n", t.length, t[0], t[1]);
   return 0;
 }

 If arrays were passed by value (as an int) this would print:
 a.length = 3, a[0] = 1, a[1] = 2
 t.length = 2, t[0] = 0, t[0] = 0
 (Original array unchanged)

 If arrays were passed by reference (as an Object) this would print:
 a.length = 3, a[0] = 1, a[1] = 2
 t.length = 3, t[0] = 1, t[1] = 2
 (Original array is the same)

 But actually it prints:

 a.length = 3, a[0] = 1, a[1] = 2
 t.length = 2, t[0] = 1, t[1] = 0

 (if it cannot resize the array in place) *OR* (Undefined Behaviour)

 a.length = 3, a[0] = 1, a[1] = 2
 t.length = 2, t[0] = 1, t[1] = 2

 if there is enough memory to resize the array in place.

 So arrays are not passed by value, nor by reference.
 Some changes to the array are incorporated into the original array, and

 are not.
 They are using "passing by array reference", using which needs a deep
 understanding of the low-level implementation of the arrays.
 I understand that this is a result of the current (fast) implemetation of
 arrays, but the result is *unacceptable*.
 One easy solution would be to disallow passing arrays as "in" parameters

 require "inout". (I still like the "ref" keyword better than "inout")
 But using "inout" parameters is slower, isn't it? It is "just another

 of indirection". So an effective solution would include redesigning the
 low-level array implementation.

 Yours,
 Sandor

Oct 07 2002
parent reply "Sandor Hojtsy" <hojtsy index.hu> writes:
"Mike Wynn" <mike.wynn l8night.co.uk> wrote in message
news:anrrdt$iup$1 digitaldaemon.com...
 I would check you D compiler ?

Do you mean version? 0.44
 there is still an issue with arrays passed as 'in' because the length is
 stored (AFAIK) within the "array type" variable not within the array
 "object". so the behaviour is similar to C or Java where arrays are passed
 by reference always.

 I modified you code .... and use DMD 0.44

....
 as you see inout does work, it modifies both the array and the length, it

 just 'in' that has odd behaviour

Yes. "inout" works, as "inout" should, but it is slower then "in". But the "in" parameter: I would not say "odd", rather "broken".
 it can modify the array contents but not the length.

Not really. If you resize the array, it may or may not be reallocated to an other position in the memory. So modifications after the resize, may or may not be done to the original array. That is called Undefined Behaviour. And is the length not a full-right property of the array? Do you consider this behaviour logical? If you don't do anything to correct this, at least disallow passing arrays as "in".
 IMHO an array passed as 'in' should do copy-on-write (at compile time) so
 read only access is fast, if there are any writes then the compiler does

 = a.dup;` in the method prolog. which would give both speed and pass by
 value semantics.

Fine with me. But I already hear the arguments about dummy programmers passing all arrays by copy, and wasting CPU time.
 "Sandor Hojtsy" <hojtsy index.hu> wrote in message
 news:anrkjd$boh$1 digitaldaemon.com...
 Arrays are not passed by value, nor by reference.

 lets consider:

 void fn(int[] a)
 {
   a[0] = 1;
   a.length = 3;
   a[1] = 2;
   printf("a.length = %d, a[0] = %d, a[1] = %d\n", a.length, a[0], a[1]);
 }

 int main()
 {
   int[] t;
   t.length = 2;
   fn(t);
   printf("t.length = %d, t[0] = %d, t[1] = %d\n", t.length, t[0], t[1]);
   return 0;
 }

 If arrays were passed by value (as an int) this would print:
 a.length = 3, a[0] = 1, a[1] = 2
 t.length = 2, t[0] = 0, t[0] = 0
 (Original array unchanged)

 If arrays were passed by reference (as an Object) this would print:
 a.length = 3, a[0] = 1, a[1] = 2
 t.length = 3, t[0] = 1, t[1] = 2
 (Original array is the same)

 But actually it prints:

 a.length = 3, a[0] = 1, a[1] = 2
 t.length = 2, t[0] = 1, t[1] = 0

 (if it cannot resize the array in place) *OR* (Undefined Behaviour)

 a.length = 3, a[0] = 1, a[1] = 2
 t.length = 2, t[0] = 1, t[1] = 2

 if there is enough memory to resize the array in place.

 So arrays are not passed by value, nor by reference.
 Some changes to the array are incorporated into the original array, and

 are not.
 They are using "passing by array reference", using which needs a deep
 understanding of the low-level implementation of the arrays.
 I understand that this is a result of the current (fast) implemetation


 arrays, but the result is *unacceptable*.
 One easy solution would be to disallow passing arrays as "in" parameters

 require "inout". (I still like the "ref" keyword better than "inout")
 But using "inout" parameters is slower, isn't it? It is "just another

 of indirection". So an effective solution would include redesigning the
 low-level array implementation.

 Yours,
 Sandor


Oct 07 2002
parent reply "Mike Wynn" <mike.wynn l8night.co.uk> writes:
 IMHO an array passed as 'in' should do copy-on-write (at compile time)


 read only access is fast, if there are any writes then the compiler does

 = a.dup;` in the method prolog. which would give both speed and pass by
 value semantics.

Fine with me. But I already hear the arguments about dummy programmers passing all arrays by copy, and wasting CPU time.

having to consider the performance effects of an action is part of programming in any language, as long as the effects are known then I see no problem. I agree that arrays passed 'in' are not just odd but broken. I find the following nasty too a dynamic array has a length that I can effect but a capacity that I can not. you can't have a zero length array, null dynamic arrays can be added to. stack allocated static arrays can be passed as mutable dynamic arrays as a side effect the memory they reference can end up being referenced from a heap item or a earlier stack frame. Mike.
Oct 08 2002
parent reply "Walter" <walter digitalmars.com> writes:
"Mike Wynn" <mike.wynn l8night.co.uk> wrote in message
news:anudkf$1oal$1 digitaldaemon.com...
 I find the following  nasty too
 a dynamic array has a length that I can effect but a capacity that I can
 not.

Correct.
 you can't have a zero length array,

Sure you can.
 null dynamic arrays can be added to.

A null dynamic array is just one with 0 length.
 stack allocated static arrays can be passed as mutable dynamic arrays as a
 side effect the memory they reference can end up being referenced from a
 heap item or a earlier stack frame.

Yes, arrays are passed by reference, like class objects are. Perhaps the solution is to simply disallow resizing of an array passed as 'in'.
Oct 09 2002
parent reply "Mike Wynn" <mike.wynn l8night.co.uk> writes:
 you can't have a zero length array,

Sure you can.
 null dynamic arrays can be added to.

A null dynamic array is just one with 0 length.

want to return 'null' no list and new item[0] there is a list but its empty. I remember some heated debateds about malloc and what it should return if you call it with 0, whilst I was working on a Java VM. the eventual outcome was that you should not be able to call malloc(0) thats an error but you should be able to create a 0 length collection. especially as D heap allocated arrays are dynamic, and that I believe that you should be able to set the available length as well as the number of actual entries in the array it makes sense (to me) that calling new type[0] should be allocating space for future expansion and not returning null. (or something that is = = = null ) Mike.
Oct 09 2002
parent reply "Walter" <walter digitalmars.com> writes:
"Mike Wynn" <mike.wynn l8night.co.uk> wrote in message
news:ao26ps$2kuu$1 digitaldaemon.com...
 you can't have a zero length array,

Sure you can.
 null dynamic arrays can be added to.

A null dynamic array is just one with 0 length.

want to return 'null' no list and new item[0] there is a list but its

 I remember some heated debateds about malloc and what it should return if
 you call it with 0, whilst I was working on a Java VM. the eventual

 was that you should not be able to call malloc(0) thats an error but you
 should be able to create a 0 length collection.

 especially as D heap allocated arrays are dynamic, and that I believe that
 you should be able to set the available length as well as the number of
 actual entries in the array
 it makes sense (to me) that calling new type[0]  should be allocating

 for future expansion and not returning null. (or something that is = = =
 null )

Interestingly, I think the need to worry about the difference between 0 length and null just goes away with D arrays, i.e. they are just an artifact of how C does things.
Oct 09 2002
parent reply "Sean L. Palmer" <seanpalmer directvinternet.com> writes:
They affect the speed with which you can test if something is null or not.
You can't test the length for null first because to do so you must
dereference the pointer, which if null is an error.  So to be safe, first
you must test the pointer then the length for 0, and only then do you have
enough information to know if you should start the loop.

I think there's still plenty of difference between null and an empty array.
I'm sure they'd become apparent in the implementation at least.  It could
affect performance of iteration features.

I really think in parameters should be thoroughly immutable to the point of
not being able to be passed as mutable array parameters themselves (to do so
you must make a copy).  This feature limits references to array entries
explicitly requested and thus can be tracked and contained by the compiler.
Otherwise there's no way you can easily enforce a contract that clients not
misuse your data.  The contract has to be visible in the language somehow,
either implicit or explicit.  Then the callee has to guarantee not to write
to its parms in order to be able to take parms which are constant or thru
refs to read-only data.  Otherwise we'll just end up with an uncontrolled
mess.

If you want to ensure that your client does not abuse your data, typecast
your data to be const before passing it to them.  If it doesn't have the
contract that says "I won't abuse your data" then the caller has to
proactively make a copy of the constant and send that in just in case the
routine decides to mess around with the data you pass it.  No other way to
be safe.

The other way, (C++'s way) at least you can get a reasonable guarantee which
is usually enforced pretty well by the compiler, that something won't misuse
your data, without having to resort to expensive runtime data duplication.

Does that qualify as an optimization, or not?  It is, but only when the data
is in fact unwritable and you care enough about it to make that copy.  Which
isn't all that often in practice.  All nuts and bolts and hoses and adaptor
sockets.  Usually you hook things up right (but that damn 5% when you get it
wrong and have to figure out what went wrong!)  That's when a tool like
const becomes handy.  It lets you track down overwrite bugs.  Hell it lets
you prevent them.

Sean

"Walter" <walter digitalmars.com> wrote in message
news:ao2ntg$3bi$2 digitaldaemon.com...
 "Mike Wynn" <mike.wynn l8night.co.uk> wrote in message
 news:ao26ps$2kuu$1 digitaldaemon.com...
 you can't have a zero length array,

Sure you can.
 null dynamic arrays can be added to.

A null dynamic array is just one with 0 length.

want to return 'null' no list and new item[0] there is a list but its

 I remember some heated debateds about malloc and what it should return


 you call it with 0, whilst I was working on a Java VM. the eventual

 was that you should not be able to call malloc(0) thats an error but you
 should be able to create a 0 length collection.

 especially as D heap allocated arrays are dynamic, and that I believe


 you should be able to set the available length as well as the number of
 actual entries in the array
 it makes sense (to me) that calling new type[0]  should be allocating

 for future expansion and not returning null. (or something that is = = =
 null )

Interestingly, I think the need to worry about the difference between 0 length and null just goes away with D arrays, i.e. they are just an

 of how C does things.

Oct 09 2002
parent reply "Walter" <walter digitalmars.com> writes:
"Sean L. Palmer" <seanpalmer directvinternet.com> wrote in message
news:ao37ep$ine$1 digitaldaemon.com...
 They affect the speed with which you can test if something is null or not.
 You can't test the length for null first because to do so you must
 dereference the pointer, which if null is an error.  So to be safe, first
 you must test the pointer then the length for 0, and only then do you have
 enough information to know if you should start the loop.

This is incorrect. A reference to an array is a pair, one is the length and the other is the pointer. The length can therefore be tested without dereferencing any pointer.
 I really think in parameters should be thoroughly immutable to the point

 not being able to be passed as mutable array parameters themselves (to do

 you must make a copy).  This feature limits references to array entries
 explicitly requested and thus can be tracked and contained by the

 Otherwise there's no way you can easily enforce a contract that clients

 misuse your data.  The contract has to be visible in the language somehow,
 either implicit or explicit.  Then the callee has to guarantee not to

 to its parms in order to be able to take parms which are constant or thru
 refs to read-only data.  Otherwise we'll just end up with an uncontrolled
 mess.

I think what you're arguing for is something like C++'s const& type modifier.
 If you want to ensure that your client does not abuse your data, typecast
 your data to be const before passing it to them.  If it doesn't have the
 contract that says "I won't abuse your data" then the caller has to
 proactively make a copy of the constant and send that in just in case the
 routine decides to mess around with the data you pass it.  No other way to
 be safe.

 The other way, (C++'s way) at least you can get a reasonable guarantee

 is usually enforced pretty well by the compiler, that something won't

 your data, without having to resort to expensive runtime data duplication.

 Does that qualify as an optimization, or not?  It is, but only when the

 is in fact unwritable and you care enough about it to make that copy.

 isn't all that often in practice.  All nuts and bolts and hoses and

 sockets.  Usually you hook things up right (but that damn 5% when you get

 wrong and have to figure out what went wrong!)  That's when a tool like
 const becomes handy.  It lets you track down overwrite bugs.  Hell it lets
 you prevent them.

I understand why you want const parameters and I know you rely on it to find bugs, but I am less convinced of its payback/cost.
Oct 10 2002
parent reply "Sean L. Palmer" <seanpalmer directvinternet.com> writes:
So how do you think you can pass a constant array to a function?  The
function could modify the constant and the language has no way of preventing
it or even warning you about it.

It might prevent you from calling any function with the constant as an inout
or out parameter.  But if the contents of 'in' arrays are mutable this still
leaves gaping safety holes.

Sean


"Walter" <walter digitalmars.com> wrote in message
news:ao47q4$1s5j$1 digitaldaemon.com...
 "Sean L. Palmer" <seanpalmer directvinternet.com> wrote in message
 news:ao37ep$ine$1 digitaldaemon.com...
 They affect the speed with which you can test if something is null or


 You can't test the length for null first because to do so you must
 dereference the pointer, which if null is an error.  So to be safe,


 you must test the pointer then the length for 0, and only then do you


 enough information to know if you should start the loop.

This is incorrect. A reference to an array is a pair, one is the length

 the other is the pointer. The length can therefore be tested without
 dereferencing any pointer.

You are right.
 I really think in parameters should be thoroughly immutable to the point

 not being able to be passed as mutable array parameters themselves (to


 so
 you must make a copy).  This feature limits references to array entries
 explicitly requested and thus can be tracked and contained by the

 Otherwise there's no way you can easily enforce a contract that clients

 misuse your data.  The contract has to be visible in the language


 either implicit or explicit.  Then the callee has to guarantee not to

 to its parms in order to be able to take parms which are constant or


 refs to read-only data.  Otherwise we'll just end up with an


 mess.

I think what you're arguing for is something like C++'s const& type modifier.
 If you want to ensure that your client does not abuse your data,


 your data to be const before passing it to them.  If it doesn't have the
 contract that says "I won't abuse your data" then the caller has to
 proactively make a copy of the constant and send that in just in case


 routine decides to mess around with the data you pass it.  No other way


 be safe.

 The other way, (C++'s way) at least you can get a reasonable guarantee

 is usually enforced pretty well by the compiler, that something won't

 your data, without having to resort to expensive runtime data


 Does that qualify as an optimization, or not?  It is, but only when the

 is in fact unwritable and you care enough about it to make that copy.

 isn't all that often in practice.  All nuts and bolts and hoses and

 sockets.  Usually you hook things up right (but that damn 5% when you


 it
 wrong and have to figure out what went wrong!)  That's when a tool like
 const becomes handy.  It lets you track down overwrite bugs.  Hell it


 you prevent them.

I understand why you want const parameters and I know you rely on it to

 bugs, but I am less convinced of its payback/cost.

Oct 10 2002
parent reply Mark Evans <Mark_member pathlink.com> writes:
As far as I understand the conversation "Re: passing arrays as "in" parameter
with suprising results," Sean makes good observations.  To me 'in' means that
data is immutable by the callee.  If the usage is 'inout', then it should be
specified as such.

Comparisons with C++ const qualifiers are probably out of place.  All we need is
clarity on the meanings of 'in', 'out', and 'inout' as D contracts.  Whatever
'in' means, it should mean for *all* data types, including arrays.

Possibly another type of contract is not yet covered by D.  In numeric work one
designs classes that know how to circulate array pointers from hand to hand with
certain conventions for data ownership.  In D, these conventions should be
contracts.

So a function might create an array, then pass it to another function with the
expectation that the callee will assume ownership.  Implicit with ownership is
the ability to change or dispose of data.

In such cases I don't know that any type of D contract works.  The contract is
not 'inout' because the data never comes back to the caller, and it's not 'in'
because the data is mutable in the callee.  Maybe D needs a new contract
qualifier, 'transfer', to signal this type of contract.

Mark
Oct 10 2002
next sibling parent Mark Evans <Mark_member pathlink.com> writes:
Sandor wrote:
"In" means "pass a copy of the value to the function".
For "primitive types" (Java jargon), this means copy the value, for objects,
it means copy the reference.
For arrays it means some mystic in-between.

The "mystic in-between" shows that D contract semantics are not yet well-defined. Implementation details are unimportant until contract definitions are clear. 'In' should mean that the callee has access to the data, but cannot change it. Whether access comes from copied data, pointers, or references is irrelevant to the contract. Access mechanisms are mere implementation details. Contracts are absent in C/C++ so they merit discussion. Attention to date has instead focused on mappings from implementations to contracts. This focus is misguided precisely because contracts do not exist in C/C++. What we'll end up with are D contracts isomorphic to C++ constructs, rendering D contracts redundant. D wants to offer something new that C++ never did. One contract may employ many implementations to cover different data types. The commonality between them should be to yield the same contract semantics. It is OK that 'in' employs different access mechanisms for different data types; it is not OK that 'in' means different contracts for different data types. Mark
Oct 10 2002
prev sibling next sibling parent Mark Evans <Mark_member pathlink.com> writes:
An example of the 'transfer' contract at work might be ROME,

"ROME was designed to manage high speed data streams within a multimedia
environment....To ensure a high throughput with minimal overhead ROME provides a
zero copy architecture where pointer references to data are passed around
instead of data being copied. The goal of this approach is to maximize the
utilization of a given hardware configuration."

http://rome.sourceforge.net/
Oct 10 2002
prev sibling next sibling parent reply "Sean L. Palmer" <seanpalmer directvinternet.com> writes:
donate or gift perhaps.  Or send.  Maybe send is the default.  And would it
send a copy, or the original?

void DestroyForever(gift object theobj)
{
    delete theobj;
}

class Keeper
{
    public void Keep(gift theobj)
    {
        assert(theobj == null);
        myobj = theobj;
    }

    public gift object GiveBack()
    {
        assert(object);
        object temp = myobj;
        myobj = null;
        return temp;
    }

    ~this()
    {
        delete myobj;
    }

    private object myobj;
}

But with GC you don't exactly need ownership to be controlled or even
monitored.  I think it would be more efficient to monitor it, myself.

Sean

"Mark Evans" <Mark_member pathlink.com> wrote in message
news:ao4mqc$2bgj$1 digitaldaemon.com...
 As far as I understand the conversation "Re: passing arrays as "in"

 with suprising results," Sean makes good observations.  To me 'in' means

 data is immutable by the callee.  If the usage is 'inout', then it should

 specified as such.

 Comparisons with C++ const qualifiers are probably out of place.  All we

 clarity on the meanings of 'in', 'out', and 'inout' as D contracts.

 'in' means, it should mean for *all* data types, including arrays.

 Possibly another type of contract is not yet covered by D.  In numeric

 designs classes that know how to circulate array pointers from hand to

 certain conventions for data ownership.  In D, these conventions should be
 contracts.

 So a function might create an array, then pass it to another function with

 expectation that the callee will assume ownership.  Implicit with

 the ability to change or dispose of data.

 In such cases I don't know that any type of D contract works.  The

 not 'inout' because the data never comes back to the caller, and it's not

 because the data is mutable in the callee.  Maybe D needs a new contract
 qualifier, 'transfer', to signal this type of contract.

 Mark

Oct 11 2002
parent Mark Evans <Mark_member pathlink.com> writes:
Sean L. Palmer wrote:
 And would it send a copy, or the original?

The original. Copying violates the definition of the contract, which says that a blob (array/class instance/struct) is being transferred from one owner to another. Think of an array as an input to some processing chain. Various objects get a hold of it and process it. Each hands it off to the next processing object in the chain. A transfer contract would be ideal for such applications. Usually there are no dedicated give/keep functions but simply conventions on particular functions. "If you call me with an array, I get to keep it, and you no longer own it." This is exactly the kind of convention that D seeks to embody in its contracts. In C++ the best you can do is document the convention in the code comments and hope that users follow it properly. Mark
Oct 11 2002
prev sibling parent reply "Walter" <walter digitalmars.com> writes:
"Mark Evans" <Mark_member pathlink.com> wrote in message
news:ao4mqc$2bgj$1 digitaldaemon.com...
 As far as I understand the conversation "Re: passing arrays as "in"

 with suprising results," Sean makes good observations.  To me 'in' means

 data is immutable by the callee.  If the usage is 'inout', then it should

 specified as such.

Depends on how you think about it. In D, 'in' means the parameter gets a copy of the value, and so cannot change the original. Since arrays are passed by reference, indeed, the caller's *reference* cannot change, but what it refers to can. Think about it like passing a reference to a class object. You can't change the caller's reference, but you can certainly change the data in the class object.
 Comparisons with C++ const qualifiers are probably out of place.  All we

 clarity on the meanings of 'in', 'out', and 'inout' as D contracts.

 'in' means, it should mean for *all* data types, including arrays.

There's no way to avoid dealing with the different semantics between by reference and by value. Arrays and classes are always by reference. Structs and scalars are always by value. 'out' and 'inout' add another reference layer on top of that.
 Possibly another type of contract is not yet covered by D.  In numeric

 designs classes that know how to circulate array pointers from hand to

 certain conventions for data ownership.  In D, these conventions should be
 contracts.

 So a function might create an array, then pass it to another function with

 expectation that the callee will assume ownership.  Implicit with

 the ability to change or dispose of data.

 In such cases I don't know that any type of D contract works.  The

 not 'inout' because the data never comes back to the caller, and it's not

 because the data is mutable in the callee.  Maybe D needs a new contract
 qualifier, 'transfer', to signal this type of contract.

You raise an excellent point. The quality "who owns this value" is a major programming problem in C++ with memory management, as it is needed to decide who must and who cannot free the memory. D avoids that problem by using garbage collection, which means the programmer does not have to keep track of ownership. The way to deal with other kinds of ownership of data is to follow the "copy on write" rule. For example, a char[] strupr(in char[] s) function would simply return s if the array is already upper cased. If the array is not upper cased, then a copy of s is made, upper cased, and returned. I'm not sure how to make contracts for that, as it is a programming technique not a syntactical feature.
Oct 11 2002
next sibling parent reply "Mike Wynn" <mike.wynn l8night.co.uk> writes:
 Depends on how you think about it. In D, 'in' means the parameter gets a
 copy of the value, and so cannot change the original. Since arrays are
 passed by reference, indeed, the caller's *reference* cannot change, but
 what it refers to can. Think about it like passing a reference to a class
 object. You can't change the caller's reference, but you can certainly
 change the data in the class object.

What is in the conceptual 'array' object intended to be ? it would seem better if the conceptual 'array' *object* was { uint length; T * data; } as opposed to the current { T data[fixed]; } IMHO : it would be nice to have a language defined by its semantics and not its implementation. Mike.
Oct 12 2002
parent reply "Walter" <walter digitalmars.com> writes:
"Mike Wynn" <mike.wynn l8night.co.uk> wrote in message
news:ao8u15$raq$1 digitaldaemon.com...
 IMHO : it would be nice to have a language defined by its semantics and

 its implementation.

Yes, but then a large risk is run of having some nitpick in the semantics putting huge burdens on the resulting code (such as Java's 64 bit floating point precision that just doesn't sit right with the Intel FPU).
Nov 05 2002
parent reply "Mike Wynn" <mike.wynn l8night.co.uk> writes:
"Walter" <walter digitalmars.com> wrote in message
news:aq915r$2iln$2 digitaldaemon.com...
 "Mike Wynn" <mike.wynn l8night.co.uk> wrote in message
 news:ao8u15$raq$1 digitaldaemon.com...
 IMHO : it would be nice to have a language defined by its semantics and

 its implementation.

Yes, but then a large risk is run of having some nitpick in the semantics putting huge burdens on the resulting code (such as Java's 64 bit floating point precision that just doesn't sit right with the Intel FPU).

convinced that it was the right thing to do, especially with Java which is intended as "write once run anywhere" (tm) once you allow float and double operations to be performed at precisions greater than float or double you code may give different results on different architectures, in C an int are defined as the most efficient size for the platform , and a short is no longer than an int, and a long no shorter than a int, many languages and systems (Java,C#, JavaVM, intent, clr ) have fixed the sizes to short 16bits, int 32 and long to 64 bits, intent like Java has IEEE 754 32 bit floats and 64 bit doubles (and intent also has a 16:16 fixed) (can't find any info on intermediate values) I believe that systems such as intent (www.tao.co.uk) and Java have shown their worth on emedded systems because they do has such strict semantics. As a developer you know not only will you code run the same on a different system, but will run on tomorrows and not just todays architecures. this is also true for desktop systems, PS2 have Linux, will D ever run on that or Cobalt Qubes, Netwinders, PowerMacs or Sun Sparc boxes ? I believe that the "cost" of imposing strict semantics is outweighted by the benifits. Will D support the AMD x86-64 architecture ? which supports IEEE-754 floating point numbers
Nov 06 2002
next sibling parent Mac Reiter <Mac_member pathlink.com> writes:
Somehow, it appears that this thread has gotten away from "Array ownership", so
I renamed my post...

I believe that both answers are right.  Even more than that, I believe that if
you restrict yourself to either answer alone, it is wrong.  Sometimes it doesn't
matter how fast you can calculate the answer, because if you get different
answers on different machines then the answer is "wrong".  For this kind of
problem, you need rigidly defined types, that follow a specific behavior on all
platforms.  Sometimes it doesn't matter how correct your answer is, because if
you can't calculate it fast enough then the answer is "wrong" (games are a
strong contender here -- the answer only has to be "good enough", but it
definitely has to be "fast enough").  For these applications, you need native
support.

I think the intrinsic types should contain things like:

fast16 - the fastest type of native integral value that holds at least 16 bits
tight16 - the smallest type of native integral value that holds at least 16 bits
exact16 - an exactly 16 bit integer, even if it requires software munging
(useful for HW overlays, file structures, network protocols -- painfully slow,
but sometimes you don't care)

To ease compiler implementation, not all compilers would have to implement all
versions.  I suspect that it could rapidly become a differentiating feature,
once multiple compilers were available for a platform.  I think I would require
the 'fast' and 'tight' versions, since all they require is a choice at the time
the compiler is written -- no additional code is necessary.  'exact' would
require additional coding inside the compiler, so it might be an optional
feature.  (Having said all of that, I *HATE* optional features, because somehow
they never seem to get implemented...)

I would actually prefer:
fast<16>, tight<16>, and so on.  That way, I can smoothly progress from built in
types to bignum routines, with no change to my syntax except to pick a bigger
number of bits.  And if I have to use numbers that currently require bignums
(say, a fast<200>), but somebody eventually comes out with a 256bit CPU, maybe
my code speeds up by an additional order of magnitude (above the clock and other
architectural speedups), without a rewrite.

For floats, I'm not sure what syntax would be better.  Maybe something like:
float<32,native>
float<64,ieee754> /* if that is even meaningful -- pardon my ignorance of the
754 spec */
complex<80,native> /* native if possible, simulated if necessary */

Again, this would allow me to specify any precision I wanted, potentially
incurring software routine penalties on some machines, but being able to run
natively on others.  If I want native performance more than specific answers (a
physics engine for games, versus a physics engine for engineering, for example)
then I can specify that I want "native" floating point support.

I'm not trying to pick a particular syntax (although I definitely prefer
parameterized types over any finite list of hardcoded type names).  I'm just
trying to say that both concerns are valid, and I would *REALLY* like to see a
language that could support both needs by giving the programmer final control
over how his or her program behaves, for every single variable.

And as for the syntax being readable by C programmers -- C99 already has
something similar to this.  They use the "finite list of hardcoded type names"
that irritates me, but they have realized that you need both portability and
performance, and that the programmer has to make the decision, not the compiler.


Let me clarify that last statement -- the compiler can tell which variables are
heavily used, and if it is told that the system should be optimized for speed,
then the compiler knows a *LOT* about how to tweak out the platform.  What the
compiler does *not* know, however, is the larger context of the programmer's
work.  It may be more important for the programmer to get certain precise
results -- maybe because the code is going in a homogenous cluster, or maybe
because it is having to calculate some type of checksum that relies on oddities
of a particular size of integer, or maybe something entirely else.  But the
compiler has no way to know that, unless you provide a syntax for specifying the
difference between "fast behavior" and "exact behavior".  And once you've done
that, "tight behavior" is sort of a middle ground -- not as fast as "fast", and
not as exact as "exact", but still useful for systems that are both time *and*
space constrained.

To continue with the readability issue : every C/C++ programmer who has worked
on large, long-term projects, has ended up either making or using someone else's
giant list of #ifdef'd typedefs that constructs a list of INT16, UINT32, etc.
And while you *could* do it this way in D, why?  That just ends up with 20
different sets of definitions, which makes it a pain in the behind to bring
multiple libraries together.  Each library may define its own "magic" types, and
then the integrator is left with code that looks schizophrenic because it can't
make up its mind which set of types to use.  If the language gave a clear and
flexible typing system, this would not be an issue -- no one would make the
typedefs, because they wouldn't be needed.  And if the parameterized types were
the only way to get integral types, it would never even occur to anyone to think
about typedef'ing types.  fast<16> isn't that much harder to type than INT16.
If the <> is annoying, the actual syntax *could* just be fast16, fast32,
fast736, etc.  The compiler *could* work it out.  But the <> would make it a bit
more obvious that parameterization was taking place.

Just my 2 cents, which always seems to cost several dollars for me to say...
Mac

P.S. Since it's been so long since you saw my original point, let me reiterate
it for a bit of focus: "I believe that both answers (fast native types and
potentially slow exact semantic types) are right.  Even more than that, I
believe that if you restrict yourself to either answer alone, it is wrong."


In article <aqbaa0$1sq7$1 digitaldaemon.com>, Mike Wynn says...
"Walter" <walter digitalmars.com> wrote in message
news:aq915r$2iln$2 digitaldaemon.com...
 "Mike Wynn" <mike.wynn l8night.co.uk> wrote in message
 news:ao8u15$raq$1 digitaldaemon.com...
 IMHO : it would be nice to have a language defined by its semantics and

 its implementation.

Yes, but then a large risk is run of having some nitpick in the semantics putting huge burdens on the resulting code (such as Java's 64 bit floating point precision that just doesn't sit right with the Intel FPU).

convinced that it was the right thing to do, especially with Java which is intended as "write once run anywhere" (tm) once you allow float and double operations to be performed at precisions greater than float or double you code may give different results on different architectures, in C an int are defined as the most efficient size for the platform , and a short is no longer than an int, and a long no shorter than a int, many languages and systems (Java,C#, JavaVM, intent, clr ) have fixed the sizes to short 16bits, int 32 and long to 64 bits, intent like Java has IEEE 754 32 bit floats and 64 bit doubles (and intent also has a 16:16 fixed) (can't find any info on intermediate values) I believe that systems such as intent (www.tao.co.uk) and Java have shown their worth on emedded systems because they do has such strict semantics. As a developer you know not only will you code run the same on a different system, but will run on tomorrows and not just todays architecures. this is also true for desktop systems, PS2 have Linux, will D ever run on that or Cobalt Qubes, Netwinders, PowerMacs or Sun Sparc boxes ? I believe that the "cost" of imposing strict semantics is outweighted by the benifits. Will D support the AMD x86-64 architecture ? which supports IEEE-754 floating point numbers

Nov 06 2002
prev sibling parent reply "Walter" <walter digitalmars.com> writes:
"Mike Wynn" <mike.wynn l8night.co.uk> wrote in message
news:aqbaa0$1sq7$1 digitaldaemon.com...
 and Java's float/double semantics have changed because of that,

I didn't know that.
 I believe that systems such as intent (www.tao.co.uk) and Java have shown
 their worth on emedded systems because they do has such strict semantics.

I'm not convinced that Java's success on embedded systems is due to strict semantics. I don't have much good information on why it has succeeded in that environment - one possibility is the Java bytecode is very compact, and so one can cram a lot of functionality into add-in modules (given the existence of the VM on the base unit). Java's resistance to crashing also has a lot of appeal in embedded systems.
 As
 a developer you know not only will you code run the same on a different
 system, but will run on tomorrows and not just todays architecures.

I know many Java developers who find the reverse is true - they cannot predict which VM will be running their code, and different VMs have different implementations, behavior, and bugs, and so it becomes an impossible task to write code that actually will run everywhere.
 this is also true for desktop systems, PS2 have Linux, will D ever run on
 that or Cobalt Qubes, Netwinders, PowerMacs or Sun Sparc boxes ?

D running on those systems is gated by there being a D compiler for them, just as Java is gated by there being a Java VM on them. The Java VM itself requires a lot of work to port. Once D is married to the GCC code generator, it should be far easier to support a new system for D than for Java.
 I believe that the "cost" of imposing strict semantics is outweighted by

 benifits.

For many purposes, yes, but D will have bendable semantics so that it can be efficiently implemented on a wide variety of machines. But not quite as bendable as C is (no one's complement!).
 Will D support the AMD x86-64 architecture ? which supports IEEE-754
 floating point numbers

There is no barrier to D supporting it other than time & effort to retarget the compiler. That's why the GCC version of D is so important, it opens the door to all those other processors and systems.
Nov 06 2002
parent reply "Mike Wynn" <mike.wynn l8night.co.uk> writes:
"Walter" <walter digitalmars.com> wrote in message
news:aqbppn$2djv$1 digitaldaemon.com...
 "Mike Wynn" <mike.wynn l8night.co.uk> wrote in message
 news:aqbaa0$1sq7$1 digitaldaemon.com...
 and Java's float/double semantics have changed because of that,

I didn't know that.

(basically allows a VM to perform intermediate float/double ops at better than original precission)
 I believe that systems such as intent (www.tao.co.uk) and Java have


 their worth on emedded systems because they do has such strict


 I'm not convinced that Java's success on embedded systems is due to strict
 semantics. I don't have much good information on why it has succeeded in
 that environment - one possibility is the Java bytecode is very compact,

 so one can cram a lot of functionality into add-in modules (given the
 existence of the VM on the base unit). Java's resistance to crashing also
 has a lot of appeal in embedded systems.

interpreters, some just in time and dynamic compilers and some are ahead of time compilers. so compact bytecode is only part of the reason for use, class loading and unloading may be another, and the ability to combine ahead of time compiled code with dynamically loaded code another, but all of this is only possible because there are strict rules imposed on the systems. Java may be a bad language to compare D with, because many people merge the Java Language and JavaVM together, they are infact completly separate entities, the VM has only been changed a little (subtle changes to invoke-special) whilst the language has been enhanced to allow inner classes and other features (all syntactic sugar) My comments where more about Java the language, which imposes strict semantic behaviour (much, it is true is inherited from the VM's behaviour) such as when Class loading and static initialisers are run, the order in which expressions are evaluated and the atomicity of operations (although some actions such as writing non volatile doubles is a VM specific operation) but the Java Language can be ahead of time compiled and still conform to Sun's Java Spec (without going to Java Bytecode first).
 As
 a developer you know not only will you code run the same on a different
 system, but will run on tomorrows and not just todays architecures.

I know many Java developers who find the reverse is true - they cannot predict which VM will be running their code, and different VMs have different implementations, behavior, and bugs, and so it becomes an impossible task to write code that actually will run everywhere.

but on desktop Java I've only every run fowl of deployment problems and 1.1.8 to 1.2.x compatibility issues when using Sun or MS JDK's. I would not consider bugs in a VM a valid argument against semantics over implementation. as for different behaviours that is agreeing with my argument for D to be defined by its semantics and not its implementation. isn't the fact that Java, Sun flagship for crossplatform compatibility is not quite what it seems a good reason to make D defined by its semantics, and show developers that they can have a language that will perform the same on any supported platform (within the limits of the platform, I can't see an MP3 player running on a GBA working at full speed). I find it odd that different VM's caused problems, for a Java VM to be allowed to use the Java name it must pass Sun's TCK (a huge Java test suite) http://developer.java.sun.com/developer/technicalArticles/JCPtools/ and not wanting Sun's lawers to come knocking on my door, I'll just say its not perfect :) I have heard many complaints that the JavaVM spec was written FROM the source code and not the other way round. the bytecode verifier spec reads exactly as if it is a writeup of someones code.
 this is also true for desktop systems, PS2 have Linux, will D ever run


 that or Cobalt Qubes, Netwinders, PowerMacs or Sun Sparc boxes ?

D running on those systems is gated by there being a D compiler for them, just as Java is gated by there being a Java VM on them. The Java VM itself requires a lot of work to port. Once D is married to the GCC code

 it should be far easier to support a new system for D than for Java.

 I believe that the "cost" of imposing strict semantics is outweighted by

 benifits.

For many purposes, yes, but D will have bendable semantics so that it can

 efficiently implemented on a wide variety of machines. But not quite as
 bendable as C is (no one's complement!).

I find it interesting that you oppose D having strict well defined semantics and yet on the D overview you say it is aimed at "Numerical programmers" and implement lots of floating point Nan and infinity behavours, but are unwilling to fix the float and doubles to defined standards. are these behaviours part of the "bendability" ? I've been reading and rereading the D overview to try to gain and understanding for why you might oppose semantics over implementation, and apart from low down and dirty programming, which is not that effected (you do need a integer that can hold a pointer) I see nothing there, in fact much of what I read says to me D will be defined by its semantics. (I assume it's a bit out of date as it also says all objects are in the heap, but I thought RAII put objects onto the stack ?) I believe that if you asked the programmers who fit into the "who is D for" list then 95% would prefer D to be the same D on every supported platform and to be freed from the implementation, and the expense of some platforms being harder to support than others.
Nov 06 2002
next sibling parent "Walter" <walter digitalmars.com> writes:
"Mike Wynn" <mike.wynn l8night.co.uk> wrote in message
news:aqcmgq$8qe$1 digitaldaemon.com...
 I find it interesting that you oppose D having strict well defined

 and yet on the D overview you say
 it is aimed at "Numerical programmers" and implement lots of floating

 Nan and infinity behavours, but are unwilling to fix the float and doubles
 to defined standards. are these behaviours part of the "bendability" ?

The floating point behavior for D will conform to IEEE 754 if the underlying hardware supports it. If the underlying hardware does not, then D's semantics on that platform will have to bend. This is not as big a problem as it sounds, as non-IEEE floating point is a thing of the past, not the future. Interestingly, IEEE floating point has been on the PC for 20 years now, and VERY few languages use its capability! This failure includes nearly all C/C++ compilers (except for DMC++). DMC++ and D allow the programmer to exploit the floating point hardware (such as using IEEE comparisons, and 80 bit extendeds).
 I've been reading and rereading the D overview to try to gain and
 understanding for why you might oppose semantics over implementation, and
 apart from low down and dirty programming, which is not that effected (you
 do need a integer that can hold a pointer) I see nothing there, in fact

 of what I read says to me D will be defined by its semantics.
 (I assume it's a bit out of date as it also says all objects are in the
 heap, but I thought RAII put objects onto the stack ?)

RAII does not put objects on the stack, it's still done by reference.
 I believe that if you asked the programmers who fit into the "who is D

 list then 95% would prefer D to be the same D on every supported platform
 and to be freed from the implementation, and the expense of some platforms
 being harder to support than others.

The most popular platforms should, of course, be the ones most in line with D semantics. D also explicitly abandons 16 bit architectures in its semantics.
Nov 06 2002
prev sibling parent reply Mark Evans <Mark_member pathlink.com> writes:
Mike, right on.  However please improve your grammar and separate your
paragraphs.  Your comments are hard to read!

Under .NET it will be interesting to see many languages targeting the *same*
virtual machine.

Particulars of the Intel chips are important, but we should recognize that, like
Windows itself, those chips are hardly the best design.  It would help to step
back and contemplate future ports to other chips.  As Mike says, distinguish the
semantics from the implementation.

The Intel chip is a piece of work.  It has about a dozen different modes, gets
more encrusted with each generation (MMX, etc.), and really amounts to a crazy
Rube Goldberg machine whose major attraction is backward compatibility all the
way back to the ancient 8088 -- in other words, something that keeps making
money for Intel.  Just because the dang chip has some weird issue that affects
languge design does not invalidate the language design principle at stake.  It
just means the language is targeting a crummy chip and needs to work around it.

I see similar problems affecting D in relation to C++, but have already spouted
my opinions on that.  Backward compatibility is a huge limitation when it drives
the design, as it did with the Intel chip line.


 I'm not convinced that Java's success on embedded systems is due to strict
 semantics.


Well, at least the Java success shows that strict semantics do not spell doom as people seem to think it will for D.
 the embedded Java VM's ... are all different ...,
 but all of this
 is only possible because there are strict rules imposed

Right.
Java may be a bad language to compare D with, because many people merge the
Java Language and JavaVM together, they are infact completly separate

Right. Think about .NET too.
I would not consider bugs in a VM a valid argument against semantics over
implementation.

Right.
as for different behaviours that is agreeing with my argument for D to be
defined by its semantics and not its implementation.

Yes, I think so.
 D will have bendable semantics so that it can be
 efficiently implemented on a wide variety of machines. But not quite as
 bendable as C is (no one's complement!).


Please provide three examples of bendable semantics. I do not see a conflict between consistent semantics and efficiency. Are you saying that good semantics will "hide the chip," whereas D wants to "permit full access"? When I type "double" I mean an IEEE double-precision floating point number. If the chip doesn't support that data type, then it really doesn't support the language I am using. Conversely, if there is some bizarre feature unique to a particular chip, I would expect a good language to treat it as a bizzare feature, such that it's not a part of the language per se, but a secondary layer of capabilities accessible from the main language on that one platform only. If I sound confused, maybe I am. Please offer some examples of what you mean to help me think it through. Give some examples of bendable semantics helping efficiency on two different chips (even if hypothetical ones).
in fact much
of what I read says to me D will be defined by its semantics.

Let's hope so!
I believe that if you asked the programmers who fit into the "who is D for"
list then 95% would prefer D to be the same D on every supported platform

Yes and the more platforms, the better.
Nov 06 2002
parent reply "Walter" <walter digitalmars.com> writes:
"Mark Evans" <Mark_member pathlink.com> wrote in message
news:aqd4pp$me2$1 digitaldaemon.com...
 D will have bendable semantics so that it can be
 efficiently implemented on a wide variety of machines. But not quite as
 bendable as C is (no one's complement!).



 between consistent semantics and efficiency.

Bendable semantics would be things like the size of a pointer, the size of an extended float value, and byte ordering.
Nov 10 2002
parent reply Mark Evans <Mark_member pathlink.com> writes:
Walter says...
Bendable semantics would be things like the size of a pointer, the size of
an extended float value, and byte ordering.

That stuff is not semantics, just implementation detail. So I would merely repeat, "As Mike says, distinguish the semantics from the implementation." No language, not even C or D, asks the programmer to state what size pointer he wants, what size extended float, or what (native) byte ordering. These details are not language semantics because no expression changes them. You can change them all around by porting to another platform, but that program will retain identical semantic content. Meaning, you run the program, and identical inputs produce identical outputs. The underlying representation details of inputs and outputs have nothing to do with semantics per se. Semantics is about the meaning of expressions. Obviously one has to draw a line somewhere. To carry this implementation confusion to its bitter end, we must include, under the heading "semantics," all voltage levels and current flows through every CPU transistor. By this definition the same program, running on different Pentiums, induces different semantics, because the underlying transistor layouts are different. The reality is that we draw the line ("semantic domain") where it stops helping our understanding of a program's purpose. Granted it can be fuzzy, but most programmers think about questions like: "Does the loop in the program above execute at all? If it executes, does it terminate? What is the value of the variable after the while loop?" http://www.sil.org/linguistics/GlossaryOfLinguisticTerms/WhatIsSemantics.htm http://www.cs.du.edu/~ramki/courses/3351/2002Autumn/notes/Lectures4and5.pdf http://www4.informatik.tu-muenchen.de/papers/RUM98a.ps.gz Mark
Nov 10 2002
parent reply "Walter" <walter digitalmars.com> writes:
"Mark Evans" <Mark_member pathlink.com> wrote in message
news:aqmr8u$21dp$1 digitaldaemon.com...
 Walter says...
Bendable semantics would be things like the size of a pointer, the size


an extended float value, and byte ordering.


 repeat, "As Mike says, distinguish the semantics from the implementation."

Consider that Java specifies these things (at least the byte ordering <g>).
 No language, not even C or D, asks the programmer to state what size

 wants, what size extended float, or what (native) byte ordering.  These

 are not language semantics because no expression changes them.  You can

 them all around by porting to another platform, but that program will

 identical semantic content.  Meaning, you run the program, and identical

 produce identical outputs.  The underlying representation details of

 outputs have nothing to do with semantics per se.

I have plenty of experience with multiple pointer sizes, programmer specified pointer sizes, and code breaking happening when pointer sizes are assumed. <g> Unless you carefully construct your code to be independent of pointer size, it can and will break when ported. Related things will break, too, such as what integral type can hold an offset to a pointer?
 Semantics is about the meaning of expressions.  Obviously one has to draw

 somewhere.  To carry this implementation confusion to its bitter end, we

 include, under the heading "semantics," all voltage levels and current

 through every CPU transistor.  By this definition the same program,

 different Pentiums, induces different semantics, because the underlying
 transistor layouts are different.

Java is a language that attempts to have no bendable semantics. It succeeds in some areas, but fails in others (notably on timing issues between threads, when gc finalization happens, etc.) Having no bendable semantics is necessary to deliver on the "write once, run everywhere" paradigm. With C code, we accept the notion that a little bit of debugging and tweaking of source will be necessary for each port to deal with inadvertant reliance on bendable semantics.
 The reality is that we draw the line ("semantic domain") where it stops

 our understanding of a program's purpose.  Granted it can be fuzzy, but

 programmers think about questions like: "Does the loop in the program

 execute at all? If it executes, does it terminate? What is the value of

 variable after the while loop?"


 http://www4.informatik.tu-muenchen.de/papers/RUM98a.ps.gz

In some languages, we draw the line on semantic specification when it would put an intolerable burden on the resulting code - for example, attempting to emulate a 32 bit flat pointer model in 16 bit C code is just not worth it (some people did try it). This means, for example, that a garbage collector written for a 32 bit flat memory model will have to be scrapped and completely redone for the "far" memory model on the PC.
Nov 11 2002
parent Mark Evans <Mark_member pathlink.com> writes:
Walter says...
 "...distinguish the semantics from the implementation."

Consider that Java specifies these things (at least the byte ordering <g>).

Java byte ordering doesn't change during the execution of a program!!! This is just a configuration flag for the implementation. Semantics is stuff that makes one program logically different from another. Whether I compute x=2+2 in LSB or MSB byte-ordering is irrelevant to the logical result. The output is still x=4 in whatever representation. That is the semantic meaning of the program.
I have plenty of experience with multiple pointer sizes, programmer
specified pointer sizes, and code breaking happening when pointer sizes are
assumed. <g>

You've spent years doing 16 and 32 bit compilers in a world dominated by Microsoftisms, so take a step back and think through that fog. <g> Programmers should not assume what languages don't support. That is what we call "bad code." Even C offers sizeof(); for example sizeof(void*) would be "good code" entailing no assumption. Get a better programmer! Win16 versus Win32 was a Microsoftism, not a language feature of C. Even at that, programs were either all 16-bit or all 32-bit (though there was a transition period) and did not care about pointer size. It is possible to write good C code without knowing the byte ordering, pointer size, or anything of the sort. That is why we have compilers to deal with such mundane nonsense -- instead of writing assembly.
Unless you carefully construct your code to be independent of
pointer size

Carefully? Almost all code is independent by default. Code should be careless about such things, because they are compiler implementation issues. That's why we have compilers.
it can and will break when ported. Related things will break,
too, such as what integral type can hold an offset to a pointer?

It will break when ported for reasons having little to do with pointer sizes, and much to do with OS issues (which are *not* language issues).
Java is a language that attempts to have no bendable semantics.

On the contrary, by your definition Java semantics do backflips and change colors. Presumably you include the VM implementation in your definition of semantics because that is what determines byte ordering, pointer sizes, etc. Byte orderings change, pointer sizes change, and everything still works in Java. By my definition: Java has solid semantics, and the implementations do backflips. Same semantics, varying implementations. That's true cross-platform code.
In some languages, we draw the line on semantic specification when it would
put an intolerable burden on the resulting code - for example, attempting to
emulate a 32 bit flat pointer model in 16 bit C code is just not worth it
(some people did try it). This means, for example, that a garbage collector
written for a 32 bit flat memory model will have to be scrapped and
completely redone for the "far" memory model on the PC.

The stated problem is to emulate one implementation of a language using a different implementation, a rather artificial corner case having nothing to do with semantics, and everything to do with implementation details that most people don't care about. The C language was not designed for such trickery. You'll have to use assembly for that. Thanks Walter, Mark
Nov 15 2002
prev sibling parent reply Mark Evans <Mark_member pathlink.com> writes:
Walter wrote:

Depends on how you think about it. In D, 'in' means the parameter gets a
copy of the value, and so cannot change the original. Since arrays are
passed by reference, indeed, the caller's *reference* cannot change, but
what it refers to can. Think about it like passing a reference to a class
object. You can't change the caller's reference, but you can certainly
change the data in the class object.

I find this an exceptionally confusing way to define 'in' and not a very useful one. If this is how contracts work, then they are pretty useless, conceptually.
There's no way to avoid dealing with the different semantics between by
reference and by value. Arrays and classes are always by reference. Structs
and scalars are always by value. 'out' and 'inout' add another reference
layer on top of that.

I'm not proposing avoidance, just consistency. The compiler can use call-by-reference and still disallow data changes. You are saying that just by virtue of being a reference, the corresponding data is mutable, which I don't buy. (It may be the easiest thing to implement but that doesn't make it the best arrangement.)
You raise an excellent point. The quality "who owns this value" is a major
programming problem in C++ with memory management...D avoids that problem by 
using garbage collection, which means the programmer does not have to
keep track of ownership.

Garbage collection doesn't eliminate the problem. "Who owns the value" also determines who prevents the garbage collector from getting rid of it, who is allowed to process the data, as well as who is ultimately allowed to release it to the garbage collector.
The way to deal with other kinds of ownership of data is to follow the "copy
on write" rule.

There are many such conventions. The idea of D is to replace conventions with contracts. Or so I thought. Mike wrote:
IMHO : it would be nice to have a language defined by its semantics and not
its implementation.

Exactly the point I keep trying to make, Mark
Oct 12 2002
parent reply "Sandor Hojtsy" <hojtsy index.hu> writes:
"Mark Evans" <Mark_member pathlink.com> wrote in message
news:aoa2uk$1v2k$1 digitaldaemon.com...
 Walter wrote:

Depends on how you think about it. In D, 'in' means the parameter gets a
copy of the value, and so cannot change the original. Since arrays are
passed by reference, indeed, the caller's *reference* cannot change, but
what it refers to can. Think about it like passing a reference to a class
object. You can't change the caller's reference, but you can certainly
change the data in the class object.

I find this an exceptionally confusing way to define 'in' and not a very

 one.  If this is how contracts work, then they are pretty useless,

They are usefull. Thousands of Java developers can live with it. But IMHO they can be made better. Especially concerning passing arrays.
There's no way to avoid dealing with the different semantics between by
reference and by value. Arrays and classes are always by reference.


and scalars are always by value. 'out' and 'inout' add another reference
layer on top of that.

I'm not proposing avoidance, just consistency. The compiler can use call-by-reference and still disallow data changes. You are saying that

 virtue of being a reference, the corresponding data is mutable, which I

 buy.  (It may be the easiest thing to implement but that doesn't make it

 best arrangement.)


 Mike wrote:
IMHO : it would be nice to have a language defined by its semantics and


its implementation.

Exactly the point I keep trying to make,

I also agree. I still think it would be usefull to make a detailed comparsion to C++ parameter passing semantics. 1 - Primitive types ------------------ 1.1 - pass by value in C++: void fn(int a); in D: void fn(in int a); 1.2 - pass by value, immutable in C++: void fn(const int a); in D: ABSENT 1.3 - pass by reference/address in C++: void fn(int &a); in D: void fn(out int a); / void fn(inout int a); 1.4 - pass by reference/address, immutable in C++: void fn(const int &a); in D: ABSENT This is not the same as 1.2, from the semantical view. I can provide example if needed. 1.5 - pass by copy/copyback in C++: ABSENT in D: ABSENT 2 - Objects ------------ 2.1 - pass by value in C++: void fn(Object a); in D: ABSENT 2.2 - pass by value, immutable in C++: void fn(const Object a); in D: ABSENT 2.3 - pass by reference/address in C++: void fn(Object &a); in D: void fn(in Object a); 2.4 - pass by reference/addess, immutable int C++: void fn(const Object &a); in D: ABSENT 2.5 - pass by copy/copyback in C++: ABSENT in D: ABSENT 2.6 - pass by reference/address of reference/address in C++: void fn(Object *&a); in D: void fn(out Object a); / void fn(inout Object a) 2.7 - pass by reference/address of reference/addess, immutable in C++: void fn(const Object *&a); in D: ABSENT 3 - Arrays ----------- 3.1 - pass by value of the items (copy) in C++: void fn(vector<int> a); in D: ABSENT 3.2 - pass by value of the items (copy), immutable in C++: void fn(const vector<int> a); in D: ABSENT 3.3 - pass by reference/addess to the items in C++: void fn(int *a, int len); in D: void fn(in int[] a); Neither of them is elegant. You cannot resize the array, but you can modify the items. 3.4 - pass by reference/address to the items, immutable items in C++: void fn(const int *a, int len); in D: ABSENT 3.5 - pass by copy/copyback in C++: ABSENT in D: ABSENT 3.6 - pass by reference/address of reference/address of the items in C++: void fn(vector<int> &a); in D: void fn(out int[] a) / void fn(inout int[] a) 2.7 - pass by reference/address of reference/addess of the items, immutable items in C++: void fn(vector<int> &a); in D: ABSENT Conclusion: - Neither language provides copy/copyback. - D provides pass-by-value semantics only for primitive types. - D does not provide immutable parameter semantics. - D extends, and separates the concept of pass-by-reference to "out" and "inout" parameters Sandor
Oct 14 2002
parent reply "Sean L. Palmer" <seanpalmer directvinternet.com> writes:
"Sandor Hojtsy" <hojtsy index.hu> wrote in message
news:aoe1tg$2pe5$1 digitaldaemon.com...
 "Mark Evans" <Mark_member pathlink.com> wrote in message
 news:aoa2uk$1v2k$1 digitaldaemon.com...
 Walter wrote:

IMHO : it would be nice to have a language defined by its semantics and



its implementation.

Exactly the point I keep trying to make,

I also agree.

Me too
 I still think it would be usefull to make a detailed comparsion to C++
 parameter passing semantics.

 1 - Primitive types
 ------------------

 1.1 - pass by value
 in C++:   void fn(int a);
 in D:       void fn(in int a);

 1.2 - pass by value, immutable
 in C++:   void fn(const int a);
 in D:       ABSENT

I think this is wrong, and I think that "in" value parameters should be equivalent to C++'s const value parameters 1.1 - pass by value in C++: void fn(int a); in D: ABSENT 1.2 - pass by value, immutable in C++: void fn(const int a); in D: void fn(in int a);
 1.3 - pass by reference/address
 in C++:   void fn(int &a);
 in D:       void fn(out int a); / void fn(inout int a);

 1.4 - pass by reference/address, immutable
 in C++:  void fn(const int &a);
 in D:       ABSENT
 This is not the same as 1.2, from the semantical view. I can provide

 if needed.

I think we need one of these. Although 1.4 - pass by reference/address, immutable in C++: void fn(const int &a); in D: void fn(in int a); with a type more complex than int, the compiler would be free to pass by reference so long as in parameters are not mutable.
 1.5 - pass by copy/copyback
 in C++:   ABSENT
 in D:       ABSENT

 2 - Objects
 ------------

 2.1 - pass by value
 in C++:  void fn(Object a);
 in D:      ABSENT

 2.2 - pass by value, immutable
 in C++:  void fn(const Object a);
 in D:   ABSENT

 2.3 - pass by reference/address
 in C++: void fn(Object &a);
 in D:      void fn(in Object a);

 2.4 - pass by reference/addess, immutable
 int C++: void fn(const Object &a);
 in D:     ABSENT

 2.5 - pass by copy/copyback
 in C++:  ABSENT
 in D:      ABSENT

 2.6 - pass by reference/address of reference/address
 in C++:   void fn(Object *&a);
 in D:       void fn(out Object a); / void fn(inout Object a)

 2.7 - pass by reference/address of reference/addess, immutable
 in C++:   void fn(const Object *&a);
 in D:       ABSENT

 3 - Arrays
 -----------

 3.1 - pass by value of the items (copy)
 in C++:  void fn(vector<int> a);
 in D:       ABSENT

 3.2 - pass by value of the items (copy), immutable
 in C++:  void fn(const vector<int> a);
 in D:      ABSENT

 3.3 - pass by reference/addess to the items
 in C++:  void fn(int *a, int len);
 in D:       void fn(in int[] a);
 Neither of them is elegant. You cannot resize the array, but you can

 the items.

 3.4 - pass by reference/address to the items, immutable items
 in C++: void fn(const int *a, int len);
 in D:  ABSENT

 3.5 - pass by copy/copyback
 in C++:  ABSENT
 in D:      ABSENT

 3.6 - pass by reference/address of reference/address of the items
 in C++:  void fn(vector<int> &a);
 in D:      void fn(out int[] a) / void fn(inout int[] a)

 2.7 - pass by reference/address of reference/addess of the items,

 items
 in C++:  void fn(vector<int> &a);
 in D:      ABSENT


 Conclusion:
 - Neither language provides copy/copyback.
 - D provides pass-by-value semantics only for primitive types.
 - D does not provide immutable parameter semantics.
 - D extends, and separates the concept of pass-by-reference to "out" and
 "inout" parameters

 Sandor

I think immutable parameters are important (you can't pass const data to anything that's mutable). You need to be able to ensure that immutability covers the entire object, all its members, what they point to, etc. Otherwise it makes holes that bugs can sneak out of. Sean
Oct 14 2002
next sibling parent Mark Evans <Mark_member pathlink.com> writes:
Sean wrote
You need to be able to ensure that immutability covers the entire object,
all its members, what they point to, etc.

Yes I agree 100% -- that's a proper contract definition which will kill many bugs. Sandor wrote
 I find this an exceptionally confusing way to define 'in'
 and not a very useful one

they can be made better. Especially concerning passing arrays.

Any "contract" whose specification is a set of implementation details is useless from a design-by-contract standpoint. Here's the only specification we currently have for the 'in' contract: "big objects use references while little objects don't, at the whim of the compiler design." That's not any kind of contract I care to use. (The Java analogy is also completely broken.) We're losing sight of what contracts are meant to do. The idea of design-by-contract, as a theory, is to codify what non-DBC languages implement as "coding conventions." Instead of focusing on implementation details, we should first ask what conventions D wishes to embody. Otherwise we just end up with a new syntax for doing C++ instead of a contract-based language. My proposal is that this problem can be analyzed along a small handful of orthogonal dimensions. Mutability, information flow direction, ownership, perhaps a few others. Not all languages sporting keywords that hint at design-by-contract underpinnings actually have them. (I'm even a bit worried about D ever becoming a DBC language.) C++ has 'const' but that's all. Sandor mentioned Java. Walter has also mentioned IDL as an inspiration for in/out/inout. None of these languages is design-by-contract. Perhaps the only real DBC language is Eiffel. Here's some supporting data: "... fundamental differences between the Java and Eiffel object models [include] .. design by contract vs. wishful thinking ..." from URL: ftp://rtfm.mit.edu/pub/usenet/news.answers/eiffel-faq "It is regrettable that this lesson [design-by-contract] has not been heeded by such recent designs as Java (which added insult to injury by removing the modest assert instruction of C!), IDL (the Interface Definition Language of CORBA, which is intended to foster large-scale reuse across networks, but fails to provide any semantic specification mechanism), Ada 95 and ActiveX. For reuse to be effective, Design by Contract is a requirement. Without a precise specification attached to each reusable component -- precondition, postcondition, invariant -- no one can trust a supposedly reusable component." from URL: http://archive.eiffel.com/doc/manuals/technology/contract/ariane/page.html "Interface Definition Languages as we know them today are doomed." [Should D then bother with interfaces at all, or use in/out/inout from IDL?] from a father of design-by-contract at URL: http://archive.eiffel.com/doc/manuals/technology/bmarticles/sd/contracts.html Java and IDL require add-on libraries to do proper design-by-contract, e.g. http://www.javaworld.com/javaworld/jw-02-2001/jw-0216-cooltools.html http://www.reliable-systems.com/tools/iContract/iContract.htm http://www.javaworld.com/javaworld/jw-02-2002/jw-0215-dbcproxy.html http://citeseer.nj.nec.com/40586.html http://www.cse.iitb.ernet.in/~rkj/COMContracts.ps.gz I'm no design-by-contract expert but I admire the whole concept and hope that D will indeed live up to it. I do not see C++, Java, or IDL offering anything truly substantial in this direction. Eiffel would be a better source of inspiration. From the Eiffel FAQ: ftp://rtfm.mit.edu/pub/usenet/news.answers/eiffel-faq "Eiffel is a pure, statically typed, object-oriented language. Its modularity is based on classes. Its most notable feature is probably design by contract. It brings design and programming closer together. It encourages maintainability and the re-use of software components. "Eiffel offers classes, multiple inheritance, polymorphism, static typing and dynamic binding, genericity (constrained and unconstrained), a disciplined exception mechanism, systematic use of assertions to promote programming by contract. "Eiffel has an elegant design and programming style, and is easy to learn. "An overview is available at http://www.eiffel.com/doc/manuals/language/intro/" Mark
Oct 14 2002
prev sibling parent reply "Sandor Hojtsy" <hojtsy index.hu> writes:
"Sean L. Palmer" <seanpalmer directvinternet.com> wrote in message
news:aoeu4k$nc2$1 digitaldaemon.com...
 "Sandor Hojtsy" <hojtsy index.hu> wrote in message
 news:aoe1tg$2pe5$1 digitaldaemon.com...
 "Mark Evans" <Mark_member pathlink.com> wrote in message
 news:aoa2uk$1v2k$1 digitaldaemon.com...
 Walter wrote:

IMHO : it would be nice to have a language defined by its semantics




 not
its implementation.

Exactly the point I keep trying to make,

I also agree.

Me too
 I still think it would be usefull to make a detailed comparsion to C++
 parameter passing semantics.

 1 - Primitive types
 ------------------

 1.1 - pass by value
 in C++:   void fn(int a);
 in D:       void fn(in int a);

 1.2 - pass by value, immutable
 in C++:   void fn(const int a);
 in D:       ABSENT

I think this is wrong, and I think that "in" value parameters should be equivalent to C++'s const value parameters

For primitive types, what use would that have?
 1.1 - pass by value
 in C++:   void fn(int a);
 in D:       ABSENT

 1.2 - pass by value, immutable
 in C++:   void fn(const int a);
 in D:       void fn(in int a);

 1.3 - pass by reference/address
 in C++:   void fn(int &a);
 in D:       void fn(out int a); / void fn(inout int a);

 1.4 - pass by reference/address, immutable
 in C++:  void fn(const int &a);
 in D:       ABSENT
 This is not the same as 1.2, from the semantical view. I can provide

 if needed.

I think we need one of these. Although

A C++ example: const int *p; void fn(const int a) { p = &a; } void fn2(const int &a) { p = &a; } These functions have different results, so 1.4 is not the same as 1.2.
 1.4 - pass by reference/address, immutable
 in C++:  void fn(const int &a);
 in D:       void fn(in int a);

 with a type more complex than int, the compiler would be free to pass by
 reference so long as in parameters are not mutable.

But that would have different side effects.
 1.5 - pass by copy/copyback
 in C++:   ABSENT
 in D:       ABSENT

 2 - Objects
 ------------

 2.1 - pass by value
 in C++:  void fn(Object a);
 in D:      ABSENT

 2.2 - pass by value, immutable
 in C++:  void fn(const Object a);
 in D:   ABSENT

 2.3 - pass by reference/address
 in C++: void fn(Object &a);
 in D:      void fn(in Object a);

 2.4 - pass by reference/addess, immutable
 int C++: void fn(const Object &a);
 in D:     ABSENT

 2.5 - pass by copy/copyback
 in C++:  ABSENT
 in D:      ABSENT

 2.6 - pass by reference/address of reference/address
 in C++:   void fn(Object *&a);
 in D:       void fn(out Object a); / void fn(inout Object a)

 2.7 - pass by reference/address of reference/addess, immutable
 in C++:   void fn(const Object *&a);
 in D:       ABSENT

 3 - Arrays
 -----------

 3.1 - pass by value of the items (copy)
 in C++:  void fn(vector<int> a);
 in D:       ABSENT

 3.2 - pass by value of the items (copy), immutable
 in C++:  void fn(const vector<int> a);
 in D:      ABSENT

 3.3 - pass by reference/addess to the items
 in C++:  void fn(int *a, int len);
 in D:       void fn(in int[] a);
 Neither of them is elegant. You cannot resize the array, but you can

 the items.

 3.4 - pass by reference/address to the items, immutable items
 in C++: void fn(const int *a, int len);
 in D:  ABSENT

 3.5 - pass by copy/copyback
 in C++:  ABSENT
 in D:      ABSENT

 3.6 - pass by reference/address of reference/address of the items
 in C++:  void fn(vector<int> &a);
 in D:      void fn(out int[] a) / void fn(inout int[] a)

 2.7 - pass by reference/address of reference/addess of the items,

 items
 in C++:  void fn(vector<int> &a);
 in D:      ABSENT


 Conclusion:
 - Neither language provides copy/copyback.
 - D provides pass-by-value semantics only for primitive types.
 - D does not provide immutable parameter semantics.
 - D extends, and separates the concept of pass-by-reference to "out" and
 "inout" parameters

 Sandor

I think immutable parameters are important (you can't pass const data to anything that's mutable).

If a copy is passed it is not a problem. But with D's reference-only object passing concept, you can't always pass copies. So there is an even greater need for const parameters.
 You need to be able to ensure that immutability covers the entire object,
 all its members,

Yes.
 what they point to, etc.

No.
 Otherwise it makes holes that bugs can sneak out of.

Const parameters would be usefull not only to ease bug-free coding, but to increase the expressive power, and help self-documentation. Sandor
Oct 15 2002
parent reply Mark Evans <Mark_member pathlink.com> writes:
What contract semantics do you wish to implement?  If none, then what does
interest you?  Making every possible calling convention available in D?  Perhaps
C++ has too many calling permutations; have you thought of that?

Walter is right that 'const' was a C++ feature that never worked out.  Not
because contracts are bad, but because 'const' is a poor man's version of
design-by-contract.  Immutability is important, but to say that we want 'const,'
without tying it into contracts, is to mire D in the mistakes of C++.

I would rather define a contract for the call, and have the compiler figure out
which calling convention makes sense, under that contract, for that particular
data type.  The important thing is semantic consistency of the contract across
all data types.  We want a compiler that does more for us than a C++ compiler.
(Otherwise we'd just use C++.)

If we think at the lower level of charting C++ call permutations, then we are
not really designing by contract.  We're just doing C++ with new syntax,
"contract paint" if you will.

To make that remark explicit, I'd like someone to explain how any of these
comparisons shed light on my original idea about transfer contracts.  I think
we've lost the ball.

Mark


In article <aogjvn$2cm7$1 digitaldaemon.com>, Sandor Hojtsy says...
"Sean L. Palmer" <seanpalmer directvinternet.com> wrote in message
news:aoeu4k$nc2$1 digitaldaemon.com...
 "Sandor Hojtsy" <hojtsy index.hu> wrote in message
 news:aoe1tg$2pe5$1 digitaldaemon.com...
 "Mark Evans" <Mark_member pathlink.com> wrote in message
 news:aoa2uk$1v2k$1 digitaldaemon.com...
 Walter wrote:

IMHO : it would be nice to have a language defined by its semantics




 not
its implementation.

Exactly the point I keep trying to make,

I also agree.

Me too
 I still think it would be usefull to make a detailed comparsion to C++
 parameter passing semantics.



Oct 15 2002
parent "Sandor Hojtsy" <hojtsy index.hu> writes:
"Mark Evans" <Mark_member pathlink.com> wrote in message
news:aohpat$hdb$1 digitaldaemon.com...
 What contract semantics do you wish to implement?

Pass by reference. Pass by reference to inout value. Pass by reference to out value. Pass by reference to immutable. You think, these are implementation details, which could and therefore should be hidden from the user. But I think these are *semantic* concepts, and I also don't care how they are implemented, while they provide the semantics. And well these semantics could not be hidden from users after all.
 If none, then what does
 interest you?  Making every possible calling convention available in D?

No.
 Perhaps C++ has too many calling permutations; have you thought of that?

I don't think C++ has too many of that. But if you don't like a passing convetion, you can avoid it, and use only a limited subset - a dumb C++.
 Walter is right that 'const' was a C++ feature that never worked out.  Not
 because contracts are bad, but because 'const' is a poor man's version of
 design-by-contract.  Immutability is important, but to say that we want

 without tying it into contracts, is to mire D in the mistakes of C++.

I used 'const' in C++ with success. Can you provide an example where the 'const' is misused and/or part of bad design? From the contract point of view: 'const' is a contract that the called function will not change the object. Whereas the 'in' specifies that changes to the passed/original object will not be incorporated into the original/passed object. They specify distinct semantic details. I don't see the problem. Mark have written "To me 'in' means that data is immutable by the callee". But that is what 'const' means. And you still need to specify *what* data? The reference or the referred?
 I would rather define a contract for the call, and have the compiler

 which calling convention makes sense, under that contract, for that

 data type.

In some functions you have to and will, 1) store the adress of, or reference to, the passed object. 2) make use of the *semantic* rule, that changes to the object are (or are not) immediately incorporated into the original object, and vica-versa. What would the compiler do in those situations? Undefined Behaviour?
 The important thing is semantic consistency of the contract across
 all data types.  We want a compiler that does more for us than a C++

 (Otherwise we'd just use C++.)

IMHO, in the current parameter passing conventions, it does less than a C++ compiler.
 If we think at the lower level of charting C++ call permutations, then we

 not really designing by contract.

C++ call convention syntax was borne out of semantic need. I was trying to demonstrate the semantics that D has no syntax/contract for.
 We're just doing C++ with new syntax, "contract paint" if you will.

C++ syntax is not important, I can leave it behind. D already has semantics that C++ doesn't. But behind the C++ syntax there lies some more semantics that D lacks, and needs.
 To make that remark explicit, I'd like someone to explain how any of these
 comparisons shed light on my original idea about transfer contracts.  I

 we've lost the ball.

Hmm. I started this thread with subject: "Re: passing arrays as "in" parameter with suprising results". With the particular example of array passing as "in", I wanted to point out weak points in the current D parameter passing docs, semantics and syntax. Then you have started a sub-thread about ownership-transfer through parameter passing. I found some interesting general opinions about parameter passing (such as the semantic meaning of 'in', which is not specified in the docs), and replied with detailing my opinion on this subject. I think I still have my ball. Sandor
Oct 16 2002
prev sibling next sibling parent reply "Mike Wynn" <mike.wynn l8night.co.uk> writes:
further to my last post I tried .. (again DMD 0.44)

void fnst(int[] a){ int b[2]; b[0]=1;b[1]=2;
 a = b; printf("a.length = %d, a[0] = %d, a[1] = %d\n", a.length, a[0],
a[1]);
}

void fnstex(int[] a){ int b[3]; b[0]=10;b[1]=20;b[2]=30;
 a = b; printf("a.length = %d, a[0] = %d, a[1] = %d\n", a.length, a[0],
a[1]);
}

void fn(int[] a){ int[]b = new int[2]; b[0]=1;b[1]=2;
 a = b; printf("a.length = %d, a[0] = %d, a[1] = %d\n", a.length, a[0],
a[1]);
}

void fnex(int[] a){ int[]b = new int[3]; b[0]=10;b[1]=20;b[2]=30;
 a = b; printf("a.length = %d, a[0] = %d, a[1] = %d\n", a.length, a[0],
a[1]);
}

void fnstref(inout int[] a){ int b[2]; b[0]=1;b[1]=2;
 a = b; printf("a.length = %d, a[0] = %d, a[1] = %d\n", a.length, a[0],
a[1]);
}

void fnstexref(inout int[] a){ int b[3]; b[0]=10;b[1]=20;b[2]=30;
 a = b; printf("a.length = %d, a[0] = %d, a[1] = %d\n", a.length, a[0],
a[1]);
}

void fnref(inout int[] a){ int[] b = new int[2]; b[0]=1;b[1]=2;
 a = b; printf("a.length = %d, a[0] = %d, a[1] = %d\n", a.length, a[0],
a[1]);
}

void fnexref(inout int[] a){ int[] b = new int[3]; b[0]=10;b[1]=20;b[2]=30;
 a = b; printf("a.length = %d, a[0] = %d, a[1] = %d\n", a.length, a[0],
a[1]);
}

int main(){
  int[] t;
  t.length = 2;
  t[0] = 4;
  t[1] = 4;
  printf("initial t.length = %d, t[0] = %d, t[1] = %d\n", t.length, t[0],
t[1]);
  fn(t);
  printf("after call fn(int[]a) t.length = %d, t[0] = %d, t[1] = %d\n",
t.length, t[0], t[1]);
  fnex(t);
  printf("after call fnex(int[]a) t.length = %d, t[0] = %d, t[1] = %d\n",
t.length, t[0], t[1]);
  fnst(t);
  printf("after call fnst(int[]a) t.length = %d, t[0] = %d, t[1] = %d\n",
t.length, t[0], t[1]);
  fnstex(t);
  printf("after call fnstex(int[]a) t.length = %d, t[0] = %d, t[1] = %d\n",
t.length, t[0], t[1]);
  t.length = 2;
  t[0] = 4;
  t[1] = 4;
  printf("reset t t.length = %d, t[0] = %d, t[1] = %d\n", t.length, t[0],
t[1]);
  fnref(t);
  printf("after call fnref(inout int[]a) t.length = %d, t[0] = %d, t[1] =
%d\n", t.length, t[0], t[1]);
  fnexref(t);
  printf("after call fnexref(inout int[]a) t.length = %d, t[0] = %d, t[1] =
%d\n", t.length, t[0], t[1]);
  fnstref(t);
  printf("after call fnstref(inout int[]a) t.length = %d, t[0] = %d, t[1] =
%d\n", t.length, t[0], t[1]);
  fnstexref(t);
  printf("after call fnstexref(inout int[]a) t.length = %d, t[0] = %d, t[1]
= %d\n", t.length, t[0], t[1]);
  return 0;
}

which gives a very expected .....

initial t.length = 2, t[0] = 4, t[1] = 4
a.length = 2, a[0] = 1, a[1] = 2
after call fn(int[]a) t.length = 2, t[0] = 4, t[1] = 4
a.length = 3, a[0] = 10, a[1] = 20
after call fnex(int[]a) t.length = 2, t[0] = 4, t[1] = 4
a.length = 2, a[0] = 1, a[1] = 2
after call fnst(int[]a) t.length = 2, t[0] = 4, t[1] = 4
a.length = 3, a[0] = 10, a[1] = 20
after call fnstex(int[]a) t.length = 2, t[0] = 4, t[1] = 4
reset t t.length = 2, t[0] = 4, t[1] = 4
a.length = 2, a[0] = 1, a[1] = 2
after call fn(inout int[]a) t.length = 2, t[0] = 1, t[1] = 2
a.length = 3, a[0] = 10, a[1] = 20
after call fnex(inout int[]a) t.length = 3, t[0] = 10, t[1] = 20
a.length = 2, a[0] = 1, a[1] = 2
after call fn(inout int[]a) t.length = 2, t[0] = 1, t[1] = 2
a.length = 3, a[0] = 10, a[1] = 20
after call fnex(inout int[]a) t.length = 3, t[0] = 10, t[1] = 20

BUT AFAIK the code in fnstref/fnstexref is very very dangerious.

void fnstexref(inout int[] a){
    int b[3]; b[0]=10;b[1]=20;b[2]=30;
 a = b; // a is now exactly the same as b
}
 having returned from this function, b which was allocated on the stack is
NOT promoted to a heap object
 so a which hold a reference to the same area of memory which will be
refering to a section of stack which
 has just been removed, this will is not show as a problem UNTIL you call
another function.
 this is true if you put "automatic" arrays passed as dynamics into an
object within the heap you then have
 no way to know when it are accessed (or by which thread).
the assignment a = b; much promote b to be a heap object or be dissallowed
forcing the programmer
to do a = b.dup; which must return a heap version of the static/stack
allocatted array.


"Sandor Hojtsy" <hojtsy index.hu> wrote in message
news:anrkjd$boh$1 digitaldaemon.com...
 Arrays are not passed by value, nor by reference.

 lets consider:

 void fn(int[] a)
 {
   a[0] = 1;
   a.length = 3;
   a[1] = 2;
   printf("a.length = %d, a[0] = %d, a[1] = %d\n", a.length, a[0], a[1]);
 }

 int main()
 {
   int[] t;
   t.length = 2;
   fn(t);
   printf("t.length = %d, t[0] = %d, t[1] = %d\n", t.length, t[0], t[1]);
   return 0;
 }

 If arrays were passed by value (as an int) this would print:
 a.length = 3, a[0] = 1, a[1] = 2
 t.length = 2, t[0] = 0, t[0] = 0
 (Original array unchanged)

 If arrays were passed by reference (as an Object) this would print:
 a.length = 3, a[0] = 1, a[1] = 2
 t.length = 3, t[0] = 1, t[1] = 2
 (Original array is the same)

 But actually it prints:

 a.length = 3, a[0] = 1, a[1] = 2
 t.length = 2, t[0] = 1, t[1] = 0

 (if it cannot resize the array in place) *OR* (Undefined Behaviour)

 a.length = 3, a[0] = 1, a[1] = 2
 t.length = 2, t[0] = 1, t[1] = 2

 if there is enough memory to resize the array in place.

 So arrays are not passed by value, nor by reference.
 Some changes to the array are incorporated into the original array, and

 are not.
 They are using "passing by array reference", using which needs a deep
 understanding of the low-level implementation of the arrays.
 I understand that this is a result of the current (fast) implemetation of
 arrays, but the result is *unacceptable*.
 One easy solution would be to disallow passing arrays as "in" parameters

 require "inout". (I still like the "ref" keyword better than "inout")
 But using "inout" parameters is slower, isn't it? It is "just another

 of indirection". So an effective solution would include redesigning the
 low-level array implementation.

 Yours,
 Sandor

Oct 07 2002
parent "Walter" <walter digitalmars.com> writes:
"Mike Wynn" <mike.wynn l8night.co.uk> wrote in message
news:anrt0b$kfh$1 digitaldaemon.com...
 void fnstexref(inout int[] a){
     int b[3]; b[0]=10;b[1]=20;b[2]=30;
  a = b; // a is now exactly the same as b
 }
  having returned from this function, b which was allocated on the stack is
 NOT promoted to a heap object
  so a which hold a reference to the same area of memory which will be
 refering to a section of stack which
  has just been removed, this will is not show as a problem UNTIL you call
 another function.

Yes, that case should generate an error message.
Oct 09 2002
prev sibling parent reply "Walter" <walter digitalmars.com> writes:
"Sandor Hojtsy" <hojtsy index.hu> wrote in message
news:anrkjd$boh$1 digitaldaemon.com...
 They are using "passing by array reference", using which needs a deep
 understanding of the low-level implementation of the arrays.
 I understand that this is a result of the current (fast) implemetation of
 arrays, but the result is *unacceptable*.
 One easy solution would be to disallow passing arrays as "in" parameters

 require "inout". (I still like the "ref" keyword better than "inout")
 But using "inout" parameters is slower, isn't it? It is "just another

 of indirection". So an effective solution would include redesigning the
 low-level array implementation.

Actually, I think simply disallowing resizing of 'in' arrays would do the trick.
Oct 09 2002
next sibling parent reply Burton Radons <loth users.sourceforge.net> writes:
Walter wrote:
 "Sandor Hojtsy" <hojtsy index.hu> wrote in message
 news:anrkjd$boh$1 digitaldaemon.com...
 
They are using "passing by array reference", using which needs a deep
understanding of the low-level implementation of the arrays.
I understand that this is a result of the current (fast) implemetation of
arrays, but the result is *unacceptable*.
One easy solution would be to disallow passing arrays as "in" parameters

and
require "inout". (I still like the "ref" keyword better than "inout")
But using "inout" parameters is slower, isn't it? It is "just another

level
of indirection". So an effective solution would include redesigning the
low-level array implementation.

Actually, I think simply disallowing resizing of 'in' arrays would do the trick.

That's unmotivated. If one doesn't understand the nature of arrays, don't use the language. Every language in my complement uses arrays totally differently. In D's case, it's with the understanding that an array's references should be controlled until its dispersion point and that arrays which need to be modified later should have its throttle point where the appropriate references are updated. It's a trivial part of engineering, not worth thinking about. Pulling in any expectations from a previous language will get you killed in all of them. Going into templated C++ with the expectation that arrays are passed by references as with Python is wrong; going into D with the expectation that arrays are distinct objects as with Python is wrong. No expectations are being violated by D's behaviour because there aren't any, and anyone who says there are is myopic.
Oct 09 2002
next sibling parent "Mike Wynn" <mike.wynn l8night.co.uk> writes:
It is not that arrays should be one way or another, but that their current
behaviour is inconsistant with D aims of removing some of the pitfalls that
you have in C++

learning a new behaviour is not a problem, but the current array behaviour
is worse (IHMO) than C++ not defaulting destructors to be virtual.

if all the information related to the array is not stored in the array
"object" then
'in' arrays should either be immutable or copy-on-write (runtime or compile
time)
rather than partly mutable
I think disallowing extension but allowing modification is just a cludge to
fit in with the current implementation.
what would happen if someone wanted to write a D compiler for CLR ? would
the have to implement this behaviour ? or onto an architecture with hardware
accelerated GC ?

Mike.

"Burton Radons" <loth users.sourceforge.net> wrote in message
news:ao1qf5$27t3$1 digitaldaemon.com...
 Walter wrote:
 "Sandor Hojtsy" <hojtsy index.hu> wrote in message
 news:anrkjd$boh$1 digitaldaemon.com...

They are using "passing by array reference", using which needs a deep
understanding of the low-level implementation of the arrays.
I understand that this is a result of the current (fast) implemetation



arrays, but the result is *unacceptable*.
One easy solution would be to disallow passing arrays as "in" parameters

and
require "inout". (I still like the "ref" keyword better than "inout")
But using "inout" parameters is slower, isn't it? It is "just another

level
of indirection". So an effective solution would include redesigning the
low-level array implementation.

Actually, I think simply disallowing resizing of 'in' arrays would do


 trick.

That's unmotivated. If one doesn't understand the nature of arrays, don't use the language. Every language in my complement uses arrays totally differently. In D's case, it's with the understanding that an array's references should be controlled until its dispersion point and that arrays which need to be modified later should have its throttle point where the appropriate references are updated. It's a trivial part of engineering, not worth thinking about. Pulling in any expectations from a previous language will get you killed in all of them. Going into templated C++ with the expectation that arrays are passed by references as with Python is wrong; going into D with the expectation that arrays are distinct objects as with Python is wrong. No expectations are being violated by D's behaviour because there aren't any, and anyone who says there are is myopic.

Oct 09 2002
prev sibling next sibling parent "Walter" <walter digitalmars.com> writes:
"Burton Radons" <loth users.sourceforge.net> wrote in message
news:ao1qf5$27t3$1 digitaldaemon.com...
 That's unmotivated.  If one doesn't understand the nature of arrays,
 don't use the language.  Every language in my complement uses arrays
 totally differently.  In D's case, it's with the understanding that an
 array's references should be controlled until its dispersion point and
 that arrays which need to be modified later should have its throttle
 point where the appropriate references are updated.  It's a trivial part
 of engineering, not worth thinking about.

 Pulling in any expectations from a previous language will get you killed
 in all of them.  Going into templated C++ with the expectation that
 arrays are passed by references as with Python is wrong; going into D
 with the expectation that arrays are distinct objects as with Python is
 wrong.  No expectations are being violated by D's behaviour because
 there aren't any, and anyone who says there are is myopic.

While I agree with your sentiment, I've always been uncomfortable with the side effects of resizing arrays. The semantics are the way they are in D because they allow well crafted array code to run like the wind. It's a tradeoff that makes sense. I can see very few legitimate cases where one would like to resize an 'in' array, most of the time it's likely to be a bug. Disallowing it is workable because it has the effect of asking the programmer if he really means it. The workaround if it is desired to resize it is simple, just copy it: void foo(in int[] a) { a.length = 10; // error, can't resize 'in' array int[] b = a; b.length = 10; // ok }
Oct 09 2002
prev sibling parent reply "Sean L. Palmer" <seanpalmer directvinternet.com> writes:
Dude why are you going off on Walter?  Just because his arrays aren't
const-safe.

Sean

"Burton Radons" <loth users.sourceforge.net> wrote in message
news:ao1qf5$27t3$1 digitaldaemon.com...
 Walter wrote:
 "Sandor Hojtsy" <hojtsy index.hu> wrote in message
 news:anrkjd$boh$1 digitaldaemon.com...

They are using "passing by array reference", using which needs a deep
understanding of the low-level implementation of the arrays.
I understand that this is a result of the current (fast) implemetation



arrays, but the result is *unacceptable*.
One easy solution would be to disallow passing arrays as "in" parameters

and
require "inout". (I still like the "ref" keyword better than "inout")
But using "inout" parameters is slower, isn't it? It is "just another

level
of indirection". So an effective solution would include redesigning the
low-level array implementation.

Actually, I think simply disallowing resizing of 'in' arrays would do


 trick.

That's unmotivated. If one doesn't understand the nature of arrays, don't use the language. Every language in my complement uses arrays totally differently. In D's case, it's with the understanding that an array's references should be controlled until its dispersion point and that arrays which need to be modified later should have its throttle point where the appropriate references are updated. It's a trivial part of engineering, not worth thinking about. Pulling in any expectations from a previous language will get you killed in all of them. Going into templated C++ with the expectation that arrays are passed by references as with Python is wrong; going into D with the expectation that arrays are distinct objects as with Python is wrong. No expectations are being violated by D's behaviour because there aren't any, and anyone who says there are is myopic.

Oct 09 2002
parent Burton Radons <loth users.sourceforge.net> writes:
Sean L. Palmer wrote:
 Dude why are you going off on Walter?  Just because his arrays aren't
 const-safe.

Huh? Perhaps you're mistaking who I was talking about in the second sentence - people learning the language who don't want to understand it (we can do such limited error handling here that those are the only people this can effect), and largely as a response to this whole thread set. I've never argued for const safety - where are you pulling that from?
 "Burton Radons" <loth users.sourceforge.net> wrote in message
 news:ao1qf5$27t3$1 digitaldaemon.com...
 
Walter wrote:

"Sandor Hojtsy" <hojtsy index.hu> wrote in message
news:anrkjd$boh$1 digitaldaemon.com...

They are using "passing by array reference", using which needs a deep
understanding of the low-level implementation of the arrays.
I understand that this is a result of the current (fast) implemetation



of
arrays, but the result is *unacceptable*.
One easy solution would be to disallow passing arrays as "in" parameters

and
require "inout". (I still like the "ref" keyword better than "inout")
But using "inout" parameters is slower, isn't it? It is "just another

level
of indirection". So an effective solution would include redesigning the
low-level array implementation.

Actually, I think simply disallowing resizing of 'in' arrays would do


the
trick.

That's unmotivated. If one doesn't understand the nature of arrays, don't use the language. Every language in my complement uses arrays totally differently. In D's case, it's with the understanding that an array's references should be controlled until its dispersion point and that arrays which need to be modified later should have its throttle point where the appropriate references are updated. It's a trivial part of engineering, not worth thinking about. Pulling in any expectations from a previous language will get you killed in all of them. Going into templated C++ with the expectation that arrays are passed by references as with Python is wrong; going into D with the expectation that arrays are distinct objects as with Python is wrong. No expectations are being violated by D's behaviour because there aren't any, and anyone who says there are is myopic.


Oct 10 2002
prev sibling parent "chris jones" <flak clara.co.uk> writes:
"Walter" <walter digitalmars.com> wrote in message
news:ao1pdi$26m5$3 digitaldaemon.com...
 "Sandor Hojtsy" <hojtsy index.hu> wrote in message
 news:anrkjd$boh$1 digitaldaemon.com...
 They are using "passing by array reference", using which needs a deep
 understanding of the low-level implementation of the arrays.
 I understand that this is a result of the current (fast) implemetation


 arrays, but the result is *unacceptable*.
 One easy solution would be to disallow passing arrays as "in" parameters

 require "inout". (I still like the "ref" keyword better than "inout")
 But using "inout" parameters is slower, isn't it? It is "just another

 of indirection". So an effective solution would include redesigning the
 low-level array implementation.

Actually, I think simply disallowing resizing of 'in' arrays would do the trick.

Delphi does somthing like that, i cant remember the terminology there are two ways to pass an array, both effectivly by referance but only one of them allows resizing. chris
Oct 09 2002