www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Value vs. reference semantics, and pointers

reply Scott L. Burson <Scott_member pathlink.com> writes:
Hi,

As a newcomer to D I want to say first that it looks like a very clean and
well-thought-out design, free of the endless frustrations of C++.  As a Lisp
fanatic, I am particularly glad to see that you have included local functions,
function literals, and closures, though the distinction between function
"pointers" and delegates strikes me as superfluous ... oh, I see you are already
considering eliminating it.  A couple of other tidbits while I'm on this: the
word "pointer" was always misleading in this context; for example, unlike other
pointers in C, they don't support pointer arithmetic.  Better, I would suggest,
to just refer to "function variables", or maybe just stick to "delegates" since
you've adopted that term.  Similarly, I would discourage the use of the
ampersand to reify a function; I think the ampersand should at least be optional
(as indeed it is in C) -- I don't see anything in the docs that says it's
required, but you seem to use it in all the examples.  (My own preference would
be not even to allow it, or failing that, to deprecate it.)  And thirdly, your
use of the phrase "dynamic closure" is surprising to me -- in Lisp parlance,
these are _lexical_ closures, specifically distinguished from dynamic closures
(in our sense of the term), which are considered an archaic kludge, not even
implemented in modern Lisps.  So I'm wondering how the word "dynamic" got in
there.  But this is a very minor quibble :)

I've been skimming the material on the D Web site and want to be sure I
understand some things.  I gather that structs and unions have value semantics,
while arrays and classes have reference semantics.  That is, assignment to a
variable of struct or union type copies the contents, while assignment to a
variable of array or class type copies a reference to the contents.  Is that
correct?  You might want to clarify this in the docs, as it's pretty
fundamental.

I also have a question about the treatment of pointers.  I understand that
out/inout parameters and reference semantics for classes and arrays will
eliminate the vast majority of occasions calling for explicit pointers, but
still, I'm curious.  In D, can you make a pointer to a single object as in C, or
does it have to be in an array?

Cheers,
Scott
Mar 21 2006
next sibling parent reply Oskar Linde <oskar.lindeREM OVEgmail.com> writes:
Hi,

I will just give you a few quick comments.

Scott L. Burson wrote:
  Similarly, I would discourage the use of the
 ampersand to reify a function; I think the ampersand should at least be
optional
 (as indeed it is in C) -- I don't see anything in the docs that says it's
 required, but you seem to use it in all the examples. 

I guess the reason is that D allows function calling (property like) without trailing parentheses (). Meaning func is identical to func() in most cases. The ampersand is needed to distinguish function calls from function references.
 I've been skimming the material on the D Web site and want to be sure I
 understand some things.  I gather that structs and unions have value semantics,
 while arrays and classes have reference semantics.  That is, assignment to a
 variable of struct or union type copies the contents, while assignment to a
 variable of array or class type copies a reference to the contents.  Is that
 correct?  You might want to clarify this in the docs, as it's pretty
 fundamental.

That is correct.
 I also have a question about the treatment of pointers.  I understand that
 out/inout parameters and reference semantics for classes and arrays will
 eliminate the vast majority of occasions calling for explicit pointers, but
 still, I'm curious.  In D, can you make a pointer to a single object as in C,
or
 does it have to be in an array?

You can still make pointers to anything just like in C. /Oskar
Mar 21 2006
next sibling parent reply Scott L. Burson <Scott_member pathlink.com> writes:
In article <dvoeak$2gvi$1 digitaldaemon.com>, Oskar Linde says...
Scott L. Burson wrote:
  Similarly, I would discourage the use of the
 ampersand to reify a function; I think the ampersand should at least be
optional
 (as indeed it is in C) -- I don't see anything in the docs that says it's
 required, but you seem to use it in all the examples. 

I guess the reason is that D allows function calling (property like) without trailing parentheses (). Meaning func is identical to func() in most cases. The ampersand is needed to distinguish function calls from function references.

A bit of Pascal leaked in there, eh? Wow, I can find nothing about this in the docs. Experimentation with DMD, however, confirms what you say. Gotta say I think it's a bad idea. Consider: int f() { return 7; } void main() { printf("%d\n", f * 2); // parens optional (`f' or `f()') int function() ff = &f; // `&' mandatory printf("%d\n", ff() * 2); // parens mandatory int function() fff = ff; // `&ff' illegal int function() g = function int() { return 8; }; // wrong: int function() g = &function int() { return 8;}; printf("%d\n", (function int() { return 9;})() * g()); // all parens mandatory } In short, there are two kinds of functions, and the syntax rules are different for referring to and invoking the two kinds. (I'm even having trouble coming up with good terms for the two kinds, though I suppose you could go with "named" and "anonymous".) I think this is confusing, and an unfortunate wart on a language that has otherwise done a great job at keeping the rules simple. I urge you to consider making the rules for named and anonymous functions the same. I see three ways to do this: (a) make parens always required for a function call, as they are in C, C++, Java, and most other languages; then you can drop the `&'. (b) always require the `&' when referring to a function without calling it, and then allow empty parens to be elided for all 0-argument calls. (c) drop the `&' but still allow the elision of empty parens; this requires the compiler to disambiguate based on context. My favorite is A, but I'm guessing you feel committed to allowing the elision, so that will be a non-starter. B is certainly workable but more than a little ugly, as the ampersand would now be required in a number of places it is currently forbidden, including when assigning a function literal to a function variable. That leaves C. Although it will complicate your front end a bit, it might not be too bad, because you already have to deal with disambiguation of references to overloaded functions: void foo(int function() g) { ... } int f() { ... } int f(int x) { ... } .. foo(&f) ... It seems like a straightforward extension of this disambiguation mechanism to decide whether the intention was to call the function or pass it. On the other hand, it leaves you figuring out what to do with cases like: void foo(int i); void foo(function int() f); int g() { return ...; } .. foo(g) ... You could require the parens (`foo(g())') if the intention is to call `g' rather than just to pass it, but that seems dangerous; if the first version of `foo' were the only one that existed when the call `foo(g)' were written, and then the second were added later, the meaning of `foo(g)' would change. Probably better just to outlaw such overloadings so the situation can't arise. -- Scott
Mar 22 2006
parent reply "Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:
"Scott L. Burson" <Scott_member pathlink.com> wrote in message 
news:dvss27$29p9$1 digitaldaemon.com...
 In article <dvoeak$2gvi$1 digitaldaemon.com>, Oskar Linde says...

Your topic made me laugh. It's probably not meant to be funny, but it just sounds so funny :) I suppose a (d) would be to create an explicit property syntax, like in C#, which would eliminate this whole problem. In addition, I would imagine it'd make it one step closer to being able to do obj.prop+= 5; As if you were to define the property as class A { private int mValue; public property int prop { set(int value) { mValue = value; } get() { return mValue; } } } The compiler would know that the property is read and write, and would know exactly how to compile an += expression.
Mar 22 2006
parent Scott L. Burson <Scott_member pathlink.com> writes:
In article <dvstvm$2cr7$1 digitaldaemon.com>, Jarrett Billingsley says...
"Scott L. Burson" <Scott_member pathlink.com> wrote in message 
news:dvss27$29p9$1 digitaldaemon.com...
 In article <dvoeak$2gvi$1 digitaldaemon.com>, Oskar Linde says...

Your topic made me laugh. It's probably not meant to be funny, but it just sounds so funny :)

Remember: subduction leads to orogeny! :)
I suppose a (d) would be to create an explicit property syntax, like in C#, 
which would eliminate this whole problem.

Hmm. If I understand correctly, you're suggesting that the primary purpose of allowing paren elision in the first place is to unclutter property references. That's presumably not the only purpose, since the elision rule applies to all named function calls, not just member function calls. Still, it could be the main purpose, and as you say, there might be another way to satisfy that purpose.
In addition, I would imagine it'd make it one step closer to being able to 
do

obj.prop+= 5;

As if you were to define the property as

class A
{
    private int mValue;

    public property int prop
    {
        set(int value)
        {
            mValue = value;
        }

        get()
        {
            return mValue;
        }
    }
}

The compiler would know that the property is read and write, and would know 
exactly how to compile an += expression. 

Yes, I agree, this seems better than the way D currently defines properties -- though a little more verbose, it seems clearer. It also opens up the possibility of specifically overloading the behavior of `+=' etc. While this is not very interesting in the case of ordinary addition, it can be a hook for useful optimizations in cases where `+' means something more complex like set union. -- Scott
Mar 24 2006
prev sibling parent reply Bruno Medeiros <daiphoenixNO SPAMlycos.com> writes:
Oskar Linde wrote:
 Hi,
 
 I will just give you a few quick comments.
 
 Scott L. Burson wrote:
  Similarly, I would discourage the use of the
 ampersand to reify a function; I think the ampersand should at least 
 be optional
 (as indeed it is in C) -- I don't see anything in the docs that says it's
 required, but you seem to use it in all the examples. 

I guess the reason is that D allows function calling (property like) without trailing parentheses (). Meaning func is identical to func() in most cases. The ampersand is needed to distinguish function calls from function references.
 I've been skimming the material on the D Web site and want to be sure I
 understand some things.  I gather that structs and unions have value 
 semantics,
 while arrays and classes have reference semantics.  That is, 
 assignment to a
 variable of struct or union type copies the contents, while assignment 
 to a
 variable of array or class type copies a reference to the contents.  
 Is that
 correct?  You might want to clarify this in the docs, as it's pretty
 fundamental.

That is correct.
 I also have a question about the treatment of pointers.  I understand 
 that
 out/inout parameters and reference semantics for classes and arrays will
 eliminate the vast majority of occasions calling for explicit 
 pointers, but
 still, I'm curious.  In D, can you make a pointer to a single object 
 as in C, or
 does it have to be in an array?

You can still make pointers to anything just like in C. /Oskar

A static array (and by static array we mean an array of fixed size, as C's arrays) is neither a proper value or reference type. It is an odd mix of the two, and IMO a bad discrepancy. Perhaps this is something D could be improved upon. (don't a formed ideia how, though) Dynamic arrays (dynamic length arrays) are "a bit more" than a reference type, but they behave pretty much as reference type, so one can consider them as such. -- Bruno Medeiros - CS/E student http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Mar 23 2006
parent reply Oskar Linde <oskar.lindeREM OVEgmail.com> writes:
Bruno Medeiros wrote:
 Oskar Linde wrote:
 Scott L. Burson wrote:
 I also have a question about the treatment of pointers.  I understand 
 that
 out/inout parameters and reference semantics for classes and arrays will
 eliminate the vast majority of occasions calling for explicit 
 pointers, but
 still, I'm curious.  In D, can you make a pointer to a single object 
 as in C, or
 does it have to be in an array?

You can still make pointers to anything just like in C.


I fail to see what's wrong with my statement. Maybe the word "still". But you are correct that this needs some clarification.
 A static array (and by static array we mean an array of fixed size, as 
 C's arrays) is neither a proper value or reference type. It is an odd 
 mix of the two, and IMO a bad discrepancy. Perhaps this is something D 
 could be improved upon. (don't a formed ideia how, though)
 Dynamic arrays (dynamic length arrays) are "a bit more" than a reference 
 type, but they behave pretty much as reference type, so one can consider 
 them as such.

Mar 24 2006
parent Bruno Medeiros <daiphoenixNO SPAMlycos.com> writes:
Oskar Linde wrote:
 Bruno Medeiros wrote:
 That is not correct (and you know it, have you forgotten?).

I fail to see what's wrong with my statement. Maybe the word "still". But you are correct that this needs some clarification.

wrong part of your post. My post was only meant to reply to this (right after you say "That is correct"): Bruno Medeiros wrote:
 Oskar Linde wrote:
 Scott L. Burson wrote:
 I've been skimming the material on the D Web site and want to be sure I
 understand some things.  I gather that structs and unions have value
 semantics,
 while arrays and classes have reference semantics.  That is,
 assignment to a
 variable of struct or union type copies the contents, while
 assignment to a
 variable of array or class type copies a reference to the contents.
 Is that
 correct?  You might want to clarify this in the docs, as it's pretty
 fundamental.

That is correct.


Repost: That is not correct (and you know it, have you forgotten?). A static array (and by static array we mean an array of fixed size, as C's arrays) is neither a proper value or reference type. It is an odd mix of the two, and IMO a bad discrepancy. Perhaps this is something D could be improved upon. (don't a formed ideia how, though) Dynamic arrays (dynamic length arrays) are "a bit more" than a reference type, but they behave pretty much as reference type, so one can consider them as such. -- Bruno Medeiros - CS/E student http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Mar 24 2006
prev sibling parent reply S. Chancellor <dnewsgr mephit.kicks-ass.org> writes:
On 2006-03-21 00:14:15 -0800, Scott L. Burson <Scott_member pathlink.com> said:

 Hi,
 
 As a newcomer to D I want to say first that it looks like a very clean and
 well-thought-out design, free of the endless frustrations of C++.  As a Lisp
 fanatic, I am particularly glad to see that you have included local functions,
 function literals, and closures, though the distinction between function
 "pointers" and delegates strikes me as superfluous ... oh, I see you 
 are already
 considering eliminating

Delegates and function pointers are fundamentally different structures in memory. In order to maintain compatibility with C binaries there must be a distinction made between the two. C can't handle delegates.
Mar 22 2006
parent Don Clugston <dac nospam.com.au> writes:
S. Chancellor wrote:
 On 2006-03-21 00:14:15 -0800, Scott L. Burson 
 <Scott_member pathlink.com> said:
 
 Hi,

 As a newcomer to D I want to say first that it looks like a very clean 
 and
 well-thought-out design, free of the endless frustrations of C++.  As 
 a Lisp
 fanatic, I am particularly glad to see that you have included local 
 functions,
 function literals, and closures, though the distinction between function
 "pointers" and delegates strikes me as superfluous ... oh, I see you 
 are already
 considering eliminating

Delegates and function pointers are fundamentally different structures in memory. In order to maintain compatibility with C binaries there must be a distinction made between the two. C can't handle delegates.

Not necessarily. Currently a delegate is struct { Object * the_this; function the_function; } and an invocation is: mov ebx, deleg.the_this; call deleg.the_function; At creation of the delegate, you can create a thunk in memory, consisting of the lines the_thunk: mov ebx, the_this; jmp the_function and then the delegate is just a function pointer. Invocation is call the_thunk just like an ordinary function pointer. That way, you can pass it to C and everything will work fine. (FWIW, this is exactly how ATL works, the thunk is created on the stack (!) ). AFAIK this will happen for D 2.0.
Mar 23 2006