www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - What if D would require * for reference types?

reply "Denis Koroskin" <2korden gmail.com> writes:
You know, I'm not usually a guy who proposes radical changes to the  
language. I like the way D is, but there are a few proposals that keep  
popping up once in a while without a good rationale, and this simple idea  
stroke my head solves many of the issues that were found in previous  
proposals, so I decided to share it with you.

It's as simple as that: require '*' to denote reference type. What does it  
give?

1) Consistency between classes and structs:

Struct* s = new Struct();
Class* c = new Class();

It allows easier transition between classes and structs (please note that  
I don't propose any changes to class semantics):
Foo* f = new Foo(); // works for both classes and structs

2) .sizeof consistency, get rid of __traits(classInstanceSize, Foo)  
(deprecate in favor of Foo.sizeof):

Foo* f = cast(Foo*)malloc(Foo.sizeof); // works for both classes and  
structs (and enums, unions, etc)
f.__ctor(args);

3) No more issues with tail-const, tail-shared, tail-immutable; deprecate  
Rebindable (this one was recently discussed):

shared(Foo)* foo; // local pointer to shared type, works for both classes  
and structs

Please note that we get these bonuses by only enforcing '*' to denote  
refence type, not a huge change to compiler IMO. It *will* break existing  
code, but the fix is rather trivial.

To be continued.
Jan 18 2010
next sibling parent reply "Denis Koroskin" <2korden gmail.com> writes:
(I moved that part to a different letter so that people don't concentrate  
on it too much, since it is more controvertial)

What else would it *optionally* provide (subject to discuss, could be  
implemented later, because it's an additional functionality that D  
currently lacks)?

I stress it one more time: it's *OPTIONAL*, but it keeps getting asked for  
and this proposal allows the feature quite nicely IMO.

1) Allow constructing classes on stack and get rid of "scope Foo" hack (it  
doesn't even work well: "scope Foo = createFoo();" <- heap-allocates,  
scope is noop here)
Consistent with structs

2) Allow class aggregation without additional allocation cost (no need for  
InSitu, which is not implementable in current D anyway):

class Foo
{
}

struct Bar
{
     Foo foo; // < analog of InSitu!(Foo)
}

Consistent with structs.

3) Returning class instances from functions via stack:

Foo createFoo()
{
    Foo foo; // default ctor is called
    // initialize
    return foo;
}

Foo foo = createFoo(); // created on stack

Consistent with structs.

4) Class array allocation:

Foo* foo = new Foo[42]; // allocates Foo.sizeof*42 bytes and calls default  
ctor on each object
(*Not* consistent with structs, until struct default ctors will be allowed)

5) Safe class instance assignment:

Foo foo1;
Foo foo2;
foo1 = foo2; // okay, consistent with classes

6) Slicing prevention rules:

class Bar : Foo {}

Bar bar;
foo1 = bar; // error, assigment from different type

Foo* fooPtr = new Bar();
foo1 = *fooPtr; // error, class pointer dereference not allowed

What do you think? I understand it is unlikely to make its way into D2  
(D3?), but is it sound? Do you think it's useless, or do you think that  
additional consistency (and functionality) is worthwhile?
Jan 18 2010
next sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
Denis Koroskin wrote:
 What do you think? I understand it is unlikely to make its way into D2 
 (D3?), but is it sound? Do you think it's useless, or do you think that 
 additional consistency (and functionality) is worthwhile?

What this means is that classes will be usable as a value type, like in C++. In C++, this causes all sorts of trouble, as a value type and a reference type are fundamentally different things with different uses.
Jan 18 2010
parent BCS <none anon.com> writes:
Hello Walter,

 Denis Koroskin wrote:
 
 What do you think? I understand it is unlikely to make its way into
 D2 (D3?), but is it sound? Do you think it's useless, or do you think
 that additional consistency (and functionality) is worthwhile?
 

in C++. In C++, this causes all sorts of trouble, as a value type and a reference type are fundamentally different things with different uses.

There is a precedent in C for T* being legal as the type of a variable or intermediate value but T not being (for some T): function pointers. If you forbid the use of the bare Class type (possibly with the exception of as an alias for meta coding reasons) your concern becomes moot.
Jan 18 2010
prev sibling parent Bill Baxter <wbaxter gmail.com> writes:
On Mon, Jan 18, 2010 at 4:25 PM, Walter Bright
<newshound1 digitalmars.com> wrote:
 Denis Koroskin wrote:
 What do you think? I understand it is unlikely to make its way into D2
 (D3?), but is it sound? Do you think it's useless, or do you think that
 additional consistency (and functionality) is worthwhile?

What this means is that classes will be usable as a value type, like in C++. In C++, this causes all sorts of trouble, as a value type and a reference type are fundamentally different things with different uses.

No problem there, just don't allow it: Foo f; // error - classes cannot be used as value types I think it's mentioned somewhere in that thread. There are probably some good reasons to object to the proposal, but I don't think that's one of them. --bb
Jan 18 2010
prev sibling next sibling parent reply Trass3r <un known.com> writes:
Using * would heavily confuse people coming from C.
I like the way it is now, class being reference and struct a value type.
Jan 18 2010
parent Michel Fortin <michel.fortin michelf.com> writes:
On 2010-01-18 10:50:58 -0500, Trass3r <un known.com> said:

 Using * would heavily confuse people coming from C.
 I like the way it is now, class being reference and struct a value type.

I don't follow this reasoning. C has pointers. D has pointers and reference types (objects). If you remove reference types from D, D will have only pointers, for which C folks are already familiar with. How is that confusing to people with a C background? If your case was that it'd be confusing to people coming from Java, then I'd understand better, as Java has reference types and no pointers. Or people with a C# background, since C# has both pointers and reference types. But C? -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Jan 18 2010
prev sibling next sibling parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2010-01-18 10:08:34 -0500, "Denis Koroskin" <2korden gmail.com> said:

 It's as simple as that: require '*' to denote reference type. What does 
 it  give?

It works like that in Objective-C where object variables must always be pointer to objects (enforced by the compiler), and it's not so bad. It's undoubtedly cleaner to read without '*', but as you illustrate it cause issues.
 3) No more issues with tail-const, tail-shared, tail-immutable; 
 deprecate  Rebindable (this one was recently discussed):

That's the issue that bothers me most about the current syntax. Rebindable is a nice trick, but it's a hackish solution thrown at a syntactic problem. And it's not so rare either when you're working with immutable objects. It'd be much better if the syntactic problem didn't exist in the first place. Compare this Rebindable!(immutable Object) object; to this: immutable(Object)* object; The second is much easier to read. By getting rid of this template trickery playing with the type system and replacing implicit references with the almost identical concept of pointer, I think we would make the language easier to grasp. Are we too late in D2 development to make this change? I'm in support of it. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Jan 18 2010
next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 18 Jan 2010 11:17:53 -0500, Michel Fortin  
<michel.fortin michelf.com> wrote:

 On 2010-01-18 10:08:34 -0500, "Denis Koroskin" <2korden gmail.com> said:

 It's as simple as that: require '*' to denote reference type. What does  
 it  give?

It works like that in Objective-C where object variables must always be pointer to objects (enforced by the compiler), and it's not so bad. It's undoubtedly cleaner to read without '*', but as you illustrate it cause issues.
 3) No more issues with tail-const, tail-shared, tail-immutable;  
 deprecate  Rebindable (this one was recently discussed):

That's the issue that bothers me most about the current syntax. Rebindable is a nice trick, but it's a hackish solution thrown at a syntactic problem. And it's not so rare either when you're working with immutable objects. It'd be much better if the syntactic problem didn't exist in the first place. Compare this Rebindable!(immutable Object) object; to this: immutable(Object)* object; The second is much easier to read. By getting rid of this template trickery playing with the type system and replacing implicit references with the almost identical concept of pointer, I think we would make the language easier to grasp. Are we too late in D2 development to make this change? I'm in support of it.

One of the biggest problems with C++ class references is the stupid -> operator. I think D has avoided that in the best way possible. I think probably this change wouldn't be too bad, especially given how auto works: auto c = new C(); // works for both explicit and implicit reference style The one thing I would insist is that classes cannot be allocated on the stack unless explicitly created via scope, and even then a variable can never be of class type without reference: C c; // error, cannot declare a variable of type C. scope C c; // error, cannot declare a variable of type C. scope C* c = new C(); // ok, allocated on stack and c works just like a normal C reference scope c = new C(); // equivalent to above line. This would leave the slicing problem solved the same way it is now -- you can stack allocate but only when you specifically request it. And if you can never declare a variable of class type without denoting it is a reference, then the only place code is affected is declarations where auto is not involved. I think we are too late for D2, the book is pretty much finished except for the concurrency chapter. It is a great idea though, I would have loved to see this happen before D2 was released. Maybe D3 can have this change. -Steve
Jan 18 2010
parent "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
Steven Schveighoffer wrote:
 On Mon, 18 Jan 2010 13:35:01 -0500, Bill Baxter <wbaxter gmail.com> wrote:
 
 I think one can argue it would be better for pointer arithmetic to
 require a more in-your-face notation, since pointer arithmetic isn't
 usually needed and won't even be allowed in SafeD.

 It isn't really standard arithmetic to begin with, since foo++
 (pointer meaning) is more like  foo += sizeof(Foo);
 Also adding pointers together is not allowed.  These two things, I
 think, form the basis of a decent justification for making pointer
 arithmetic syntactically distinct from regular arithmetic.

 So, for instance, you could require distinct operators like  *++, *+,
 *-  for pointer manipulations.

 Maybe we're on to another big mistake of C here -- conflation of
 pointers with numbers.  Or at least a corollary to the mistake of
 conflating of arrays with pointers.

 The question that then comes up is can you do overloading of those new
 operators?  If you want to make something that is "pointer-like", then
 it would be necessary to do so.  But probably  only value-types should
 be allowed to act in a pointer-like manner.  So you could only
 overload pointer arithmetic operators in a struct.  But I'm not sure
 there's a use case for even that.

pointers are essentially unrestricted arrays also. So you'd also have to disallow ptr[x]. Basically any operation that normally gets forwarded to a class reference but would be intercepted by the pointer has to be given a new syntax. I'm wondering, is there an issue with signifying rebindable references like C++'s references, i.e. C& ? (question mark separated for clarity) C& classref = new C(); C& classref2 = classref; // does not create a new instance, but just copies the reference S& structref = new S(); // struct reference! S& structref2 = structref; // structref2 and structref bind to same data. S structref3 = *structref2; // must dereference to access value *classref2 = *classref; // Error! illegal to derference class references. S& structref4 = &structref3; // pointer implicitly casts to reference. C classref3; // illegal const(C)& constref; // tail-const class reference! ref can still be used as a storage class, meaning you can use it as if it were a value type. But the & types will be explicitly a reference type, meaning: foo(ref int x; int& y) { int z; assert(is(typeof(z) == typeof(x)); assert(is(typeof(y) != typeof(x) && typeof(y) != typeof(z)); y = z; // illegal; *y = z; // correct; y = &z; // correct, but unsafe. y = x; // also illegal y = &x; // ok (possibly unsafe). y += 5; // compiler error or equivalent to *y += 5 ? } basically, & is a pointer that only supports the operators = * and 'is'. I think conversion from a reference to a pointer should be available via a cast, but I'm not sure whether the compiler should allow class pointers. My gut feeling is no.

Isn't this already possible? class Foo { } Foo foo = new Foo; Foo* ptr = cast(Foo*) foo; // ptr is not a pointer to the reference. assert (ptr != &foo);
 Are there any syntax ambiguities here?  & is also a binary op, but then 
 again, so is *.  Will there be an issue with && ? I mean, because the 
 references are rebindable, you should be able to have a reference to a 
 reference.
 
 Also, struct references in this way will be usable in safe D, enabling 
 heap struct data!
 
 The more I think about it, the more I like having an explicit reference 
 denotation for classes, with the compiler enforcing that you simply 
 can't use class data as a value type.  This basically makes all the 
 tail-X syntax just work, and still retains the benefits of classes being 
 reference-only types.  The tail shared problem is a really crappy issue, 
 worse than tail-const.

I like it too. :) But I'm not sure we should use the & symbol: Foo& foo = new Foo; // & denotes a reference Bar* bar = &barValue; // & returns a pointer Yes, C++ does it, but it's still ugly. -Lars
Jan 18 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 18 Jan 2010 12:12:06 -0500, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

 I think we are too late for D2, the book is pretty much finished except  
 for the concurrency chapter.  It is a great idea though, I would have  
 loved to see this happen before D2 was released.  Maybe D3 can have this  
 change.

I forgot one really really important requirement -- a class reference needs to not be a general pointer. For example: Foo *foo; foo++; // must be equivalent to foo.opInc(), not add one to the pointer. This may be the death knell of the idea. I don't see Walter accepting similar syntaxes to be drastically different, and distinguishing safe D as not using pointers is going to be a huge problem if you can't use classes. -Steve
Jan 18 2010
prev sibling next sibling parent Bill Baxter <wbaxter gmail.com> writes:
On Mon, Jan 18, 2010 at 9:30 AM, Steven Schveighoffer
<schveiguy yahoo.com> wrote:
 On Mon, 18 Jan 2010 12:12:06 -0500, Steven Schveighoffer
 <schveiguy yahoo.com> wrote:

 I think we are too late for D2, the book is pretty much finished except
 for the concurrency chapter. =A0It is a great idea though, I would have =


 to see this happen before D2 was released. =A0Maybe D3 can have this cha=


 I forgot one really really important requirement -- a class reference nee=

 to not be a general pointer.

 For example:

 Foo *foo;
 foo++; // must be equivalent to foo.opInc(), not add one to the pointer.

 This may be the death knell of the idea. =A0I don't see Walter accepting
 similar syntaxes to be drastically different, and distinguishing safe D a=

 not using pointers is going to be a huge problem if you can't use classes=

I think one can argue it would be better for pointer arithmetic to require a more in-your-face notation, since pointer arithmetic isn't usually needed and won't even be allowed in SafeD. It isn't really standard arithmetic to begin with, since foo++ (pointer meaning) is more like foo +=3D sizeof(Foo); Also adding pointers together is not allowed. These two things, I think, form the basis of a decent justification for making pointer arithmetic syntactically distinct from regular arithmetic. So, for instance, you could require distinct operators like *++, *+, *- for pointer manipulations. Maybe we're on to another big mistake of C here -- conflation of pointers with numbers. Or at least a corollary to the mistake of conflating of arrays with pointers. The question that then comes up is can you do overloading of those new operators? If you want to make something that is "pointer-like", then it would be necessary to do so. But probably only value-types should be allowed to act in a pointer-like manner. So you could only overload pointer arithmetic operators in a struct. But I'm not sure there's a use case for even that. --bb
Jan 18 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 18 Jan 2010 13:35:01 -0500, Bill Baxter <wbaxter gmail.com> wrote:

 I think one can argue it would be better for pointer arithmetic to
 require a more in-your-face notation, since pointer arithmetic isn't
 usually needed and won't even be allowed in SafeD.

 It isn't really standard arithmetic to begin with, since foo++
 (pointer meaning) is more like  foo += sizeof(Foo);
 Also adding pointers together is not allowed.  These two things, I
 think, form the basis of a decent justification for making pointer
 arithmetic syntactically distinct from regular arithmetic.

 So, for instance, you could require distinct operators like  *++, *+,
 *-  for pointer manipulations.

 Maybe we're on to another big mistake of C here -- conflation of
 pointers with numbers.  Or at least a corollary to the mistake of
 conflating of arrays with pointers.

 The question that then comes up is can you do overloading of those new
 operators?  If you want to make something that is "pointer-like", then
 it would be necessary to do so.  But probably  only value-types should
 be allowed to act in a pointer-like manner.  So you could only
 overload pointer arithmetic operators in a struct.  But I'm not sure
 there's a use case for even that.

pointers are essentially unrestricted arrays also. So you'd also have to disallow ptr[x]. Basically any operation that normally gets forwarded to a class reference but would be intercepted by the pointer has to be given a new syntax. I'm wondering, is there an issue with signifying rebindable references like C++'s references, i.e. C& ? (question mark separated for clarity) C& classref = new C(); C& classref2 = classref; // does not create a new instance, but just copies the reference S& structref = new S(); // struct reference! S& structref2 = structref; // structref2 and structref bind to same data. S structref3 = *structref2; // must dereference to access value *classref2 = *classref; // Error! illegal to derference class references. S& structref4 = &structref3; // pointer implicitly casts to reference. C classref3; // illegal const(C)& constref; // tail-const class reference! ref can still be used as a storage class, meaning you can use it as if it were a value type. But the & types will be explicitly a reference type, meaning: foo(ref int x; int& y) { int z; assert(is(typeof(z) == typeof(x)); assert(is(typeof(y) != typeof(x) && typeof(y) != typeof(z)); y = z; // illegal; *y = z; // correct; y = &z; // correct, but unsafe. y = x; // also illegal y = &x; // ok (possibly unsafe). y += 5; // compiler error or equivalent to *y += 5 ? } basically, & is a pointer that only supports the operators = * and 'is'. I think conversion from a reference to a pointer should be available via a cast, but I'm not sure whether the compiler should allow class pointers. My gut feeling is no. Are there any syntax ambiguities here? & is also a binary op, but then again, so is *. Will there be an issue with && ? I mean, because the references are rebindable, you should be able to have a reference to a reference. Also, struct references in this way will be usable in safe D, enabling heap struct data! The more I think about it, the more I like having an explicit reference denotation for classes, with the compiler enforcing that you simply can't use class data as a value type. This basically makes all the tail-X syntax just work, and still retains the benefits of classes being reference-only types. The tail shared problem is a really crappy issue, worse than tail-const. -Steve
Jan 18 2010
prev sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 18 Jan 2010 15:08:56 -0500, Lars T. Kyllingstad  
<public kyllingen.nospamnet> wrote:

 Steven Schveighoffer wrote:
 I think conversion from a reference to a pointer should be available  
 via a cast, but I'm not sure whether the compiler should allow class  
 pointers.  My gut feeling is no.

Isn't this already possible? class Foo { } Foo foo = new Foo; Foo* ptr = cast(Foo*) foo; // ptr is not a pointer to the reference. assert (ptr != &foo);

Hm.. does ptr actually work though? My guess is it would not. i.e. I think this would not work: class Foo { int x; void bar() { writefln(x); } } Foo foo = new Foo; foo.x = 42; Foo* ptr = cast(Foo*) foo; ptr.bar(); // I predict at least that it does not print 42, possibly it's a crash. The reason is because Foo* means "a pointer to a reference to a Foo." This is the whole problem with builtin references, you can't get at the type without including the reference.
 Are there any syntax ambiguities here?  & is also a binary op, but then  
 again, so is *.  Will there be an issue with && ? I mean, because the  
 references are rebindable, you should be able to have a reference to a  
 reference.
  Also, struct references in this way will be usable in safe D, enabling  
 heap struct data!
  The more I think about it, the more I like having an explicit  
 reference denotation for classes, with the compiler enforcing that you  
 simply can't use class data as a value type.  This basically makes all  
 the tail-X syntax just work, and still retains the benefits of classes  
 being reference-only types.  The tail shared problem is a really crappy  
 issue, worse than tail-const.

I like it too. :) But I'm not sure we should use the & symbol: Foo& foo = new Foo; // & denotes a reference Bar* bar = &barValue; // & returns a pointer Yes, C++ does it, but it's still ugly.

C++ is quite different, C++ uses & to denote the same thing as ref does in D, except you can declare them wherever, not just in parameter lists. i.e.: Foo foo, other; Foo& foo2 = foo; // one and only time you can bind foo2 foo2 = other; // equivalent to saying foo = other, does not rebind foo2. Foo& foo3 = new Foo; // Error, new Foo returns a pointer, cannot assign to a reference The second line you have there is a bit troubling. I agree it doesn't look very good. Note that we already live with * having three meanings -- denoting a pointer, dereferencing a pointer, and multiplication. In all 3 cases the syntax is unambiguous, but the meaning isn't so clear to a person reading it. It's roughly the same as & would be. I wonder if there are better symbols? Foo$ Foo% Foo# -Steve
Jan 18 2010
prev sibling next sibling parent Trass3r <un known.com> writes:
 How is that confusing to people with a C background?

Foo* foo = new Foo; would make people think foo is a pointer to an instance of Foo, not a reference. Furthermore it would be inconsistent with normal pointers (int*, ...)
Jan 18 2010
prev sibling next sibling parent "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
Denis Koroskin wrote:
 You know, I'm not usually a guy who proposes radical changes to the 
 language. I like the way D is, but there are a few proposals that keep 
 popping up once in a while without a good rationale, and this simple 
 idea stroke my head solves many of the issues that were found in 
 previous proposals, so I decided to share it with you.
 
 It's as simple as that: require '*' to denote reference type. What does 
 it give?
 
 1) Consistency between classes and structs:
 
 Struct* s = new Struct();
 Class* c = new Class();
 
 It allows easier transition between classes and structs (please note 
 that I don't propose any changes to class semantics):
 Foo* f = new Foo(); // works for both classes and structs
 
 2) .sizeof consistency, get rid of __traits(classInstanceSize, Foo) 
 (deprecate in favor of Foo.sizeof):
 
 Foo* f = cast(Foo*)malloc(Foo.sizeof); // works for both classes and 
 structs (and enums, unions, etc)
 f.__ctor(args);
 
 3) No more issues with tail-const, tail-shared, tail-immutable; 
 deprecate Rebindable (this one was recently discussed):
 
 shared(Foo)* foo; // local pointer to shared type, works for both 
 classes and structs
 
 Please note that we get these bonuses by only enforcing '*' to denote 
 refence type, not a huge change to compiler IMO. It *will* break 
 existing code, but the fix is rather trivial.
 
 To be continued.

As others have mentioned, using the same syntax for pointers and references may not be such a good idea. That said, I've always wanted a general way to create references, for instance a "ref" type constructor: // Compulsory for classes. class Foo { } ref(Foo) foo = new Foo; // Symmetry between classes and structs. struct Bar { } ref(Bar) bar = new Bar; // 'new' *always* returns a reference. ref(int) i = new int; -Lars
Jan 18 2010
prev sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Mon, 18 Jan 2010 21:08:57 +0300, Trass3r <un known.com> wrote:

 How is that confusing to people with a C background?

Foo* foo = new Foo; would make people think foo is a pointer to an instance of Foo, not a reference.

What's the difference?
 Furthermore it would be inconsistent with normal pointers (int*, ...)

Why is in inconsistent?
Jan 18 2010