www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Why can't we make reference variables?

reply "Tommi" <tommitissari hotmail.com> writes:
In the following example code there's a situation, where the data 
we're looking for already exists, the data has value semantics, 
finding the data takes quite a lot of time, we need to "use" the 
data on multiple occasions, and the size of the data is so large 
that we don't want to copy it.

In this situation, I think, the most convenient and sensible 
thing to do is to make a reference to the data, and use that 
reference multiple times. We could make a pointer, but then we'd 
be stuck with the nasty syntax of dereferencing:

struct Book // has value semantics
{
     // ... lots of data

     void read() const
     {
         // ...
     }
}

struct NationalLibrary
{
     immutable Book[] _books;

     ref immutable(Book) find(string nameOfBook)
     {
         auto idx = 0;
         // ... takes quite long to figure out the index
         return _books[idx];
     }
}

void main()
{
     NationalLibrary nl;

     // This is fine if we just want to read it once:
     nl.find("WarAndPeace").read();
	
     // ... but if we want to read it multiple times, we don't want
     // to each time go to the library and take the time to find 
it:

     immutable(Book)* ptrWarAndPeace = &nl.find("WarAndPeace");
     // And now we're stuck with this syntax:
     (*ptrWarAndPeace).read();
     (*ptrWarAndPeace).read();

     // I'd like to be able to do this:	
     // ref immutable(Book) refWarAndPeace = 
nl.find("WarAndPeace");
     // refWarAndPeace.read();
     // refWarAndPeace.read();
}

Foreach loops can make reference variables, and function calls 
can do it for the parameters passed in. So, my question is, 
wouldn't it be better if we could, in general, make reference 
variables?
Aug 28 2012
next sibling parent "cal" <callumenator gmail.com> writes:
On Wednesday, 29 August 2012 at 00:21:29 UTC, Tommi wrote:
 In this situation, I think, the most convenient and sensible 
 thing to do is to make a reference to the data, and use that 
 reference multiple times. We could make a pointer, but then 
 we'd be stuck with the nasty syntax of dereferencing:

This works currently: struct Test { void foo() const { writeln("FOO"); } } void main() { immutable(Test)* ptr = new immutable(Test); ptr.foo(); }
Aug 28 2012
prev sibling next sibling parent "Tommi" <tommitissari hotmail.com> writes:
On Wednesday, 29 August 2012 at 00:34:02 UTC, cal wrote:
 On Wednesday, 29 August 2012 at 00:21:29 UTC, Tommi wrote:
 In this situation, I think, the most convenient and sensible 
 thing to do is to make a reference to the data, and use that 
 reference multiple times. We could make a pointer, but then 
 we'd be stuck with the nasty syntax of dereferencing:

This works currently: struct Test { void foo() const { writeln("FOO"); } } void main() { immutable(Test)* ptr = new immutable(Test); ptr.foo(); }

Now, that's a surprise for someone coming from C++. But even though ptr looks like a reference variable in your example, it doesn't look like it at all in this example: void main() { int counter = 0; auto notQuiteRefCounter = &counter; // Increments the pointer, not counter value ++notQuiteRefCounter; // Can't do this // int counterBackup = notQuiteRefCounter; // Prints deref: 0 no-deref: 18FD34 writefln("deref: %s no-deref: %s", *notQuiteRefCounter, notQuiteRefCounter); }
Aug 28 2012
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, August 29, 2012 02:21:28 Tommi wrote:
 Foreach loops can make reference variables, and function calls
 can do it for the parameters passed in. So, my question is,
 wouldn't it be better if we could, in general, make reference
 variables?

Not going to happen. Unfortunately though, I don't remember all of Walter's reasons for it, so I can't really say why (partly due to complications it causes in the language, I think, but I don't know). Use a pointer, std.typecons.RefCounted, a class, or make your struct a reference type (which would probably mean having the data held in a separate struct with a pointer to it in the outer struct). It's really not hard to have a type which is a reference type if that's what you really want. You just can't declare a ref to a variable as a local variable. And really, the only two differences between using a pointer and being able to directly declare a reference like you can in C++ is the fact that a pointer can be null and that operations which don't use . require that you dereference the pointer first (e.g. ==). So, while there may be cases where being able to do something like ref var = otherVar; would be nice, it really doesn't buy you all that much. - Jonathan M Davis
Aug 28 2012
prev sibling next sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
Not exactly the same thing (what you propose would have different
IFTI behaviour), but works quite well:

import std.stdio;

struct Ref(T){
     private T* _payload;
     this(ref T i){_payload = &i; }
      property ref T deref(){ return *_payload; }
     alias deref this;
}
auto ref_(T)(ref T arg){return Ref!T(arg);}

void main(){
     int i,j;
     auto r = i.ref_;
     r++;
     auto q = r;
     writeln(r," ",q);
     q++;
     writeln(r," ",q);
     q = j.ref_;
     q++;
     writeln(r," ",q.deref);
}
Aug 28 2012
prev sibling next sibling parent "Tommi" <tommitissari hotmail.com> writes:
On Wednesday, 29 August 2012 at 01:28:49 UTC, Jonathan M Davis 
wrote:
 Not going to happen. Unfortunately though, I don't remember all 
 of Walter's reasons for it, so I can't really say why (partly
 due to complications it causes in the language, I think, but I
 don't know).

I'd really like to hear about those complications (unless they're too complicated for me to understand), because for someone like me, not knowing the implementation details of the language, it looks like the language already *has* implemented reference variables. We just can't create them, apart from these few distinct ways: foreach (ref actualRefVariable, array) { ++actualRefVariable; // <-- that's a reference variable alright } void fun(ref int actualRefVariable) { ++actualRefVariable; // <-- that's a reference variable alright }
Aug 28 2012
prev sibling next sibling parent "Tommi" <tommitissari hotmail.com> writes:
On Wednesday, 29 August 2012 at 01:42:36 UTC, Timon Gehr wrote:
 Not exactly the same thing (what you propose would have 
 different
 IFTI behaviour), but works quite well:

 import std.stdio;

 struct Ref(T){
     private T* _payload;
     this(ref T i){_payload = &i; }
      property ref T deref(){ return *_payload; }
     alias deref this;
 }
 auto ref_(T)(ref T arg){return Ref!T(arg);}

 void main(){
     int i,j;
     auto r = i.ref_;
     r++;
     auto q = r;
     writeln(r," ",q);
     q++;
     writeln(r," ",q);
     q = j.ref_;
     q++;
     writeln(r," ",q.deref);
 }

I did figure that that's possible. But, to me, having reference variables be implemented in a library instead of them being a core language feature, is like having pointers implemented as a library. I'd like to have a good answer when some newcomer asks me: "why, oh why is this so?".
Aug 28 2012
prev sibling next sibling parent Nick Sabalausky <SeeWebsiteToContactMe semitwist.com> writes:
On Wed, 29 Aug 2012 03:16:20 +0200
"Tommi" <tommitissari hotmail.com> wrote:

 On Wednesday, 29 August 2012 at 00:34:02 UTC, cal wrote:
 On Wednesday, 29 August 2012 at 00:21:29 UTC, Tommi wrote:
 In this situation, I think, the most convenient and sensible 
 thing to do is to make a reference to the data, and use that 
 reference multiple times. We could make a pointer, but then 
 we'd be stuck with the nasty syntax of dereferencing:

This works currently: struct Test { void foo() const { writeln("FOO"); } } void main() { immutable(Test)* ptr = new immutable(Test); ptr.foo(); }

Now, that's a surprise for someone coming from C++. But even though ptr looks like a reference variable in your example, it doesn't look like it at all in this example:

I've been primarily a D guy for years, and even I'm surprised by that! O_O
Aug 28 2012
prev sibling next sibling parent Nick Sabalausky <SeeWebsiteToContactMe semitwist.com> writes:
On Wed, 29 Aug 2012 03:44:38 +0200
"Tommi" <tommitissari hotmail.com> wrote:

 On Wednesday, 29 August 2012 at 01:28:49 UTC, Jonathan M Davis 
 wrote:
 Not going to happen. Unfortunately though, I don't remember all 
 of Walter's reasons for it, so I can't really say why (partly
 due to complications it causes in the language, I think, but I
 don't know).

I'd really like to hear about those complications (unless they're too complicated for me to understand),

I don't remember exactly either, but IIRC, it would somehow make it impossible to guarantee...something about references not escaping their proper scope...
Aug 28 2012
prev sibling next sibling parent "Mehrdad" <wfunction hotmail.com> writes:
I think there's an (undocumented?) Ref class in some file 
(object_.d?)
Aug 28 2012
prev sibling next sibling parent "Mehrdad" <wfunction hotmail.com> writes:
On Wednesday, 29 August 2012 at 02:28:09 UTC, Mehrdad wrote:
 I think there's an (undocumented?) Ref class in some file 
 (object_.d?)

er, struct
Aug 28 2012
prev sibling next sibling parent "anonymous" <anonymous example.com> writes:
On Wednesday, 29 August 2012 at 02:07:19 UTC, Nick Sabalausky 
wrote:
 On Wed, 29 Aug 2012 03:16:20 +0200
 "Tommi" <tommitissari hotmail.com> wrote:

 On Wednesday, 29 August 2012 at 00:34:02 UTC, cal wrote:
 On Wednesday, 29 August 2012 at 00:21:29 UTC, Tommi wrote:
 In this situation, I think, the most convenient and 
 sensible thing to do is to make a reference to the data, 
 and use that reference multiple times. We could make a 
 pointer, but then we'd be stuck with the nasty syntax of 
 dereferencing:

This works currently: struct Test { void foo() const { writeln("FOO"); } } void main() { immutable(Test)* ptr = new immutable(Test); ptr.foo(); }

Now, that's a surprise for someone coming from C++. But even though ptr looks like a reference variable in your example, it doesn't look like it at all in this example:

I've been primarily a D guy for years, and even I'm surprised by that! O_O

You didn't know that the dot operator does dereference? That's quite a big one to miss for years.
Aug 28 2012
prev sibling next sibling parent "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Wednesday, 29 August 2012 at 01:28:49 UTC, Jonathan M Davis 
wrote:
 On Wednesday, August 29, 2012 02:21:28 Tommi wrote:
 Foreach loops can make reference variables, and function calls 
 can do it for the parameters passed in. So, my question is, 
 wouldn't it be better if we could, in general, make reference 
 variables?

Not going to happen. Unfortunately though, I don't remember all of Walter's reasons for it, so I can't really say why (partly due to complications it causes in the language, I think, but I don't know). Use a pointer, std.typecons.RefCounted, a class, or make your struct a reference type (which would probably mean having the data held in a separate struct with a pointer to it in the outer struct). It's really not hard to have a type which is a reference type if that's what you really want. You just can't declare a ref to a variable as a local variable.

You would need a flag added to EVERY variable and item to specify if it was stack allocated or not. Otherwise it would be quite annoying to deal with. The compiler has to work blindly, assuming everything is correct. You can ref what you've been given but making more permanent references aren't possible without bypassing the safeguards. Assuming 'ref' works: struct S { ref int r; } //ref local variable/stack, Ticking timebomb //compiler may refuse void useRef(ref S input, int r) { input.r = r; } //should be good, right? S useRef2(S input, ref int r) { //Can declare safe, right??? input.r = r; //maybe, maybe not. return S; } //Shy should indirect care if it's local/stack or heap? S indirect(ref int r) { return useRef2(S(), r); } //local variables completely okay to ref! Right? S indirect2() { int r; return useRef2(S(), r); } S someScope() { int* pointer = new int(31); //i think that's right int local = 127; S s; //reference to calling stack! (which may be destroyed now); //Or worse it may silently work for a while useRef(s, 99); assert(s.r == 99); return s; s = useRef2(s, pointer); //or is it *pointer? assert(s.r == 31); //good so far if it passes correctly return s; //good, heap allocated s = useRef2(s, local); assert(s.r == 127); //good so far (still local) return s; //Ticking timebomb! s = indirect(local); assert(s.r == 127); //good so far (still local) return s; //timebomb! s = indirect2(); return s; //already destroyed! Unknown consequences! } assuming a flag is silently passed to ensure if it is stack or heap allocated, then useRef2 silently becomes... //safe! But not C callable! S useRef2(S input, ref int r, bool __r_isHeap) { assert(__r_isHeap, "Cannot use a stack allocated variable!"); input.r = r; //now safe-able! return S; } Or at the end of the scope it could check all structs if the ref's had a silent flag specifying if a referenced variable was set for local, so it would assert/throw an exception at the end of the call. Am I wrong?
Aug 28 2012
prev sibling next sibling parent "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Wednesday, 29 August 2012 at 02:57:27 UTC, Era Scarecrow wrote:
 You would need a flag added to EVERY variable and item to 
 specify if it was stack allocated or not. Otherwise it would be 
 quite annoying to deal with. The compiler has to work blindly, 
 assuming everything is correct. You can ref what you've been 
 given but making more permanent references aren't possible 
 without bypassing the safeguards.

To add on to this a little.. If you can only ref by calling, you ensure the variable is alive/valid when you are calling it regardless if it's stack or heap or global (But can't ensure it once the scope ends). Making a reference as a variable modifier won't work, but using a pointer you're already going lower level and it's all up to you the programmer to make sure it's right. I think that's right; Otherwise ref wouldn't be allowed in safe code (at all).
Aug 28 2012
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, August 29, 2012 04:40:17 anonymous wrote:
 On Wednesday, 29 August 2012 at 02:07:19 UTC, Nick Sabalausky
 
 wrote:
 On Wed, 29 Aug 2012 03:16:20 +0200
 
 "Tommi" <tommitissari hotmail.com> wrote:
 On Wednesday, 29 August 2012 at 00:34:02 UTC, cal wrote:
 On Wednesday, 29 August 2012 at 00:21:29 UTC, Tommi wrote:
 In this situation, I think, the most convenient and
 sensible thing to do is to make a reference to the data,
 and use that reference multiple times. We could make a
 pointer, but then we'd be stuck with the nasty syntax of

 dereferencing:

struct Test { void foo() const { writeln("FOO"); } } void main() { immutable(Test)* ptr = new immutable(Test); ptr.foo(); }

Now, that's a surprise for someone coming from C++. But even though ptr looks like a reference variable in your example, it

 doesn't look like it at all in this example:

by that! O_O

You didn't know that the dot operator does dereference? That's quite a big one to miss for years.

Yeah. I'm a bit confused about what's so suprising about that code. - Jonathan M Davis
Aug 28 2012
prev sibling next sibling parent "Tommi" <tommitissari hotmail.com> writes:
On Wednesday, 29 August 2012 at 03:21:04 UTC, Jonathan M Davis 
wrote:
 void main()
 {
 
     immutable(Test)* ptr = new immutable(Test);
     ptr.foo();
 
 }

Now, that's a surprise for someone coming from C++. But even though ptr looks like a reference variable in your example, it

 doesn't look like it at all in this example:

by that! O_O

You didn't know that the dot operator does dereference? That's quite a big one to miss for years.

Yeah. I'm a bit confused about what's so suprising about that code. - Jonathan M Davis

The weird thing is that you can use a member access operator with a pointer (without explicitly dereferencing the pointer first). At least I didn't know what to expect the following code to print: struct MyStruct { int _value = 0; void increment() { ++_value; } } void increment(ref MyStruct* ptr) { ++ptr; } void main() { MyStruct[2] twoStructs; twoStructs[1]._value = 42; MyStruct* ptrFirstStruct = &twoStructs[0]; // Are we incrementing the pointer using UFCS or // are we calling the member function in MyStruct? ptrFirstStruct.increment(); // This prints 1, so we called the actual method writeln((*ptrFirstStruct)._value); }
Aug 28 2012
prev sibling next sibling parent "Tommi" <tommitissari hotmail.com> writes:
On Wednesday, 29 August 2012 at 03:17:39 UTC, Era Scarecrow wrote:
  To add on to this a little.. If you can only ref by calling, 
 you ensure the variable is alive/valid when you are calling it 
 regardless if it's stack or heap or global (But can't ensure it 
 once the scope ends). Making a reference as a variable modifier 
 won't work, but using a pointer you're already going lower 
 level and it's all up to you the programmer to make sure it's 
 right.

  I think that's right; Otherwise ref wouldn't be allowed in 
  safe code (at all).

But couldn't the compiler disallow all unsafe ref use in safe code, and allow all use of ref in system and trusted code?
Aug 28 2012
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, August 29, 2012 06:46:25 Tommi wrote:
 The weird thing is that you can use a member access operator with
 a pointer (without explicitly dereferencing the pointer first).

Well, you clearly haven't done much pointers to structs in D, or that wouldn't be surprising at all. . always implicitly dereferences the pointer, which pretty much makes the -> operator completely unnecessary. - Jonathan M Davis
Aug 28 2012
prev sibling next sibling parent Nick Sabalausky <SeeWebsiteToContactMe semitwist.com> writes:
On Tue, 28 Aug 2012 22:08:11 -0700
Jonathan M Davis <jmdavisProg gmx.com> wrote:

 On Wednesday, August 29, 2012 06:46:25 Tommi wrote:
 The weird thing is that you can use a member access operator with
 a pointer (without explicitly dereferencing the pointer first).

Well, you clearly haven't done much pointers to structs in D,

I indeed haven't :) Usually "ref MyStruct" is good enough for my needs.
 or that
 wouldn't be surprising at all. . always implicitly dereferences the
 pointer, which pretty much makes the -> operator completely
 unnecessary.
 

I always figured it was just the reference semantics for classes (and the optional "ref" when passing structs) that eliminated the need for ->. Probably 99+% of the time I use structs it's either a plain-old-struct or "ref MyStruct", so I assumed C-style "(*foo).bar" was good enough for the rare uses of "MyStruct*", and it never bothered me. But it's definitely pretty cool that dot still works even for pointers. It's awesome that D is still pleasantly surprising me :) Out of curiosity, what about "MyStruct**" or "MyClass*"?
Aug 28 2012
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, August 29, 2012 02:04:44 Nick Sabalausky wrote:
 Out of curiosity, what about "MyStruct**" or "MyClass*"?

I'd have to try it, but my guess would be that it only implicitly dereferences one level (though in the case of MyClass*, that might be enough, since there's no dereferencing needed on class references - though it's _very_ rare that having a pointer to a reference like that makes sense in the first place). - Jonathan M Davis
Aug 28 2012
prev sibling next sibling parent Marco Leise <Marco.Leise gmx.de> writes:
Am Tue, 28 Aug 2012 22:08:11 -0700
schrieb Jonathan M Davis <jmdavisProg gmx.com>:

 On Wednesday, August 29, 2012 06:46:25 Tommi wrote:
 The weird thing is that you can use a member access operator with
 a pointer (without explicitly dereferencing the pointer first).

Well, you clearly haven't done much pointers to structs in D, or that wouldn't be surprising at all. . always implicitly dereferences the pointer, which pretty much makes the -> operator completely unnecessary. - Jonathan M Davis

P.S.: Also not much Delphi, which did the same before when innovating over Pascal. (There never was a '->' though, just dereference then access fields.) -- Marco
Aug 29 2012
prev sibling next sibling parent "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Wednesday, 29 August 2012 at 04:54:31 UTC, Tommi wrote:
 On Wednesday, 29 August 2012 at 03:17:39 UTC, Era Scarecrow
 I think that's right; Otherwise ref wouldn't be allowed in 
  safe code (at all).

But couldn't the compiler disallow all unsafe ref use in safe code, and allow all use of ref in system and trusted code?

So.... Ref can only then be used if it's heap allocated... That would be the only safe way, but even that is ridden with timebombs. int[10] xyz; foreach(i; xyz) { //all's good, they are copies! } foreach(ref i; xyz) { // cannot use ref, (not even const ref) as it may be unsafe } //a ref function void func(ref int i); int *z = new int; func(z); //or is it *z? 'Might' be good func(xyz[0]); //Compile error, may be unsafe. class C { int cc; } struct S { int sc; } C cSomething = new C(); S sSomething = S(); S* sSomething2 = new S(); func(cSomething.cc); //good, heap allocated enforced func(sSomething.sc); //error, may be unsafe! func(sSomething2.sc); //pointer dereference; It bypasses protections since it probably is heap (but not guaranteed), so maybe this should fail too. sSomething2 = &sSomething; func(sSomething2.sc); //back to a timebomb situation but the compiler can't verify it! only something from a class could be considered safe (maybe) Why is it unsafe? Let's assume we did this: ref int unsafe; //global ref. Hey it's legal if ref is allowed as you want it! void func(ref int i) { unsafe = i; } Now I ask you. If any of the 'unsafe' ones were used and then the scope ended that held that information and we use unsafe now, it can only be a time bomb: All of them stem from them being on the stack. void someFunc() { int stacked = 42; func(stacked); } someFunc(); writeln(unsafe); //what will this print? Will it crash? //some scope. FOR loop? IF statement? who cares? { int local = 42; func(&local); //assuming the pointer bypasses and gets dereferenced to get past the heap only compiler problems (pointers and classes only) limitations writeln(unsafe); //prints 42 as expected } writeln(unsafe); //????? Scope ended. Now what? can't ensure local exists anymore as we reffed a stack class Reffed { ref int i; } //some scope Reffed r = new Reffed(); Reffed r2 = new Reffed(); trusted { int local; r.i = local; //may or may not work normally, trusted should force it } //Why not copy a reference from a class? //it should be good (heap only), right? r2.i = r.i; writeln(r.i); //????? writeln(r2.i); //?!?! To make it 'safe' the only way to do it is ref can only be used on heap allocated data (or global fixed data). The current implementation as I see it anything referenced is safe because you can only reference something that currently exists (via function calls). The other way to make it safe is to silently include a flag that specifies if it was stack allocated or not, but even that becomes more complicated and more of an issue.
Aug 29 2012
prev sibling next sibling parent "Tommi" <tommitissari hotmail.com> writes:
On Wednesday, 29 August 2012 at 02:57:27 UTC, Era Scarecrow wrote:
  Assuming 'ref' works:

  struct S {
    ref int r;
  }

  //ref local variable/stack, Ticking timebomb
  //compiler may refuse
  void useRef(ref S input, int r) {
    input.r = r;
  }

I think we might be talking about somewhat different things. What I mean by reference variable is what the term means in C++. From wikipedia: http://en.wikipedia.org/wiki/Reference_%28C%2B%2B%29 C++ references differ from pointers in several essential ways: * It is not possible to refer directly to a reference object after it is defined; any occurrence of its name refers directly to the object it references. * Once a reference is created, it cannot be later made to reference another object; it cannot be reseated. This is often done with pointers. *References cannot be null, whereas pointers can; every reference refers to some object, although it may or may not be valid. Note that for this reason, containers of references are not allowed. * References cannot be uninitialized. Because it is impossible to reinitialize a reference, they must be initialized as soon as they are created. In particular, local and global variables must be initialized where they are defined, and references which are data members of class instances must be initialized in the initializer list of the class's constructor. For example: int& k; // compiler will complain: error: `k' declared as reference but not initialized
Aug 29 2012
prev sibling next sibling parent "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Wednesday, 29 August 2012 at 12:41:26 UTC, Tommi wrote:
 I think we might be talking about somewhat different things. 
 What I mean by reference variable is what the term means in 
 C++. From wikipedia: 
 http://en.wikipedia.org/wiki/Reference_%28C%2B%2B%29

 C++ references differ from pointers in several essential ways:

 * It is not possible to refer directly to a reference object 
 after it is defined; any occurrence of its name refers directly 
 to the object it references.

 * Once a reference is created, it cannot be later made to 
 reference another object; it cannot be reseated. This is often 
 done with pointers.

 *References cannot be null, whereas pointers can; every 
 reference refers to some object, although it may or may not be 
 valid. Note that for this reason, containers of references are 
 not allowed.

 * References cannot be uninitialized. Because it is impossible 
 to reinitialize a reference, they must be initialized as soon 
 as they are created. In particular, local and global variables 
 must be initialized where they are defined, and references 
 which are data members of class instances must be initialized 
 in the initializer list of the class's constructor. For 
 example: int& k; // compiler will complain: error: `k' declared 
 as reference but not initialized

Then most of my examples change to going into the constructor rather than out, and the global one goes away. But it still doesn't help with the problem of anything stack allocated. struct S { //default construtor otherwise then fails to work because of ref? Since //x must be known at compile time so it would be null to start with, or no //default onstructor ref int x; this(ref int r) { x = r;} } S func() { int local = 42; S s = S(local); return s; } S func(ref int reffed) { S s = S(reffed); return s; } S s = func(); writeln(s.x); //mystery value still! { int local = 42; s = func(local); writeln(s.x); //42 } writeln(s.x); //!?!? Unless I'm understanding this wrong (And I'm tired right now maybe I missed something), once again the only safe 'reference variable' is from actively being called from a live accessible reference; And anything else is still a ticking timebomb and cannot ever be safe. Stuff referencing within heaped/class related stuff may suffer fewer problems, but only as long as you rely on the GC and not do any magic involving malloc/free, or maybe a dynamic range. Reminds me. The example for the foreach are missing parts. int[10] fixed; int[] dynamic; dynamic.length = 10; foreach(i; fixed) { //fine, i is a copy } foreach(ref i; fixed) { //compile-time error for safe checking //cannot ensure this is safe otherwise } foreach(i; dynamic) { //fine, i is a copy } foreach(ref i; dynamic) { //Dynamic/heap allocated, so fine. } //from function/ref void func(int[] huh) { foreach(ref i; huh) { //????? safe? } } //both legal last I checked. func(fixed); func(dynamic); My god, as I look at this more, reference variables would cripple D so badly C++ wouldn't look half bad afterwards.
Aug 29 2012
prev sibling next sibling parent "Namespace" <rswhite4 googlemail.com> writes:
 *References cannot be null, whereas pointers can; every 
 reference refers to some object, although it may or may not be 
 valid. Note that for this reason, containers of references are 
 not allowed.

 * References cannot be uninitialized. Because it is impossible 
 to reinitialize a reference, they must be initialized as soon 
 as they are created. In particular, local and global variables 
 must be initialized where they are defined, and references 
 which are data members of class instances must be initialized 
 in the initializer list of the class's constructor. For example:
 int& k; // compiler will complain: error: `k' declared as 
 reference but not initialized

That would be a dream: not null references. I'm still think that D needs something like that. And I'm not talking about of some struct constructs like NotNullable, which will be added in std.typecons later. I'm talking about built-in support.
Aug 29 2012
prev sibling next sibling parent "Mehrdad" <wfunction hotmail.com> writes:
On Wednesday, 29 August 2012 at 15:49:26 UTC, Namespace wrote:
 *References cannot be null, whereas pointers can; every 
 reference refers to some object, although it may or may not be 
 valid. Note that for this reason, containers of references are 
 not allowed.

 * References cannot be uninitialized. Because it is impossible 
 to reinitialize a reference, they must be initialized as soon 
 as they are created. In particular, local and global variables 
 must be initialized where they are defined, and references 
 which are data members of class instances must be initialized 
 in the initializer list of the class's constructor. For 
 example:
 int& k; // compiler will complain: error: `k' declared as 
 reference but not initialized

That would be a dream: not null references. I'm still think that D needs something like that. And I'm not talking about of some struct constructs like NotNullable, which will be added in std.typecons later. I'm talking about built-in support.

+1
Aug 29 2012
prev sibling next sibling parent "Tommi" <tommitissari hotmail.com> writes:
On Wednesday, 29 August 2012 at 14:17:21 UTC, Era Scarecrow wrote:
  Then most of my examples change to going into the constructor 
 rather than out, and the global one goes away. But it still 
 doesn't help with the problem of anything stack allocated.

  struct S {
    //default construtor otherwise then fails to work because of 
 ref? Since
    //x must be known at compile time so it would be null to 
 start with, or no
    //default onstructor
    ref int x;
    this(ref int r) { x = r;}
  }

  S func() {
    int local = 42;
    S s = S(local);
    return s;
  }

  S func(ref int reffed) {
    S s = S(reffed);
    return s;
  }

  S s = func();

  writeln(s.x); //mystery value still!

  {
    int local = 42;
    s = func(local);
    writeln(s.x); //42
  }

  writeln(s.x); //!?!?


  Unless I'm understanding this wrong (And I'm tired right now 
 maybe I missed something), once again the only safe 'reference 
 variable' is from actively being called from a live accessible 
 reference; And anything else is still a ticking timebomb and 
 cannot ever be  safe. Stuff referencing within heaped/class 
 related stuff may suffer fewer problems, but only as long as 
 you rely on the GC and not do any magic involving malloc/free, 
 or maybe a dynamic range.

  Reminds me. The example for the foreach are missing parts.

  int[10] fixed;
  int[] dynamic;
  dynamic.length = 10;

  foreach(i; fixed) {
     //fine, i is a copy
  }

  foreach(ref i; fixed) {
      //compile-time error for  safe checking
      //cannot ensure this is safe otherwise
  }

  foreach(i; dynamic) {
     //fine, i is a copy
  }

  foreach(ref i; dynamic) {
     //Dynamic/heap allocated, so fine.
  }

  //from function/ref
  void func(int[] huh) {
    foreach(ref i; huh) {
      //????? safe?
    }
  }

  //both legal last I checked.
  func(fixed);
  func(dynamic);



  My god, as I look at this more, reference variables would 
 cripple D so badly C++ wouldn't look half bad afterwards.

I honestly don't know anything about memory safety and what it entails. So, I can't comment about any of that stuff. But if you say general ref variables can't be implemented in safe mode, then I can just take your word for it. But what I'm saying is that ref variables would be a nice feature to have in system code. My logic is very simple: since we can use pointers in system code, and references are no more unsafe than pointers, then we should be able to use references in system code. You might argue that references are little more than syntactic sugar (and a bit safer) compared to pointers, but you shouldn't underestimate the importance of syntactic sugar especially when you're trying to lure in users of C++.
Aug 29 2012
prev sibling next sibling parent "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Wednesday, 29 August 2012 at 19:55:08 UTC, Tommi wrote:
 I honestly don't know anything about memory safety and what it 
 entails. So, I can't comment about any of that stuff. But if 
 you say general ref variables can't be implemented in  safe 
 mode, then I can just take your word for it. But what I'm 
 saying is that ref variables would be a nice feature to have in 
  system code.

They can be used safely if they are heap allocated (or global).
 My logic is very simple: since we can use pointers in  system 
 code, and references are no more unsafe than pointers, then we 
 should be able to use references in  system code. You might 
 argue that references are little more than syntactic sugar (and 
 a bit safer) compared to pointers, but you shouldn't 
 underestimate the importance of syntactic sugar especially when 
 you're trying to lure in users of C++.

Maybe. But the scope of referencing variables should likely be left alone. I am not sure of all the details of C++ references either. Ref hides the fact it's a pointer/reference. This means unless you really know what you're doing that it is not a good thing to use. Let's assume we compare a pointer and a ref in a struct and copying. struct P { int* ptr; } struct R { ref int r; this(ref int i) {r = i;} } int value; P p1, p2; p1.ptr = value; //Compile error, not a pointer, need to use &value p2 = p1; p2.ptr = 100; //compile error, need to use *p2.ptr = 100 value = 42; R r1 = R(value) R r2 = r1; //both reference value now r2.r = 100; //r must be a variable if this succeeds assert(r1.r == 42, "FAIL!"); assert(r2.r == 100, "Would succeed if it got this far"); If I have this right, as a pointer you know it's a pointer/reference and don't make more obvious mistakes as you're told by the compiler what's wrong. Also since it hides the fact it's actually a pointer hidden bugs are bound to crop up more often where reference variables were used. If we return now r1 or r2, it silently fails since it can't guarantee that the reference wasn't a local stack variable, or of anything at all really. As a pointer you can blame the programmer (It's their fault afterall), as a reference you blame the language designer. What would you do for the postblitz if you can't reset the reference? Or if you could, would you do...? It would be so easy to forget. this(this) { r = new int(r); } *Prepares for flames from Walter and Andrei*
Aug 29 2012
prev sibling next sibling parent "Tommi" <tommitissari hotmail.com> writes:
On Wednesday, 29 August 2012 at 21:37:33 UTC, Era Scarecrow wrote:
  struct R {
    ref int r;
    this(ref int i) {r = i;}
  }

I had totally forgotten what it says in "The book" about struct and class construction. It's basically that all fields are first initialized to either T.init or by using the field's initializer. That means the use of ref inside class or struct would be quite restricted: int globalVal; struct MyStruct { // ref int defaultInitRef; // Illegal: reference variables // can't be default initialized ref int explicitRef = globalVal; // Fine this(int val) { explicitRef = val; // Assigns val to globalVal (because // explicitRef references globalVal) } }
Aug 29 2012
prev sibling next sibling parent "Tommi" <tommitissari hotmail.com> writes:
...but I guess you could do this:

int globalVal1;
int globalVal2;

struct MyStruct(alias valToRef)
{
     ref int refVal = valToRef;
}

void main()
{
     MyStruct!globalVal1 ms1;
     MyStruct!globalVal2 ms2;
}
Aug 29 2012
prev sibling next sibling parent "Tommi" <tommitissari hotmail.com> writes:
...although, now I'm thinking that having reference variables as 
members of struct or class won't ever work. And that's because 
T.init is a compile-time variable, and references can't be known 
at compile-time (unlike pointers, which can be null). Perhaps the 
easiest way to implement reference variables would be to just 
think about them as syntactic sugar, and use lowering. Something 
like this user code...

void main()
{
     int           var = 123;
     immutable int imm = 42;

     ref int rVar = var;
     rVar = 1234;

     ref immutable int rImm = imm;
     auto mult = rImm * rImm;
}

// ...would get lowered to:

void main()
{
     int           var = 123;
     immutable int imm = 42;

     int* pVar = &var;
     (*pVar) = 1234;

     immutable(int)* pImm = &imm;
     auto mult = (*pImm) * (*pImm);
}

And this kind of initialization would be always illegal:
ref int rVar;
Aug 29 2012
prev sibling next sibling parent "Mehrdad" <wfunction hotmail.com> writes:
I think references should either take the C# approach (which is 
rock-solid), or the C++ approach (which is also pretty 
rock-solid), but not an in-between.
Aug 29 2012
prev sibling next sibling parent "Namespace" <rswhite4 googlemail.com> writes:
 I had totally forgotten what it says in "The book" about struct 
 and class construction. It's basically that all fields are 
 first initialized to either T.init or by using the field's 
 initializer. That means the use of ref inside class or struct 
 would be quite restricted:

 int globalVal;

 struct MyStruct
 {
     // ref int defaultInitRef; // Illegal: reference variables
                                // can't be default initialized

But you can handle it like const members: you have to initialize these members in the ctor.
Aug 30 2012
prev sibling next sibling parent "Namespace" <rswhite4 googlemail.com> writes:
On Thursday, 30 August 2012 at 07:35:34 UTC, Namespace wrote:
 I had totally forgotten what it says in "The book" about 
 struct and class construction. It's basically that all fields 
 are first initialized to either T.init or by using the field's 
 initializer. That means the use of ref inside class or struct 
 would be quite restricted:

 int globalVal;

 struct MyStruct
 {
    // ref int defaultInitRef; // Illegal: reference variables
                               // can't be default initialized

But you can handle it like const members: you have to initialize these members in the ctor.

Furthermore I suggest that with "ref" marked Objects _can't_ be null. So ref Foo fr = null; is equally forbidden as [code] Foo f; // same as Foo f = null; ref Foo fr = f; [/code] Why? First: null isn't an lvalue and even if "Foo f;" is one: if it would be allowed, you have a useless reference, because it is null and you can't assign a valid object to it (because you can assign ref's just once) Furthermore it solve the problem which I often annotate: not null paramters. void do_something(ref Foo fr) { <- "fr" can't be null, you can trust it without any other validations. I would love that. :)
Aug 30 2012
prev sibling next sibling parent "Tommi" <tommitissari hotmail.com> writes:
On Thursday, 30 August 2012 at 07:35:34 UTC, Namespace wrote:
 struct MyStruct
 {
    // ref int defaultInitRef; // Illegal: reference variables
                               // can't be default initialized

But you can handle it like const members: you have to initialize these members in the ctor.

Yeah, maybe. I'm starting to think we should really know the implementation details of the language in order to even speculate about how ref might be implemented. On Thursday, 30 August 2012 at 07:35:34 UTC, Namespace wrote:
 So ref Foo fr = null; is equally forbidden as
 [code]
 Foo f; // same as Foo f = null;
 ref Foo fr = f;

But you can always say: Foo f = new Foo(); ref Foo fr = f; f = null; ...after which fr references f, which references null, so effectively fr references null. I don't think this could be prevented from happening.
Aug 30 2012
prev sibling next sibling parent "Namespace" <rswhite4 googlemail.com> writes:
On Thursday, 30 August 2012 at 08:57:24 UTC, Tommi wrote:
 On Thursday, 30 August 2012 at 07:35:34 UTC, Namespace wrote:
 struct MyStruct
 {
   // ref int defaultInitRef; // Illegal: reference variables
                              // can't be default initialized

But you can handle it like const members: you have to initialize these members in the ctor.

Yeah, maybe. I'm starting to think we should really know the implementation details of the language in order to even speculate about how ref might be implemented.

 On Thursday, 30 August 2012 at 07:35:34 UTC, Namespace wrote:
 So ref Foo fr = null; is equally forbidden as
 [code]
 Foo f; // same as Foo f = null;
 ref Foo fr = f;

But you can always say: Foo f = new Foo(); ref Foo fr = f; f = null; ...after which fr references f, which references null, so effectively fr references null. I don't think this could be prevented from happening.

That's true... Maybe in "ref Foo fr = f;" fr shouln't be a real pointer to f and instead an object which refer to the concrete object of f. Then it would be indifferent if you write f = null; because "fr" refers still to a valid object.
Aug 30 2012
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Tommi:

     // This prints 1, so we called the actual method

I think in D methods have precedence over free functions. Bye, bearophile
Aug 30 2012
prev sibling next sibling parent "Tommi" <tommitissari hotmail.com> writes:
On Thursday, 30 August 2012 at 11:53:14 UTC, bearophile wrote:
 Tommi:

    // This prints 1, so we called the actual method

I think in D methods have precedence over free functions. Bye, bearophile

Yes, but to me the ambiguity of that example is in whether or not implicit deferencing of pointers has precedence over uniform function call syntax. Apparently it does, but it's not that obvious that it would. struct MyStruct { int _value = 0; void increment() { ++_value; } } void increment(ref MyStruct* ptr) { ++ptr; } void main() { MyStruct* ptrMyStruct = new MyStruct(); // Are we incrementing the pointer using UFCS or // are we calling the member function in MyStruct? ptrMyStruct.increment(); }
Aug 30 2012
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Thursday, August 30, 2012 15:30:15 Tommi wrote:
 Yes, but to me the ambiguity of that example is in whether or not
 implicit deferencing of pointers has precedence over uniform
 function call syntax. Apparently it does, but it's not that
 obvious that it would.

It _can't_ work any other way, because there's no way to tell the compiler to use the member function specifically. You can use the full import path for the free function, so you can tell the compiler to use the free function. There's no such syntax for member variables.
 struct MyStruct
 {
 int _value = 0;
 
 void increment()
 {
 ++_value;
 }
 }
 
 void increment(ref MyStruct* ptr)
 {
 ++ptr;
 }
 
 void main()
 {
 MyStruct* ptrMyStruct = new MyStruct();
 
 // Are we incrementing the pointer using UFCS or
 // are we calling the member function in MyStruct?
 ptrMyStruct.increment();
 }

For instance, if you do increment(ptrMyStruct); or .increment(ptrMyStruct); or if increment had a longer import path path.to.increment(ptrMyStruct); then you can tell the compiler to use the free function. But how would you do that with the member function? You can't. So, there's really no other choice but to choose the member function whenever there's a conflict. It also prevents function hijacking so that something like var.increment() doesn't suddenly start using using a free function instead of the member function when you add an import which has an increment free function. - Jonathan M Davis
Aug 30 2012
prev sibling next sibling parent "Tommi" <tommitissari hotmail.com> writes:
On Thursday, 30 August 2012 at 17:50:47 UTC, Jonathan M Davis 
wrote:
 On Thursday, August 30, 2012 15:30:15 Tommi wrote:
 Yes, but to me the ambiguity of that example is in whether or 
 not
 implicit deferencing of pointers has precedence over uniform
 function call syntax. Apparently it does, but it's not that
 obvious that it would.

It _can't_ work any other way, because there's no way to tell the compiler to use the member function specifically. You can use the full import path for the free function, so you can tell the compiler to use the free function. There's no such syntax for member variables.
 struct MyStruct
 {
 int _value = 0;
 
 void increment()
 {
 ++_value;
 }
 }
 
 void increment(ref MyStruct* ptr)
 {
 ++ptr;
 }
 
 void main()
 {
 MyStruct* ptrMyStruct = new MyStruct();
 
 // Are we incrementing the pointer using UFCS or
 // are we calling the member function in MyStruct?
 ptrMyStruct.increment();
 }

For instance, if you do increment(ptrMyStruct); or .increment(ptrMyStruct); or if increment had a longer import path path.to.increment(ptrMyStruct); then you can tell the compiler to use the free function. But how would you do that with the member function? You can't. So, there's really no other choice but to choose the member function whenever there's a conflict. It also prevents function hijacking so that something like var.increment() doesn't suddenly start using using a free function instead of the member function when you add an import which has an increment free function. - Jonathan M Davis

But this is not about member function vs. free function. This is about implicit pointer dereferencing vs. UFCS. The question is: "what does member access operator do, when it operates on a pointer?". There are two options, and I think they are both valid options the language could have chosen: // S is a struct: auto ptr = new S(); // What to do with this? ptr.fun(); // or ptr.fun; Option #1: Rewrite the expression as (*ptr).fun() Option #2: If .fun(S*) exists: call the free function .fun(ptr) Else: rewrite the expression as (*ptr).fun() ...But, actually it seems that the language is a bit broken, because it doesn't follow neither one of those two options. What it actually does is this: Option #3: If S.init.fun() exists: call the member (*ptr).fun() Else if .fun(S*) exists: call the free function .fun(ptr) Else give a compile time error Here's the details: struct S1 { int _value = 42; void fun() { ++_value; } } void fun(ref S1 s1) { s1._value += 1000; } void fun(ref S1* ptr1) { ++ptr1; } ///////////////////////////// struct S2 { int _value = 42; } void fun(ref S2 s2) { ++s2._value; } void fun(ref S2* ptr2) { ++ptr2; } ///////////////////////////// struct S3 { int _value = 42; } void fun(ref S3 s3) { ++s3._value; } void main() { auto ptr1 = (new S1[2]).ptr; ptr1.fun(); // calls (*ptr1).fun() writeln(ptr1._value); // prints 43 auto arr2 = new S2[2]; auto ptr2 = arr2.ptr; arr2[1]._value = 12345; ptr2.fun(); // calls .fun(ptr2) writeln(ptr2._value); // prints 12345 auto ptr3 = (new S3[2]).ptr; ptr3.fun(); // Error: function main.fun (ref S3 s3) is // not callable using argument types (S3*) } I think, if we want to have implicit pointer dereferencing (I'm not even sure it's a good thing), then option #1 would be the best choice.
Aug 30 2012
prev sibling next sibling parent "Tommi" <tommitissari hotmail.com> writes:
On Friday, 31 August 2012 at 05:24:31 UTC, Tommi wrote:
 ...But, actually it seems that the language is a bit broken, 
 because it doesn't follow neither one of those two options. 
 What it actually does is this:

Obviously I don't think that the language is broken, but the compiler is. I wrote that before I had seen that the third test case (S3) results a compile-time error. At the time I wrote that I assumed it would just call .fun(*ptr3)
Aug 30 2012
prev sibling parent "Tommi" <tommitissari hotmail.com> writes:
I filed a bug report:
http://d.puremagic.com/issues/show_bug.cgi?id=8603
Aug 31 2012