www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Consistency, Templates, Constructors, and D3

reply "F i L" <witte2008 gmail.com> writes:
DISCLAIMER: This isn't a feature request or anything like that. 
It's ONLY intended to stir _constructive_ conversation and 
criticism of D's existing features, and how to improve them _in 
the future_ (note the 'D3' in the title).

I've had a couple of ideas recently about the importance of 
consistency in a language design, and how a few languages I 
highly respect (D, C#, and Nimrod) approach these issues. This 
post is mostly me wanting to reach out to a community that enjoys 
discussing such issues, in an effort to correct any 
mis-conceptions I might hold, and to spread potentially good 
ideas to the community in hopes that my favorite language will 
benefit from our discussion.


---------------------------


First, let me assert that "Consistency" in a language is 
critically important for a few reasons:

1. One way of doing things means one way to _remember_ things. It 
keeps us sane, focused, and productive. The more we have to fight 
the language, the harder it is to master.

2. Less things to remember means it's easier to learn. First 
impressions are key for popularity.

3. Less discrepancies means fewer human errors, and thus, fewer 
"stupid" bugs.




######### CAST/TRAITS ##########

To start, let's look at: cast(T) vs to!T(t)

In D, we have one way to use template function, and then we have 
special keyword syntax which doesn't follow the same syntactical 
rules. Here, cast looks like the 'scope()' or 'debug' statement, 
which should be followed by a body of code, but it works like a 
function which takes in the following argument and returns the 
result. Setting aside the "func!()()" syntax for a moment, what 
cast should look like in D is:

     int i = cast!int(myLong);

It's a similar story with __traits(). What appears to be a 
function taking in a run-time parameter is actually compile-time 
parameter which works by "magic". It should look like:

     bool b = traits!HasMember(Foo);




######### FUNCTIONS PARAMETERS ##########

All that brings me to my next argument, and that's that the 
"func!()()" is inconsistent, or at the very least, hard to 
understand (when it doesn't have to be). We have one way of 
defining "optional" runtime parameters, and a different set of 
rules entirely for compile-time parameters. Granted, these things 
are very different to the compiler, to the programmer however, 
they "appear" to just be things we're passing to a function.

I think Nimrod has a better (but not perfect) approach to this, 
in that there are different "kinds" of functions. One that takes 
in runtime params, and one that takes in compile-time ones; but 
at the call site, you use them the same:

     # Nimrod code

     template foo(x:int) # compile time
       when x == 0:
         doSomething()
       else:
         doSomethingElse()

     proc bar(x:int) # run time
       if x == 0:
         doSomething()
       else:
         doSomethingElse()

     block main:
       foo(0) # both have identical..
       bar(0) # ..call signatures.

In D, that looks like:

     void foo(int x)() {
       static if (x == 0) { doSomething(); }
       else { doSomethingElse(); }
     }

     void bar(int x) {
       if (x == 0) { doSomething(); }
       else { doSomethingElse(); }
     }

     void main() {
       foo!0();
       bar(0); // completely difference signatures
     }

Ultimately foo is just more optimized in the case where an 'int' 
can be passed at compile time, but the way you use it in Nimrod 
is much more consistent than in D. In fact, Nimrod code is very 
clean because there's no special syntax oddities, and that makes 
it easy to follow (at least on that level), especially for people 
learning the language.

But I think there's a much better way. One of the things people 
like about Dynamicly Typed languages is that you can hack things 
together quickly. Given:

     function load(filename) { ... }

the name of the parameter is all that's required when throwing 
something together. You know what 'filename' is and how to use 
it. The biggest problem (beyond efficiency), is later when you're 
tightening things up you have to make sure that 'filename' is a 
valid type, so we end up having to do the work manually where in 
a Strong Typed language we can just define a type:

     function load(filename)
     {
       if (filename != String) {
         error("Must be string");
         return;
       }
       ...
     }

vs:

     void load(string filename) { ... }

but, of course, sometimes we want to take in a generic parameter, 
as D programmers are fully aware. In D, we have that option:

     void load(T)(T file)
     {
       static if (is(T : string))
         ...
       else if (is(T : File))
         ...
     }

but it's wonky. Two parameter sets? Type deduction? These 
concepts aren't the easiest to pick up, and I remember having 
some amount of difficulty first learn what the "func!(...)(...)" 
did in D.

So why not have one set of parameters and allow "typeless" ones 
which are simply compile-time duck-typed?

     void load(file)
     {
       static if (is(typeof(file) : string))
         ...
       else if (is(typeof(file) : File))
         ...
     }

this way, we have one set of rules for calling functions, and 
deducing/defaulting parameters, with the same power. Plus, we get 
the convenience of just hacking things together and going back 
later to tighten things up. We can have similar (to existing) 
rules for specialization and defaults:

     void foo(int x) // runtime
     void foo(x:int) // compiletime that must be 'int'
     void foo(x=int) // compiletime, defaults to 'int'
     void foo(x:int|string) // can be either int or string
     void foo(x=int|string) // defaults to int; can be string

as well for deduction:

     void foo(int x, T y, T)            { ... }
     void bar(int x, T y, T = float)    { ... }
     void baz(int x, T y, T : int|long) { ... }

     void main()
     {
       foo(0, "bar");      // T is string
       foo(0, 1.0);        // T is double
       foo(0, 1.0, float); // T is float
       bar(0, 1.0);        // T is float (?)
       baz(0, 1.0);        // error: needs int, or long
     }


Revisiting the cast()/__traits() issue. Given our new function 
call syntax, they would looks like:

     cast(Type, value);
     traits(CommandEnum, values...);

Now, I'm sure everyone is saying "What about Type template 
parameters? How do we separate them from constructor parameters?" 
Please keep reading.



######### CONSTRUCTORS ##########

We're all aware that overriding new/delete in D is a depreciated 
feature. I agree with this, however I think we should go a step 
further and remove the new/delete syntax all together... :D crazy 
I know, but hear me out.

We replace it with special factory functions. Example:

     class Person {
       string name;
       uint age;

       this new(string n, uint a) {
         name = n;
         age = a;
       }
     }

     void main() {
       auto philip = Person.new("Philip", 24);
     }

Notice 'new()' returns type 'this', which makes it static and 
implicitly calls allocation methods (which could be overridden) 
and has a 'this' reference.

This way, creating objects is consistent across struct/class and 
is always done through a named function. So things like 
converting can become arbitrarily consistent through a naming 
convention:

for example, we could use 'from()' in replace of to()/cast():

     auto i = int.from(inputString);
     auto i = int.from(myLong);

and we don't have to fight for overloads, or have to remember 
special factories (for things like FreeLists) when we usually 
think to use the 'new T()' syntax:

     // Naming clarifies intention
     auto model = Model.new(...);
     auto model = Model.load("/path/to/file");

Desctructors would be named as well, so we could force delete 
objects:

     struct Foo {
       this new() { ... }
       ~this delete() { ... }
     }

     void somefunc() {
         auto foo = Foo.new();
         scope(exit) foo.delete(); // forced
     }

This would also keep consistent syntax when using 
FreeLists/MemeoryPools, because everything is done through 
factories in this case, and the implementation can be arbitrary:

     class Foo {
       private Foo _head, _next;

       this new()  noalloc {
         if (_head) {
           Foo result = _head;
           _head = _head.next;
           return result;
         }
         else {
           import core.memory;
           return cast(Foo, GC.alloc(this.sizeof));
         }
       }

       ~this delete() {
         ...
       }
     }


More importantly, it's keeps Type template parameters from 
conflicting with constructor params:

     struct Foo(T) {
         this new(T t) { ... }
     }

     alias Foo(float) Foof;

     void main() {
         auto foo = Foo(int).new(123);
         auto foof = Foof.new(1.0f);
     }



With these changes, the language is more consistent and there's 
no special syntax characters or hard to understand rules (IMO).

Thanks for your time. Please let me know if you have any thoughts 
or opinions on my ideas, it is after all, why I'm posting them. :)
Aug 23 2012
next sibling parent "F i L" <witte2008 gmail.com> writes:
F i L wrote:
 This would also keep consistent syntax when using 
 FreeLists/MemeoryPools, because everything is done through 
 factories in this case, and the implementation can be arbitrary:

     class Foo {
       private Foo _head, _next;
 
       [ ... ]

Typo: '_head' and '_next' in this example should be 'static'
Aug 23 2012
prev sibling next sibling parent "F i L" <witte2008 gmail.com> writes:
 Typo: '_head' and '_next' in this example should be 'static'

bleh.. I mean only '_head' should be static. Not '_next'
Aug 23 2012
prev sibling next sibling parent reply Nick Treleaven <nospam example.net> writes:
On 24/08/2012 06:14, F i L wrote:
 DISCLAIMER: This isn't a feature request or anything like that. It's
 ONLY intended to stir _constructive_ conversation and criticism of D's
 existing features, and how to improve them _in the future_ (note the
 'D3' in the title).

 To start, let's look at: cast(T) vs to!T(t)

 In D, we have one way to use template function, and then we have special
 keyword syntax which doesn't follow the same syntactical rules. Here,
 cast looks like the 'scope()' or 'debug' statement, which should be
 followed by a body of code, but it works like a function which takes in
 the following argument and returns the result. Setting aside the
 "func!()()" syntax for a moment, what cast should look like in D is:

      int i = cast!int(myLong);

That syntax makes sense, but cast is a built-in language feature. I'm not sure making it look like a library function is really worth the change IMO. In some cases the syntax would be a bit noisier: cast(immutable int) v; // current cast!(immutable int)(v); // new
 It's a similar story with __traits(). What appears to be a function
 taking in a run-time parameter is actually compile-time parameter which
 works by "magic". It should look like:

      bool b = traits!HasMember(Foo);

Or: bool b = traits.hasMember!(Foo, "bar")(); Also: int i; bool b = __traits(isArithmetic, i); // current bool b = traits.isArithmetic(i); // new 'i' cannot be a compile-time parameter or a runtime parameter either (by normal rules). So I think __traits are special, they're not really like a template function.
      # Nimrod code

      template foo(x:int) # compile time
        when x == 0:
          doSomething()
        else:
          doSomethingElse()

      proc bar(x:int) # run time
        if x == 0:
          doSomething()
        else:
          doSomethingElse()

      block main:
        foo(0) # both have identical..
        bar(0) # ..call signatures.

 In D, that looks like:

      void foo(int x)() {
        static if (x == 0) { doSomething(); }
        else { doSomethingElse(); }
      }

      void bar(int x) {
        if (x == 0) { doSomething(); }
        else { doSomethingElse(); }
      }

      void main() {
        foo!0();
        bar(0); // completely difference signatures
      }

 Ultimately foo is just more optimized in the case where an 'int' can be
 passed at compile time, but the way you use it in Nimrod is much more
 consistent than in D. In fact, Nimrod code is very clean because there's
 no special syntax oddities, and that makes it easy to follow (at least
 on that level), especially for people learning the language.

Personally I think it's a benefit that D makes compile-time and runtime parameters look different in caller code. The two things are very different.
 But I think there's a much better way. One of the things people like
 about Dynamicly Typed languages is that you can hack things together
 quickly. Given:

      function load(filename) { ... }

 the name of the parameter is all that's required when throwing something
 together. You know what 'filename' is and how to use it. The biggest
 problem (beyond efficiency), is later when you're tightening things up
 you have to make sure that 'filename' is a valid type, so we end up
 having to do the work manually where in a Strong Typed language we can
 just define a type:

      function load(filename)
      {
        if (filename != String) {
          error("Must be string");
          return;
        }
        ...
      }

 vs:

      void load(string filename) { ... }

 but, of course, sometimes we want to take in a generic parameter, as D
 programmers are fully aware. In D, we have that option:

      void load(T)(T file)
      {
        static if (is(T : string))
          ...
        else if (is(T : File))
          ...
      }

Perhaps I'm being pedantic, but here it would probably be: void load(T:string)(T file){...} void load(T:File)(T file){...}
 but it's wonky. Two parameter sets? Type deduction? These concepts
 aren't the easiest to pick up, and I remember having some amount of
 difficulty first learn what the "func!(...)(...)" did in D.

I think it's straightforward if you already know func<...>(...) syntax from C++ and other languages.
 So why not have one set of parameters and allow "typeless" ones which
 are simply compile-time duck-typed?

      void load(file)
      {
        static if (is(typeof(file) : string))
          ...
        else if (is(typeof(file) : File))
          ...
      }

 this way, we have one set of rules for calling functions, and
 deducing/defaulting parameters, with the same power. Plus, we get the
 convenience of just hacking things together and going back later to
 tighten things up.

The above code doesn't look much easier to understand than the current D example. I think knowing about template function syntax is simpler than knowing how to do static if tests. I think there was a discussion about this before, but using: void load(auto file) Your analysis seems to go into more depth than that discussion AFAIR. ...
 Revisiting the cast()/__traits() issue. Given our new function call
 syntax, they would looks like:

      cast(Type, value);
      traits(CommandEnum, values...);

Those do now look nice and consistent.
 ######### CONSTRUCTORS ##########

 We're all aware that overriding new/delete in D is a depreciated
 feature.

I thought it was a deprecated feature (SCNR ;-))
 I agree with this, however I think we should go a step further
 and remove the new/delete syntax all together... :D crazy I know, but
 hear me out.

I have wondered whether 'new' will need to be a library function when allocators are introduced.
Aug 24 2012
parent Nick Treleaven <nospam example.net> writes:
On 24/08/2012 17:27, Nick Treleaven wrote:
 On 24/08/2012 06:14, F i L wrote:
 It's a similar story with __traits(). What appears to be a function
 taking in a run-time parameter is actually compile-time parameter which
 works by "magic". It should look like:

      bool b = traits!HasMember(Foo);


Correcting myself: bool b = traits.hasMember!(Foo, "bar"); int i; bool b = traits.isArithmetic!i;
 'i' cannot be a compile-time parameter or a runtime parameter either (by
 normal rules).

I realized I was wrong here, 'i' can be a template alias parameter.
 So I think __traits are special, they're not really like
 a template function.

But most seem close in semantics to a template instantiation, so that syntax might be nice.
Aug 25 2012
prev sibling next sibling parent "Nathan M. Swan" <nathanmswan gmail.com> writes:
On Friday, 24 August 2012 at 05:14:39 UTC, F i L wrote:
 We replace it with special factory functions. Example:

     class Person {
       string name;
       uint age;

       this new(string n, uint a) {
         name = n;
         age = a;
       }
     }

     void main() {
       auto philip = Person.new("Philip", 24);
     }

 Notice 'new()' returns type 'this', which makes it static and 
 implicitly calls allocation methods (which could be overridden) 
 and has a 'this' reference.

The constructor definition syntax doesn't seem to be an improvement: this new instead of the old this. The constructor calling syntax is actually something I've thought of before. I think Class.new is better than new Class, simply because it's more succinct when chaining: Class.new().method() vs. (new Class()).method() NMS
Aug 24 2012
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 8/24/12, Nathan M. Swan <nathanmswan gmail.com> wrote:
 Class.new().method() vs. (new Class()).method()

I prefer the latter because it's more explicit that you're throwing away an object after invocation (unless you do something funky in 'method' and store the 'this' reference globally).
Aug 24 2012
prev sibling next sibling parent "David Nadlinger" <see klickverbot.at> writes:
On Friday, 24 August 2012 at 17:04:49 UTC, Andrej Mitrovic wrote:
 On 8/24/12, Nathan M. Swan <nathanmswan gmail.com> wrote:
 Class.new().method() vs. (new Class()).method()

I prefer the latter because it's more explicit that you're throwing away an object after invocation

I'm not sure if I follow you here – when chaining methods, you are _always_ throwing away the return value temporaries… David
Aug 24 2012
prev sibling next sibling parent "F i L" <witte2008 gmail.com> writes:
Nathan M. Swan wrote:
 The constructor definition syntax doesn't seem to be an 
 improvement: this new instead of the old this.

well the reason they're named is because then you can multiple constructors under different names: class Model { string name; float x = 0, y = 0; this new(string n) { name = n; } this new(string n, float x, float y) { name = n; this.x = x; this.y = y; } this load(string fn) { auto file = File.load(fn); ... } } Here we have two overloads of the constructor new() without conflict, but also two constructors that would have conflicted if they weren't separated by name: new(string) and load(string). In this situation today, we would normally need to make a static factory function called 'load()' which created a 'new Model()' and returned it. We do this factory function thing all the time today, and it's required for things like Memory Pools and to resolve naming conflicts like above. Ideally, there should be a single, consistent way of creating objects which allow for arbitrary named separation, and I think this is best solution. Both Vala and Dart have a named-constructor syntax to address this issue, but neither feels as consistent (to me) as what I'm presenting above.
Aug 24 2012
prev sibling next sibling parent reply "David Piepgrass" <qwertie256 gmail.com> writes:
 I've had a couple of ideas recently about the importance of 
 consistency in a language design, and how a few languages I 
 highly respect (D, C#, and Nimrod) approach these issues. This 
 post is mostly me wanting to reach out to a community that 
 enjoys discussing such issues, in an effort to correct any 
 mis-conceptions I might hold, and to spread potentially good 
 ideas to the community in hopes that my favorite language will 
 benefit from our discussion.

The points you raise are good and I generally like your ideas, although it feels a little early to talk about D3 when D2 is still far from a comprehensive solution. Amazing that bug 1528 is still open for example: http://stackoverflow.com/questions/10970143/wheres-the-conflict-here Regarding your idea for merging compile-time and run-time arguments together, it sounds good at first but I wonder if it would be difficult to handle in the parser, because at the call site, the parser does not know whether a particular argument should be a type or an expression. Still, no insurmountable difficulties come to mind. I certainly like the idea to introduce a more regular syntax for object construction (as I have proposed before, see http://d.puremagic.com/issues/show_bug.cgi?id=8381#c1) but you didn't say whether it would be allowed to declare a static method called "new". I'd be adamant that it should be allowed: the caller should not know whether they are calling a constructor or not. Also, I'm inclined to think that constructors should use "init", in keeping with tradition. A couple inconsistencies that come immediately to my mind about D2 are 1. Function calling is foo!(x, y)(z) but declaration is foo(x, y)(int z) And the compiler doesn't always offer a good error message. I'm seeing "function declaration without return type. (Note that constructors are always named 'this')" "no identifier for declarator myFunction!(Range)(Range r)" 2. Ref parameters are declared as (ref int x) but are not allowed to be called as (ref x) -- then again, maybe it's not a real inconsistency, but I'm annoyed. It prevents my code from self-documenting properly. Obviously, D is easy compared to C++, but no language should be judged by such a low standard of learnability. So I am also bothered by various things about D that feel unintuitive: 1. Enums. Since most enums are just a single value, they are named incorrectly. 2. immutable int[] func()... does not return an immutable array of int[]? 3. 0..10 in a "foreach" loop is not a range. It took me awhile to find the equivalent range function, whose name is quite baffling: "iota(10)" 4. Eponymous templates aren't distinct enough. Their syntax is the same as a normal template except that the outer and inner members just happen to have the same name. This confused me the other day when I was trying to understand some code by Nick, which called a method inside an eponymous templates via another magic syntax, UFCS (I like UFCS, but I might be a little happier if free functions had to request participation in it.) 5. The meaning is non-obvious when using "static import" and advanced imports like "import a = b : c, d" or "import a : b = c, d = e" or "import a = b : c = d". 6. the syntax of is(...)! It looks like a function or operator with an expression inside, when in fact the whole thing is one big operator. It's especially not obvious that "is(typeof(foo + bar))" means "test whether foo+bar is a valid and meaningful expression". Making matters worse, the language itself and most of its constructs are non-Googlable. For example if you don't remember how do declare the forwarding operator (alias this), what do you search for? If you see "alias _suchAndSuch this" and don't know what it means, what do you search for? (one might not think of removing the middle word and searching for that). I even have trouble finding stuff in TDPL e-book. The place where templates are discussed is odd: section 7.5 in chapter 7, "user-defined types", even though the template statement doesn't actually define a type. I know, I should just read the book again... say, where's the second edition? I got so disappointed when I reached the end of chapter 13 and it was followed by an index. No UFCS or traits or ranges mentioned in there anywhere... compile-time function evaluation is mentioned, but the actual acronym CTFE is not. I also hope something will be changed about contracts. I am unlikely to ever use them if there's no option to keep SOME of them in release builds (I need them to work at all boundaries between different parties' code, e.g. official API boundaries, and it is preferable to keep them in all cases that they don't hurt performance; finally, we should consider that the class that contains the contracts may not know its own role in the program, so it may not know whether to assert or enforce is best). Plus, the syntax is too verbose. Instead of in { assert(x >= 0 && x < 100); }, I'd prefer just in(x >= 0 && x < 100). I'd rather not type "body" either. Speaking of verbosity -- breaks in switches. Nuff said. Interestingly, the discussion so far has been all about syntax, not any significant new features. I'm thinking ... coersion of a class to any compatible interface (as in Go)? pattern matching? tuple unpacking? an attribute system? unit inference? compiler plug-ins? further enhanced metaprogramming? safe navigation operator? user-defined operators? first-class void type? If I were a compiler writer (and I want to be) the possibilities would be MADDENINGLY endless :)
Aug 24 2012
parent reply Nick Treleaven <nospam example.net> writes:
On 25/08/2012 07:45, David Piepgrass wrote:
 2. immutable int[] func()... does not return an immutable array of int[]?

Maybe this should be disallowed with an error message "prefix immutable is deprecated - either use suffix immutable or immutable(int[])": http://d.puremagic.com/issues/show_bug.cgi?id=4070 Unfortunately that is marked WONTFIX.
 Making matters worse, the language itself and most of its constructs are
 non-Googlable. For example if you don't remember how do declare the
 forwarding operator (alias this), what do you search for? If you see
 "alias _suchAndSuch this" and don't know what it means, what do you
 search for? (one might not think of removing the middle word and
 searching for that).

alias syntax is confusing and inconsistent with renamed imports. It should use assignment syntax: alias Bar = expression; alias this = foo; http://d.puremagic.com/issues/show_bug.cgi?id=3011
 Interestingly, the discussion so far has been all about syntax, not any
 significant new features. I'm thinking ... coersion of a class to any
 compatible interface (as in Go)?

We already have: import std.range; auto range = ...; auto obj = inputRangeObject(range); alias ElementType!(typeof(range)) E; InputRange!E iface = obj; writeln(iface.front); So maybe we can do: auto implementObject(Interface, T)(T t){...} auto obj = implementObject!(InputRange!E)(range); Also, it might be nice to have 'canImplement' for template constraints: auto foo(T)(T v) if (canImplement!(T, SomeInterface)){...} Nick
Aug 25 2012
parent Nick Treleaven <invalid example.net> writes:
On 25/08/2012 23:12, David Piepgrass wrote:
 So maybe we can do:

 auto implementObject(Interface, T)(T t){...}
 auto obj = implementObject!(InputRange!E)(range);

Well, my D-fu is too weak to tell whether it's doable. When it comes to ranges, the standard library already knows what it's looking for, so I expect the wrapping to be straightforward.

I was thinking things like __traits(allMembers, Interface) and other __traits might provide enough to do it.
 Even if a template-based
 solution could work at compile-time, run-time (when you want to cast
 some unknown object to a known interface) may be a different story.

My code above does do that - range is unknown, it just has to have compatible symbols with Interface's methods. obj is a true Object that implements Interface. Continuing the example, you could write: InputRange!E iface = obj; Where InputRange!E is a real interface, not a model of one. I.e. you could pass it to a non-template function: void foo(InputRange!E irange){...} foo(obj); I'm not sure what's missing (besides an implementation of course ;-)).
 Also, it might be nice to have 'canImplement' for template constraints:

 auto foo(T)(T v) if (canImplement!(T, SomeInterface)){...}

or 'couldImplement', assuming T doesn't officially declare that it implements the interface...

I was thinking 'can implement' vs. 'does implement'.
Aug 26 2012
prev sibling next sibling parent "David Piepgrass" <qwertie256 gmail.com> writes:
 I'm inclined to think that constructors should use "init", in 
 keeping with tradition.

Wow, what the hell am I saying. Scratch that sentence, I often wish I could edit stuff after posting.
Aug 25 2012
prev sibling next sibling parent "David Piepgrass" <qwertie256 gmail.com> writes:
 Interestingly, the discussion so far has been all about 
 syntax, not any
 significant new features. I'm thinking ... coersion of a class 
 to any
 compatible interface (as in Go)?

We already have: import std.range; auto range = ...; auto obj = inputRangeObject(range); alias ElementType!(typeof(range)) E; InputRange!E iface = obj; writeln(iface.front); So maybe we can do: auto implementObject(Interface, T)(T t){...} auto obj = implementObject!(InputRange!E)(range);

Well, my D-fu is too weak to tell whether it's doable. When it comes to ranges, the standard library already knows what it's looking for, so I expect the wrapping to be straightforward. Even if a template-based solution could work at compile-time, run-time (when you want to cast some unknown object to a known interface) may be a different story. I am sometimes amazed by the things the Boost people come up with, that support C++03, things that I was "sure" C++03 couldn't do, such as lambda/inner functions (see "Boost Lambda Library", "Boost.LocalFunction" and "Phoenix"), scope(exit) (BOOST_SCOPE_EXIT), and a "typeof" operator (Boost.Typeof). If there were 1/4 as many D programmers as C++ ones, I might be amazed on a regular basis.
 Also, it might be nice to have 'canImplement' for template 
 constraints:

 auto foo(T)(T v) if (canImplement!(T, SomeInterface)){...}

or 'couldImplement', assuming T doesn't officially declare that it implements the interface...
Aug 25 2012
prev sibling next sibling parent "Tommi" <tommitissari hotmail.com> writes:
On Friday, 24 August 2012 at 05:14:39 UTC, F i L wrote:
 To start, let's look at: cast(T) vs to!T(t)

 In D, we have one way to use template function, and then we 
 have special keyword syntax which doesn't follow the same 
 syntactical rules. Here, cast looks like the 'scope()' or 
 'debug' statement, which should be followed by a body of code, 
 but it works like a function which takes in the following 
 argument and returns the result. Setting aside the "func!()()" 
 syntax for a moment, what cast should look like in D is:

     int i = cast!int(myLong);

I agree that cast!int(myLong) looks better and more consistent than cast(int)myLong. But I think your proposed syntax improvement fails when you're casting to const or immutable: cast(const)myArray --> cast!const(myArray) Then it looks like we're passing a keyword as a template argument.
Aug 25 2012
prev sibling next sibling parent "Chris Nicholson-Sauls" <ibisbasenji gmail.com> writes:
Before we go proposing something like replacing 'new Foo( val )' 
with 'Foo.new( val )' ... which is just so Ruby-esque, but that's 
okay with me ... we need to consider that 'new' is not used only 
for classes.  Okay, so presumably structs would work the same 
way, but what of, say, arrays?  What would be the equivalent of 
'new int[][]( 5, 10 )' given such a change?

As it stands, 'new' behaves like an operator (behaves like, but 
is really a grammar artifact) and so is consistent with 
intuition.  How would we make something like 'int[][].new( 5, 10 
)' make sense *without* having to provide a function (presumably 
through UFCS) for each arity?  And, given the design of D arrays, 
what would such a function even look like?
Aug 27 2012
prev sibling next sibling parent "F i L" <witte2008 gmail.com> writes:
On Monday, 27 August 2012 at 11:15:36 UTC, Chris Nicholson-Sauls 
wrote:
 Before we go proposing something like replacing 'new Foo( val 
 )' with 'Foo.new( val )' ... which is just so Ruby-esque, but 
 that's okay with me ... we need to consider that 'new' is not 
 used only for classes.  Okay, so presumably structs would work 
 the same way, but what of, say, arrays?  What would be the 
 equivalent of 'new int[][]( 5, 10 )' given such a change?

 As it stands, 'new' behaves like an operator (behaves like, but 
 is really a grammar artifact) and so is consistent with 
 intuition.  How would we make something like 'int[][].new( 5, 
 10 )' make sense *without* having to provide a function 
 (presumably through UFCS) for each arity?  And, given the 
 design of D arrays, what would such a function even look like?

idk, I think the int[][].new(5, 10) syntax looks good, and is consistent with how I described template parameters. Constructors would have completely arbitrary names. So, while a most would follow a naming standard (like 'new'), arrays could always use something that was a bit more descriptive, like 'alloc': struct Point(T) { T x, y; this new(T x, T y) { ... } } void main() { // create Point auto p = Point(int).new(1, 2); // create dynamic array of 5 Points // and construct them all auto a = Point(int)[].alloc(5); a[].new(1, 2); // shorthand equivalent auto a = Point(int)[].alloc(5).new(1, 2); // same thing, but with static array auto a = Point(int)[5].new(1, 2); } Personally, I kinda think using 'new' instead of 'alloc' makes more sense, for consistency reasons: auto a = Point(int)[].new(5).new(1, 2); but then, I can see why someone would want the distinction so it's easier to understand what constructor belongs to the Array. Either way, I don't see any conflicts with the syntax I purposed.
Aug 27 2012
prev sibling next sibling parent "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Monday, 27 August 2012 at 12:51:48 UTC, F i L wrote:
     auto a = Point(int)[].new(5).new(1, 2);

 but then, I can see why someone would want the distinction so 
 it's easier to understand what constructor belongs to the Array.

 Either way, I don't see any conflicts with the syntax I 
 purposed.

Except being somewhat unfamiliar with it, I can only look it over and go 'huh? what's going on?'. new being a keyword from C++ for allocate on the heap immediately makes this whole thing confusing. I would wonder myself: Is this on the stack or in the heap for the whole thing? Is Point a struct or a class? Then I wonder assuming that you get 5 for the length, then would all the elements be set the same or just the first one? Would new still be a key word? If not (and uses a regular function signature as you propose), then what if someone decided to use new for something other than allocation and returned something other than the type you were expecting from the view above? (Say, making a 2d matrix?) Sorry, there are just coming forward for me as I sit here reading it.
Aug 27 2012
prev sibling next sibling parent "F i L" <witte2008 gmail.com> writes:
F i L wrote:
 auto a = Point(int)[].new(5).new(1, 2);

On second thought, the shorthand might need to require the '[]' syntax: auto a = Point(int)[].new(5) a[].new(1, 2) // shorthand: auto a = Point(int)[].new(5)[].new(1, 2);
Aug 27 2012
prev sibling next sibling parent "foobar" <foo bar.com> writes:
On Monday, 27 August 2012 at 13:59:53 UTC, F i L wrote:
 F i L wrote:
 auto a = Point(int)[].new(5).new(1, 2);

On second thought, the shorthand might need to require the '[]' syntax: auto a = Point(int)[].new(5) a[].new(1, 2) // shorthand: auto a = Point(int)[].new(5)[].new(1, 2);

I'd expect something like: // 1. create Array of 5 elements, default construction auto a = Array!(T).new(5); // 2. create Array of 7 elements // all elements init-ed with the same "this(params)" c-tor auto b = Array!(T).new(7, params); // 3. create Array of 3 elements, init-ed via function auto c = Array!(T).new(3, (length) {...}); Regarding syntax [sugar], the above would translate to: 1. auto a = T[].new(5); 2. auto b = T[].new(7, params); 3. auto c = T[].new(7, (length) {...}); For multidimensional arrays the above can be generalized as following: // first parameter is a tuple // should there be syntax sugar for this? auto d1 = Array!T.new((6,5,4), params); // 6x5x4 3d array auto d2 = Array!T.new((6,5,4), (dims) {...}); // 6x5x4 3d array // "arrays of arrays" auto d1 = T[][].new(6, T[].new(5, params)); // nest "new" calls auto d2 = T[][].new(6, (length) {...}); // init sub-arrays inside function // or treat multi "[]" as multidimensional - IMHO this is less optimal // compiler knows how many dimensions there are auto d1 = T[][][].new(6,5,4, params); // 6x5x4 3d array auto d2 = T[][][].new(6,5,4, (x,y,z) {...}); // 6x5x4 3d array for static arrays: 1. auto a = T[5].new(); 2. auto b = T[7].new(params); 3. auto c = T[3].new((length) {...}); P.S I saw in a different language (Nimrod?) the syntax for array!T is "[T]" which I like a bit more than D's "T[]" but this isn't that important for the discussion at hand.
Aug 27 2012
prev sibling next sibling parent "F i L" <witte2008 gmail.com> writes:
Era Scarecrow wrote:
  Except being somewhat unfamiliar with it, I can only look it 
 over and go 'huh? what's going on?'.

  new being a keyword from C++ for allocate on the heap 
 immediately makes this whole thing confusing. I would wonder 
 myself: Is this on the stack or in the heap for the whole 
 thing? Is Point a struct or a class? Then I wonder assuming 
 that you get 5 for the length, then would all the elements be 
 set the same or just the first one?

in C#, you use 'new Type()' for both classes and structs, and it works fine. In fact, it has some benefit with generic programming. Plus, it's impossible to completely get away from having to understand the type, even in C++/D today, because we can always make factory functions: FooType newFoo() { return new FooType( ... ); } void main() { auto f = newFoo(); // struct or class? } However, I do agree it would be nice having some kind of distinction between stack/heap allocation and copy/ref semantics. Because constructor names are completely arbitrary in my proposal, I think you could easily just choose a different name (say, 'new' vs 'set'). struct Foo { this set() { ... } } class Bar { this new() { ... } } void main() { auto f = Foo.set( ... ); // stack/copy auto b = Bar.new( ... ); // heap/ref } Again, this is just an arbitrary distinction and wouldn't be enforced, so third-party libs could choose something completely different... but then, they always could (like above) so it's benefit is debatable. I've had ideas before about having two different '=' operators for assignment and copy, but honestly, I think just looking up and understanding the types you're working with might be the best solution. A task much easier with proper IDE tooltips and the like.
  Would new still be a key word?

No. You could just as easily name your constructor 'freshCopyYo()'. In fact, you often want multiple constructor names for different constructing procedures, like I explain in my original post. For instance if you're loading from a file, having a 'load()' constructor makes more sense than 'new()'. When converting from some other type, have a 'from()' constructor. The name implies the action, but they all "construct" the type: class Text { this new() // blank this new(string s) // basic constructor this from(int i) // convert from int this from(float f) // convert from float this load(string filename) // load from file } All of these are constructors, because they're return type is 'this'. They all implicitly allocate an object and have a 'this' reference. However, their names and implementations are completely arbitrary, which is a good thing because we need and use these arbitrary constructors all the time today, we just have to do it in a completely inconsistent way (static factory functions).
 If not (and uses a regular function signature as you propose), 
 then what if someone decided to use new for something other 
 than allocation and returned something other than the type you 
 were expecting from the view above? (Say, making a 2d matrix?)

You can ask the same question about any functions today (see the first code example of this post). It's possible to make a function 'add' that actually subtracts, but realistically no one is going to do it. Free Lists (memory pooled objects) are a perfect example of when you want a constructor that doesn't [always] allocate memory, and I describe this example in my original post. Also, there are times when you actually want to return a different (but often related) type. Dart has special "constructor factories", which overrides 'new Type()' syntax, that's built for just that (http://www.dartlang.org/articles/idiomatic-dart/#constructors scroll down a little to 'factory constructors'). With my proposal, a function which returns type 'this' would always return that type, but because they're invoked like any other static function, you can easily define a static factory function which "appears" to be a constructor (based on it's name).
Aug 27 2012
prev sibling next sibling parent "F i L" <witte2008 gmail.com> writes:
foobar wrote:
 I'd expect something like:

 [ ... ]

Ya, you could always have the Array constructor pass on construction Parameters as well, and I think your examples make sense.
 P.S
 I saw in a different language (Nimrod?) the syntax for array!T 
 is "[T]" which I like a bit more than D's "T[]" but this isn't 
 that important for the discussion at hand.

actually, Nimrod arrays look like: var a : array[0..5, int] var a = [0, 1, 2, 3, 4] # shorthand and dynamic arrays (sequences): var s : seq[int] var s = [0, 1, 2, 3, 4] # shorthand D has the best array syntax, IMO. Especially with AA shorthands: auto aa = ['one':1, 'two':2, ... ]; I think that's pretty much prefect that way it is.
Aug 27 2012
prev sibling next sibling parent "F i L" <witte2008 gmail.com> writes:
F i L wrote:
 foobar wrote:
 I'd expect something like:

 [ ... ]

Ya, you could always have the Array constructor pass on construction Parameters as well, and I think your examples make sense.

On a side note, you could always invoke Arrays, Dynamic Arrays (Lists), and Associative Arrays (Maps) by name (like every other object), and use the '[...]' syntax for initializing the size: auto a = Array(Point(int))[5].new(1, 2); // static auto l = List(Point(int))[5].new(1, 2); // dynamic auto m = Map(Point(int), string)[ 'one', 'two', 'three', 'four', 'five' ].new(1, 2) // shorthands: auto a = [0, 1, 2]; // Array(int)[3] auto l = [][0, 1, 2]; // List(int)[3] auto m = ['z':0, 'o':1, 't':2]; // Map(int, string)['z', 'o', 't'] this would have the added benefit of being consistent with std.containers types, but only the three default one would have shorthands: import std.containers; auto l = List(int)[5]; // object.d auto sl = SLinked(int)[5]; // std.containers.d auto dl = DLinked(int)[5]; // std.containers.d IDK though, just a thought.
Aug 27 2012
prev sibling next sibling parent "F i L" <witte2008 gmail.com> writes:
F i L wrote:
 On a side note, you could always invoke Arrays, Dynamic Arrays 
 (Lists), and Associative Arrays (Maps) by name (like every 
 other object), and use the '[...]' syntax for initializing the 
 size:

     [ ... ]

meh. On second thought, that would be really annoying in some places. It would be ugly cause you'd have to do things like: void main(List(string) args) { ... } instead of: void main(string[] args) { ... } I think the way D currently handles arrays is very elegant.
Aug 27 2012
prev sibling next sibling parent "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Monday, 27 August 2012 at 14:53:57 UTC, F i L wrote:
 in C#, you use 'new Type()' for both classes and structs, and 
 it works fine. In fact, it has some benefit with generic 
 programming. Plus, it's impossible to completely get away from 
 having to understand the type, even in C++/D today, because we 
 can always make factory functions:

I'm sure in C# that all structs and classes are heap allocated (It takes after C++ very likely) that's the simplest way to do it. You can do that in C++ as well, but other than having to declare it a pointer first. In C++ they made structs 'classes that are public by default' by it's definition I believe. Considering how C++ is set up that makes perfect sense.
     FooType newFoo() {
         return new FooType( ... );
     }

     void main() {
         auto f = newFoo(); // struct or class?
     }

By looking at newFoo I'd say a class; But if like in C# I'm sure you can't tell the difference (But C++ with the pointer you can). And for factory functions I'd put them inside the struct/class, unless I had a compelling reason not to. class Record { static Record newRecord(string options) { Record rec = new Record(); // stuff building the complex record return rec; } }
 However, I do agree it would be nice having some kind of 
 distinction between stack/heap allocation and copy/ref 
 semantics. Because constructor names are completely arbitrary 
 in my proposal, I think you could easily just choose a 
 different name (say, 'new' vs 'set').

     struct Foo {
         this set() { ... }
     }

     class Bar {
         this new() { ... }
     }

     void main() {
         auto f = Foo.set( ... ); // stack/copy
         auto b = Bar.new( ... ); // heap/ref
     }

 Again, this is just an arbitrary distinction and wouldn't be 
 enforced, so third-party libs could choose something completely 
 different... but then, they always could (like above) so it's 
 benefit is debatable.

 I've had ideas before about having two different '=' operators 
 for assignment and copy, but honestly, I think just looking up 
 and understanding the types you're working with might be the 
 best solution. A task much easier with proper IDE tooltips and 
 the like.


 Would new still be a key word?

No. You could just as easily name your constructor 'freshCopyYo()'. In fact, you often want multiple constructor names for different constructing procedures, like I explain in my original post. For instance if you're loading from a file, having a 'load()' constructor makes more sense than 'new()'. When converting from some other type, have a 'from()' constructor. The name implies the action, but they all "construct" the type:

     class Text
     {
         this new()         // blank
         this new(string s) // basic constructor

         this from(int i)   // convert from int
         this from(float f) // convert from float

         this load(string filename) // load from file
     }

 All of these are constructors, because they're return type is 
 'this'. They all implicitly allocate an object and have a 
 'this' reference. However, their names and implementations are 
 completely arbitrary, which is a good thing because we need and 
 use these arbitrary constructors all the time today, we just 
 have to do it in a completely inconsistent way (static factory 
 functions).

And a postblits would end up being...? The extra 'this' makes it look like an obvious typo or a minor headache. this this(this){} //postblitz? Honestly I kinda like it how it is now. It's fairly clear and concise. Only if you start getting creative can it begin to get confusing; Then again in any language someone who decided to make it confusing would make it confusing regardless. -- enum JOJOJO = 100; class Jo { int jojo1; Jo jo(Jo JOO) { int jojo = 10; int joJo = JOJOJO; Jo JO = new Jo(); // and so on and so forth. } } // Make a new Jo! Jo newJo(){ return Jo.jo(null); } Jo something = newJo(); //now is this a class or a struct :P Right back at you. -- Seriously... Actually if you get it confusing enough you could submit it to The International Obfuscated C Code Contest. http://www.ioccc.org/ I would say if 'new' is part of the function name, it's returns a class. If it's referenced (or a pointer), it could be either. But not having a universal constructor would make creating an array with defaults impossible without declaring/hinting which one to use. It's one reason structs have to have good contents for all it's data members at compile-time, so you could create an array with defaulted information that works. If we take your approach and suggestion, which one should the compile assume? Something globalSomething; class Something { this defaultConstructor(); this duplicate(); //or clone this copyGlobalSomething(); this constructorWithDefault(int x = 100); } By signature alone... Which one? They are all legal, they are uniquely named, and they are all equal candidates. Order of functions are irrelevant.
Aug 27 2012
prev sibling next sibling parent "foobar" <foo bar.com> writes:
On Monday, 27 August 2012 at 20:22:47 UTC, Era Scarecrow wrote:
 On Monday, 27 August 2012 at 14:53:57 UTC, F i L wrote:
 in C#, you use 'new Type()' for both classes and structs, and 
 it works fine. In fact, it has some benefit with generic 
 programming. Plus, it's impossible to completely get away from 
 having to understand the type, even in C++/D today, because we 
 can always make factory functions:

I'm sure in C# that all structs and classes are heap allocated (It takes after C++ very likely) that's the simplest way to do it. You can do that in C++ as well, but other than having to declare it a pointer first. In C++ they made structs 'classes that are public by default' by it's definition I believe. Considering how C++ is set up that makes perfect sense.
    FooType newFoo() {
        return new FooType( ... );
    }

    void main() {
        auto f = newFoo(); // struct or class?
    }

By looking at newFoo I'd say a class; But if like in C# I'm sure you can't tell the difference (But C++ with the pointer you can). And for factory functions I'd put them inside the struct/class, unless I had a compelling reason not to. class Record { static Record newRecord(string options) { Record rec = new Record(); // stuff building the complex record return rec; } }
 However, I do agree it would be nice having some kind of 
 distinction between stack/heap allocation and copy/ref 
 semantics. Because constructor names are completely arbitrary 
 in my proposal, I think you could easily just choose a 
 different name (say, 'new' vs 'set').

    struct Foo {
        this set() { ... }
    }

    class Bar {
        this new() { ... }
    }

    void main() {
        auto f = Foo.set( ... ); // stack/copy
        auto b = Bar.new( ... ); // heap/ref
    }

 Again, this is just an arbitrary distinction and wouldn't be 
 enforced, so third-party libs could choose something 
 completely different... but then, they always could (like 
 above) so it's benefit is debatable.

 I've had ideas before about having two different '=' operators 
 for assignment and copy, but honestly, I think just looking up 
 and understanding the types you're working with might be the 
 best solution. A task much easier with proper IDE tooltips and 
 the like.


 Would new still be a key word?

No. You could just as easily name your constructor 'freshCopyYo()'. In fact, you often want multiple constructor names for different constructing procedures, like I explain in my original post. For instance if you're loading from a file, having a 'load()' constructor makes more sense than 'new()'. When converting from some other type, have a 'from()' constructor. The name implies the action, but they all "construct" the type:

    class Text
    {
        this new()         // blank
        this new(string s) // basic constructor

        this from(int i)   // convert from int
        this from(float f) // convert from float

        this load(string filename) // load from file
    }

 All of these are constructors, because they're return type is 
 'this'. They all implicitly allocate an object and have a 
 'this' reference. However, their names and implementations are 
 completely arbitrary, which is a good thing because we need 
 and use these arbitrary constructors all the time today, we 
 just have to do it in a completely inconsistent way (static 
 factory functions).

And a postblits would end up being...? The extra 'this' makes it look like an obvious typo or a minor headache. this this(this){} //postblitz? Honestly I kinda like it how it is now. It's fairly clear and concise. Only if you start getting creative can it begin to get confusing; Then again in any language someone who decided to make it confusing would make it confusing regardless. -- enum JOJOJO = 100; class Jo { int jojo1; Jo jo(Jo JOO) { int jojo = 10; int joJo = JOJOJO; Jo JO = new Jo(); // and so on and so forth. } } // Make a new Jo! Jo newJo(){ return Jo.jo(null); } Jo something = newJo(); //now is this a class or a struct :P Right back at you. -- Seriously... Actually if you get it confusing enough you could submit it to The International Obfuscated C Code Contest. http://www.ioccc.org/ I would say if 'new' is part of the function name, it's returns a class. If it's referenced (or a pointer), it could be either. But not having a universal constructor would make creating an array with defaults impossible without declaring/hinting which one to use. It's one reason structs have to have good contents for all it's data members at compile-time, so you could create an array with defaulted information that works. If we take your approach and suggestion, which one should the compile assume? Something globalSomething; class Something { this defaultConstructor(); this duplicate(); //or clone this copyGlobalSomething(); this constructorWithDefault(int x = 100); } By signature alone... Which one? They are all legal, they are uniquely named, and they are all equal candidates. Order of functions are irrelevant.

FiL's scheme looks backwards to me. One of the main drawbacks of factories is the fact that they have non-standard names. Given a class Foo, How would I know to call a factory newFoo? More generally, given template: T create(T)(...) { // .. how do I create an instance of T? } The correct scheme would be (as implemented in a few languages): 1. The constructor needs to be split into two separate stages - creation/allocation and initialization. 2. Creation is done via a regular (virtual) method with a standardized name ("new"). 3. Initialization is done via a non-virtual function, e.g what we call a c-tor. The compiler/programmer know that creation is done via "new" and that new must call a c-tor. If there is no c-tor defined one will be generated for you by the compiler (same as it is now). In addition, a default creation method will also be generated by the compiler if it is not defined by the programmer. All it will do is just forward to the init function (aka c-tor). The main difference is that now the programmer has control over the first stage (creation) and it is no longer implicitly hardwired. Example use case: class LimitedAccount : Account { // "regular allocation" - on GC heap private Account new(Person p) { return GC.allocate!LimitedAccount(P); } // init this(Person p) {...} ...more code... } class Bank { Account new(Person p, AccountType t) { switch(t) { case AccountType.LIMITED: return LimitedAccount.new(p); ... more cases... } } } // usage: Account acc = Bank.new(PoorShmoe, AccountType.LIMITED);
Aug 27 2012
prev sibling next sibling parent "F i L" <witte2008 gmail.com> writes:
Era Scarecrow wrote:
  I'm sure in C# that all structs and classes are heap allocated 
 (It takes after C++ very likely) that's the simplest way to do 
 it. You can do that in C++ as well, but other than having to 
 declare it a pointer first. In C++ they made structs 'classes 
 that are public by default' by it's definition I believe. 
 Considering how C++ is set up that makes perfect sense.

C# structs are allocated on the stack when they can be. In certain cases (they're class fields, they're boxed, etc..) they're heap allocated.
 By looking at newFoo I'd say a class; But if like in C# I'm 
 sure you can't tell the difference (But C++ with the pointer 
 you can).

the 'auto' keyword kind negates the 'Type*' distinction. My point here is that you pretty much have to look up the type definition (or a tooltip) to understand what you're working with when factory functions are involved.
 And for factory functions I'd put them inside the struct/class, 
 unless I had a compelling reason not to.

   class Record {
     static Record newRecord(string options) {
       Record rec = new Record();
       // stuff building the complex record
       return rec;
     }
   }

The result is the same weather they're inside or outside a class, because, when used, all the coder sees is the function name to know about what it's returning. In D there is a difference between structs and classes beyond what's in C++. The 'new' keyword helps us understand what kind of object is being created, and I enjoy that. However, my argument is in favor of consistency because right now factory-functions, which _are_ used a lot, completely hide that distinction on top of being inconsistent with "normal" type construction.
  And a postblits would end up being...? The extra 'this' makes 
 it look like an obvious typo or a minor headache.

  this this(this){} //postblitz?

I'm sure this case has an easy solution. How about: struct Foo { this new() { ... } // constructor this() { ... } // postblit }
  Only if you start getting creative can it begin to get 
 confusing;

And I have to completely disagree with you here. Memory Pools are used everywhere in performance-critical code which needs a dynamic array of objects. At least half of all the "allocation" in game engines is done through factory functions that recycle objects. And for overload distinction (new vs load), which is an issue beyond Memory Pools and effects and even larger codebase. There needs to be a consistent way to distinguish (by name) a constructor that loads from a file, and one that creates the object "manually".
  If we take your approach and suggestion, which one should the 
 compile assume?

  Something globalSomething;

  class Something {
    this defaultConstructor();
    this duplicate(); //or clone
    this copyGlobalSomething();
    this constructorWithDefault(int x = 100);
  }

 By signature alone... Which one? They are all legal, they are 
 uniquely named, and they are all equal candidates. Order of 
 functions are irrelevant.

It could work identically to how D functions today. A 'new()' constructor would be part of the root Object classes are derived of, and structs would have an implicit 'new()' constructor. This could work in our favor, because instead of 'new' we could use 'alloc' (or something like that) while still encouraging 'new'. Which means structs could provide a parameter-less new() constructor: class Foo { float x; // no constructor } auto f = Foo.alloc(); assert(f.x == float.nan); --- struct Point { float x, y; this new() { x = 0; y = 0; } this new(float x, float y) { ... } } auto a = Point.new(); auto b = Point.new(1, 2); assert(a.x == 0.0f);
Aug 27 2012
prev sibling next sibling parent "F i L" <witte2008 gmail.com> writes:
foobar wrote:
 FiL's scheme looks backwards to me. One of the main drawbacks 
 of factories is the fact that they have non-standard names.
 Given a class Foo, How would I know to call a factory newFoo?

Well you'd know to call constructor functions because their name implies they construct an object: new(), load(), from(T), etc.. I _hate_ global functions like in my newFoo() example. I was simply using it as an example of where factories hide information about what's being returned in today's code. Constructor functions in my proposal must always be attached to a type, and can't be global, but their names are arbitrary (which is good for overload distinction, like I've explained before).
 class LimitedAccount : Account {
 // "regular allocation" - on GC heap
 private Account new(Person p) {
   return GC.allocate!LimitedAccount(P);
 }
 // init
 this(Person p) {...}
 ...more code...
 }

 class Bank {
 Account new(Person p, AccountType t) {
   switch(t) {
   case AccountType.LIMITED: return LimitedAccount.new(p);
   ... more cases...
   }
 }
 }

 // usage:
 Account acc = Bank.new(PoorShmoe, AccountType.LIMITED);

This is pretty much exactly what I am advocating except I think the allocator and c-tor can be combined (with a attribute override), since the allocator is implicit half the time. I gave an example in my original post: class Foo { this new() { // Implicitly allocates Foo. } this new() noalloc { // Allocation must be explicit // but must return type Foo. } static auto new() { // Regular factory function. // Allocate and return at will. } }
Aug 27 2012
prev sibling next sibling parent "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Monday, 27 August 2012 at 22:44:53 UTC, F i L wrote:
 Era Scarecrow wrote:

 C# structs are allocated on the stack when they can be. In 
 certain cases (they're class fields, they're boxed, etc..) 
 they're heap allocated.

So it will intentionally ignore 'new' and instead just call the constructor and decide if it should be heap or not? Sounds both helpful and harmful to me.
 By looking at newFoo I'd say a class; But if like in C# I'm 
 sure you can't tell the difference (But C++ with the pointer 
 you can).


 the 'auto' keyword kind negates the 'Type*' distinction. My 
 point here is that you pretty much have to look up the type 
 definition (or a tooltip) to understand what you're working 
 with when factory functions are involved.

I think that's why the naming should be something that sounds easy to follow or follows a particular style. The following should have no trouble figuring out even without seeing any documentation. Mind you i'm throwing this out there without a specific class in min. File inputFile = File.createHandle("some file"); inputFile.load(); inputFile.close();
 The result is the same weather they're inside or outside a 
 class, because, when used, all the coder sees is the function 
 name to know about what it's returning.

Wrong. If you have it within a class, you know it comes FROM that class. There may be multiple 'Records' depending on different projects or having compatible types. In my own factory function it polymorphs based on the input, so knowing the root class makes sense to me.
 In D there is a difference between structs and classes beyond 
 what's in C++. The 'new' keyword helps us understand what kind 
 of object is being created, and I enjoy that. However, my 
 argument is in favor of consistency because right now 
 factory-functions, which _are_ used a lot, completely hide that 
 distinction on top of being inconsistent with "normal" type 
 construction.

 And a postblits would end up being...? The extra 'this' makes 
 it look like an obvious typo or a minor headache.

 this this(this){} //postblitz?

I'm sure this case has an easy solution. How about: struct Foo { this new() { ... } // constructor this() { ... } // postblit }

But now you're breaking consistency by not including a return type. maybe 'this this()' but that looks like a mistake or typo. If you're willing to use something without a return type, why not leave it 'this(this)'? Or rename it all together? 'this postblitz()'.
 Only if you start getting creative can it begin to get 
 confusing;

And I have to completely disagree with you here. Memory Pools are used everywhere in performance-critical code which needs a dynamic array of objects. At least half of all the "allocation" in game engines is done through factory functions that recycle objects.

Wasn't there already a way to specify (or going to be) what you wanted to use an allocator? I thought i remember seeing an example in TDPL.
 And for overload distinction (new vs load), which is an issue 
 beyond Memory Pools and effects and even larger codebase. There 
 needs to be a consistent way to distinguish (by name) a 
 constructor that loads from a file, and one that creates the 
 object "manually".

Isn't that more an API issue?
 If we take your approach and suggestion, which one should the 
 compile assume?

 Something globalSomething;

 class Something {
   this defaultConstructor();
   this duplicate(); //or clone
   this copyGlobalSomething();
   this constructorWithDefault(int x = 100);
 }

 By signature alone... Which one? They are all legal, they are 
 uniquely named, and they are all equal candidates. Order of 
 functions are irrelevant.

It could work identically to how D functions today. A 'new()' constructor would be part of the root Object classes are derived of, and structs would have an implicit 'new()' constructor.

But new wouldn't be a constructor then would it? It would still be based on allocating memory that's optionally different. Constructor and allocation are two different steps; And for it to seamlessly go from one to another defaults to having a set default constructor. Let's assume... class Object { this new() { //allocate return defaultConstructor(); } this defaultConstructor() {} } Now in order to make a constructor (and then destructor) you either can: A) overload or use 'defaultConstructor', which would be publicly known B) overload new to do allocation the same way and call a different constructor and specifically add a destructor to make sure it follows the same lines. C) overload new to call the default allocator and then call a different constructor Now assuming you can make a different constructor by name, you then have to be able to specify a destuctor the same way for consistancy. class CustomType { this MyAwesomeConstuctor(); void MyAwesomeDestructor(); } Same problem, how do you tell it ahead of time without completely rewriting the rules? leaving it as 'this' and '~this' are simple to remember and work with, and factory functions should be used to do a bulk of work when you don't want the basic/bare minimum.
 This could work in our favor, because instead of 'new' we could 
 use 'alloc' (or something like that) while still encouraging 
 'new'. Which means structs could provide a parameter-less new() 
 constructor:
     class Foo {
         float x;
         // no constructor
     }

     auto f = Foo.alloc();
     assert(f.x == float.nan);

     ---

     struct Point {
         float x, y;
         this new() { x = 0; y = 0; }
         this new(float x, float y) { ... }
     }

     auto a = Point.new();
     auto b = Point.new(1, 2);
     assert(a.x == 0.0f);

After reading a large chunk where Andrei spoke of the flaws of C++'s classes and inheritance and potential problems having stack/exact size allocated classes compared to leaving them on the heap is coming to mind. This would undo all of that. class Bar : Foo { float y, z; } auto foo = Foo.alloc(); int isSafe; foo = Bar.alloc(); //should implicity convert normally. But stack? assert(isSafe == int.init); //or was the next variables overwritten?? In the cases where you don't overload anything would make the classes safe, in which case they don't polymorph, in which case you may as well use structs. Am I wrong on this?
Aug 27 2012
prev sibling next sibling parent Jose Armando Garcia <jsancio gmail.com> writes:
On Aug 27, 2012, at 17:31, "Era Scarecrow" <rtcvb32 yahoo.com> wrote:

 On Monday, 27 August 2012 at 22:44:53 UTC, F i L wrote:
 Era Scarecrow wrote:

 C# structs are allocated on the stack when they can be. In certain  
 cases (they're class fields, they're boxed, etc..) they're heap  
 allocated.

So it will intentionally ignore 'new' and instead just call the constructor and decide if it should be heap or not? Sounds both helpful and harmful to me.

It works very similar to D's struct. If declared in a function they are stack allocated. If declared in a class they are heap allocated. Not sure what they do for global/static declarations. Probably allocated in the data section.
 By looking at newFoo I'd say a class; But if like in C# I'm sure  
 you can't tell the difference (But C++ with the pointer you can).


 the 'auto' keyword kind negates the 'Type*' distinction. My point  
 here is that you pretty much have to look up the type definition  
 (or a tooltip) to understand what you're working with when factory  
 functions are involved.

I think that's why the naming should be something that sounds easy to follow or follows a particular style. The following should have no trouble figuring out even without seeing any documentation. Mind you i'm throwing this out there without a specific class in min. File inputFile = File.createHandle("some file"); inputFile.load(); inputFile.close();
 The result is the same weather they're inside or outside a class,  
 because, when used, all the coder sees is the function name to know  
 about what it's returning.

Wrong. If you have it within a class, you know it comes FROM that class. There may be multiple 'Records' depending on different projects or having compatible types. In my own factory function it polymorphs based on the input, so knowing the root class makes sense to me.
 In D there is a difference between structs and classes beyond  
 what's in C++. The 'new' keyword helps us understand what kind of  
 object is being created, and I enjoy that. However, my argument is  
 in favor of consistency because right now factory-functions, which  
 _are_ used a lot, completely hide that distinction on top of being  
 inconsistent with "normal" type construction.

 And a postblits would end up being...? The extra 'this' makes it  
 look like an obvious typo or a minor headache.

 this this(this){} //postblitz?

I'm sure this case has an easy solution. How about: struct Foo { this new() { ... } // constructor this() { ... } // postblit }

But now you're breaking consistency by not including a return type. maybe 'this this()' but that looks like a mistake or typo. If you're willing to use something without a return type, why not leave it 'this(this)'? Or rename it all together? 'this postblitz()'.
 Only if you start getting creative can it begin to get confusing;

And I have to completely disagree with you here. Memory Pools are used everywhere in performance-critical code which needs a dynamic array of objects. At least half of all the "allocation" in game engines is done through factory functions that recycle objects.

Wasn't there already a way to specify (or going to be) what you wanted to use an allocator? I thought i remember seeing an example in TDPL.
 And for overload distinction (new vs load), which is an issue  
 beyond Memory Pools and effects and even larger codebase. There  
 needs to be a consistent way to distinguish (by name) a constructor  
 that loads from a file, and one that creates the object "manually".

Isn't that more an API issue?
 If we take your approach and suggestion, which one should the  
 compile assume?

 Something globalSomething;

 class Something {
  this defaultConstructor();
  this duplicate(); //or clone
  this copyGlobalSomething();
  this constructorWithDefault(int x = 100);
 }

 By signature alone... Which one? They are all legal, they are  
 uniquely named, and they are all equal candidates. Order of  
 functions are irrelevant.

It could work identically to how D functions today. A 'new()' constructor would be part of the root Object classes are derived of, and structs would have an implicit 'new()' constructor.

But new wouldn't be a constructor then would it? It would still be based on allocating memory that's optionally different. Constructor and allocation are two different steps; And for it to seamlessly go from one to another defaults to having a set default constructor. Let's assume... class Object { this new() { //allocate return defaultConstructor(); } this defaultConstructor() {} } Now in order to make a constructor (and then destructor) you either can: A) overload or use 'defaultConstructor', which would be publicly known B) overload new to do allocation the same way and call a different constructor and specifically add a destructor to make sure it follows the same lines. C) overload new to call the default allocator and then call a different constructor Now assuming you can make a different constructor by name, you then have to be able to specify a destuctor the same way for consistancy. class CustomType { this MyAwesomeConstuctor(); void MyAwesomeDestructor(); } Same problem, how do you tell it ahead of time without completely rewriting the rules? leaving it as 'this' and '~this' are simple to remember and work with, and factory functions should be used to do a bulk of work when you don't want the basic/bare minimum.
 This could work in our favor, because instead of 'new' we could use  
 'alloc' (or something like that) while still encouraging 'new'.  
 Which means structs could provide a parameter-less new() constructor:
    class Foo {
        float x;
        // no constructor
    }

    auto f = Foo.alloc();
    assert(f.x == float.nan);

    ---

    struct Point {
        float x, y;
        this new() { x = 0; y = 0; }
        this new(float x, float y) { ... }
    }

    auto a = Point.new();
    auto b = Point.new(1, 2);
    assert(a.x == 0.0f);

After reading a large chunk where Andrei spoke of the flaws of C++'s classes and inheritance and potential problems having stack/exact size allocated classes compared to leaving them on the heap is coming to mind. This would undo all of that. class Bar : Foo { float y, z; } auto foo = Foo.alloc(); int isSafe; foo = Bar.alloc(); //should implicity convert normally. But stack? assert(isSafe == int.init); //or was the next variables overwritten?? In the cases where you don't overload anything would make the classes safe, in which case they don't polymorph, in which case you may as well use structs. Am I wrong on this?

Aug 27 2012
prev sibling next sibling parent "foobar" <foo bar.com> writes:
On Monday, 27 August 2012 at 23:09:13 UTC, F i L wrote:
 foobar wrote:
 FiL's scheme looks backwards to me. One of the main drawbacks 
 of factories is the fact that they have non-standard names.
 Given a class Foo, How would I know to call a factory newFoo?

Well you'd know to call constructor functions because their name implies they construct an object: new(), load(), from(T), etc.. I _hate_ global functions like in my newFoo() example. I was simply using it as an example of where factories hide information about what's being returned in today's code. Constructor functions in my proposal must always be attached to a type, and can't be global, but their names are arbitrary (which is good for overload distinction, like I've explained before).
 class LimitedAccount : Account {
 // "regular allocation" - on GC heap
 private Account new(Person p) {
  return GC.allocate!LimitedAccount(P);
 }
 // init
 this(Person p) {...}
 ...more code...
 }

 class Bank {
 Account new(Person p, AccountType t) {
  switch(t) {
  case AccountType.LIMITED: return LimitedAccount.new(p);
  ... more cases...
  }
 }
 }

 // usage:
 Account acc = Bank.new(PoorShmoe, AccountType.LIMITED);

This is pretty much exactly what I am advocating except I think the allocator and c-tor can be combined (with a attribute override), since the allocator is implicit half the time. I gave an example in my original post: class Foo { this new() { // Implicitly allocates Foo. } this new() noalloc { // Allocation must be explicit // but must return type Foo. } static auto new() { // Regular factory function. // Allocate and return at will. } }

Not at all. Your scheme requires new to return typeof(this) whereas in mine Bank.new() returns an Account instance. Also, allocation needs to be explicit in order to be useful. Most importantly, the naming scheme you suggest breaks generic code. I _DON'T_ want to choose between "new/from/load/etc.." when I create an object. I just want to create an object, period. There are whole frameworks of DI just to "fix" this problem of new and factories. Please google for "java new considered harmful" to read more about this.
Aug 28 2012
prev sibling next sibling parent "David Piepgrass" <qwertie256 gmail.com> writes:
On Monday, 27 August 2012 at 20:22:47 UTC, Era Scarecrow wrote:
 On Monday, 27 August 2012 at 14:53:57 UTC, F i L wrote:
 in C#, you use 'new Type()' for both classes and structs, and 
 it works fine. In fact, it has some benefit with generic 
 programming. Plus, it's impossible to completely get away from 
 having to understand the type, even in C++/D today, because we 
 can always make factory functions:

I'm sure in C# that all structs and classes are heap allocated (It takes after C++ very likely) that's the simplest way to do it. You can do that in C++ as well, but other than having to declare it a pointer first. In C++ they made structs 'classes that are public by default' by it's definition I believe. Considering how C++ is set up that makes perfect sense.

You're mistaken as FiL pointed out. "new" is simply not a heap allocation operator in C#, it is a creation operator. Structs in C# are allocated on the stack or embedded in another object (on the stack or on the heap). "new X()" creates a new value of type X, which could be a struct on the stack or a class on the heap. I like the way C# works in this regard because the way X is allocated is an implementation detail that is hidden from clients. If the type X is immutable, then I can freely change it from struct to class or vice versa without affecting clients that use X. (Mind you if X is mutable, the difference is visible to clients since x1 = x2 copies X itself, not a reference to X.) Plus as mentioned, generic code can use "new T()" without caring what kind of type T is.
Aug 28 2012
prev sibling next sibling parent "David Piepgrass" <qwertie256 gmail.com> writes:
 And a postblits would end up being...? The extra 'this' makes 
 it look like an obvious typo or a minor headache.

 this this(this){} //postblitz?



This is not an appropriate syntax, not just because it looks silly, but because a postblit constructor is not really a constructor, it's is a postprocessing function that is called after an already-constructed value is copied. So I don't think there's any fundamental need for postblit constructors to look like normal constructors.
 I'm sure this case has an easy solution. How about:

    struct Foo {
        this new() { ... } // constructor
        this() { ... } // postblit
    }

But now you're breaking consistency by not including a return type. maybe 'this this()' but that looks like a mistake or typo.

I don't see how "this this()" is any worse than "this(this)"; IMO neither name really expresses the idea "function that is called on a struct after its value is copied". But a postblit constructor doesn't work like normal constructors, so keeping the "this(this)" syntax makes sense to me even though it is not consistent with normal constructors. "this()" has the virtual of simplicity, but it's even less googlable than "this(this)".
 And for overload distinction (new vs load), which is an issue 
 beyond Memory Pools and effects and even larger codebase. 
 There needs to be a consistent way to distinguish (by name) a 
 constructor that loads from a file, and one that creates the 
 object "manually".

Isn't that more an API issue?

Sorry, I don't follow.
 If we take your approach and suggestion, which one should the 
 compile assume?

 Something globalSomething;

 class Something {
  this defaultConstructor();
  this duplicate(); //or clone
  this copyGlobalSomething();
  this constructorWithDefault(int x = 100);
 }

 By signature alone... Which one? They are all legal, they are 
 uniquely named, and they are all equal candidates. Order of 
 functions are irrelevant.

It could work identically to how D functions today. A 'new()' constructor would be part of the root Object classes are derived of, and structs would have an implicit 'new()' constructor.

But new wouldn't be a constructor then would it? It would still be based on allocating memory that's optionally different. Constructor and allocation are two different steps; And for it to seamlessly go from one to another defaults to having a set default constructor. Let's assume... class Object { this new() { //allocate return defaultConstructor(); } this defaultConstructor() {} } Now in order to make a constructor (and then destructor) you either can: A) overload or use 'defaultConstructor', which would be publicly known B) overload new to do allocation the same way and call a different constructor and specifically add a destructor to make sure it follows the same lines. C) overload new to call the default allocator and then call a different constructor Now assuming you can make a different constructor by name, you then have to be able to specify a destuctor the same way for consistancy. class CustomType { this MyAwesomeConstuctor(); void MyAwesomeDestructor(); } Same problem, how do you tell it ahead of time without completely rewriting the rules? leaving it as 'this' and '~this' are simple to remember and work with, and factory functions should be used to do a bulk of work when you don't want the basic/bare minimum.

Sorry, I don't understand what you're getting it. I suspect that you're interpreting his proposal in a completely different way than I am, and then trying to expose the flaws in your interpretation of the proposal, and then I can't follow it because my interpretation doesn't have those flaws :)
Aug 28 2012
prev sibling next sibling parent "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Tuesday, 28 August 2012 at 21:01:52 UTC, David Piepgrass wrote:
 this this(this){} //postblitz?



This is not an appropriate syntax, not just because it looks silly, but because a postblit constructor is not really a constructor, it's is a postprocessing function that is called after an already-constructed value is copied. So I don't think there's any fundamental need for postblit constructors to look like normal constructors.

 I'm sure this case has an easy solution. How about:

   struct Foo {
       this new() { ... } // constructor
       this() { ... } // postblit
   }

But now you're breaking consistency by not including a return type. maybe 'this this()' but that looks like a mistake or typo.


 I don't see how "this this()" is any worse than "this(this)"; 
 IMO neither name really expresses the idea "function that is 
 called on a struct after its value is copied". But a postblit 
 constructor doesn't work like normal constructors, so keeping 
 the "this(this)" syntax makes sense to me even though it is 
 not consistent with normal constructors. "this()" has the 
 virtual of simplicity, but it's even less googlable than 
 "this(this)".


As long as it's an established standard it won't matter. With his insistence that constructors have a return type, then the postblit should too. Although keep in mind most likely you won't 'return' anything in constructors and it's assumed that 'return this;' is entered at the very last line (although that becomes a special case, and that makes it okay right?).
 And for overload distinction (new vs load), which is an issue 
 beyond Memory Pools and effects and even larger codebase. 
 There needs to be a consistent way to distinguish (by name) a 
 constructor that loads from a file, and one that creates the 
 object "manually".

Isn't that more an API issue?

Sorry, I don't follow.

Having a named constructor compared to a default one. If there's only one type of constructor (although different ways to call it) then there shouldn't be an issue; Not really. But the problem was down to 'is this a struct or a class' problem. Could do Hungarian notation/prefixes; Then you're forced to know each time you use it (unless you use auto everywhere). struct MyStruct becomes struct Struct_MyStruct or struct S_MyStruct
 Same problem, how do you tell it ahead of time without 
 completely rewriting the rules? leaving it as 'this' and 
 '~this' are simple to remember and work with, and factory 
 functions should be used to do a bulk of work when you don't 
 want the basic/bare minimum.

Sorry, I don't understand what you're getting it. I suspect that you're interpreting his proposal in a completely different way than I am, and then trying to expose the flaws in your interpretation of the proposal, and then I can't follow it because my interpretation doesn't have those flaws :)

But our own opinions never have flaws :) Being able to name the constructor whatever you want although may sound nice likely is more a hassle than it's worth. Like namespace issues. I agree being able to override new could be useful for types of allocation, but the way he presented it suggested new was the constructor & allocator, meaning you'd have to have in depth detail of both steps in order to override it later. Abstraction then is thrown away and the black box method becomes a white/clear box method, encapsulation goes out the window. Make everything as simple as possible, but not simpler. -- Albert Einstein
Aug 28 2012
prev sibling next sibling parent "F i L" <witte2008 gmail.com> writes:
Era Scarecrow wrote:
 suggested new was the constructor & allocator, meaning you'd 
 have to have in depth detail of both steps in order to override 
 it later. Abstraction then is thrown away and the black box 
 method becomes a white/clear box method, encapsulation goes out 
 the window.

You raise a valid issue here. However, I think there's an easy solution (similar to what foobar was suggesting) which separates the allocator from the constructor but still allows for easy defaults and consistent syntax at object instantiation. Here's my idea: class A { private void* myAlloc() { return GC.alloc(A); } this new1() { ... } // automatic allocator this new2() alloc(myAlloc) { ... } } class B : A { override this new1() { super.new1(); ... } override this new2() { // for 'new2' we could either: // 1. Default to automatic. // That's probably bad because you don't know // when you're potentially overriding a // specialized allocator in the super class. // 2. Compiler error: Must define an allocator // with alloc(...) for this constructor // because super.new2 override the automatic one. } // So we'd probably want to require: private void* b_alloc() { ... } override this new2 alloc(b_alloc) { ... } }
Aug 29 2012
prev sibling next sibling parent "F i L" <witte2008 gmail.com> writes:
Actually, now that I think about it, there's an potentially 
better way. Simply have static analysis do the work for us:

class A
{
   int a;
   this new() {
     // if 'this = ...' is found before 'this.whatever' then
     // the automatic allocation is overriden. So we have no need
     // for any kind of  noalloc/ alloc() distinction.

     // More importantly, because allocation is type specific, we
     // strip this out when calling it from a derived class (see B)

     this = GC.alloc(A); // this stripped when called from B.new()
     this.a = ...;
   }
}

class B : A
{
   int b;
   this new() {
     super.new(); // use A.new() except for allocation
     this.b = ...;
   }
}


Basically what's happening is two functions are built out for 
each class constructor which defines a 'this = ...': one with the 
allocation stuff, and one without. When a derived class calls the 
super classes constructor, it's calling the one built without the 
allocation stuff.

There could also be some kind of cool tricks involved. For 
instance of you use 'typeof(this)' with 'GC.alloc()' (instead of 
'A'), then it could keep the allocation stuff and the super.new() 
constructor and use the allocation logic, but still allocate the 
size appropriate for type 'B' when it's called:

class A
{
   this new() {
     if (condition) {
       this = GC.alloc(typeof(this));
     }
     else {
       this = malloc(typeof(this));
     }
     ...
   }
}

class B
{
   this new() {
     super.new(); // same allocation rules as A
     ...
   }
}

However, that last part's just a side thought, and I'm not sure 
if it would really work, or what the implementation costs would 
be.
Aug 29 2012
prev sibling next sibling parent "foobar" <foo bar.com> writes:
On Wednesday, 29 August 2012 at 15:07:42 UTC, F i L wrote:
 Actually, now that I think about it, there's an potentially 
 better way. Simply have static analysis do the work for us:

 class A
 {
   int a;
   this new() {
     // if 'this = ...' is found before 'this.whatever' then
     // the automatic allocation is overriden. So we have no need
     // for any kind of  noalloc/ alloc() distinction.

     // More importantly, because allocation is type specific, we
     // strip this out when calling it from a derived class (see 
 B)

     this = GC.alloc(A); // this stripped when called from 
 B.new()
     this.a = ...;
   }
 }

 class B : A
 {
   int b;
   this new() {
     super.new(); // use A.new() except for allocation
     this.b = ...;
   }
 }


 Basically what's happening is two functions are built out for 
 each class constructor which defines a 'this = ...': one with 
 the allocation stuff, and one without. When a derived class 
 calls the super classes constructor, it's calling the one built 
 without the allocation stuff.

 There could also be some kind of cool tricks involved. For 
 instance of you use 'typeof(this)' with 'GC.alloc()' (instead 
 of 'A'), then it could keep the allocation stuff and the 
 super.new() constructor and use the allocation logic, but still 
 allocate the size appropriate for type 'B' when it's called:

 class A
 {
   this new() {
     if (condition) {
       this = GC.alloc(typeof(this));
     }
     else {
       this = malloc(typeof(this));
     }
     ...
   }
 }

 class B
 {
   this new() {
     super.new(); // same allocation rules as A
     ...
   }
 }

 However, that last part's just a side thought, and I'm not sure 
 if it would really work, or what the implementation costs would 
 be.

This looks way too complicated and also quite limited. I still haven't got an answer regarding - why impose typeof(this) as the return type of new()? I even provided an real-world use case where this is undesired! As I said before, please read: http://www.drdobbs.com/javas-new-considered-harmful/184405016# This was implemented in languages since the 80's, e.g smalltalk.
Aug 29 2012
prev sibling parent "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Wednesday, 29 August 2012 at 15:07:42 UTC, F i L wrote:
 Actually, now that I think about it, there's an potentially 
 better way. Simply have static analysis do the work for us:

 class A
 {
   int a;
   this new() {
     // if 'this = ...' is found before 'this.whatever' then
     // the automatic allocation is overriden. So we have no need
     // for any kind of  noalloc/ alloc() distinction.

     // More importantly, because allocation is type specific, we
     // strip this out when calling it from a derived class (see 
 B)

     this = GC.alloc(A); // this stripped when called from 
 B.new()
     this.a = ...;
   }
 }

 class B : A
 {
   int b;
   this new() {
     super.new(); // use A.new() except for allocation
     this.b = ...;
   }
 }


 Basically what's happening is two functions are built out for 
 each class constructor which defines a 'this = ...': one with 
 the allocation stuff, and one without. When a derived class 
 calls the super classes constructor, it's calling the one built 
 without the allocation stuff.

 There could also be some kind of cool tricks involved. For 
 instance of you use 'typeof(this)' with 'GC.alloc()' (instead 
 of 'A'), then it could keep the allocation stuff and the 
 super.new() constructor and use the allocation logic, but still 
 allocate the size appropriate for type 'B' when it's called:

 class A
 {
   this new() {
     if (condition) {
       this = GC.alloc(typeof(this));
     }
     else {
       this = malloc(typeof(this));
     }
     ...
   }
 }

 class B
 {
   this new() {
     super.new(); // same allocation rules as A
     ...
   }
 }

 However, that last part's just a side thought, and I'm not sure 
 if it would really work, or what the implementation costs would 
 be.

By this form of definition that's all suggested, you'd be mixing if it was heap or non-heap allocated. I'm not talking about Class A & B, I'm talking about things they contain. Assume class B was defined instead. class B : A { C something; this new() { super.new(); // same allocation rules as A ... something = new C(); //and however it's made } } Now you have the following: 1) Sometimes A/B is heap and sometimes not 2) Class C may or may not be heap allocated we don't know (It's an implementation detail) If A/B happens to be stack allocated then when it leaves the frame (or abandoned), no harm done (C is abandoned and will be picked up by the GC later safely) Let's reverse it so C is the outer class, and let's assume A is defined to use something like... alloca (stack). Now you have: class C { B mysteryNew; this new() { mysteryNew = new B(); } } Oops! Now leaving new (or the constructor, or whatever) mysteryNew is now an invalid object! So if there's another new option you can decide 'maybe' which one you may want to use. int currentCounter; B[10] globalReservedB; class C { B mysteryNew; this new1() alloc { if (currentCounter < globalReservedB) { //because it's faster? And we all know faster is better! globalReservedB[currentCounter] = new1 B(); mysteryNew = globalReservedB[currentCounter++]; } else assert(0); } } Whew! Wait! No!!!! globalReservedB still is just a reference pointer and not preallocated space, meaning you'd have to do fancy low level magic to actually store stuff there. mysteryNew now points to a global reference holder that holds an invalid pointer. Say the authors of zlib make a D class that does compression at a low level because D is better than C. By default they used the standard new. LATER they decide that zlib compression should only happen on the stack because you only need to compress very very small buffers (for something like chat text where you only have maybe 4k to worry about), so they override new with their own that uses alloca. They don't want to change it to a struct because it's already a class. Now what?? There's no guarantees anymore at all! If you update a library you'd need to read every implementation detail to make sure the updates don't break current code and maybe add checks everywhere. What a headache! interface IZlib { void compress(string input); string decompress(string input); ubyte[] flush(); void empty(); } //intended use. ubyte[] compressText(string text) { Zlib zText = new Zlib(); zText.compress(text); return zText.flush(); } //our fancy function wants to do multiple reads/writes. Zlib compressText(Zlib zText, string text) { if (zText is null) zText = new Zlib(); zText.compress(text); return zText; } Assuming it allocates on the heap and where we assumed it would always go, the function probably would work fine. If Zlib changed to alloca since it should never be used otherwise (and they even make an explicit note), but it could break previously compiled code. class Zlib : IZlib { ... } //could contain mystery allocator IZlib ztext = compressText(null, "Hello world"); //could crash, if it was pre-compiled code to call a library //this is now unavoidable and was shipped to millions of customers compressText(ztext, " Hello Hello!"); Now let's assume you aren't allocating on the stack (because they think the stack is a stupid place to store stuff) but we don't want to use the GC. Let's assume Zlib was used with it's own new to use malloc so it's compatible with their C interface(s). class Something { Zlib zText; void clear() {zText = null;} } Something s = new Something(); s.zText = new Zlib(); So far so good. Later.. //s isn't needed anymore //but we don't know if s was still used by something else //no destructor is called on zText, but we assume the GC //can pick up the abandoned object and cleanly handle it later. s.clear(); s = null; Now zText is abandoned, and will never call it's destructor since it wasn't registered in the GC. If it is registered, then it MIGHT call the destructor, or it may only search the area and manage to free the ubyte[] buffer that it contained still leaving you with a floating leaked block of memory (no matter how small). (May be ranting, feel free to ignore me from here on) Let's assume you have space limitations and you have to do space efficiently. So we decide to do something Apple/Mac people did and use a two level management for memory (Or sort of, pointerless pointers, I read about this so...) struct sizeBlock{ void* ptr; int size; } void* rawMemory; sizeBlock[1_024] blockRef; //1k item limit! Maybe we have 64k only, embedded devices int allocate(int size); //suspiciously similar to malloc somehow //simple, but more aptly used for arrays like strings void deallocate(int index) { blockRef[index].size = 0; } Object isObject(int index) { return cast(Object) *blockRef[index].ptr; } All allocators now need to register their pointers in blockRef. Should memory need to be shifted, only blockRef is updated and we can save as many bytes as we need. class Managed { int new() alloc { int block = allocate(Managed.sizeof); //or however the size is calculated this = isObject(block); //initialize return block; } } Now if allocate cannot find a block of appropriate size, it can just shift everything over until the end has room and none of the classes or references (as long as it honors blockRef) is safe; Even if rawMemory gets resized/moved later it should be able to handle it. int someObject = new Managed(); //so far so good right? Object o = isObject(someObject); //actively work on/with it //safe until we decide to allocate something. //Then we need to refresh our instance of o. int somethingElse = new Managed(); o = isObject(someObject); //may have moved so we refresh o //destructor never called! and memory in jeopardy! Worse is any contents //are likely abandoned and a larger memory leak issue has appeared! deallocate(someObject);
Aug 29 2012