www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Re: Do we really need const?

reply Robert Fraser <fraserofthenight gmail.com> writes:
Lionello Lunesu Wrote:

 void bar() throw;
 void foo() { bar(); }//ERROR, bar can throw
 void foo2() throw { bar(); } //OK
 void foo3() throw { bar() throw; } //OK ("cast")

In other words, all exceptions are checked exceptions...? This would especially be a problem since all allocations can throw exceptions unless caught. This sounds like it would introduce MORE problems than Java's checked exceptions. I think the "safe" case sis typically the marked one (i.e. Walter's "nothrow" he mentioned at the conference, const, pure, etc.). The whole red code-green code idea isn't bad, but I think that should generally be relegated to documentation instead of making the compiler check it. There seems to be a general push (among many computer scientists) to enforce stricter rules, yet some of the most successful languages in the past few years have been dynamically/duck typed. The price is that the impetus of using the code correctly falls on the programmer and not the compiler/interpreter. The reward is much greater productivity, since the programmer doesn't need to be concerned about such things as const-correctness, checked exceptions or interface specification, she just works with the knowledge the compiler trusts her enough to do the right thing. I'm not suggesting D go dynamically-typed (doesn't work so well in a compiled language), but restrictions for the sake of restrictions should be looked upon with great care. If const or pure can help optimization and make paralell programming easier, I'm all for it, but if it'll just sit there and make my life harder then it's not worth it. I'd like to pose a question to those who have used C++'s const: do you feel that it has saved more time by preventing bugs than it has taken by being forced to type it all the time, and the time spent when it has to be removed all throughout a hierarchy, as inevitably has to happen at least once? That is, const-correctness is a time investment, so do you feel that investment has paid off for you? I can say that working in Java, I have _never_ felt that if I pass a class reference that was "constant" in nature to a method written by somebody else or even to entirely different subsystem, that the invariantness contract, specified only in the documentation, would be broken. Compiler checks in that case end up being as useless and annoying as checked exceptions.
Sep 17 2007
next sibling parent reply "Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:
"Robert Fraser" <fraserofthenight gmail.com> wrote in message 
news:fcmngv$qhq$1 digitalmars.com...

 I'm not suggesting D go dynamically-typed (doesn't work so well in a 
 compiled language),

Bit of a sidetrack here. Dynamic typing, in the sense that variables don't have types, certainly doesn't go well in a compiled language, but type inference, something that ends up looking very similar to dynamic typing, _does_, can, and has been implemented well in compiled languages. ML and Haskell are examples. Nemerle as well, because even though it compiles to a VM it's still a statically typed VM. It's almost like IFTI or variable declaration type inference (auto x = 5), but extended to virtually _everything_. One language I've seen that I really liked was Bla. It uses Haskell-style type inference, but it still allows you to explicitly type things if you want. In this way you can do away with typing variables/params in a vast majority of the cases, and in the instances when you _want_ something to be typed, or when the type inference system can't figure it out on its own, you can type it manually. Of course something like this would probably be far too much of a departure for a language like D.
Sep 17 2007
parent Jari-Matti =?ISO-8859-1?Q?M=E4kel=E4?= <jmjmak utu.fi.invalid> writes:
Jarrett Billingsley wrote:

 One language I've seen that I really liked was Bla.  It uses Haskell-style
 type inference, but it still allows you to explicitly type things if you
 want.  In this way you can do away with typing variables/params in a vast
 majority of the cases, and in the instances when you _want_ something to
 be typed, or when the type inference system can't figure it out on its
 own, you can type it manually.
 
 Of course something like this would probably be far too much of a
 departure for a language like D.

Of course type systems aren't one dimensional - there can be several kinds of implicit static typing too. I had an impression that the lack of type inference in some places is only temporary. I agree that some advanced type techniques involving a Turing complete compile time language are not exactly what the users are expecting. Heh, they might be in the future, but let's not tell anyone - D still has reputation being a "practical language" for "real world tasks".
Sep 17 2007
prev sibling next sibling parent reply Bill Baxter <dnewsgroup billbaxter.com> writes:
Robert Fraser wrote:
 Lionello Lunesu Wrote:
 
 There seems to be a general push (among many computer scientists) to
 enforce stricter rules, yet some of the most successful languages in
 the past few years have been dynamically/duck typed. 

Note however that as these languages mature people are gradually trying to put some notion of interface-checking back in. I only really know about python, but there we have pyprotocols (http://peak.telecommunity.com/PyProtocols.html) and zope.interface (http://www.zope.org/Products/ZopeInterface) that both aim to put some non-duck type checking features back into the language. Because, surprise, when you're scaling up to huge systems it becomes difficult to figure out exactly what kind of duck you're supposed to be passing.
 I'd like to pose a question to those who have used C++'s const: do
 you feel that it has saved more time by preventing bugs than it has
 taken by being forced to type it all the time, and the time spent
 when it has to be removed all throughout a hierarchy, as inevitably
 has to happen at least once? That is, const-correctness is a time
 investment, so do you feel that investment has paid off for you?

That's kind of why I started this thread. I started using C++ in 1995 and it has been my main language since about 1998. I'm very used to C++'s const. But I do vaguely recall finding it terribly annoying in 1995 when I started moving over from C. The C++ people kept telling me that const correctness was like eating your peas. "You may not like it, but it's good for you." And now I like const, just like I like eating peas now too. It's not that I prefer the taste of peas to chocolate ice cream, but if I don't eat my peas I get this feeling like my health may fall apart at any minute. It's the same feeling I get from not using const. That said, I can program in Python without const and not flinch at all. Because const correctness is just not a part of Python. There are no peas in Python-land so I don't feel like I have to eat them. It's ice cream for every meal! Of course in Python-land "slow" has also been declared the new "fast", so I also don't flinch about making heap allocations willy-nilly.
 I can say that working in Java, I have _never_ felt that if I pass a
 class reference that was "constant" in nature to a method written by
 somebody else or even to entirely different subsystem, that the
 invariantness contract, specified only in the documentation, would be
 broken. Compiler checks in that case end up being as useless and
 annoying as checked exceptions.

Here's the one case that makes me want const: Efficient vector math. Say you're writing a routine to compute whether or not three points are inside the circumcircle defined by three others. Here's how I wrote it in D: // Return if point d is in the circumcircle defined by points a,b,c. // The points a,b,c should be given in counter-clockwise order. bool in_circle(Point a, Point b, Point c, ref Point d) { a -= d; b -= d; c -= d; assert(is_CCW(a,b,c), "input circle points are not in ccw order"); Scalar a2 = a.sqrnorm(); Scalar b2 = b.sqrnorm(); Scalar c2 = c.sqrnorm(); Scalar det = a2*(b.x*c.y - b.y*c.x) + b2*(a.y*c.x - a.x*c.y) + c2*(a.x*b.y - a.y*b.x); return det >= 0; } Really I would like to make 'ref Point d' there be const. I'm passing it by reference because it's slightly more efficient and in_circle can get called a *lot*. I go ahead and pass a,b,c by value because I'm going to have to push a copy on the stack to modify them anyway. If I weren't modifying them I'd pass them by reference, too. But those are implementation details that the caller of in_circle doesn't really care about. So it's odd they should be in the interface. Ref/const ref is not a very direct solution for this kind of need. What i'd really like to be able to do is declare the function to be generically non-mutating (like 'in'), but at the same time time a) allow the implementation to modify its arguments if it wants to and b) make the call using the most efficient mechanism possible. I don't really want to have to guess whether the argument is big enough to justify pass by reference or not. It probably depends a lot on the architecture and number of accesses to the variable actually made in the end. Just using plain 'ref' would maybe be an acceptable solution, except you can't pass literals by ref. So if you make a min template like "T min(T)(ref T a, ref T b) {...};" min(0,x) won't compile. That's just too useful to disallow. And then if T is a class type then it's already a reference so there's not need to take a reference to the reference. You can work around these things with lots of static if(is(T==class)) type things, but it gets ugly fast. It would be great if something like T min(T)(in T a, in T b) {...} "just worked". I.e. prevented visible modifications to a,b but didn't do unnecessary, inefficient copying of big structures, and didn't take unnecessary references of arguments that are already references, and allowed calling with literals. Give me a way to do that and I'll be happy. I'd even be willing to ditch the 'prevent modifications' bit as long as I can have efficiency and the ability to pass constants. --bb
Sep 17 2007
next sibling parent Bruce Adams <tortoise_74 yeah.who.co.uk> writes:
Bill Baxter Wrote:

 Robert Fraser wrote:
 Lionello Lunesu Wrote:
 
 There seems to be a general push (among many computer scientists) to
 enforce stricter rules, yet some of the most successful languages in
 the past few years have been dynamically/duck typed. 

Note however that as these languages mature people are gradually trying to put some notion of interface-checking back in. I only really know about python, but there we have pyprotocols (http://peak.telecommunity.com/PyProtocols.html) and zope.interface (http://www.zope.org/Products/ZopeInterface) that both aim to put some non-duck type checking features back into the language. Because, surprise, when you're scaling up to huge systems it becomes difficult to figure out exactly what kind of duck you're supposed to be passing.

Sep 18 2007
prev sibling parent Bruce Adams <tortoise_74 yeah.who.co.uk> writes:
Bill Baxter Wrote:

 Robert Fraser wrote:
 Lionello Lunesu Wrote:
 
 There seems to be a general push (among many computer scientists) to
 enforce stricter rules, yet some of the most successful languages in
 the past few years have been dynamically/duck typed. 

Note however that as these languages mature people are gradually trying to put some notion of interface-checking back in. I only really know about python, but there we have pyprotocols (http://peak.telecommunity.com/PyProtocols.html) and zope.interface (http://www.zope.org/Products/ZopeInterface) that both aim to put some non-duck type checking features back into the language. Because, surprise, when you're scaling up to huge systems it becomes difficult to figure out exactly what kind of duck you're supposed to be passing.
 I'd like to pose a question to those who have used C++'s const: do
 you feel that it has saved more time by preventing bugs than it has
 taken by being forced to type it all the time, and the time spent
 when it has to be removed all throughout a hierarchy, as inevitably
 has to happen at least once? That is, const-correctness is a time
 investment, so do you feel that investment has paid off for you?

That's kind of why I started this thread. I started using C++ in 1995 and it has been my main language since about 1998. I'm very used to C++'s const. But I do vaguely recall finding it terribly annoying in 1995 when I started moving over from C. The C++ people kept telling me that const correctness was like eating your peas. "You may not like it, but it's good for you." And now I like const, just like I like eating peas now too. It's not that I prefer the taste of peas to chocolate ice cream, but if I don't eat my peas I get this feeling like my health may fall apart at any minute. It's the same feeling I get from not using const. That said, I can program in Python without const and not flinch at all. Because const correctness is just not a part of Python. There are no peas in Python-land so I don't feel like I have to eat them. It's ice cream for every meal! Of course in Python-land "slow" has also been declared the new "fast", so I also don't flinch about making heap allocations willy-nilly.
 I can say that working in Java, I have _never_ felt that if I pass a
 class reference that was "constant" in nature to a method written by
 somebody else or even to entirely different subsystem, that the
 invariantness contract, specified only in the documentation, would be
 broken. Compiler checks in that case end up being as useless and
 annoying as checked exceptions.

Here's the one case that makes me want const: Efficient vector math. Say you're writing a routine to compute whether or not three points are inside the circumcircle defined by three others. Here's how I wrote it in D: // Return if point d is in the circumcircle defined by points a,b,c. // The points a,b,c should be given in counter-clockwise order. bool in_circle(Point a, Point b, Point c, ref Point d) { a -= d; b -= d; c -= d; assert(is_CCW(a,b,c), "input circle points are not in ccw order"); Scalar a2 = a.sqrnorm(); Scalar b2 = b.sqrnorm(); Scalar c2 = c.sqrnorm(); Scalar det = a2*(b.x*c.y - b.y*c.x) + b2*(a.y*c.x - a.x*c.y) + c2*(a.x*b.y - a.y*b.x); return det >= 0; } Really I would like to make 'ref Point d' there be const. I'm passing it by reference because it's slightly more efficient and in_circle can get called a *lot*. I go ahead and pass a,b,c by value because I'm going to have to push a copy on the stack to modify them anyway. If I weren't modifying them I'd pass them by reference, too. But those are implementation details that the caller of in_circle doesn't really care about. So it's odd they should be in the interface. Ref/const ref is not a very direct solution for this kind of need. What i'd really like to be able to do is declare the function to be generically non-mutating (like 'in'), but at the same time time a) allow the implementation to modify its arguments if it wants to and b) make the call using the most efficient mechanism possible. I don't really want to have to guess whether the argument is big enough to justify pass by reference or not. It probably depends a lot on the architecture and number of accesses to the variable actually made in the end. Just using plain 'ref' would maybe be an acceptable solution, except you can't pass literals by ref. So if you make a min template like "T min(T)(ref T a, ref T b) {...};" min(0,x) won't compile. That's just too useful to disallow. And then if T is a class type then it's already a reference so there's not need to take a reference to the reference. You can work around these things with lots of static if(is(T==class)) type things, but it gets ugly fast. It would be great if something like T min(T)(in T a, in T b) {...} "just worked". I.e. prevented visible modifications to a,b but didn't do unnecessary, inefficient copying of big structures, and didn't take unnecessary references of arguments that are already references, and allowed calling with literals. Give me a way to do that and I'll be happy. I'd even be willing to ditch the 'prevent modifications' bit as long as I can have efficiency and the ability to pass constants. --bb

Showing my true colours as a D noob here. I thought that with D like in java any object argument is passed by reference by default but unlike java the D compiler has the option of passing by value if it decides it would be more efficient. A const declaration here would help that. I believe it should be down to the compiler to decide when to do this because it makes no visible difference from the programmers perspective. constness is part of the contract but perhaps allowing copy on write where you want to modify arguments is an option. It may be a dangerous one because it allows you to shoot yourself in the foot and modify a variable thinking you are modifying the real one when actually you are only modifying a local copy. The inverse of the normal problem. Bruce.
Sep 18 2007
prev sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
Robert Fraser wrote:
 I'd like to pose a question to those who have used C++'s const: do
 you feel that it has saved more time by preventing bugs than it has
 taken by being forced to type it all the time, and the time spent
 when it has to be removed all throughout a hierarchy, as inevitably
 has to happen at least once? That is, const-correctness is a time
 investment, so do you feel that investment has paid off for you?

At the upcoming http://www.astoriaseminar.com, I think I'll do some asking around on this issue. There'll be a lot of C++ diehards there.
 I can say that working in Java, I have _never_ felt that if I pass a
 class reference that was "constant" in nature to a method written by
 somebody else or even to entirely different subsystem, that the
 invariantness contract, specified only in the documentation, would be
 broken.

Some Java professionals have reported problems with not being able to specify const classes. These people work on Java programs with large teams.
 Compiler checks in that case end up being as useless and
 annoying as checked exceptions.

I don't think that's the same issue. The problem with checked exceptions is that suppose you have functions A, B, C, where A calls B, and B calls C. Now, you throw a new exception in C, and catch it in A. Arggh, you've got to change B.
Sep 17 2007
next sibling parent reply Robert Fraser <fraserofthenight gmail.com> writes:
Walter Bright Wrote:

 Robert Fraser wrote:
 I'd like to pose a question to those who have used C++'s const: do
 you feel that it has saved more time by preventing bugs than it has
 taken by being forced to type it all the time, and the time spent
 when it has to be removed all throughout a hierarchy, as inevitably
 has to happen at least once? That is, const-correctness is a time
 investment, so do you feel that investment has paid off for you?

At the upcoming http://www.astoriaseminar.com, I think I'll do some asking around on this issue. There'll be a lot of C++ diehards there.
 I can say that working in Java, I have _never_ felt that if I pass a
 class reference that was "constant" in nature to a method written by
 somebody else or even to entirely different subsystem, that the
 invariantness contract, specified only in the documentation, would be
 broken.

Some Java professionals have reported problems with not being able to specify const classes. These people work on Java programs with large teams.

I work (well, intern...) on a Java project with a team of ~35 developers. Can't say I've ever really wanted to constify something, although needing to synchronize everything when in doubt does get annoying (and as I keep telling my boss hurts performance dramatically, but they're big fans of the "throw more hardware at it" form of performance refactoring) so knowing something is INVARIANT might occasionally be helpful. But as a purely interface thing, I think docs serve better.
 
 Compiler checks in that case end up being as useless and
 annoying as checked exceptions.

I don't think that's the same issue. The problem with checked exceptions is that suppose you have functions A, B, C, where A calls B, and B calls C. Now, you throw a new exception in C, and catch it in A. Arggh, you've got to change B.

Same with const, but top-down. Make something const in A, you have to make it const in B and C. If you ague that if B and C weren't changing it, then what if C's implementation changed in such a way as it dos make a change. Now, arrgh, got to go back and change it in B and A. It's not nearly as bad as checked exceptions, but const for const's sake is excessive and useless. Hopefully, though, IDEs will be able to do it all for you (Descent will get there someday!).
Sep 17 2007
parent Walter Bright <newshound1 digitalmars.com> writes:
Robert Fraser wrote:
 But as a purely interface thing, I
 think docs serve better.

The problem with docs is they are invariably wrong, out of date, ambiguous, or missing. Docs are also unreadable by analysis tools, and can't be relied upon by code auditors.
 Same with const, but top-down. Make something const in A, you have to
 make it const in B and C.

Only if B and C were done poorly to begin with, and didn't already say they didn't change the reference.
 If you ague that if B and C weren't
 changing it, then what if C's implementation changed in such a way as
 it dos make a change. Now, arrgh, got to go back and change it in B
 and A.

But that is a *relevant* change. The exception passthrough is not relevant, as it means nothing to B.
Sep 18 2007
prev sibling next sibling parent "Janice Caron" <caron800 googlemail.com> writes:
On 9/18/07, Walter Bright <newshound1 digitalmars.com> wrote:
 Robert Fraser wrote:
 I'd like to pose a question to those who have used C++'s const: do
 you feel that it has saved more time by preventing bugs than it has
 taken by being forced to type it all the time, and the time spent
 when it has to be removed all throughout a hierarchy, as inevitably
 has to happen at least once? That is, const-correctness is a time
 investment, so do you feel that investment has paid off for you?

At the upcoming http://www.astoriaseminar.com, I think I'll do some asking around on this issue. There'll be a lot of C++ diehards there.

The issues in C++ aren't necessarily the same as the issues in D, however. Perhaps the most significant thing is that in C++, all classes are passed by value by default. That means that function parameters are essentially const by default, because the function gets its own copy of the object rather than referencing the original. To pass by pointer or reference, you must explicitly code it to do that, and only then does const start to creep in. This can lead to some very awkward coding practices, e.g. void f(std::string const & s); ...which basically does exactly the same job as void f(std::string s) Obviously you don't need "const" in the latter case because you're passing by value, but as soon as you start passing by reference (for efficiency - it avoids unnecessary copy constructor and destructor calls) you suddenly start to need const. In D, things are a bit different, because (as in Java) classes are passed by reference, and that means that the const keyword is going to be needed a lot more. An alternative approach might be to have reference types passed to functions as const by default, requiring the function author to explicitly state (by means of the ref keyword) that the object is mutable. This would mean that classes would then have exactly the same semantics as structs. e.g. struct S; class C; void f(S s); /* f gets a copy of s, so cannot modify the original */ void f(ref S s); /* s passed by reference, so f can modify it */ void f(C c); /* s passed by const reference, so f cannot modify it */ void f(ref C c); /* s passed by reference-to-mutable, so f can modify it */ Of course, this brings us back to the head/tail const distinction. In the above examples, we are only concerned with the constness of the object's members. In the fourth example, we arrive at a situation which /can never happen in C++/, because in C++, references are /always/ head-const. That is: void f(C & c) { s = new C(); /*ERROR - c is a reference */ } will not compile, because even though c was not declared as const, it is nonetheless head-const /because it is a reference/ This leads me to my second thought (I have more...), which is the notion that all function parameters should be head-const, not just by default, but absolutely. This would support all of the preceeding argument, but it would also mean that, for example void f(int n) { ++n; /* Error */ } would no longer compile. That's not the end of the world, because n is local anyway. The only change the programmer would need make to their code is to make a local copy of n, like this: void f(int n0) { n = n0; /* local copy - may modify */ ++n; /* OK */ } which brings me to my third and final observation, which is that this scheme needs one final "fix" before it becomes usable, because, as I've described it above, structs would be passed by value, and yet the function would not be able to modify them. And obviously that's bad. So here's the final trick to make it all hunky dory. For value types, such as structs or ints, we (that is, the compiler), divide them into two categories: Category A consists of all primitive types, and all structs which are less than some threshold size (say, 16 bytes). Category B consists of all remaining structs. In summary, category = (T.sizeof < 16) ? "A" : "B". Category A objects are passed by value. Category B objects are passed by reference. Some examples would help explain: --------------------------- struct SmallStruct { int x; int y; } SmallStruct s; f(s); void f(SmallStruct s) /* s is passed by value and is head-const */ { s.x = 3; /* Error - s is head-const */ SmallStruct s2 = s; s2.x = 3; /* OK */ } void g(ref BigStruct s) /* s is passed by reference and is head-const */ { s.x = 3; /* OK */ } --------------------------- struct BigStruct { int x; int[100] y; } SmallStruct s; f(s); void f(BigStruct s) /* s is passed by reference and is head-const and tail-const */ { s.x = 3; /* Error - s is tail-const */ SmallStruct s2 = s; s2.x = 3; /* OK */ } void g(ref BigStruct s) /* s is passed by reference and is head-const */ { s.x = 3; /* OK */ } --------------------------- class MyClass { int x; int y; } MyClass s; f(s); void f(MyClass s) /* s is already a reference, which passed by copy, but we consider it head-const and tail-const */ { s = new MyClass; /* Error - s is head-const */ s.x = 3; /* Error - s is tail-const */ MyClass s2 = s.dup; s2.x = 3; /* OK */ } void g(ref MyClass s) /* s is already a reference, which passed by copy, but we consider it head-const */ { s = new MyClass; /* Error - s is head-const */ s.x = 3; /* OK */ } --------------------------- The important thing to observe in these examples is that everything works the same - the semantics are identical at both the caller site and the callee site. The second important thing to observe is that the word "const" is completely absent from these examples. If you have const-by-default, you don't need it. Instead, you work around head-constness by making a local copy, and you override tail-constness by using the "ref" keyword. Plus - you get built-in effeciency for passing large structs to functions. I believe that this will help programmers to write code quickly without having to remember to write "const" all over the place. If they need to modify the original, the compiler will remind them to throw in a "ref" keyword to make it explicit. Everyone wins. It's easy to write code, and the compiler gets to do its checking.
Sep 18 2007
prev sibling next sibling parent reply "Janice Caron" <caron800 googlemail.com> writes:
About passing structs by reference: It's one of those things, like
register or inline...

Once upon a time, programmers used the keyword "register" because they
thought they could do a better job than the compiler at figuring how
to use its registers.

Once upon a time, programmers used the keyword "inline" because they
thought they could do a better job than the compiler at figuring out
what to inline and what not.

Now people explicitly choose between f(s) and f(ref s) (where s is a
struct and the function does not modify it) because they think they
can do a better job than the compiler at figuring out how to pass
parameters to functions. I say give control back to the compiler. Lets
let "ref" mean "the function may modify the parameter", and the
absense of ref mean "the function may not modify the parameter", but
leave it to the compiler to decide what is the most efficient way to
pass the data to the function.

I think this is a neat idea, however it won't work unless the compiler
can also do const-checking. This is one more optimisation which having
const allows. (And const-by-default would make it obvious).

----

I have thought of a problem though. Under the scheme I outlined in a
previous post, it would no longer be possible for a function to modify
the original reference. That is:

class C;
C c;
f(c)

void f(ref C c)
{
    c = new C(); /* compile-time error */
}

would no longer compile, because under the new scheme, "ref" is
reinterpreted to mean that c is head-const (because C is a class - see
definitions in previous post). So instead, you'd have to do:

class C;
C c;
f(&c)

void f(ref C* c)
{
    *c = new C();
}

Whether or not that would be considered a prohibiting problem, I don't know.
Sep 18 2007
next sibling parent reply Bill Baxter <dnewsgroup billbaxter.com> writes:
Janice Caron wrote:
 About passing structs by reference: It's one of those things, like
 register or inline...
 
 Once upon a time, programmers used the keyword "register" because they
 thought they could do a better job than the compiler at figuring how
 to use its registers.
 
 Once upon a time, programmers used the keyword "inline" because they
 thought they could do a better job than the compiler at figuring out
 what to inline and what not.
 
 Now people explicitly choose between f(s) and f(ref s) (where s is a
 struct and the function does not modify it) because they think they
 can do a better job than the compiler at figuring out how to pass
 parameters to functions. I say give control back to the compiler. Lets
 let "ref" mean "the function may modify the parameter", and the
 absense of ref mean "the function may not modify the parameter", but
 leave it to the compiler to decide what is the most efficient way to
 pass the data to the function.

This thought has occurred to me before to. I think issue becomes that the function signature may no longer enough for a compiler to tell what kind of code it needs to generate to call the function. So for instance "pass by ref unless something in the function modifies the value" is not going to work. But something like "pass by value unless the size of the parameter is greater than 8 bytes" could work. I think this is a separate issue from explicitly passing things as 'ref'. The above seems more appropriate to me as an implementation of 'in' parameters. For compiling function bodies, if the compiler detects a modification of an 'in' parameter that it chose to implement using pass-by-reference then it would just generate code in the function body to copy the parameter value first first thing. E.g. void foo(in BigStruct x); compiler actually generates code like: void foo(ref BigStruc x) { ... } unless the body of foo modifies x. Then the code gen changes to: void foo(ref BigStruct x_) { BigStruct x = x_; ... } There would still be a bit of a compiler compatibility issue though. For a given platform Compiler vendors A and B would need to agree on the rules for 'in' parameters or the libs they generate would be incompatible. If we want D to be a language with a unified ABI at least. That suggests that the rules would need to be part of the spec. Heck it may be good enough to just pick a max parameter size and stick with it for all platforms. Just like D picks a size for 'float' and sticks with it for all platforms, whether it's the most efficient for that platform or not. --bb
Sep 18 2007
next sibling parent reply Bill Baxter <dnewsgroup billbaxter.com> writes:
Janice Caron wrote:
 On 9/18/07, Bill Baxter <dnewsgroup billbaxter.com> wrote:
 This thought has occurred to me before to.  I think issue becomes that
 the function signature may no longer enough for a compiler to tell what
 kind of code it needs to generate to call the function.

That is also true for inline. I believe the solution for the pass-by-ref vs pass-by-value decision is that the compiler generates both versions of the function, and then may throw one of them away at link-time.

So the compiler generates 2^N versions for an N-parameter method? --bb
Sep 18 2007
next sibling parent Bill Baxter <dnewsgroup billbaxter.com> writes:
Janice Caron wrote:
 On 9/18/07, Bill Baxter <dnewsgroup billbaxter.com> wrote:
 So the compiler generates 2^N versions for an N-parameter method?

Still only two ... I /think/ The first version is the version in which all information is known. This is the version in which optimal decisions are assumed at the callee site, and may be taken at the caller site. The second version is the version in which no information is known. In this version, everything gets passed by the default behaviour.

That makes sense and seems like it could work if you have const refs to use as the fallback. Otherwise the fallback is going to be pass-by-value, which would be terrible for any BigStruct parameters. Object files would unfortunately just about double in size overnight... --bb
Sep 18 2007
prev sibling parent Ingo Oeser <ioe-news rameria.de> writes:
Bill Baxter wrote:

 Janice Caron wrote:
 On 9/18/07, Bill Baxter <dnewsgroup billbaxter.com> wrote:
 This thought has occurred to me before to.  I think issue becomes that
 the function signature may no longer enough for a compiler to tell what
 kind of code it needs to generate to call the function.

That is also true for inline. I believe the solution for the pass-by-ref vs pass-by-value decision is that the compiler generates both versions of the function, and then may throw one of them away at link-time.

So the compiler generates 2^N versions for an N-parameter method?

No, just for the cases, where it really matters in code size and/or speed. Isn't that kind of optimisation called "function versioning" in compiler folk slang? Once you compile in project global scope, you can do a lot of fun stuff. The problem is: We usually don't compile code that way :-( Best Regards Ingo Oeser
Sep 18 2007
prev sibling next sibling parent Bill Baxter <dnewsgroup billbaxter.com> writes:
Janice Caron wrote:
 On 9/18/07, Bill Baxter <dnewsgroup billbaxter.com> wrote:
 Janice Caron wrote:
 About passing structs by reference: It's one of those things, like
 register or inline...

 Once upon a time, programmers used the keyword "register" because they
 thought they could do a better job than the compiler at figuring how
 to use its registers.

 Once upon a time, programmers used the keyword "inline" because they
 thought they could do a better job than the compiler at figuring out
 what to inline and what not.

 Now people explicitly choose between f(s) and f(ref s) (where s is a
 struct and the function does not modify it) because they think they
 can do a better job than the compiler at figuring out how to pass
 parameters to functions. I say give control back to the compiler. Lets
 let "ref" mean "the function may modify the parameter", and the
 absense of ref mean "the function may not modify the parameter", but
 leave it to the compiler to decide what is the most efficient way to
 pass the data to the function.

the function signature may no longer enough for a compiler to tell what kind of code it needs to generate to call the function.

Actually, I think it's possible my proposal may not have been understood. Under my suggestion, if the caller passes a struct... f(s) ...and the callee declares the function as... void f(S s) ...then all information is known. Both ends must surely know s.sizeof at compile time? And since that's the /only/ thing the compiler needs to know to make the decision.
 So for instance
 "pass by ref unless something in the function modifies the value" is not
 going to work.

Agreed. But that's not what I suggested. You'd probably have to head back to see the earlier post to see the suggestion in full, because it's too long to repost, but basically it all works (...or at least, I think so...) providing you have parameters passed as const by default, which is an argument in favor of needing const. (If this discussion gets too complicated, we can take it to another thread)

Yeh, sorry my response was more along the lines of "that's an interesting idea that reminds me of this idea *I* had". Your idea requires const by default, but unfortunately const by default has been beaten to death around here. Nearly everyone who cares about D lobbied for giving it a try as Walter was designing const for D2.0, but in the end Walter said, no, it's too big a departure from C++. --bb
Sep 18 2007
prev sibling parent reply Bruce Adams <tortoise_74 yeah.who.co.uk> writes:
Bill Baxter Wrote:

[snip]
 
 There would still be a bit of a compiler compatibility issue though. 
 For a given platform Compiler vendors A and B would need to agree on the 
 rules for 'in' parameters or the libs they generate would be 
 incompatible.  If we want D to be a language with a unified ABI at 
 least.  That suggests that the rules would need to be part of the spec. 
   Heck it may be good enough to just pick a max parameter size and stick 
 with it for all platforms.  Just like D picks a size for 'float' and 
 sticks with it for all platforms, whether it's the most efficient for 
 that platform or not.
 
 --bb

That's not the same at all. There are internationally approved standards for floating point types. These standards are also implemented in hardware more often that not. Bruce.
Sep 18 2007
parent Bill Baxter <dnewsgroup billbaxter.com> writes:
Bruce Adams wrote:
 Bill Baxter Wrote:
 
 [snip]
 There would still be a bit of a compiler compatibility issue though. 
 For a given platform Compiler vendors A and B would need to agree on the 
 rules for 'in' parameters or the libs they generate would be 
 incompatible.  If we want D to be a language with a unified ABI at 
 least.  That suggests that the rules would need to be part of the spec. 
   Heck it may be good enough to just pick a max parameter size and stick 
 with it for all platforms.  Just like D picks a size for 'float' and 
 sticks with it for all platforms, whether it's the most efficient for 
 that platform or not.

 --bb

That's not the same at all. There are internationally approved standards for floating point types. These standards are also implemented in hardware more often that not.

Any Walter-approved ABI standards for the D compiler are basically "internationally approved" too. But maybe a better example would have been "just like D picks 64 bits for a long and sticks with it whether it's efficient or not". Whatever. This is really not important. --bb
Sep 18 2007
prev sibling next sibling parent Jascha Wetzel <firstname mainia.de> writes:
Janice Caron wrote:
 Lets let "ref" mean "the function may modify the parameter", and the
 absense of ref mean "the function may not modify the parameter", but
 leave it to the compiler to decide what is the most efficient way to
 pass the data to the function.
 
 I think this is a neat idea, however it won't work unless the compiler
 can also do const-checking. This is one more optimisation which having
 const allows. (And const-by-default would make it obvious).

we still need a solution to force pass-by-value and pass-by-reference for exported functions that are called from non-D code.
 ...
 would no longer compile, because under the new scheme, "ref" is
 reinterpreted to mean that c is head-const (because C is a class - see
 definitions in previous post). So instead, you'd have to do:
 
 class C;
 C c;
 f(&c)
 
 void f(ref C* c)
 {
     *c = new C();
 }
 
 Whether or not that would be considered a prohibiting problem, I don't know.

if you define ref to mean "be able to change the cell the symbol represents", that isn't needed.
Sep 18 2007
prev sibling parent reply renoX <renosky free.fr> writes:
Janice Caron a écrit :
 About passing structs by reference: It's one of those things, like
 register or inline...
 
 Once upon a time, programmers used the keyword "register" because they
 thought they could do a better job than the compiler at figuring how
 to use its registers.
 
 Once upon a time, programmers used the keyword "inline" because they
 thought they could do a better job than the compiler at figuring out
 what to inline and what not.

Well in some (very limited) cases, they still do: I'm thinking about the Linux kernels where developers do care and know (at least in some parts of the kernel) what the compiler do, of course this is not the usual case.. That said, I know that Sun's JVM has some 'escape analysis' optimisation (or it's planned to have this) where it's able to put on the stack some of the objects, but I don't know how efficient it is in comparison to the developer choosing heap or stack allocation.. renoX
Sep 18 2007
parent Bruce Adams <tortoise_74 yeah.who.co.uk> writes:
renoX Wrote:

 Janice Caron a écrit :
 About passing structs by reference: It's one of those things, like
 register or inline...
 
 Once upon a time, programmers used the keyword "register" because they
 thought they could do a better job than the compiler at figuring how
 to use its registers.
 
 Once upon a time, programmers used the keyword "inline" because they
 thought they could do a better job than the compiler at figuring out
 what to inline and what not.

Well in some (very limited) cases, they still do: I'm thinking about the Linux kernels where developers do care and know (at least in some parts of the kernel) what the compiler do, of course this is not the usual case..

The compiler doesn't know and in general can't know how your code is going to be used, you need the execution profile for that. Bruce.
Sep 18 2007
prev sibling next sibling parent "Janice Caron" <caron800 googlemail.com> writes:
On 9/18/07, Jascha Wetzel <firstname mainia.de> wrote:
 if you define ref to mean "be able to change the cell the symbol
 represents", that isn't needed.

What I mean was defined in a previous post - hopefully fairly precisely. It was too big a definition to repeat here.
Sep 18 2007
prev sibling next sibling parent "Janice Caron" <caron800 googlemail.com> writes:
On 9/18/07, Bill Baxter <dnewsgroup billbaxter.com> wrote:
 This thought has occurred to me before to.  I think issue becomes that
 the function signature may no longer enough for a compiler to tell what
 kind of code it needs to generate to call the function.

That is also true for inline. I believe the solution for the pass-by-ref vs pass-by-value decision is that the compiler generates both versions of the function, and then may throw one of them away at link-time.
Sep 18 2007
prev sibling next sibling parent "Janice Caron" <caron800 googlemail.com> writes:
On 9/18/07, Bill Baxter <dnewsgroup billbaxter.com> wrote:
 So the compiler generates 2^N versions for an N-parameter method?

Still only two ... I /think/ The first version is the version in which all information is known. This is the version in which optimal decisions are assumed at the callee site, and may be taken at the caller site. The second version is the version in which no information is known. In this version, everything gets passed by the default behaviour. So, at the callee site, either all information about the function is known, in which case we call version A, or it isn't in which case we call version B. I think that would work. It may not be /quite/ as efficient as the 2^N possibility you mentioned, but it would certainly be more efficient than the status quo (version B for everything).
Sep 18 2007
prev sibling next sibling parent reply "Janice Caron" <caron800 googlemail.com> writes:
On 9/18/07, Bill Baxter <dnewsgroup billbaxter.com> wrote:
 Janice Caron wrote:
 About passing structs by reference: It's one of those things, like
 register or inline...

 Once upon a time, programmers used the keyword "register" because they
 thought they could do a better job than the compiler at figuring how
 to use its registers.

 Once upon a time, programmers used the keyword "inline" because they
 thought they could do a better job than the compiler at figuring out
 what to inline and what not.

 Now people explicitly choose between f(s) and f(ref s) (where s is a
 struct and the function does not modify it) because they think they
 can do a better job than the compiler at figuring out how to pass
 parameters to functions. I say give control back to the compiler. Lets
 let "ref" mean "the function may modify the parameter", and the
 absense of ref mean "the function may not modify the parameter", but
 leave it to the compiler to decide what is the most efficient way to
 pass the data to the function.

This thought has occurred to me before to. I think issue becomes that the function signature may no longer enough for a compiler to tell what kind of code it needs to generate to call the function.

Actually, I think it's possible my proposal may not have been understood. Under my suggestion, if the caller passes a struct... f(s) ...and the callee declares the function as... void f(S s) ...then all information is known. Both ends must surely know s.sizeof at compile time? And since that's the /only/ thing the compiler needs to know to make the decision.
 So for instance
 "pass by ref unless something in the function modifies the value" is not
 going to work.

Agreed. But that's not what I suggested. You'd probably have to head back to see the earlier post to see the suggestion in full, because it's too long to repost, but basically it all works (...or at least, I think so...) providing you have parameters passed as const by default, which is an argument in favor of needing const. (If this discussion gets too complicated, we can take it to another thread)
Sep 18 2007
parent reply Bruce Adams <tortoise_74 yeah.who.co.uk> writes:
Janice Caron Wrote:

 On 9/18/07, Bill Baxter <dnewsgroup billbaxter.com> wrote:
 Janice Caron wrote:
 About passing structs by reference: It's one of those things, like
 register or inline...

 Once upon a time, programmers used the keyword "register" because they
 thought they could do a better job than the compiler at figuring how
 to use its registers.

 Once upon a time, programmers used the keyword "inline" because they
 thought they could do a better job than the compiler at figuring out
 what to inline and what not.

 Now people explicitly choose between f(s) and f(ref s) (where s is a
 struct and the function does not modify it) because they think they
 can do a better job than the compiler at figuring out how to pass
 parameters to functions. I say give control back to the compiler. Lets
 let "ref" mean "the function may modify the parameter", and the
 absense of ref mean "the function may not modify the parameter", but
 leave it to the compiler to decide what is the most efficient way to
 pass the data to the function.

This thought has occurred to me before to. I think issue becomes that the function signature may no longer enough for a compiler to tell what kind of code it needs to generate to call the function.

Actually, I think it's possible my proposal may not have been understood. Under my suggestion, if the caller passes a struct... f(s) ...and the callee declares the function as... void f(S s) ...then all information is known. Both ends must surely know s.sizeof at compile time? And since that's the /only/ thing the compiler needs to know to make the decision.

Bruce.
Sep 18 2007
parent reply Bill Baxter <dnewsgroup billbaxter.com> writes:
Bruce Adams wrote:
 Janice Caron Wrote:
 
 On 9/18/07, Bill Baxter <dnewsgroup billbaxter.com> wrote:
 Janice Caron wrote:
 About passing structs by reference: It's one of those things, like
 register or inline...

 Once upon a time, programmers used the keyword "register" because they
 thought they could do a better job than the compiler at figuring how
 to use its registers.

 Once upon a time, programmers used the keyword "inline" because they
 thought they could do a better job than the compiler at figuring out
 what to inline and what not.

 Now people explicitly choose between f(s) and f(ref s) (where s is a
 struct and the function does not modify it) because they think they
 can do a better job than the compiler at figuring out how to pass
 parameters to functions. I say give control back to the compiler. Lets
 let "ref" mean "the function may modify the parameter", and the
 absense of ref mean "the function may not modify the parameter", but
 leave it to the compiler to decide what is the most efficient way to
 pass the data to the function.

the function signature may no longer enough for a compiler to tell what kind of code it needs to generate to call the function.

understood. Under my suggestion, if the caller passes a struct... f(s) ...and the callee declares the function as... void f(S s) ...then all information is known. Both ends must surely know s.sizeof at compile time? And since that's the /only/ thing the compiler needs to know to make the decision.


 S could be an opaque type so s.sizeof may still be undefined.

You can't declare a function that takes an argument of unknown size. You just can't. The compiler will complain that it doesn't know the type. So either your "opaque type" is not allowable, or it's actually a reference/pointer to "opaque type" in which case the size of the pointer *is* known, which is all that is needed, since that's all that will be passed to the function. --bb
Sep 18 2007
parent reply Bruce Adams <tortoise_74 yeah.who.co.uk> writes:
Bill Baxter Wrote:

 Bruce Adams wrote:
 Janice Caron Wrote:
 
 On 9/18/07, Bill Baxter <dnewsgroup billbaxter.com> wrote:
 Janice Caron wrote:
 About passing structs by reference: It's one of those things, like
 register or inline...

 Once upon a time, programmers used the keyword "register" because they
 thought they could do a better job than the compiler at figuring how
 to use its registers.

 Once upon a time, programmers used the keyword "inline" because they
 thought they could do a better job than the compiler at figuring out
 what to inline and what not.

 Now people explicitly choose between f(s) and f(ref s) (where s is a
 struct and the function does not modify it) because they think they
 can do a better job than the compiler at figuring out how to pass
 parameters to functions. I say give control back to the compiler. Lets
 let "ref" mean "the function may modify the parameter", and the
 absense of ref mean "the function may not modify the parameter", but
 leave it to the compiler to decide what is the most efficient way to
 pass the data to the function.

the function signature may no longer enough for a compiler to tell what kind of code it needs to generate to call the function.

understood. Under my suggestion, if the caller passes a struct... f(s) ...and the callee declares the function as... void f(S s) ...then all information is known. Both ends must surely know s.sizeof at compile time? And since that's the /only/ thing the compiler needs to know to make the decision.


 S could be an opaque type so s.sizeof may still be undefined.

You can't declare a function that takes an argument of unknown size. You just can't. The compiler will complain that it doesn't know the type. So either your "opaque type" is not allowable, or it's actually a reference/pointer to "opaque type" in which case the size of the pointer *is* known, which is all that is needed, since that's all that will be passed to the function. --bb

See my reply to Janice on channel B. I did mean its passed by reference. I don't quite get the structs are passed by value by default thing. I'm assuming everything is passed by reference by default but that sometimes you want the compiler to pass by value as for small data types by they classes or structs its more efficient. So my point is that for an opaque type you may not have access to sizeof. An opaque type is like a class or struct whose data members are all private but done properly. You don't need to know about the implementation. C++ forces you to expose the implementation by declaring the members in the header even though they're private and client code can't do anything with them. The caveat is you do still need to know the size to reserve space for an opaque type before its constructed. This is why a C++ compiler needs a full definition. In theory it could be sorted at link time with support from the object file format or even left until run-time. My understanding was that you can do opaque types properly in D though I haven't tried myself. So you have sizeof at link time but not necessarily at compile time. Regards, Bruce.
Sep 18 2007
parent Reiner Pope <some address.com> writes:
Bruce Adams wrote:
 Bill Baxter Wrote:
 
 Bruce Adams wrote:
 Janice Caron Wrote:

 On 9/18/07, Bill Baxter <dnewsgroup billbaxter.com> wrote:
 Janice Caron wrote:
 About passing structs by reference: It's one of those things, like
 register or inline...

 Once upon a time, programmers used the keyword "register" because they
 thought they could do a better job than the compiler at figuring how
 to use its registers.

 Once upon a time, programmers used the keyword "inline" because they
 thought they could do a better job than the compiler at figuring out
 what to inline and what not.

 Now people explicitly choose between f(s) and f(ref s) (where s is a
 struct and the function does not modify it) because they think they
 can do a better job than the compiler at figuring out how to pass
 parameters to functions. I say give control back to the compiler. Lets
 let "ref" mean "the function may modify the parameter", and the
 absense of ref mean "the function may not modify the parameter", but
 leave it to the compiler to decide what is the most efficient way to
 pass the data to the function.

the function signature may no longer enough for a compiler to tell what kind of code it needs to generate to call the function.

understood. Under my suggestion, if the caller passes a struct... f(s) ...and the callee declares the function as... void f(S s) ...then all information is known. Both ends must surely know s.sizeof at compile time? And since that's the /only/ thing the compiler needs to know to make the decision.


You just can't. The compiler will complain that it doesn't know the type. So either your "opaque type" is not allowable, or it's actually a reference/pointer to "opaque type" in which case the size of the pointer *is* known, which is all that is needed, since that's all that will be passed to the function. --bb

See my reply to Janice on channel B. I did mean its passed by reference. I don't quite get the structs are passed by value by default thing. I'm assuming everything is passed by reference by default but that sometimes you want the compiler to pass by value as for small data types by they classes or structs its more efficient. So my point is that for an opaque type you may not have access to sizeof. An opaque type is like a class or struct whose data members are all private but done properly. You don't need to know about the implementation. C++ forces you to expose the implementation by declaring the members in the header even though they're private and client code can't do anything with them. The caveat is you do still need to know the size to reserve space for an opaque type before its constructed. This is why a C++ compiler needs a full definition. In theory it could be sorted at link time with support from the object file format or even left until run-time. My understanding was that you can do opaque types properly in D though I haven't tried myself. So you have sizeof at link time but not necessarily at compile time. Regards, Bruce.

These opaque types sound suspiciously like D classes/interfaces. void foo(Object o) { assert(o.sizeof == (void*).sizeof); } But C++ polymorphism also allows that, no? But, as other people have pointed out, doing this with structs is a completely different story. -- Reiner
Sep 18 2007
prev sibling parent reply "Janice Caron" <caron800 googlemail.com> writes:
On 9/18/07, Bruce Adams <tortoise_74 yeah.who.co.uk> wrote:
 Under my suggestion, if the caller passes a struct...

 f(s)

 ...and the callee declares the function as...

 void f(S s)

 ...then all information is known. Both ends must surely know s.sizeof
 at compile time? And since that's the /only/ thing the compiler needs
 to know to make the decision.


Could you explain further? I don't understand what an "opaque type" is in D. Observe that the example I cited above explicitly states that s be a struct, not a class. Given that the struct would ordinarily have to be built on the stack at the caller site, and copied onto the stack at the callee site, I just don't see how s.sizeof can be unknown at the time the function is instantiated, /even if/ was declared using a template. What am I missing?
Sep 18 2007
parent reply Bruce Adams <tortoise_74 yeah.who.co.uk> writes:
Janice Caron Wrote:

 On 9/18/07, Bruce Adams <tortoise_74 yeah.who.co.uk> wrote:
 Under my suggestion, if the caller passes a struct...

 f(s)

 ...and the callee declares the function as...

 void f(S s)

 ...then all information is known. Both ends must surely know s.sizeof
 at compile time? And since that's the /only/ thing the compiler needs
 to know to make the decision.


Could you explain further? I don't understand what an "opaque type" is in D. Observe that the example I cited above explicitly states that s be a struct, not a class. Given that the struct would ordinarily have to be built on the stack at the caller site, and copied onto the stack at the callee site, I just don't see how s.sizeof can be unknown at the time the function is instantiated, /even if/ was declared using a template. What am I missing?

Right. That was additional information I lost somewhere. That's one thing I've heard bandied around here, that structs are passed by value and classes by reference. It seems too peculiar to be true. Anyway, my comment was based on the understanding that the parameter is passed by reference. A reference is just a pointer. You don't need to know anything about what it points to until you dereference it. E.g. foo(opaque* myvar) { if (this.doIt == true) bar(myvar); } Can be compiled without ever knowing anything about the opaque type. Obviously if its passed by value you know the size because your call pushed it on the stack. Bruce.
Sep 18 2007
parent Nathan Reed <nathaniel.reed gmail.com> writes:
Bruce Adams wrote:
 That's one thing I've heard bandied around here, that structs are passed by
value and classes by reference. It seems too peculiar to be true. 

It's true. Why do you find this peculiar? Structs are aggregate data structures with value semantics, and classes are aggregate data structures with reference semantics. For some problems you want one, and for some you want the other. Thanks, Nathan Reed
Sep 18 2007