www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - `in` parameters made useful

reply Mathias LANG <geod24 gmail.com> writes:
Hi everyone,
For a long time I've been pretty annoyed by the state of `in` 
parameters.
In case it needs any clarification, I'm talking at what's between 
the asterisks (*) here: `void foo (*in* char[] arg)`).

While they always seemed like a good idea, they never really 
added anything: `in` was supposed to be `const scope`, then, when 
the time came to make `scope` actually do something (read: 
DIP1000), `scope` was removed from `in`!

This was re-added last release (DMD 2.092) where the 
`-preview=in` switch was added 
(https://dlang.org/changelog/2.092.0.html#preview-in). So now, if 
you want `in` to mean what it's documented to be, you need to 
throw in both `-preview=dip1000` and `-preview=in`.

But then... That still feels incomplete. I deal with a lot of C++ 
interop code, and we can't use `in` without using `ref`, because 
otherwise we trigger copy constructors / destructors of 
aggregates we have no control over. We also have some value types 
which can get pretty big, so we don't want to pass those by 
value, either. So easy solution, add `ref` ? But then, we cannot 
pass rvalues. A real-world example of this is 
`doSomething(myData.getHash())` where `getHash` return a 
`ubyte[64]`.

Luckily we have a `-preview=rvaluerefparam` switch, which should 
do what I want, right ? Well, as I said multiple times on this 
forum, it's so utterly broken it's not even funny:
- https://issues.dlang.org/show_bug.cgi?id=20704
- https://issues.dlang.org/show_bug.cgi?id=20705 (I'm sorry, 
WHAT?)
- https://issues.dlang.org/show_bug.cgi?id=20706

Because of 20705, that switch is completely unusable for any real 
world application.
There are alternatives to this (which we are using), such as 
using `auto ref`. But it requires to use templates, which we 
cannot do with delegates, or virtual methods.

Now I don't really like to rant without having a solution to 
offer. And it turns out, that's the whole motivation for this 
post. I have a PR that solves *all* those problems at once. All 
it needs is a bit of attention / review / feedback!

The PR in question is here: 
https://github.com/dlang/dmd/pull/11000

What does it do ?
A.0) It fixes `in` to be an actual storage class, not something 
that is lowered almost immediately.
    This was necessary for the implementation to work, but has two 
nice side effects:
      1) it fixes error messages (currently `void foo(in int)` 
will display as `void foo(const(int))` in error messages);
      2) it fixes header generation (`.di` files) so that `in` is 
kept instead of seeing `const` or `scope const`, depending on 
`-preview=in`;
    I think this change has value in itself, so I submitted it as 
a separate PR (https://github.com/dlang/dmd/pull/11474), which 
itself needs a tiny adjustment in Phobos 
(https://github.com/dlang/phobos/pull/7570).

A.1) It gives a mangling to `in`: This is necessary to avoid some 
ambiguity. The main two user-visible side effects will be that 
older debuggers won't be able to demangle `in`, and that, once we 
update druntime, stack traces will show the correct signature for 
functions using `in` (currently they suffer from the same bug as 
the error message / header generation). This is also part of the 
aforementioned PR.

B) It makes `in` take the effect of `ref` when it makes sense. It 
always pass something by `ref` if the type has elaborate 
construction / destruction (postblit, copy constructor, 
destructors). If the type doesn't have any of those it is only 
passed by `ref` if it cannot be passed in register. Some types 
(dynamic arrays, probably AA in the future) are not affected to 
allow for covariance (more on that later). The heuristics there 
still need some small improvements, e.g. w.r.t. floating points 
(currently the heuristic is based on size, and not asking the 
backend) and small struct slicing, but that should not affect 
correctness.

C) It implements covariance rules: if you have a `void 
toString(scope void delegate(in char[]) sink)` method, you can 
pass it `void writeToScreen(const scope char[])`. If you have 
`void output(scope void delegate(in ubyte[64]))` you can pass it 
`void saveHash(const scope ref ubyte[64])`. Simple stuff.

D) It allows to pass rvalues to `in`. Because we know it's 
`scope`, so it cannot be escaped (allegedly), and it's `const`, 
so it cannot be modified, it's only logical that you can give it 
rvalues.

Interestingly,  benjones pointed out in the PR that this is 
similar to one of Herb Sutter's proposal for C++: 
https://youtu.be/qx22oxlQmKc?t=1258

I hope this will generate interest with people hitting the same 
problem. I tried this with my project (which depends on ~10 
libraries including Vibe.d and does C++ interop) and things just 
worked when changing `scope const auto ref` to `in`, and clearing 
up a few places where `in` parameters were escaped, or there was 
both an `in ref` and an `in` overload.

Last, but not least, if this gets accepted it would pave the way 
for another awesome change, having `checkaction=context` the 
default for D.
If you look at 
https://github.com/dlang/druntime/blob/104ac712331e4d3573fc277084334a528b5dadb1/src/cor
/internal/dassert.d you'll find that sweet `auto ref const scope` everywhere.
Jul 31 2020
next sibling parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Friday, 31 July 2020 at 21:49:25 UTC, Mathias LANG wrote:
 B) It makes `in` take the effect of `ref` when it makes sense.
i like it
 D) It allows to pass rvalues to `in`. Because we know it's 
 `scope`, so it cannot be escaped (allegedly), and it's `const`, 
 so it cannot be modified, it's only logical that you can give 
 it rvalues.
i like this too I've argued before the compiler should be allowed to optimize this in `in` case anyway so yeah you have my support here.
Jul 31 2020
parent Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:
On Friday, 31 July 2020 at 22:01:06 UTC, Adam D. Ruppe wrote:
 On Friday, 31 July 2020 at 21:49:25 UTC, Mathias LANG wrote:
 B) It makes `in` take the effect of `ref` when it makes sense.
i like it
 D) It allows to pass rvalues to `in`. Because we know it's 
 `scope`, so it cannot be escaped (allegedly), and it's 
 `const`, so it cannot be modified, it's only logical that you 
 can give it rvalues.
i like this too I've argued before the compiler should be allowed to optimize this in `in` case anyway so yeah you have my support here.
I agree. This is the D way. More simplicity via more inference. Note that brings the meaning of the `in`-parameter-qualifier very close (it to equal) to its meaning in Ada.
Aug 01 2020
prev sibling next sibling parent Kagamin <spam here.lot> writes:
On Friday, 31 July 2020 at 21:49:25 UTC, Mathias LANG wrote:
 B) It makes `in` take the effect of `ref` when it makes sense. 
 It always pass something by `ref` if the type has elaborate 
 construction / destruction (postblit, copy constructor, 
 destructors). If the type doesn't have any of those it is only 
 passed by `ref` if it cannot be passed in register.
You mean if it fits in two registers, it's still passed by reference? 16 bytes is the size of uuid and is better passed by value.
Aug 01 2020
prev sibling next sibling parent reply Rainer Schuetze <r.sagitario gmx.de> writes:
I like most of your proposal, but

On 31/07/2020 23:49, Mathias LANG wrote:
 B) It makes `in` take the effect of `ref` when it makes sense. It always
 pass something by `ref` if the type has elaborate construction /
 destruction (postblit, copy constructor, destructors). If the type
 doesn't have any of those it is only passed by `ref` if it cannot be
 passed in register. Some types (dynamic arrays, probably AA in the
 future) are not affected to allow for covariance (more on that later).
 The heuristics there still need some small improvements, e.g. w.r.t.
 floating points (currently the heuristic is based on size, and not
 asking the backend) and small struct slicing, but that should not affect
 correctness.
Please note that many C/C++-ABIs already define similar rules for passing function arguments by value (referencing a copy on the stack). It might not be the best idea to stack two similar, but maybe slightly conflicting rule sets. Maybe we can leverage that and define that if the ABI uses a reference for an `in`-value, the compiler may/must elide an extra copy. That avoids having to define our own rule set.
Aug 01 2020
parent reply Mathias LANG <geod24 gmail.com> writes:
On Saturday, 1 August 2020 at 07:48:10 UTC, Rainer Schuetze wrote:
 I like most of your proposal, but

 On 31/07/2020 23:49, Mathias LANG wrote:
 B) It makes `in` take the effect of `ref` when it makes sense. 
 It always pass something by `ref` if the type has elaborate 
 construction / destruction (postblit, copy constructor, 
 destructors). If the type doesn't have any of those it is only 
 passed by `ref` if it cannot be passed in register. Some types 
 (dynamic arrays, probably AA in the future) are not affected 
 to allow for covariance (more on that later). The heuristics 
 there still need some small improvements, e.g. w.r.t. floating 
 points (currently the heuristic is based on size, and not 
 asking the backend) and small struct slicing, but that should 
 not affect correctness.
Please note that many C/C++-ABIs already define similar rules for passing function arguments by value (referencing a copy on the stack). It might not be the best idea to stack two similar, but maybe slightly conflicting rule sets. Maybe we can leverage that and define that if the ABI uses a reference for an `in`-value, the compiler may/must elide an extra copy. That avoids having to define our own rule set.
Do you have a link ? I did some research beforehand, but all I could find was about NRVO and throwing exception, nothing about actually promoting values to references. Itanium C++ ABI doesn't have anything: https://itanium-cxx-abi.github.io/cxx-abi/abi.html#value-parameter Nor does MS: https://docs.microsoft.com/en-us/cpp/build/x64-calling-convention?view=vs-2019#parameter-passing
Aug 04 2020
parent Rainer Schuetze <r.sagitario gmx.de> writes:
On 04/08/2020 11:35, Mathias LANG wrote:
 On Saturday, 1 August 2020 at 07:48:10 UTC, Rainer Schuetze wrote:
 Maybe we can leverage that and define that if the ABI uses a reference
 for an `in`-value, the compiler may/must elide an extra copy. That
 avoids having to define our own rule set.
Do you have a link ? I did some research beforehand, but all I could find was about NRVO and throwing exception, nothing about actually promoting values to references. Itanium C++ ABI doesn't have anything: https://itanium-cxx-abi.github.io/cxx-abi/abi.html#value-parameter
Well, this already says as much for non-POD data IIUC. The System V ABI for C that is used for PODs doesn't seem to use references, though.
 Nor does MS:
 https://docs.microsoft.com/en-us/cpp/build/x64-calling-convention?view=vs-2019#parameter-passing
 
This says for non-register-sized data: "Structs or unions of other sizes are passed as a pointer to memory allocated by the caller."
Aug 04 2020
prev sibling next sibling parent reply tsbockman <thomas.bockman gmail.com> writes:
First off, this is a great change and I am excited to be able to 
use it in my own projects. Thanks for championing this.

On Friday, 31 July 2020 at 21:49:25 UTC, Mathias LANG wrote:
 B) It makes `in` take the effect of `ref` when it makes sense. 
 It always pass something by `ref` if the type has elaborate 
 construction / destruction (postblit, copy constructor, 
 destructors). If the type doesn't have any of those it is only 
 passed by `ref` if it cannot be passed in register. Some types 
 (dynamic arrays, probably AA in the future) are not affected to 
 allow for covariance (more on that later). The heuristics there 
 still need some small improvements, e.g. w.r.t. floating points 
 (currently the heuristic is based on size, and not asking the 
 backend) and small struct slicing, but that should not affect 
 correctness.
This optimization should be implemented by querying the backend, not calculated from scratch in the frontend, which is redundant and error-prone; the rules in the current PR are not accurate. The correct rules are complex, platform-dependent, and depend on the types of each function's full parameter list as a whole, not just each parameter individually. I haven't found anywhere that documents them, but by experimentation with LDC I have discovered the following: 1) The size limit for most types to be passed in registers in x86_64 is twice as large as the PR's threshold, at least on LDC. (Array slices are passed by register because of their size and member types, and do not need to be special-cased. A custom slice-like struct with a pointer and a size member will be passed by register, too.) 2) I say *most* types because there are exceptions; __vector types, and *sometimes* structs that transitively contain only a single __vector member (like 3d homogenous coordinates in graphics programming) are also passed via register, although they may be 8 times the size of a general purpose register when using AVX2 in a 32-bit program, and probably even larger on some other platform. 3) There are limits to how many arguments can be passed via registers. I say "limits", plural, because different data types may consume different types of registers; for example on x86, `int` uses general purpose registers, whereas `float` uses SIMD registers. These limits are architecture-dependent.
Aug 04 2020
parent Mathias LANG <geod24 gmail.com> writes:
On Tuesday, 4 August 2020 at 23:18:56 UTC, tsbockman wrote:
 This optimization should be implemented by querying the 
 backend, not calculated from scratch in the frontend, which is 
 redundant and error-prone; the rules in the current PR are not 
 accurate.
Indeed. The current rules were put there as a way to get the ball rolling, so to say. My current focus is to get things to compile and pass test on Buildkite, then optimize the rules. The thing that is not going to change is that types that needs elaborate copy or destruction, and types that are not copyable, will be passed by ref. Additionally, I want to keep covariance for array types, which requires them to be passed by value (although it can be done in registers). The rest, I don't mind changing it.
 The correct rules are complex, platform-dependent, and depend 
 on the types of each function's full parameter list as a whole, 
 not just each parameter individually. I haven't found anywhere 
 that documents them, but by experimentation with LDC I have 
 discovered the following:

 [...]
Thanks for the feedback. I'll definitely incorporate it (and Rainer's) into the PR soon-ish, most likely via a call to a backend hook, as is currently done for NRVO.
Aug 04 2020
prev sibling next sibling parent reply Fynn =?UTF-8?B?U2NocsO2ZGVy?= <fynnos live.com> writes:
On Friday, 31 July 2020 at 21:49:25 UTC, Mathias LANG wrote:
 B) It makes `in` take the effect of `ref` when it makes sense. 
 [...]
 C) It implements covariance rules
 [...]
 D) It allows to pass rvalues to `in`.
This sounds so great! Thank you for improving `in`!
 I hope this will generate interest with people hitting the same 
 problem.
I've literally yesterday written some new code with `const scope ref` in almost every function to pass large, complex structs. Occasionally, I had to store rvalues temporarily to pass as lvalues (non-templated code). I would rather simply put `in` on those parameters :-) It's a lot easier to grasp function signatures only using `in` and `out` on parameters (and their effect/purpose being immediately obvious to new D programmers!)
Aug 05 2020
parent James Blachly <james.blachly gmail.com> writes:
On 8/5/20 3:27 AM, Fynn Schröder wrote:
 On Friday, 31 July 2020 at 21:49:25 UTC, Mathias LANG wrote:
 I hope this will generate interest with people hitting the same problem.
I've literally yesterday written some new code with `const scope ref` in almost every function to pass large, complex structs. Occasionally, I had to store rvalues temporarily to pass as lvalues (non-templated code). I would rather simply put `in` on those parameters :-) It's a lot easier to grasp function signatures only using `in` and `out` on parameters (and their effect/purpose being immediately obvious to new D programmers!)
Sorry to interject d.D.learn material in this thread, but where is "scope ref" documented? I found https://dlang.org/spec/function.html#scope-parameters which discusses the use of `scope` with ref type parameters, but the example given is pointer-based. Is it correct that `scope ref T` behaves the same as `scope T*` ? DIP1000 shows as "superseded" I am glad D is iterating and improving on safety, but I have found that documentation may not well recent changes in this area.
Aug 23 2020
prev sibling next sibling parent reply Kagamin <spam here.lot> writes:
On Friday, 31 July 2020 at 21:49:25 UTC, Mathias LANG wrote:
 https://github.com/dlang/dmd/pull/11000
1. Deprecation of `in ref` makes no sense. Why? I assume it's due to a bug in the proposed change. How the compiler should know that the argument should be passed by ref? I doesn't necessarily know how to load the argument, it may have alignment and synchronization requirements. And more importantly how the programmer can know whether the argument is passed by ref, now that it varies by platform? 2. Dependence on calling convention. AIU ref semantics depends on parameter position? 3. Runtime hooks don't go through semantic checks. Is this a theoretical concern or did you introduce some new behavior that causes problem with this?
Aug 20 2020
parent reply Mathias LANG <geod24 gmail.com> writes:
On Thursday, 20 August 2020 at 15:59:24 UTC, Kagamin wrote:
 On Friday, 31 July 2020 at 21:49:25 UTC, Mathias LANG wrote:
 https://github.com/dlang/dmd/pull/11000
1. Deprecation of `in ref` makes no sense. Why? I assume it's due to a bug in the proposed change.
I'm not in the business of deprecating something to accommodate for a broken implementation, no. The implementation originally allowed `in ref`, but after some tinkering and looking at people's usages, my opinion is that it would be better to just allow `in`.
 How the compiler should know that the argument should be passed 
 by ref?
 I doesn't necessarily know how to load the argument, it may 
 have alignment and synchronization requirements.
As explained in the PR, and in the changelog, the compiler knows by inspecting the type. If the type has elaborate copy or destruction, IOW, if copying it would have side effects, it will always pass it by ref to avoid those side effects. Otherwise, it asks the backend. The current rule in DMD is for types that are over twice the size of a register to be passed by ref (or real). I can't think of a situation where the compiler doesn't know how to load the argument. If you're talking about opaque types, those are rejected.
 And more importantly how the programmer can know whether the 
 argument is passed by ref, now that it varies by platform?
I don't understand why it would be "more important". The point of `in` parameter is that it does the right thing for parameters which are read-only and won't escape the scope of the function. It doesn't matter to the user whether your parameter is `ref` or not if it is `scope const`, because you can't modify it anyway. It only matters if passing it by value would be expensive (e.g. large static array) or have side effects (e.g. a destructor).
 2. Dependence on calling convention. AIU ref semantics depends 
 on parameter position?
Yes. Originally didn't, but that was the main feedback I got, that it should be done at the function level instead of the parameter (type) level.
 3. Runtime hooks don't go through semantic checks. Is this a 
 theoretical concern or did you introduce some new behavior that 
 causes problem with this?
Just to be clear: When I said "runtime hook", I meant "the AST which the compiler generate to call C functions in druntime". It generates the equivalent of a prototype and call that. It's not a big deal, and I found a way around.
Aug 20 2020
parent reply Kagamin <spam here.lot> writes:
On Thursday, 20 August 2020 at 17:25:43 UTC, Mathias LANG wrote:
 On Thursday, 20 August 2020 at 15:59:24 UTC, Kagamin wrote:
 On Friday, 31 July 2020 at 21:49:25 UTC, Mathias LANG wrote:
 https://github.com/dlang/dmd/pull/11000
1. Deprecation of `in ref` makes no sense. Why? I assume it's due to a bug in the proposed change.
I'm not in the business of deprecating something to accommodate for a broken implementation, no. The implementation originally allowed `in ref`, but after some tinkering and looking at people's usages, my opinion is that it would be better to just allow `in`.
It needlessly degrades language and breaks code and shouldn't be done. Didn't you write this pull because you believe `in ref` is useful?
 I can't think of a situation where the compiler doesn't know 
 how to load the argument. If you're talking about opaque types, 
 those are rejected.

 And more importantly how the programmer can know whether the 
 argument is passed by ref, now that it varies by platform?
I don't understand why it would be "more important". The point of `in` parameter is that it does the right thing for parameters which are read-only and won't escape the scope of the function. It doesn't matter to the user whether your parameter is `ref` or not if it is `scope const`, because you can't modify it anyway. It only matters if passing it by value would be expensive (e.g. large static array) or have side effects (e.g. a destructor).
I mean things like int atomicLoad(in ref shared int n); int loadAligned(in ref byte[4] n); When the argument should be passed by ref by programmer's intent and should be communicated to the compiler, because the compiler isn't that smart.
 2. Dependence on calling convention. AIU ref semantics depends 
 on parameter position?
Yes. Originally didn't, but that was the main feedback I got, that it should be done at the function level instead of the parameter (type) level.
Doesn't this defeat your optimization when passing by value is expensive?
Aug 21 2020
parent reply Mathias LANG <geod24 gmail.com> writes:
On Friday, 21 August 2020 at 09:48:16 UTC, Kagamin wrote:
 I mean things like
 int atomicLoad(in ref shared int n);
 int loadAligned(in ref byte[4] n);
 When the argument should be passed by ref by programmer's 
 intent and should be communicated to the compiler, because the 
 compiler isn't that smart.
The first example is pretty good: We probably need to specify the interaction with `shared`. AFAICS it boils down to "when do we want to read a `shared` value ?". If we pass by value, it means the input parameter will only have a single value, while if we pass by ref, a function can "listen" to changes. I don't really have an answer for this at the moment, I would need to try out some options before I make up my mind. The second example is pretty simple: the backend will decide whether to pass it by ref or not. Since it's a small type, it might make more sense to pass it in registers. Whether or not it's ref does not matter to the programmer, because the programmer cannot change the input anyway, only read it.
 2. Dependence on calling convention. AIU ref semantics 
 depends on parameter position?
Yes. Originally didn't, but that was the main feedback I got, that it should be done at the function level instead of the parameter (type) level.
Doesn't this defeat your optimization when passing by value is expensive?
I don't see how ?
Aug 21 2020
parent reply Kagamin <spam here.lot> writes:
On Friday, 21 August 2020 at 18:23:08 UTC, Mathias LANG wrote:
 The second example is pretty simple: the backend will decide 
 whether to pass it by ref or not. Since it's a small type, it 
 might make more sense to pass it in registers. Whether or not 
 it's ref does not matter to the programmer, because the 
 programmer cannot change the input anyway, only read it.
The backend doesn't know how to load the data. It matters to the programmer, because it affects performance. Another example: void log(in ref int n) { write(fd, &n, n.sizeof); } If the argument here is passed by value, it will need to fiddle with stack and the code will be less optimal.
 Doesn't this defeat your optimization when passing by value is 
 expensive?
I don't see how ?
If a non-POD object is passed by value, it will have extra calls to postblit and destructor.
Aug 25 2020
parent reply Mathias LANG <geod24 gmail.com> writes:
On Tuesday, 25 August 2020 at 12:09:45 UTC, Kagamin wrote:
 The backend doesn't know how to load the data. It matters to 
 the programmer, because it affects performance.

 Another example:
 void log(in ref int n)
 {
   write(fd, &n, n.sizeof);
 }
 If the argument here is passed by value, it will need to fiddle 
 with stack and the code will be less optimal.
I assume you imply that it's less optimal because it's taking the address of it ?
 If a non-POD object is passed by value, it will have extra 
 calls to postblit and destructor.
https://github.com/dlang/dmd/blob/master/changelog/preview-in.dd
 - If the type has an elaborate copy or destruction (postblit, 
 copy constructor, destructor),
 the type is always passed by reference.
Aug 25 2020
parent Kagamin <spam here.lot> writes:
On Tuesday, 25 August 2020 at 16:05:51 UTC, Mathias LANG wrote:
 On Tuesday, 25 August 2020 at 12:09:45 UTC, Kagamin wrote:
 The backend doesn't know how to load the data. It matters to 
 the programmer, because it affects performance.

 Another example:
 void log(in ref int n)
 {
   write(fd, &n, n.sizeof);
 }
 If the argument here is passed by value, it will need to 
 fiddle with stack and the code will be less optimal.
I assume you imply that it's less optimal because it's taking the address of it ?
Because it has an extra write to memory, pushing the parameter passed by value to stack to take its address. In general when the parameter is passed by reference somewhere, like the refactoring you did in phobos replacing `in ref` with `scope const ref` - the parameter can be passed to one of those functions that have precise ref semantics. I also wonder how they interact with auto ref.
 If a non-POD object is passed by value, it will have extra 
 calls to postblit and destructor.
https://github.com/dlang/dmd/blob/master/changelog/preview-in.dd
 - If the type has an elaborate copy or destruction (postblit, 
 copy constructor, destructor),
 the type is always passed by reference.
If ref semantics doesn't depend on parameter position, then it's fine, but you say it depends.
Aug 27 2020
prev sibling next sibling parent reply IGotD- <nise nise.com> writes:
On Friday, 31 July 2020 at 21:49:25 UTC, Mathias LANG wrote:
 B) It makes `in` take the effect of `ref` when it makes sense. 
 It always pass something by `ref` if the type has elaborate 
 construction / destruction (postblit, copy constructor, 
 destructors). If the type doesn't have any of those it is only 
 passed by `ref` if it cannot be passed in register. Some types 
 (dynamic arrays, probably AA in the future) are not affected to 
 allow for covariance (more on that later). The heuristics there 
 still need some small improvements, e.g. w.r.t. floating points 
 (currently the heuristic is based on size, and not asking the 
 backend) and small struct slicing, but that should not affect 
 correctness.
This is interesting on a general level as well and true for several programming languages. Let the compiler optimize the parameter passing unless the programmer explicitly ask for a certain way (copy object, pointer/reference etc.). This is very unusual and if you have a language that optimizes the parameter passing by default, please mention it because it would be interesting. I'm all for that 'in' can be used for a "optimized const parameter" where the compiler optimizes as it wants. However, the question is if we want to override the default behaviour per object basis? As mentioned you might want to have another default behaviour for arrays but that can be true for other storage structures as well so this should really be user defined.
Aug 20 2020
next sibling parent reply Araq <rumpf_a web.de> writes:
On Thursday, 20 August 2020 at 17:31:17 UTC, IGotD- wrote:
 This is interesting on a general level as well and true for 
 several programming languages. Let the compiler optimize the 
 parameter passing unless the programmer explicitly ask for a 
 certain way (copy object, pointer/reference etc.). This is very 
 unusual and if you have a language that optimizes the parameter 
 passing by default, please mention it because it would be 
 interesting.
Nim does this and I took the feature from Ada. You can override the behavior with pragmas but I've only seen that done for C interop, not for optimizations as the compiler always seems to get it right.
Aug 20 2020
parent reply Petar Kirov [ZombineDev] <petar.p.kirov gmail.com> writes:
On Thursday, 20 August 2020 at 20:29:31 UTC, Araq wrote:
 On Thursday, 20 August 2020 at 17:31:17 UTC, IGotD- wrote:
 This is interesting on a general level as well and true for 
 several programming languages. Let the compiler optimize the 
 parameter passing unless the programmer explicitly ask for a 
 certain way (copy object, pointer/reference etc.). This is 
 very unusual and if you have a language that optimizes the 
 parameter passing by default, please mention it because it 
 would be interesting.
Nim does this and I took the feature from Ada. You can override the behavior with pragmas but I've only seen that done for C interop, not for optimizations as the compiler always seems to get it right.
Interesting, how does it work? As mentioned previously, the issue with choosing by-val vs by-ref passing solely based on the type is that it doesn't take into consideration the ABI and how many registers are available. E.g. if you have more than X parameters some will need to be spilled on the stack anyway (even if they can fit in a GPR). Can the Nim compiler (I'm guessing `extccomp`) query the C/C++ compiler backend for e.g. the number of registers available for parameters? Or you have this logic built into the frontend (say `ccgtypes`)?
Aug 21 2020
parent Araq <rumpf_a web.de> writes:
On Friday, 21 August 2020 at 13:59:30 UTC, Petar Kirov 
[ZombineDev] wrote:
 Interesting, how does it work? As mentioned previously, the 
 issue with choosing by-val vs by-ref passing solely based on 
 the type is that it doesn't take into consideration the ABI and 
 how many registers are available. E.g. if you have more than X 
 parameters some will need to be spilled on the stack anyway 
 (even if they can fit in a GPR).

 Can the Nim compiler (I'm guessing `extccomp`) query the C/C++ 
 compiler backend for e.g. the number of registers available for 
 parameters? Or you have this logic built into the frontend (say 
 `ccgtypes`)?
It's built into cctypes indeed. The logic is mostly "pass by pointer if sizeof(T) > 3 machine words". ABI and registers are not relevant all that much, what you want to prevent are copies of large sizes.
Aug 21 2020
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.com> writes:
On 8/20/20 1:31 PM, IGotD- wrote:
 This is interesting on a general level as well and true for several 
 programming languages. Let the compiler optimize the parameter passing 
 unless the programmer explicitly ask for a certain way (copy object, 
 pointer/reference etc.).
This has been discussed a few times. If mutation is allowed, aliasing is a killer: void fun(ref S a, const compiler_chooses S b) { ... mutate a, read b ... } S x; fun(x, x); // oops The problem now is that the semantics of fun depends on whether the compiler chose pass by value vs. pass by reference.
Aug 20 2020
next sibling parent reply Araq <rumpf_a web.de> writes:
On Thursday, 20 August 2020 at 22:19:16 UTC, Andrei Alexandrescu 
wrote:
 On 8/20/20 1:31 PM, IGotD- wrote:
 This is interesting on a general level as well and true for 
 several programming languages. Let the compiler optimize the 
 parameter passing unless the programmer explicitly ask for a 
 certain way (copy object, pointer/reference etc.).
This has been discussed a few times. If mutation is allowed, aliasing is a killer: void fun(ref S a, const compiler_chooses S b) { ... mutate a, read b ... } S x; fun(x, x); // oops The problem now is that the semantics of fun depends on whether the compiler chose pass by value vs. pass by reference.
True but in practice it doesn't happen very often. The benefits far outweigh this minor downside. Plus there are known ways to prevent this form of aliasing at compile-time.
Aug 20 2020
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 8/21/20 1:20 AM, Araq wrote:
 On Thursday, 20 August 2020 at 22:19:16 UTC, Andrei Alexandrescu wrote:
 On 8/20/20 1:31 PM, IGotD- wrote:
 This is interesting on a general level as well and true for several 
 programming languages. Let the compiler optimize the parameter 
 passing unless the programmer explicitly ask for a certain way (copy 
 object, pointer/reference etc.).
This has been discussed a few times. If mutation is allowed, aliasing is a killer: void fun(ref S a, const compiler_chooses S b) {     ... mutate a, read b ... } S x; fun(x, x); // oops The problem now is that the semantics of fun depends on whether the compiler chose pass by value vs. pass by reference.
True but in practice it doesn't happen very often. The benefits far outweigh this minor downside.
That seems quite worrisome. A bug rare and subtle that can become devastating. Something the Hindenburg captain might have said.
 Plus there are known ways to prevent this 
 form of aliasing at compile-time.
Not if you have globals and/or separate compilation.
Aug 21 2020
next sibling parent reply tsbockman <thomas.bockman gmail.com> writes:
On Friday, 21 August 2020 at 19:21:37 UTC, Andrei Alexandrescu 
wrote:
 On 8/21/20 1:20 AM, Araq wrote:
 On Thursday, 20 August 2020 at 22:19:16 UTC, Andrei 
 Alexandrescu wrote:
 On 8/20/20 1:31 PM, IGotD- wrote:
 void fun(ref S a, const compiler_chooses S b) {
     ... mutate a, read b ...
 }

 S x;
 fun(x, x); // oops

 The problem now is that the semantics of fun depends on 
 whether the compiler chose pass by value vs. pass by 
 reference.
True but in practice it doesn't happen very often. The benefits far outweigh this minor downside.
That seems quite worrisome. A bug rare and subtle that can become devastating. Something the Hindenburg captain might have said.
 Plus there are known ways to prevent this form of aliasing at 
 compile-time.
Not if you have globals and/or separate compilation.
The risk of aliasing-related bugs comes from the availability of pointers/references though, which will still be present in D with or without this optimization. People who want to rule out aliasing problems can just explicitly specify pass-by-value using `scope const` instead of `in`. Let's not cripple the future of the language out of fear of pitfalls that are already present, and cannot be entirely removed as long as D remains a systems programming language.
Aug 21 2020
parent IGotD- <nise nise.com> writes:
On Friday, 21 August 2020 at 23:16:53 UTC, tsbockman wrote:
 The risk of aliasing-related bugs comes from the availability 
 of pointers/references though, which will still be present in D 
 with or without this optimization. People who want to rule out 
 aliasing problems can just explicitly specify pass-by-value 
 using `scope const` instead of `in`.

 Let's not cripple the future of the language out of fear of 
 pitfalls that are already present, and cannot be entirely 
 removed as long as D remains a systems programming language.
Most of the parameters can actually be 'const', which also the case for 'in' parameters with the current definition. With const parameters, aliasing is no problem unless you do something with pointers further into the parameters. 'in' should be const by default like it is today and if not use inout or ref. With a language like D it is up to the programmer ensure there is no aliasing, much like the 'restrict' qualifier in C. I'm for that the compiler should detect any aliasing but this can be gradually improved. The new 'in' qualifier is interesting enough not to through it away because of the aliasing problems.
Aug 22 2020
prev sibling parent reply Araq <rumpf_a web.de> writes:
On Friday, 21 August 2020 at 19:21:37 UTC, Andrei Alexandrescu 
wrote:
 On 8/21/20 1:20 AM, Araq wrote:
 On Thursday, 20 August 2020 at 22:19:16 UTC, Andrei 
 Alexandrescu wrote:
 On 8/20/20 1:31 PM, IGotD- wrote:
 [...]
This has been discussed a few times. If mutation is allowed, aliasing is a killer: void fun(ref S a, const compiler_chooses S b) {     ... mutate a, read b ... } S x; fun(x, x); // oops The problem now is that the semantics of fun depends on whether the compiler chose pass by value vs. pass by reference.
True but in practice it doesn't happen very often. The benefits far outweigh this minor downside.
That seems quite worrisome. A bug rare and subtle that can become devastating. Something the Hindenburg captain might have said.
 Plus there are known ways to prevent this form of aliasing at 
 compile-time.
Not if you have globals and/or separate compilation.
Wrong. You can simply compile it to 'fun(x, copy(x))' if an alias analysis cannot disambiguate the locations and the alias analysis is restricted to the callsite, separate compilation continues to work.
Aug 21 2020
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 8/22/20 2:07 AM, Araq wrote:
 On Friday, 21 August 2020 at 19:21:37 UTC, Andrei Alexandrescu wrote:
 On 8/21/20 1:20 AM, Araq wrote:
 On Thursday, 20 August 2020 at 22:19:16 UTC, Andrei Alexandrescu wrote:
 On 8/20/20 1:31 PM, IGotD- wrote:
 [...]
This has been discussed a few times. If mutation is allowed, aliasing is a killer: void fun(ref S a, const compiler_chooses S b) {     ... mutate a, read b ... } S x; fun(x, x); // oops The problem now is that the semantics of fun depends on whether the compiler chose pass by value vs. pass by reference.
True but in practice it doesn't happen very often. The benefits far outweigh this minor downside.
That seems quite worrisome. A bug rare and subtle that can become devastating. Something the Hindenburg captain might have said.
 Plus there are known ways to prevent this form of aliasing at 
 compile-time.
Not if you have globals and/or separate compilation.
Wrong. You can simply compile it to 'fun(x, copy(x))' if an alias analysis cannot disambiguate the locations and the alias analysis is restricted to the callsite, separate compilation continues to work.
Fair enough, thanks.
Aug 22 2020
prev sibling parent Mathias LANG <geod24 gmail.com> writes:
On Thursday, 20 August 2020 at 22:19:16 UTC, Andrei Alexandrescu 
wrote:
 On 8/20/20 1:31 PM, IGotD- wrote:
 This is interesting on a general level as well and true for 
 several programming languages. Let the compiler optimize the 
 parameter passing unless the programmer explicitly ask for a 
 certain way (copy object, pointer/reference etc.).
This has been discussed a few times. If mutation is allowed, aliasing is a killer: void fun(ref S a, const compiler_chooses S b) { ... mutate a, read b ... } S x; fun(x, x); // oops The problem now is that the semantics of fun depends on whether the compiler chose pass by value vs. pass by reference.
Isn't that what Walter's OB system is supposed to address ?
Aug 20 2020
prev sibling next sibling parent reply WebFreak001 <d.forum webfreak.org> writes:
On Friday, 31 July 2020 at 21:49:25 UTC, Mathias LANG wrote:
 [...]
how does the ABI work for this for extern(C) and others? Will it mean ref or not ref / will that translate to a pointer or not? For example in Win32 COM IDL files I often see [in] used for input only parameters, can we annotate parameters in D like that without changing calling semantics to document that they are input only parameters and potentially allow optimizations?
Aug 21 2020
parent Jacob Carlborg <doob me.com> writes:
On Friday, 21 August 2020 at 09:37:37 UTC, WebFreak001 wrote:

 For example in Win32 COM IDL files I often see [in] used for 
 input only parameters, can we annotate parameters in D like 
 that without changing calling semantics to document that they 
 are input only parameters and potentially allow optimizations?
You can attach a UDA. It can't be called "in", but perhaps "input". -- /Jacob Carlborg
Aug 21 2020
prev sibling parent reply Atila Neves <atila.neves gmail.com> writes:
On Friday, 31 July 2020 at 21:49:25 UTC, Mathias LANG wrote:
 Hi everyone,
 For a long time I've been pretty annoyed by the state of `in` 
 parameters.
 In case it needs any clarification, I'm talking at what's 
 between the asterisks (*) here: `void foo (*in* char[] arg)`).

 [...]
I think that having `in ref` is preferable than introducing new rules. Everything else makes sense to me, but it's possible I'm missing something.
Aug 25 2020
next sibling parent reply Mathias LANG <geod24 gmail.com> writes:
On Tuesday, 25 August 2020 at 14:43:45 UTC, Atila Neves wrote:
 On Friday, 31 July 2020 at 21:49:25 UTC, Mathias LANG wrote:
 Hi everyone,
 For a long time I've been pretty annoyed by the state of `in` 
 parameters.
 In case it needs any clarification, I'm talking at what's 
 between the asterisks (*) here: `void foo (*in* char[] arg)`).

 [...]
I think that having `in ref` is preferable than introducing new rules. Everything else makes sense to me, but it's possible I'm missing something.
Why is it preferable ? Genuine question here. I originally had in mind to support `in ref` as being the user forcing `ref` on the param. It just seemed *obvious*. However, after a lot of playing around with it, it wasn't that obvious anymore.
Aug 25 2020
next sibling parent Atila Neves <atila.neves gmail.com> writes:
On Wednesday, 26 August 2020 at 04:15:21 UTC, Mathias LANG wrote:
 On Tuesday, 25 August 2020 at 14:43:45 UTC, Atila Neves wrote:
 On Friday, 31 July 2020 at 21:49:25 UTC, Mathias LANG wrote:
 Hi everyone,
 For a long time I've been pretty annoyed by the state of `in` 
 parameters.
 In case it needs any clarification, I'm talking at what's 
 between the asterisks (*) here: `void foo (*in* char[] arg)`).

 [...]
I think that having `in ref` is preferable than introducing new rules. Everything else makes sense to me, but it's possible I'm missing something.
Why is it preferable ? Genuine question here. I originally had in mind to support `in ref` as being the user forcing `ref` on the param. It just seemed *obvious*. However, after a lot of playing around with it, it wasn't that obvious anymore.
For me, more programmer choice and less magic.
Aug 26 2020
prev sibling parent reply Atila Neves <atila.neves gmail.com> writes:
On Wednesday, 26 August 2020 at 04:15:21 UTC, Mathias LANG wrote:
 On Tuesday, 25 August 2020 at 14:43:45 UTC, Atila Neves wrote:
 On Friday, 31 July 2020 at 21:49:25 UTC, Mathias LANG wrote:
 Hi everyone,
 For a long time I've been pretty annoyed by the state of `in` 
 parameters.
 In case it needs any clarification, I'm talking at what's 
 between the asterisks (*) here: `void foo (*in* char[] arg)`).

 [...]
I think that having `in ref` is preferable than introducing new rules. Everything else makes sense to me, but it's possible I'm missing something.
Why is it preferable ? Genuine question here. I originally had in mind to support `in ref` as being the user forcing `ref` on the param. It just seemed *obvious*. However, after a lot of playing around with it, it wasn't that obvious anymore.
I think it's preferable because it gives more control to the programmer and there's less magic. And also that enregistering and all that probably belongs in the backend.
Aug 26 2020
parent Jacob Carlborg <doob me.com> writes:
On Wednesday, 26 August 2020 at 09:10:45 UTC, Atila Neves wrote:

 I think it's preferable because it gives more control to the 
 programmer and there's less magic. And also that enregistering 
 and all that probably belongs in the backend.
I like the idea of Herb Sutter's talk [1], where the programmer specifies the semantic behavior of the parameters, instead of the mechanics how to pass them. The programmer specifies `in`, `out`, `inout`, `move` or `forward`. Then the compiler figures out how to achieve the semantics in the most efficient way. There's nothing in Mathias' proposal that cannot be accomplished today with other attributes? So if you want more control, don't use `in`. [1] https://youtu.be/qx22oxlQmKc?t=1258 -- /Jacob Carlborg
Aug 26 2020
prev sibling parent IGotD- <nise nise.com> writes:
On Tuesday, 25 August 2020 at 14:43:45 UTC, Atila Neves wrote:
 I think that having `in ref` is preferable than introducing new 
 rules. Everything else makes sense to me, but it's possible I'm 
 missing something.
I think that the proposal from Mathias LANG meant that 'in' shall not be accompanied with any other qualifier. How the parameter is passed is up to the compiler, if it is by value, reference or something else. If you explicitly want a reference you use 'ref' just like today.
Aug 26 2020