digitalmars.D - `in` parameters made useful

Mathias LANG (96/96) Jul 31 2020 Hi everyone,

Adam D. Ruppe (5/10) Jul 31 2020 i like this too

Per =?UTF-8?B?Tm9yZGzDtnc=?= (4/14) Aug 01 2020 I agree. This is the D way. More simplicity via more inference.

Kagamin (4/9) Aug 01 2020 You mean if it fits in two registers, it's still passed by
Rainer Schuetze (9/19) Aug 01 2020 Please note that many C/C++-ABIs already define similar rules for

Mathias LANG (8/28) Aug 04 2020 Do you have a link ? I did some research beforehand, but all I

Rainer Schuetze (6/20) Aug 04 2020 Well, this already says as much for non-POD data IIUC.

tsbockman (29/40) Aug 04 2020 First off, this is a great change and I am excited to be able to

Mathias LANG (13/23) Aug 04 2020 Indeed. The current rules were put there as a way to get the ball

Fynn =?UTF-8?B?U2NocsO2ZGVy?= (9/16) Aug 05 2020 I've literally yesterday written some new code with `const scope

James Blachly (10/20) Aug 23 2020 Sorry to interject d.D.learn material in this thread, but where is

Kagamin (13/14) Aug 20 2020 1. Deprecation of `in ref` makes no sense. Why? I assume it's due

Mathias LANG (30/45) Aug 20 2020 I'm not in the business of deprecating something to accommodate

Kagamin (12/41) Aug 21 2020 It needlessly degrades language and breaks code and shouldn't be

Mathias LANG (15/29) Aug 21 2020 The first example is pretty good: We probably need to specify the

Kagamin (12/20) Aug 25 2020 The backend doesn't know how to load the data. It matters to the

Mathias LANG (4/18) Aug 25 2020 I assume you imply that it's less optimal because it's taking the

Kagamin (10/30) Aug 27 2020 Because it has an extra write to memory, pushing the parameter

IGotD- (14/25) Aug 20 2020 This is interesting on a general level as well and true for

Araq (5/12) Aug 20 2020 Nim does this and I took the feature from Ada. You can override

Petar Kirov [ZombineDev] (11/24) Aug 21 2020 Interesting, how does it work? As mentioned previously, the issue

Araq (6/16) Aug 21 2020 It's built into cctypes indeed. The logic is mostly "pass by

Andrei Alexandrescu (10/14) Aug 20 2020 This has been discussed a few times. If mutation is allowed, aliasing is...

Araq (5/19) Aug 20 2020 True but in practice it doesn't happen very often. The benefits

Andrei Alexandrescu (4/28) Aug 21 2020 That seems quite worrisome. A bug rare and subtle that can become

tsbockman (10/33) Aug 21 2020 The risk of aliasing-related bugs comes from the availability of

IGotD- (12/20) Aug 22 2020 Most of the parameters can actually be 'const', which also the

Araq (6/34) Aug 21 2020 Wrong. You can simply compile it to 'fun(x, copy(x))' if an alias

Andrei Alexandrescu (2/35) Aug 22 2020 Fair enough, thanks.

Mathias LANG (3/17) Aug 20 2020 Isn't that what Walter's OB system is supposed to address ?

WebFreak001 (7/8) Aug 21 2020 how does the ABI work for this for extern(C) and others? Will it

Jacob Carlborg (5/9) Aug 21 2020 You can attach a UDA. It can't be called "in", but perhaps

Atila Neves (4/10) Aug 25 2020 I think that having `in ref` is preferable than introducing new

Mathias LANG (6/17) Aug 25 2020 Why is it preferable ? Genuine question here.

Atila Neves (2/20) Aug 26 2020 For me, more programmer choice and less magic.
Atila Neves (4/22) Aug 26 2020 I think it's preferable because it gives more control to the

Jacob Carlborg (12/15) Aug 26 2020 I like the idea of Herb Sutter's talk [1], where the programmer

IGotD- (6/9) Aug 26 2020 I think that the proposal from Mathias LANG meant that 'in' shall

Mathias LANG <geod24 gmail.com> writes:

Hi everyone,
For a long time I've been pretty annoyed by the state of `in` 
parameters.
In case it needs any clarification, I'm talking at what's between 
the asterisks (*) here: `void foo (*in* char[] arg)`).

While they always seemed like a good idea, they never really 
added anything: `in` was supposed to be `const scope`, then, when 
the time came to make `scope` actually do something (read: 
DIP1000), `scope` was removed from `in`!

This was re-added last release (DMD 2.092) where the 
`-preview=in` switch was added 
(https://dlang.org/changelog/2.092.0.html#preview-in). So now, if 
you want `in` to mean what it's documented to be, you need to 
throw in both `-preview=dip1000` and `-preview=in`.

But then... That still feels incomplete. I deal with a lot of C++ 
interop code, and we can't use `in` without using `ref`, because 
otherwise we trigger copy constructors / destructors of 
aggregates we have no control over. We also have some value types 
which can get pretty big, so we don't want to pass those by 
value, either. So easy solution, add `ref` ? But then, we cannot 
pass rvalues. A real-world example of this is 
`doSomething(myData.getHash())` where `getHash` return a 
`ubyte[64]`.

Luckily we have a `-preview=rvaluerefparam` switch, which should 
do what I want, right ? Well, as I said multiple times on this 
forum, it's so utterly broken it's not even funny:
- https://issues.dlang.org/show_bug.cgi?id=20704
- https://issues.dlang.org/show_bug.cgi?id=20705 (I'm sorry, 
WHAT?)
- https://issues.dlang.org/show_bug.cgi?id=20706

Because of 20705, that switch is completely unusable for any real 
world application.
There are alternatives to this (which we are using), such as 
using `auto ref`. But it requires to use templates, which we 
cannot do with delegates, or virtual methods.

Now I don't really like to rant without having a solution to 
offer. And it turns out, that's the whole motivation for this 
post. I have a PR that solves *all* those problems at once. All 
it needs is a bit of attention / review / feedback!

The PR in question is here: 
https://github.com/dlang/dmd/pull/11000

What does it do ?
A.0) It fixes `in` to be an actual storage class, not something 
that is lowered almost immediately.
    This was necessary for the implementation to work, but has two 
nice side effects:
      1) it fixes error messages (currently `void foo(in int)` 
will display as `void foo(const(int))` in error messages);
      2) it fixes header generation (`.di` files) so that `in` is 
kept instead of seeing `const` or `scope const`, depending on 
`-preview=in`;
    I think this change has value in itself, so I submitted it as 
a separate PR (https://github.com/dlang/dmd/pull/11474), which 
itself needs a tiny adjustment in Phobos 
(https://github.com/dlang/phobos/pull/7570).

A.1) It gives a mangling to `in`: This is necessary to avoid some 
ambiguity. The main two user-visible side effects will be that 
older debuggers won't be able to demangle `in`, and that, once we 
update druntime, stack traces will show the correct signature for 
functions using `in` (currently they suffer from the same bug as 
the error message / header generation). This is also part of the 
aforementioned PR.

B) It makes `in` take the effect of `ref` when it makes sense. It 
always pass something by `ref` if the type has elaborate 
construction / destruction (postblit, copy constructor, 
destructors). If the type doesn't have any of those it is only 
passed by `ref` if it cannot be passed in register. Some types 
(dynamic arrays, probably AA in the future) are not affected to 
allow for covariance (more on that later). The heuristics there 
still need some small improvements, e.g. w.r.t. floating points 
(currently the heuristic is based on size, and not asking the 
backend) and small struct slicing, but that should not affect 
correctness.

C) It implements covariance rules: if you have a `void 
toString(scope void delegate(in char[]) sink)` method, you can 
pass it `void writeToScreen(const scope char[])`. If you have 
`void output(scope void delegate(in ubyte[64]))` you can pass it 
`void saveHash(const scope ref ubyte[64])`. Simple stuff.

D) It allows to pass rvalues to `in`. Because we know it's 
`scope`, so it cannot be escaped (allegedly), and it's `const`, 
so it cannot be modified, it's only logical that you can give it 
rvalues.

Interestingly,  benjones pointed out in the PR that this is 
similar to one of Herb Sutter's proposal for C++: 
https://youtu.be/qx22oxlQmKc?t=1258

I hope this will generate interest with people hitting the same 
problem. I tried this with my project (which depends on ~10 
libraries including Vibe.d and does C++ interop) and things just 
worked when changing `scope const auto ref` to `in`, and clearing 
up a few places where `in` parameters were escaped, or there was 
both an `in ref` and an `in` overload.

Last, but not least, if this gets accepted it would pave the way 
for another awesome change, having `checkaction=context` the 
default for D.
If you look at 
https://github.com/dlang/druntime/blob/104ac712331e4d3573fc277084334a528b5dadb1/src/cor
/internal/dassert.d you'll find that sweet `auto ref const scope` everywhere.

Jul 31 2020

Adam D. Ruppe <destructionator gmail.com> writes:

On Friday, 31 July 2020 at 21:49:25 UTC, Mathias LANG wrote:
 B) It makes `in` take the effect of `ref` when it makes sense.

i like it

 D) It allows to pass rvalues to `in`. Because we know it's 
 `scope`, so it cannot be escaped (allegedly), and it's `const`, 
 so it cannot be modified, it's only logical that you can give 
 it rvalues.

i like this too


I've argued before the compiler should be allowed to optimize 
this in `in` case anyway so yeah you have my support here.

Jul 31 2020

Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:

On Friday, 31 July 2020 at 22:01:06 UTC, Adam D. Ruppe wrote:
 On Friday, 31 July 2020 at 21:49:25 UTC, Mathias LANG wrote:
 B) It makes `in` take the effect of `ref` when it makes sense.

 i like it

 D) It allows to pass rvalues to `in`. Because we know it's 
 `scope`, so it cannot be escaped (allegedly), and it's 
 `const`, so it cannot be modified, it's only logical that you 
 can give it rvalues.

 i like this too


 I've argued before the compiler should be allowed to optimize 
 this in `in` case anyway so yeah you have my support here.

I agree. This is the D way. More simplicity via more inference.

Note that brings the meaning of the `in`-parameter-qualifier very 
close (it to equal) to its meaning in Ada.

Aug 01 2020

Kagamin <spam here.lot> writes:

On Friday, 31 July 2020 at 21:49:25 UTC, Mathias LANG wrote:
 B) It makes `in` take the effect of `ref` when it makes sense. 
 It always pass something by `ref` if the type has elaborate 
 construction / destruction (postblit, copy constructor, 
 destructors). If the type doesn't have any of those it is only 
 passed by `ref` if it cannot be passed in register.

You mean if it fits in two registers, it's still passed by 
reference? 16 bytes is the size of uuid and is better passed by 
value.

Aug 01 2020

Rainer Schuetze <r.sagitario gmx.de> writes:

I like most of your proposal, but

On 31/07/2020 23:49, Mathias LANG wrote:
 B) It makes `in` take the effect of `ref` when it makes sense. It always
 pass something by `ref` if the type has elaborate construction /
 destruction (postblit, copy constructor, destructors). If the type
 doesn't have any of those it is only passed by `ref` if it cannot be
 passed in register. Some types (dynamic arrays, probably AA in the
 future) are not affected to allow for covariance (more on that later).
 The heuristics there still need some small improvements, e.g. w.r.t.
 floating points (currently the heuristic is based on size, and not
 asking the backend) and small struct slicing, but that should not affect
 correctness.

Please note that many C/C++-ABIs already define similar rules for
passing function arguments by value (referencing a copy on the stack).
It might not be the best idea to stack two similar, but maybe slightly
conflicting rule sets.

Maybe we can leverage that and define that if the ABI uses a reference
for an `in`-value, the compiler may/must elide an extra copy. That
avoids having to define our own rule set.

Aug 01 2020

Mathias LANG <geod24 gmail.com> writes:

On Saturday, 1 August 2020 at 07:48:10 UTC, Rainer Schuetze wrote:
 I like most of your proposal, but

 On 31/07/2020 23:49, Mathias LANG wrote:
 B) It makes `in` take the effect of `ref` when it makes sense. 
 It always pass something by `ref` if the type has elaborate 
 construction / destruction (postblit, copy constructor, 
 destructors). If the type doesn't have any of those it is only 
 passed by `ref` if it cannot be passed in register. Some types 
 (dynamic arrays, probably AA in the future) are not affected 
 to allow for covariance (more on that later). The heuristics 
 there still need some small improvements, e.g. w.r.t. floating 
 points (currently the heuristic is based on size, and not 
 asking the backend) and small struct slicing, but that should 
 not affect correctness.

 Please note that many C/C++-ABIs already define similar rules 
 for passing function arguments by value (referencing a copy on 
 the stack). It might not be the best idea to stack two similar, 
 but maybe slightly conflicting rule sets.

 Maybe we can leverage that and define that if the ABI uses a 
 reference for an `in`-value, the compiler may/must elide an 
 extra copy. That avoids having to define our own rule set.

Do you have a link ? I did some research beforehand, but all I 
could find was about NRVO and throwing exception, nothing about 
actually promoting values to references.

Itanium C++ ABI doesn't have anything: 
https://itanium-cxx-abi.github.io/cxx-abi/abi.html#value-parameter
Nor does MS: 
https://docs.microsoft.com/en-us/cpp/build/x64-calling-convention?view=vs-2019#parameter-passing

Aug 04 2020

Rainer Schuetze <r.sagitario gmx.de> writes:

On 04/08/2020 11:35, Mathias LANG wrote:
 On Saturday, 1 August 2020 at 07:48:10 UTC, Rainer Schuetze wrote:
 Maybe we can leverage that and define that if the ABI uses a reference
 for an `in`-value, the compiler may/must elide an extra copy. That
 avoids having to define our own rule set.

 
 Do you have a link ? I did some research beforehand, but all I could
 find was about NRVO and throwing exception, nothing about actually
 promoting values to references.
 
 Itanium C++ ABI doesn't have anything:
 https://itanium-cxx-abi.github.io/cxx-abi/abi.html#value-parameter

Well, this already says as much for non-POD data IIUC.

The System V ABI for C that is used for PODs doesn't seem to use
references, though.


 Nor does MS:
 https://docs.microsoft.com/en-us/cpp/build/x64-calling-convention?view=vs-2019#parameter-passing
 

This says for non-register-sized data: "Structs or unions of other sizes
are passed as a pointer to memory allocated by the caller."

Aug 04 2020

tsbockman <thomas.bockman gmail.com> writes:

First off, this is a great change and I am excited to be able to 
use it in my own projects. Thanks for championing this.

On Friday, 31 July 2020 at 21:49:25 UTC, Mathias LANG wrote:
 B) It makes `in` take the effect of `ref` when it makes sense. 
 It always pass something by `ref` if the type has elaborate 
 construction / destruction (postblit, copy constructor, 
 destructors). If the type doesn't have any of those it is only 
 passed by `ref` if it cannot be passed in register. Some types 
 (dynamic arrays, probably AA in the future) are not affected to 
 allow for covariance (more on that later). The heuristics there 
 still need some small improvements, e.g. w.r.t. floating points 
 (currently the heuristic is based on size, and not asking the 
 backend) and small struct slicing, but that should not affect 
 correctness.

This optimization should be implemented by querying the backend, 
not calculated from scratch in the frontend, which is redundant 
and error-prone; the rules in the current PR are not accurate.

The correct rules are complex, platform-dependent, and depend on 
the types of each function's full parameter list as a whole, not 
just each parameter individually. I haven't found anywhere that 
documents them, but by experimentation with LDC I have discovered 
the following:

1) The size limit for most types to be passed in registers in 
x86_64 is twice as large as the PR's threshold, at least on LDC. 
(Array slices are passed by register because of their size and 
member types, and do not need to be special-cased. A custom 
slice-like struct with a pointer and a size member will be passed 
by register, too.)

2) I say *most* types because there are exceptions; __vector 
types, and *sometimes* structs that transitively contain only a 
single __vector member (like 3d homogenous coordinates in 
graphics programming) are also passed via register, although they 
may be 8 times the size of a general purpose register when using 
AVX2 in a 32-bit program, and probably even larger on some other 
platform.

3) There are limits to how many arguments can be passed via 
registers. I say "limits", plural, because different data types 
may consume different types of registers; for example on x86, 
`int` uses general purpose registers, whereas `float` uses SIMD 
registers. These limits are architecture-dependent.

Aug 04 2020

Mathias LANG <geod24 gmail.com> writes:

On Tuesday, 4 August 2020 at 23:18:56 UTC, tsbockman wrote:
 This optimization should be implemented by querying the 
 backend, not calculated from scratch in the frontend, which is 
 redundant and error-prone; the rules in the current PR are not 
 accurate.

Indeed. The current rules were put there as a way to get the ball 
rolling, so to say. My current focus is to get things to compile 
and pass test on Buildkite, then optimize the rules.
The thing that is not going to change is that types that needs 
elaborate copy or destruction, and types that are not copyable, 
will be passed by ref. Additionally, I want to keep covariance 
for array types, which requires them to be passed by value 
(although it can be done in registers). The rest, I don't mind 
changing it.

 The correct rules are complex, platform-dependent, and depend 
 on the types of each function's full parameter list as a whole, 
 not just each parameter individually. I haven't found anywhere 
 that documents them, but by experimentation with LDC I have 
 discovered the following:

 [...]

Thanks for the feedback. I'll definitely incorporate it (and 
Rainer's) into the PR soon-ish, most likely via a call to a 
backend hook, as is currently done for NRVO.

Aug 04 2020

Fynn =?UTF-8?B?U2NocsO2ZGVy?= <fynnos live.com> writes:

On Friday, 31 July 2020 at 21:49:25 UTC, Mathias LANG wrote:
 B) It makes `in` take the effect of `ref` when it makes sense. 
 [...]
 C) It implements covariance rules
 [...]
 D) It allows to pass rvalues to `in`.

This sounds so great! Thank you for improving `in`!

 I hope this will generate interest with people hitting the same 
 problem.

I've literally yesterday written some new code with `const scope 
ref` in almost every function to pass large, complex structs. 
Occasionally, I had to store rvalues temporarily to pass as 
lvalues (non-templated code). I would rather simply put `in` on 
those parameters :-) It's a lot easier to grasp function 
signatures only using `in` and `out` on parameters (and their 
effect/purpose being immediately obvious to new D programmers!)

Aug 05 2020

James Blachly <james.blachly gmail.com> writes:

On 8/5/20 3:27 AM, Fynn Schröder wrote:
 On Friday, 31 July 2020 at 21:49:25 UTC, Mathias LANG wrote:
 I hope this will generate interest with people hitting the same problem.

 
 I've literally yesterday written some new code with `const scope ref` in 
 almost every function to pass large, complex structs. Occasionally, I 
 had to store rvalues temporarily to pass as lvalues (non-templated 
 code). I would rather simply put `in` on those parameters :-) It's a lot 
 easier to grasp function signatures only using `in` and `out` on 
 parameters (and their effect/purpose being immediately obvious to new D 
 programmers!)

Sorry to interject d.D.learn material in this thread, but where is 
"scope ref" documented? I found 
https://dlang.org/spec/function.html#scope-parameters which discusses 
the use of `scope` with ref type parameters, but the example given is 
pointer-based. Is it correct that `scope ref T` behaves the same as 
`scope T*` ?

DIP1000 shows as "superseded"

I am glad D is iterating and improving on safety, but I have found that 
documentation may not well recent changes in this area.

Aug 23 2020

Kagamin <spam here.lot> writes:

On Friday, 31 July 2020 at 21:49:25 UTC, Mathias LANG wrote:
 https://github.com/dlang/dmd/pull/11000

1. Deprecation of `in ref` makes no sense. Why? I assume it's due 
to a bug in the proposed change. How the compiler should know 
that the argument should be passed by ref? I doesn't necessarily 
know how to load the argument, it may have alignment and 
synchronization requirements. And more importantly how the 
programmer can know whether the argument is passed by ref, now 
that it varies by platform?
2. Dependence on calling convention. AIU ref semantics depends on 
parameter position?
3. Runtime hooks don't go through semantic checks. Is this a 
theoretical concern or did you introduce some new behavior that 
causes problem with this?

Aug 20 2020

Mathias LANG <geod24 gmail.com> writes:

On Thursday, 20 August 2020 at 15:59:24 UTC, Kagamin wrote:
 On Friday, 31 July 2020 at 21:49:25 UTC, Mathias LANG wrote:
 https://github.com/dlang/dmd/pull/11000

 1. Deprecation of `in ref` makes no sense. Why? I assume it's 
 due to a bug in the proposed change.

I'm not in the business of deprecating something to accommodate 
for a broken implementation, no. The implementation originally 
allowed `in ref`, but after some tinkering and looking at 
people's usages, my opinion is that it would be better to just 
allow `in`.

 How the compiler should know that the argument should be passed 
 by ref?
 I doesn't necessarily know how to load the argument, it may 
 have alignment and synchronization requirements.

As explained in the PR, and in the changelog, the compiler knows 
by inspecting the type. If the type has elaborate copy or 
destruction, IOW, if copying it would have side effects, it will 
always pass it by ref to avoid those side effects.
Otherwise, it asks the backend. The current rule in DMD is for 
types that are over twice the size of a register to be passed by 
ref (or real).

I can't think of a situation where the compiler doesn't know how 
to load the argument. If you're talking about opaque types, those 
are rejected.

 And more importantly how the programmer can know whether the 
 argument is passed by ref, now that it varies by platform?

I don't understand why it would be "more important". The point of 
`in` parameter is that it does the right thing for parameters 
which are read-only and won't escape the scope of the function. 
It doesn't matter to the user whether your parameter is `ref` or 
not if it is `scope const`, because you can't modify it anyway. 
It only matters if passing it by value would be expensive (e.g. 
large static array) or have side effects (e.g. a destructor).


 2. Dependence on calling convention. AIU ref semantics depends 
 on parameter position?

Yes. Originally didn't, but that was the main feedback I got, 
that it should be done at the function level instead of the 
parameter (type) level.

 3. Runtime hooks don't go through semantic checks. Is this a 
 theoretical concern or did you introduce some new behavior that 
 causes problem with this?

Just to be clear: When I said "runtime hook", I meant "the AST 
which the compiler generate to call C functions in druntime". It 
generates the equivalent of a prototype and call that. It's not a 
big deal, and I found a way around.

Aug 20 2020

Kagamin <spam here.lot> writes:

On Thursday, 20 August 2020 at 17:25:43 UTC, Mathias LANG wrote:
 On Thursday, 20 August 2020 at 15:59:24 UTC, Kagamin wrote:
 On Friday, 31 July 2020 at 21:49:25 UTC, Mathias LANG wrote:
 https://github.com/dlang/dmd/pull/11000

 1. Deprecation of `in ref` makes no sense. Why? I assume it's 
 due to a bug in the proposed change.

 I'm not in the business of deprecating something to accommodate 
 for a broken implementation, no. The implementation originally 
 allowed `in ref`, but after some tinkering and looking at 
 people's usages, my opinion is that it would be better to just 
 allow `in`.

It needlessly degrades language and breaks code and shouldn't be 
done. Didn't you write this pull because you believe `in ref` is 
useful?

 I can't think of a situation where the compiler doesn't know 
 how to load the argument. If you're talking about opaque types, 
 those are rejected.

 And more importantly how the programmer can know whether the 
 argument is passed by ref, now that it varies by platform?

 I don't understand why it would be "more important". The point 
 of `in` parameter is that it does the right thing for 
 parameters which are read-only and won't escape the scope of 
 the function. It doesn't matter to the user whether your 
 parameter is `ref` or not if it is `scope const`, because you 
 can't modify it anyway. It only matters if passing it by value 
 would be expensive (e.g. large static array) or have side 
 effects (e.g. a destructor).

I mean things like
int atomicLoad(in ref shared int n);
int loadAligned(in ref byte[4] n);
When the argument should be passed by ref by programmer's intent 
and should be communicated to the compiler, because the compiler 
isn't that smart.

 2. Dependence on calling convention. AIU ref semantics depends 
 on parameter position?

 Yes. Originally didn't, but that was the main feedback I got, 
 that it should be done at the function level instead of the 
 parameter (type) level.

Doesn't this defeat your optimization when passing by value is 
expensive?

Aug 21 2020

Mathias LANG <geod24 gmail.com> writes:

On Friday, 21 August 2020 at 09:48:16 UTC, Kagamin wrote:
 I mean things like
 int atomicLoad(in ref shared int n);
 int loadAligned(in ref byte[4] n);
 When the argument should be passed by ref by programmer's 
 intent and should be communicated to the compiler, because the 
 compiler isn't that smart.

The first example is pretty good: We probably need to specify the 
interaction with `shared`. AFAICS it boils down to "when do we 
want to read a `shared` value ?".
If we pass by value, it means the input parameter will only have 
a single value, while if we pass by ref, a function can "listen" 
to changes.
I don't really have an answer for this at the moment, I would 
need to try out some options before I make up my mind.

The second example is pretty simple: the backend will decide 
whether to pass it by ref or not. Since it's a small type, it 
might make more sense to pass it in registers. Whether or not 
it's ref does not matter to the programmer, because the 
programmer cannot change the input anyway, only read it.

 2. Dependence on calling convention. AIU ref semantics 
 depends on parameter position?

 Yes. Originally didn't, but that was the main feedback I got, 
 that it should be done at the function level instead of the 
 parameter (type) level.

 Doesn't this defeat your optimization when passing by value is 
 expensive?

I don't see how ?

Aug 21 2020

Kagamin <spam here.lot> writes:

On Friday, 21 August 2020 at 18:23:08 UTC, Mathias LANG wrote:
 The second example is pretty simple: the backend will decide 
 whether to pass it by ref or not. Since it's a small type, it 
 might make more sense to pass it in registers. Whether or not 
 it's ref does not matter to the programmer, because the 
 programmer cannot change the input anyway, only read it.

The backend doesn't know how to load the data. It matters to the 
programmer, because it affects performance.

Another example:
void log(in ref int n)
{
   write(fd, &n, n.sizeof);
}
If the argument here is passed by value, it will need to fiddle 
with stack and the code will be less optimal.

 Doesn't this defeat your optimization when passing by value is 
 expensive?

 I don't see how ?

If a non-POD object is passed by value, it will have extra calls 
to postblit and destructor.

Aug 25 2020

Mathias LANG <geod24 gmail.com> writes:

On Tuesday, 25 August 2020 at 12:09:45 UTC, Kagamin wrote:
 The backend doesn't know how to load the data. It matters to 
 the programmer, because it affects performance.

 Another example:
 void log(in ref int n)
 {
   write(fd, &n, n.sizeof);
 }
 If the argument here is passed by value, it will need to fiddle 
 with stack and the code will be less optimal.

I assume you imply that it's less optimal because it's taking the 
address of it ?


 If a non-POD object is passed by value, it will have extra 
 calls to postblit and destructor.

https://github.com/dlang/dmd/blob/master/changelog/preview-in.dd

 - If the type has an elaborate copy or destruction (postblit, 
 copy constructor, destructor),
 the type is always passed by reference.

Aug 25 2020

Kagamin <spam here.lot> writes:

On Tuesday, 25 August 2020 at 16:05:51 UTC, Mathias LANG wrote:
 On Tuesday, 25 August 2020 at 12:09:45 UTC, Kagamin wrote:
 The backend doesn't know how to load the data. It matters to 
 the programmer, because it affects performance.

 Another example:
 void log(in ref int n)
 {
   write(fd, &n, n.sizeof);
 }
 If the argument here is passed by value, it will need to 
 fiddle with stack and the code will be less optimal.

 I assume you imply that it's less optimal because it's taking 
 the address of it ?

Because it has an extra write to memory, pushing the parameter 
passed by value to stack to take its address. In general when the 
parameter is passed by reference somewhere, like the refactoring 
you did in phobos replacing `in ref` with `scope const ref` - the 
parameter can be passed to one of those functions that have 
precise ref semantics. I also wonder how they interact with auto 
ref.

 If a non-POD object is passed by value, it will have extra 
 calls to postblit and destructor.

 https://github.com/dlang/dmd/blob/master/changelog/preview-in.dd

 - If the type has an elaborate copy or destruction (postblit, 
 copy constructor, destructor),
 the type is always passed by reference.


If ref semantics doesn't depend on parameter position, then it's 
fine, but you say it depends.

Aug 27 2020

IGotD- <nise nise.com> writes:

On Friday, 31 July 2020 at 21:49:25 UTC, Mathias LANG wrote:
 B) It makes `in` take the effect of `ref` when it makes sense. 
 It always pass something by `ref` if the type has elaborate 
 construction / destruction (postblit, copy constructor, 
 destructors). If the type doesn't have any of those it is only 
 passed by `ref` if it cannot be passed in register. Some types 
 (dynamic arrays, probably AA in the future) are not affected to 
 allow for covariance (more on that later). The heuristics there 
 still need some small improvements, e.g. w.r.t. floating points 
 (currently the heuristic is based on size, and not asking the 
 backend) and small struct slicing, but that should not affect 
 correctness.

This is interesting on a general level as well and true for 
several programming languages. Let the compiler optimize the 
parameter passing unless the programmer explicitly ask for a 
certain way (copy object, pointer/reference etc.). This is very 
unusual and if you have a language that optimizes the parameter 
passing by default, please mention it because it would be 
interesting.

I'm all for that 'in' can be used for a "optimized const 
parameter" where the compiler optimizes as it wants. However, the 
question is if we want to override the default behaviour per 
object basis? As mentioned you might want to have another default 
behaviour for arrays but that can be true for other storage 
structures as well so this should really be user defined.

Aug 20 2020

Araq <rumpf_a web.de> writes:

On Thursday, 20 August 2020 at 17:31:17 UTC, IGotD- wrote:
 This is interesting on a general level as well and true for 
 several programming languages. Let the compiler optimize the 
 parameter passing unless the programmer explicitly ask for a 
 certain way (copy object, pointer/reference etc.). This is very 
 unusual and if you have a language that optimizes the parameter 
 passing by default, please mention it because it would be 
 interesting.

Nim does this and I took the feature from Ada. You can override 
the behavior with pragmas but I've only seen that done for C 
interop, not for optimizations as the compiler always seems to 
get it right.

Aug 20 2020

Petar Kirov [ZombineDev] <petar.p.kirov gmail.com> writes:

On Thursday, 20 August 2020 at 20:29:31 UTC, Araq wrote:
 On Thursday, 20 August 2020 at 17:31:17 UTC, IGotD- wrote:
 This is interesting on a general level as well and true for 
 several programming languages. Let the compiler optimize the 
 parameter passing unless the programmer explicitly ask for a 
 certain way (copy object, pointer/reference etc.). This is 
 very unusual and if you have a language that optimizes the 
 parameter passing by default, please mention it because it 
 would be interesting.

 Nim does this and I took the feature from Ada. You can override 
 the behavior with pragmas but I've only seen that done for C 
 interop, not for optimizations as the compiler always seems to 
 get it right.

Interesting, how does it work? As mentioned previously, the issue 
with choosing by-val vs by-ref passing solely based on the type 
is that it doesn't take into consideration the ABI and how many 
registers are available. E.g. if you have more than X parameters 
some will need to be spilled on the stack anyway (even if they 
can fit in a GPR).

Can the Nim compiler (I'm guessing `extccomp`) query the C/C++ 
compiler backend for e.g. the number of registers available for 
parameters? Or you have this logic built into the frontend (say 
`ccgtypes`)?

Aug 21 2020

Araq <rumpf_a web.de> writes:

On Friday, 21 August 2020 at 13:59:30 UTC, Petar Kirov 
[ZombineDev] wrote:
 Interesting, how does it work? As mentioned previously, the 
 issue with choosing by-val vs by-ref passing solely based on 
 the type is that it doesn't take into consideration the ABI and 
 how many registers are available. E.g. if you have more than X 
 parameters some will need to be spilled on the stack anyway 
 (even if they can fit in a GPR).

 Can the Nim compiler (I'm guessing `extccomp`) query the C/C++ 
 compiler backend for e.g. the number of registers available for 
 parameters? Or you have this logic built into the frontend (say 
 `ccgtypes`)?

It's built into cctypes indeed. The logic is mostly "pass by 
pointer if sizeof(T) > 3 machine words". ABI and registers are 
not relevant all that much, what you want to prevent are copies 
of large sizes.

Aug 21 2020

Andrei Alexandrescu <SeeWebsiteForEmail erdani.com> writes:

On 8/20/20 1:31 PM, IGotD- wrote:
 This is interesting on a general level as well and true for several 
 programming languages. Let the compiler optimize the parameter passing 
 unless the programmer explicitly ask for a certain way (copy object, 
 pointer/reference etc.).

This has been discussed a few times. If mutation is allowed, aliasing is 
a killer:

void fun(ref S a, const compiler_chooses S b) {
     ... mutate a, read b ...
}

S x;
fun(x, x); // oops

The problem now is that the semantics of fun depends on whether the 
compiler chose pass by value vs. pass by reference.

Aug 20 2020

Araq <rumpf_a web.de> writes:

On Thursday, 20 August 2020 at 22:19:16 UTC, Andrei Alexandrescu 
wrote:
 On 8/20/20 1:31 PM, IGotD- wrote:
 This is interesting on a general level as well and true for 
 several programming languages. Let the compiler optimize the 
 parameter passing unless the programmer explicitly ask for a 
 certain way (copy object, pointer/reference etc.).

 This has been discussed a few times. If mutation is allowed, 
 aliasing is a killer:

 void fun(ref S a, const compiler_chooses S b) {
     ... mutate a, read b ...
 }

 S x;
 fun(x, x); // oops

 The problem now is that the semantics of fun depends on whether 
 the compiler chose pass by value vs. pass by reference.

True but in practice it doesn't happen very often. The benefits 
far outweigh this minor downside. Plus there are known ways to 
prevent this form of aliasing at compile-time.

Aug 20 2020

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 8/21/20 1:20 AM, Araq wrote:
 On Thursday, 20 August 2020 at 22:19:16 UTC, Andrei Alexandrescu wrote:
 On 8/20/20 1:31 PM, IGotD- wrote:
 This is interesting on a general level as well and true for several 
 programming languages. Let the compiler optimize the parameter 
 passing unless the programmer explicitly ask for a certain way (copy 
 object, pointer/reference etc.).

 This has been discussed a few times. If mutation is allowed, aliasing 
 is a killer:

 void fun(ref S a, const compiler_chooses S b) {
     ... mutate a, read b ...
 }

 S x;
 fun(x, x); // oops

 The problem now is that the semantics of fun depends on whether the 
 compiler chose pass by value vs. pass by reference.

 
 True but in practice it doesn't happen very often. The benefits far 
 outweigh this minor downside.

That seems quite worrisome. A bug rare and subtle that can become 
devastating. Something the Hindenburg captain might have said.

 Plus there are known ways to prevent this 
 form of aliasing at compile-time.

Not if you have globals and/or separate compilation.

Aug 21 2020

tsbockman <thomas.bockman gmail.com> writes:

On Friday, 21 August 2020 at 19:21:37 UTC, Andrei Alexandrescu 
wrote:
 On 8/21/20 1:20 AM, Araq wrote:
 On Thursday, 20 August 2020 at 22:19:16 UTC, Andrei 
 Alexandrescu wrote:
 On 8/20/20 1:31 PM, IGotD- wrote:
 void fun(ref S a, const compiler_chooses S b) {
     ... mutate a, read b ...
 }

 S x;
 fun(x, x); // oops

 The problem now is that the semantics of fun depends on 
 whether the compiler chose pass by value vs. pass by 
 reference.

 
 True but in practice it doesn't happen very often. The 
 benefits far outweigh this minor downside.

 That seems quite worrisome. A bug rare and subtle that can 
 become devastating. Something the Hindenburg captain might have 
 said.

 Plus there are known ways to prevent this form of aliasing at 
 compile-time.

 Not if you have globals and/or separate compilation.

The risk of aliasing-related bugs comes from the availability of 
pointers/references though, which will still be present in D with 
or without this optimization. People who want to rule out 
aliasing problems can just explicitly specify pass-by-value using 
`scope const` instead of `in`.

Let's not cripple the future of the language out of fear of 
pitfalls that are already present, and cannot be entirely removed 
as long as D remains a systems programming language.

Aug 21 2020

IGotD- <nise nise.com> writes:

On Friday, 21 August 2020 at 23:16:53 UTC, tsbockman wrote:
 The risk of aliasing-related bugs comes from the availability 
 of pointers/references though, which will still be present in D 
 with or without this optimization. People who want to rule out 
 aliasing problems can just explicitly specify pass-by-value 
 using `scope const` instead of `in`.

 Let's not cripple the future of the language out of fear of 
 pitfalls that are already present, and cannot be entirely 
 removed as long as D remains a systems programming language.

Most of the parameters can actually be 'const', which also the 
case for 'in' parameters with the current definition. With const 
parameters, aliasing is no problem unless you do something with 
pointers further into the parameters.

'in' should be const by default like it is today and if not use 
inout or ref. With a language like D it is up to the programmer 
ensure there is no aliasing, much like the 'restrict' qualifier 
in C. I'm for that the compiler should detect any aliasing but 
this can be gradually improved.

The new 'in' qualifier is interesting enough not to through it 
away because of the aliasing problems.

Aug 22 2020

Araq <rumpf_a web.de> writes:

On Friday, 21 August 2020 at 19:21:37 UTC, Andrei Alexandrescu 
wrote:
 On 8/21/20 1:20 AM, Araq wrote:
 On Thursday, 20 August 2020 at 22:19:16 UTC, Andrei 
 Alexandrescu wrote:
 On 8/20/20 1:31 PM, IGotD- wrote:
 [...]

 This has been discussed a few times. If mutation is allowed, 
 aliasing is a killer:

 void fun(ref S a, const compiler_chooses S b) {
     ... mutate a, read b ...
 }

 S x;
 fun(x, x); // oops

 The problem now is that the semantics of fun depends on 
 whether the compiler chose pass by value vs. pass by 
 reference.

 
 True but in practice it doesn't happen very often. The 
 benefits far outweigh this minor downside.

 That seems quite worrisome. A bug rare and subtle that can 
 become devastating. Something the Hindenburg captain might have 
 said.

 Plus there are known ways to prevent this form of aliasing at 
 compile-time.

 Not if you have globals and/or separate compilation.

Wrong. You can simply compile it to 'fun(x, copy(x))' if an alias 
analysis cannot disambiguate the locations and the alias analysis 
is restricted to the callsite, separate compilation continues to 
work.

Aug 21 2020

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 8/22/20 2:07 AM, Araq wrote:
 On Friday, 21 August 2020 at 19:21:37 UTC, Andrei Alexandrescu wrote:
 On 8/21/20 1:20 AM, Araq wrote:
 On Thursday, 20 August 2020 at 22:19:16 UTC, Andrei Alexandrescu wrote:
 On 8/20/20 1:31 PM, IGotD- wrote:
 [...]

 This has been discussed a few times. If mutation is allowed, 
 aliasing is a killer:

 void fun(ref S a, const compiler_chooses S b) {
     ... mutate a, read b ...
 }

 S x;
 fun(x, x); // oops

 The problem now is that the semantics of fun depends on whether the 
 compiler chose pass by value vs. pass by reference.

 True but in practice it doesn't happen very often. The benefits far 
 outweigh this minor downside.

 That seems quite worrisome. A bug rare and subtle that can become 
 devastating. Something the Hindenburg captain might have said.

 Plus there are known ways to prevent this form of aliasing at 
 compile-time.

 Not if you have globals and/or separate compilation.

 
 Wrong. You can simply compile it to 'fun(x, copy(x))' if an alias 
 analysis cannot disambiguate the locations and the alias analysis is 
 restricted to the callsite, separate compilation continues to work.

Fair enough, thanks.

Aug 22 2020

Mathias LANG <geod24 gmail.com> writes:

On Thursday, 20 August 2020 at 22:19:16 UTC, Andrei Alexandrescu 
wrote:
 On 8/20/20 1:31 PM, IGotD- wrote:
 This is interesting on a general level as well and true for 
 several programming languages. Let the compiler optimize the 
 parameter passing unless the programmer explicitly ask for a 
 certain way (copy object, pointer/reference etc.).

 This has been discussed a few times. If mutation is allowed, 
 aliasing is a killer:

 void fun(ref S a, const compiler_chooses S b) {
     ... mutate a, read b ...
 }

 S x;
 fun(x, x); // oops

 The problem now is that the semantics of fun depends on whether 
 the compiler chose pass by value vs. pass by reference.

Isn't that what Walter's OB system is supposed to address ?

Aug 20 2020

WebFreak001 <d.forum webfreak.org> writes:

On Friday, 31 July 2020 at 21:49:25 UTC, Mathias LANG wrote:
 [...]

how does the ABI work for this for extern(C) and others? Will it 
mean ref or not ref / will that translate to a pointer or not?

For example in Win32 COM IDL files I often see [in] used for 
input only parameters, can we annotate parameters in D like that 
without changing calling semantics to document that they are 
input only parameters and potentially allow optimizations?

Aug 21 2020

Jacob Carlborg <doob me.com> writes:

On Friday, 21 August 2020 at 09:37:37 UTC, WebFreak001 wrote:

 For example in Win32 COM IDL files I often see [in] used for 
 input only parameters, can we annotate parameters in D like 
 that without changing calling semantics to document that they 
 are input only parameters and potentially allow optimizations?

You can attach a UDA. It can't be called "in", but perhaps 
"input".

--
/Jacob Carlborg

Aug 21 2020

Atila Neves <atila.neves gmail.com> writes:

On Friday, 31 July 2020 at 21:49:25 UTC, Mathias LANG wrote:
 Hi everyone,
 For a long time I've been pretty annoyed by the state of `in` 
 parameters.
 In case it needs any clarification, I'm talking at what's 
 between the asterisks (*) here: `void foo (*in* char[] arg)`).

 [...]

I think that having `in ref` is preferable than introducing new 
rules. Everything else makes sense to me, but it's possible I'm 
missing something.

Aug 25 2020

Mathias LANG <geod24 gmail.com> writes:

On Tuesday, 25 August 2020 at 14:43:45 UTC, Atila Neves wrote:
 On Friday, 31 July 2020 at 21:49:25 UTC, Mathias LANG wrote:
 Hi everyone,
 For a long time I've been pretty annoyed by the state of `in` 
 parameters.
 In case it needs any clarification, I'm talking at what's 
 between the asterisks (*) here: `void foo (*in* char[] arg)`).

 [...]

 I think that having `in ref` is preferable than introducing new 
 rules. Everything else makes sense to me, but it's possible I'm 
 missing something.

Why is it preferable ? Genuine question here.
I originally had in mind to support `in ref` as being the user 
forcing `ref` on the param. It just seemed *obvious*. However, 
after a lot of playing around with it, it wasn't that obvious 
anymore.

Aug 25 2020

Atila Neves <atila.neves gmail.com> writes:

On Wednesday, 26 August 2020 at 04:15:21 UTC, Mathias LANG wrote:
 On Tuesday, 25 August 2020 at 14:43:45 UTC, Atila Neves wrote:
 On Friday, 31 July 2020 at 21:49:25 UTC, Mathias LANG wrote:
 Hi everyone,
 For a long time I've been pretty annoyed by the state of `in` 
 parameters.
 In case it needs any clarification, I'm talking at what's 
 between the asterisks (*) here: `void foo (*in* char[] arg)`).

 [...]

 I think that having `in ref` is preferable than introducing 
 new rules. Everything else makes sense to me, but it's 
 possible I'm missing something.

 Why is it preferable ? Genuine question here.
 I originally had in mind to support `in ref` as being the user 
 forcing `ref` on the param. It just seemed *obvious*. However, 
 after a lot of playing around with it, it wasn't that obvious 
 anymore.

For me, more programmer choice and less magic.

Aug 26 2020

Atila Neves <atila.neves gmail.com> writes:

On Wednesday, 26 August 2020 at 04:15:21 UTC, Mathias LANG wrote:
 On Tuesday, 25 August 2020 at 14:43:45 UTC, Atila Neves wrote:
 On Friday, 31 July 2020 at 21:49:25 UTC, Mathias LANG wrote:
 Hi everyone,
 For a long time I've been pretty annoyed by the state of `in` 
 parameters.
 In case it needs any clarification, I'm talking at what's 
 between the asterisks (*) here: `void foo (*in* char[] arg)`).

 [...]

 I think that having `in ref` is preferable than introducing 
 new rules. Everything else makes sense to me, but it's 
 possible I'm missing something.

 Why is it preferable ? Genuine question here.
 I originally had in mind to support `in ref` as being the user 
 forcing `ref` on the param. It just seemed *obvious*. However, 
 after a lot of playing around with it, it wasn't that obvious 
 anymore.

I think it's preferable because it gives more control to the 
programmer and there's less magic. And also that enregistering 
and all that probably belongs in the backend.

Aug 26 2020

Jacob Carlborg <doob me.com> writes:

On Wednesday, 26 August 2020 at 09:10:45 UTC, Atila Neves wrote:

 I think it's preferable because it gives more control to the 
 programmer and there's less magic. And also that enregistering 
 and all that probably belongs in the backend.

I like the idea of Herb Sutter's talk [1], where the programmer 
specifies the semantic behavior of the parameters, instead of the 
mechanics how to pass them. The programmer specifies `in`, `out`, 
`inout`, `move` or `forward`. Then the compiler figures out how 
to achieve the semantics in the most efficient way.

There's nothing in Mathias' proposal that cannot be accomplished 
today with other attributes? So if you want more control, don't 
use `in`.

[1] https://youtu.be/qx22oxlQmKc?t=1258

--
/Jacob Carlborg

Aug 26 2020

IGotD- <nise nise.com> writes:

On Tuesday, 25 August 2020 at 14:43:45 UTC, Atila Neves wrote:
 I think that having `in ref` is preferable than introducing new 
 rules. Everything else makes sense to me, but it's possible I'm 
 missing something.

I think that the proposal from Mathias LANG meant that 'in' shall 
not be accompanied with any other qualifier. How the parameter is 
passed is up to the compiler, if it is by value, reference or 
something else.

If you explicitly want a reference you use 'ref' just like today.

Aug 26 2020

D Programming

C/C++ Programming

Other

digitalmars.D - `in` parameters made useful