digitalmars.D - Idea: Ownership & lifetime escape analysis by variables in reference

Rikki Cattermole (35/35) May 28 2022 Walter suggested that I create a thread for an idea I've had for

deadalnix (7/43) May 28 2022 Yes, this is a good start. You have one major missing point: you

rikki cattermole (3/5) May 28 2022 Can you please write up an example similar to what I wrote, so that we
drug (3/12) May 29 2022 You mean `isolated`? Funny but we almost have this type qualifier - in

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (12/19) May 29 2022 Maybe look at the basic ideas for static analysis? Then you will

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (7/9) May 29 2022 Think of it like this:

rikki cattermole (6/16) May 29 2022 My goals for this was to be a very cheap set of checks at compile time,

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (5/7) May 29 2022 It cannot both be simple and low on false positives. Anyway, I

rikki cattermole (4/13) May 29 2022 That's how this originally came up on Discord.

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (4/6) May 29 2022 Yes, @trusted is ok when you can limit it to some specific

rikki cattermole (4/11) May 29 2022 Worse of all, you end up missing all of the things @safe can actually ca...

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (12/14) May 29 2022 Yes, and that is why the idea of a simple solution doesn't work.

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (20/22) May 29 2022 I hope I didn't sound too negative, as I am 100% supportive of

rikki cattermole (8/9) May 29 2022 In my snippets here, I've kinda ignored how it should be turned on.

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (24/26) May 29 2022 I guess what I am saying is something like this:

rikki cattermole (13/25) May 29 2022 I'm basically maxed out on this like a year ago. I've only been bringing...

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (2/4) May 29 2022 ARC could work with -betterC.

rikki cattermole (10/15) May 29 2022 I'm convinced for structs and classes, that's actually a DIP I would

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (3/5) May 29 2022 I mean, you could just start by optimizing C++ shared_ptr… Even

rikki cattermole (21/32) May 29 2022 ```d

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (10/18) May 29 2022 You will end up being too pessimistic. For instance: if the queue

rikki cattermole (8/9) May 29 2022 I've been thinking back on my past thought experiments, how I use to

Walter Bright (15/16) May 29 2022 dip1000 infers `scope` and `return` for:

rikki cattermole (34/42) May 29 2022 Indeed.

Walter Bright (3/4) May 29 2022 Thanks for the report, but I don't understand what it has to do with the...

rikki cattermole (5/10) May 30 2022 Its basically the only non-design related issue I've encountered when

Rikki Cattermole <alphaglosined gmail.com> writes:

Walter suggested that I create a thread for an idea I've had for 
memory ownership & lifetimes.

The basic explanation is that every variable (including 
temporaries and returns) are always in reference to other 
variables. So that the order of destruction is correct and 
nothing escapes its owner (they can be bundled unless scope is 
involved).

These examples come from a talk with Timon a couple of days ago 
on Discord. His existing dislike for this idea is that it 
conflates different levels of indirection.

My worked example:

```d
int[] array = [1, 2, 3]; // ownership: []
// arg ownership: [] (not ref/out)
// return ownership: [first argument] -> [array]
int* got = func(array); // ownership: [array]

int* func(int[] source) {
   int[] slice = source[1 .. 2]; // ownership: [source]
   int* ptr = &slice[0]; // ownership: [slice, source] -> [source]
   return ptr; // ownership: [ptr] -> [source]
}
```

A question Timon came up with that I answered:

```d
int[][] func(int[] a, int[] b){
     return [a,b];
}
```

(for return value): ``// ownership: [first argument, second 
argument]``

This ticks a lot of the boxes I'm using as preferable acceptance 
criteria:

1. Inferred for anything non-virtual
2. Nothing outlives its owner
3. Fairly straight forward to understand

May 28 2022

deadalnix <deadalnix gmail.com> writes:

On Sunday, 29 May 2022 at 00:22:03 UTC, Rikki Cattermole wrote:
 Walter suggested that I create a thread for an idea I've had 
 for memory ownership & lifetimes.

 The basic explanation is that every variable (including 
 temporaries and returns) are always in reference to other 
 variables. So that the order of destruction is correct and 
 nothing escapes its owner (they can be bundled unless scope is 
 involved).

 These examples come from a talk with Timon a couple of days ago 
 on Discord. His existing dislike for this idea is that it 
 conflates different levels of indirection.

 My worked example:

 ```d
 int[] array = [1, 2, 3]; // ownership: []
 // arg ownership: [] (not ref/out)
 // return ownership: [first argument] -> [array]
 int* got = func(array); // ownership: [array]

 int* func(int[] source) {
   int[] slice = source[1 .. 2]; // ownership: [source]
   int* ptr = &slice[0]; // ownership: [slice, source] -> 
 [source]
   return ptr; // ownership: [ptr] -> [source]
 }
 ```

 A question Timon came up with that I answered:

 ```d
 int[][] func(int[] a, int[] b){
     return [a,b];
 }
 ```

 (for return value): ``// ownership: [first argument, second 
 argument]``

 This ticks a lot of the boxes I'm using as preferable 
 acceptance criteria:

 1. Inferred for anything non-virtual
 2. Nothing outlives its owner
 3. Fairly straight forward to understand

Yes, this is a good start. You have one major missing point: you 
need something to transfers ownership or you end up very limited.

https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/msr-tr-2012-79.pdf

All we need is one god damn type qualifier and we make DIP1000, 
 live, RC object, nogc exception and couple of other long 
standing problem a thing of the past.

May 28 2022

rikki cattermole <rikki cattermole.co.nz> writes:

On 29/05/2022 12:26 PM, deadalnix wrote:
 Yes, this is a good start. You have one major missing point: you need 
 something to transfers ownership or you end up very limited.

Can you please write up an example similar to what I wrote, so that we 
can see what would be added (my mind just go blank on that paper)?

May 28 2022

drug <drug2004 bk.ru> writes:

On 29.05.2022 03:26, deadalnix wrote:
 Yes, this is a good start. You have one major missing point: you need 
 something to transfers ownership or you end up very limited.
 
 https://www.microsoft.com/en-us/research/wp-content/uploads/2016/0
/msr-tr-2012-79.pdf 
 
 
 All we need is one god damn type qualifier and we make DIP1000,  live, 
 RC object, nogc exception and couple of other long standing problem a 
 thing of the past.

You mean `isolated`? Funny but we almost have this type qualifier - in 
fact it is opposite/antonym to `shared`, I guess.

May 29 2022

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Sunday, 29 May 2022 at 00:22:03 UTC, Rikki Cattermole wrote:
 Walter suggested that I create a thread for an idea I've had 
 for memory ownership & lifetimes.

 The basic explanation is that every variable (including 
 temporaries and returns) are always in reference to other 
 variables. So that the order of destruction is correct and 
 nothing escapes its owner (they can be bundled unless scope is 
 involved).

Maybe look at the basic ideas for static analysis? Then you will 
know the limitations.

How do you deal with conditionals?
How do you deal with mutable arrays of pointers?

Simple stuff like a queue of size two? How do you know what it 
contains?

With lifetimes you can be pessimitic: max lifetime of anything 
that could in theory be put into it unless I am able to prove 
otherwise. Then you dont have to try to prove something unless it 
is needed.

Better to just add ARC and do all these things as optimizations.

May 29 2022

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Sunday, 29 May 2022 at 07:42:29 UTC, Ola Fosheim Grøstad wrote:
 Maybe look at the basic ideas for static analysis? Then you 
 will know the limitations.

Think of it like this:

If you try to be precise then parts of it has to happen at 
runtime.

If you want to do it at compile time then you have to 
overapproximate (be more pessimistic than the possible runtime 
situations).

May 29 2022

rikki cattermole <rikki cattermole.co.nz> writes:

On 29/05/2022 7:57 PM, Ola Fosheim Grøstad wrote:
 On Sunday, 29 May 2022 at 07:42:29 UTC, Ola Fosheim Grøstad wrote:
 Maybe look at the basic ideas for static analysis? Then you will know 
 the limitations.

 
 Think of it like this:
 
 If you try to be precise then parts of it has to happen at runtime.
 
 If you want to do it at compile time then you have to overapproximate 
 (be more pessimistic than the possible runtime situations).

My goals for this was to be a very cheap set of checks at compile time, 
that can catch most naive wrong life time issues, while being easily 
explained.

I'm not concerned too much about preciseness as long as the false 
positives are low (or at least lower than DIP1000).

May 29 2022

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Sunday, 29 May 2022 at 08:14:18 UTC, rikki cattermole wrote:
 I'm not concerned too much about preciseness as long as the 
 false positives are low (or at least lower than DIP1000).

It cannot both be simple and low on false positives. Anyway, I 
dont really like the concept of DIP1000.

It basically means that any required change to a function can 
lead to a situation where one has to slap  trusted on it.

May 29 2022

rikki cattermole <rikki cattermole.co.nz> writes:

On 29/05/2022 8:50 PM, Ola Fosheim Grøstad wrote:
 On Sunday, 29 May 2022 at 08:14:18 UTC, rikki cattermole wrote:
 I'm not concerned too much about preciseness as long as the false 
 positives are low (or at least lower than DIP1000).

 
 It cannot both be simple and low on false positives. Anyway, I dont 
 really like the concept of DIP1000.
 
 It basically means that any required change to a function can lead to a 
 situation where one has to slap  trusted on it.

That's how this originally came up on Discord.

I dislike DIP1000 too. We need to go back to the drawing board it isn't 
archiving what it should be.

May 29 2022

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Sunday, 29 May 2022 at 08:52:55 UTC, rikki cattermole wrote:
 We need to go back to the drawing board it isn't archiving what 
 it should be.

Yes,  trusted is ok when you can limit it to some specific 
module/class, but if you end up sprinkling it around then there 
is no point to  safe anymore.

May 29 2022

rikki cattermole <rikki cattermole.co.nz> writes:

On 29/05/2022 9:04 PM, Ola Fosheim Grøstad wrote:
 On Sunday, 29 May 2022 at 08:52:55 UTC, rikki cattermole wrote:
 We need to go back to the drawing board it isn't archiving what it 
 should be.

 
 Yes,  trusted is ok when you can limit it to some specific module/class, 
 but if you end up sprinkling it around then there is no point to  safe 
 anymore.

Worse of all, you end up missing all of the things  safe can actually catch.

Unless its a wrapper around actual logic, it always worries me that I 
haven't got that extra guard each time I have to add it.

May 29 2022

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Sunday, 29 May 2022 at 09:21:16 UTC, rikki cattermole wrote:
 Worse of all, you end up missing all of the things  safe can 
 actually catch.

Yes, and that is why the idea of a simple solution doesn't work. 
I seldom make mistakes in functions without loops.

 safe is most valuable in functions that are "complicated" 
(containing loops or other intricacies).

Some overhead in  safe is ok, because most applications don't 
need to be faster than Java in 95% of the code. So if you make 
95%  safe then you only need to be careful with the remaining 5%.

Which is kinda how it works with C now, people write C libraries 
and use those with high level languages. But it is more 
convenient with one unified language (and that could make D 
attractive for Linux GUI applications).

May 29 2022

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Sunday, 29 May 2022 at 08:52:55 UTC, rikki cattermole wrote:
 I dislike DIP1000 too. We need to go back to the drawing board 
 it isn't archiving what it should be.

I hope I didn't sound too negative, as I am 100% supportive of 
what you have said in other treads about ARC or adding 
constraints that enable a more advanced GC. (Or possibly my idea 
of having ARC for shared and GC for non-shared).

It is just not fruitful to try to make  system-type code  safe as 
that means Rust or beyond, and those that want that have already 
moved to Rust. And even if D could do the same as Rust then it 
would take 15 years to get there and then we compete with next 
gen static analysis for C++33, Rust 2.0 and Go 3.0… And if D is 
only going halfway to where Rust is then there will be endless 
complaints and people will just switch over to Rust to benefit 
from their ecosystem.

So this angle is a *big drain* on language evolution.

D could achieve so much more by providing some cool features for 
making it easier to write  trusted code that actually can be 
trusted.

All this work people talk about regarding static analysis and 
advanced type system features could be done as optimizations on a 
simpler stable language. With higher payoff.

May 29 2022

rikki cattermole <rikki cattermole.co.nz> writes:

On 30/05/2022 12:01 AM, Ola Fosheim Grøstad wrote:
 It is just not fruitful to try to make  system-type code  safe

In my snippets here, I've kinda ignored how it should be turned on.

I did mention elsewhere that I previously used scope as a type qualifier 
to do this. Which should of course work in  system, but could always be 
cast away. Just like shared, const or immutable.

In general I am in agreement, these sort of safety features shouldn't be 
opted out of, instead opted into only. Otherwise you're breaking a ton 
of code without any benefit

May 29 2022

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Sunday, 29 May 2022 at 12:11:01 UTC, rikki cattermole wrote:
 I did mention elsewhere that I previously used scope as a type 
 qualifier to do this.

I guess what I am saying is something like this:

I think you would get better yield for your time investment by 
pursuing how to add barriers and figuring out how one could write 
 trusted code in such a landscape than anything related to 
advanced escape analysis in the context of verification of *D 
code* (optimization is ok, because then you will choose a best 
effort strategy, verification is all-or-nothing).

To understand what is needed for solid escape analysis, let us 
take the two-element queue example. You could unroll the loop. 
Replace ```a[0]``` with ```a0```, ```a[1]``` with ```a1``` and 
all ```a[i]``` accesses with a conditional (or switch). Then 
analyse this with an algorithm/solver that can deal with 
branches. Or something along these lines. A lot of machinery even 
for the simplest datastructure! And then we have the slice 
scenarios with aliasing, where modifying the content of one slice 
could modify basically any other parameter.

It would be so much easier to just add  stack as an optimization 
hint where you want something to be put on the stack, and let the 
compiler generate a report of where it failed, why and how to 
improve the code.

(For DIP1000 to make sense it must be a able to deal with more 
complexity than most of your code, otherwise it will break down 
whenever you need to modify existing code.)

May 29 2022

rikki cattermole <rikki cattermole.co.nz> writes:

On 30/05/2022 12:40 AM, Ola Fosheim Grøstad wrote:
 On Sunday, 29 May 2022 at 12:11:01 UTC, rikki cattermole wrote:
 I did mention elsewhere that I previously used scope as a type 
 qualifier to do this.

 
 I guess what I am saying is something like this:
 
 I think you would get better yield for your time investment by pursuing 
 how to add barriers and figuring out how one could write  trusted code 
 in such a landscape than anything related to advanced escape analysis in 
 the context of verification of *D code* (optimization is ok, because 
 then you will choose a best effort strategy, verification is 
 all-or-nothing).

I'm basically maxed out on this like a year ago. I've only been bringing 
it up as an example of an alternative to DIP1000 and friends. To show 
that we could be far closer to a useful solution than what we have now 
which is certainly not hitting its marks.

Write barriers should be a mere glue code problem. That doesn't need a 
DIP or anything else, just someone who knows the code bases to do it!

It would be great to have alternative designs, but whatever is come up 
with, I think a key design decision is going to be the differentiation 
between dependency and complimentary of runtime mechanisms. I know 
you've been wanting more on the dependency side of decisions, whereas 
I'm more interested in complimentary since I have to stick to things 
like -betterC.

May 29 2022

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Sunday, 29 May 2022 at 12:57:01 UTC, rikki cattermole wrote:
 side of decisions, whereas I'm more interested in complimentary 
 since I have to stick to things like -betterC.

ARC could work with -betterC.

May 29 2022

rikki cattermole <rikki cattermole.co.nz> writes:

On 30/05/2022 1:04 AM, Ola Fosheim Grøstad wrote:
 On Sunday, 29 May 2022 at 12:57:01 UTC, rikki cattermole wrote:
 side of decisions, whereas I'm more interested in complimentary since 
 I have to stick to things like -betterC.

 
 ARC could work with -betterC.

I'm convinced for structs and classes, that's actually a DIP I would 
like to write. I have written a bit up about what it would be[0].

Without boxing into fat types, I don't see how its going to be possible. 
The boxing I'm really not happy about, as that is changing known type 
behaviors and properties which is concerning to me.

But hey, I'd love to be proven wrong (not that I've thought about this 
much)!

[0] 
https://gist.github.com/rikkimax/95994607813d6665b6adbfb36d180c42#file-opref-md

May 29 2022

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Sunday, 29 May 2022 at 13:11:58 UTC, rikki cattermole wrote:
 But hey, I'd love to be proven wrong (not that I've thought 
 about this much)!

I mean, you could just start by optimizing C++ shared_ptr… Even 
that would be valuable in the context of -betterC.

May 29 2022

rikki cattermole <rikki cattermole.co.nz> writes:

On 29/05/2022 7:42 PM, Ola Fosheim Grøstad wrote:
 Maybe look at the basic ideas for static analysis? Then you will know 
 the limitations.
 
 How do you deal with conditionals?

```d
Thing func(Thing a, Thing b, bool pickB) {
	return pickB ? a : b; // ownership: [a, b]
}
```

 How do you deal with mutable arrays of pointers?

```d

void func(Thing a, Thing b) {
	int[] arr; // ownership: []
	
	arr ~= a; // arr ownership: [a]
	arr ~= b; // arr ownership: [a, b]

	arr = arr[0 .. 1]; // arr ownership: [a, b]
	arr ~= b; // arr ownership: [a, b]
}
```

 Simple stuff like a queue of size two? How do you know what it contains?

See above

 With lifetimes you can be pessimitic: max lifetime of anything that 
 could in theory be put into it unless I am able to prove otherwise. Then 
 you dont have to try to prove something unless it is needed.
 
 Better to just add ARC and do all these things as optimizations.

I too would like to have proper reference counting on structs and 
classes with optimizations, but the compiler has to know what object to 
do the reference counting on! Right now things can escape reference 
counting.

May 29 2022

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Sunday, 29 May 2022 at 08:03:57 UTC, rikki cattermole wrote:
 On 29/05/2022 7:42 PM, Ola Fosheim Grøstad wrote:
 Simple stuff like a queue of size two? How do you know what it 
 contains?

 See above

You will end up being too pessimistic. For instance: if the queue 
is empty you will assume it is tainted by eveything that went 
through it...

To get to a good place you have to do a good overapproximation.

 I too would like to have proper reference counting on structs 
 and classes with optimizations, but the compiler has to know 
 what object to do the reference counting on! Right now things 
 can escape reference counting.

Impose more limits on  safe. Then give  trusted and  system 
wrappers and other constructs that allows them to give  safe code 
what it needs.

A fat pointer can track ownership when you address fields. Then 
get rid of fat pointers by optimization.

May 29 2022

rikki cattermole <rikki cattermole.co.nz> writes:

On 29/05/2022 7:42 PM, Ola Fosheim Grøstad wrote:
 Simple stuff like a queue of size two? How do you know what it contains?

I've been thinking back on my past thought experiments, how I use to 
turn this on was with scope as a type qualifier.

For non-owning data structures like queues, you would never 
intentionally mark that as scope since its entries will always have 
separate lifetimes from that of the data structure itself.

So even if implemented, this may not be a general solution that we are 
looking for.

May 29 2022

Walter Bright <newshound2 digitalmars.com> writes:

On 5/28/2022 5:22 PM, Rikki Cattermole wrote:
 1. Inferred for anything non-virtual

dip1000 infers `scope` and `return` for:

1. templates
2. lambdas
3. nested functions
4. generated functions

i.e. where the source to the function is there.

But this isn't so for:

   int* foo(int*);

where the declaration of foo() is separated (maybe in another file) from its 
definition.

Your proposal requires all the source to be available, and no "header" 
declarations. It also won't work for pointers to functions and delegates.

To make those work will require annotations - the same reason D has `scope` and 
`return` annotations.

May 29 2022

rikki cattermole <rikki cattermole.co.nz> writes:

On 30/05/2022 8:26 AM, Walter Bright wrote:
 where the declaration of foo() is separated (maybe in another file) from 
 its definition.
 
 Your proposal requires all the source to be available, and no "header" 
 declarations. It also won't work for pointers to functions and delegates.
 
 To make those work will require annotations - the same reason D has 
 `scope` and `return` annotations.

Indeed.

I don't think you can get around having annotations for lifetime stuff 
for function pointers or virtual methods. You simply are going to need them.

Here is a common bit of code I've got:

```d

struct Thing {
 safe nothrow  nogc:

     StringBuilder_ASCII toString(RCAllocator allocator = 
globalAllocator())  trusted scope {
         StringBuilder_ASCII ret = StringBuilder_ASCII(allocator);

         ...

         return ret;
     }
}
```

The fact that it is called toString is immaterial (just want to point 
that out).

There are two sets of function annotations, at the struct scope level 
and at the method level. Scope on the method should actually be at the 
struct level, its not due to[0]. So we can ignore that one as it is 
unrelated to the problem at hand.

But the  trusted on the other hand, that is DIP1000 related. Since 
without it, DIP1000 is seeing ret and even though the this pointer and 
anything in it, is not stored in anyway inside of ret, ret is inferred 
as scope and cannot be returned.

While I'm sure we can special case this to get it to work, this is a 
pretty good example of where things that should be working right now, 
are not.

I know in my original post I didn't talk about how to turn this sort of 
thing on, in the past I used scope as a type qualifier to do it in my 
thought experiments. Which would work in all D code regardless of 
function annotations.

[0] https://issues.dlang.org/show_bug.cgi?id=23142

May 29 2022

Walter Bright <newshound2 digitalmars.com> writes:

On 5/29/2022 4:56 PM, rikki cattermole wrote:
 [0] https://issues.dlang.org/show_bug.cgi?id=23142

Thanks for the report, but I don't understand what it has to do with the rest
of 
your post?

May 29 2022

rikki cattermole <rikki cattermole.co.nz> writes:

On 30/05/2022 2:00 PM, Walter Bright wrote:
 On 5/29/2022 4:56 PM, rikki cattermole wrote:
 [0] https://issues.dlang.org/show_bug.cgi?id=23142

 
 Thanks for the report, but I don't understand what it has to do with the 
 rest of your post?

Its basically the only non-design related issue I've encountered when 
working with compiler guarantees in the last year or so.

So its worth the mention since it only comes up because I'm using 
DIP1000 with all the attribute soup that goes along with it.

May 30 2022

D Programming

C/C++ Programming

Other

digitalmars.D - Idea: Ownership & lifetime escape analysis by variables in reference