www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Actual lifetime of static array slices?

reply Elfstone <elfstone yeah.net> writes:
I failed to find any documentation, except dynamic array slices 
will be taken care of by GC, but I assume it's not the case with 
static arrays.

But the code bellow doesn't behave as I expected.

     int[] foo()
     {
     	int[1024] static_array;
     	// return static_array[]; // Error: returning 
`static_array[]` escapes a reference to local variable 
`static_array`
         return null;
     }

     class A
     {
     	this(int[] inData)
     	{
     		data = inData;
     	}

     	int[] data;
     }

     void main()
     {
     	int[] arr;
     	A a;
     	{
     		int[1024] static_array;
     		arr = aSlice; // OK
     		a = new A(aSlice); // OK
     		arr = foo();
     		//arr = foo();

     	}
     }

By assigning aSlice to arr or a, it seemingly escapes the scope, 
I thought there'd be errors, but the code compiles just fine.

Is it really safe though?
Nov 14 2022
next sibling parent reply Mike Parker <aldacron gmail.com> writes:
On Tuesday, 15 November 2022 at 02:26:41 UTC, Elfstone wrote:
 I failed to find any documentation, except dynamic array slices 
 will be taken care of by GC, but I assume it's not the case 
 with static arrays.
A slice is a view on the existing memory owned by the original array. No allocations are made for the slice. The GC will track all references to the memory allocated for a dynamic array, so as long as any slices remain alive, so will the original memory. Static arrays are allocated on the stack and become invalid when they leave a function scope. In turn, so would any slices or other pointers that reference that stack memory.
 But the code bellow doesn't behave as I expected.

     int[] foo()
     {
     	int[1024] static_array;
     	// return static_array[]; // Error: returning 
 `static_array[]` escapes a reference to local variable 
 `static_array`
         return null;
     }

     class A
     {
     	this(int[] inData)
     	{
     		data = inData;
     	}

     	int[] data;
     }

     void main()
     {
     	int[] arr;
     	A a;
     	{
     		int[1024] static_array;
     		arr = aSlice; // OK
     		a = new A(aSlice); // OK
     		arr = foo();
     		//arr = foo();

     	}
     }

 By assigning aSlice to arr or a, it seemingly escapes the 
 scope, I thought there'd be errors, but the code compiles just 
 fine.

 Is it really safe though?
It's not the scope that matters here. It's the stack. Memory allocated in the inner scope uses the function stack, so it's all valid until the function exits.
Nov 14 2022
parent Mike Parker <aldacron gmail.com> writes:
On Tuesday, 15 November 2022 at 02:49:55 UTC, Mike Parker wrote:

 It's not the scope that matters here. It's the stack. Memory 
 allocated in the inner scope uses the function stack, so it's 
 all valid until the function exits.
And that was just so, so wrong. Of course destructors get called when scopes exit, etc.
Nov 14 2022
prev sibling parent reply Siarhei Siamashka <siarhei.siamashka gmail.com> writes:
On Tuesday, 15 November 2022 at 02:26:41 UTC, Elfstone wrote:
 By assigning aSlice to arr or a, it seemingly escapes the 
 scope, I thought there'd be errors, but the code compiles just 
 fine.

 Is it really safe though?
No, it's not safe. You can add ` safe:` line in the beginning of your program and it will fail to compile (after renaming static_array to aSlice): test.d(27): Error: address of variable `aSlice` assigned to `arr` with longer lifetime By default everything is assumed to be system and the compiler silently allows you to shoot yourself in the foot. See https://dlang.org/spec/memory-safe-d.html
Nov 14 2022
parent reply Elfstone <elfstone yeah.net> writes:
On Tuesday, 15 November 2022 at 02:50:44 UTC, Siarhei Siamashka 
wrote:
 On Tuesday, 15 November 2022 at 02:26:41 UTC, Elfstone wrote:
 By assigning aSlice to arr or a, it seemingly escapes the 
 scope, I thought there'd be errors, but the code compiles just 
 fine.

 Is it really safe though?
No, it's not safe. You can add ` safe:` line in the beginning of your program and it will fail to compile (after renaming static_array to aSlice): test.d(27): Error: address of variable `aSlice` assigned to `arr` with longer lifetime By default everything is assumed to be system and the compiler silently allows you to shoot yourself in the foot. See https://dlang.org/spec/memory-safe-d.html
Thanks, safe works my first code, but the following code still compiles. class A { safe this(int[] inData) { data = inData; } int[] data; } safe int[] foo() { int[1024] static_array; // return static_array[]; // Error: returning `static_array[]` escapes a reference to local variable `static_array` return null; } safe A bar() { int[1024] static_array; return new A(static_array[]); } safe void main() { auto a = bar(); writeln(a.data); // OK, but writes garbage } So the compiler detects escaping in foo() but not in bar(), this doesn't look right. Is there a way to tell whether a slice is from a dynamic array or a static array?
Nov 14 2022
next sibling parent reply Siarhei Siamashka <siarhei.siamashka gmail.com> writes:
On Tuesday, 15 November 2022 at 03:05:30 UTC, Elfstone wrote:
 So the compiler detects escaping in foo() but not in bar(), 
 this doesn't look right.
The compiler can detect it with -dip1000 command line option.
 Is there a way to tell whether a slice is from a dynamic array 
 or a static array?
For debugging purposes? Maybe find the stack boundaries and check if the address is in stack?
Nov 14 2022
parent reply Elfstone <elfstone yeah.net> writes:
On Tuesday, 15 November 2022 at 03:18:17 UTC, Siarhei Siamashka 
wrote:
 On Tuesday, 15 November 2022 at 03:05:30 UTC, Elfstone wrote:
 So the compiler detects escaping in foo() but not in bar(), 
 this doesn't look right.
The compiler can detect it with -dip1000 command line option.
 Is there a way to tell whether a slice is from a dynamic array 
 or a static array?
For debugging purposes? Maybe find the stack boundaries and check if the address is in stack?
Great! This should be a builtin feature! idea what supersedes it?
Nov 14 2022
parent reply rikki cattermole <rikki cattermole.co.nz> writes:
On 15/11/2022 5:10 PM, Elfstone wrote:

 what supersedes it?
The implementation.
Nov 14 2022
parent Elfstone <elfstone yeah.net> writes:
On Tuesday, 15 November 2022 at 04:10:37 UTC, rikki cattermole 
wrote:
 On 15/11/2022 5:10 PM, Elfstone wrote:

 Any idea what supersedes it?
The implementation.
Cool.
Nov 14 2022
prev sibling parent reply =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 11/14/22 19:05, Elfstone wrote:

       safe
      int[] foo()
      {
          int[1024] static_array;
          // return static_array[]; // Error: returning `static_array[]`
 escapes a reference to local variable `static_array`
That is trivial for the compiler to catch.
      return null;
      }

       safe
      A bar()
      {
          int[1024] static_array;
          return new A(static_array[]);
That one requires the computer to analyze the code to a deeper level. Yes, we are passing a slice to the A constructor but we don't know whether the constructor will store the slice or simply use it. Even the following can be safe but impossible to detect by the compiler: class A { safe this(int[] inData) { data = someCondition() ? new int[42] : inData; // (1) // ... if (someOtherCondition()) { data = null; // (2) } } // ... } (1) Whether we use inData depends on someCondition(). It may be so that bar() is never called depending on someCondition() in the program at all. (2) data may never refer to 'static_array' after the constructor exits depending on someOtherCondition() The code above may not be using good coding practices and humans may see the bugs if there are any but only a slow (infinetely?) compiler can see through all the code. (I think live will help with these cases.) Further, the support for "separate compilation" makes it impossible to see through function boundaries. Additionally, we don't want the compiler to force us to copy all stack variables just in case. In summary, you are right but the compiler cannot do anything about it in all cases and we wouldn't want it to spend infinite amount of time to try to determine everything. Ali
Nov 14 2022
parent reply Siarhei Siamashka <siarhei.siamashka gmail.com> writes:
On Tuesday, 15 November 2022 at 06:44:16 UTC, Ali Çehreli wrote:
 In summary, you are right but the compiler cannot do anything 
 about it in all cases and we wouldn't want it to spend infinite 
 amount of time to try to determine everything.
Well, there's another way to look at it: https://doc.rust-lang.org/book/ch19-01-unsafe-rust.html ('Unsafe Rust exists because, by nature, static analysis is conservative. When the compiler tries to determine whether or not code upholds the guarantees, it’s better for it to reject some valid programs than to accept some invalid programs. Although the code might be okay, **if the Rust compiler doesn’t have enough information to be confident, it will reject the code**. In these cases, you can use unsafe code to tell the compiler, “Trust me, I know what I’m doing.”') Are you saying that the D safety model is different? In the sense that if the D compiler doesn’t have enough information to be confident, it will accept the code?
Nov 15 2022
parent reply Paul Backus <snarwin gmail.com> writes:
On Tuesday, 15 November 2022 at 13:01:39 UTC, Siarhei Siamashka 
wrote:
 Well, there's another way to look at it: 
 https://doc.rust-lang.org/book/ch19-01-unsafe-rust.html 
 ('Unsafe Rust exists because, by nature, static analysis is 
 conservative. When the compiler tries to determine whether or 
 not code upholds the guarantees, it’s better for it to reject 
 some valid programs than to accept some invalid programs. 
 Although the code might be okay, **if the Rust compiler doesn’t 
 have enough information to be confident, it will reject the 
 code**. In these cases, you can use unsafe code to tell the 
 compiler, “Trust me, I know what I’m doing.”')

 Are you saying that the D safety model is different? In the 
 sense that if the D compiler doesn’t have enough information to 
 be confident, it will accept the code?
D's safety model is the same. In ` safe` code, D will reject anything that the compiler cannot say for sure is memory safe. However, unlike in Rust, ` safe` is not the default in D, so you must mark your code as ` safe` manually if you want to benefit from these checks.
Nov 15 2022
parent reply Siarhei Siamashka <siarhei.siamashka gmail.com> writes:
On Tuesday, 15 November 2022 at 13:16:18 UTC, Paul Backus wrote:
 D's safety model is the same. In ` safe` code, D will reject 
 anything that the compiler cannot say for sure is memory safe. 
 However, unlike in Rust, ` safe` is not the default in D, so 
 you must mark your code as ` safe` manually if you want to 
 benefit from these checks.
I specifically asked for Ali's opinion. Because the context is that the compiler couldn't catch a memory safety bug in the code that was annotated as safe (but without -dip1000) and Ali commented that "the compiler cannot do anything about it in all cases and we wouldn't want it to spend infinite amount of time to try to determine everything". This sounds like he justifies the compiler's failure and accepts this as something normal. The https://dlang.org/spec/memory-safe-d.html page also provides a rather vague statement: " safe functions have a number of restrictions on what they may do and are intended to disallow operations that may cause memory corruption". Which kinda means that it makes some effort to catch some memory safety bugs. This weasel language isn't very reassuring, compared to a very clear Rust documentation.
Nov 15 2022
next sibling parent =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 11/15/22 06:05, Siarhei Siamashka wrote:

 Ali commented that "the
 compiler cannot do anything about it in all cases and we wouldn't want
 it to spend infinite amount of time to try to determine everything".
Yes, that's my understanding.
 This sounds like he justifies the compiler's failure and accepts this as
 something normal.
Despite my lack of computer science education, I think the compiler's failure in analyzing source code to determine all bugs is "normal". I base my understanding on the "halting problem" and the "separate compilation" feature that D supports.
 The https://dlang.org/spec/memory-safe-d.html page also provides a
 rather vague statement: " safe functions have a number of restrictions
 on what they may do and are intended to disallow operations that may
 cause memory corruption". Which kinda means that it makes some effort to
 catch some memory safety bugs.
Exactly. My understanding is that safe attempts to remove memory corruptions. live is being worked on to improve the situation by tracking liveness of data.
 This weasel language isn't very
 reassuring, compared to a very clear Rust documentation.
That's spot on. Ali
Nov 15 2022
prev sibling parent Paul Backus <snarwin gmail.com> writes:
On Tuesday, 15 November 2022 at 14:05:42 UTC, Siarhei Siamashka 
wrote:
 On Tuesday, 15 November 2022 at 13:16:18 UTC, Paul Backus wrote:
 D's safety model is the same. In ` safe` code, D will reject 
 anything that the compiler cannot say for sure is memory safe. 
 However, unlike in Rust, ` safe` is not the default in D, so 
 you must mark your code as ` safe` manually if you want to 
 benefit from these checks.
I specifically asked for Ali's opinion. Because the context is that the compiler couldn't catch a memory safety bug in the code that was annotated as safe (but without -dip1000) and Ali commented that "the compiler cannot do anything about it in all cases and we wouldn't want it to spend infinite amount of time to try to determine everything". This sounds like he justifies the compiler's failure and accepts this as something normal. The https://dlang.org/spec/memory-safe-d.html page also provides a rather vague statement: " safe functions have a number of restrictions on what they may do and are intended to disallow operations that may cause memory corruption". Which kinda means that it makes some effort to catch some memory safety bugs. This weasel language isn't very reassuring, compared to a very clear Rust documentation.
The goal of ` safe` is to ensure that memory corruption cannot possibly occur in ` safe` code, period--only in ` system` or ` trusted` code. If the documentation isn't clear about this, that's failure of the documentation. However, there are some known issues with ` safe` that require breaking changes to fix, and to make migration easier for existing code, those changes have been hidden behind the `-dip1000` flag. So in practice, if you are using ` safe` without `-dip1000`, you may run into compiler bugs that compromise memory safety. That's what happened in your example. Slicing a stack-allocated static array *shouldn't* be allowed in ` safe` code without `-dip1000`, but the compiler allows it anyway, due to a bug, and the fix for that bug is enabled by the `-dip1000` switch.
Nov 16 2022