digitalmars.D - More evidence that memory safety is the future for programming

Walter Bright (4/4) Mar 28 2020 https://news.ycombinator.com/item?id=22711391

JN (4/8) Mar 28 2020 Over time we learned that opt-in safety doesn't work and @safe

Timon Gehr (7/17) Mar 28 2020 Why would this happen? The only thing that @safe and @live have in

rikki cattermole (11/17) Mar 28 2020 "This is why D requires and @live attribute be added to functions to

Arine (13/26) Mar 28 2020 Indeed it is interesting, on one hand, you have @safe being made

Timon Gehr (42/74) Mar 28 2020 @live will never be the default in any version of D that I am willing to...

Timon Gehr (2/4) Mar 28 2020 Actually, there is a true positive on this line.
Timon Gehr (4/10) Mar 28 2020 Another way to interpret it is to just take it as the statement that

Timon Gehr (6/7) Mar 28 2020 @live is not an Ownership/Borrowing system, even though it is true that
Dukc (21/24) Mar 30 2020 I have recently added a lot of unittests to my code. That

Atila Neves (4/12) Mar 30 2020 It's easier to use asan with ldc. I did write an allocator to do

Dukc (9/12) Mar 31 2020 Yeah, something like those is what I meant. Thanks - I have to

Atila Neves (5/7) Apr 01 2020 It is:

Dukc (2/9) Apr 01 2020 Great!

Johan (14/18) Mar 30 2020 Why is this news?

Walter Bright (4/5) Mar 30 2020 Do you mean RAII? RAII is only a partial solution. For example, it is qu...

Johan (2/8) Apr 01 2020 I meant static analysis.

Walter Bright (10/11) Apr 01 2020 I'm not familiar specifically with clang's static analysis, but my exper...

Jacob Carlborg (17/21) Apr 02 2020 This is detected by Clang without running the static analyzer:

Johan (19/32) Apr 02 2020 Come on, of course such a super simple example is detected by
Walter Bright (19/25) Apr 02 2020 Now try:

Walter Bright (25/25) Apr 02 2020 Some experimenting with clang shows it loses track of things when one le...

Sebastiaan Koppe (24/37) Mar 31 2020 I have to agree with you, most of the time I don't care about

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (5/10) Mar 31 2020 How do you do this? Do you do ref counting on the JS side?

Sebastiaan Koppe (17/29) Mar 31 2020 No, I keep all the JS objects in a JS array so the GC won't free

Walter Bright <newshound2 digitalmars.com> writes:

https://news.ycombinator.com/item?id=22711391

Fitting in with the push for  safe as the default, and the  live 
Ownership/Borrowing system for D.

We can either get on the bus or get run over by the bus.

Mar 28 2020

JN <666total wp.pl> writes:

On Saturday, 28 March 2020 at 20:24:02 UTC, Walter Bright wrote:
 https://news.ycombinator.com/item?id=22711391

 Fitting in with the push for  safe as the default, and the 
  live Ownership/Borrowing system for D.

 We can either get on the bus or get run over by the bus.

Over time we learned that opt-in safety doesn't work and  safe 
should be the default. Do you think the same will happen with 
 live?

Mar 28 2020

Timon Gehr <timon.gehr gmx.ch> writes:

On 28.03.20 22:03, JN wrote:
 On Saturday, 28 March 2020 at 20:24:02 UTC, Walter Bright wrote:
 https://news.ycombinator.com/item?id=22711391

 Fitting in with the push for  safe as the default, and the  live 
 Ownership/Borrowing system for D.

 We can either get on the bus or get run over by the bus.

 
 Over time we learned that opt-in safety doesn't work and  safe should be 
 the default. Do you think the same will happen with  live?

Why would this happen? The only thing that  safe and  live have in 
common is that they are function qualifiers that add additional type 
checking.  safe is in service of an invariant (memory safety), while 
 live is not. It's just a linting tool that (from the perspective of 
memory safety) produces exclusively false positives in  safe code. Also, 
there is no way to negate it.

Mar 28 2020

rikki cattermole <rikki cattermole.co.nz> writes:

On 29/03/2020 9:24 AM, Walter Bright wrote:
 https://news.ycombinator.com/item?id=22711391
 
 Fitting in with the push for  safe as the default, and the  live 
 Ownership/Borrowing system for D.
 
 We can either get on the bus or get run over by the bus.

"This is why D requires and  live attribute be added to functions to 
enable the checking for just those functions, so it doesn't break every 
program out there."

Interesting quote there.

But what if we want to use these semantics isolated and not turn it on 
in every function in use?

The alternative I have been proposing is headconst tied to lifetimes, 
basically a borrowed pointer just with a bit of extra syntax that does 
propagate within a type.

https://gist.github.com/rikkimax/4cb2cc8ddcac33c1a9bb20de432f9dea

Mar 28 2020

Arine <arine123445128843 gmail.com> writes:

On Saturday, 28 March 2020 at 23:14:05 UTC, rikki cattermole 
wrote:
 On 29/03/2020 9:24 AM, Walter Bright wrote:
 https://news.ycombinator.com/item?id=22711391
 
 Fitting in with the push for  safe as the default, and the 
  live Ownership/Borrowing system for D.
 
 We can either get on the bus or get run over by the bus.

 "This is why D requires and  live attribute be added to 
 functions to enable the checking for just those functions, so 
 it doesn't break every program out there."

 Interesting quote there.


Indeed it is interesting, on one hand, you have  safe being made 
the default, breaking all code in existence. And then you have 
 live being introduced so it doesn't break all code in existence. 
It seems  live is making the same mistake as  safe. How long 
until  live becomes the default?


I also find this claim to be quite bold:

 This is why D uses DFA to catch 100% of the positives with 0% 
 negatives.

When in the review thread previously the sentiment was along the 
lines of, "patch the holes as they appear". So how do you go from 
patching holes as they appear, to 100% guaranteeing it catches 
everything correctly, without thorough testing. Or is that just 
empty marketing promises?

Mar 28 2020

Timon Gehr <timon.gehr gmx.ch> writes:

On 29.03.20 03:08, Arine wrote:
 On Saturday, 28 March 2020 at 23:14:05 UTC, rikki cattermole wrote:
 On 29/03/2020 9:24 AM, Walter Bright wrote:
 https://news.ycombinator.com/item?id=22711391

 Fitting in with the push for  safe as the default, and the  live 
 Ownership/Borrowing system for D.

 We can either get on the bus or get run over by the bus.

 "This is why D requires and  live attribute be added to functions to 
 enable the checking for just those functions, so it doesn't break 
 every program out there."

 Interesting quote there.

 
 
 Indeed it is interesting, on one hand, you have  safe being made the 
 default, breaking all code in existence. And then you have  live being 
 introduced so it doesn't break all code in existence. It seems  live is 
 making the same mistake as  safe. How long until  live becomes the default?
 ...

 live will never be the default in any version of D that I am willing to 
use. I really don't understand why people are inclined to think it is on 
the same level as  safe. It just is not.

 
 I also find this claim to be quite bold:
 
 This is why D uses DFA to catch 100% of the positives with 0% negatives.

 
 When in the review thread previously the sentiment was along the lines 
 of, "patch the holes as they appear". So how do you go from patching 
 holes as they appear, to 100% guaranteeing it catches everything 
 correctly, without thorough testing. Or is that just empty marketing 
 promises?
 

It's not a statement that has a standard meaning. What positives and 
what negatives? Refer to the following table:

                        | problem exists | problem does not exist
----------------------------------------------------------------
flagged by linter      | true positive  | false positive
----------------------------------------------------------------
not flagged by linter  | false negative | true negative
----------------------------------------------------------------

 live leads to any of the four kinds of outcomes:

void foo(int *p) live{
     free(p);
     free(p); // true positive
}

void bar(int *p) live{
     auto q=p;
     *p=3;    // false positive
}

void baz() live{
     auto p=new int;
     free(p); // false negative
}

void qux() life{
     auto p=cast(int*)malloc(int.sizeof);
     free(p); // true negative
}

An amusing way to interpret 100% positives and 0% negatives would be to 
say the linter flags all positives (either true or false) and no 
negatives (neither true nor false). That basically means the linter 
flags every code as being problematic, which would make it completely 
useless.

In the thread on hackernews, there were a few people who interpreted the 
statement as saying that the linter has neither false positives nor 
false negatives, which is provably impossible. You can get rid of one of 
them at a time, but not both.  safe is an example of a feature that does 
not have false negatives (unless the implementation is buggy), but of 
course it has loads of false positives, for example you cannot manage 
memory manually in  safe code without  trusted escapes, even if you do 
it correctly.

Mar 28 2020

Timon Gehr <timon.gehr gmx.ch> writes:

On 29.03.20 04:28, Timon Gehr wrote:
 
 void qux() life{

Actually, there is a true positive on this line.

Mar 28 2020

Timon Gehr <timon.gehr gmx.ch> writes:

On 29.03.20 04:28, Timon Gehr wrote:
 
 An amusing way to interpret 100% positives and 0% negatives would be to 
 say the linter flags all positives (either true or false) and no 
 negatives (neither true nor false). That basically means the linter 
 flags every code as being problematic, which would make it completely 
 useless.

Another way to interpret it is to just take it as the statement that 
everything that is flagged by the linter is a positive, and everything 
else is a negative. This is the definition of positive and negative.

Mar 28 2020

Timon Gehr <timon.gehr gmx.ch> writes:

On 28.03.20 21:24, Walter Bright wrote:
  live Ownership/Borrowing system

 live is not an Ownership/Borrowing system, even though it is true that 
it is based on concepts related to ownership and borrowing.

An Ownership/Borrowing system enforces ownership semantics in  safe 
code,  live does not. It is a linter for  system and  trusted code with 
no safety guarantees.

Mar 28 2020

Dukc <ajieskola gmail.com> writes:

On Saturday, 28 March 2020 at 20:24:02 UTC, Walter Bright wrote:
 Fitting in with the push for  safe as the default, and the 
  live Ownership/Borrowing system for D.

 We can either get on the bus or get run over by the bus.

I have recently added a lot of unittests to my code. That 
confirmed me that as we all know, it is a mandatory to do if I 
even a remotely bug-free program is desired :). And that's in a 
program that already had no global state and used ranges, `final 
switch`es and `assert`s fairly much.

So I am thinking, perhaps a dead-easy and well known way to test 
correct `malloc`-`free` pairing in unit tests would work just as 
well, but be easier to implement and use? It'd be something like 
this to use:

```
unittest
{   auto tracer = MallocTracer();

     if (true)
     {   auto raiiObject = allocAnObject();
         raiiObject.doSomething();
         assert(tracer.numMallocs == tracer.numFrees - 1);
     }

     assert(tracer.numMallocs == tracer.numFrees);
}
```

Mar 30 2020

Atila Neves <atila.neves gmail.com> writes:

On Monday, 30 March 2020 at 12:58:33 UTC, Dukc wrote:
 On Saturday, 28 March 2020 at 20:24:02 UTC, Walter Bright wrote:
 [...]

 I have recently added a lot of unittests to my code. That 
 confirmed me that as we all know, it is a mandatory to do if I 
 even a remotely bug-free program is desired :). And that's in a 
 program that already had no global state and used ranges, 
 `final switch`es and `assert`s fairly much.

 [...]

It's easier to use asan with ldc. I did write an allocator to do 
this before asan was available though: 
https://github.com/atilaneves/test_allocator

Mar 30 2020

Dukc <ajieskola gmail.com> writes:

On Monday, 30 March 2020 at 13:20:08 UTC, Atila Neves wrote:
 It's easier to use asan with ldc. I did write an allocator to 
 do this before asan was available though: 
 https://github.com/atilaneves/test_allocator

Yeah, something like those is what I meant. Thanks - I have to 
remember those when next having problems with `malloc`s.

Lowering the bar to use a tool like these is IMO more effective 
than pushing Rust-like static analysis. Basically, to cut down 
memory problems a sanitizer should be as easy to use as the 
built-in `unittest`s. Sure static checks can be useful too, but 
to be worth it they need to be easier to use than the sanitizer, 
and in any case static checks can't completely replace sanitizers.

Mar 31 2020

Atila Neves <atila.neves gmail.com> writes:

On Tuesday, 31 March 2020 at 23:08:01 UTC, Dukc wrote:

 Basically, to cut down memory problems a sanitizer should be as 
 easy to use as the built-in `unittest`s.

It is:

dflags: "-fsanitize=address" platform="ldc"


If you're not using dub, then:

ldc2 -fsanitize=address --unittest $REST_OF_ARGS

Apr 01 2020

Dukc <ajieskola gmail.com> writes:

On Wednesday, 1 April 2020 at 14:30:20 UTC, Atila Neves wrote:
 On Tuesday, 31 March 2020 at 23:08:01 UTC, Dukc wrote:
 Basically, to cut down memory problems a sanitizer should be 
 as easy to use as the built-in `unittest`s.

 It is:

 dflags: "-fsanitize=address" platform="ldc"


 If you're not using dub, then:

 ldc2 -fsanitize=address --unittest $REST_OF_ARGS

Great!

Apr 01 2020

Johan <j j.nl> writes:

On Saturday, 28 March 2020 at 20:24:02 UTC, Walter Bright wrote:
 https://news.ycombinator.com/item?id=22711391

 Fitting in with the push for  safe as the default, and the 
  live Ownership/Borrowing system for D.

 We can either get on the bus or get run over by the bus.

Why is this news?
Clang has had this for a decade, and it certainly wasn't the 
first. If there is a bus, it's left the station a very long time 
ago.

Isn't it more interesting to find a comprehensive resource 
management solution, instead of working on a solution only for 
the special case of memory [*]. A double file close is also bad, 
for example.
Maybe RAII and move semantics isn't it, but at least it doesn't 
single out one type of resource.

-Johan

[*] Let alone the even more special case of functions with the 
exact symbol names "malloc" and "free" in this case...

Mar 30 2020

Walter Bright <newshound2 digitalmars.com> writes:

On 3/30/2020 12:25 PM, Johan wrote:
 Clang has had this for a decade

Do you mean RAII? RAII is only a partial solution. For example, it is quite
easy 
for an RAII object to leak a reference to its internals, and then the RAII 
object gets deleted, but the reference is still there.

Mar 30 2020

Johan <j j.nl> writes:

On Tuesday, 31 March 2020 at 01:26:50 UTC, Walter Bright wrote:
 On 3/30/2020 12:25 PM, Johan wrote:
 Clang has had this for a decade

 Do you mean RAII? RAII is only a partial solution. For example, 
 it is quite easy for an RAII object to leak a reference to its 
 internals, and then the RAII object gets deleted, but the 
 reference is still there.

I meant static analysis.

Apr 01 2020

Walter Bright <newshound2 digitalmars.com> writes:

On 4/1/2020 12:03 PM, Johan wrote:
 I meant static analysis.

I'm not familiar specifically with clang's static analysis, but my experience 
with such is they detect a few obvious patterns, and miss the subtle ones that 
cause all the trouble. For example,

     int* foo(int i) { return &i; }

gets detected by static analysis. This one does not:

     int* bar(int* p) { return p; }

     int* foo(int i) { return bar(&i); }

The idea with D is to solve it in the general case, not by matching specific 
patterns. #Dip1000 is how D solves the second case in general.

Apr 01 2020

Jacob Carlborg <doob me.com> writes:

On Wednesday, 1 April 2020 at 21:44:27 UTC, Walter Bright wrote:

     int* foo(int i) { return &i; }

This is detected by Clang without running the static analyzer:

$ clang main.c
main.c:1:27: warning: address of stack memory associated with 
parameter 'i' returned [-Wreturn-stack-address]
int* foo(int i) { return &i; }
                           ^
1 warning generated.

 gets detected by static analysis. This one does not:

     int* bar(int* p) { return p; }

     int* foo(int i) { return bar(&i); }

The Clang static analyzer detects this:

$ clang --analyze main.c
main.c:2:19: warning: Address of stack memory associated with 
local variable 'i' returned to caller
int* foo(int i) { return bar(&i); }
          ~~~~~    ^~~~~~~~~~~~~~
1 warning generated.

--
/Jacob Carlborg

Apr 02 2020

Johan <j j.nl> writes:

On Thursday, 2 April 2020 at 10:04:24 UTC, Jacob Carlborg wrote:
 On Wednesday, 1 April 2020 at 21:44:27 UTC, Walter Bright wrote:

 gets detected by static analysis. This one does not:

     int* bar(int* p) { return p; }

     int* foo(int i) { return bar(&i); }

 The Clang static analyzer detects this:

 $ clang --analyze main.c
 main.c:2:19: warning: Address of stack memory associated with 
 local variable 'i' returned to caller
 int* foo(int i) { return bar(&i); }
          ~~~~~    ^~~~~~~~~~~~~~
 1 warning generated.

Come on, of course such a super simple example is detected by 
clang's static analyzer. From my Inkscape days, I remember a case 
where the static analyzer found a bug with 80+ steps across 
multiple cpp files and complex control flow. I imagine Clang's 
analyzer is doing similar proofs that the D compiler is doing. My 
guess is that the problem is similar to the halting problem. And 
memory issues like this are just not provable (I think): 
https://github.com/dlang/dmd/pull/7050

Running the program with a sanitizer (memory, threads, UB, ...) 
appears easier to me in the many cases where this is possible. 
Nice to see you are advocating it Atila! :)

My only point was to question the newness of the news in OP and 
the statement that we need to "get on the bus". C++ has had 
powerful static analysis engines for a long time, and I don't 
think it changed the landscape. C++ has very nice runtime 
sanitizers, but again I don't think it is changing the landscape 
as much as one may have expected.

-Johan

Apr 02 2020

Walter Bright <newshound2 digitalmars.com> writes:

On 4/2/2020 3:04 AM, Jacob Carlborg wrote:
 $ clang --analyze main.c
 main.c:2:19: warning: Address of stack memory associated with local variable
'i' 
 returned to caller
 int* foo(int i) { return bar(&i); }
           ~~~~~    ^~~~~~~~~~~~~~
 1 warning generated.

Now try:

   int* bar(int* p);
   int* foo(int i) { return bar(&i); }

And then:

   struct S { int* p; };

   struct S foo(struct S* ps, int i)
   {
       ps->p = &i;
       return *ps;
   }

It falls apart. Now let's try D:

   struct S { int* p; }

    safe S foo(S* ps, int i)
   {
       ps.p = &i; // Error: cannot take address of parameter i in  safe
function foo
       return *ps;
   }

The point is to get them all, not a few simple patterns.

Apr 02 2020

Walter Bright <newshound2 digitalmars.com> writes:

Some experimenting with clang shows it loses track of things when one level of 
indirection is added:

   struct S* malloc();
   void free(struct S*);

   void nut(struct S* s, int* pi) { free(s); *pi = 4; }

   void bolt()
   {
     struct S* s = malloc();
     struct S** ps = &s;     // <= add indirection
     nut(*ps, (*ps)->i);
   }

or when extern functions are used (i.e. function bodies are not available).

Other things clang doesn't detect:

   int* malloc();
   void free(int*);

   int nut();

   void bolt(int i)
   {
     int* p = malloc();
     *p = 1;
   }

Doesn't find the memory leak. Also, if you write your own storage allocator, 
clang doesn't pick it up.

clang actually does a nice job with what it has to work with - it's the C and 
C++ languages that are not amenable to doing it 100%.

Apr 02 2020

Sebastiaan Koppe <mail skoppe.eu> writes:

On Monday, 30 March 2020 at 19:25:53 UTC, Johan wrote:
 On Saturday, 28 March 2020 at 20:24:02 UTC, Walter Bright wrote:
 https://news.ycombinator.com/item?id=22711391

 Fitting in with the push for  safe as the default, and the 
  live Ownership/Borrowing system for D.

 We can either get on the bus or get run over by the bus.

 Isn't it more interesting to find a comprehensive resource 
 management solution, instead of working on a solution only for 
 the special case of memory [*]. A double file close is also 
 bad, for example.
 Maybe RAII and move semantics isn't it, but at least it doesn't 
 single out one type of resource.

I have to agree with you, most of the time I don't care about 
memory, but rather whatever I am modelling within that memory. 
Yes, that 24 might look right a regular int, but to me it holds 
significance beyond the fact that it is a mere int.

For my web library spasm I am dealing with JS objects that I have 
to release at the right time. Things get hairy with delegates, 
callback and long lived references.

At first I tried reference counting, but found out there was 
significant bloat (I like to keep my web binaries small), 
eventually I settled for non-copyable objects so I get unique 
references, and release them on the JS side when the last and 
only reference goes out of scope.

Of course now I run into other issues. 'scope ref' helps a bit, 
but I find that I have to write `move` a lot. The parts where I 
am struggling a bit are where I get a handle to an JS object that 
is conceptually a sumtype, an optional or a base object and need 
to unwrap or up cast things. I have solved them, but with some 
rather gnarly  system code.

Remember, I want unique references (and borrow) to plain ints 
here. I know it might look to the compiler as a regular int it 
can simply copy (and it is), but to me it looks like a JS 
mouseevent object that needs clean up after the last reference is 
gone.

Mar 31 2020

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Tuesday, 31 March 2020 at 07:26:38 UTC, Sebastiaan Koppe wrote:
 At first I tried reference counting, but found out there was 
 significant bloat (I like to keep my web binaries small), 
 eventually I settled for non-copyable objects so I get unique 
 references, and release them on the JS side when the last and 
 only reference goes out of scope.

How do you do this? Do you do ref counting on the JS side?

I see that there is a proposal for weak references for javascript:
https://v8.dev/features/weak-references

I guess that could be useful.

Mar 31 2020

Sebastiaan Koppe <mail skoppe.eu> writes:

On Tuesday, 31 March 2020 at 11:38:40 UTC, Ola Fosheim Grøstad 
wrote:
 On Tuesday, 31 March 2020 at 07:26:38 UTC, Sebastiaan Koppe 
 wrote:
 At first I tried reference counting, but found out there was 
 significant bloat (I like to keep my web binaries small), 
 eventually I settled for non-copyable objects so I get unique 
 references, and release them on the JS side when the last and 
 only reference goes out of scope.

 How do you do this? Do you do ref counting on the JS side?

No, I keep all the JS objects in a JS array so the GC won't free 
them. When the time comes I call a release function from D, which 
removes the object from the array.

In the off case the D code takes the hold of the same JS object 
twice (e.g. twice the same querySelector or similar), there would 
be 2 entries in the JS array for the same object, each having 
their own unique reference.

JS engines do objects in arrays pretty well. It works nicely 
combined with the unique reference semantics I have on the D 
side. And if you really need to you can wrap it in a refcount and 
get the best of both worlds.

 I see that there is a proposal for weak references for 
 javascript:
 https://v8.dev/features/weak-references

 I guess that could be useful.

That is pure JS though. There is a webassembly proposal to 
introduce anyref, whereby you can move js objects into wasm, and 
the js engine will track them. It might take a while before that 
is available.

Mar 31 2020

D Programming

C/C++ Programming

Other

digitalmars.D - More evidence that memory safety is the future for programming