www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - The non-intrusive ad-hoc borrowing =?UTF-8?B?c2NoZW1l4oCm?=

reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
Since Rust style borrowing checks lead to various complications, 
why not embrace testing as an alternative approach. Basically 
providing something like valgrind for borrowing.

You can track shared ownership by letting pointers and slices use 
ref-counting. That is well-known.

In my own attempts to model borrowing in D, I did a tiny little 
twist to it, creating borrowing pointers that will not free when 
the refcount goes to zero. Instead they fail. So you have to 
release the owner explicitly and that can only be done when the 
refcount is zero.

That allows you to track borrowing during debugging/testing, and 
turn it off in release mode. You also avoid the inconveniences 
that Rust comes with.

The downside is that the resulting code is only as safe as the 
testing environment ensures…

Anyway, this is relatively easy to implement if the code base is 
wellbehaved.
Nov 13 2019
next sibling parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Wednesday, 13 November 2019 at 12:22:12 UTC, Ola Fosheim 
Grøstad wrote:
 That allows you to track borrowing during debugging/testing, 
 and turn it off in release mode.
You could also have a circular buffer that registers where borrowing/unborrowing happens, so that you can pinpoint which sourcecode line to look at. (And still use the same tooling for static analysis of pointers that C++ will use + valgrind.)
Nov 13 2019
prev sibling next sibling parent reply Atila Neves <atila.neves gmail.com> writes:
On Wednesday, 13 November 2019 at 12:22:12 UTC, Ola Fosheim 
Grøstad wrote:
 Since Rust style borrowing checks lead to various 
 complications, why not embrace testing as an alternative 
 approach. Basically providing something like valgrind for 
 borrowing.

 [...]
How would this differ from actually using valgrind or address sanitizer?
Nov 13 2019
parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Wednesday, 13 November 2019 at 14:15:10 UTC, Atila Neves wrote:
 How would this differ from actually using valgrind or address 
 sanitizer?
Good question. I haven't used this concept for anything more than playing with it. Off the top of my head the potential could be: - It should work with custom allocators that reuse memory - It could track borrowed views on an array, such as array slices. So you could track whether there are overlapping views being borrowed or if the view is unique. - It should work for concurrency scenarios when you want to know if you have an isolated reference that you safely can transfer to another context. I.e. Pony reference types, but here it would be done at runtime as dynamic types. - You could also extend it to work with multithreading, so that you could tell whether a resource is only available to a single thread. - Hopefully give better debugging information by tracing the conceptual borrowing rather than memory access. I think there are opportunities here for borrowed references that are stored on the heap in graph like structures(which is a limiting factor for Rust).
Nov 13 2019
prev sibling next sibling parent reply IGotD- <nise nise.com> writes:
On Wednesday, 13 November 2019 at 12:22:12 UTC, Ola Fosheim 
Grøstad wrote:
 Since Rust style borrowing checks lead to various 
 complications, why not embrace testing as an alternative 
 approach. Basically providing something like valgrind for 
 borrowing.

 You can track shared ownership by letting pointers and slices 
 use ref-counting. That is well-known.

 In my own attempts to model borrowing in D, I did a tiny little 
 twist to it, creating borrowing pointers that will not free 
 when the refcount goes to zero. Instead they fail. So you have 
 to release the owner explicitly and that can only be done when 
 the refcount is zero.

 That allows you to track borrowing during debugging/testing, 
 and turn it off in release mode. You also avoid the 
 inconveniences that Rust comes with.

 The downside is that the resulting code is only as safe as the 
 testing environment ensures…

 Anyway, this is relatively easy to implement if the code base 
 is wellbehaved.
Is this something that is done during compile time or run time or both? I was thinking if you could track reference counting as far as you can with CTFE (or any other static analysis) you could actually remove increase/decrease of reference counting in the code if you could prove it is safe to remove them and optimize the code that way. When the compiler cannot statically track the life time it inserts the associated reference counting code. Another question is, isn't a borrow just an increase of the reference counter? As long as someone is using the object, it should have an associated acquired reference.
Nov 13 2019
parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Wednesday, 13 November 2019 at 20:13:47 UTC, IGotD- wrote:
 Is this something that is done during compile time or run time 
 or both?
At runtime and possibly an analysis of a log after running, in the general case, as that would cover much more than can be done by static analysis. BUT the syntax should be such that it also could be used as annotations for verification/static analysis. However you might need some compile time support to make pleasant to use.
 I was thinking if you could track reference counting as far as 
 you can with CTFE (or any other static analysis) you could 
 actually remove increase/decrease of reference counting in the 
 code if you could prove it is safe to remove them and optimize 
 the code that way. When the compiler cannot statically track 
 the life time it inserts the associated reference counting code.
Yes, that is how Swift works. However, my assumption here is that we don't care about execution speed, we only care about getting as good information as possible so the programmer can get a good view of the weaknesses in his program.
 Another question is, isn't a borrow just an increase of the 
 reference counter? As long as someone is using the object, it 
 should have an associated acquired reference.
A generic reference counting scheme implies shared ownership, so that would be slightly different from borrowing (in the sense that you with borrowing only "rent a view" on the object, but never do something drastic with it like deleting it or handing it to another context/thread). The point of counting borrowing is to be able to test whether the program behaves according to the spec. Like, you could even support casting "old" objects to immutable if nobody has borrowed the object at that time. However, this check would not be done in release-mode, so problems could still occur. The quality would depend on how rigid the testing regime is.
Nov 13 2019
parent reply Araq <rumpf_a web.de> writes:
On Wednesday, 13 November 2019 at 20:33:31 UTC, Ola Fosheim 
Grøstad wrote:
 On Wednesday, 13 November 2019 at 20:13:47 UTC, IGotD- wrote:
 [...]
At runtime and possibly an analysis of a log after running, in the general case, as that would cover much more than can be done by static analysis. BUT the syntax should be such that it also could be used as annotations for verification/static analysis. [...]
I think you are about to re-invented https://researcher.watson.ibm.com/researcher/files/us-bacon/Dingle07Ownership.pdf See https://github.com/nim-lang/RFCs/issues/144 for a discussion.
Nov 13 2019
parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Wednesday, 13 November 2019 at 23:04:38 UTC, Araq wrote:

 I think you are about to re-invented 
 https://researcher.watson.ibm.com/researcher/files/us-bacon/Dingle07Ownership.pdf
Thanks for the reference, I'll read it later :-)
 See https://github.com/nim-lang/RFCs/issues/144 for a 
 discussion.
Yes, that concept for Nim seems to be similar in spirit, although what I am thinking about is only for testing so that time and memory overhead is acceptable. Actually, maybe refcounting isn't needed. Maybe one can only emit the owning/borrowing/unborrowing patterns to disk and do an analysis of those patterns after running the tests.
Nov 14 2019
prev sibling parent reply Nick Treleaven <nick geany.org> writes:
On Wednesday, 13 November 2019 at 12:22:12 UTC, Ola Fosheim 
Grøstad wrote:
 That allows you to track borrowing during debugging/testing, 
 and turn it off in release mode. You also avoid the 
 inconveniences that Rust comes with.

 The downside is that the resulting code is only as safe as the 
 testing environment ensures…
Sounds similar to my runtime-checked RCSlice idea: https://forum.dlang.org/post/aeeffshzkfjbrejztksj forum.dlang.org The checks could be disabled when -boundscheck=off is used.
Nov 14 2019
parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Thursday, 14 November 2019 at 12:49:00 UTC, Nick Treleaven 
wrote:
 Sounds similar to my runtime-checked RCSlice idea:
 https://forum.dlang.org/post/aeeffshzkfjbrejztksj forum.dlang.org
It is similar in spirit at least.
 The checks could be disabled when -boundscheck=off is used.
Yes, but I was thinking about something that tests more than performance requirements would normally allow. But maybe it could be configured in a more granular manner than on/off.
Nov 19 2019
parent reply Nick Treleaven <nick geany.org> writes:
On Wednesday, 20 November 2019 at 06:40:50 UTC, Ola Fosheim 
Grøstad wrote:
 But maybe it could be configured in a more granular manner than 
 on/off.
Do you have any ideas on that? One nice thing is that the struct doesn't need to expose an element by ref unless the user needs to pass it as a ref argument to some function. Instead, the struct can define opIndexAssign and opIndexOpAssign, which don't need to bump the reference count. I probably didn't get that far with my proof of concept RCSlice.
Nov 22 2019
parent reply Ola Fosheim Gr <ola.fosheim.grostad gmail.com> writes:
On Friday, 22 November 2019 at 15:32:14 UTC, Nick Treleaven wrote:
 On Wednesday, 20 November 2019 at 06:40:50 UTC, Ola Fosheim 
 Grøstad wrote:
 But maybe it could be configured in a more granular manner 
 than on/off.
Do you have any ideas on that?
There are so many options... but if set aside "debug-only ref-counting" then this could be an possible "performance philosophy": I thought about something yesterday regarding bounds checks. I think it would be nice if one could state a code section should be high_performance and then get warnings everywhere safety checks had survived all the optimization passes. Then one would have to mark those bounds checks as "necessary" to get suppress the warning. So that you basically can write safe code that is performant and focus time and energy where it matters. If one had ARC optimization like Swift, then maybe something similar could be done for reference counting. In a high_performance section one would get warnings if the refcount is changed and would have to do explict "borrow statements" to suppress the warnings. Then one could later figure out a way to move those out of performance sensitive code (like loops).
 One nice thing is that the struct doesn't need to expose an 
 element by ref unless the user needs to pass it as a ref 
 argument to some function.

 Instead, the struct can define opIndexAssign and 
 opIndexOpAssign, which don't need to bump the reference count. 
 I probably didn't get that far with my proof of concept RCSlice.
Another interesting thing about having RCSlices is that you might find a concurrency use case where you could put write locks on only certain parts of an array and have safe read/write access to different slices of the same array without error prone manual locking... not sure exactly how that would work out in terms of performance, but it could at least be useful in debug mode.
Nov 22 2019
parent reply Paulo Pinto <pjmlp progtools.org> writes:
On Friday, 22 November 2019 at 16:09:51 UTC, Ola Fosheim Gr wrote:
 On Friday, 22 November 2019 at 15:32:14 UTC, Nick Treleaven 
 wrote:
 On Wednesday, 20 November 2019 at 06:40:50 UTC, Ola Fosheim 
 Grøstad wrote:
 But maybe it could be configured in a more granular manner 
 than on/off.
Do you have any ideas on that?
There are so many options... but if set aside "debug-only ref-counting" then this could be an possible "performance philosophy": I thought about something yesterday regarding bounds checks. I think it would be nice if one could state a code section should be high_performance and then get warnings everywhere safety checks had survived all the optimization passes. Then one would have to mark those bounds checks as "necessary" to get suppress the warning. So that you basically can write safe code that is performant and focus time and energy where it matters. If one had ARC optimization like Swift, then maybe something similar could be done for reference counting. In a high_performance section one would get warnings if the refcount is changed and would have to do explict "borrow statements" to suppress the warnings. Then one could later figure out a way to move those out of performance sensitive code (like loops). …
You mean the optimization that gets slammed by state of the art tracing GC implementations? https://github.com/ixy-languages/ixy-languages If one cares about performance there are only two paths, linear and affine type systems with their complexity for typical every day coding, or tracing GCs. Reference counting GC are just the easy way to get automatic memory management, and when one adds enough machinery to make them compete with tracing GC in performance, they end up being a tracing GC algorithm in disguise.
Nov 22 2019
parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Friday, 22 November 2019 at 17:54:53 UTC, Paulo Pinto wrote:
 You mean the optimization that gets slammed by state of the art 
 tracing GC implementations?
You can get the exact same performance from ARC as from Rust, because the concept is the same... but Swift is built on Objective-C's implementation. What is wrong here is to reference a specific implementation.
 If one cares about performance there are only two paths, linear 
 and affine type systems with their complexity for typical every 
 day coding, or tracing GCs.
Actually, no. That "benchmark" didn't really tell me anything interesting. It might be interesting for people wanting to implementing network drivers in a managed language, which is worrisome for a large number of reasons... (I mean, why would you implement a tiny program of 1000 lines using a GC...)
 Reference counting GC are just the easy way to get automatic 
 memory management, and when one adds enough machinery to make 
 them compete with tracing GC in performance, they end up being 
 a tracing GC algorithm in disguise.
No, then you end up with C++ unique_ptr...
Nov 22 2019
parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Friday, 22 November 2019 at 18:35:48 UTC, Ola Fosheim Grøstad 
wrote:
 No, then you end up with C++ unique_ptr...
The point is, if you constrain the language sufficiently, what can be expressed in it and do deep enough analysis then you end up right there. It isn't really possible to make broad statement about memory management without considering the data-structure you need to represent, maintain, evolve...
Nov 22 2019