www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Nim's ORC - Vorsprung durch Algorithmen

reply zoujiaqing <zoujiaqing gmail.com> writes:
https://nim-lang.org/blog/2020/12/08/introducing-orc.html
Feb 01
parent reply IGotD- <nise nise.com> writes:
On Monday, 1 February 2021 at 09:46:48 UTC, zoujiaqing wrote:
 https://nim-lang.org/blog/2020/12/08/introducing-orc.html
Yes, what are we supposed to discusss?
Feb 01
parent reply zoujiaqing <zoujiaqing gmail.com> writes:
On Monday, 1 February 2021 at 10:33:43 UTC, IGotD- wrote:
 On Monday, 1 February 2021 at 09:46:48 UTC, zoujiaqing wrote:
 https://nim-lang.org/blog/2020/12/08/introducing-orc.html
Yes, what are we supposed to discusss?
Okay! D language should also be used for reference in memory management! isn't it? What do officials think? What does the community think?
Feb 01
next sibling parent reply Petar Kirov [ZombineDev] <petar.p.kirov gmail.com> writes:
On Monday, 1 February 2021 at 13:41:24 UTC, zoujiaqing wrote:
 On Monday, 1 February 2021 at 10:33:43 UTC, IGotD- wrote:
 On Monday, 1 February 2021 at 09:46:48 UTC, zoujiaqing wrote:
 https://nim-lang.org/blog/2020/12/08/introducing-orc.html
Yes, what are we supposed to discusss?
Okay! D language should also be used for reference in memory management! isn't it? What do officials think? What does the community think?
I think it's a great option to have and we should work on a similar feature for D. It has been discussed many times in past in this newsgroup, but as far as I know no one has actually started working on it yet. Perhaps it can be a #saoc or #gsoc project? On the library side, I think that exposing the GC building blocks (like std.experimental.allocator) would help. On the type system side, #dip1000 and something like #dip2021 certainly help, but are only parts of the bigger story. IIRC Nim has a strict separation between "managed" and raw pointers, while in D they're the same type (class references also have unclear ownership semantics). One of the optimizations mentioned in this article:
 The Nim compiler analyses the involved types and only if it is 
 potentially cyclic, code is produced that calls into the cycle 
 collector. This type analysis can be helped out by annotating a 
 type as acyclic.
This can easily done in D, both by meta-programming and inside the compiler. Perhaps it can be added on top of our existing support for precise GCs (https://dlang.org/spec/traits.html#getPointerBitmap).
Feb 01
parent reply IGotD- <nise nise.com> writes:
On Monday, 1 February 2021 at 14:08:06 UTC, Petar Kirov 
[ZombineDev] wrote:
 Nim has a strict separation between "managed" and raw pointers, 
 while in D they're the same type (class references also have 
 unclear ownership semantics).
Yes, that's is what makes the difference. Nim has all the degrees of freedom when it comes to changing GC algorithms and has several possible GC algorithm that the programmer can choose from. D is painted into a corner because the lack of managed pointers. Also, regarding DIP1000 and DIP1021 is going nowhere. I haven't seen any plan what this would in the end produce and therefore it is just an unplanned attempt and has nothing to do with improving the GC.
Feb 01
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Mon, Feb 01, 2021 at 05:30:30PM +0000, IGotD- via Digitalmars-d wrote:
[...]
 Also, regarding DIP1000 and DIP1021 is going nowhere. I haven't seen
 any plan what this would in the end produce and therefore it is just
 an unplanned attempt and has nothing to do with improving the GC.
DIP1000 has nothing to do with the (current) GC. It is to prepare the ground for implementing some kind of ARC scheme. T -- Once the bikeshed is up for painting, the rainbow won't suffice. -- Andrei Alexandrescu
Feb 01
next sibling parent IGotD- <nise nise.com> writes:
On Monday, 1 February 2021 at 17:43:25 UTC, H. S. Teoh wrote:
 DIP1000 has nothing to do with the (current) GC.  It is to 
 prepare the ground for implementing some kind of ARC scheme.
DIP1000 and 1021 are now over a year old and I haven't seen any document describing this or any ARC implementation. It is also unclear how ARC is supposed to live together with raw pointers as today.
Feb 01
prev sibling parent reply Petar Kirov [ZombineDev] <petar.p.kirov gmail.com> writes:
On Monday, 1 February 2021 at 17:43:25 UTC, H. S. Teoh wrote:
 On Mon, Feb 01, 2021 at 05:30:30PM +0000, IGotD- via 
 Digitalmars-d wrote: [...]
 Also, regarding DIP1000 and DIP1021 is going nowhere. I 
 haven't seen any plan what this would in the end produce and 
 therefore it is just an unplanned attempt and has nothing to 
 do with improving the GC.
DIP1000 has nothing to do with the (current) GC. It is to prepare the ground for implementing some kind of ARC scheme. T
Yes, those DIPs have nothing to do with D's current GC, but ARC, ORC, ..., and tracing are just different forms of garbage collection (all with different trade offs). As explained by the Nim article, their compiler is taking advantage of move semantics, escape and other forms of static analysis (basically what DIP1000 and DIP1021 are about) to optimize the load on the "GC". Go's compiler uses escape analysis to determine when heap allocations can be demoted to stack allocations (LDC also have a GC2Stack optimization pass, but it's a shame that it's not part of the front-end proper). I wish we as a community would stop with this nonsensical split between the "GC" and the "no GC crowd", us versus them, etc. It's just different sides of the same coin. As soon as we realize that using the GC is not a binary choice but a spectrum of options each good for different use cases the better. #dip1021 (which doesn't even mention " live" btw) is step in the wrong direction IMO. It is a kind of cargo cult version of a Rust's move and ownership semantics. Even if Rust was not the first language to implement this, it is certainly the language which immensely popularized affine types [1]. I think the main innovation of Rust's community is the realization that affine types are not just about memory management but are about adding a level of expressivity to the language that can't be easily emulated otherwise and that can be applied to different classes of problems [2] [3] [4]. #dip1021 doesn't support safe use of both of owned and non-owned/GC memory (or "managed" and "unmanaged" pointers) in the same function (since in D there's no distinction). But what is worse is that #dip1021 doesn't support typestate pattern in safe code. IMO affine-like types should distinct types available in system and safe code alike, not just in some temporal anomaly that is live. [1]: https://gankra.github.io/blah/linear-rust/#adding-proper-must-use-types-to-rust [2]: http://cliffle.com/blog/rust-typestate/ [3]: https://rust-unofficial.github.io/patterns/intro.html [4]: https://munksgaard.me/papers/laumann-munksgaard-larsen.pdf
Feb 01
parent reply Max Haughton <maxhaton gmail.com> writes:
On Tuesday, 2 February 2021 at 07:25:21 UTC, Petar Kirov 
[ZombineDev] wrote:
 On Monday, 1 February 2021 at 17:43:25 UTC, H. S. Teoh wrote:
 On Mon, Feb 01, 2021 at 05:30:30PM +0000, IGotD- via 
 Digitalmars-d wrote: [...]
 Also, regarding DIP1000 and DIP1021 is going nowhere. I 
 haven't seen any plan what this would in the end produce and 
 therefore it is just an unplanned attempt and has nothing to 
 do with improving the GC.
DIP1000 has nothing to do with the (current) GC. It is to prepare the ground for implementing some kind of ARC scheme. T
Yes, those DIPs have nothing to do with D's current GC, but ARC, ORC, ..., and tracing are just different forms of garbage collection (all with different trade offs). As explained by the Nim article, their compiler is taking advantage of move semantics, escape and other forms of static analysis (basically what DIP1000 and DIP1021 are about) to optimize the load on the "GC". Go's compiler uses escape analysis to determine when heap allocations can be demoted to stack allocations (LDC also have a GC2Stack optimization pass, but it's a shame that it's not part of the front-end proper). I wish we as a community would stop with this nonsensical split between the "GC" and the "no GC crowd", us versus them, etc. It's just different sides of the same coin. As soon as we realize that using the GC is not a binary choice but a spectrum of options each good for different use cases the better. #dip1021 (which doesn't even mention " live" btw) is step in the wrong direction IMO. It is a kind of cargo cult version of a Rust's move and ownership semantics. Even if Rust was not the first language to implement this, it is certainly the language which immensely popularized affine types [1]. I think the main innovation of Rust's community is the realization that affine types are not just about memory management but are about adding a level of expressivity to the language that can't be easily emulated otherwise and that can be applied to different classes of problems [2] [3] [4]. #dip1021 doesn't support safe use of both of owned and non-owned/GC memory (or "managed" and "unmanaged" pointers) in the same function (since in D there's no distinction). But what is worse is that #dip1021 doesn't support typestate pattern in safe code. IMO affine-like types should distinct types available in system and safe code alike, not just in some temporal anomaly that is live. [1]: https://gankra.github.io/blah/linear-rust/#adding-proper-must-use-types-to-rust [2]: http://cliffle.com/blog/rust-typestate/ [3]: https://rust-unofficial.github.io/patterns/intro.html [4]: https://munksgaard.me/papers/laumann-munksgaard-larsen.pdf
I would like to get a group going to hash out the future of these ideas in D. The connection with affine and linear type systems is definitely the lense to look through rather than merely memory. 1021 is a step in the right direction practically in that in catches bugs but we have to dream bigger. For a simple example of somewhere where advanced (substructural) type semantics buy you nice things other than memory: nodiscard is effectively a linear type in disguise. Attacking from both sides is a good idea - we can make GC code faster and nogc code safer using the same weaponry. HOWEVER - all of these changes effectively mean bolting even more flow analysis to the compiler which isn't great given the current coding styles in dmd (It's very "flat" i.e. not much abstraction, and there is a tendency to dump everything in one 10k line file) - this is an easily solvable problem, we just need to be more forward thinking.
Feb 02
parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Tuesday, 2 February 2021 at 08:11:03 UTC, Max Haughton wrote:
 HOWEVER - all of these changes effectively mean bolting even 
 more flow analysis to the compiler which isn't great given the 
 current coding styles in dmd (It's very "flat" i.e. not much 
 abstraction, and there is a tendency to dump everything in one 
 10k line file) - this is an easily solvable problem, we just 
 need to be more forward thinking.
There is a need for a new typed intermediate representation that is higher level than the LLVM IR. One problem is that D allows directly emitting low level constructs. One way to deal with this is to require all low level code to have a high level counterpart with a version selector.
Feb 02
parent reply Max Haughton <maxhaton gmail.com> writes:
On Tuesday, 2 February 2021 at 11:14:24 UTC, Ola Fosheim Grøstad 
wrote:
 On Tuesday, 2 February 2021 at 08:11:03 UTC, Max Haughton wrote:
 HOWEVER - all of these changes effectively mean bolting even 
 more flow analysis to the compiler which isn't great given the 
 current coding styles in dmd (It's very "flat" i.e. not much 
 abstraction, and there is a tendency to dump everything in one 
 10k line file) - this is an easily solvable problem, we just 
 need to be more forward thinking.
There is a need for a new typed intermediate representation that is higher level than the LLVM IR. One problem is that D allows directly emitting low level constructs. One way to deal with this is to require all low level code to have a high level counterpart with a version selector.
I'm not convinced that an IR is needed as much as a clearly defined pipeline for the AST as it goes through the compiler, e.g. trying to organise code into passes, at least locally.
Feb 02
next sibling parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Tuesday, 2 February 2021 at 11:32:53 UTC, Max Haughton wrote:
 I'm not convinced that an IR is needed as much as a clearly 
 defined pipeline for the AST as it goes through the compiler, 
 e.g. trying to organise code into passes, at least locally.
Doing this over the AST is just not a good idea if you care about making it work correctly.
Feb 02
prev sibling parent reply Petar Kirov [ZombineDev] <petar.p.kirov gmail.com> writes:
On Tuesday, 2 February 2021 at 11:32:53 UTC, Max Haughton wrote:
 On Tuesday, 2 February 2021 at 11:14:24 UTC, Ola Fosheim 
 Grøstad wrote:
 On Tuesday, 2 February 2021 at 08:11:03 UTC, Max Haughton 
 wrote:
 HOWEVER - all of these changes effectively mean bolting even 
 more flow analysis to the compiler which isn't great given 
 the current coding styles in dmd (It's very "flat" i.e. not 
 much abstraction, and there is a tendency to dump everything 
 in one 10k line file) - this is an easily solvable problem, 
 we just need to be more forward thinking.
There is a need for a new typed intermediate representation that is higher level than the LLVM IR. One problem is that D allows directly emitting low level constructs. One way to deal with this is to require all low level code to have a high level counterpart with a version selector.
I'm not convinced that an IR is needed as much as a clearly defined pipeline for the AST as it goes through the compiler, e.g. trying to organise code into passes, at least locally.
One could argue that dmd already has 2 IRs separate from the AST representation: 1. https://github.com/dlang/dmd/blob/master/src/dmd/e2ir.d, https://github.com/dlang/dmd/blob/master/src/dmd/s2ir.d 2. https://github.com/dlang/dmd/blob/7233643c5da2bb531dd0fdec5f823daa12d30217/src/dmd/ob.d#L84-L114 :P
Feb 03
parent Max Haughton <maxhaton gmail.com> writes:
On Wednesday, 3 February 2021 at 08:58:18 UTC, Petar Kirov 
[ZombineDev] wrote:
 On Tuesday, 2 February 2021 at 11:32:53 UTC, Max Haughton wrote:
 On Tuesday, 2 February 2021 at 11:14:24 UTC, Ola Fosheim 
 Grøstad wrote:
 [...]
I'm not convinced that an IR is needed as much as a clearly defined pipeline for the AST as it goes through the compiler, e.g. trying to organise code into passes, at least locally.
One could argue that dmd already has 2 IRs separate from the AST representation: 1. https://github.com/dlang/dmd/blob/master/src/dmd/e2ir.d, https://github.com/dlang/dmd/blob/master/src/dmd/s2ir.d 2. https://github.com/dlang/dmd/blob/7233643c5da2bb531dd0fdec5f823daa12d30217/src/dmd/ob.d#L84-L114 :P
You could but you'd be wrong. The "IR" ob.d uses doesn't actually track enough information to do proper error messages, and I've tried.
Feb 03
prev sibling parent Guillaume Piolat <first.last gmail.com> writes:
On Monday, 1 February 2021 at 13:41:24 UTC, zoujiaqing wrote:
 What does the community think?
It's a great Nim article. We have been perfectly silent on the GC speed improvements seen since 2018, but actually they are quite significant. If you didn't go to DConf, you would never know about it. I think a bit of publicity and pat-in-the-back could go a long way.
Feb 01