digitalmars.D - Passing data and ownership to new thread

James Brister (12/12) Sep 26 2017 I'm pretty new to D, but from what I've seen there are two modes

Jonathan M Davis (37/49) Sep 26 2017 The problem is that the type system has no concept of thread ownership o...

jmh530 (5/9) Sep 26 2017 Reference capabilities in pony are also an interesting (albeit

Moritz Maxeiner (35/48) Sep 26 2017 If you're talking about Phobos (import std.{...}):
Moritz Maxeiner (36/49) Sep 26 2017 If you're talking about the language:

James Brister <brister pobox.com> writes:

I'm pretty new to D, but from what I've seen there are two modes 
of using data across threads: (a) immutable message passing and 
the new thread copies the data if it needs to be modified, (b) 
shared, assuming the data will be modified by both threads, and 
the limits that imposes . But why nothing for passing an object 
of some sort to another thread with the ownership moving to the 
new thread. I suppose this would be hard enforce at the language 
level, but wouldn't you want this when trying to pass large-ish 
data structures from one thread to another (thinking of a network 
server, such as a DHCP server, that has a thread for handling the 
network interface and it reads the incoming requests and then 
passes off to another thread to handle).

Sep 26 2017

Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:

On Tuesday, September 26, 2017 09:10:41 James Brister via Digitalmars-d 
wrote:
 I'm pretty new to D, but from what I've seen there are two modes
 of using data across threads: (a) immutable message passing and
 the new thread copies the data if it needs to be modified, (b)
 shared, assuming the data will be modified by both threads, and
 the limits that imposes . But why nothing for passing an object
 of some sort to another thread with the ownership moving to the
 new thread. I suppose this would be hard enforce at the language
 level, but wouldn't you want this when trying to pass large-ish
 data structures from one thread to another (thinking of a network
 server, such as a DHCP server, that has a thread for handling the
 network interface and it reads the incoming requests and then
 passes off to another thread to handle).

The problem is that the type system has no concept of thread ownership or
memory ownership in general (beyond knowing whether something is typed as
thread-local or shared, and even that doesn't say what the data was
originally, just what it's currently treated as), and the compiler has no
way of determining that there are no other references to the data that
you're passing between threads (at least not beyond very simply cases). The
programmer can cast an object to shared or immutable on one thread, pass it,
and then cast it to mutable on the other and essentially pass ownership of
the object in the process, but it's up to the programmer to verify that that
object is no longer referenced by anything on the thread that it came from,
and it's up to the programmer to make sure that casting doesn't violate the
type system. As it is, if the cast is mutable to immutable to mutable,
you're technically violating the type system but in a way that will always
work - so casting to and from shared is definitely better, but
std.concurrency does have some bugs with regards to shared where it won't
always allow a type to be passed when it should. Dealing with types that you
can copy rather than really needing to pass ownership is always cleaner but
not necessarily efficient.

In princple, what we'd really like is the ability to safely take a mutable
object that has no other references to it, pass it to another thread, and
have all of that work with no casting, but D's type system simply does not
have the level of information in it that it would need to make that work.
It's my understanding that Rust tries to encode ownership like that into its
type system but that that makes its type system considerably more
complicated. D doesn't make the attempt. It just leaves it up to the
programmer to get it right - which isn't ideal, but there's only so much
that we can do without making the type system too restrictive or too
unwieldy.

Perhaps someone will come up with a solution that will work under at some
set of circumstances without over complicating things, but thus far, no one
has. So, we're essentially forced to have the programmer either send data
that has no references or send data that can be left as immutable or copied
from immutable, or require the programmer to carefully use casts. It's not
ideal, but for the most part, it works if you're careful.

- Jonathan M Davis

Sep 26 2017

jmh530 <john.michael.hall gmail.com> writes:

On Tuesday, 26 September 2017 at 09:36:39 UTC, Jonathan M Davis 
wrote:
 [snip] It's my understanding that Rust tries to encode 
 ownership like that into its type system but that that makes 
 its type system considerably more complicated. D doesn't make 
 the attempt.

Reference capabilities in pony are also an interesting (albeit 
complicated) approach.
https://tutorial.ponylang.org/capabilities/reference-capabilities.html

Sep 26 2017

Moritz Maxeiner <moritz ucworks.org> writes:

On Tuesday, 26 September 2017 at 09:10:41 UTC, James Brister 
wrote:
 I'm pretty new to D, but from what I've seen there are two 
 modes of using data across threads: (a) immutable message 
 passing and the new thread copies the data if it needs to be 
 modified, (b) shared, assuming the data will be modified by 
 both threads, and the limits that imposes . But why nothing for 
 passing an object of some sort to another thread with the 
 ownership moving to the new thread.

If you're talking about Phobos (import std.{...}):
AFAICT because no one has has a strong enough need implement such 
a thing and propose it for Phobos inclusion.
If you're talking about the language:
Because D doesn't have any builtin concept of ownership.

 I suppose this would be hard enforce at the language level, but 
 wouldn't you want this when trying to pass large-ish data 
 structures from one thread to another (thinking of a network 
 server, such as a DHCP server, that has a thread for handling 
 the network interface and it reads the incoming requests and 
 then passes off to another thread to handle).

In such server code you're probably better off distributing the 
request reading (and potentially even the client socket 
accepting) to multiple workers, e.g. having multiple threads (or 
processes for that matter, as that can minimize downtime when 
combined with process supervision) listening on their own socket 
with the same address:port (see SO_REUSEPORT).

If you really want to do it, though, the way I'd start going 
about it would be with a classic work queue / thread pool system. 
Below is pseudo code showing how to do that for a oneshot request 
scenario.

[shared data]
work_queue (protect methods with mutex or use a lockfree queue)

main thread:
     loop:
         auto client_socket = accept(...);
         // Allocate request on the heap
         Request* request = client_socket.readRequest(...);
         // Send a pointer to the request to the work queue
         work_queue ~= tuple(client_socket, request);
         // Model "ownership" by forgetting about client_socket 
and request here

worker thread:
     loop:
         ...
         auto job = work_queue.pop();
         scope (exit) { close(job[0]); free(job[1]); }
         auto response = job[1].handle();
         client_socket.writeResponse(response);

Sep 26 2017

Moritz Maxeiner <moritz ucworks.org> writes:

On Tuesday, 26 September 2017 at 09:10:41 UTC, James Brister 
wrote:
 I'm pretty new to D, but from what I've seen there are two 
 modes of using data across threads: (a) immutable message 
 passing and the new thread copies the data if it needs to be 
 modified, (b) shared, assuming the data will be modified by 
 both threads, and the limits that imposes . But why nothing for 
 passing an object of some sort to another thread with the 
 ownership moving to the new thread.

If you're talking about the language:
Because D doesn't have any builtin concept of ownership.
If you're talking about Phobos (import std.{...}):
Because a general solution is not a trivial problem (see 
Jonathan's answer for more detail).

 I suppose this would be hard enforce at the language level, but 
 wouldn't you want this when trying to pass large-ish data 
 structures from one thread to another (thinking of a network 
 server, such as a DHCP server, that has a thread for handling 
 the network interface and it reads the incoming requests and 
 then passes off to another thread to handle).

In such server code you're probably better off distributing the 
request reading (and potentially even the client socket 
accepting) to multiple workers, e.g. having multiple threads (or 
processes for that matter, as that can minimize downtime when 
combined with process supervision) listening on their own socket 
with the same address:port (see SO_REUSEPORT).

If you really want to do it, though, the way I'd start going 
about it would be with a classic work queue / thread pool system. 
Below is pseudo code showing how to do that for a oneshot request 
scenario.

[shared data]
work_queue (synchronize methods e.g. with mutex or use a lockfree 
queue)

main thread:
     loop:
         auto client_socket = accept(...);
         // Allocate request on the heap
         Request* request = client_socket.readRequest(...);
         // Send a pointer to the request to the work queue
         work_queue ~= tuple(client_socket, request);
         // Poor man's ownership by forgetting about client_socket 
and request here

worker thread:
     loop:
         ...
         auto job = work_queue.pop();
         scope (exit) { close(job[0]); free(job[1]); }
         auto response = job[1].handle();
         job[0].writeResponse(response);

Sep 26 2017

D Programming

C/C++ Programming

Other

digitalmars.D - Passing data and ownership to new thread