www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Passing data and ownership to new thread

reply James Brister <brister pobox.com> writes:
I'm pretty new to D, but from what I've seen there are two modes 
of using data across threads: (a) immutable message passing and 
the new thread copies the data if it needs to be modified, (b) 
shared, assuming the data will be modified by both threads, and 
the limits that imposes . But why nothing for passing an object 
of some sort to another thread with the ownership moving to the 
new thread. I suppose this would be hard enforce at the language 
level, but wouldn't you want this when trying to pass large-ish 
data structures from one thread to another (thinking of a network 
server, such as a DHCP server, that has a thread for handling the 
network interface and it reads the incoming requests and then 
passes off to another thread to handle).
Sep 26 2017
next sibling parent reply Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Tuesday, September 26, 2017 09:10:41 James Brister via Digitalmars-d 
wrote:
 I'm pretty new to D, but from what I've seen there are two modes
 of using data across threads: (a) immutable message passing and
 the new thread copies the data if it needs to be modified, (b)
 shared, assuming the data will be modified by both threads, and
 the limits that imposes . But why nothing for passing an object
 of some sort to another thread with the ownership moving to the
 new thread. I suppose this would be hard enforce at the language
 level, but wouldn't you want this when trying to pass large-ish
 data structures from one thread to another (thinking of a network
 server, such as a DHCP server, that has a thread for handling the
 network interface and it reads the incoming requests and then
 passes off to another thread to handle).
The problem is that the type system has no concept of thread ownership or memory ownership in general (beyond knowing whether something is typed as thread-local or shared, and even that doesn't say what the data was originally, just what it's currently treated as), and the compiler has no way of determining that there are no other references to the data that you're passing between threads (at least not beyond very simply cases). The programmer can cast an object to shared or immutable on one thread, pass it, and then cast it to mutable on the other and essentially pass ownership of the object in the process, but it's up to the programmer to verify that that object is no longer referenced by anything on the thread that it came from, and it's up to the programmer to make sure that casting doesn't violate the type system. As it is, if the cast is mutable to immutable to mutable, you're technically violating the type system but in a way that will always work - so casting to and from shared is definitely better, but std.concurrency does have some bugs with regards to shared where it won't always allow a type to be passed when it should. Dealing with types that you can copy rather than really needing to pass ownership is always cleaner but not necessarily efficient. In princple, what we'd really like is the ability to safely take a mutable object that has no other references to it, pass it to another thread, and have all of that work with no casting, but D's type system simply does not have the level of information in it that it would need to make that work. It's my understanding that Rust tries to encode ownership like that into its type system but that that makes its type system considerably more complicated. D doesn't make the attempt. It just leaves it up to the programmer to get it right - which isn't ideal, but there's only so much that we can do without making the type system too restrictive or too unwieldy. Perhaps someone will come up with a solution that will work under at some set of circumstances without over complicating things, but thus far, no one has. So, we're essentially forced to have the programmer either send data that has no references or send data that can be left as immutable or copied from immutable, or require the programmer to carefully use casts. It's not ideal, but for the most part, it works if you're careful. - Jonathan M Davis
Sep 26 2017
parent jmh530 <john.michael.hall gmail.com> writes:
On Tuesday, 26 September 2017 at 09:36:39 UTC, Jonathan M Davis 
wrote:
 [snip] It's my understanding that Rust tries to encode 
 ownership like that into its type system but that that makes 
 its type system considerably more complicated. D doesn't make 
 the attempt.
Reference capabilities in pony are also an interesting (albeit complicated) approach. https://tutorial.ponylang.org/capabilities/reference-capabilities.html
Sep 26 2017
prev sibling next sibling parent Moritz Maxeiner <moritz ucworks.org> writes:
On Tuesday, 26 September 2017 at 09:10:41 UTC, James Brister 
wrote:
 I'm pretty new to D, but from what I've seen there are two 
 modes of using data across threads: (a) immutable message 
 passing and the new thread copies the data if it needs to be 
 modified, (b) shared, assuming the data will be modified by 
 both threads, and the limits that imposes . But why nothing for 
 passing an object of some sort to another thread with the 
 ownership moving to the new thread.
If you're talking about Phobos (import std.{...}): AFAICT because no one has has a strong enough need implement such a thing and propose it for Phobos inclusion. If you're talking about the language: Because D doesn't have any builtin concept of ownership.
 I suppose this would be hard enforce at the language level, but 
 wouldn't you want this when trying to pass large-ish data 
 structures from one thread to another (thinking of a network 
 server, such as a DHCP server, that has a thread for handling 
 the network interface and it reads the incoming requests and 
 then passes off to another thread to handle).
In such server code you're probably better off distributing the request reading (and potentially even the client socket accepting) to multiple workers, e.g. having multiple threads (or processes for that matter, as that can minimize downtime when combined with process supervision) listening on their own socket with the same address:port (see SO_REUSEPORT). If you really want to do it, though, the way I'd start going about it would be with a classic work queue / thread pool system. Below is pseudo code showing how to do that for a oneshot request scenario. [shared data] work_queue (protect methods with mutex or use a lockfree queue) main thread: loop: auto client_socket = accept(...); // Allocate request on the heap Request* request = client_socket.readRequest(...); // Send a pointer to the request to the work queue work_queue ~= tuple(client_socket, request); // Model "ownership" by forgetting about client_socket and request here worker thread: loop: ... auto job = work_queue.pop(); scope (exit) { close(job[0]); free(job[1]); } auto response = job[1].handle(); client_socket.writeResponse(response);
Sep 26 2017
prev sibling parent Moritz Maxeiner <moritz ucworks.org> writes:
On Tuesday, 26 September 2017 at 09:10:41 UTC, James Brister 
wrote:
 I'm pretty new to D, but from what I've seen there are two 
 modes of using data across threads: (a) immutable message 
 passing and the new thread copies the data if it needs to be 
 modified, (b) shared, assuming the data will be modified by 
 both threads, and the limits that imposes . But why nothing for 
 passing an object of some sort to another thread with the 
 ownership moving to the new thread.
If you're talking about the language: Because D doesn't have any builtin concept of ownership. If you're talking about Phobos (import std.{...}): Because a general solution is not a trivial problem (see Jonathan's answer for more detail).
 I suppose this would be hard enforce at the language level, but 
 wouldn't you want this when trying to pass large-ish data 
 structures from one thread to another (thinking of a network 
 server, such as a DHCP server, that has a thread for handling 
 the network interface and it reads the incoming requests and 
 then passes off to another thread to handle).
In such server code you're probably better off distributing the request reading (and potentially even the client socket accepting) to multiple workers, e.g. having multiple threads (or processes for that matter, as that can minimize downtime when combined with process supervision) listening on their own socket with the same address:port (see SO_REUSEPORT). If you really want to do it, though, the way I'd start going about it would be with a classic work queue / thread pool system. Below is pseudo code showing how to do that for a oneshot request scenario. [shared data] work_queue (synchronize methods e.g. with mutex or use a lockfree queue) main thread: loop: auto client_socket = accept(...); // Allocate request on the heap Request* request = client_socket.readRequest(...); // Send a pointer to the request to the work queue work_queue ~= tuple(client_socket, request); // Poor man's ownership by forgetting about client_socket and request here worker thread: loop: ... auto job = work_queue.pop(); scope (exit) { close(job[0]); free(job[1]); } auto response = job[1].handle(); job[0].writeResponse(response);
Sep 26 2017