www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 10108] New: Thread local slice to array literal references the same data

reply d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10108

           Summary: Thread local slice to array literal references the
                    same data
           Product: D
           Version: unspecified
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: DMD
        AssignedTo: nobody puremagic.com
        ReportedBy: code dawg.eu



cat > bug.d << CODE
import core.thread;

int[] arr = [1,2,3];
__gshared int* thrPtr;

void main()
{
    auto thr = new Thread({thrPtr = arr.ptr;});
    thr.start();
    thr.join();
    assert(arr.ptr !is thrPtr);
}
CODE

----

Only the `arr` variable is store in TLS, but `arr.ptr` references the literal
which is in the shared data segment.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
May 17 2013
next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10108


Alex Rønne Petersen <alex lycus.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |alex lycus.org



CEST ---
I'm not sure I understand why this is a bug. Can you elaborate?

As far as I know, being able to access TLS data from one thread in another
thread through pointers passed via globals is a feature (it's even one I rely
on).

But maybe I'm misunderstanding this.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
May 17 2013
prev sibling next sibling parent reply d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10108




When a thread local variable is a reference type to modifiable data, we must
make sure that it is initialized uniquely.

This is what the current implementation does which results in hidden sharing.

__gshared int[] gArr = [1,2,3];
int[] arr = gArr;

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
May 17 2013
next sibling parent Sean Kelly <sean invisibleduck.org> writes:
This is expected because the global is __gshared and there's therefore no ty=
pe protection from doing this. If you want safe sharing, make the global sha=
red.=20

On May 17, 2013, at 9:38 AM, d-bugmail puremagic.com wrote:

 http://d.puremagic.com/issues/show_bug.cgi?id=3D10108
=20
=20
=20

-
 When a thread local variable is a reference type to modifiable data, we mu=
st
 make sure that it is initialized uniquely.
=20
 This is what the current implementation does which results in hidden shari=
ng.
=20
 __gshared int[] gArr =3D [1,2,3];
 int[] arr =3D gArr;
=20
 --=20
 Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=3Dema=
il
 ------- You are receiving this mail because: -------
May 18 2013
prev sibling parent "Simen Kjaeraas" <simen.kjaras gmail.com> writes:
On Sat, 18 May 2013 16:59:53 +0200, Sean Kelly <sean invisibleduck.org>  
wrote:

 This is expected because the global is __gshared and there's therefore  
 no type protection from doing this. If you want safe sharing, make the  
 global shared.
[snip]
 __gshared int[] gArr = [1,2,3];
 int[] arr = gArr;
Uhm, you are aware this is the Bugzilla newsgroup, right? Please reply in Bugzilla. Also, that's exactly what we're saying. This code has problems: import core.thread; int[] arr = [1,2,3]; void main() { new Thread({arr[0] = 3;}); assert(arr[0] == 1); } And they are caused by the data inside arr being shared. This is unsafe, unexpected (due to safety of other globals) and, we feel, goes against the D ethos. -- Simen
May 18 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10108


Steven Schveighoffer <schveiguy yahoo.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |schveiguy yahoo.com



10:47:13 PDT ---
Be careful with your code snippets, unless you are more explicit about the
error reporting.  Asserts are generally included to show what currently PASSES,
not what FAILS.  I was about to mark this as invalid, because the code SHOULD
pass, but you are correct in that it fails.

Your code should say:

assert(arr.ptr !is thrPtr); // FAILS!!!

or

assert(arr.ptr is thrPtr);

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
May 17 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10108




CEST ---

 When a thread local variable is a reference type to modifiable data, we must
 make sure that it is initialized uniquely.
 
 This is what the current implementation does which results in hidden sharing.
 
 __gshared int[] gArr = [1,2,3];
 int[] arr = gArr;
Ah, I understand. Carry on. :) -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
May 17 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10108


Walter Bright <bugzilla digitalmars.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
                 CC|                            |bugzilla digitalmars.com
         Resolution|                            |INVALID



11:25:36 PDT ---
The __gshared storage class is an end run around the type system. That's why it
has the __ prefix. So, yes, you can use it to get multithreaded "local" access
to global data without a compiler error.

This is as designed - it's a bug in the code example.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
May 17 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10108


Steven Schveighoffer <schveiguy yahoo.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |REOPENED
         Resolution|INVALID                     |



11:29:07 PDT ---


 This is as designed - it's a bug in the code example.
No, it's not. It's a bug in the code generation. The __gshared variable just demonstrates the bug. What is happening is that each thread-local instance of arr is getting a pointer to the SAME data. The assert should have been written differently. as written, it fails, but this is not noted. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
May 17 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10108


Simen Kjaeraas <simen.kjaras gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |simen.kjaras gmail.com



PDT ---
Simplified example without __gshared:

import core.thread;

int[] arr = [1,2,3];

void main( ) {
    int* p = arr.ptr;
    auto thr = new Thread({assert(arr.ptr == p);}); // Should have failed.
    thr.start();
    thr.join();
}

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
May 17 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10108


Igor Stepanov <wazar.leollone yahoo.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |wazar.leollone yahoo.com



13:57:50 PDT ---

 Simplified example without __gshared:
 
 import core.thread;
 
 int[] arr = [1,2,3];
 
 void main( ) {
     int* p = arr.ptr;
     auto thr = new Thread({assert(arr.ptr == p);}); // Should have failed.
     thr.start();
     thr.join();
 }
In other words implicit "thread local" modifier is not transitive. Yes, all threads have a local copy of "arr symbol" and &arr differs in different threads. But all of this "local" symbols points to single array object. The problem lies deeper than it seems. struct Foo { int[] arr; } Foo[] getFooArr() { Foo[] ret; foreach(i; 1 .. 10) { Foo cur; foreach(j; 1 .. 5) { cur.arr ~= j; } ret ~= cur; } return ret; } Foo[] arr = getFooArr(); Now foo points to the arr, each members of it points to array literal. And if we want to create Foo[] is true TLS object, we must to set, that 1. arr must point to tls object (arr.ptr must be thread_tls_start+arr_tls_offset) 2. for each i: arr[i].arr must point to tls object. In other words If compiler see ptr dereference expression (with * or []) it must know, is this ptr is TLS. If it is TLS compiler must add to it value thread_tls_start, otherwise - use it value as is. This functional can be provided, if we declare transitive threadlocal storage class (like shared) and implement special reference behaviour. (e.g. dereference, casting and other.) for example: theradlocal int[][] tls = [[1,2],[3,4],[5,6]]; int* getNthMthElemPointer(theradlocal int[][] a, int n, int m) { return &ptr[n][m]; //implicit cast to non-tls pointer. returned value points to elem it function caller thread } void main() { int* p1 = getNthMthElemPointer(tls, 1, 1); theradlocal int* p2 = &tls[1][1]; void threadFunc(int num)() { writeln(num, " shared ptr: ", p1); writeln(num, " thread local ptr: ", cast(int*)p2); } auto thr1 = new Thread(&threadFunc!1); thr1.start(); auto thr2 = new Thread(&threadFunc!2); thr2.start(); thr1.join(); thr2.join(); } therads will print same "shared ptr" value but different "thread local ptr" However this future is disharmonious with language design I think. Other way: disallow all of tls static initialized values. shared int[] a = [1,2,3]; //OK _gshared int[] b = [1,2,3]; //OK const int[] c = [1,2,3]; //OK int[] d = [1,2,3]; //Disallowed int[] e;//OK static this() { e = [1,2,3]; //If e value will be allocated in heap this code doesn't break type system. } The same applies to classes, pointers and associative arrays in future. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
May 17 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10108




PDT ---
I see. Once again, simplified:

import core.thread;

struct Foo {
    int[] arr;
}

Foo[] arr = [Foo([1,2,3])]; // Should have failed? (1)

void main( ) {
    int* p = arr[0].arr.ptr;
    auto thr = new Thread({assert(arr[0].arr.ptr == p);}); // Should have
failed. (2)
    thr.start();
    thr.join();
}

In this case, for the assert to fail, we'd have to deep-dup the array (COW
might make that unnecessary, but that's beside the point).

This is in a way related to the issue of array literals being mutable, in that
it is an example of the compiler erroneously assuming some state may be shared
when in fact it shouldn't.

I contend that (1) above should simply not compile. It should be required to be
placed in a module constructor instead. A case can be made that the compiler
should automagically place it in a module constructor for you, but I am not of
that opinion.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
May 17 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10108




16:25:34 PDT ---

 I see. Once again, simplified:
 
 import core.thread;
 
 struct Foo {
     int[] arr;
 }
 
 Foo[] arr = [Foo([1,2,3])]; // Should have failed? (1)
 
 void main( ) {
     int* p = arr[0].arr.ptr;
     auto thr = new Thread({assert(arr[0].arr.ptr == p);}); // Should have
 failed. (2)
     thr.start();
     thr.join();
 }
 
 In this case, for the assert to fail, we'd have to deep-dup the array (COW
 might make that unnecessary, but that's beside the point).
 
 This is in a way related to the issue of array literals being mutable, in that
 it is an example of the compiler erroneously assuming some state may be shared
 when in fact it shouldn't.
 
 I contend that (1) above should simply not compile. It should be required to be
 placed in a module constructor instead. 
Yep int[] x = [1,2,3]; should not be compiled, but shared int[] x = [1,2,3]; //OK const int[] x = [1,2,3]; //OK, because const is global scope == immutable immutable int[] x = [1,2,3]; //OK __gshared int[] x = [1,2,3]; //Same
A case can be made that the compiler
 should automagically place it in a module constructor for you, but I am not of
 that opinion.
I agree, this is not D way, I think. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
May 17 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10108




 Asserts are generally included to show what currently PASSES,
not what FAILS. OK, I always write unittests that should pass but I'll be more explicit.
 In other words implicit "thread local" modifier is not transitive.
It's not intended to be transitive, it is a storage class, not a type qualifier. Variables with thread local storage may reference any other data (__gshared, shared, stack, heap) and vice versa.
 int[] x = [1,2,3]; // should not be compiled
It would be trivial to fix. As the initializer for static data must be a compile time constant we'd just need to store this constant in TLS instead of the data segment. The problem is that ELF has no TLS relocations for data, i.e. we'd need a dynamic initalizer that sets arr.ptr to the TLS data. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
May 17 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10108




The simple fix is to only allow value types to have TLS initalizers and require
static this() for everything else.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
May 17 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10108




00:46:39 PDT ---
 
 int[] x = [1,2,3]; // should not be compiled
It would be trivial to fix. As the initializer for static data must be a compile time constant we'd just need to store this constant in TLS instead of the data segment. The problem is that ELF has no TLS relocations for data, i.e. we'd need a dynamic initalizer that sets arr.ptr to the TLS data.
I dont know anything about relocation magic. But, as I understand you, we cannot to use it;
The simple fix is to only allow value types to have TLS initalizers and require
static this() for everything else.
Yes. This is all we need I think. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
May 18 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10108


Sean Kelly <sean invisibleduck.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |sean invisibleduck.org



PDT ---
So I thought I understood this:

    import core.thread;
    int[] arr = [1,2,3].dup;

    void main() {
        auto t = new Thread({arr[0] = 3;});
        t.start();
        t.join();
        assert(arr[0] == 1);
    }

It looks like we have a thread-local reference "arr" to a __gshared array of
int, so I would expect the assert to fail.  Except:

    import core.thread;
    int[] arr = [1,2,3].dup;

    void main() {
        auto t = new Thread({arr[0] = 3;});
        t.start();
        t.join();
        assert(arr[0] == 1);
    }

The .dup should fix the problem, as now each thread gets its own copy of the
array, right?  But the assert still fails.  I suppose I should check the ASM,
but the codegen seems kind of broken here.  Is a __gshared label being inferred
for arr because it's statically slicing __gshared data?

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
May 21 2013
prev sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10108




12:41:56 PDT ---

     import core.thread;
     int[] arr = [1,2,3].dup;
 
     void main() {
         auto t = new Thread({arr[0] = 3;});
         t.start();
         t.join();
         assert(arr[0] == 1);
     }
 
 The .dup should fix the problem, as now each thread gets its own copy of the
 array, right?
It's still being done at compile time, so no. It basically creates a copy at compile time, then stores that in the data segment instead of the original. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
May 21 2013