www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 8774] New: 2.059 worked 2.060 does not: Unable to join thread

reply d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8774

           Summary: 2.059 worked 2.060 does not: Unable to join thread
           Product: D
           Version: D2
          Platform: x86_64
        OS/Version: Linux
            Status: NEW
          Severity: regression
          Priority: P2
         Component: druntime
        AssignedTo: nobody puremagic.com
        ReportedBy: russel winder.org.uk



PDT ---
The attached code compiles and runs fine under DMD 2.059 installed on Debian
Unstable via the distributed deb file. Under 2.060 it compiles but at runtime
gives:

core.thread.ThreadException src/core/thread.d(780): Unable to join thread
----------------
/tmp/.rdmd-1000/rdmd-pi_d_threadsGlobalState_array_declarative.d-C963C499401209E276E9BB7F98EEB447/pi_d_threadsGlobalState_array_declarative(void
pi_d_threadsGlobalState_array_declarative.execute(immutable(int))+0x129)
[0x44dac5]
/tmp/.rdmd-1000/rdmd-pi_d_threadsGlobalState_array_declarative.d-C963C499401209E276E9BB7F98EEB447/pi_d_threadsGlobalState_array_declarative(_Dmain+0x28)
[0x44db8c]
/tmp/.rdmd-1000/rdmd-pi_d_threadsGlobalState_array_declarative.d-C963C499401209E276E9BB7F98EEB447/pi_d_threadsGlobalState_array_declarative(extern
(C) int rt.dmain2.main(int, char**).void runMain()+0x1c) [0x45b34c]
/tmp/.rdmd-1000/rdmd-pi_d_threadsGlobalState_array_declarative.d-C963C499401209E276E9BB7F98EEB447/pi_d_threadsGlobalState_array_declarative(extern
(C) int rt.dmain2.main(int, char**).void tryExec(scope void delegate())+0x2a)
[0x45acc6]
/tmp/.rdmd-1000/rdmd-pi_d_threadsGlobalState_array_declarative.d-C963C499401209E276E9BB7F98EEB447/pi_d_threadsGlobalState_array_declarative(extern
(C) int rt.dmain2.main(int, char**).void runAll()+0x3b) [0x45b393]
/tmp/.rdmd-1000/rdmd-pi_d_threadsGlobalState_array_declarative.d-C963C499401209E276E9BB7F98EEB447/pi_d_threadsGlobalState_array_declarative(extern
(C) int rt.dmain2.main(int, char**).void tryExec(scope void delegate())+0x2a)
[0x45acc6]
/tmp/.rdmd-1000/rdmd-pi_d_threadsGlobalState_array_declarative.d-C963C499401209E276E9BB7F98EEB447/pi_d_threadsGlobalState_array_declarative(main+0xd1)
[0x45ac51]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xfd) [0x7fe18bb3aead]

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Oct 07 2012
next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8774




PDT ---
Created an attachment (id=1146)
.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Oct 07 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8774


Russel Winder <russel winder.org.uk> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------

        description|                            |ay_declarative.d


-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Oct 07 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8774


Alex Rψnne Petersen <alex lycus.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |alex lycus.org



CEST ---
pi_d_threadsGlobalState_array_declarative.d(13): Error: module output_d is in
file 'output_d.d' which cannot be read
import path[0] = .
import path[1] = /usr/include/dmd/phobos
import path[2] = /usr/include/dmd/druntime/import
Failed: 'dmd' '-v' '-o-' 'pi_d_threadsGlobalState_array_declarative.d' '-I.'

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Oct 07 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8774




PDT ---
Created an attachment (id=1148)
output_d.d

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Oct 07 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8774




CEST ---
What's happening is that pthread_join() is giving us ESRCH because the thread
handle is apparently no longer valid.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Oct 07 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8774




CEST ---
Something is very wrong here and I'm not sure whether to blame the compiler or
Phobos:

starting 0 7FCE6E0B3E00
joining 0 7FCE6E0B3D00

First value is the thread number, second is its address. Notice how the address
it attempts to join with is very wrong.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Oct 07 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8774




PDT ---
Perhaps worth noting that using ldc2 compiled from the Git repository gives
effectively the same result.

core.thread.ThreadException /home/Checkouts/Git/Git/LDC/runtime/druntime/src/core/thread.d(788):
Unable to join thread

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Oct 07 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8774


luka8088 <luka8088 owave.net> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |luka8088 owave.net



*** Issue 8852 has been marked as a duplicate of this issue. ***

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Oct 19 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8774




Here is a simple test case:

-----

module program;

import std.stdio;
import core.thread;

void main () {

  Thread t1, t2;

  t1 = new Thread(delegate { t2.start(); });
  t2 = new Thread(delegate { Thread.sleep(dur!"seconds"(1)); });

  t1.start();
  t2.join();

}

-----

http://dpaste.dzfl.pl/0d24dd06

output:
  core.thread.ThreadException src/core/thread.d(780): Unable to join thread

if t2.join occurs after t2 already finished then exception is not thrown, hence
the sleep

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Oct 19 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8774


Dmitry Olshansky <dmitry.olsh gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dmitry.olsh gmail.com



01:58:21 PST ---
This one was actually simple to decipher - map recomputes values across
multiple iterations. I instrumented the code to show
a) ref address of threads in start loop and join loop
b) print Mapped each time a functor that creates the thread is run.

I get 2 times and 2 different addresses. The only question left is why it ever
worked.

The right aproach would be to call array on result of task map and do not
attempt 2nd lazy evluation of it.

Modifed source below:


import std.algorithm ;
import std.datetime ;
import std.range ;
import std.stdio;

import core.thread ;

import output_d ;

shared double sum ;
shared Object sumMutex ;

void partialSum ( immutable int id , immutable int sliceSize , immutable double
delta ) {
  immutable start = 1 + id * sliceSize ;
  immutable end = ( id + 1 ) * sliceSize ;
  auto localSum = 0.0 ;
  foreach ( i ; start .. end + 1 ) {
    immutable x = ( i - 0.5 ) * delta ;
    localSum += 1.0 / ( 1.0 + x * x ) ;
  }
  synchronized ( sumMutex ) { sum += localSum ; }
}

void execute ( immutable int numberOfThreads ) {
  immutable n = 1000000000 ;
  immutable delta = 1.0 / n ;
  StopWatch stopWatch ;
  stopWatch.start ( ) ;
  immutable sliceSize = n / numberOfThreads ;
  sum = 0.0 ;
  auto threads = map ! ( ( int i ) {
      auto closedPartialSum ( ) {
        immutable ii = i ;
        return delegate ( ) { partialSum ( ii , sliceSize , delta ) ; } ;
      }
      writeln("Mapped!");
      return new Thread ( closedPartialSum ) ;
      } ) ( iota ( numberOfThreads ) ) ;
  foreach ( thread ; threads ) { 
    writefln("%x", cast(void*)thread);
    thread.start ( ) ; 
  }
  foreach ( thread ; threads ) { 
    writefln("%x", cast(void*)thread);
    thread.join ( ) ; 
  }
  immutable pi = 4.0 * delta * sum ;
  stopWatch.stop ( ) ;
  immutable elapseTime = stopWatch.peek ( ).hnsecs * 100e-9 ;
  output ( __FILE__ , pi , n , elapseTime , numberOfThreads ) ;
}

int main ( immutable string[] args ) {
  sumMutex = new shared ( Object ) ;
  execute ( 1 ) ;
  execute ( 2 ) ;
  execute ( 8 ) ;
  execute ( 32 ) ;
  return 0 ;
}

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Dec 23 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8774




02:32:18 PST ---

 Here is a simple test case:
 
 -----
 
 module program;
 
 import std.stdio;
 import core.thread;
 
 void main () {
 
   Thread t1, t2;
 
   t1 = new Thread(delegate { t2.start(); });
   t2 = new Thread(delegate { Thread.sleep(dur!"seconds"(1)); });
 
   t1.start();
   t2.join();
 
 }
 
 -----
 
 http://dpaste.dzfl.pl/0d24dd06
 
 output:
   core.thread.ThreadException src/core/thread.d(780): Unable to join thread
 
 if t2.join occurs after t2 already finished then exception is not thrown, hence
 the sleep
This one is a genuine race condition: t2.join could be called before t2 is actually started by t1. And as far as I can tell this is the *most* *probable* outcome. So it can't be seriously taken as test case without proper synchronization between threads. What it shows though is that you can't join a thread that isn't started and the error is "Unable to join thread". -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Dec 23 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8774


Walter Bright <bugzilla digitalmars.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |bugzilla digitalmars.com



03:36:13 PST ---
Dmitry, does this mean this is not a bug in the compiler or library?

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Dec 25 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8774




CET ---
Well, we probably should make sure that by the time Thread.start() returns, the
thread is joinable (waiting for the thread to start in Thread.join() would
probably be undesirable).

In any case, it probably isn't a bug, but a very reasonable enhancement request
- I know I'd expect it to behave like that.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Dec 25 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8774






 Here is a simple test case:
 
 -----
 
 module program;
 
 import std.stdio;
 import core.thread;
 
 void main () {
 
   Thread t1, t2;
 
   t1 = new Thread(delegate { t2.start(); });
   t2 = new Thread(delegate { Thread.sleep(dur!"seconds"(1)); });
 
   t1.start();
   t2.join();
 
 }
 
 -----
 
 http://dpaste.dzfl.pl/0d24dd06
 
 output:
   core.thread.ThreadException src/core/thread.d(780): Unable to join thread
 
 if t2.join occurs after t2 already finished then exception is not thrown, hence
 the sleep
This one is a genuine race condition: t2.join could be called before t2 is actually started by t1. And as far as I can tell this is the *most* *probable* outcome. So it can't be seriously taken as test case without proper synchronization between threads. What it shows though is that you can't join a thread that isn't started and the error is "Unable to join thread".
Yes, you are correct, it does not make much sense, but there is another issue. The following code throws an exception on 64-bit linux (32-bit linux and 32-bit windows executes without throwing an exception). On 64-bit linux t2 is never started. ----- module program; import std.stdio; import core.thread; void main () { Thread t1, t2; bool runned = false; t1 = new Thread(delegate { t2.start(); }); t2 = new Thread(delegate { runned = true; }); t1.start(); Thread.sleep(dur!"seconds"(1)); assert(runned); } ----- http://dpaste.dzfl.pl/5dc9733e Should I file a new bug report ? -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Dec 25 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8774




06:03:35 PST ---

 Dmitry, does this mean this is not a bug in the compiler or library?
From what I can gather the bug is not in compiler but in the original code making wrong assumptions about behavior of map in Phobos(it doesn't cache any value but computes them on demand). I suspect that this problem with the code was apparently hidden by the fact that join didn't always throw exception on the wrong join (e.g. on a thread that isn't started). In other words the code worked by pure luck and it needs fixing. However I'm still getting access violation after fixing the original bug. I'll investigate further as to what is the cause. It may be an unrelated bug with a closure as there are 3 level of these. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Dec 25 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8774




06:14:16 PST ---

 
 Yes, you are correct, it does not make much sense, but there is another issue.
 The following code throws an exception on 64-bit linux (32-bit linux and 32-bit
 windows executes without throwing an exception). On 64-bit linux t2 is never
 started.
 
How did you get to this conclusion?
 -----
 
 module program;
 
 import std.stdio;
 import core.thread;
 
 void main () {
 
   Thread t1, t2;
   bool runned = false;
 
   t1 = new Thread(delegate { t2.start(); });
   t2 = new Thread(delegate { runned = true; });
 
   t1.start();
   Thread.sleep(dur!"seconds"(1));
   assert(runned);
 
 }
 
 -----
 
 http://dpaste.dzfl.pl/5dc9733e
 
 
 Should I file a new bug report ?
And again there is no guarantee that: a) access to runned is properly guarded and made visible in all threads as it's on stack and not declared as shared nor there are memory barriers. TEchnically compiler is free to do what the heck it wants to. b) sleep is not a tool to coordinate threads, use locks and condition variables (or spin on atomic variables) as there is *always* a *chance* to get to assert(runned) before t2 is executed. With all that being said it very well *may* be a bug. Yet the test is still bogus. Again until proper synchronization is enacted it's wrong to conclude *anything* aside from the fact that it's ultimately indeterministic. So if you'd get this in a form where there are no probabilities involved then sure it's a bug and the report is appreciated. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Dec 25 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8774


Dmitry Olshansky <dmitry.olsh gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|2.059 worked 2.060 does     |2.059 worked 2.060 does
                   |not: Unable to join thread  |not: nested delegate memory
                   |                            |corruption



03:57:08 PST ---
Okay I drilled this down and the bug is not related to threading.

Short-list for Russel: 
- the bug in the original code is fixed by .array on result of map 
- to workaround a strange stackframe corruption bug pass 'i' instead of 'ii' in
partialSum and it works.


Now on to the real problem discovered, I guess Walter has some work to do
though as it doesn't look related to threading and run-time at all.

The minimized test case with threading removed,
shows that variable sliceSize gets corrupted:

import std.stdio;
import std.algorithm;
import std.range;

int sliceBefore;

void partialSum ( immutable int id , immutable int sliceSize , immutable double
delta ) {
  writeln("PSUM: ", id, "  SLICE: ", sliceSize, " DELTA:", delta); 
  assert(sliceSize == sliceBefore);
}

void corrupt( immutable int numberOfThreads ) {
    immutable n = 1000000000 ;
    immutable delta = 1.0 / n ;      
    immutable sliceSize = n / numberOfThreads ;
    sliceBefore = sliceSize;
    writeln("REAL SLICE SIZE: ", sliceSize);
    auto threads = map ! ( ( int i ) {
        auto closedPartialSum ( ) {
            immutable ii = i ;
            return delegate ( ) {     
                                //change ii to i and it works            
                partialSum (ii , sliceSize , delta ) ;
            };
        }
        return closedPartialSum;     
    }) ( iota ( numberOfThreads ) ).array;
    foreach(dg; threads)
        dg();
}

void main(){
  corrupt(1);
}


Yields the following (exact corrupted number in SLICE varies):

REAL SLICE SIZE: 1000000000
PSUM: 0  SLICE: 1588842464 DELTA:1e-09
core.exception.AssertError quadrature(15): Assertion failure
....

Can somebody having 2.059 check if this simplified test case passes there.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Dec 26 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8774




05:19:36 PST ---
Spent some more time & removed Phobos dependencies. 
The sample below runs fine with -version=OK and triggers assert without.
Curiously the only difference is a = a[1..$] vs the same via popFront (where
array is passed by ref).

void popFront(A)(ref A[] a)
{
    a = a[1 .. $];
}

private struct MapResult(alias fun, Range)
{
    alias Range R;
    R _input;

    this(R input)
    {
        _input = input;
    }

     property bool empty()
    {
        return _input.length == 0;
    }

    void popFront()
    {
        version(OK)
            _input = _input[1 .. $];
        else
            _input.popFront();
        //
    }

     property auto ref front()
    {
        return fun(_input[0]);
    }

     property auto length()
    {
        return _input.length;
    }

    alias length opDollar;
}

auto map(alias fun, Range)(Range r)
{
    return MapResult!(fun, Range)(r);
}

int sliceBefore;

void corrupt( immutable int numberOfThreads ) {
    immutable n = 1000000000 ;
    immutable delta = 1.0 / n ;      
    immutable sliceSize = n / numberOfThreads ;
    sliceBefore = sliceSize;
    auto threads = map!( ( int i ) {
        auto closedPartialSum() {
            immutable ii = i ;
            return delegate ( ) {
                assert(sliceSize == sliceBefore);
            };
        }
        return closedPartialSum();     
    }) ( [0] );

    //create array of delegates and copy them using range-based foreach
    auto dgs = new void delegate()[threads.length];
    size_t i=0;
    foreach(dg; threads) //empty-front-popFront
        dgs[i++] = dg;

    foreach(dg; dgs)
        dg();
}

void main(){
  corrupt(1);
}

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Dec 26 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8774


Maxim Fomin <maxim maxim-fomin.ru> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |maxim maxim-fomin.ru



---
This looks like template lambda bug, but strictly speaking it is not. However,
symptoms are same - usage of map with delegates/lambdas corrupts values.

May be related: issue 8899 and 
- 8514 : delegate and map leads to segfault, importing standard module affects
behavior
- 8854 : as described above 
- 8832 : similar code which uses map and delegate has different and incorrect
behavior
- 5064 (???): program crashes when using map and delegate
- 7978 : map, take and iota functions with delegates

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Dec 26 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8774




06:01:41 PST ---
Reduced a tiny bit further and there 2 versions that work fine: OK and OK2. 
Both are seemingly unrelated to the test. 


void popFront(A)(ref A[] a)
{
    a = a[1 .. $];
}

private struct MapResult(alias fun, Range)
{
    alias Range R;
    R _input;

    this(R input)
    {
        _input = input;
    }

     property bool empty()
    {
        return _input.length == 0;
    }

    void popFront()
    {
        version(OK)
            _input = _input[1 .. $];
        else
            _input.popFront();
        //
    }

     property auto ref front()
    {
        return fun(_input[0]);
    }

     property auto length()
    {
        return _input.length;
    }

    alias length opDollar;
}

int sliceBefore;

void corrupt( immutable int numberOfThreads ) {
    immutable n = 1000000000 ;  
    immutable sliceSize = n / numberOfThreads ;
    sliceBefore = sliceSize;
    auto threads = MapResult!( ( int i ) {
        auto closedPartialSum() {
            version(OK2){} //omit ii and it works
            else
            {
                immutable ii = i ;
            }
            return delegate ( ) {
                assert(sliceSize == sliceBefore);
            };
        }
        return closedPartialSum();     
    }, int[]) ( [0] );

    //create array of delegates and copy them using range-based foreach
    auto dgs = new void delegate()[threads.length];
    size_t i=0;
    foreach(dg; threads) //empty-front-popFront
        dgs[i++] = dg;

    foreach(dg; dgs)
        dg();
}

void main(){
  corrupt(1);
}

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Dec 26 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8774




---

 Reduced a tiny bit further and there 2 versions that work fine: OK and OK2. 
 Both are seemingly unrelated to the test. 
 
 <skipped>
Note, valgrind still complains with -version=OK (use of uninitialized value in dgliteral2). Try insert writeln("") in MapResult.popFront and you will get assert error. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Dec 26 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8774




PST ---
OK I can accept that map delivers an iterable but not a sequence, so to create
a sequence application of array is needed. This would imply though that 2.059
β†’
2.060 introduces an undocumented breaking change – albeit a good one
correcting
a bug.

Having i work and ii not work implies the whole closure capture mechanism in
2.059 β†’ 2.060 underwent breaking change. The question then is which closure
semantics are correct, indeed whether delegates are required to create
closures.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Dec 26 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8774




PST ---
Using LDC2, the ii β†’ i change is not needed, just adding the .array to the
result of the map expression to turn the iterable into a data structure is
sufficient.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Dec 26 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8774




PST ---
ii β†’ i not needed for DMD 2.060 either.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Dec 26 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8774




PST ---
Interesting, or not, pi_d_threadsGlobalState_array_iterative.d works fine with
LDC2 and causes a segmentation violation with DMD. Separate issue coming up
shortly.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Dec 26 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8774




12:41:47 PST ---

 Well, we probably should make sure that by the time Thread.start() returns, the
 thread is joinable (waiting for the thread to start in Thread.join() would
 probably be undesirable).
 
 In any case, it probably isn't a bug, but a very reasonable enhancement request
 - I know I'd expect it to behave like that.
I see no reason to believe it's not the case right now. Basically after t.start() was called join should always succeed (and wait on the thread to terminate, potentially blocking indefinetely). -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Dec 26 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8774




14:22:48 PST ---

 OK I can accept that map delivers an iterable but not a sequence, so to create
 a sequence application of array is needed.
It's not exactly an iterable or sequence, it's a lazy computation represented as a range. The key point is that it doesn't even attempt to store results somewhere nor avoid recomputation on demand.
 This would imply though that 2.059 β†’
 2.060 introduces an undocumented breaking change – albeit a good one
correcting
 a bug.
 
The only case I think it might have worked previously is the following chain of "events": 1. map used to cache the first entry (and maybe the last) (AFAIK it was in 2.057 or earlier) 2. then 1 thread out of pack is indeed the one that can be joined, the others are these that are newely created and not started. (if both front & back - then 2 threads) 3. the join somehow worked on not yet started threads. 4. the join then waited on only one thread (or maybe 2). 5. by sheer luck to the moment of printing stuff all threads arrive completed (since printing takes some time and is interlocked there is plenty of chance for other threads to finish the work) So... can you print the addresses of threads (in both start & join loops) in 2.059 where it used to work. It's intriguing to see if this guess is on spot and what forces are at work there.
 Having i work and ii not work implies the whole closure capture mechanism in
 2.059 β†’ 2.060 underwent breaking change. The question then is which closure
 semantics are correct, indeed whether delegates are required to create
 closures.
No, I suspect you are confusing these 2 cases: foreach(i; x..y) { threads ~= delegate (){ //use i here } } vs threads = map!( (int i){ use i here ... })(...); Note that in second case 'i' is a parameter passed to a delegate that is in turn passed by alias to map. Thus 'i' is unique for every stack frame (or call) of the said delegate. In the first case 'i' is tied to the same variable on the stack frame for each delegate create. Semantically there are 2 rules: - context capture happens on delegate creation - during context capture everything in the current function scope is captured by ref (behind the scenes rather the whole is captured by a single ref to a stack frame) -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Dec 26 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8774




18:17:14 PST ---

 Reduced a tiny bit further and there 2 versions that work fine: OK and OK2. 
 Both are seemingly unrelated to the test. 
This example seg faults on 2.058, 59, and 60 as well as 61. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Dec 26 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8774




21:30:55 PST ---
A further reduced test case:

-----------------
void popFront()
{
    int[20] abc;        // smash stack
}

struct MapResult(alias fun)
{
    void delegate() front()
    {
        return fun(1);
    }
}

void main() {

    int sliceSize = 100;

    void delegate() foo( int i )
    {
        void delegate() closedPartialSum()
        {
            int ii = i ;
            void bar()
            {   assert(sliceSize == 100);
            }
            return &bar;
        }
        return closedPartialSum();
    }

    auto threads = MapResult!foo();

    auto dg = threads.front();
    popFront();

    dg();
}
-----------------

Most of the oddities in the original were there to "smash" the stack, which we
do here with a simple array initialization. Apparently, there are out-of-scope
dangling references into the stack which cause the seg fault.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Dec 26 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8774




Commit pushed to staging at https://github.com/D-Programming-Language/dmd

https://github.com/D-Programming-Language/dmd/commit/c1e04220dcd9d4e49d43a60d7619a19cc38e73e7
fix Issue 8774 - 2.059 worked 2.060 does not: nested delegate memory corruption

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Dec 27 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8774




Commit pushed to master at https://github.com/D-Programming-Language/dmd

https://github.com/D-Programming-Language/dmd/commit/d1f6b562ecd64e2405be00cea7abf6b8d5f78a55
fix Issue 8774 - 2.059 worked 2.060 does not: nested delegate memory corruption

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Dec 27 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8774




00:24:27 PST ---
I did find a long time bug in the way closures are nested. This was not a
regression. The non-thread test cases here are now fixed, but I don't know
about the original problem with joining, so I won't mark it fixed just yet.

It's likely that this was causing problems in other bugzilla issues.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Dec 27 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8774




PST ---

[…]
 It's not exactly  an iterable or sequence, it's a lazy computation represented
 as a range. The key point is that it doesn't even attempt to store results
 somewhere nor avoid recomputation on demand.
If I can iterate over it then it is iterable ;-) The real point is lazy evaluation vs. strict evaluation. And the question is, as stated early: why did this ever work? On the other hand, the code works now. […]
 So... can you print the addresses of threads (in both start & join loops) in
 2.059 where it used to work. It's intriguing to see if this guess is on spot
 and what forces are at work there.
I will try and look into this in the new year. […]
 No, I suspect you are confusing these 2 cases:
 foreach(i; x..y)
 {
    threads ~= delegate (){ 
        //use i here
    }
 }
 
 vs
 
 threads = map!( (int i){ use i here ... })(...);
I suspect you are 100% correct :-) Thanks for the "heads up", code duly amended and still working. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Dec 27 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8774




03:37:35 PST ---

 I did find a long time bug in the way closures are nested. This was not a
 regression.
Terrific, thanks!
 The non-thread test cases here are now fixed, but I don't know
 about the original problem with joining, so I won't mark it fixed just yet.
 
The code shouldn't work and after fixing a bug in the compiler and in the code itself it works :) For me on Win32 the breakdown across full versions (dmd+phobos+druntime) is as follows: 2.061 with the latest patch: with .array: works w/o .array: fails as it should (Unnable to join) 2.057-2.058 and 2.060 with .array: segfaults due to closure bug with stack corruption, can workaround in the same way as discussed here w/o .array: fails as it should (Unnable to join) 2.059: with .array: fails to compile with a range of Thread (!) w/o .array: fails as it should (Unable to join) 2.053-2.056: -fails to compile (no inference for nested function), fixing that it fails with segfault (the same closure thing) -fails as it should (Unable to join) 2.052 - fails to compile(cannot access frame of ...) 2.049-2.051 had no std.datetime & std.parallelism but even with it ripped off fails to compile as 2.052. Thus I conclude that the issue with threading is invalid and is not a regression. The only mystery remaining is why (and when?) it *did* work before. Probably it did work with LDC but then stopped. LDC is not affected by the closure bug and uses a patched fork of D run-time. So I'd go ahead and close it as resolved fixed as a long standing critical bug with stack corruption in nested delegates. Russel are you OK with that?
 It's likely that this was causing problems in other bugzilla issues.
Time to go on a witch-hunt! Maxim presented an excelent list to check. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Dec 27 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8774




03:52:06 PST ---
*** Issue 8505 has been marked as a duplicate of this issue. ***

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Dec 27 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8774




---

 I did find a long time bug in the way closures are nested. This was not a
 regression. The non-thread test cases here are now fixed, but I don't know
 about the original problem with joining, so I won't mark it fixed just yet.
 
 It's likely that this was causing problems in other bugzilla issues.
Walter, take look at issue 8832. It is the same problem, but the code still segfaults. It seems your commits do not entirely fix the bug, although they fix a couple of other which are closed now. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Dec 27 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8774




---

 I did find a long time bug in the way closures are nested. This was not a
 regression. The non-thread test cases here are now fixed, but I don't know
 about the original problem with joining, so I won't mark it fixed just yet.
 
 It's likely that this was causing problems in other bugzilla issues.
Walter, please take a look also at issue 7978. It still shows problem. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Dec 27 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8774




---


 It's likely that this was causing problems in other bugzilla issues.
Others (fixed by these commits) issues are: issue 8514, issue 8854, issue 5064, issue 1350 -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Dec 27 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8774




10:15:02 PST ---


 
 It's likely that this was causing problems in other bugzilla issues.
Others (fixed by these commits) issues are: issue 8514, issue 8854, issue 5064, issue 1350
Woo-hoo! It sure was a nasty one. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Dec 27 2012
prev sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8774


Walter Bright <bugzilla digitalmars.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED


-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Dec 28 2012