www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - new(malloc) locks everything in multithreading

reply "tcak" <tcak gmail.com> writes:
This must be my special day that everything I try gets broken.


import core.thread;
import std.stdio;

class ThreadTest{
	private core.thread.Thread th;

	public this() shared{
		th = cast( shared )(
			new core.thread.Thread(
				cast( void delegate() )( &threadRun )
			)
		);
	}

	public void start() shared{
		(cast()th).start();
	}

	private void threadRun() shared{
		writeln("It works");
		readln();
	}
}


void main(){
	new shared ThreadTest().start();
	char[] abc = new char[4096];
}

=====

Run the program (Release or Debug doesn't matter).

- Mono Application Output -

[Thread debugging using libthread_db enabled]
Using host libthread_db library 
\"/lib/x86_64-linux-gnu/libthread_db.so.1\".
[New Thread 0x7ffff75ed700 (LWP 18480)]
[Switching to Thread 0x7ffff75ed700 (LWP 18480)]


In call stack, program comes to "__lll_lock_wait_private () in 
/build/buildd/eglibc-2.19/nptl/../nptl/sysdeps/unix/sysv/linux/x86_64
lowlevellock.S:95", 
and
this locks everything.

Remove the "char[] abc = ne...." line, and everything is fine.
Remove the "new shared Thre..." line and everything is fine.

I don't want to blame dmd directly because as far as I see from 
the search I did with "__lll_lock_wait_private", some C++ 
programs are having same problem with malloc operation as well. 
But still, can this be because of compiler?
Oct 23 2014
parent reply "safety0ff" <safety0ff.dev gmail.com> writes:
On Friday, 24 October 2014 at 02:51:20 UTC, tcak wrote:
 I don't want to blame dmd directly because as far as I see from 
 the search I did with "__lll_lock_wait_private", some C++ 
 programs are having same problem with malloc operation as well. 
 But still, can this be because of compiler?
versions of the compiler. Which version are you using? [1] https://issues.dlang.org/show_bug.cgi?id=11981
Oct 23 2014
parent reply "tcak" <tcak gmail.com> writes:
On Friday, 24 October 2014 at 03:42:29 UTC, safety0ff wrote:
 On Friday, 24 October 2014 at 02:51:20 UTC, tcak wrote:
 I don't want to blame dmd directly because as far as I see 
 from the search I did with "__lll_lock_wait_private", some C++ 
 programs are having same problem with malloc operation as 
 well. But still, can this be because of compiler?
versions of the compiler. Which version are you using? [1] https://issues.dlang.org/show_bug.cgi?id=11981
I am on DMD 2.066 64-bit Linux 3.13.0-37-generic
Oct 24 2014
parent reply "Kagamin" <spam here.lot> writes:
If it's deterministic, looks more like 
https://issues.dlang.org/show_bug.cgi?id=4890
(11981 is not deterministic)
Oct 24 2014
parent reply "tcak" <tcak gmail.com> writes:
On Friday, 24 October 2014 at 08:47:55 UTC, Kagamin wrote:
 If it's deterministic, looks more like 
 https://issues.dlang.org/show_bug.cgi?id=4890
 (11981 is not deterministic)
Yes, it is deterministic. Run it as many times as you want, and it does the same thing. I ran it now again, and still same.
Oct 24 2014
parent reply "Kagamin" <spam here.lot> writes:
Do you see recursive call to malloc in the stack trace?
Oct 24 2014
parent reply "tcak" <tcak gmail.com> writes:
On Friday, 24 October 2014 at 08:55:17 UTC, Kagamin wrote:
 Do you see recursive call to malloc in the stack trace?
I further simplified the example: import std.stdio; import core.thread; class ThreadTest{ public this(){ new core.thread.Thread( &threadRun ).start(); } private void threadRun(){ writeln("It works"); readln(); } } void main(){ new ThreadTest(); char[] abc = new char[4096]; } This is what I see on screen: http://imgur.com/Pv9Rulw Same result.
Oct 24 2014
next sibling parent "tcak" <tcak gmail.com> writes:
On Friday, 24 October 2014 at 09:12:57 UTC, tcak wrote:
 On Friday, 24 October 2014 at 08:55:17 UTC, Kagamin wrote:
 Do you see recursive call to malloc in the stack trace?
I further simplified the example: import std.stdio; import core.thread; class ThreadTest{ public this(){ new core.thread.Thread( &threadRun ).start(); } private void threadRun(){ writeln("It works"); readln(); } } void main(){ new ThreadTest(); char[] abc = new char[4096]; } This is what I see on screen: http://imgur.com/Pv9Rulw Same result.
And this is the thread of malloc. http://imgur.com/e8ofRte It suspends all threads and cannot get out of there it seems like.
Oct 24 2014
prev sibling parent reply "Kagamin" <spam here.lot> writes:
Looks like your IDE filters too much. Can you configure it to 
filter less and show address locations?
Oct 24 2014
parent reply "tcak" <tcak gmail.com> writes:
On Friday, 24 October 2014 at 10:29:10 UTC, Kagamin wrote:
 Looks like your IDE filters too much. Can you configure it to 
 filter less and show address locations?
This is what I have found: Main Thread http://i.imgur.com/6ElZ3Fm.png Second Thread (TestThread) http://i.imgur.com/w4y5gYB.png
Oct 24 2014
next sibling parent reply "tcak" <tcak gmail.com> writes:
On Friday, 24 October 2014 at 10:46:57 UTC, tcak wrote:
 On Friday, 24 October 2014 at 10:29:10 UTC, Kagamin wrote:
 Looks like your IDE filters too much. Can you configure it to 
 filter less and show address locations?
This is what I have found: Main Thread http://i.imgur.com/6ElZ3Fm.png Second Thread (TestThread) http://i.imgur.com/w4y5gYB.png
BTW, instead of using Monodevelop with Mono-D, I used same code on a text file, and compiled with "dmd test.d", then run with "./test", then everything works fine. When I run it with "gdb ./test", then I can see those errors again.
Oct 24 2014
parent reply "Kapps" <opantm2+spam gmail.com> writes:
On Friday, 24 October 2014 at 10:49:42 UTC, tcak wrote:
 On Friday, 24 October 2014 at 10:46:57 UTC, tcak wrote:
 On Friday, 24 October 2014 at 10:29:10 UTC, Kagamin wrote:
 Looks like your IDE filters too much. Can you configure it to 
 filter less and show address locations?
This is what I have found: Main Thread http://i.imgur.com/6ElZ3Fm.png Second Thread (TestThread) http://i.imgur.com/w4y5gYB.png
BTW, instead of using Monodevelop with Mono-D, I used same code on a text file, and compiled with "dmd test.d", then run with "./test", then everything works fine. When I run it with "gdb ./test", then I can see those errors again.
Not sure if this is the same issue, but by default gdb breaks on signals that the GC uses, which would explain why it's breaking in gdb but not normally. What happens if you try: handle SIGUSR1 noprint nostop handle SIGUSR2 noprint nostop In GDB before starting execution of the program?
Oct 24 2014
parent reply "tcak" <tcak gmail.com> writes:
On Friday, 24 October 2014 at 16:51:02 UTC, Kapps wrote:
 On Friday, 24 October 2014 at 10:49:42 UTC, tcak wrote:

 Not sure if this is the same issue, but by default gdb breaks 
 on signals that the GC uses, which would explain why it's 
 breaking in gdb but not normally.

 What happens if you try:
 handle SIGUSR1 noprint nostop
 handle SIGUSR2 noprint nostop

 In GDB before starting execution of the program?
This is what I did on shell: (I put some spaces for readability) tolga tolga:~/dev/d/bug$ dmd -gc -debug test.d tolga tolga:~/dev/d/bug$ gdb ./test GNU gdb (Ubuntu 7.7-0ubuntu3.1) 7.7 Copyright (C) 2014 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from ./test...done. (gdb) run Starting program: /home/tolga/dev/d/bug/test [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". [New Thread 0x7ffff75ed700 (LWP 4940)] Program received signal SIGUSR1, User defined signal 1. [Switching to Thread 0x7ffff75ed700 (LWP 4940)] __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95 95 ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S: No such file or directory. (gdb) backtrace full ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95 No locals. /lib/x86_64-linux-gnu/libpthread.so.0 No symbol table info available. pthread_create.c:301 oldtype = 0 pd = 0x7ffff75ed700 now = <optimised out> unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140737343575808, 9112392114346628883, 1, 0, 140737343576512, 140737343575808, -9112375355952097517, -9112375093712283885}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}} not_first_call = <optimised out> pagesize_m1 = <optimised out> sp = <optimised out> freesize = <optimised out> __PRETTY_FUNCTION__ = "start_thread" ../sysdeps/unix/sysv/linux/x86_64/clone.S:111 No locals. (gdb)
Oct 24 2014
parent reply "Kapps" <opantm2+spam gmail.com> writes:
On Friday, 24 October 2014 at 18:38:39 UTC, tcak wrote:
 On Friday, 24 October 2014 at 16:51:02 UTC, Kapps wrote:
 On Friday, 24 October 2014 at 10:49:42 UTC, tcak wrote:

 Not sure if this is the same issue, but by default gdb breaks 
 on signals that the GC uses, which would explain why it's 
 breaking in gdb but not normally.

 What happens if you try:
 handle SIGUSR1 noprint nostop
 handle SIGUSR2 noprint nostop

 In GDB before starting execution of the program?
This is what I did on shell: (I put some spaces for readability) tolga tolga:~/dev/d/bug$ dmd -gc -debug test.d tolga tolga:~/dev/d/bug$ gdb ./test
Yes, GDB is stopping on SIGUSR1 / SIGUSR2 since that's the default settings. D's GC uses these signals for suspending / resuming threads during a collection. You need to type what I said above, prior to typing 'run'.
Oct 24 2014
next sibling parent "Sean Kelly" <sean invisibleduck.org> writes:
On Friday, 24 October 2014 at 21:02:05 UTC, Kapps wrote:
 Yes, GDB is stopping on SIGUSR1 / SIGUSR2 since that's the 
 default settings. D's GC uses these signals for suspending / 
 resuming threads during a collection. You need to type what I 
 said above, prior to typing 'run'.
I took a look at the Boehm GC earlier today and it appears they've made the signal set configurable, both to try and not use SIGUSR1/2 by default and to let the user specify another signal set if they need SIGUSR1/2 for some reason. It's probably worth doing this in our own code as well. The Boehm GC also does some magic with clearing signals at various points to make sure the right signal handlers will be called. Probably another enhancement request.
Oct 24 2014
prev sibling parent "tcak" <tcak gmail.com> writes:
On Friday, 24 October 2014 at 21:02:05 UTC, Kapps wrote:
 On Friday, 24 October 2014 at 18:38:39 UTC, tcak wrote:
 On Friday, 24 October 2014 at 16:51:02 UTC, Kapps wrote:

 This is what I did on shell: (I put some spaces for 
 readability)

 tolga tolga:~/dev/d/bug$ dmd -gc -debug test.d

 tolga tolga:~/dev/d/bug$ gdb ./test
Yes, GDB is stopping on SIGUSR1 / SIGUSR2 since that's the default settings. D's GC uses these signals for suspending / resuming threads during a collection. You need to type what I said above, prior to typing 'run'.
I was desperately looking for a solution how to do this on Mono-D instead of entering on shell. Then I have found this link: http://www.mono-project.com/docs/debug+profile/debug/ Under the title "Debugging with GDB", it says to create ".gdbinit" file in home folder and put those "handle SIGUSR1..." things into it. Then I opened the preferences of MonoDevelop, and brought the GDB with D language support as first in preferred debugger list. Tada! It works now. I really thank you. BUT, since Mono-D comes with "GDB with D language support" as well, this process should have been automated in my opinion knowing that GC would be using those signals.
Oct 24 2014
prev sibling parent reply "Kagamin" <spam here.lot> writes:
On Friday, 24 October 2014 at 10:46:57 UTC, tcak wrote:
 Second Thread (TestThread)
 http://i.imgur.com/w4y5gYB.png
Hmm... where is __lll_lock_wait_private now? And how mmap can hang at all?
Oct 24 2014
parent "tcak" <tcak gmail.com> writes:
On Friday, 24 October 2014 at 12:38:48 UTC, Kagamin wrote:
 On Friday, 24 October 2014 at 10:46:57 UTC, tcak wrote:
 Second Thread (TestThread)
 http://i.imgur.com/w4y5gYB.png
Hmm... where is __lll_lock_wait_private now? And how mmap can hang at all?
Here it is. http://i.imgur.com/5ZDuYRF.png I didn't notice mmap before you said it, but it seems like that happened as well.
Oct 24 2014