www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Porting GDC to QNX

reply Sheff <sheffmail mail.ru> writes:
Hi everyone, I'm porting GDC compiler, version 0.23 and GCC 4.1.1 to QNX
Neutrino 6.3.0 SP1 (with GCC 3.3.5).
The porting is almost done, GDC, GCC and Phobos are compiled and working fine,
but you can't say the same for the programs, which GDC produces from D source
code. So far I tested, I only spot one problem: posix threads. Consider the
following example:

import std.stdio;
import std.thread;

char[] string;

int th_func(void* arg)
{
	const int num = 1_000_000;

	for (int i=0; i<num; ++i)
	{
		string ~= "*";
	}

	return 0;
}

int main(char[][] args)
{
	auto th = new Thread(&th_func, null);
	scope(exit)
		delete th;

	th.start();
	th.wait();

	writefln("end\n");

    return 0;
}

When you run this program, it crashes with signal SIGUSR1, which's generated by
pthread_join(), here's GDB output:

Program received signal SIGUSR1, User defined signal 1.
[Switching to process 66433068]
0xb032f092 in ThreadJoin_r () from /usr/qnx630/target/qnx6/x86/lib/libc.so.2
(gdb) bt
#0  0xb032f092 in ThreadJoin_r ()
   from /usr/qnx630/target/qnx6/x86/lib/libc.so.2
#1  0xb031a801 in pthread_join ()
   from /usr/qnx630/target/qnx6/x86/lib/libc.so.2
Memory fault

But, if you'll replace the line:

const int num = 1_000_000;

with:

const int num = 100_000;

Then everything works fine, the "end" string gets printed, program successfuly
exits. Consider another, similar example:

import std.stdio;
import std.thread;

char[] string;

int th_func(void* arg)
{
	const int num = 1_000_000;

	for (int i=0; i<num; ++i)
	{
		string ~= "*";
	}

	return 0;
}

int main(char[][] args)
{
	th_func(null);

	writefln("end\n");

    return 0;
}

As you can see, we have

const int num = 1_000_000;

here like in first example, which crashed, but this one doesn't, it works fine
even if I write:

const int num = 50_000_000;

The conclusion I made from all this is that there's something wrong with
threads, but I can't figure out what. Does anyone have any ideas ?
Mar 28 2007
next sibling parent reply Brad Roberts <braddr puremagic.com> writes:
On Wed, 28 Mar 2007, Sheff wrote:

 Hi everyone, I'm porting GDC compiler, version 0.23 and GCC 4.1.1 to QNX 
 Neutrino 6.3.0 SP1 (with GCC 3.3.5). The porting is almost done, GDC, 
 GCC and Phobos are compiled and working fine, but you can't say the same 
 for the programs, which GDC produces from D source code. So far I 
 tested, I only spot one problem: posix threads. Consider the following 
 example:

[snip code]
 When you run this program, it crashes with signal SIGUSR1, which's generated
by pthread_join(), here's GDB output:
 
 Program received signal SIGUSR1, User defined signal 1.
 [Switching to process 66433068]
 0xb032f092 in ThreadJoin_r () from /usr/qnx630/target/qnx6/x86/lib/libc.so.2
 (gdb) bt
 #0  0xb032f092 in ThreadJoin_r ()
    from /usr/qnx630/target/qnx6/x86/lib/libc.so.2
 #1  0xb031a801 in pthread_join ()
    from /usr/qnx630/target/qnx6/x86/lib/libc.so.2
 Memory fault
 
 But, if you'll replace the line:
 
 const int num = 1_000_000;
 
 with:
 
 const int num = 100_000;
 
 Then everything works fine, the "end" string gets printed, program successfuly
exits. Consider another, similar example:

That's not a 'crash', that's receiving a signal. The garbage collector uses SIGUSR1 and SIGUSR2 to stop and start the threads of the app around garbage collection. You need to tell gdb to not stop when receiving them, or you need to just continue the app when they do come in if you care to see when collections are occurring. (gdb) handle SIGUSR1 nostop (gdb) handle SIGUSR2 nostop (those commands are from memory, so check the docs if they're not right). The reason you see it in the higher iterations and not the lower is simply that the higher one ends up doing a gc collection. Later, Brad
Mar 28 2007
next sibling parent reply Sheff <sheffmail mail.ru> writes:
Brad Roberts Wrote:

 That's not a 'crash', that's receiving a signal.  The garbage collector 
 uses SIGUSR1 and SIGUSR2 to stop and start the threads of the app around 
 garbage collection.  You need to tell gdb to not stop when receiving them, 
 or you need to just continue the app when they do come in if you care to 
 see when collections are occurring.
 
 (gdb) handle SIGUSR1 nostop
 (gdb) handle SIGUSR2 nostop
 
 (those commands are from memory, so check the docs if they're not right).
 
 The reason you see it in the higher iterations and not the lower is simply 
 that the higher one ends up doing a gc collection.
 
 Later,
 Brad

No, it's the crash, cause when I launch a standalone program(without GDB) it writes "Abort" and exits.
Mar 28 2007
parent reply Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
Sheff wrote:
 Brad Roberts Wrote:
 
 That's not a 'crash', that's receiving a signal.  The garbage collector 
 uses SIGUSR1 and SIGUSR2 to stop and start the threads of the app around 
 garbage collection.  You need to tell gdb to not stop when receiving them, 
 or you need to just continue the app when they do come in if you care to 
 see when collections are occurring.

 (gdb) handle SIGUSR1 nostop
 (gdb) handle SIGUSR2 nostop

 (those commands are from memory, so check the docs if they're not right).

 The reason you see it in the higher iterations and not the lower is simply 
 that the higher one ends up doing a gc collection.

 Later,
 Brad

No, it's the crash, cause when I launch a standalone program(without GDB) it writes "Abort" and exits.

That doesn't mean the signal causes the crash. It could still be that the crash is _after_ the gc run. You should try either ignoring the signals or telling gdb to continue when it stops on them, to see what happens afterwards.
Mar 28 2007
parent Sheff <sheffmail mail.ru> writes:
Frits van Bommel Wrote:

 That doesn't mean the signal causes the crash. It could still be that 
 the crash is _after_ the gc run. You should try either ignoring the 
 signals or telling gdb to continue when it stops on them, to see what 
 happens afterwards.

Hm, I think that just ignoring that signal is wrong, because, for example, in linux, the same code works fine and doesn't send signals. When I continue on the signal in gdb it writes "Memory fault": Program received signal SIGUSR1, User defined signal 1. [Switching to process 66433068] 0xb032f092 in ThreadJoin_r () from /usr/qnx630/target/qnx6/x86/lib/libc.so.2 (gdb) bt #0 0xb032f092 in ThreadJoin_r () from /usr/qnx630/target/qnx6/x86/lib/libc.so.2 #1 0xb031a801 in pthread_join () from /usr/qnx630/target/qnx6/x86/lib/libc.so.2 **Memory fault**
Mar 29 2007
prev sibling parent Dan <murpsoft hotmail.com> writes:
: )  
Brad's awesome.  
He said what I was going to, but actually sounds like he knows what he was
talking about.  I was just going to hypothesize the same.
Mar 28 2007
prev sibling parent reply Charlie <charlie.fats gmail.com> writes:
What platform is QNX running on for this port, x86 ?

Charlie

Sheff wrote:
 Hi everyone, I'm porting GDC compiler, version 0.23 and GCC 4.1.1 to QNX
Neutrino 6.3.0 SP1 (with GCC 3.3.5).
 The porting is almost done, GDC, GCC and Phobos are compiled and working fine,
but you can't say the same for the programs, which GDC produces from D source
code. So far I tested, I only spot one problem: posix threads. Consider the
following example:
 
 import std.stdio;
 import std.thread;
 
 char[] string;
 
 int th_func(void* arg)
 {
 	const int num = 1_000_000;
 
 	for (int i=0; i<num; ++i)
 	{
 		string ~= "*";
 	}
 
 	return 0;
 }
 
 int main(char[][] args)
 {
 	auto th = new Thread(&th_func, null);
 	scope(exit)
 		delete th;
 
 	th.start();
 	th.wait();
 
 	writefln("end\n");
 
     return 0;
 }
 
 When you run this program, it crashes with signal SIGUSR1, which's generated
by pthread_join(), here's GDB output:
 
 Program received signal SIGUSR1, User defined signal 1.
 [Switching to process 66433068]
 0xb032f092 in ThreadJoin_r () from /usr/qnx630/target/qnx6/x86/lib/libc.so.2
 (gdb) bt
 #0  0xb032f092 in ThreadJoin_r ()
    from /usr/qnx630/target/qnx6/x86/lib/libc.so.2
 #1  0xb031a801 in pthread_join ()
    from /usr/qnx630/target/qnx6/x86/lib/libc.so.2
 Memory fault
 
 But, if you'll replace the line:
 
 const int num = 1_000_000;
 
 with:
 
 const int num = 100_000;
 
 Then everything works fine, the "end" string gets printed, program successfuly
exits. Consider another, similar example:
 
 import std.stdio;
 import std.thread;
 
 char[] string;
 
 int th_func(void* arg)
 {
 	const int num = 1_000_000;
 
 	for (int i=0; i<num; ++i)
 	{
 		string ~= "*";
 	}
 
 	return 0;
 }
 
 int main(char[][] args)
 {
 	th_func(null);
 
 	writefln("end\n");
 
     return 0;
 }
 
 As you can see, we have
 
 const int num = 1_000_000;
 
 here like in first example, which crashed, but this one doesn't, it works fine
even if I write:
 
 const int num = 50_000_000;
 
 The conclusion I made from all this is that there's something wrong with
threads, but I can't figure out what. Does anyone have any ideas ?

Mar 28 2007
parent Sheff <sheffmail mail.ru> writes:
Charlie Wrote:

 What platform is QNX running on for this port, x86 ?

That's right.
Mar 29 2007