www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Porting GDC to QNX

reply Sheff <sheffmail mail.ru> writes:
Hi everyone, I'm porting GDC compiler, version 0.23 and GCC 4.1.1 to QNX
Neutrino 6.3.0 SP1 (with GCC 3.3.5).
The porting is almost done, GDC, GCC and Phobos are compiled and working fine,
but you can't say the same for the programs, which GDC produces from D source
code. So far I tested, I only spot one problem: posix threads. Consider the
following example:

import std.stdio;
import std.thread;

char[] string;

int th_func(void* arg)
{
	const int num = 1_000_000;

	for (int i=0; i<num; ++i)
	{
		string ~= "*";
	}

	return 0;
}

int main(char[][] args)
{
	auto th = new Thread(&th_func, null);
	scope(exit)
		delete th;

	th.start();
	th.wait();

	writefln("end\n");

    return 0;
}

When you run this program, it crashes with signal SIGUSR1, which's generated by
pthread_join(), here's GDB output:

Program received signal SIGUSR1, User defined signal 1.
[Switching to process 66433068]
0xb032f092 in ThreadJoin_r () from /usr/qnx630/target/qnx6/x86/lib/libc.so.2
(gdb) bt

   from /usr/qnx630/target/qnx6/x86/lib/libc.so.2

   from /usr/qnx630/target/qnx6/x86/lib/libc.so.2
Memory fault

But, if you'll replace the line:

const int num = 1_000_000;

with:

const int num = 100_000;

Then everything works fine, the "end" string gets printed, program successfuly
exits. Consider another, similar example:

import std.stdio;
import std.thread;

char[] string;

int th_func(void* arg)
{
	const int num = 1_000_000;

	for (int i=0; i<num; ++i)
	{
		string ~= "*";
	}

	return 0;
}

int main(char[][] args)
{
	th_func(null);

	writefln("end\n");

    return 0;
}

As you can see, we have

const int num = 1_000_000;

here like in first example, which crashed, but this one doesn't, it works fine
even if I write:

const int num = 50_000_000;

The conclusion I made from all this is that there's something wrong with
threads, but I can't figure out what. Does anyone have any ideas ?
Mar 28 2007
next sibling parent reply Brad Roberts <braddr puremagic.com> writes:
On Wed, 28 Mar 2007, Sheff wrote:

 Hi everyone, I'm porting GDC compiler, version 0.23 and GCC 4.1.1 to QNX 
 Neutrino 6.3.0 SP1 (with GCC 3.3.5). The porting is almost done, GDC, 
 GCC and Phobos are compiled and working fine, but you can't say the same 
 for the programs, which GDC produces from D source code. So far I 
 tested, I only spot one problem: posix threads. Consider the following 
 example:
[snip code]
 When you run this program, it crashes with signal SIGUSR1, which's generated
by pthread_join(), here's GDB output:
 
 Program received signal SIGUSR1, User defined signal 1.
 [Switching to process 66433068]
 0xb032f092 in ThreadJoin_r () from /usr/qnx630/target/qnx6/x86/lib/libc.so.2
 (gdb) bt

    from /usr/qnx630/target/qnx6/x86/lib/libc.so.2

    from /usr/qnx630/target/qnx6/x86/lib/libc.so.2
 Memory fault
 
 But, if you'll replace the line:
 
 const int num = 1_000_000;
 
 with:
 
 const int num = 100_000;
 
 Then everything works fine, the "end" string gets printed, program successfuly
exits. Consider another, similar example:
That's not a 'crash', that's receiving a signal. The garbage collector uses SIGUSR1 and SIGUSR2 to stop and start the threads of the app around garbage collection. You need to tell gdb to not stop when receiving them, or you need to just continue the app when they do come in if you care to see when collections are occurring. (gdb) handle SIGUSR1 nostop (gdb) handle SIGUSR2 nostop (those commands are from memory, so check the docs if they're not right). The reason you see it in the higher iterations and not the lower is simply that the higher one ends up doing a gc collection. Later, Brad
Mar 28 2007
next sibling parent reply Sheff <sheffmail mail.ru> writes:
Brad Roberts Wrote:

 That's not a 'crash', that's receiving a signal.  The garbage collector 
 uses SIGUSR1 and SIGUSR2 to stop and start the threads of the app around 
 garbage collection.  You need to tell gdb to not stop when receiving them, 
 or you need to just continue the app when they do come in if you care to 
 see when collections are occurring.
 
 (gdb) handle SIGUSR1 nostop
 (gdb) handle SIGUSR2 nostop
 
 (those commands are from memory, so check the docs if they're not right).
 
 The reason you see it in the higher iterations and not the lower is simply 
 that the higher one ends up doing a gc collection.
 
 Later,
 Brad
No, it's the crash, cause when I launch a standalone program(without GDB) it writes "Abort" and exits.
Mar 28 2007
parent reply Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
Sheff wrote:
 Brad Roberts Wrote:
 
 That's not a 'crash', that's receiving a signal.  The garbage collector 
 uses SIGUSR1 and SIGUSR2 to stop and start the threads of the app around 
 garbage collection.  You need to tell gdb to not stop when receiving them, 
 or you need to just continue the app when they do come in if you care to 
 see when collections are occurring.

 (gdb) handle SIGUSR1 nostop
 (gdb) handle SIGUSR2 nostop

 (those commands are from memory, so check the docs if they're not right).

 The reason you see it in the higher iterations and not the lower is simply 
 that the higher one ends up doing a gc collection.

 Later,
 Brad
No, it's the crash, cause when I launch a standalone program(without GDB) it writes "Abort" and exits.
That doesn't mean the signal causes the crash. It could still be that the crash is _after_ the gc run. You should try either ignoring the signals or telling gdb to continue when it stops on them, to see what happens afterwards.
Mar 28 2007
parent Sheff <sheffmail mail.ru> writes:
Frits van Bommel Wrote:

 That doesn't mean the signal causes the crash. It could still be that 
 the crash is _after_ the gc run. You should try either ignoring the 
 signals or telling gdb to continue when it stops on them, to see what 
 happens afterwards.
Hm, I think that just ignoring that signal is wrong, because, for example, in linux, the same code works fine and doesn't send signals. When I continue on the signal in gdb it writes "Memory fault": Program received signal SIGUSR1, User defined signal 1. [Switching to process 66433068] 0xb032f092 in ThreadJoin_r () from /usr/qnx630/target/qnx6/x86/lib/libc.so.2 (gdb) bt from /usr/qnx630/target/qnx6/x86/lib/libc.so.2 from /usr/qnx630/target/qnx6/x86/lib/libc.so.2 **Memory fault**
Mar 29 2007
prev sibling parent Dan <murpsoft hotmail.com> writes:
: )  
Brad's awesome.  
He said what I was going to, but actually sounds like he knows what he was
talking about.  I was just going to hypothesize the same.
Mar 28 2007
prev sibling next sibling parent reply Charlie <charlie.fats gmail.com> writes:
What platform is QNX running on for this port, x86 ?

Charlie

Sheff wrote:
 Hi everyone, I'm porting GDC compiler, version 0.23 and GCC 4.1.1 to QNX
Neutrino 6.3.0 SP1 (with GCC 3.3.5).
 The porting is almost done, GDC, GCC and Phobos are compiled and working fine,
but you can't say the same for the programs, which GDC produces from D source
code. So far I tested, I only spot one problem: posix threads. Consider the
following example:
 
 import std.stdio;
 import std.thread;
 
 char[] string;
 
 int th_func(void* arg)
 {
 	const int num = 1_000_000;
 
 	for (int i=0; i<num; ++i)
 	{
 		string ~= "*";
 	}
 
 	return 0;
 }
 
 int main(char[][] args)
 {
 	auto th = new Thread(&th_func, null);
 	scope(exit)
 		delete th;
 
 	th.start();
 	th.wait();
 
 	writefln("end\n");
 
     return 0;
 }
 
 When you run this program, it crashes with signal SIGUSR1, which's generated
by pthread_join(), here's GDB output:
 
 Program received signal SIGUSR1, User defined signal 1.
 [Switching to process 66433068]
 0xb032f092 in ThreadJoin_r () from /usr/qnx630/target/qnx6/x86/lib/libc.so.2
 (gdb) bt

    from /usr/qnx630/target/qnx6/x86/lib/libc.so.2

    from /usr/qnx630/target/qnx6/x86/lib/libc.so.2
 Memory fault
 
 But, if you'll replace the line:
 
 const int num = 1_000_000;
 
 with:
 
 const int num = 100_000;
 
 Then everything works fine, the "end" string gets printed, program successfuly
exits. Consider another, similar example:
 
 import std.stdio;
 import std.thread;
 
 char[] string;
 
 int th_func(void* arg)
 {
 	const int num = 1_000_000;
 
 	for (int i=0; i<num; ++i)
 	{
 		string ~= "*";
 	}
 
 	return 0;
 }
 
 int main(char[][] args)
 {
 	th_func(null);
 
 	writefln("end\n");
 
     return 0;
 }
 
 As you can see, we have
 
 const int num = 1_000_000;
 
 here like in first example, which crashed, but this one doesn't, it works fine
even if I write:
 
 const int num = 50_000_000;
 
 The conclusion I made from all this is that there's something wrong with
threads, but I can't figure out what. Does anyone have any ideas ?
Mar 28 2007
parent Sheff <sheffmail mail.ru> writes:
Charlie Wrote:

 What platform is QNX running on for this port, x86 ?
That's right.
Mar 29 2007
prev sibling next sibling parent reply Sheff <sheffmail mail.ru> writes:
I think I understood what's the problem, the values of POSIX constants are
different in linux and QNX (i.e O_CREAT, MAP_ANON, etc ), but in phobos they're
defined like in linux, for example:
In phobos O_CREAT defined like:
const int O_CREAT = 0100; //0x64
In linux fcntl.h header:
#define  O_CREAT 0100 //0x64
but in QNX fcntl.h header it's defined like this:
#define  O_CREAT 0400 //0x100
0x64 != 0x100, so file creation always fails, so do other system calls.
What shall I do about it ? I don't want to manually redefine all POSIX
constants...
Mar 30 2007
next sibling parent =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:
Sheff wrote:
 I think I understood what's the problem, the values of POSIX constants are
different in linux and QNX (i.e O_CREAT, MAP_ANON, etc ), but in phobos they're
defined like in linux, for example:
 In phobos O_CREAT defined like:
 const int O_CREAT = 0100; //0x64
 In linux fcntl.h header:
 #define  O_CREAT 0100 //0x64
 but in QNX fcntl.h header it's defined like this:
 #define  O_CREAT 0400 //0x100
 0x64 != 0x100, so file creation always fails, so do other system calls.
 What shall I do about it ? I don't want to manually redefine all POSIX
constants...
That only applies to DMD's (which only supports Linux anyway) Phobos, but not to GDC - it uses autoconf to generate those gPhobos constants. For instance, on Mac OS X we have: #define O_CREAT 0x0200 Take a look at the programs in d/phobos/config, for all the details. You need to create the frag-gen, frag-math and frag-unix configs... --anders
Mar 30 2007
prev sibling parent Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
Sheff wrote:
 I think I understood what's the problem, the values of POSIX constants are
different in linux and QNX (i.e O_CREAT, MAP_ANON, etc ), but in phobos they're
defined like in linux, for example:
 In phobos O_CREAT defined like:
 const int O_CREAT = 0100; //0x64
 In linux fcntl.h header:
 #define  O_CREAT 0100 //0x64
 but in QNX fcntl.h header it's defined like this:
 #define  O_CREAT 0400 //0x100
 0x64 != 0x100, so file creation always fails, so do other system calls.
 What shall I do about it ? I don't want to manually redefine all POSIX
constants...
There should be a file called gen_unix.c in the GDC tree (specifically, gcc/d/phobos/config/gen_unix.c) that generates data for std/c/unix/unix.d when compiled and ran on the target platform. This should be automatic for a native build, but IIRC for a cross-build you need to do this manually. (This is mentioned in gcc/d/INSTALL). You'll want to do the same for gen_config1.c and gen_math.c, and then put the output in a directory like gcc/d/phobos/config/qnx (with filenames frag-unix, frag-gen and frag-math) and pass --enable-phobos-config-dir=<dir> to GDC's ./configure command. Note: I've never done this myself, this is just from what I remember reading in these newsgroups and from looking at the GDC source tree.
Mar 30 2007
prev sibling parent Sheff <sheffmail mail.ru> writes:
Sheff Wrote:

 Hi everyone, I'm porting GDC compiler, version 0.23 and GCC 4.1.1 to QNX
Neutrino 6.3.0 SP1 (with GCC 3.3.5).
 The porting is almost done, GDC, GCC and Phobos are compiled and working fine,
but you can't say the same for the programs, which GDC produces from D source
code. So far I tested, I only spot one problem: posix threads. Consider the
following example:
 
 import std.stdio;
 import std.thread;
 
 char[] string;
 
 int th_func(void* arg)
 {
 	const int num = 1_000_000;
 
 	for (int i=0; i<num; ++i)
 	{
 		string ~= "*";
 	}
 
 	return 0;
 }
 
 int main(char[][] args)
 {
 	auto th = new Thread(&th_func, null);
 	scope(exit)
 		delete th;
 
 	th.start();
 	th.wait();
 
 	writefln("end\n");
 
     return 0;
 }
 
 When you run this program, it crashes with signal SIGUSR1, which's generated
by pthread_join(), here's GDB output:
 
 Program received signal SIGUSR1, User defined signal 1.
 [Switching to process 66433068]
 0xb032f092 in ThreadJoin_r () from /usr/qnx630/target/qnx6/x86/lib/libc.so.2
 (gdb) bt

    from /usr/qnx630/target/qnx6/x86/lib/libc.so.2

    from /usr/qnx630/target/qnx6/x86/lib/libc.so.2
 Memory fault
 
 But, if you'll replace the line:
 
 const int num = 1_000_000;
 
 with:
 
 const int num = 100_000;
 
 Then everything works fine, the "end" string gets printed, program successfuly
exits. Consider another, similar example:
 
 import std.stdio;
 import std.thread;
 
 char[] string;
 
 int th_func(void* arg)
 {
 	const int num = 1_000_000;
 
 	for (int i=0; i<num; ++i)
 	{
 		string ~= "*";
 	}
 
 	return 0;
 }
 
 int main(char[][] args)
 {
 	th_func(null);
 
 	writefln("end\n");
 
     return 0;
 }
 
 As you can see, we have
 
 const int num = 1_000_000;
 
 here like in first example, which crashed, but this one doesn't, it works fine
even if I write:
 
 const int num = 50_000_000;
 
 The conclusion I made from all this is that there's something wrong with
threads, but I can't figure out what. Does anyone have any ideas ?
I think I understood what's the problem, the values of POSIX constants are different in linux and QNX (i.e O_CREAT, MAP_ANON, etc ), but in phobos they're defined like in linux, for example: In phobos O_CREAT defined like: const int O_CREAT = 0100; //0x64 In linux fcntl.h header: #define O_CREAT 0100 //0x64 but in QNX fcntl.h header it's defined like this: #define O_CREAT 0400 //0x100 0x64 != 0x100, so file creation always fails, so do other system calls. What shall I do about it ? I don't want to manually redefine all POSIX constants...
Mar 30 2007