www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - ctfe bug?

reply Johannes Pfau <spam example.com> writes:
Hi,
the following code is reduced from a parser generated with Ragel 
(http://www.complang.org/ragel/). That's also the reason why it's
using pointers instead of array access, but Ragel guarantees that there 
won't be any out-of-bound reads.

AFAIK pointers are supported in CTFE now as long as they're pointing to an 
array and there are no out-of-bounds reads. Still, the following code fails:

--------------------
ubyte[4] testCTFE()
{
    ubyte[4] data;
    string input = "8ab3060e2cba4f23b74cb52db3bdfb46";
    auto p = input.ptr;
    p++; p++;
    data[0] = parse!ubyte((p-2)[0 .. 2], 16);
    p++; p++;
    data[1] = parse!ubyte((p-2)[0 .. 2], 16);
    p++; p++;
    data[2] = parse!ubyte((p-2)[0 .. 2], 16);
    p++; p++;
    data[3] = parse!ubyte((p-2)[0 .. 2], 16);
    p++; p++;
    return data;
}
enum ctfe = testCTFE();

void main()
{
	import std.stdio;
	writeln(testCTFE()); //[138, 179, 6, 14]
	writeln(ctfe); //[138, 138, 138, 138]
}
--------------------

Has this bug already been filed? I could possibly circumvent it by making 
ragel use array indexing instead of pointers, but that'd be a performance 
issue for runtime code as well.
Dec 21 2011
next sibling parent Johannes Pfau <spam example.com> writes:
Johannes Pfau wrote:
 
 Has this bug already been filed? I could possibly circumvent it by making
 ragel use array indexing instead of pointers, but that'd be a performance
 issue for runtime code as well.

OK, I found a workaround: If I use ---------------- data[x] = parse!ubyte(input[p-input.ptr-2 .. p-input.ptr], 16); ---------------- instead, it works. So the issue is related to pointer slicing in ctfe.
Dec 21 2011
prev sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2011-12-22 08:47, Johannes Pfau wrote:
 Hi,
 the following code is reduced from a parser generated with Ragel
 (http://www.complang.org/ragel/). That's also the reason why it's
 using pointers instead of array access, but Ragel guarantees that there
 won't be any out-of-bound reads.

 AFAIK pointers are supported in CTFE now as long as they're pointing to an
 array and there are no out-of-bounds reads. Still, the following code fails:

 --------------------
 ubyte[4] testCTFE()
 {
      ubyte[4] data;
      string input = "8ab3060e2cba4f23b74cb52db3bdfb46";
      auto p = input.ptr;
      p++; p++;
      data[0] = parse!ubyte((p-2)[0 .. 2], 16);
      p++; p++;
      data[1] = parse!ubyte((p-2)[0 .. 2], 16);
      p++; p++;
      data[2] = parse!ubyte((p-2)[0 .. 2], 16);
      p++; p++;
      data[3] = parse!ubyte((p-2)[0 .. 2], 16);
      p++; p++;
      return data;
 }
 enum ctfe = testCTFE();

 void main()
 {
 	import std.stdio;
 	writeln(testCTFE()); //[138, 179, 6, 14]
 	writeln(ctfe); //[138, 138, 138, 138]
 }
 --------------------

 Has this bug already been filed? I could possibly circumvent it by making
 ragel use array indexing instead of pointers, but that'd be a performance
 issue for runtime code as well.

Why would arrays be slower than pointers? You do know that you can turn off array bounds checking? -- /Jacob Carlborg
Dec 22 2011
next sibling parent Johannes Pfau <spam example.com> writes:
Jacob Carlborg wrote:

 On 2011-12-22 08:47, Johannes Pfau wrote:
 Hi,
 the following code is reduced from a parser generated with Ragel
 (http://www.complang.org/ragel/). That's also the reason why it's
 using pointers instead of array access, but Ragel guarantees that there
 won't be any out-of-bound reads.

 AFAIK pointers are supported in CTFE now as long as they're pointing to
 an array and there are no out-of-bounds reads. Still, the following code
 fails:

 --------------------
 ubyte[4] testCTFE()
 {
      ubyte[4] data;
      string input = "8ab3060e2cba4f23b74cb52db3bdfb46";
      auto p = input.ptr;
      p++; p++;
      data[0] = parse!ubyte((p-2)[0 .. 2], 16);
      p++; p++;
      data[1] = parse!ubyte((p-2)[0 .. 2], 16);
      p++; p++;
      data[2] = parse!ubyte((p-2)[0 .. 2], 16);
      p++; p++;
      data[3] = parse!ubyte((p-2)[0 .. 2], 16);
      p++; p++;
      return data;
 }
 enum ctfe = testCTFE();

 void main()
 {
 import std.stdio;
 writeln(testCTFE()); //[138, 179, 6, 14]
 writeln(ctfe); //[138, 138, 138, 138]
 }
 --------------------

 Has this bug already been filed? I could possibly circumvent it by making
 ragel use array indexing instead of pointers, but that'd be a performance
 issue for runtime code as well.

Why would arrays be slower than pointers? You do know that you can turn off array bounds checking?

Don't know, but I remember some benchmarks showed that arrays were slower, even with bounds-checking off. (I think that was brought up in some discussion about the tango xml parser). Also the default for ragel is to use pointers, so I'd like to use that. Making it use arrays means extra work ;-) And turning off bounds-checking is not a perfect solution, as it applies to the complete module. As I said, ragel makes sure that the pointer access is safe, so there's really no issue in using pointers.
Dec 22 2011
prev sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 12/22/2011 10:28 AM, Jacob Carlborg wrote:
 On 2011-12-22 08:47, Johannes Pfau wrote:
 Hi,
 the following code is reduced from a parser generated with Ragel
 (http://www.complang.org/ragel/). That's also the reason why it's
 using pointers instead of array access, but Ragel guarantees that there
 won't be any out-of-bound reads.

 AFAIK pointers are supported in CTFE now as long as they're pointing
 to an
 array and there are no out-of-bounds reads. Still, the following code
 fails:

 --------------------
 ubyte[4] testCTFE()
 {
 ubyte[4] data;
 string input = "8ab3060e2cba4f23b74cb52db3bdfb46";
 auto p = input.ptr;
 p++; p++;
 data[0] = parse!ubyte((p-2)[0 .. 2], 16);
 p++; p++;
 data[1] = parse!ubyte((p-2)[0 .. 2], 16);
 p++; p++;
 data[2] = parse!ubyte((p-2)[0 .. 2], 16);
 p++; p++;
 data[3] = parse!ubyte((p-2)[0 .. 2], 16);
 p++; p++;
 return data;
 }
 enum ctfe = testCTFE();

 void main()
 {
 import std.stdio;
 writeln(testCTFE()); //[138, 179, 6, 14]
 writeln(ctfe); //[138, 138, 138, 138]
 }
 --------------------

 Has this bug already been filed? I could possibly circumvent it by making
 ragel use array indexing instead of pointers, but that'd be a performance
 issue for runtime code as well.

Why would arrays be slower than pointers? You do know that you can turn off array bounds checking?

Yes but the length has to be stored and updated, therefore for example p++ is less machine instructions/memory accesses/register pressure than arr = arr[1..$].
Dec 22 2011
parent reply Jacob Carlborg <doob me.com> writes:
On 2011-12-22 14:39, Timon Gehr wrote:
 On 12/22/2011 10:28 AM, Jacob Carlborg wrote:
 On 2011-12-22 08:47, Johannes Pfau wrote:
 Hi,
 the following code is reduced from a parser generated with Ragel
 (http://www.complang.org/ragel/). That's also the reason why it's
 using pointers instead of array access, but Ragel guarantees that there
 won't be any out-of-bound reads.

 AFAIK pointers are supported in CTFE now as long as they're pointing
 to an
 array and there are no out-of-bounds reads. Still, the following code
 fails:

 --------------------
 ubyte[4] testCTFE()
 {
 ubyte[4] data;
 string input = "8ab3060e2cba4f23b74cb52db3bdfb46";
 auto p = input.ptr;
 p++; p++;
 data[0] = parse!ubyte((p-2)[0 .. 2], 16);
 p++; p++;
 data[1] = parse!ubyte((p-2)[0 .. 2], 16);
 p++; p++;
 data[2] = parse!ubyte((p-2)[0 .. 2], 16);
 p++; p++;
 data[3] = parse!ubyte((p-2)[0 .. 2], 16);
 p++; p++;
 return data;
 }
 enum ctfe = testCTFE();

 void main()
 {
 import std.stdio;
 writeln(testCTFE()); //[138, 179, 6, 14]
 writeln(ctfe); //[138, 138, 138, 138]
 }
 --------------------

 Has this bug already been filed? I could possibly circumvent it by
 making
 ragel use array indexing instead of pointers, but that'd be a
 performance
 issue for runtime code as well.

Why would arrays be slower than pointers? You do know that you can turn off array bounds checking?

Yes but the length has to be stored and updated, therefore for example p++ is less machine instructions/memory accesses/register pressure than arr = arr[1..$].

Ok, I see. Then this seems to be a very performance critical piece of code. -- /Jacob Carlborg
Dec 22 2011
parent Johannes Pfau <spam example.com> writes:
Jacob Carlborg wrote:

 On 2011-12-22 14:39, Timon Gehr wrote:
 On 12/22/2011 10:28 AM, Jacob Carlborg wrote:
 On 2011-12-22 08:47, Johannes Pfau wrote:
 Hi,
 the following code is reduced from a parser generated with Ragel
 (http://www.complang.org/ragel/). That's also the reason why it's
 using pointers instead of array access, but Ragel guarantees that there
 won't be any out-of-bound reads.

 AFAIK pointers are supported in CTFE now as long as they're pointing
 to an
 array and there are no out-of-bounds reads. Still, the following code
 fails:

 --------------------
 ubyte[4] testCTFE()
 {
 ubyte[4] data;
 string input = "8ab3060e2cba4f23b74cb52db3bdfb46";
 auto p = input.ptr;
 p++; p++;
 data[0] = parse!ubyte((p-2)[0 .. 2], 16);
 p++; p++;
 data[1] = parse!ubyte((p-2)[0 .. 2], 16);
 p++; p++;
 data[2] = parse!ubyte((p-2)[0 .. 2], 16);
 p++; p++;
 data[3] = parse!ubyte((p-2)[0 .. 2], 16);
 p++; p++;
 return data;
 }
 enum ctfe = testCTFE();

 void main()
 {
 import std.stdio;
 writeln(testCTFE()); //[138, 179, 6, 14]
 writeln(ctfe); //[138, 138, 138, 138]
 }
 --------------------

 Has this bug already been filed? I could possibly circumvent it by
 making
 ragel use array indexing instead of pointers, but that'd be a
 performance
 issue for runtime code as well.

Why would arrays be slower than pointers? You do know that you can turn off array bounds checking?

Yes but the length has to be stored and updated, therefore for example p++ is less machine instructions/memory accesses/register pressure than arr = arr[1..$].

Ok, I see. Then this seems to be a very performance critical piece of code.

is also used for HTTP parsers in webservers (lighttpd2), json parsers, etc and it's main advantage is speed.
Dec 22 2011