www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 10763] New: (&x)[0 .. 1] doesn't work in CTFE

reply d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10763

           Summary: (&x)[0 .. 1] doesn't work in CTFE
           Product: D
           Version: unspecified
          Platform: All
        OS/Version: All
            Status: NEW
          Keywords: CTFE
          Severity: enhancement
          Priority: P2
         Component: DMD
        AssignedTo: nobody puremagic.com
        ReportedBy: nilsbossung googlemail.com



---
static assert({
    int x;
    int[] a = (&x)[0 .. 1];

    return true;
}());

Error: pointer & x cannot be sliced at compile time (it does not point to an
array)

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Aug 05 2013
next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10763




This restriction is intentional. It's a consequence of strictly enforcing C's
pointer arithmetic rules.
You can only slice a pointer that you can perform pointer arithmetic on.

Where x is a variable, C does not guarantee that &x + 1 is a legal address.
(For example, it might be 0, if x is at the end of the address space).

(Enforcing C's pointer arithmetic enormously simplifies the implementation.
Allowing this would create a huge number of special cases).

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Aug 12 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10763


timon.gehr gmx.ch changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |timon.gehr gmx.ch



What kind of special cases? (The above code works in my own implementation.)

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Aug 12 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10763




It's basically the same as issue 10266.
The corner cases arise if you still disallow &x + 1. My guess is that you're
allowing it in your implementation?

The problem with allowing it is that we're departing from C. And there's
annoying things like:

// global scope
int x;
int *p = &x + 1; // points to junk! - must not compile


Is there really a use case for this unsafe behaviour?

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Aug 12 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10763





 It's basically the same as issue 10266.
Issue 10266 additionally requests allowing reinterpret-casts between T* and T[1]* (my implementation currently rejects this, but allowing it would be easy.)
 The corner cases arise if you still disallow &x + 1. My guess is that you're
 allowing it in your implementation?
 ...
Yes, but dereferencing it is an error. Subtracting one results in the address of x.
 The problem with allowing it is that we're departing from C.
Does C actually disallow adding 0 to a pointer to a local variable? That's what the example is doing. Furthermore, I don't see what the restriction buys in terms of implementation effort. Every program can be rewritten to only contain arrays.
 And there's annoying things like:
 
 // global scope
 int x;
 int *p = &x + 1; // points to junk! - must not compile
 
Agreed, but I think this is not closely related. DMD already allows creating invalid addresses in CTFE by other means.
 
 Is there really a use case for this unsafe behaviour?
Make more code CTFE-able. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 12 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10763






 It's basically the same as issue 10266.
Issue 10266 additionally requests allowing reinterpret-casts between T* and T[1]* (my implementation currently rejects this, but allowing it would be easy.)
 The corner cases arise if you still disallow &x + 1. My guess is that you're
 allowing it in your implementation?
 ...
Yes, but dereferencing it is an error. Subtracting one results in the address of x.
That is not the issue. The problem is that in C, simply creating the pointer is undefined behaviour. No dereferencing is involved. Note that is undefined behaviour, it's not even implementation-specific behaviour! Simply storing an invalid pointer into a pointer register may generate a hardware exception on some systems. In C, you are not permitted to do pointer arithmetic unless the pointer points to an array, or one-past-the-end-of-an-array.
 The problem with allowing it is that we're departing from C.
Does C actually disallow adding 0 to a pointer to a local variable? That's what the example is doing.
I'm not sure if that's legal or not. I suspect not, though I think it would always work in practice. But adding 1 to a pointer to a local variable is definitely illegal, and there are systems where it will not work. So the end of the slice is problematic.
 Is there really a use case for this unsafe behaviour?
Make more code CTFE-able.
But it's undefined behaviour. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 19 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10763







 ...
 The corner cases arise if you still disallow &x + 1. My guess is that you're
 allowing it in your implementation?
 ...
Yes, but dereferencing it is an error. Subtracting one results in the address of x.
That is not the issue. The problem is that in C, simply creating the pointer is undefined behaviour.
I guess I'll update my implementation eventually to disallow this. (Other related limitations are that it currently allows escaping addresses to locals and simply closes over them, array appends may cause non-determinism and pointers can be freely compared.)
 ...
 Is there really a use case for this unsafe behaviour?
Make more code CTFE-able.
But it's undefined behaviour.
There is not really a reason why (&x)[0..1] should be UB. But I guess if you want to keep C behaviour and also keep the invariant that slices always point to arrays, this is indeed not fixable. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 19 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10763


Iain Buclaw <ibuclaw ubuntu.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ibuclaw ubuntu.com




 It's basically the same as issue 10266.
 The corner cases arise if you still disallow &x + 1. My guess is that you're
 allowing it in your implementation?
 
 The problem with allowing it is that we're departing from C. And there's
 annoying things like:
 
 // global scope
 int x;
 int *p = &x + 1; // points to junk! - must not compile
 
 
 Is there really a use case for this unsafe behaviour?
Only one would be in std.math if we want to make the elementary functions CTFE-able (we've discussed this before). But yes, I think that it is right to disallow it, as there is no clean way to slice up basic types into an array and guarantee ie: format or endian correctness at compile time (cross-compilers, for instance). -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 19 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10763






 It's basically the same as issue 10266.
 The corner cases arise if you still disallow &x + 1. My guess is that you're
 allowing it in your implementation?
 
 The problem with allowing it is that we're departing from C. And there's
 annoying things like:
 
 // global scope
 int x;
 int *p = &x + 1; // points to junk! - must not compile
 
 
 Is there really a use case for this unsafe behaviour?
Only one would be in std.math if we want to make the elementary functions CTFE-able (we've discussed this before).
That's why my proposed solution for that is to allow only the complete expression, where the pointer is instantly dereferenced: (cast(ulong *)cast(void *)&f)[0]; and it really only needs to be allowed for 80-bit reals, since casting float<->int and double<->long is already supported. The minimal operations are: - significand <-> ulong - sign + exponent <-> ushort That would give us four special-case hacks which are x87 specific. Effectively they are intrinsics with ugly syntax. The existing code could be modified slightly to only use those four operations, with no performance penalty.
 But yes, I think that it is right to disallow it, as there is no clean way to
 slice up basic types into an array and guarantee ie: format or endian
 correctness at compile time (cross-compilers, for instance).
It's an ugly area. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 19 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10763







 It's basically the same as issue 10266.
 The corner cases arise if you still disallow &x + 1. My guess is that you're
 allowing it in your implementation?
 
 The problem with allowing it is that we're departing from C. And there's
 annoying things like:
 
 // global scope
 int x;
 int *p = &x + 1; // points to junk! - must not compile
 
 
 Is there really a use case for this unsafe behaviour?
Only one would be in std.math if we want to make the elementary functions CTFE-able (we've discussed this before).
That's why my proposed solution for that is to allow only the complete expression, where the pointer is instantly dereferenced: (cast(ulong *)cast(void *)&f)[0]; and it really only needs to be allowed for 80-bit reals, since casting float<->int and double<->long is already supported.
And (speaking as someone who stubbed out your implementation of float<->int and double<->long cast) the only reason why it's supported is because the backend I implement against can (thankfully) do re-interpreted native casts between basic types such as integer, float, complex and vectors. You will need to support all reals that have support in std.math. This includes 64-bit, 80-bit, 96-bit (really just 80-bit), 128-bit (likewise), and 128-bit (quadruple). There are only three supported formats really... (double-double will have to keep with partial support for the time being, sorry PPC!)
 The minimal operations are:
 - significand  <-> ulong
 - sign  + exponent <-> ushort
 
 That would give us four special-case hacks which are x87 specific. Effectively
 they are intrinsics with ugly syntax.
 
I veto any new addition that is x87 specific - or, more accurately endian specific. Remember its: version(BigEndian) short sign_exp = (cast(ushort*)&x)[0]; else short sign_exp = (cast(ushort*)&x)[5]; -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 19 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10763




---

 
 I veto any new addition that is x87 specific - or, more accurately endian
 specific.
 
 Remember its:
 
 version(BigEndian)
   short sign_exp = (cast(ushort*)&x)[0];
 else
   short sign_exp = (cast(ushort*)&x)[5];
Wrote a quick toy routine to paint real->ushort[real.sizeof/2] (based on backend routine that interprets a value as a vector). --- pseudo code --- Expression* e = RealExp(42.0L); size_t len = native_encode_expr(e, buffer); (gdb) p buffer "\000\000\000\000\000\000\000\250\004 \000\000\000\000\000\000`\365f\001\000\000\000\000\300\341\377\377\377\177\000\000\000\000\000\000\00 0\000\000\000\023\340Z\000\000\000\000\000`As\001\000\000\000\000\006\000\000\000\000\000\000" tree cst = native_interpret_array (TypeSArray(ushort, 8), buffer, len); (gdb) p debug_tree(cst) {[0]=0, [1]=0, [2]=0, [3]=43008, [4]=16388, [5]=0, [6]=0, [7]=0} --- OK, lets check this output against run-time results. --- writeln(*cast(ushort[8]*)(&x)); => [0, 0, 0, 43008, 16388, 0, 32672, 0] Which looks like at a first glance that the real->ushort[real.sizeof/2] conversion isn't correct... up until the point you realise that the '32672' value is just garbage in padding. So... this might be very well doable, but will have to be *extremely* careful about it. Also, I'm assuming that CTFE is able to get values from constant static arrays? -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 19 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10763




---

 
 So... this might be very well doable, but will have to be *extremely* careful
 about it.  Also, I'm assuming that CTFE is able to get values from constant
 static arrays?
Adapted code so that it does the following: real <-> ushort[8]: RealExp <-> VectorExp(ushort[8]) <-> ArrayLiteralExp(ushort[8]) Result? --- ushort[8] foo(real x) { return *cast(ushort[8]*)(&x); } real bar(ushort[8] x) { return *cast(real*)(&x); } pragma(msg, foo(42.0L)); pragma(msg, bar(foo(42.0L))); static assert(foo(42.0L) == [0,0,0,43008,16388,0,0,0]); static assert(bar(foo(42.0L)) == 42.0L); pragma(msg, "Success!"); --- $ gdc -c paint.d [cast(ushort)0u, cast(ushort)0u, cast(ushort)0u, cast(ushort)43008u, cast(ushort)16388u, cast(ushort)0u, cast(ushort)0u, cast(ushort)0u] 4.2e+1 Success! Only downside is that it is restricted to T[x].sizeof == real.sizeof. So real<->ulong[2] only works with 128bit reals on 64bit, but could look into getting around that later... Don, I think I'm ready to test trial this in GDC if you are willing to implement this in DMD? Regards Iain. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 19 2013
prev sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10763




---

 
 Don, I think I'm ready to test trial this in GDC if you are willing to
 implement this in DMD?
 
Added support in GDC (but no front-end support) in case you want to go down this route. https://github.com/D-Programming-GDC/GDC/commit/262a5bd22754e0fa8176c1cef523bde33d1559df -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Oct 10 2013