www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Pointer semantics in CTFE

reply Don Clugston <dac nospam.com> writes:
The current implementation of CTFE strictly enforces C pointer 
semantics. One of the restrictions is that you cannot perform ordering 
comparisons between unrelated pointers.
This is important for repeatability: if it was permitted, the results 
would be arbitrary and might vary unpredictably with subtle changes in 
the code, or change between compiler releases.

But, there's an interesting case from bug 7898: the 'inside' operation.
If p and q are pointers to the same array, then (r >= p && r <= q) is 
true if r points inside that array, and false if it does not.
This seems to be a perfectly reasonable operation: it is completely 
repeatable and safe, regardless of what r points to. But there doesn't 
seem to be any way to rewrite it to avoid the disallowed comparisons.

I could write code to allow this special case in CTFE. There's a bit of 
work to make sure that all the valid cases are detected, because there 
are quite a lot of ways to rewrite it, but it's not too terrible.

But I dunno, I don't like this sort of thing much. Feels a bit clunky.
OTOH it seems like necessary functionality, and I can't see any other 
way of doing it.

Opinions?
May 24 2012
next sibling parent Artur Skawina <art.08.09 gmail.com> writes:
On 05/25/12 08:50, Don Clugston wrote:
 The current implementation of CTFE strictly enforces C pointer semantics. One
of the restrictions is that you cannot perform ordering comparisons between
unrelated pointers.
 This is important for repeatability: if it was permitted, the results would be
arbitrary and might vary unpredictably with subtle changes in the code, or
change between compiler releases.
 
 But, there's an interesting case from bug 7898: the 'inside' operation.
 If p and q are pointers to the same array, then (r >= p && r <= q) is true if
r points inside that array, and false if it does not.
 This seems to be a perfectly reasonable operation: it is completely repeatable
and safe, regardless of what r points to. But there doesn't seem to be any way
to rewrite it to avoid the disallowed comparisons.
 
 I could write code to allow this special case in CTFE. There's a bit of work
to make sure that all the valid cases are detected, because there are quite a
lot of ways to rewrite it, but it's not too terrible.
 
 But I dunno, I don't like this sort of thing much. Feels a bit clunky.
 OTOH it seems like necessary functionality, and I can't see any other way of
doing it.
 
 Opinions?
It _is_ necessary functionality. Comparing truly unrelated pointers results in undefined behavior, this is not different from normal non-ctfe code - i don't see a reason to even bother with implementing any restrictions. It's undefined, but not unsafe - so the programmer gets to deal with it. artur
May 25 2012
prev sibling next sibling parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 25.05.2012 10:50, Don Clugston wrote:
 The current implementation of CTFE strictly enforces C pointer
 semantics. One of the restrictions is that you cannot perform ordering
 comparisons between unrelated pointers.
 This is important for repeatability: if it was permitted, the results
 would be arbitrary and might vary unpredictably with subtle changes in
 the code, or change between compiler releases.

 But, there's an interesting case from bug 7898: the 'inside' operation.
 If p and q are pointers to the same array, then (r >= p && r <= q) is
 true if r points inside that array, and false if it does not.
Isn't indexes just enough for this use case? It's not like pointer iteration faster/better when you base, end pointer pair. -- Dmitry Olshansky
May 25 2012
prev sibling next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 25 May 2012 02:50:21 -0400, Don Clugston <dac nospam.com> wrote:

 The current implementation of CTFE strictly enforces C pointer  
 semantics. One of the restrictions is that you cannot perform ordering  
 comparisons between unrelated pointers.
 This is important for repeatability: if it was permitted, the results  
 would be arbitrary and might vary unpredictably with subtle changes in  
 the code, or change between compiler releases.

 But, there's an interesting case from bug 7898: the 'inside' operation.
 If p and q are pointers to the same array, then (r >= p && r <= q) is  
 true if r points inside that array, and false if it does not.
 This seems to be a perfectly reasonable operation: it is completely  
 repeatable and safe, regardless of what r points to. But there doesn't  
 seem to be any way to rewrite it to avoid the disallowed comparisons.

 I could write code to allow this special case in CTFE. There's a bit of  
 work to make sure that all the valid cases are detected, because there  
 are quite a lot of ways to rewrite it, but it's not too terrible.

 But I dunno, I don't like this sort of thing much. Feels a bit clunky.
 OTOH it seems like necessary functionality, and I can't see any other  
 way of doing it.

 Opinions?
Remove the restriction. The code is unpredictable, but not invalid. It just means you need to take more care when writing such code. -Steve
May 25 2012
parent reply "David Nadlinger" <see klickverbot.at> writes:
On Friday, 25 May 2012 at 14:06:55 UTC, Steven Schveighoffer 
wrote:
 Remove the restriction.  The code is unpredictable, but not 
 invalid.  It just means you need to take more care when writing 
 such code.
The question is whether we want to allow unpredictable behavior at compile time. David
May 25 2012
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 25 May 2012 11:08:52 -0400, David Nadlinger <see klickverbot.at>  
wrote:

 On Friday, 25 May 2012 at 14:06:55 UTC, Steven Schveighoffer wrote:
 Remove the restriction.  The code is unpredictable, but not invalid.   
 It just means you need to take more care when writing such code.
The question is whether we want to allow unpredictable behavior at compile time.
Given the alternative, yes I think we do. -Steve
May 25 2012
parent deadalnix <deadalnix gmail.com> writes:
Le 25/05/2012 17:38, Steven Schveighoffer a écrit :
 On Fri, 25 May 2012 11:08:52 -0400, David Nadlinger <see klickverbot.at>
 wrote:

 On Friday, 25 May 2012 at 14:06:55 UTC, Steven Schveighoffer wrote:
 Remove the restriction. The code is unpredictable, but not invalid.
 It just means you need to take more care when writing such code.
The question is whether we want to allow unpredictable behavior at compile time.
Given the alternative, yes I think we do. -Steve
What ? We absolutely don't whant that, and teh alternative is pretty simple : using slices.
May 26 2012
prev sibling next sibling parent kenji hara <k.hara.pg gmail.com> writes:
I think repeatability of CTFE is much important.
If it is not guaranteed, debugging codes for CTFE is almost impossible.
I hate such special treatments.

And, with CTFEable copy operation, overlapping check is always same as
unrelated memory comparison.
For bug 7898, we should add __ctfe specific code instead of changing
CTFE interpreter.

Kenji Hara

2012/5/25 Don Clugston <dac nospam.com>:
 The current implementation of CTFE strictly enforces C pointer semantics.
 One of the restrictions is that you cannot perform ordering comparisons
 between unrelated pointers.
 This is important for repeatability: if it was permitted, the results would
 be arbitrary and might vary unpredictably with subtle changes in the code,
 or change between compiler releases.

 But, there's an interesting case from bug 7898: the 'inside' operation.
 If p and q are pointers to the same array, then (r >= p && r <= q) is true
 if r points inside that array, and false if it does not.
 This seems to be a perfectly reasonable operation: it is completely
 repeatable and safe, regardless of what r points to. But there doesn't seem
 to be any way to rewrite it to avoid the disallowed comparisons.

 I could write code to allow this special case in CTFE. There's a bit of work
 to make sure that all the valid cases are detected, because there are quite
 a lot of ways to rewrite it, but it's not too terrible.

 But I dunno, I don't like this sort of thing much. Feels a bit clunky.
 OTOH it seems like necessary functionality, and I can't see any other way of
 doing it.

 Opinions?
May 25 2012
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 5/24/2012 11:50 PM, Don Clugston wrote:
 Opinions?
My experience with such special cases is that users lose the ability to reason about what code will work in CTFE and what will not. Its operation will become a magical, unknowable black box. Meanwhile, you'll be endlessly chasing more special cases and rainbows. I suggest not supporting it unless it can be done in the general case. One solution might be to attach to each CTFE pointer a reference to the object it is pointing into. Then, pointer comparisons within the same object can work, and comparing pointers from different objects can explicitly and predictably be not supported.
May 25 2012
parent reply Don <nospam nospam.com> writes:
On 26.05.2012 01:09, Walter Bright wrote:
 On 5/24/2012 11:50 PM, Don Clugston wrote:
 Opinions?
My experience with such special cases is that users lose the ability to reason about what code will work in CTFE and what will not. Its operation will become a magical, unknowable black box. Meanwhile, you'll be endlessly chasing more special cases and rainbows. I suggest not supporting it unless it can be done in the general case. One solution might be to attach to each CTFE pointer a reference to the object it is pointing into. Then, pointer comparisons within the same object can work, and comparing pointers from different objects can explicitly and predictably be not supported.
That is exactly how it works right now. The problem is, if pstart and pend point to the beginning and end of an array, then given another pointer q, there is AFAIK no defined way in C for checking if q is between pstart and pend, even though I'm sure everyone does it. (eg, I don't know how to implement memcpy otherwise). If q is part of that array, (q >= pstart && q <= pend) is well-defined, and returns true. But if it isn't part of that array, q >= pstart is undefined. In reality of course, if q is unrelated to the array then EITHER q < pstart, OR q > pend. So (q >= pstart && q <= pend) is ALWAYS false, and the result is completely predictable. ie, the full expression can be well-defined even though its two subexpressions are not. This can be implemented. Given an expression like q >= pstart, where p and q are unrelated, it's possible to make it have the value 'false_or_undefined'. If it is anded with another expression that is also false_or_undefined, and goes the other direction ( q <= pend) , then the whole thing is false. Otherwise, it generates an error. The reason it would have to be a special case is because of things like: bool b = false; q >= start && ( b = true, true) && q <= end which is undefined because the result of the first comparison has been stored. So it would only be valid for one-comparison-in-each-direction-anded-together (and the equivalent for ||).
May 25 2012
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 5/25/2012 8:24 PM, Don wrote:
 The problem is, if pstart and pend point to the beginning and end of an array,
 then given another pointer q, there is AFAIK no defined way in C for checking
if
 q is between pstart and pend, even though I'm sure everyone does it. (eg, I
 don't know how to implement memcpy otherwise).
 If q is part of that array, (q >= pstart && q <= pend) is well-defined, and
 returns true.
 But if it isn't part of that array, q >= pstart is undefined.

 In reality of course, if q is unrelated to the array then EITHER q < pstart, OR
 q > pend. So (q >= pstart && q <= pend) is ALWAYS false, and the result is
 completely predictable.
If q is unrelated to the array, which you can tell because q, pstart, and pend must all refer to the same object that CTFE knows about, then you can have CTFE refuse to execute it.
 ie, the full expression can be well-defined even though its two subexpressions
 are not.

 This can be implemented.
 Given an expression like q >= pstart, where p and q are unrelated, it's
possible
 to make it have the value 'false_or_undefined'. If it is anded with another
 expression that is also false_or_undefined, and goes the other direction ( q <=
 pend) , then the whole thing is false. Otherwise, it generates an error.

 The reason it would have to be a special case is because of things like:
 bool b = false;
 q >= start && ( b = true, true) && q <= end
 which is undefined because the result of the first comparison has been stored.
 So it would only be valid for one-comparison-in-each-direction-anded-together
 (and the equivalent for ||).
I still don't think you need to make a special case for it. If upon initialization of q, pstart, and pend, CTFE makes a note of what memory object they are initialized from, then you can unambiguously tell if they still point within the same object or not.
May 25 2012
parent reply Don <nospam nospam.com> writes:
On 26.05.2012 05:35, Walter Bright wrote:
 On 5/25/2012 8:24 PM, Don wrote:
 The problem is, if pstart and pend point to the beginning and end of
 an array,
 then given another pointer q, there is AFAIK no defined way in C for
 checking if
 q is between pstart and pend, even though I'm sure everyone does it.
 (eg, I
 don't know how to implement memcpy otherwise).
 If q is part of that array, (q >= pstart && q <= pend) is
 well-defined, and
 returns true.
 But if it isn't part of that array, q >= pstart is undefined.

 In reality of course, if q is unrelated to the array then EITHER q <
 pstart, OR
 q > pend. So (q >= pstart && q <= pend) is ALWAYS false, and the
 result is
 completely predictable.
If q is unrelated to the array, which you can tell because q, pstart, and pend must all refer to the same object that CTFE knows about, then you can have CTFE refuse to execute it.
Yes, that's what happens now. But that doesn't help the programmer. If it is inside, no problem, the expression is true. But if it is not inside, the expression is not false -- it's a compile-time error. So you can't use it as a test for if it is inside the same object. I was confused about how memmove can work in C without relying on undefined behaviour. But I just read http://www.cplusplus.com/reference/clibrary/cstring/memcpy/ which defines it in terms of an intermediate buffer. So maybe, the current CTFE implementation is _exactly_ consistent with the C spec. If that's true, though, I find it pretty incredible that there is no way to find out if a pointers points a particular array, even if you have pointers to both the start and end of that array. (OK, I guess you can iterate from start to end, checking for equality, but .. bleah .. it's a terrible abstraction inversion).
 ie, the full expression can be well-defined even though its two
 subexpressions
 are not.

 This can be implemented.
 Given an expression like q >= pstart, where p and q are unrelated,
 it's possible
 to make it have the value 'false_or_undefined'. If it is anded with
 another
 expression that is also false_or_undefined, and goes the other
 direction ( q <=
 pend) , then the whole thing is false. Otherwise, it generates an error.

 The reason it would have to be a special case is because of things like:
 bool b = false;
 q >= start && ( b = true, true) && q <= end
 which is undefined because the result of the first comparison has been
 stored.
 So it would only be valid for
 one-comparison-in-each-direction-anded-together
 (and the equivalent for ||).
I still don't think you need to make a special case for it. If upon initialization of q, pstart, and pend, CTFE makes a note of what memory object they are initialized from, then you can unambiguously tell if they still point within the same object or not.
May 26 2012
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 5/26/2012 3:59 AM, Don wrote:
 Yes, that's what happens now. But that doesn't help the programmer.

 If it is inside, no problem, the expression is true. But if it is not inside,
 the expression is not false -- it's a compile-time error.
Ok, I understand now what you meant.
 So you can't use it as a test for if it is inside the same object.

 I was confused about how memmove can work in C without relying on undefined
 behaviour. But I just read
 http://www.cplusplus.com/reference/clibrary/cstring/memcpy/
 which defines it in terms of an intermediate buffer.

 So maybe, the current CTFE implementation is _exactly_ consistent with the C
 spec. If that's true, though, I find it pretty incredible that there is no way
 to find out if a pointers points a particular array, even if you have pointers
 to both the start and end of that array.

 (OK, I guess you can iterate from start to end, checking for equality, but ..
 bleah .. it's a terrible abstraction inversion).
You could implement it as simply comparing the addresses - you'd be no worse off than C is, and you would get the correct answer for pointers both in and out of the array without needing special cases.
May 26 2012
next sibling parent Artur Skawina <art.08.09 gmail.com> writes:
On 05/27/12 02:45, Walter Bright wrote:
 On 5/26/2012 3:59 AM, Don wrote:
 Yes, that's what happens now. But that doesn't help the programmer.

 If it is inside, no problem, the expression is true. But if it is not inside,
 the expression is not false -- it's a compile-time error.
Ok, I understand now what you meant.
 So you can't use it as a test for if it is inside the same object.

 I was confused about how memmove can work in C without relying on undefined
 behaviour. But I just read
 http://www.cplusplus.com/reference/clibrary/cstring/memcpy/
 which defines it in terms of an intermediate buffer.

 So maybe, the current CTFE implementation is _exactly_ consistent with the C
 spec. If that's true, though, I find it pretty incredible that there is no way
 to find out if a pointers points a particular array, even if you have pointers
 to both the start and end of that array.

 (OK, I guess you can iterate from start to end, checking for equality, but ..
 bleah .. it's a terrible abstraction inversion).
You could implement it as simply comparing the addresses - you'd be no worse off than C is, and you would get the correct answer for pointers both in and out of the array without needing special cases.
Note that if pointer comparison is allowed, subtraction should too. Ie if 'p1>=p2' works, then it is reasonable to expect 'p1-p2<=i' to also work. artur
May 27 2012
prev sibling parent reply Don Clugston <dac nospam.com> writes:
On 27/05/12 02:45, Walter Bright wrote:
 On 5/26/2012 3:59 AM, Don wrote:
 Yes, that's what happens now. But that doesn't help the programmer.

 If it is inside, no problem, the expression is true. But if it is not
 inside,
 the expression is not false -- it's a compile-time error.
Ok, I understand now what you meant.
 So you can't use it as a test for if it is inside the same object.

 I was confused about how memmove can work in C without relying on
 undefined
 behaviour. But I just read
 http://www.cplusplus.com/reference/clibrary/cstring/memcpy/
 which defines it in terms of an intermediate buffer.

 So maybe, the current CTFE implementation is _exactly_ consistent with
 the C
 spec. If that's true, though, I find it pretty incredible that there
 is no way
 to find out if a pointers points a particular array, even if you have
 pointers
 to both the start and end of that array.

 (OK, I guess you can iterate from start to end, checking for equality,
 but ..
 bleah .. it's a terrible abstraction inversion).
You could implement it as simply comparing the addresses - you'd be no worse off than C is, and you would get the correct answer for pointers both in and out of the array without needing special cases.
I think that's a no-go. Implementation-specific behaviour at runtime is bad enough, but at compile time, it's truly horrible. Consider that any change to unrelated code can change the results. Something that makes it really terrible is that the same function can be called in CTFE once before inlining, and once after. Good luck tracking that down. And at runtime, you have a debugger. It's not hard to make it work in all cases of: one-sided-comparison && one-sided-comparison one-sided-comparison || one-sided-comparison one-sided-comparision: ptr_expression RELOP ptr_expression ! one-sided-comparision RELOP: > < >= <= where ptr_expression is any expression of type pointer. And by all cases, I really mean all (the code for the four pointer expressions need not having anything in common). It's only when you allow other expressions to be present, that things get hairy.
May 29 2012
next sibling parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2012-05-29 13:29:35 +0000, Don Clugston <dac nospam.com> said:

 On 27/05/12 02:45, Walter Bright wrote:
 You could implement it as simply comparing the addresses - you'd be no
 worse off than C is, and you would get the correct answer for pointers
 both in and out of the array without needing special cases.
I think that's a no-go. Implementation-specific behaviour at runtime is bad enough, but at compile time, it's truly horrible. Consider that any change to unrelated code can change the results. Something that makes it really terrible is that the same function can be called in CTFE once before inlining, and once after. Good luck tracking that down. And at runtime, you have a debugger.
Wouldn't it be possible to just catch the case where you compare two pointers not assigned from the same memory block and issue an error? Here's an idea: make each CTFE pointer some kind of struct with a pointer to the memory block and an offset. When comparing, if the memory block isn't the same, it's an error. If you add or subtract to a pointer, it'll still belong to the same memory block but with a different offset, thus it remains comparable. When dereferencing, you can make sure the pointer still points inside the block, assuming the block knows its length. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
May 29 2012
next sibling parent reply Artur Skawina <art.08.09 gmail.com> writes:
On 05/29/12 16:20, Michel Fortin wrote:
 On 2012-05-29 13:29:35 +0000, Don Clugston <dac nospam.com> said:
 
 On 27/05/12 02:45, Walter Bright wrote:
 You could implement it as simply comparing the addresses - you'd be no
 worse off than C is, and you would get the correct answer for pointers
 both in and out of the array without needing special cases.
I think that's a no-go. Implementation-specific behaviour at runtime is bad enough, but at compile time, it's truly horrible. Consider that any change to unrelated code can change the results. Something that makes it really terrible is that the same function can be called in CTFE once before inlining, and once after. Good luck tracking that down. And at runtime, you have a debugger.
Wouldn't it be possible to just catch the case where you compare two pointers not assigned from the same memory block and issue an error? Here's an idea: make each CTFE pointer some kind of struct with a pointer to the memory block and an offset. When comparing, if the memory block isn't the same, it's an error. If you add or subtract to a pointer, it'll still belong to the same memory block but with a different offset, thus it remains comparable. When dereferencing, you can make sure the pointer still points inside the block, assuming the block knows its length.
int a[1024]; int[] da = a[0..1024]; if (whatever) da = da[3..14]; if (something_else) da = [42] ~ da; // etc if (da_is_a_slice_of_a()) still_inside_a(); How do you implement da_is_a_slice_of_a()? Disallowing pointer comparisons means normal code needs __ctfe special casing in order to work at compile time at all, like it was done for the "bug" mentioned in this thread. artur
May 29 2012
parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2012-05-29 15:09:00 +0000, Artur Skawina <art.08.09 gmail.com> said:

    int a[1024];
    int[] da = a[0..1024];
 
    if (whatever)
       da = da[3..14];
    if (something_else)
       da = [42] ~ da;
    // etc
 
    if (da_is_a_slice_of_a())
       still_inside_a();
 
 How do you implement da_is_a_slice_of_a()?
Indeed, for that to work you'd still need to handle this case specially. My bad for not catching that. Personally, I think it'd be much cleaner to go with some kind of magic function than trying to match the condition against a predefined pattern. Something like da.isSliceOf(a), which could do the usual pointer thing at runtime and call some sort of CTFE intrinsic at compile-time. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
May 29 2012
next sibling parent "Patrik Strand" <Patrik.Strand gmail.com> writes:
On Tuesday, 29 May 2012 at 17:35:12 UTC, Michel Fortin wrote:
 On 2012-05-29 15:09:00 +0000, Artur Skawina 
 <art.08.09 gmail.com> said:

   int a[1024];
   int[] da = a[0..1024];
 
   if (whatever)
      da = da[3..14];
   if (something_else)
      da = [42] ~ da;
   // etc
 
   if (da_is_a_slice_of_a())
      still_inside_a();
 
 How do you implement da_is_a_slice_of_a()?
Indeed, for that to work you'd still need to handle this case specially. My bad for not catching that. Personally, I think it'd be much cleaner to go with some kind of magic function than trying to match the condition against a predefined pattern. Something like da.isSliceOf(a), which could do the usual pointer thing at runtime and call some sort of CTFE intrinsic at compile-time.
it would be elegant to reuse 'in', but is it too error-prone? int[10] tmp; int* ptr = &tmp[4]; if(ptr in tmp) // O(1), i.e. ok. { }
May 29 2012
prev sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 29 May 2012 13:35:12 -0400, Michel Fortin  
<michel.fortin michelf.com> wrote:

 On 2012-05-29 15:09:00 +0000, Artur Skawina <art.08.09 gmail.com> said:

    int a[1024];
    int[] da = a[0..1024];
     if (whatever)
       da = da[3..14];
    if (something_else)
       da = [42] ~ da;
    // etc
     if (da_is_a_slice_of_a())
       still_inside_a();
  How do you implement da_is_a_slice_of_a()?
Indeed, for that to work you'd still need to handle this case specially. My bad for not catching that. Personally, I think it'd be much cleaner to go with some kind of magic function than trying to match the condition against a predefined pattern. Something like da.isSliceOf(a), which could do the usual pointer thing at runtime and call some sort of CTFE intrinsic at compile-time.
That doesn't help when most code does not use this today. I.e. one of the main benefits of ctfe is that you don't *have* to write special code. -Steve
May 30 2012
parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2012-05-30 14:44:37 +0000, "Steven Schveighoffer" 
<schveiguy yahoo.com> said:

 On Tue, 29 May 2012 13:35:12 -0400, Michel Fortin  
 <michel.fortin michelf.com> wrote:
 
 Personally, I think it'd be much cleaner to go with some kind of magic  
 function than trying to match the condition against a predefined  
 pattern. Something like da.isSliceOf(a), which could do the usual  
 pointer thing at runtime and call some sort of CTFE intrinsic at  
 compile-time.
That doesn't help when most code does not use this today. I.e. one of the main benefits of ctfe is that you don't *have* to write special code.
But at the other end, there should be an easy to understand separation between what is supported by CTFE and what is not. Once you try to add special cases for some code patterns by adding heuristics, those heuristics now define the line and would have to become part of the language specification. More generally, the drawback of this kind of pattern recognition is that there's an infinite pool of equivalent patterns to recognize, and seemingly innocuous changes to a CTFEable function could break its CTFEablility if they happen to cross the invisible boundary. I'm just trying to point out the drawbacks of too clever heuristics. If such an heuristic is used, I think it should be limited in scope and well documented. Unfortunately, that means you'll still have to care about writing code in a way that fits the documented pattern. It'd be much easier to reason about CTFEability if the pattern had to be encapsulated in its own function. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
May 30 2012
next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 30 May 2012 11:33:54 -0400, Michel Fortin  
<michel.fortin michelf.com> wrote:

 On 2012-05-30 14:44:37 +0000, "Steven Schveighoffer"  
 <schveiguy yahoo.com> said:

 On Tue, 29 May 2012 13:35:12 -0400, Michel Fortin   
 <michel.fortin michelf.com> wrote:

 Personally, I think it'd be much cleaner to go with some kind of  
 magic  function than trying to match the condition against a  
 predefined  pattern. Something like da.isSliceOf(a), which could do  
 the usual  pointer thing at runtime and call some sort of CTFE  
 intrinsic at  compile-time.
That doesn't help when most code does not use this today. I.e. one of the main benefits of ctfe is that you don't *have* to write special code.
But at the other end, there should be an easy to understand separation between what is supported by CTFE and what is not. Once you try to add special cases for some code patterns by adding heuristics, those heuristics now define the line and would have to become part of the language specification. More generally, the drawback of this kind of pattern recognition is that there's an infinite pool of equivalent patterns to recognize, and seemingly innocuous changes to a CTFEable function could break its CTFEablility if they happen to cross the invisible boundary. I'm just trying to point out the drawbacks of too clever heuristics. If such an heuristic is used, I think it should be limited in scope and well documented. Unfortunately, that means you'll still have to care about writing code in a way that fits the documented pattern. It'd be much easier to reason about CTFEability if the pattern had to be encapsulated in its own function.
Unless of course, you simply always allow it, which is what I think we should do :) -Steve
May 30 2012
prev sibling parent Don Clugston <dac nospam.com> writes:
On 30/05/12 17:33, Michel Fortin wrote:
 On 2012-05-30 14:44:37 +0000, "Steven Schveighoffer"
 <schveiguy yahoo.com> said:

 On Tue, 29 May 2012 13:35:12 -0400, Michel Fortin
 <michel.fortin michelf.com> wrote:

 Personally, I think it'd be much cleaner to go with some kind of
 magic function than trying to match the condition against a
 predefined pattern. Something like da.isSliceOf(a), which could do
 the usual pointer thing at runtime and call some sort of CTFE
 intrinsic at compile-time.
That doesn't help when most code does not use this today. I.e. one of the main benefits of ctfe is that you don't *have* to write special code.
But at the other end, there should be an easy to understand separation between what is supported by CTFE and what is not. Once you try to add special cases for some code patterns by adding heuristics, those heuristics now define the line and would have to become part of the language specification. More generally, the drawback of this kind of pattern recognition is that there's an infinite pool of equivalent patterns to recognize, and seemingly innocuous changes to a CTFEable function could break its CTFEablility if they happen to cross the invisible boundary. I'm just trying to point out the drawbacks of too clever heuristics. If such an heuristic is used, I think it should be limited in scope and well documented. Unfortunately, that means you'll still have to care about writing code in a way that fits the documented pattern. It'd be much easier to reason about CTFEability if the pattern had to be encapsulated in its own function.
The heuristic is: one pointer comparison in each direction, connected by && or ||. No restriction on the pointer expressions, except that they must have no side-effects. That covers most legitimate cases. The ones it doesn't cover are: - doing two interleaved isInside() comparisons: p1 >= q1 && s1 >= r1 && p2 < q2 && s2 < r2 which needs to be rewritten as: (p1 >= q1 && p2 < q2) && (s1 >= r1 && s2 < r2) - saving two one-side compares, and then not using them for _anything_ except an && or || ---> this is a fundamental limitation - calling two functions which each return a one-side-compare: bool cmp(void *a, void *b) { return a<b); cmp(p1, q1) && cmp(q2, p2) ----> this one could actually be done, because it would only be legal to call such a function from an && or ||. Implementation would be complicated though.
May 31 2012
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 5/29/2012 7:20 AM, Michel Fortin wrote:
 Wouldn't it be possible to just catch the case where you compare two pointers
 not assigned from the same memory block and issue an error?

 Here's an idea: make each CTFE pointer some kind of struct with a pointer to
the
 memory block and an offset. When comparing, if the memory block isn't the same,
 it's an error. If you add or subtract to a pointer, it'll still belong to the
 same memory block but with a different offset, thus it remains comparable. When
 dereferencing, you can make sure the pointer still points inside the block,
 assuming the block knows its length.
That was my suggestion a few posts back in the thread, and Don correctly demolished it.
May 29 2012
parent reply "Mehrdad" <wfunction hotmail.com> writes:
Just a general note: going the "make a special case for two 
comparisons" route won't work if, for example, someone decides to 
use a lambda for comparing pointers.
May 29 2012
parent Don Clugston <dac nospam.com> writes:
On 30/05/12 01:47, Mehrdad wrote:
 Just a general note: going the "make a special case for two comparisons"
 route won't work if, for example, someone decides to use a lambda for
 comparing pointers.
You mean effectively like: bool cmp(void *x, void *y) { return x < y: } assert ( cmp(x, y) && cmp(x, z) ); ? Yes, this is why it's a special case. I can imagine how that could be implemented in the current CTFE, I cannot see how it could be done reasonably with a JIT CTFE implementation. You'd have to have a special lamba for "pointer inside a range" rather than simple pointer comparisons. I'm suggesting that "is pointer inside a range" is a different primitive operation from "comparison of two pointers to the same memory block". Even though C code typically uses the latter to implement the former, it's relying on undefined behaviour.
May 30 2012
prev sibling parent Don Clugston <dac nospam.com> writes:
On 29/05/12 16:20, Michel Fortin wrote:
 On 2012-05-29 13:29:35 +0000, Don Clugston <dac nospam.com> said:

 On 27/05/12 02:45, Walter Bright wrote:
 You could implement it as simply comparing the addresses - you'd be no
 worse off than C is, and you would get the correct answer for pointers
 both in and out of the array without needing special cases.
I think that's a no-go. Implementation-specific behaviour at runtime is bad enough, but at compile time, it's truly horrible. Consider that any change to unrelated code can change the results. Something that makes it really terrible is that the same function can be called in CTFE once before inlining, and once after. Good luck tracking that down. And at runtime, you have a debugger.
Wouldn't it be possible to just catch the case where you compare two pointers not assigned from the same memory block and issue an error? Here's an idea: make each CTFE pointer some kind of struct with a pointer to the memory block and an offset. When comparing, if the memory block isn't the same, it's an error. If you add or subtract to a pointer, it'll still belong to the same memory block but with a different offset, thus it remains comparable. When dereferencing, you can make sure the pointer still points inside the block, assuming the block knows its length.
Thats _exactly_ what it does right now.
May 30 2012
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 5/29/2012 6:29 AM, Don Clugston wrote:
 I think that's a no-go.
 Implementation-specific behaviour at runtime is bad enough, but at compile
time,
 it's truly horrible. Consider that any change to unrelated code can change the
 results. Something that makes it really terrible is that the same function can
 be called in CTFE once before inlining, and once after. Good luck tracking that
 down.
 And at runtime, you have a debugger.
True, but when code relies on undefined behavior, behavior may also differ based on compiler settings, link order, compilation order, etc. Also, note that CTFE can compute different floating point results at compile time than at runtime (because CTFE uses 80 bits for everything, even floats). The solution is to eliminate all undefined behavior, but we can't do that and still have a systems programming language.
May 29 2012
prev sibling parent reply "Mehrdad" <wfunction hotmail.com> writes:
On Friday, 25 May 2012 at 06:50:22 UTC, Don Clugston wrote:
 This is important for repeatability: if it was permitted, the 
 results would be arbitrary and might vary unpredictably with 
 subtle changes in the code, or change between compiler releases.
Wait, what? Pointer semantics are well-defined... what's wrong with comparing two unrelated pointers?
May 25 2012
parent Don <nospam nospam.com> writes:
On 26.05.2012 03:39, Mehrdad wrote:
 On Friday, 25 May 2012 at 06:50:22 UTC, Don Clugston wrote:
 This is important for repeatability: if it was permitted, the results
 would be arbitrary and might vary unpredictably with subtle changes in
 the code, or change between compiler releases.
Wait, what? Pointer semantics are well-defined... what's wrong with comparing two unrelated pointers?
Actually it isn't well-defined in C. You can compare unrelated pointers for equality, but can't compare their ordering. (I think this is because of the segmented architecture in 16 bit DOS).
May 25 2012