www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - What's wrong with just a runtime-checked const?

reply Reiner Pope <reiner.pope gmail.com> writes:
Disclaimer: my background is a small amount of Java and C#, so perhaps I 
  am completely confusing the issue because of practically no C++ 
experience. Please tell me if that's the case.

Walter seems to have two main objections to a C++-style const:
1. it can be subverted with a const_cast, removing any guarantees about it.
2. it is hard to implement, but this doesn't seem to concern him as much 
as the former.

People have expressed a desire for const especially to ensure the 
Copy-on-Write idiom, and to make libraries more robust.

Wouldn't a runtime const check be much more flexible than a compile-time 
check? Const-safeness is fundamentally a correctness-checking feature, 
just like unit tests, so why not make it operate exactly like unit 
tests? I'm thinking of something like array bounds checking:

 If an index is out of bounds, an ArrayBoundsError exception is raised if
detected at runtime, and an error if detected at compile time. A program may
not rely on array bounds checking happening

and also:
 Implementation Note: Compilers should attempt to detect array bounds errors at
compile time, for example:

...
 Insertion of array bounds checking code at runtime should be turned on and off
with a compile time switch.

This would work in const this way: everything with pointer semantics (classes, arrays, actual pointers) has a hidden (*) field called isConst. This field is stored with the pointer, not the data, so it is always passed by value, just like arrays. The compiler inserts the following code into the precondition of every function that could cause a mutation: assert(!isConst, "Mutation of a const variable detected"); /* Of course, since isConst is hidden, only the compiler has access to it, but you get the idea */ This then behaves like a unit-test, so it is removed for a release version, avoiding the slow-down and extra memory required. And since it is a field, not a type modifier, you don't need to go through your code and insert a whole lot of const's everywhere, because the compiler doesn't need to PROVE it is const-correct -- that's what unit-testing helps with. Let's give some examples: class Foo { private char[] bar; const char[] getCoWBar() // The const tells the compiler to set the isConst field in the pointer to true { return bar; } char[] getNonCoWBar() { return bar; } } ... Foo foo; char[] a = foo.getCoWBar(); // OK, even though getCoWBar is const and a isn't a[0] = 'a'; // Error -- compile-time if the compiler is smart enough, otherwise runtime a = a.dup; // OK a[0] = 'a'; // OK, because we duplicate it. a = getNonCoWBar(); // OK a[0] = 'a'; // OK -- it isn't const In this, the only way to get a non-const pointer from a const pointer is by duplicating it. This makes data that can never change as simple as never having a non-const pointer to it: const char[] foo = "bar"; // OK char[] a = foo; // OK a[0] = 'a'; // Error The even better thing about this is that most code doesn't need to have const-correctness in mind when writing it, and it shouldn't break existing code, because the only code that will break is code that is buggy code anyway. Am I completely missing the point? Will it cause memory/speed issues (keeping in mind that it's only for debug builds)? * The field is hidden because it can't be safely used by the code as it is left out in a release build.
Jul 15 2006
next sibling parent "Andrei Khropov" <andkhropov nospam_mtu-net.ru> writes:
Reiner Pope wrote:

 Disclaimer: my background is a small amount of Java and C#, so perhaps I   am
 completely confusing the issue because of practically no C++ experience.
 Please tell me if that's the case.
 
 Walter seems to have two main objections to a C++-style const:
 1. it can be subverted with a const_cast, removing any guarantees about it.

Different C++ style casts are not supported in D so it's not an issue, I think.
 2. it is hard to implement, but this doesn't seem to concern him as much as
 the former.
 
 People have expressed a desire for const especially to ensure the
 Copy-on-Write idiom, and to make libraries more robust.
 
 Wouldn't a runtime const check be much more flexible than a compile-time
 check? Const-safeness is fundamentally a correctness-checking feature, just
 like unit tests, so why not make it operate exactly like unit tests?

I think things that can be checked at compile-time should be checked at compile-time and don't slow down runtime code. Your proposed type of checks also has a kind of dynamic flavor (Smalltalk/Python like) which is not very D way :-) Unit tests are different. They are rather "post-compile" or "pre-runtime" in fact. I suppose you actually mean DbC, right?
 The even better thing about this is that most code doesn't need to have
 const-correctness in mind when writing it

I think that const-correct code is important and expresses good design and also aids compiler in optimization. ------------------------------------------------------------------------------- Anyway, I think const discussion should be postponed until we get all the things with imports right which is the highest priority for now, IMHO.
Jul 15 2006
prev sibling parent reply xs0 <xs0 xs0.com> writes:
Reiner Pope wrote:
 
 Wouldn't a runtime const check be much more flexible than a compile-time 
 check? Const-safeness is fundamentally a correctness-checking feature, 
 just like unit tests, so why not make it operate exactly like unit 
 tests? I'm thinking of something like array bounds checking:
 [snip]
 The even better thing about this is that most code doesn't need to have 
 const-correctness in mind when writing it, and it shouldn't break 
 existing code, because the only code that will break is code that is 
 buggy code anyway.
 
 Am I completely missing the point?
 Will it cause memory/speed issues (keeping in mind that it's only for 
 debug builds)?

Well, I don't think you completely missed the point, but doing it would cause all sorts of issues: - where should the tag be placed? you can't put it inside the pointer, as there are no free bits; you also can't put it next to a pointer, as it would affect memory layout of structures (in particular, it would make debug-built and release-built code non-interoperable). - it can still be trivially subverted - just cast to int/long and back - you can't just check at the beginning of a function - you can get the pointer in the middle of it; you can also get the pointer in _another_ function (from a global or in a multi-threaded program); checking at every access would be too expensive, I think, even for a debug build xs0
Jul 16 2006
next sibling parent reply Reiner Pope <reiner.pope gmail.com> writes:
xs0 wrote:
 Reiner Pope wrote:
 Wouldn't a runtime const check be much more flexible than a 
 compile-time check? Const-safeness is fundamentally a 
 correctness-checking feature, just like unit tests, so why not make it 
 operate exactly like unit tests? I'm thinking of something like array 
 bounds checking:
 [snip]
 The even better thing about this is that most code doesn't need to 
 have const-correctness in mind when writing it, and it shouldn't break 
 existing code, because the only code that will break is code that is 
 buggy code anyway.

 Am I completely missing the point?
 Will it cause memory/speed issues (keeping in mind that it's only for 
 debug builds)?

Well, I don't think you completely missed the point, but doing it would cause all sorts of issues:

 - where should the tag be placed? you can't put it inside the pointer, 
 as there are no free bits; you also can't put it next to a pointer, as 
 it would affect memory layout of structures (in particular, it would 
 make debug-built and release-built code non-interoperable).

be the main issue. I did indeed mean that it would be stored next to the pointer, but it does seem like that could cause problems with interoperability. If all the info could be stored separately in a bit-array managed by, say, the GC, then perhaps that would avoid the issue of interoperability. But I haven't quite worked out how that would be implemented yet...
 - it can still be trivially subverted - just cast to int/long and back

bit was stored *in* the pointer. So that's a null issue, since it clearly isn't.
 - you can't just check at the beginning of a function

the details more specifically, I realised that the advantage of runtime const checking is a *huge* increase in flexibility, and having to say at compile-time whether a function is const or not removes that flexibility in my mind.
 - you can get the pointer in the middle of it; you can also get the
 pointer in _another_ function (from a global or in a multi-threaded
 program); checking at every access would be too expensive, I think,
 even for a debug build

I don't know how expensive it would be. Considering that all it is is assert(!isConst); that operation doesn't seem too hard. And of course, the compiler is free to optimize away duplicate versions of it, like this: _member = 4; _member = 3; Obviously there's no need to check isConst before the second assignment. And on the issue of the pointer suddenly changing in the middle of the function: well, it's actually not an issue, because I worked out that it's best for the function to have a local copy of the isConst variable, which means no other resources can change it (other than a malicious hacker with pointers). This is also conceptually right, because it doesn't make sense to give a function a certain level of modification access and then change it while it is running. However, I think now that checking at every access is the only flexible way to manage const. In my view, we either need runtime-checked const or nothing. Consider this, and tell me how we can avoid excess dup calls if we use C++-style const-checking: class foo { private char[] _name; public const char[] getName() { return _name; } ... } ... Import std.string; foo f; /*const*/ char[] name = f.getName(); char[] name_m = tolower(name); // Can we modify name_m? Better dup it, just to make sure name_m = name_m.dup(); With runtime const checking, we could fix that, though, by also getting a bool variable back from tolower() which says whether it returned the original string or a copy. Quite simply, compile-time const won't let you do that, so you will actually produce *slower* code. If you're interested, I'm trying to come up with a draft which has the actual details, jut so it gives a more solid ground for discussion. My problem at the moment is just what we mentioned earlier: operability between debug and release builds. I suggested that each function should be passed, behind the scenes, the relevant isConst variable, but that's not compatible with release builds. I then thought that you could have two copies of a function: one for const variables and one for non-const variables. The correct one would be chosen at runtime, and external packages would always choose the one for non-const variables (which is a copy of the one for const variables, but with checking turned off). This effectively duplicates the amount of machine code generated by the compiler, but I am convinced that a solution like I propose is much better than a simple static checking scheme for the flexibility reason I mentioned above. Reiner
Jul 16 2006
parent reply xs0 <xs0 xs0.com> writes:
 - it can still be trivially subverted - just cast to int/long and back

bit was stored *in* the pointer. So that's a null issue, since it clearly isn't.

Well, it's not really important where it's stored - if you cast to int/long, you lose the constness information in any case...
 - you can't just check at the beginning of a function

the details more specifically, I realised that the advantage of runtime const checking is a *huge* increase in flexibility, and having to say at compile-time whether a function is const or not removes that flexibility in my mind.

Agreed :) It's definitely more flexible, but the question is whether the cost is low enough for the feature to still be usable, and whether it's realistically implementable at all.. Note that even if the GC is modified, and the constness information is stored somewhere externally, it can take a lot of updating when passing pointers/references around...
 - you can get the pointer in the middle of it; you can also get the
 pointer in _another_ function (from a global or in a multi-threaded
 program); checking at every access would be too expensive, I think,
 even for a debug build

I don't know how expensive it would be. Considering that all it is is assert(!isConst); that operation doesn't seem too hard. And of course, the compiler is free to optimize away duplicate versions of it, like this: _member = 4; _member = 3;

Well, you assume the optimizer, but it's not usually used with debug builds, even if it supported this optimization. And there definitely are cases where each access would require a check; in a multi-threaded program you can hardly assume anything about non-local variables, especially that they don't change :)
 Obviously there's no need to check isConst before the second assignment. 
 And on the issue of the pointer suddenly changing in the middle of the 
 function: well, it's actually not an issue, because I worked out that 
 it's best for the function to have a local copy of the isConst variable, 
 which means no other resources can change it (other than a malicious 
 hacker with pointers). This is also conceptually right, because it 
 doesn't make sense to give a function a certain level of modification 
 access and then change it while it is running.

That is true only for correct programs ;) Considering how the purpose is to catch bugs, though, you can't make that assumption...
 In my view, we either need runtime-checked const or nothing. Consider 
 this, and tell me how we can avoid excess dup calls if we use C++-style 
 const-checking:
 
  class foo {
    private char[] _name;
    public const char[] getName()
    {
      return _name;
    }
    ...
  }
  ...
  Import std.string;
  foo f;
  /*const*/ char[] name = f.getName();
  char[] name_m = tolower(name);
  // Can we modify name_m? Better dup it, just to make sure
  name_m = name_m.dup();

if (name_m is name) but, it's certainly not a pretty solution... and there's a related question - how can tolower know whether to .dup or not.. there are certainly cases where it isn't necessary.
 With runtime const checking, we could fix that, though, by also getting 
 a bool variable back from tolower() which says whether it returned the 
 original string or a copy. Quite simply, compile-time const won't let 
 you do that, so you will actually produce *slower* code.

Well, if you wanted, you could make tolower() return that already, it doesn't have much to do with constness..
 If you're interested, I'm trying to come up with a draft which has the 
 actual details, jut so it gives a more solid ground for discussion. My 
 problem at the moment is just what we mentioned earlier: operability 
 between debug and release builds. I suggested that each function should 
 be passed, behind the scenes, the relevant isConst variable, but that's 
 not compatible with release builds. I then thought that you could have 
 two copies of a function: one for const variables and one for non-const 
 variables. The correct one would be chosen at runtime, and external 
 packages would always choose the one for non-const variables (which is a 
 copy of the one for const variables, but with checking turned off). This 
 effectively duplicates the amount of machine code generated by the 
 compiler, but I am convinced that a solution like I propose is much 
 better than a simple static checking scheme for the flexibility reason I 
 mentioned above.

Sure, I'm interested :) But the way I see it, it's a practically impossible problem to solve in a way that is time/space efficient, useful, and easy to use, even more so if the solution is not static.. xs0
Jul 17 2006
parent reply Reiner Pope <reiner.pope gmail.com> writes:
xs0 wrote:
 
 - it can still be trivially subverted - just cast to int/long and back

the bit was stored *in* the pointer. So that's a null issue, since it clearly isn't.

Well, it's not really important where it's stored - if you cast to int/long, you lose the constness information in any case...

If you stored it separately, it wouldn't be accessible easily to the user. This is obviously an implementation detail, and not the interesting part of the discussion, though. :)
 
 
 - you can't just check at the beginning of a function

down the details more specifically, I realised that the advantage of runtime const checking is a *huge* increase in flexibility, and having to say at compile-time whether a function is const or not removes that flexibility in my mind.

Agreed :) It's definitely more flexible, but the question is whether the cost is low enough for the feature to still be usable, and whether it's realistically implementable at all.. Note that even if the GC is modified, and the constness information is stored somewhere externally, it can take a lot of updating when passing pointers/references around...

Yes, and I don't know enough about function calling to know what exactly is possible, etc.
 
 
 - you can get the pointer in the middle of it; you can also get the
 pointer in _another_ function (from a global or in a multi-threaded
 program); checking at every access would be too expensive, I think,
 even for a debug build

I don't know how expensive it would be. Considering that all it is is assert(!isConst); that operation doesn't seem too hard. And of course, the compiler is free to optimize away duplicate versions of it, like this: _member = 4; _member = 3;

Well, you assume the optimizer, but it's not usually used with debug builds, even if it supported this optimization. And there definitely are cases where each access would require a check; in a multi-threaded program you can hardly assume anything about non-local variables, especially that they don't change :)

I don't see the problem: you pass isConst by value, so there is no way for anything external to alter the const-ness level that that particular function has. How can that cause any problems?
 
 Obviously there's no need to check isConst before the second 
 assignment. And on the issue of the pointer suddenly changing in the 
 middle of the function: well, it's actually not an issue, because I 
 worked out that it's best for the function to have a local copy of the 
 isConst variable, which means no other resources can change it (other 
 than a malicious hacker with pointers). This is also conceptually 
 right, because it doesn't make sense to give a function a certain 
 level of modification access and then change it while it is running.

That is true only for correct programs ;) Considering how the purpose is to catch bugs, though, you can't make that assumption...

Since the programmer shouldn't be able to access the isConst variable, there's no way to change it, so it shouldn't be a problem.
 
 
 In my view, we either need runtime-checked const or nothing. Consider 
 this, and tell me how we can avoid excess dup calls if we use 
 C++-style const-checking:

  class foo {
    private char[] _name;
    public const char[] getName()
    {
      return _name;
    }
    ...
  }
  ...
  Import std.string;
  foo f;
  /*const*/ char[] name = f.getName();
  char[] name_m = tolower(name);
  // Can we modify name_m? Better dup it, just to make sure
  name_m = name_m.dup();

if (name_m is name)

Yes, but try getting that to work with a statically-checked const. It simply won't work unless you have a very smart checker. Of course, you could leave const out altogether, but then we would be making no progress. Let me demonstrate the static const-checking problem: char[] tolower(const char[] input) // the input must be const, because we agree with CoW, so we won't change it { // do some stuff if ( a write is necessary ) { // copy it into another variable, since we can't change input (it's const) } return something; // This something could possibly be input, so it also needs to be declared const. So we go back and make the return value of the function also a const. } // Now, since the return value is const, we *must* dup it. Can you suggest another solution other than avoiding const-checking entirely? By the way, a dedicated string class is actually an implementation of runtime const checking.
 but, it's certainly not a pretty solution... and there's a related 
 question - how can tolower know whether to .dup or not.. there are 
 certainly cases where it isn't necessary.

checking can (providing we have a modification of the libraries to support a CoW and a in-place version).
 
 With runtime const checking, we could fix that, though, by also 
 getting a bool variable back from tolower() which says whether it 
 returned the original string or a copy. Quite simply, compile-time 
 const won't let you do that, so you will actually produce *slower* code.

Well, if you wanted, you could make tolower() return that already, it doesn't have much to do with constness..

I've said above how it has to do with constness. Constness is not required, but if you decide to use it statically, then you get all these problems which you can't solve.
 If you're interested, I'm trying to come up with a draft which has the 
 actual details, jut so it gives a more solid ground for discussion. My 
 problem at the moment is just what we mentioned earlier: operability 
 between debug and release builds. I suggested that each function 
 should be passed, behind the scenes, the relevant isConst variable, 
 but that's not compatible with release builds. I then thought that you 
 could have two copies of a function: one for const variables and one 
 for non-const variables. The correct one would be chosen at runtime, 
 and external packages would always choose the one for non-const 
 variables (which is a copy of the one for const variables, but with 
 checking turned off). This effectively duplicates the amount of 
 machine code generated by the compiler, but I am convinced that a 
 solution like I propose is much better than a simple static checking 
 scheme for the flexibility reason I mentioned above.

Sure, I'm interested :) But the way I see it, it's a practically impossible problem to solve in a way that is time/space efficient, useful, and easy to use, even more so if the solution is not static..

somewhere that unit tests often cause slow-downs to half speeds, so is speed really such an issue then? Similarly, unit tests builds are 3x the size of release builds (on my current test, anyway). Should we really be worried about efficiency concerns in non-release builds given the losses in efficiency we already accept?
 xs0

Jul 18 2006
next sibling parent reply xs0 <xs0 xs0.com> writes:
 Agreed :) It's definitely more flexible, but the question is whether 
 the cost is low enough for the feature to still be usable, and whether 
 it's realistically implementable at all.. Note that even if the GC is 
 modified, and the constness information is stored somewhere 
 externally, it can take a lot of updating when passing 
 pointers/references around...

Yes, and I don't know enough about function calling to know what exactly is possible, etc.

Well, I don't know much about function calling either (well, you put some stuff on stack and some on registers, and call the function :). But regardless of how it's done, you need to propagate constness information whenever you - assign a variable a new value - copy the variable (this includes passing parameters to functions and getting return values back)
 And there definitely are cases where each access would require a 
 check; in a multi-threaded program you can hardly assume anything 
 about non-local variables, especially that they don't change :)

I don't see the problem: you pass isConst by value, so there is no way for anything external to alter the const-ness level that that particular function has. How can that cause any problems?

Hmm, since when was constness a property of a function? It's the data that is const or not, and one way or another, you need to know the constness for each pointer/reference separately.
 Obviously there's no need to check isConst before the second 
 assignment. And on the issue of the pointer suddenly changing in the 
 middle of the function: well, it's actually not an issue, because I 
 worked out that it's best for the function to have a local copy of 
 the isConst variable, which means no other resources can change it 
 (other than a malicious hacker with pointers). This is also 
 conceptually right, because it doesn't make sense to give a function 
 a certain level of modification access and then change it while it is 
 running.

That is true only for correct programs ;) Considering how the purpose is to catch bugs, though, you can't make that assumption...

Since the programmer shouldn't be able to access the isConst variable, there's no way to change it, so it shouldn't be a problem.

Hmm, now I really think you'd like to achieve constness on a per-function-invocation basis.. Are you sure that is what you want and that it is useful? I think it would be a good time you restated your proposal :)
 How important is time/space in a debug/unittest build? Walter said 
 somewhere that unit tests often cause slow-downs to half speeds, so is 
 speed really such an issue then? Similarly, unit tests builds are 3x the 
 size of release builds (on my current test, anyway). Should we really be 
 worried about efficiency concerns in non-release builds given the losses 
 in efficiency we already accept?

Hmm, afaik, unit tests are only run once per execution and don't otherwise affect the speed of execution. Their size also depends on the size of the code, but keeping runtime constness information would have costs depending on the size of _data_, a much bigger problem I think... And sure, you need to worry about efficiency. Something like a 10% speed loss is probably acceptable, but something that will slow your program by a factor of 20 is often not acceptable, even in debug builds... xs0
Jul 18 2006
parent Reiner Pope <reiner.pope gmail.com> writes:
 Hmm, afaik, unit tests are only run once per execution and don't 
 otherwise affect the speed of execution. Their size also depends on the 
 size of the code, but keeping runtime constness information would have 
 costs depending on the size of _data_, a much bigger problem I think...

I also meant DbC preconditions and postconditions, which I believe are only compiled in during debug mode. These are run at every function invocation, which can mean a big slow-down (especially with long invariants).
 
 And sure, you need to worry about efficiency. Something like a 10% speed 
 loss is probably acceptable, but something that will slow your program 
 by a factor of 20 is often not acceptable, even in debug builds...

Since no implementation has been done, we can only wildly speculate about slow-downs. Since it's all getting very confused, I'll work out the details gradually and see if I come across any more problems, and then post my results here. Reiner
 
 
 xs0

Jul 18 2006
prev sibling parent reply Bruno Medeiros <brunodomedeirosATgmail SPAM.com> writes:
Reiner Pope wrote:
 In my view, we either need runtime-checked const or nothing. Consider 
 this, and tell me how we can avoid excess dup calls if we use 
 C++-style const-checking:

  class foo {
    private char[] _name;
    public const char[] getName()
    {
      return _name;
    }
    ...
  }
  ...
  Import std.string;
  foo f;
  /*const*/ char[] name = f.getName();
  char[] name_m = tolower(name);
  // Can we modify name_m? Better dup it, just to make sure
  name_m = name_m.dup();

if (name_m is name)

Yes, but try getting that to work with a statically-checked const. It simply won't work unless you have a very smart checker. Of course, you could leave const out altogether, but then we would be making no progress. Let me demonstrate the static const-checking problem: char[] tolower(const char[] input) // the input must be const, because we agree with CoW, so we won't change it { // do some stuff if ( a write is necessary ) { // copy it into another variable, since we can't change input (it's const) } return something; // This something could possibly be input, so it also needs to be declared const. So we go back and make the return value of the function also a const. } // Now, since the return value is const, we *must* dup it. Can you suggest another solution other than avoiding const-checking entirely? By the way, a dedicated string class is actually an implementation of runtime const checking.
 but, it's certainly not a pretty solution... and there's a related 
 question - how can tolower know whether to .dup or not.. there are 
 certainly cases where it isn't necessary.

checking can (providing we have a modification of the libraries to support a CoW and a in-place version).

I'm not following you. That code example shows how a static const checking should work, so what is the problem with that? And in: char[] name_m = tolower(name); // Can we modify name_m? Better dup it, just to make sure You ask if we can modify name_m, the answer is yes if the return type of tolower is non-const, no if it const, so what's the issue here? -- Bruno Medeiros - CS/E student http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Jul 18 2006
parent reply Reiner Pope <reiner.pope gmail.com> writes:
Bruno Medeiros wrote:
 Can you suggest another solution other than avoiding const-checking 
 entirely? By the way, a dedicated string class is actually an 
 implementation of runtime const checking.
 but, it's certainly not a pretty solution... and there's a related 
 question - how can tolower know whether to .dup or not.. there are 
 certainly cases where it isn't necessary.

runtime checking can (providing we have a modification of the libraries to support a CoW and a in-place version).

I'm not following you. That code example shows how a static const checking should work, so what is the problem with that? And in: char[] name_m = tolower(name); // Can we modify name_m? Better dup it, just to make sure You ask if we can modify name_m, the answer is yes if the return type of tolower is non-const, no if it const, so what's the issue here?

You seem to have ignored my more recent example. The problem is that static checking *forces* the return value to be const, which in turn forces you to dup it. You don't always need to do that:
 char[] tolower(const char[] input) /* the input must be const, 


 {
   // do some stuff
   if ( a write is necessary )
   { /* copy it into another variable, since we can't change input 


   }
   return something; /* This something could possibly be input, so it 


of the function also a const. */
 }

 // Now, since the return value is const, we *must* dup it.


Jul 18 2006
parent reply Bruno Medeiros <brunodomedeirosATgmail SPAM.com> writes:
Reiner Pope wrote:
 Bruno Medeiros wrote:
 Can you suggest another solution other than avoiding const-checking 
 entirely? By the way, a dedicated string class is actually an 
 implementation of runtime const checking.
 but, it's certainly not a pretty solution... and there's a related 
 question - how can tolower know whether to .dup or not.. there are 
 certainly cases where it isn't necessary.

runtime checking can (providing we have a modification of the libraries to support a CoW and a in-place version).

I'm not following you. That code example shows how a static const checking should work, so what is the problem with that? And in: char[] name_m = tolower(name); // Can we modify name_m? Better dup it, just to make sure You ask if we can modify name_m, the answer is yes if the return type of tolower is non-const, no if it const, so what's the issue here?

You seem to have ignored my more recent example. The problem is that static checking *forces* the return value to be const, which in turn forces you to dup it. You don't always need to do that: >> char[] tolower(const char[] input) /* the input must be const, because we agree with CoW, so we won't change it */ >> { >> // do some stuff >> if ( a write is necessary ) >> { /* copy it into another variable, since we can't change input it's const)*/ >> } >> return something; /* This something could possibly be input, so it also needs to be declared const. So we go back and make the return value of the function also a const. */ >> } >> >> // Now, since the return value is const, we *must* dup it. >>

Are you saying that in that function, what is returned sometimes needs do be const(readonly) and sometimes not? And because of that the function needs to return const everytime? But then why not have the function return non-const and do the duping when necessary? -- Bruno Medeiros - CS/E student http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Jul 20 2006
parent reply Don Clugston <dac nospam.com.au> writes:
Bruno Medeiros wrote:
 Reiner Pope wrote:
 Bruno Medeiros wrote:
 Can you suggest another solution other than avoiding const-checking 
 entirely? By the way, a dedicated string class is actually an 
 implementation of runtime const checking.
 but, it's certainly not a pretty solution... and there's a related 
 question - how can tolower know whether to .dup or not.. there are 
 certainly cases where it isn't necessary.

runtime checking can (providing we have a modification of the libraries to support a CoW and a in-place version).

I'm not following you. That code example shows how a static const checking should work, so what is the problem with that? And in: char[] name_m = tolower(name); // Can we modify name_m? Better dup it, just to make sure You ask if we can modify name_m, the answer is yes if the return type of tolower is non-const, no if it const, so what's the issue here?

You seem to have ignored my more recent example. The problem is that static checking *forces* the return value to be const, which in turn forces you to dup it. You don't always need to do that: >> char[] tolower(const char[] input) /* the input must be const, because we agree with CoW, so we won't change it */ >> { >> // do some stuff >> if ( a write is necessary ) >> { /* copy it into another variable, since we can't change input it's const)*/ >> } >> return something; /* This something could possibly be input, so it also needs to be declared const. So we go back and make the return value of the function also a const. */ >> } >> >> // Now, since the return value is const, we *must* dup it. >>

Are you saying that in that function, what is returned sometimes needs do be const(readonly) and sometimes not?

Yes. And because of that the
 function needs to return const everytime?
 But then why not have the function return non-const and do the duping 
 when necessary?

Doesn't help. Either, the function has to dup every time (even if the input was a const, and it's passing it on unmodified) -- so that a user of the function can write without copying first OR the user of the function has to dup every time (even if it was already duped) In both cases, you have an unnecessary dup.
Jul 20 2006
parent reply Bruno Medeiros <brunodomedeirosATgmail SPAM.com> writes:
Don Clugston wrote:
 Bruno Medeiros wrote:
 Reiner Pope wrote:
 Bruno Medeiros wrote:
 Can you suggest another solution other than avoiding const-checking 
 entirely? By the way, a dedicated string class is actually an 
 implementation of runtime const checking.
 but, it's certainly not a pretty solution... and there's a related 
 question - how can tolower know whether to .dup or not.. there are 
 certainly cases where it isn't necessary.

runtime checking can (providing we have a modification of the libraries to support a CoW and a in-place version).

I'm not following you. That code example shows how a static const checking should work, so what is the problem with that? And in: char[] name_m = tolower(name); // Can we modify name_m? Better dup it, just to make sure You ask if we can modify name_m, the answer is yes if the return type of tolower is non-const, no if it const, so what's the issue here?

You seem to have ignored my more recent example. The problem is that static checking *forces* the return value to be const, which in turn forces you to dup it. You don't always need to do that: >> char[] tolower(const char[] input) /* the input must be const, because we agree with CoW, so we won't change it */ >> { >> // do some stuff >> if ( a write is necessary ) >> { /* copy it into another variable, since we can't change input it's const)*/ >> } >> return something; /* This something could possibly be input, so it also needs to be declared const. So we go back and make the return value of the function also a const. */ >> } >> >> // Now, since the return value is const, we *must* dup it. >>

Are you saying that in that function, what is returned sometimes needs do be const(readonly) and sometimes not?

Yes. And because of that the
 function needs to return const everytime?
 But then why not have the function return non-const and do the duping 
 when necessary?

Doesn't help. Either, the function has to dup every time (even if the input was a const, and it's passing it on unmodified) -- so that a user of the function can write without copying first OR the user of the function has to dup every time (even if it was already duped) In both cases, you have an unnecessary dup.

Ok, I'm not following you guys. Can you explain from scratch what it is you were trying to achieve (the purpose of the function?), which when done with static const checking causes unnecessary dups? -- Bruno Medeiros - CS/E student http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Jul 21 2006
parent reply xs0 <xs0 xs0.com> writes:
 Ok, I'm not following you guys. Can you explain from scratch what it is 
 you were trying to achieve (the purpose of the function?), which when 
 done with static const checking causes unnecessary dups?

It's really simple. Say you have a function that sometimes modifies its data and sometimes doesn't. The non-const version is simple, as it's allowed in-place modification: byte[] filter(byte[] data) { if (...) data[0] = 10; return data; } Now, whenever you have const data, you can only use it by duping: const byte[] data = foo(); byte[] filtered = filter(data.dup); That's bad, because you always have to .dup the data, even if filter() does nothing with it. So, you also write a const version: const byte[] filter(const byte[] data) { if (...) { byte[] result = data.dup; result[0] = 10; return result; } return data; } That's much better, because a copy is only made when needed. However, there's still a problem - the return type must be const, because you need to be able to return the original (const) parameter. Therefore, the information about whether a copy is made is lost. With a single function that's not even a problem, but say your filtering is configurable and you call 20 filters. If you use the const versions, 20 copies will be made, even though only one is necessary. If you use the non-const versions, you're forced to dup before the first filter, even when not necessary at all. A similar problem occurs in this case: const char[] handleDesc() { if (_handle) { return format("Handle #%d", _handle.id()); } else { return "No handle"; } } Because you sometimes return a literal, you must declare the result const, even though you'll return a fresh writable char[] most of the time, which may or may not be significant. Alternatively, you can .dup the literal before returning it, but it will usually be a wasted .dup.. In both cases, the problem is the same - there is no way for the function to return information about whether a copy was made along with the data itself. It can't even neatly be solved by wrapping the thing in a struct or checking equality of references, because the type in either case is static and one way or another, you'll have to work around that (by casting or whatever) xs0
Jul 21 2006
parent reply Bruno Medeiros <brunodomedeirosATgmail SPAM.com> writes:
xs0 wrote:
 
 Ok, I'm not following you guys. Can you explain from scratch what it 
 is you were trying to achieve (the purpose of the function?), which 
 when done with static const checking causes unnecessary dups?

It's really simple. Say you have a function that sometimes modifies its data and sometimes doesn't. The non-const version is simple, as it's allowed in-place modification: byte[] filter(byte[] data) { if (...) data[0] = 10; return data; } Now, whenever you have const data, you can only use it by duping: const byte[] data = foo(); byte[] filtered = filter(data.dup); That's bad, because you always have to .dup the data, even if filter() does nothing with it. So, you also write a const version: const byte[] filter(const byte[] data) { if (...) { byte[] result = data.dup; result[0] = 10; return result; } return data; } That's much better, because a copy is only made when needed. However, there's still a problem - the return type must be const, because you need to be able to return the original (const) parameter. Therefore, the information about whether a copy is made is lost. With a single function that's not even a problem, but say your filtering is configurable and you call 20 filters. If you use the const versions, 20 copies will be made, even though only one is necessary. If you use the non-const versions, you're forced to dup before the first filter, even when not necessary at all. A similar problem occurs in this case: const char[] handleDesc() { if (_handle) { return format("Handle #%d", _handle.id()); } else { return "No handle"; } } Because you sometimes return a literal, you must declare the result const, even though you'll return a fresh writable char[] most of the time, which may or may not be significant. Alternatively, you can .dup the literal before returning it, but it will usually be a wasted .dup.. In both cases, the problem is the same - there is no way for the function to return information about whether a copy was made along with the data itself. It can't even neatly be solved by wrapping the thing in a struct or checking equality of references, because the type in either case is static and one way or another, you'll have to work around that (by casting or whatever) xs0

Ah, I understand the objective now. But then, this problem you are trying to solve here is one of performance only, related to ownership management, but not necessarily the same. In particular: * Whatever mechanism is made to deal with that problem, cannot be disabled in release builds: it is not a contract, unlike const/ownership checking which is a contract, and being so can be checked at compile time, or at runtime in debug releases only. * For that same reasons, such mechanism should not be a substitute for const/ownership checking, since this latter one can be processed at compile time, which is naturally more efficient, and the former one can't. * Such mechanism is likely impractical or too hard to implement in the general sense (that is, for any data type). So perhaps the mechanism should be implement not by the language/compiler, but by the coder with existing language constructs (mixins, etc.)? -- Bruno Medeiros - CS/E student http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Jul 26 2006
parent xs0 <xs0 xs0.com> writes:
 Ah, I understand the objective now.
 
 But then, this problem you are trying to solve here is one of 
 performance only, related to ownership management, but not necessarily 
 the same. In particular:

 * Whatever mechanism is made to deal with that problem, cannot be 
 disabled in release builds: it is not a contract, unlike const/ownership 
 checking which is a contract, and being so can be checked at compile 
 time, or at runtime in debug releases only.

Well, it is a contract (though shall not modify a readonly array), just not verifiable at compile-time. Similar to in{} and out{} blocks, I guess.. It has other (potentially significant) benefits, though..
 * For that same reasons, such mechanism should not be a substitute for 
 const/ownership checking, since this latter one can be processed at 
 compile time, which is naturally more efficient, and the former one can't.

I disagree about efficiency - it turns out there is negligible speed impact even if .readonly is not used at all, while it can be used for significant speed improvements with simpler code (in particular, writing a fast and safe COW function is easier). I posted test results and code in digitalmars.D, if you're interested (I think :)
 * Such mechanism is likely impractical or too hard to implement in the 
 general sense (that is, for any data type). So perhaps the mechanism 
 should be implement not by the language/compiler, but by the coder with 
 existing language constructs (mixins, etc.)?

Well, you could code your own array references (like I did for my test), but they lose on ease of use and are less flexible in general (because you can't code an inheritance tree which would mimic the built-in arrays'). On the other hand, if it was language-supported, nothing would be lost, and something would be gained.. xs0
Jul 27 2006
prev sibling parent Reiner Pope <reiner.pope gmail.com> writes:
xs0 wrote:

 Well, I don't think you completely missed the point, but doing it would 
 cause all sorts of issues:
 - where should the tag be placed? you can't put it inside the pointer, 
 as there are no free bits; you also can't put it next to a pointer, as 
 it would affect memory layout of structures (in particular, it would 
 make debug-built and release-built code non-interoperable).

implementation would involve modifying the signature of functions to accept extra variables, signalling isConst. I know, this damages release/debug interoperability, so can you tell me which functions that would cause problems for? All that I can think of is (a) exported functions in libraries, and (b) functions that inline assembler calls. Am I missing any?
 - it can still be trivially subverted - just cast to int/long and back
 - you can't just check at the beginning of a function - you can get the 
 pointer in the middle of it; you can also get the pointer in _another_ 
 function (from a global or in a multi-threaded program); checking at 
 every access would be too expensive, I think, even for a debug build
 
 
 xs0

Jul 18 2006