www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Const, strings, and other things.

reply Jarrett Billingsley <kb3ctd2 yahoo.com> writes:
That topic name is _dangerously_ catchy.

Anyway some noobish thoughts were running through my constless brain 
today.  By constless I mean I'm in the "I've never used const and 
haven't really run into any cases where I've thought it 
necessary/useful" camp, although I can't deny that there are _some_ uses 
for it.  But the more I thought about it, the more it seemed to me 
that.. well, if we're not trying to be C++, and we're not trying to make 
it possible to have user types behave exactly like built-in types, then 
maybe a generic const system isn't really all that necessary.

What my thoughts boiled down to is that constness seems useful for 
strings and not much else.  I suppose it could be useful for other kinds 
of arrays, but overwhelmingly the use cases for const seem to be for 
strings, and constness helps to make some operations with strings more 
efficient.  Other than that, constness for other types seems more like a 
logical convenience.  So you want to return a const(Foo) where Foo is a 
class reference?  How often do you need to do that?  Passing const refs 
-- have YOU ever had a bug where you tried to modify a const ref param 
that was caught by the const?

The more I hear about const, and the more conversations I watch about 
it, the more complex and esoteric it gets.  I think that constness for 
strings only could cover a large majority of use cases for const without 
having to have a pervasive, complex addition to the type system.

(I just keep looking at D2 and seeing this awful const wart on it, and 
thinking I'd really like to try it out if it weren't for const.  Ugh.)
Nov 12 2007
next sibling parent reply "Janice Caron" <caron800 googlemail.com> writes:
I like const!

I like to know that when I pass /my/ data to /your/ function, then
your function is not going to mess with my data. const in the
declaration of your function is what gives me that guarantee.
Nov 12 2007
next sibling parent reply "Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:
"Janice Caron" <caron800 googlemail.com> wrote in message 
news:mailman.52.1194907553.2338.digitalmars-d puremagic.com...
I like const!

 I like to know that when I pass /my/ data to /your/ function, then
 your function is not going to mess with my data. const in the
 declaration of your function is what gives me that guarantee.

Oh, _come on_. Chances are if your code is passing data to my function, one or more of the following is true: 1) We're on the same team, and we have some sort of convention set up for this kind of thing. 2) You're using a library that I wrote, and it's probably well-documented, or at least self-obvious, whether or not my function modifies your data. (Keep in mind that even if there _is_ const available and my function is not well-documented, I don't _have_ to use const, and so you don't have any guarantee anyway.) 3) I'm not a complete moron and actually name my functions after what they do. Face it, most people don't write functions with one name and have them do something completely different, and if that's happening, constness is not really going to help you with that. My big question is: how often do you pass some reference type into a function and hope that it will/won't be modified? Here are some possible reference types: - Const refs to structs. Why do these exist? ref to get the compiler to pass the struct efficiently, and const to preserve value semantics. Wouldn't this be better served by some other mechanism, such as the optimizer? - Pointers. Come on, don't use pointers. ;) - Class instances. I'll get back to this. - Arrays of things other than characters. OK, I can see a use for declaring an array as read-only, although most of the time I'm modifying my arrays left and right. - Strings. The single most common kind of array, and they even have special read-only literals built into the language. It makes sense to provide read-only strings, if for nothing else efficiency. Most of the time you're not going to be modifying the strings after you've done some machinations. As for class instances, this is where it's a grey area. So I did a bit of research into my own code, and this is what I've come up with, as far as why I'm passing class instances into functions: - As some sort of object to be mutated, i.e. a context or state class (like a function that adds some symbol to a given symbol table). - In order to construct/set up an aggregate object, such as some kind of complex IO class which takes an output stream, a layout instance, etc. In virtually all cases these instances are subsequently modified by the methods of the owner object. - As a sort of 'struct on steroids', that is, just a data-holding class instance with some methods. These I guess are candidates for constness, but the incidence of these is so small and the functions which use them are so short and what they do is so obvious that I have no idea what gains I will get from using const. I'm just absolutely curious as to what on earth, _other than strings_, that you'll be passing to other code that you don't want it to be modified.
Nov 12 2007
next sibling parent reply Bill Baxter <dnewsgroup billbaxter.com> writes:
Jarrett Billingsley wrote:
 "Janice Caron" <caron800 googlemail.com> wrote in message 
 news:mailman.52.1194907553.2338.digitalmars-d puremagic.com...
 I like const!

 I like to know that when I pass /my/ data to /your/ function, then
 your function is not going to mess with my data. const in the
 declaration of your function is what gives me that guarantee.


Also important to point out that "you" in the above often means "myself two months ago".
 Oh, _come on_.  Chances are if your code is passing data to my function, one 
 or more of the following is true:
 
 1) We're on the same team, and we have some sort of convention set up for 
 this kind of thing.

My convention in D1 is this: foo(/*const*/ ref T a, /*const*/string b); It works ok. Puts the label right there in front of the parameter where you can see it when looking at the function signature. Sure would be nice if it actually did some checking, though. My other convention is to name ref parameters that are meant to be output like "ref Foo something_out". I also end up making a lot of out parameters be pointers instead references, because then looking at the signature I know that it must be an out parameter, because there's no other reason to pass it as a pointer.
 2) You're using a library that I wrote, and it's probably well-documented, 
 or at least self-obvious, whether or not my function modifies your data. 

Come on. Code is poorly documented. For 90% of open source projects out there "better documentation" is in the top 5 on the TODO list.
 (Keep in mind that even if there _is_ const available and my function is not 
 well-documented, I don't _have_ to use const, and so you don't have any 
 guarantee anyway.)

Yes, the library writer might not have used const. In C++ those are usually the ones I glance at quickly and conclude "this guy probably had no idea what he was doing ... tread with caution".
 3) I'm not a complete moron and actually name my functions after what they 
 do.  Face it, most people don't write functions with one name and have them 
 do something completely different, and if that's happening, constness is not 
 really going to help you with that.

Bad naming really has nothing to do with it.
 
 My big question is: how often do you pass some reference type into a 
 function and hope that it will/won't be modified?  Here are some possible 
 reference types:
 
 - Const refs to structs.  Why do these exist?  ref to get the compiler to 
 pass the struct efficiently, and const to preserve value semantics. 
 Wouldn't this be better served by some other mechanism, such as the 
 optimizer?

For me, I do this all the time. Passing 4 different 4-vectors of doubles to a function is a good formula for killing your performance. But I'm with you. I shouldn't *have* to worry about the performance. If I could pass structs around efficiently it would kill 90% of what I want const for. I think the idea of having the compiler quietly substitute in pass-by-reference for pass-by-value to improve efficiency has some promise.
 I'm just absolutely curious as to what on earth, _other than strings_, that 
 you'll be passing to other code that you don't want it to be modified. 

I do hardly any text processing. I'm mostly passing vectors and matrices and meshes and images around. And a most of those things are big enough that you really don't want to make a copy unless you absolutely have to. So whether you plan to modify the data or not you're going to pass it around by some sort of reference/pointer/handle. --bb
Nov 12 2007
parent reply "Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:
"Bill Baxter" <dnewsgroup billbaxter.com> wrote in message 
news:fhav0e$2stp$1 digitalmars.com...

 My other convention is to name ref parameters that are meant to be output 
 like  "ref Foo something_out".  I also end up making a lot of out 
 parameters be pointers instead references, because then looking at the 
 signature I know that it must be an out parameter, because there's no 
 other reason to pass it as a pointer.

Which would be a good reason to require out at the call site, like C#. No more pointer nonsense.
 Come on.  Code is poorly documented.  For 90% of open source projects out 
 there "better documentation" is in the top 5 on the TODO list.

:D
 3) I'm not a complete moron and actually name my functions after what 
 they do.  Face it, most people don't write functions with one name and 
 have them do something completely different, and if that's happening, 
 constness is not really going to help you with that.

Bad naming really has nothing to do with it.

It has something to do with it. If you have a function like so: void drawImage(Image i, int x, int y) ... And Image is a class type, do you expect images that you pass in to be modified? No, because that would be stupid. It _draws an image_. Making this method take a const(Image) instead is kind of beating a dead horse. Yes, now the compiler will enforce it but why would anyone in their right mind modify the passed-in image anyway? I've never made the stupid mistake of going "Oh! Why don't I modify the image data in the draw method?! Makes perfect sense to me!" And if someone really really wanted to modify the image in drawImage, at that point they have to have the source code, so there's nothing preventing them from changing that const(Image) parameter to an Image. So I *really* don't see what const gets us here.
Nov 12 2007
parent Paul Findlay <r.lph50+d gmail.com> writes:
 It has something to do with it.  If you have a function like so:
 
 void drawImage(Image i, int x, int y) ...
 
 And Image is a class type, do you expect images that you pass in to be
 modified?  No, because that would be stupid.

http://projects.jkraemer.net/acts_as_ferret/ticket/181 (happens easily since ruby strings are like references to classes). Having optional const may not have prevented the error, but if ruby had something like a const attribute I imagine I could have simply added it and tracked down the modification by waiting for a compiler/runtime error rather than tracing the whole flow of the code manually. I don't really know, but I wish the D compiler had support for ensuring things that could be const are, and that functions that can return null-references get checked their return value checked by their users. I don't think it should substitute for programming habits, but a compiler should be able to do so much more grunt work. - Paul
Nov 12 2007
prev sibling parent reply "David B. Held" <dheld codelogicconsulting.com> writes:
Jarrett Billingsley wrote:
 [...]
 1) We're on the same team, and we have some sort of convention
 set up for this kind of thing.

Hahahahaha!!! I see you've never worked for a company with more than 5 people. The idea that you can get even 5 people in a team in a large company to agree on a "convention" is rather amusing. If you work in a company with at least 5 *teams*, you have another class of problem altogether.
 2) You're using a library that I wrote, and it's probably well-
 documented, or at least self-obvious, whether or not my function
 modifies your data. 

Wow, maybe you write libraries that way, but you should take a look around. 95% of software engineers don't. Most of the code I see, I'd be happy of there were *comments*, let alone documentation about mutability and constness.
 (Keep in mind that even if there _is_ const available and my function is
 not well-documented, I don't _have_ to use const, and so you don't have
 any guarantee anyway.)

I have the guarantee that if you use const, you can't hack my data, and if you don't use const, I don't have to use your code. At the worst, I will copy my data before hand and curse you for the performance hit. And I can show you code bases where you will do this for your own sanity; and yes, the code will make you cry.
 3) I'm not a complete moron and actually name my functions after what
 they do.  Face it, most people don't write functions with one name and
 have them do something completely different, and if that's happening,
 constness is not really going to help you with that.

So you're in the middle of an app that executes a bunch of business logic, and the function is a hundred lines long and is called processFoo(Foo foo, Bar bar, Baz baz). Which of those arguments do you expect to get modified and why? Does this sound like an unreasonable name? Well, unreasonable or not, I see names like this a hundred times a day. I don't have the luxury of going around and renaming them as processFooModifiesFooButNotBarOrBaz(Foo foo, Bar bar, Baz baz).
 My big question is: how often do you pass some reference type into a 
 function and hope that it will/won't be modified?

Just about every time I write some code.
 [...]
 - Const refs to structs.  Why do these exist?  ref to get the compiler
 to pass the struct efficiently, and const to preserve value semantics. 
 Wouldn't this be better served by some other mechanism, such as the 
 optimizer?

Sure. And for the optimizer to tell that it can replace a copy with a const ref, what does it need to do? It needs to prove that there are no mutable operations on the object. I.e., it needs to prove that the object is const! Granted, once you have const, you could have the optimizer do these replacements for you, but clearly, the const-proving mechanism itself is essential.
 - Pointers.  Come on, don't use pointers.  ;)

Easier said in D than C++.
 - Class instances.  I'll get back to this.

Yeah, since classes only cover maybe 2% of user types in D.
 - Arrays of things other than characters.  OK, I can see a use for
 declaring an array as read-only, although most of the time I'm modifying
 my arrays left and right.

This depends entirely on the nature of the code you're writing. For some kinds of apps, in-place modification makes perfect sense. For others, you want to transform data from one form to another, and the input and output types are different, so there is no opportunity for in-place modification. In this case, you may want the input to be immutable so you can reuse it in a subsequent call.
 - Strings.  The single most common kind of array, and they even have
 special read-only literals built into the language.  It makes sense to
 provide read-only strings, if for nothing else efficiency.  Most of the
 time you're not going to be modifying the strings after you've done some
 machinations.

But clearly, nobody needs read-only vectors, because scientific computing *always* entails modifying your matrices, tensors, etc. And if they do, they can just translate them to read-only strings instead, which are faster.
 [...]
 - As a sort of 'struct on steroids', that is, just a data-holding class 
 instance with some methods.  These I guess are candidates for constness,
 but the incidence of these is so small and the functions which use them
 are so short and what they do is so obvious that I have no idea what
 gains I will get from using const.

Have you ever heard of a little thing called "Service-Oriented Architecture"? The idea there is to decompose a large distributed app into independent components so that you have loose coupling and well-defined interface boundaries, as well as nice opportunities for parallelization for scaling. Well, in SOA, all the data gets passed between services as messages. And in a business with big data objects, those messages can get rather large. Most of the time, there is no reason to manipulate the message itself (any more than you need to manipulate a packet you received over a network). When a function needs to dig into one of these, there's no reason for it to be non-const, and you may want to pass the message to multiple functions for processing. Any time you read a business object off a database for display to a user, you don't want to mutate the object. This kind of thing happens a *lot* in business apps. Things that happen to be very dynamic and stateful will have more mutation (like games, simulations, etc.). The rest are going to benefit a lot from const.
 I'm just absolutely curious as to what on earth, _other than strings_,
 that you'll be passing to other code that you don't want it to be
 modified.

How about "just about everything"? Dave
Nov 12 2007
parent Bruce Adams <tortoise_74 yeah.who.co.uk> writes:
David B. Held Wrote:

 Jarrett Billingsley wrote:
 [...]
 1) We're on the same team, and we have some sort of convention
 set up for this kind of thing.

Hahahahaha!!! I see you've never worked for a company with more than 5 people. The idea that you can get even 5 people in a team in a large company to agree on a "convention" is rather amusing. If you work in a company with at least 5 *teams*, you have another class of problem altogether.

 2) You're using a library that I wrote, and it's probably well-
 documented, or at least self-obvious, whether or not my function
 modifies your data. 

Wow, maybe you write libraries that way, but you should take a look around. 95% of software engineers don't. Most of the code I see, I'd be happy of there were *comments*, let alone documentation about mutability and constness.

 (Keep in mind that even if there _is_ const available and my function is
 not well-documented, I don't _have_ to use const, and so you don't have
 any guarantee anyway.)

I have the guarantee that if you use const, you can't hack my data, and if you don't use const, I don't have to use your code. At the worst, I will copy my data before hand and curse you for the performance hit. And I can show you code bases where you will do this for your own sanity; and yes, the code will make you cry.
 3) I'm not a complete moron and actually name my functions after what
 they do.  Face it, most people don't write functions with one name and
 have them do something completely different, and if that's happening,
 constness is not really going to help you with that.

So you're in the middle of an app that executes a bunch of business logic, and the function is a hundred lines long and is called processFoo(Foo foo, Bar bar, Baz baz). Which of those arguments do you expect to get modified and why? Does this sound like an unreasonable name? Well, unreasonable or not, I see names like this a hundred times a day. I don't have the luxury of going around and renaming them as processFooModifiesFooButNotBarOrBaz(Foo foo, Bar bar, Baz baz).

I haven't tried it (mainly because its a sick idea) but presumably you can get ersatz constness with a contract: pre { signature sigX = md5sum(X); } foo(/* const */ X) { ... } post { assert(sigX == md5sum(X)); } This is a bit sick because its a run-time check. Might be useful for unit tests.
 
 But clearly, nobody needs read-only vectors, because scientific 
 computing *always* entails modifying your matrices, tensors, etc.  And 
 if they do, they can just translate them to read-only strings instead, 
 which are faster.

Another case is when you are slicing. I have seen applications in C++ where successive filters are applied to a vector-like container. In fact each filter creates a new slice of the data. (The structure is actually a tree so D array slices wouldn't work). That would make a good use case for iterators and opStar/opDeref/opSlice. Regards, Bruce.
Nov 13 2007
prev sibling parent Robert Fraser <fraserofthenight gmail.com> writes:
Janice Caron Wrote:

 I like to know that when I pass /my/ data to /your/ function, then
 your function is not going to mess with my data. const in the
 declaration of your function is what gives me that guarantee.

That's a rather paranoid approach. I never worry about that in Java, or D1, or any other language without const. That information belongs in the documentation, rather than waste the programmer's valuable time dealing with obscure const bugs and typing it all over the place.
Nov 12 2007
prev sibling next sibling parent reply Robert Fraser <fraserofthenight gmail.com> writes:
Jarrett Billingsley Wrote:

 What my thoughts boiled down to is that constness seems useful for 
 strings and not much else.  I suppose it could be useful for other kinds 
 of arrays, but overwhelmingly the use cases for const seem to be for 
 strings, and constness helps to make some operations with strings more 
 efficient.  Other than that, constness for other types seems more like a 
 logical convenience.  So you want to return a const(Foo) where Foo is a 
 class reference?  How often do you need to do that?  Passing const refs 
 -- have YOU ever had a bug where you tried to modify a const ref param 
 that was caught by the const?

Coming from a Java background, I agree with you quite a bit. However, I have run into one case where having const in Java would help a lot (although I think if I can get my debugger to set a modification watchpoint, it'll be a non-issue). Somewhere in the Descent semantic analysis, some static members that are supposed not to change are being somehow modified. Since the codebase is large, figuring out where exactly thats happening is proving problematic. That said, I agree -- having a const string would be good enough for software engineering purposes for me. But I think Walter wants to be able to optimize based on invariantness.
Nov 12 2007
parent Walter Bright <newshound1 digitalmars.com> writes:
Robert Fraser wrote:
 That said, I agree -- having a const string would be good enough for
 software engineering purposes for me. But I think Walter wants to be
 able to optimize based on invariantness.

I also want to support functional programming, and without invariants, that's dead in the water.
Nov 12 2007
prev sibling parent "Janice Caron" <caron800 googlemail.com> writes:
In C++ (and, I believe, D), it is possible for a statement like

    a = b;

to modify b. That's because the argument to operator=() or opAssign()
could be a non-const reference. You could argue that this would be a
silly thing to do, but auto_ptr<T> does that by design.

Likewise, it is possible in D for

    a[n] = b;

to modify b (and even n). You could argue that a function /shouldn't/
do stuff like that - but what if it does it by accident? As in,
because of a bug? const is your only guarantee that that won't happen.
Nov 12 2007