## digitalmars.D - pure functions without invariant (a get-rid-of-const proposal)

• hasen (202/202) Mar 08 2009 The line of thinking for introducing const/invariant seems to go like th...
• Andrei Alexandrescu (17/46) Mar 08 2009 That's not the line of reasoning. Invariant data has other uses - it is
• BCS (31/109) Mar 08 2009 what about an "immutable int" are you proposing that it be a totally sep...
• Frits van Bommel (27/31) Mar 08 2009 I'm pretty much in agreement with Andrei and BCS here, so I'll just
hasen <hasan.aljudy gmail.com> writes:
```The line of thinking for introducing const/invariant seems to go like this:

- D needs to be suitable for parallelism
- let's introduce pure functions
- pure functions need to accept immutable data only
- let's introduce const!! (invariant)

OK, I can agree with most of this, except for the last part! It's not
necessary that invariant is needed to introduce pure functions into D.
There must be better ways.

First of all, one could argue, why? what's wrong with invariant?

Well, there are a number of things that are wrong with it:

First of all, immutable data is a concept, implementing it as
"invariant" seems kind of hackish. By definition, immutable data doesn't
change. Forcing invariant on top of a mutable type is like saying "This
data would usually change, but please, for this instance, make it not
un-changeable!"
This is a HACK. Immutable types should have immutability be a
fundamental part of the type, not some kind of a glue on top of it; a
flag that's set on or off!

For instance, it doesn't make sense to have an invariant pointer to
invariant pointer to invariant data. If the data is invariant, then why
access it through pointers at all?

When data is immutable, then all you need is the data! You shouldn't
need to know where is it stored.

Invariant pointers just don't make sense!

Another thing to consider. When you have an object (in the oop sense of
the term), and you mark it as immutable, what's the point? The whole
point of objects is to encapsulate state that could change. If the
object's state doesn't change at all, then it becomes much less
interesting as an object. All you need then is the data in that object,
and not the object itself.

Invariant (oop) objects don't make sense.

You're trying to glue completely different things together and hope that
they stick.

I see this as similar to C++ trying to be "Object Oriented" without
automatic memory management, and it just doesn't make sense! They put
high level concepts in a low level language with no consideration to how
people actually need to use OOP. (They can argue all day long about how
they have inheritance and polymorphism!)

Similarly for D2, are we considering how people actually need to use
immutable data in their code?

Wouldn't it make much more sense, to have immutable data-types (such as
python tuples, and python strings), and remove the "invariant" concept
all together?

Let's step back and look at what we're trying to achieve again, and
think of a different way to implement it.

We want to have pure functions, but those have some restrictions!
A pure function's result can only depend on its input parameters (so
cannot have side effects)
- doesn't perform I/O operations
- doesn't change input parameters (if they are passed by reference, that is)

It's clear that we need immutable types, so, just introduce immutable
types, but would they be like?
I suggest looking at python.

Consider for instance, python strings (let's call them python-style
strings from here on):
-immutable type
-no `a[b] = c` operation defined on string a
-`a + b` returns a completely new string
-not implemented in terms of an underlying `char` type

We can use it as an inspiration for the string type in D2 (excluding the
"no underlying char type" part)

We can also extend the concept and introduce immutable lists
-similar to python-style strings (see above)
-python-style string can be implemented as an immutable list
- while we're at it, string slice and index operations should
operate on characters, not bytes!
- I actually prefer that string is an actual type, not just
an alias
-the elements of the list must also be immutable.
- immutable list to mutable data doesn't make sense (wouldn't
be used or needed in practice)

Immutable lists don't need an `invariant` keyword, they can be defined
by i[ ] instead of [ ] (where `i[` is a proper token, like `!(` and
`r"`). This ties strongly to the idea that immutability is a fundamental
part of the type, not just some kind of annotation.

The other immutable type we need is a clustering of data, similar to
tuples in python.
I'm proposing here "immutable struct objects"
- based on struct
- internal state is initialized in constructor, never changed anywhere
- have reference semantics (no value semantics at all, similar to
python strings)
- all operations simply return a new object
- fields can only be of immutable types or native types with value
semantics (no pointers)

for a practical example, consider a 3d vector:

istruct vector //istruce is an immutable struct type
{
real x = 0; //ok, initialization behavior
real z = 0;
real y = 0;
this( real px, real py, real pz )
{
//constructor can initialize fields
x = px;
y = px;
z = pz;
}

real length() { return sqrt( x*x + y*y + z*z ); } //doesn't change
any state (doesn't need to, anyway)

vector normalized()
{
real len = length();
return vector( x/len, y/len, z/len ); //returns a new object
//no need for "new", as immutable structs have no value
semantics, only reference semantics
//so "new" is implicit
}

{
return vector( x + other.x, y + other.y, z + other.z );
//returns a new object
}

//..etc..
}
//usage example
auto a = vector(1,2,3);
auto b = vector(5,4,3);
auto c = a.cross(b); //neither a nor b is changed

Why should you consider this proposal?

Well, for one thing, this is how things work in functional programming
(AFAIK), so this approach sticks to the spirit of functional languages.
What I mean is, this how data types are like! (I can only speak about

Also, This approach is more natural.
Immutable types don't have a state to change by their nature, where as
if you annotate some type as invariant, you're saying that this object
could potentially change, but shouldn't for this instance.
This creates confusion, consider when a type that's mutable by
definition, suddenly needs to be passed to pure functions, and you're in
for trouble! Welcome to the invariant virus!
If you need this object to be immutable, then why base it on a mutable
type at all?

The suggested approach presented here removes much complication from the
type system!
I mean, just what the hell is `invariant(char*)** p`??? why would anyone
pass such a thing to a pure function?

There's an issue I want to brush on, that is:

Pure function cannot depend on global state, but probably one exception
can be allowed:
- if the global state is of a native or immutable type
- if it's a constant
- it can't be initialized by an impure function
Consider if the variable is initialized to `now()`? If a pure
then it's getting another input, which varies depending on when
the program was run.
- corollary: pure functions cannot use:
- global pointers
- global (normal/oop) objects
Essentially, this global data is not a state at all, it's simply an
alias for a value.
e.g. mathematical PI

Now, it's easier to define (and check) pure functions:

A pure function needs to satisfy the following:
- All parameters are either of native types passed by value, or of
immutable type (passed by reference)
- doesn't read global state unless const (as stated above)
- doesn't perform IO

There are things that pure functions can do (I think, but someone
correct me if I'm wrong please!)

Pure functions can create and modify local state (as long as it's local
to the function)
This entails that pure function can call impure functions if:
- the impure function doesn't read or change global state in an
impure-way (again, as stated above)
- the impure function doesn't perform I/O

Let's call such functions "semi-pure" (def: impure functions that are
callable from within pure functions)

This applies to all kinds of expressions, as they can all be reduced to
functions. For instance, newing objects can be reduced to calling their
constructors, which are functions that can be checked for
pureness/semi-pureness.

When the coder marks a function as "pure", the compiler must check that
it actually is pure.
I think the compiler can do this, in a way similar to how it can detect
which functions can have compile time function execution (CTFE)

Probably the procedure can be something like this:
- functions can be marked (by the compiler) as one of four: pure,
semi-pure, impure, unknown
- At first, all functions are marked unknown
- Three passes:
Pass1: detect impure functions:
- functions that read global state in an impure way
- functions that perform I/O
- functions that call other impure functions (I'm not sure how
simple or hard this can be)
Pass2: detect semi-pure functions:
- unknown functions that accept parameters of mutable types
- please note: only unknown functions are checked at this pass
Pass3: check pure functions:
- check functions marked by the programmer as "pure"
- are they already marked impure or semi-pure? issue an error
- do they call impure functions? issue an error
- if they call semi-pure functions, it's ok.
- if you're here, you passed the pureness test!

There are probably some quirks here and there about what constitutes a
proper semi-pure function,
but I'm sure it can essentially be worked out, without resorting to
`invariant`.
```
Mar 08 2009
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
```hasen wrote:
The line of thinking for introducing const/invariant seems to go like this:

- D needs to be suitable for parallelism
- let's introduce pure functions
- pure functions need to accept immutable data only
- let's introduce const!! (invariant)

That's not the line of reasoning. Invariant data has other uses - it is
data that can be shared between threads without prejudice. Pure
functions themselves are ok with any inputs, and put "const" in the
parameter types to clarify to the caller they won't change the arguments.

OK, I can agree with most of this, except for the last part! It's not
necessary that invariant is needed to introduce pure functions into D.
There must be better ways.

This is a non-problem. Invariant is not needed to introduce pure
functions into D at all.

First of all, one could argue, why? what's wrong with invariant?

Well, there are a number of things that are wrong with it:

First of all, immutable data is a concept, implementing it as
"invariant" seems kind of hackish. By definition, immutable data doesn't
change. Forcing invariant on top of a mutable type is like saying "This
data would usually change, but please, for this instance, make it not
un-changeable!"
This is a HACK.

On the contrary, it's a very principled approach. Const types are proper
supertypes of their mutable counterparts. Immutable types are proper
theory of type qualifiers" by Jeff Foster.

Immutable types should have immutability be a
fundamental part of the type, not some kind of a glue on top of it; a
flag that's set on or off!

That is quite what's happening - the flag is kept during compilation though.

For instance, it doesn't make sense to have an invariant pointer to
invariant pointer to invariant data. If the data is invariant, then why
access it through pointers at all?

All interesting data has indirection. You couldn't have an immutable
list without indirection.

When data is immutable, then all you need is the data! You shouldn't
need to know where is it stored.

Invariant pointers just don't make sense!

Of course they do. You may want to rethink your argument. It rests on
invalid premises and inevitably can't go interesting places.

Andrei
```
Mar 08 2009
BCS <none anon.com> writes:
```Hello hasen,

The line of thinking for introducing const/invariant seems to go like
this:

- D needs to be suitable for parallelism
- let's introduce pure functions
- pure functions need to accept immutable data only
- let's introduce const!! (invariant)
OK, I can agree with most of this, except for the last part! It's not
necessary that invariant is needed to introduce pure functions into D.
There must be better ways.

First of all, one could argue, why? what's wrong with invariant?

Well, there are a number of things that are wrong with it:

First of all, immutable data is a concept, implementing it as
"invariant" seems kind of hackish. By definition, immutable data
doesn't
change. Forcing invariant on top of a mutable type is like saying
"This
data would usually change, but please, for this instance, make it not
un-changeable!"

That is, in fact, exactly what I think people want.

This is a HACK. Immutable types should have immutability be a
fundamental part of the type, not some kind of a glue on top of it; a
flag that's set on or off!

what about an "immutable int" are you proposing that it be a totally separate
type?

For instance, it doesn't make sense to have an invariant pointer to
invariant pointer to invariant data. If the data is invariant, then
why access it through pointers at all?

Because arrays /are/ pointers because objects are pointers, because ref
parameters
are pointers, because passing large structs by value is to expensive and
the alternative is a pointer.

OK, I'll admit that all but the last case are not pointer at the code level
but I'm not seeing any way your argument would work work that doesn't also
apply to those cases.

When data is immutable, then all you need is the data! You shouldn't
need to know where is it stored.

pointers are a way to access data, not a way to tell you where it is.

Invariant pointers just don't make sense!

they make a lot of sence to me.

Another thing to consider. When you have an object (in the oop sense
of the term), and you mark it as immutable, what's the point? The
whole point of objects is to encapsulate state that could change. If
the object's state doesn't change at all, then it becomes much less
interesting as an object. All you need then is the data in that
object, and not the object itself.

IIRC, the point of objects is to encapsulate implementation that can change.

Invariant (oop) objects don't make sense.

they make a lot of sence to me.

You're trying to glue completely different things together and hope
that they stick.

I see this as similar to C++ trying to be "Object Oriented" without
automatic memory management, and it just doesn't make sense! They put
high level concepts in a low level language with no consideration to
how people actually need to use OOP. (They can argue all day long
about how they have inheritance and polymorphism!)

Similarly for D2, are we considering how people actually need to use
immutable data in their code?

I'm reasonably sure they have.

Wouldn't it make much more sense, to have immutable data-types (such
as python tuples, and python strings), and remove the "invariant"
concept all together?

There are to make cases where the data that will be wanted immutable is of
a type (that already exists) that is also wanted in a mutable form.

Let's step back and look at what we're trying to achieve again, and
think of a different way to implement it.

We want to have pure functions, but those have some restrictions!
A pure function's result can only depend on its input parameters (so
cannot have side effects)

wrong, it can access anything (including globals) that can be proven won't
change (that is they are immutable)

- doesn't perform I/O operations
- doesn't change input parameters (if they are passed by reference,
that is)
It's clear that we need immutable types, so, just introduce immutable
types, but would they be like?

Based on your comment about globals it is not clear that immutables are needed.
Based on the above, all that would be needed is const.

[...]

- while we're at it, string slice and index operations should
operate on characters, not bytes!

That's a side issue, but indexing char's is an O(n) operation.

- I actually prefer that string is an actual type, not
just
an alias
-the elements of the list must also be immutable.
- immutable list to mutable data doesn't make sense (wouldn't
be used or needed in practice)

You, me and walter aggree on that.

[...]

After reading this far and skimming the rest my only question is "WHY?".
I see exactly zero benefit to your proposal. You have defined a system that
has all the problems of the current one and more. If we were to switch to
it I expect that we would end up with a common idiom of defining two almost
identical types where one is immutable and the other is not.

*What does your system give me that the current system does not?*
```
Mar 08 2009
Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
```I'm pretty much in agreement with Andrei and BCS here, so I'll just
comment on this bit:

hasen wrote:
When the coder marks a function as "pure", the compiler must check that
it actually is pure.
I think the compiler can do this, in a way similar to how it can detect
which functions can have compile time function execution (CTFE)

The DMD[1] doesn't actually check whether it can execute a function at
compile time, it just tries to do it and errors out if it comes across
something it can't handle.

For example, this compiles and prints '100':
-----
module test;

uint y = 100;
uint* x = &y;

return *x;
return 100;
}

// Doesn't compile if changed to foo(true):
// test.d(14): Error: cannot evaluate foo(true) at compile time
auto bar = foo(false);

void main() {
printf("%d\n", bar);
}
-----
But as the comment states, changing the parameter to true makes it error
out because it can't do CTFE on foo(true), even though it managed fine
with foo(false).

[1]: and LDC/GDC, which use the same frontend.
```
Mar 08 2009