digitalmars.D - Algorithms should be free from rich types

=?UTF-8?Q?Ali_=c3=87ehreli?= (48/48) Jun 27 2023 My mind is not fully clear on this topic yet but some related things

H. S. Teoh (84/129) Jun 27 2023 I can't resist me a Walter quote here:
FeepingCreature (12/39) Jun 28 2023 I like this approach:

=?UTF-8?Q?Ali_=c3=87ehreli?= (3/5) Jun 28 2023 I love it. And I actually tried but no, D does not have this yet. :D

Richard (Rikki) Andrew Cattermole (32/32) Jun 28 2023 Oh how you dare me.

Adam D Ruppe (5/6) Jun 28 2023 just do

bachmeier (2/13) Jun 28 2023 This would be a good change to the language.

Cecil Ward (2/17) Jun 28 2023 I’m not sure, but I’m thinking ‘yes’.

Steven Schveighoffer (22/35) Jun 29 2023 ```d

Steven Schveighoffer (5/26) Jun 29 2023 Oh wait, the `__traits(getMember)` trick doesn't work on member

Max Samukha (4/16) Jun 28 2023 That's some poorly designed library (Phobos?). A decently

=?UTF-8?Q?Ali_=c3=87ehreli?= (7/8) Jun 28 2023 Not in the D world at all.

Hipreme (13/21) Jun 28 2023 I have had a rant with `private` since the time I used LibGDX

bachmeier (3/24) Jun 28 2023 [Rich

=?UTF-8?Q?Ali_=c3=87ehreli?= (29/31) Jun 28 2023 Hickey](https://harfangk.github.io/2017/12/08/rich-hickey-interview-from...

Atila Neves (8/11) Jun 29 2023 I have lost count of how many times my life has been made

Jonathan M Davis (27/39) Jun 29 2023 Yeah. As with many things, I think that it primarily comes down to good ...
Steven Schveighoffer (8/23) Jun 29 2023 private is good for the library writer.
H. S. Teoh (26/43) Jun 29 2023 [...]

Atila Neves (7/37) Jun 30 2023 API design is indeed hard. Which makes it all the more imperative

bachmeier (10/17) Jun 30 2023 Private is more like locking everyone else's doors for their own

monkyyy (3/8) Jun 30 2023 Why do people make arguments about data ownership at all?

Timon Gehr (5/15) Jul 03 2023 That's why functions are not making the arguments. API design is a

H. S. Teoh (67/84) Jun 30 2023 The thing is, both of the above are true.

bachmeier (27/43) Jun 30 2023 That can never happen if you have to explicitly override

Meta (3/12) Jun 30 2023 IMO private is extremely important for maintaining the internal

Dom DiSc (12/14) Jul 01 2023 Yes. And this is pretty much the only reason to use private.

Dukc (4/7) Jul 02 2023 You do have it. `__traits(getMember, /+...+/)` as others have
Atila Neves (15/30) Jul 03 2023 I don't see how - it only applies to your own code, adding

Steven Schveighoffer (19/31) Jul 03 2023 That is the issue. For instance, if you do:

H. S. Teoh (33/43) Jul 03 2023 [...]

Steven Schveighoffer (22/57) Jul 03 2023 But that's the thing, there are parts that *simply must be private*. No

H. S. Teoh (40/64) Jul 03 2023 We're actually agreeing with each other, y'know. :-D

claptrap (5/13) Jul 03 2023 the correct usage of "ideal" is..
Steven Schveighoffer (30/86) Jul 03 2023 Yeah kind of. It's just that there are 2 types of privacy labeling,

H. S. Teoh (73/135) Jul 05 2023 Depends. D is flexible enough that public fields can be replaced with

Dukc (18/34) Jul 02 2023 The language-agnostic answer is to patch the library yourself to

=?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:

My mind is not fully clear on this topic yet but some related things 
have been brewing in me for years.

First, an aside: You may remember my minor complaint about 'private' 
during a DConf presentation years ago. Today, I feel even stronger that 
disallowing access to parts of software "just because" of good design is 
a mistake. I've seen multiple examples of this in professional life 
where a developer uses 'private' only because it is "of course" better 
to do so. (The Turkish word "işgüzar" and the German word 
"verschlimmbessern" describe the situation pretty well for me but the 
English language lacks such a word.)

To give an example from D's ecosystem, the D runtime's garbage collector 
statistics object used to be 'private'. (I think there is an interface 
for it now.) What an inconvenience it was to copy/paste that type's 
definition from the runtime to user code, get the compiled symbol of the 
object from the library, and pointer cast it to be able to access the 
members! A 'static assert' attempts to protect the project from changes 
to that type...

The idea of 'private' should be to just give the developer freedom to 
change the implementation in the future. It should not impede use cases 
that people come up with. That can be achieved practically with an 
underscore: Make everything 'public' and name your implementation 
details with an underscore. People who need them will surely know they 
are implementation details that can change in the future but they will 
be happy: They will get things done.

Ok, that rant is over.

The main topic here is about the harm caused by rich types surrounding 
algorithms. Let's say I am interested in using an open source algorithm 
that works with a memory area. (Not related to D.) We all know that a 
memory area can be described by a fat pointer like D's slices. So, that 
is what the algorithm should take.

Unfortunately, the poor little algorithm is not free to be used: It is 
written to work with a custom type of that library; let's call it 
MySlice, which is produced by MyMemoryMappedFile, which is produced by 
MyFile, which is initialized only by types like MyFilePath. (I may have 
gotten the relationships wrong there.)

But my data is already in a memory area that I own! How can I call that 
algorithm? Should I write it to a file first and then use those rich 
types to access the algorithm? That should not be necessary...

Of course I understand the benefits of all those types but the core 
algorithm should be as free as possible. So, this is simply wrong. I 
think us, software developers, have been on the wrong path. Our task 
should primarily be about getting things done first.

I could work with those types if they had virtual interfaces. But no. 
They are un-subtypable C++ 'class'es.

I think it could also work if the algorithm was templatized; but again, 
no...

Hey! Thank you! I feel better already. :)

Ali

Jun 27 2023

"H. S. Teoh" <hsteoh qfbox.info> writes:

On Tue, Jun 27, 2023 at 02:53:59PM -0700, Ali Çehreli via Digitalmars-d wrote:
[...]
 First, an aside: You may remember my minor complaint about 'private'
 during a DConf presentation years ago. Today, I feel even stronger
 that disallowing access to parts of software "just because" of good
 design is a mistake. I've seen multiple examples of this in
 professional life where a developer uses 'private' only because it is
 "of course" better to do so. (The Turkish word "işgüzar" and the
 German word "verschlimmbessern" describe the situation pretty well for
 me but the English language lacks such a word.)

I can't resist me a Walter quote here:

	I've been around long enough to have seen an endless parade of
	magic new techniques du jour, most of which purport to remove
	the necessity of thought about your programming problem.  In the
	end they wind up contributing one or two pieces to the
	collective wisdom, and fade away in the rearview mirror. --
	Walter Bright

When you start doing something with the code because that's what
everybody else does, or because it's what everyone else says is "the
Right Thing(tm)", then it's just cargo-culting, which inevitably leads
to problems down the road.


 To give an example from D's ecosystem, the D runtime's garbage
 collector statistics object used to be 'private'. (I think there is an
 interface for it now.) What an inconvenience it was to copy/paste that
 type's definition from the runtime to user code, get the compiled
 symbol of the object from the library, and pointer cast it to be able
 to access the members! A 'static assert' attempts to protect the
 project from changes to that type...

Thing is, things like these usually come from temporary hacks in the
code that the original coder didn't want to set in stone, but that end
up staying put because of inertia and becoming de facto set in stone.


 The idea of 'private' should be to just give the developer freedom to
 change the implementation in the future. It should not impede use
 cases that people come up with. That can be achieved practically with
 an underscore: Make everything 'public' and name your implementation
 details with an underscore.  People who need them will surely know
 they are implementation details that can change in the future but they
 will be happy: They will get things done.

IOW, empower the user instead of straitjacketing them. My favorite
programming modus operandi. Along the same lines as my philosophy of
"everything should be a library, main() is just a convenient (thin)
interface to access the library API".


[...]
 The main topic here is about the harm caused by rich types surrounding
 algorithms. Let's say I am interested in using an open source
 algorithm that works with a memory area. (Not related to D.) We all
 know that a memory area can be described by a fat pointer like D's
 slices. So, that is what the algorithm should take.

 Unfortunately, the poor little algorithm is not free to be used: It is
 written to work with a custom type of that library; let's call it
 MySlice, which is produced by MyMemoryMappedFile, which is produced by
 MyFile, which is initialized only by types like MyFilePath. (I may
 have gotten the relationships wrong there.)

That's a sign of poorly-factored code. The logically-separate parts of
the code are not properly separated out, causing them to be dependent on
each other where they technically should not be.  Doing this right is
actually a lot harder than it looks; it often requires significant
amounts of refactoring after your initial implementation, because until
you write the thing out in code, it isn't always clear which parts are
actually dependent and which parts can be separated.

Idioms like pipeline programming with ranges help to identify
independent pieces of the logic, and abstractions like the range API
help you actually separate out the pieces in a clean way. Without a
unifying common API like ranges, it's pretty tough to write code in
composable pieces that can be freely mixed-and-matched with each other.

	https://wiki.dlang.org/Component_programming_with_ranges

Well, obviously you already know about this article, but one of my
motivations for writing that article was precisely what you describe
above.


 But my data is already in a memory area that I own! How can I call
 that algorithm? Should I write it to a file first and then use those
 rich types to access the algorithm? That should not be necessary...
 
 Of course I understand the benefits of all those types but the core
 algorithm should be as free as possible. So, this is simply wrong. I
 think us, software developers, have been on the wrong path. Our task
 should primarily be about getting things done first.

Over the years, I've been dreaming about the ideal situation where
there would be libraries of algorithms that are not tied to a specific
implementation (i.e., bound to concrete types and parameter values), but
are written in a form that encapsulates only its core logic.  You'd then
pull in the algorithm by specifying which concrete type(s) to bind its
various parts to, and it'd Just Work(tm).  That's the way things should
have been from the beginning.

But the situation today is far from that ideal: you have libraries that
solve some particular programming problem X, but to use the library's
solution you need to use also Y, Z, and W that the author of that
library happened to choose. For instance, the FreeType library
implements rasterization algorithms, but you can't access those
algorithms directly. You have to use the library API, which abstracts
away file handling, memory management, image type, etc.. In order to
cater to different user needs, an entire complicated API is invented to
allow the user to specify certain parameters the authors deem tweakable,
while an elaborate scheme is designed to hide the rest of the
information away. You can't effectively use the rasterization algorithm
without also using all of these other peripheral types; and when you
need to interface FreeType with another library that uses other,
different concrete types, you end up having to write lots of shunt code
whose sole purpose is to bridge between incompatible types that actually
do equivalent things.


 I could work with those types if they had virtual interfaces. But no.
 They are un-subtypable C++ 'class'es.
 
 I think it could also work if the algorithm was templatized; but
 again, no...

[...]

In cases like this, I often get really tempted to copy-n-paste the code
and templatize it myself. :-D  Of course, in practice that's usually
impractical, so the next best thing is to use D's compile-time
introspection capabilities to autogenerate boilerplate shunt code to
work around API infelicities in the target library, and export a nicer
API on the D side. :-D  Not always possible, of course, like in your
case, where you'd have to either copy-n-paste code and do un- safe
casts, or live with infelicities like writing stuff to a file and
opening it via the official API.

(I had to do something similar once in my day job, interfacing with a
grossly over-engineered C++ framework that nobody fully understood nor
wanted anything to do with if they could help it -- I ended up having to
write a hack where a single function call involved 7 layers of
abstraction, one of which involved writing a struct to a temporary file
on one side of an RPC call and having the other side (a daemon process)
read from the file and cast it back to the struct.  The result was the
stuff of nightmares that, to everyone's great relief, was phased out a
couple of releases later. We relished every moment of typing `\rm -rf`
on that entire old codebase after its replacement became fully
functional.)


T

-- 
2+2=4. 2*2=4. 2^2=4. Therefore, +, *, and ^ are the same operation.

Jun 27 2023

FeepingCreature <feepingcreature gmail.com> writes:

On Tuesday, 27 June 2023 at 21:53:59 UTC, Ali Çehreli wrote:
 My mind is not fully clear on this topic yet but some related 
 things have been brewing in me for years.

 First, an aside: You may remember my minor complaint about 
 'private' during a DConf presentation years ago. Today, I feel 
 even stronger that disallowing access to parts of software 
 "just because" of good design is a mistake. I've seen multiple 
 examples of this in professional life where a developer uses 
 'private' only because it is "of course" better to do so. (The 
 Turkish word "işgüzar" and the German word "verschlimmbessern" 
 describe the situation pretty well for me but the English 
 language lacks such a word.)

 To give an example from D's ecosystem, the D runtime's garbage 
 collector statistics object used to be 'private'. (I think 
 there is an interface for it now.) What an inconvenience it was 
 to copy/paste that type's definition from the runtime to user 
 code, get the compiled symbol of the object from the library, 
 and pointer cast it to be able to access the members! A 'static 
 assert' attempts to protect the project from changes to that 
 type...

 The idea of 'private' should be to just give the developer 
 freedom to change the implementation in the future. It should 
 not impede use cases that people come up with. That can be 
 achieved practically with an underscore: Make everything 
 'public' and name your implementation details with an 
 underscore. People who need them will surely know they are 
 implementation details that can change in the future but they 
 will be happy: They will get things done.

I like this approach:

```
class C {
     private int i;
}
...
void main()  system {
     auto c = new C;
     c.private.i = 5;
}
```

Jun 28 2023

=?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:

On 6/28/23 01:00, FeepingCreature wrote:

      auto c = new C;
      c.private.i = 5;

I love it. And I actually tried but no, D does not have this yet. :D

Ali

Jun 28 2023

"Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:

Oh how you dare me.

--- app.d
module app;
import foo;
void main()
{
     Foo foo = new Foo;
     foo.privateGet!"i" = 2;
     foo.say();
}

ref privateGet(string name, From)(ref From from) {
     static foreach(I; 0 .. from.tupleof.length) {
         {
             enum Name = __traits(identifier, from.tupleof[I]);

             static if (Name == name) {
                 return from.tupleof[I];
             }
         }
     }

     assert(0);
}

--- foo.d
module foo;
class Foo {
     void say() {
         import std.stdio;
      	writeln(i);
     }

private:
     int i;
     bool b;
}

Jun 28 2023

Adam D Ruppe <destructionator gmail.com> writes:

On Wednesday, 28 June 2023 at 17:06:43 UTC, Richard (Rikki) 
Andrew Cattermole wrote:
 Oh how you dare me.


just do

__traits(getMember, foo, "i") = 2;


reflection bypasses private

Jun 28 2023

bachmeier <no spam.net> writes:

On Wednesday, 28 June 2023 at 08:00:23 UTC, FeepingCreature wrote:

 I like this approach:

 ```
 class C {
     private int i;
 }
 ...
 void main()  system {
     auto c = new C;
     c.private.i = 5;
 }
 ```

This would be a good change to the language.

Jun 28 2023

Cecil Ward <cecil cecilward.com> writes:

On Wednesday, 28 June 2023 at 17:40:25 UTC, bachmeier wrote:
 On Wednesday, 28 June 2023 at 08:00:23 UTC, FeepingCreature 
 wrote:

 I like this approach:

 ```
 class C {
     private int i;
 }
 ...
 void main()  system {
     auto c = new C;
     c.private.i = 5;
 }
 ```

 This would be a good change to the language.

I’m not sure, but I’m thinking ‘yes’.

Jun 28 2023

Steven Schveighoffer <schveiguy gmail.com> writes:

On 6/28/23 4:00 AM, FeepingCreature wrote:

 I like this approach:
 
 ```
 class C {
      private int i;
 }
 ...
 void main()  system {
      auto c = new C;
      c.private.i = 5;
 }
 ```
 

```d
auto usePrivate(T)(ref T thing)  system
{
    static struct GetMeThePrivateStuff
    {
       disable this(this); // shouldn't be copied about, meant to be a 
temporary access
      private T* _thing; // "private" lol
      auto ref opDispatch(string s, Args...)(Args args)
      {
         static if(Args.length == 0)
            return __traits(getMember, *_thing, s);
         else
            return __traits(getMember, *_thing, s)(args);
      }
    }

    return GetMeThePrivateStuff(&thing);
}
```

Yeah, yeah, it needs work. But you get the idea. D is all-powerful.

-Steve

Jun 29 2023

Steven Schveighoffer <schveiguy gmail.com> writes:

On 6/29/23 10:15 PM, Steven Schveighoffer wrote:

 
 ```d
 auto usePrivate(T)(ref T thing)  system
 {
     static struct GetMeThePrivateStuff
     {
        disable this(this); // shouldn't be copied about, meant to be a 
 temporary access
       private T* _thing; // "private" lol
       auto ref opDispatch(string s, Args...)(Args args)
       {
          static if(Args.length == 0)
             return __traits(getMember, *_thing, s);
          else
             return __traits(getMember, *_thing, s)(args);
       }
     }
 
     return GetMeThePrivateStuff(&thing);
 }
 ```

Oh wait, the `__traits(getMember)` trick doesn't work on member 
functions, interesting...

So maybe half-powerful ;)

-Steve

Jun 29 2023

Max Samukha <maxsamukha gmail.com> writes:

On Tuesday, 27 June 2023 at 21:53:59 UTC, Ali Çehreli wrote:
 My mind is not fully clear on this topic yet but some related 
 things have been brewing in me for years.

 Unfortunately, the poor little algorithm is not free to be 
 used: It is written to work with a custom type of that library; 
 let's call it MySlice, which is produced by MyMemoryMappedFile, 
 which is produced by MyFile, which is initialized only by types 
 like MyFilePath. (I may have gotten the relationships wrong 
 there.)

 But my data is already in a memory area that I own! How can I 
 call that algorithm? Should I write it to a file first and then 
 use those rich types to access the algorithm? That should not 
 be necessary...

That's some poorly designed library (Phobos?). A decently 
designed one would at least allow you to construct a MySlice 
instance from a (pointer, length) pair.

Jun 28 2023

=?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:

On 6/28/23 02:25, Max Samukha wrote:

 That's some poorly designed library (Phobos?).

Not in the D world at all.

Ironically, I think the library's design is actually pretty good. And 
that's why I was motivated to write in the first place: Everything was 
done according to industry best practices but in the end all of that 
reduces the usability of the library.

Ali

Jun 28 2023

Hipreme <msnmancini hotmail.com> writes:

On Wednesday, 28 June 2023 at 17:00:44 UTC, Ali Çehreli wrote:
 On 6/28/23 02:25, Max Samukha wrote:

 That's some poorly designed library (Phobos?).

 Not in the D world at all.

 Ironically, I think the library's design is actually pretty 
 good. And that's why I was motivated to write in the first 
 place: Everything was done according to industry best practices 
 but in the end all of that reduces the usability of the library.

 Ali

I have had a rant with `private` since the time I used LibGDX 
Particle System. I wasn't able to extend its particle system to 
add collision to it, why? Because the particles were `private`. 
Since that, I never used `private` anymore without a very very 
good reason to do so, the only place I use it right now is for 
intermediate processes on a full process. People in industry 
knows nothing on how to use `protected`. Protected IMO should be 
the industry standard.

I have worked in a codebase which is being refactored for at 
least 3 years, there's so many changes on `private` not being 
used after some time. Why is that? Because programmers should not 
fear themselves most of the time.

Jun 28 2023

bachmeier <no spam.net> writes:

On Wednesday, 28 June 2023 at 17:12:17 UTC, Hipreme wrote:

 I have had a rant with `private` since the time I used LibGDX 
 Particle System. I wasn't able to extend its particle system to 
 add collision to it, why? Because the particles were `private`. 
 Since that, I never used `private` anymore without a very very 
 good reason to do so, the only place I use it right now is for 
 intermediate processes on a full process. People in industry 
 knows nothing on how to use `protected`. Protected IMO should 
 be the industry standard.

 I have worked in a codebase which is being refactored for at 
 least 3 years, there's so many changes on `private` not being 
 used after some time. Why is that? Because programmers should 
 not fear themselves most of the time.

[Rich 
Hickey](https://harfangk.github.io/2017/12/08/rich-hickey-interview-from-codequarterly.html):

 At some point though, someone is going to need to have access 
 to the data. And if you have a notion of “private”, you need 
 corresponding notions of privilege and trust. And that adds a 
 whole ton of complexity and little value, creates rigidity in a 
 system, and often forces things to live in places they 
 shouldn’t.

 If people don’t have the sensibilities to desire to program to 
 abstractions and to be wary of marrying implementation details, 
 then they are never going to be good programmers.

Jun 28 2023

=?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:

On 6/28/23 10:38, bachmeier wrote:

 [Rich
 

Hickey](https://harfangk.github.io/2017/12/08/rich-hickey-interview-from-codequarterly.html):

Amen! I've just finished reading most of it (skipped some Clojure 
specific parts).

The following part is worth quoting as well:

   "When we drop down to the algorithm level, I think OO can
    seriously thwart reuse. In particular, the use of objects
    to represent simple informational data is almost criminal
    in its generation of per-piece-of-information
    micro-languages, i.e. the class methods, versus far more
    powerful, declarative, and generic methods like
    relational algebra. Inventing a class with its own
    interface to hold a piece of information is like
    inventing a new language to write every short story. This
    is anti-reuse, and, I think, results in an explosion of
    code in typical OO applications."

One more quote both to stay unkind to my ex-favorite language and to 
relate to our ever-present discussions on the GC's appropriateness in 
libraries:

   "The complexity [of C++] is stunning. It failed as the
    library language it purported to be, due to lack of GC,
    in my opinion, and static typing failed to keep large OO
    systems from becoming wretched balls of mud. Large
    mutable object graphs are the sore point, and const is
    inadequate to address it. Once C++’s performance
    advantage eroded or became less important, you had to
    wonder—why bother? I can’t imagine working in a language
    without GC today, except in very special circumstances."

Ali

Jun 28 2023

Atila Neves <atila.neves gmail.com> writes:

On Tuesday, 27 June 2023 at 21:53:59 UTC, Ali Çehreli wrote:
 My mind is not fully clear on this topic yet but some related 
 things have been brewing in me for years.

 [...]

I have lost count of how many times my life has been made 
difficult by the lack of `private`.

I have also lost count of how many times my life has been made 
easier by the fact that I ruthlessly declare everything `private` 
unless it has good reason not to be.

Ease of refactoring = good, ergo `private` = good and should be 
the default.

Jun 29 2023

Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:

On Thursday, June 29, 2023 8:44:05 AM MDT Atila Neves via Digitalmars-d wrote:
 On Tuesday, 27 June 2023 at 21:53:59 UTC, Ali Çehreli wrote:
 My mind is not fully clear on this topic yet but some related
 things have been brewing in me for years.

 [...]

 I have lost count of how many times my life has been made
 difficult by the lack of `private`.

 I have also lost count of how many times my life has been made
 easier by the fact that I ruthlessly declare everything `private`
 unless it has good reason not to be.

 Ease of refactoring = good, ergo `private` = good and should be
 the default.

Yeah. As with many things, I think that it primarily comes down to good API
design (which can be hard). private prevents implementation details from
being mucked with, which can be a lifesaver when refactoring and can be a
big help with testing and ensuring that things work as expected when other
folks use your code. On the other hand, if you fail to make it so that the
API provides what your users need, then it could easily be the case that
some stuff that should have been available is locked behind private, making
their lives harder (or even impossible, depending on what they're trying to
do).

Similarly, if you actually plan your API around generic types, then it's
much easier for folks to make it work with their own types, but it's not
always obvious when you should be doing that vs designing an API around more
specific types - and it's often the case that code goes from using more
specific types to being more flexible as it matures (though that's harder to
do in cases where you can't reasonably make sure that all user code gets
updated when you make changes, which can make fixing such issues in open
source code harder than in company code).

So, I'm very much in favor of private being the default, but programmers
need to be aware of API issues that can come from being too specific with
APIs and locking away stuff that users may actually need. Experience can
help a lot with that, though it isn't always easy, and there are plenty of
folks out there who just put something together that "works" and leave folks
to deal with the mess when something better thought out would have been far
more useful. Actively trying to come up with good APIs instead of something
that just works can go a long way.

- Jonathan M Davis

Jun 29 2023

Steven Schveighoffer <schveiguy gmail.com> writes:

On 6/29/23 10:44 AM, Atila Neves wrote:
 On Tuesday, 27 June 2023 at 21:53:59 UTC, Ali Çehreli wrote:
 My mind is not fully clear on this topic yet but some related things 
 have been brewing in me for years.

 [...]

 
 I have lost count of how many times my life has been made difficult by 
 the lack of `private`.
 
 I have also lost count of how many times my life has been made easier by 
 the fact that I ruthlessly declare everything `private` unless it has 
 good reason not to be.
 
 Ease of refactoring = good, ergo `private` = good and should be the 
 default.

private is good for the library writer.

arbitrary access to private is good for the user/hacker.

Honestly though, since private data is accessible through an escape 
hatch hack (i.e. `__traits(getMember)`), and the library writer can just 
say "whatevs, you broke it, you bought it", I think we are in a 
reasonable space.

-Steve

Jun 29 2023

"H. S. Teoh" <hsteoh qfbox.info> writes:

On Thu, Jun 29, 2023 at 05:54:28PM -0600, Jonathan M Davis via Digitalmars-d
wrote:
 On Thursday, June 29, 2023 8:44:05 AM MDT Atila Neves via Digitalmars-d wrote:

[...]
 I have lost count of how many times my life has been made
 difficult by the lack of `private`.

 I have also lost count of how many times my life has been made
 easier by the fact that I ruthlessly declare everything `private`
 unless it has good reason not to be.

 Ease of refactoring = good, ergo `private` = good and should be
 the default.

 
 Yeah. As with many things, I think that it primarily comes down to
 good API design (which can be hard).

[...]

True.  It comes down to good API design. Which, as you say, is very
hard, probably harder than most people realize.  It's easy to slap an ad
hoc API onto your library functions, but over time it will prove
inadequate for user needs and they will feel frustrated over why certain
things are locked behind private.

IME, it takes several iterations of actually using a particular API
before it becomes clear where the friction points are and what are
possible alternative designs that may work better for user code. (And
also, which parts of the API are perhaps needlessly complex and could
probably be simplified.) The problem is that if you have actual users
during this period of time, they will start writing code that depends on
the current API, which obligates you to support an inferior API even
after a better design emerges.


 Similarly, if you actually plan your API around generic types, then
 it's much easier for folks to make it work with their own types, but
 it's not always obvious when you should be doing that vs designing an
 API around more specific types

[...]

Yeah, there's definitely a danger of premature generalization. Before
you have experience designing a certain library, it's hard to predict
what's worth generalizing and what isn't.  But it's hard to gain
experience without people actually using your library, which then binds
you to the non-optimal initial API.  So it's a catch-22.

API design is hard.


T

-- 
What do you mean the Internet isn't filled with subliminal messages? What about
all those buttons marked "submit"??

Jun 29 2023

Atila Neves <atila.neves gmail.com> writes:

On Friday, 30 June 2023 at 02:21:42 UTC, H. S. Teoh wrote:
 On Thu, Jun 29, 2023 at 05:54:28PM -0600, Jonathan M Davis via 
 Digitalmars-d wrote:
 [...]

 [...]
 [...]

 [...]

 True.  It comes down to good API design. Which, as you say, is 
 very hard, probably harder than most people realize.  It's easy 
 to slap an ad hoc API onto your library functions, but over 
 time it will prove inadequate for user needs and they will feel 
 frustrated over why certain things are locked behind private.

 IME, it takes several iterations of actually using a particular 
 API before it becomes clear where the friction points are and 
 what are possible alternative designs that may work better for 
 user code. (And also, which parts of the API are perhaps 
 needlessly complex and could probably be simplified.) The 
 problem is that if you have actual users during this period of 
 time, they will start writing code that depends on the current 
 API, which obligates you to support an inferior API even after 
 a better design emerges.


 [...]

 [...]

 Yeah, there's definitely a danger of premature generalization. 
 Before you have experience designing a certain library, it's 
 hard to predict what's worth generalizing and what isn't.  But 
 it's hard to gain experience without people actually using your 
 library, which then binds you to the non-optimal initial API.  
 So it's a catch-22.

 API design is hard.


 T

API design is indeed hard. Which makes it all the more imperative 
to not accidentally design one with implementation details that 
users downstream start depending on. That is: API design needs to 
be a conscious opt-in decision and not "I guess I didn't think 
about the consequences of leaving the door to my flat open all 
the time and now there are people camping in my living room".

Jun 30 2023

bachmeier <no spam.net> writes:

On Friday, 30 June 2023 at 11:07:33 UTC, Atila Neves wrote:

 API design is indeed hard. Which makes it all the more 
 imperative to not accidentally design one with implementation 
 details that users downstream start depending on. That is: API 
 design needs to be a conscious opt-in decision and not "I guess 
 I didn't think about the consequences of leaving the door to my 
 flat open all the time and now there are people camping in my 
 living room".

Private is more like locking everyone else's doors for their own 
safety. In the cases that it keeps an intruder out, it was 
helpful to them. When grandma had to sleep on the sidewalk, not 
so much. Many times library authors have prevented me from doing 
my work because of arbitrarily preventing access to 
implementation details. I should have the option to override 
those decisions. If something blows up, or if my code gets broken 
in the future, it's my fault, because I was the one that made 
that decision.

Jun 30 2023

monkyyy <crazymonkyyy gmail.com> writes:

On Friday, 30 June 2023 at 14:41:00 UTC, bachmeier wrote:
 On Friday, 30 June 2023 at 11:07:33 UTC, Atila Neves wrote:
 I didn't think about the consequences of leaving the door to 
 my flat open all the time

 Private is more like locking everyone else's doors for their 
 own safety.

Why do people make arguments about data ownership at all? 
Functions airnt people.

Jun 30 2023

Timon Gehr <timon.gehr gmx.ch> writes:

On 6/30/23 17:57, monkyyy wrote:
 On Friday, 30 June 2023 at 14:41:00 UTC, bachmeier wrote:
 On Friday, 30 June 2023 at 11:07:33 UTC, Atila Neves wrote:
 I didn't think about the consequences of leaving the door to my flat 
 open all the time

 Private is more like locking everyone else's doors for their own safety.

 
 Why do people make arguments about data ownership at all? Functions 
 airnt people.
 

That's why functions are not making the arguments. API design is a 
social activity between programmers. Programmers are people. Simple.

Anyway, it's not like private actually prevents you from deliberately 
accessing things, it just makes clear that that's outside the supported API.

Jul 03 2023

"H. S. Teoh" <hsteoh qfbox.info> writes:

On Fri, Jun 30, 2023 at 02:41:00PM +0000, bachmeier via Digitalmars-d wrote:
 On Friday, 30 June 2023 at 11:07:33 UTC, Atila Neves wrote:
 
 API design is indeed hard. Which makes it all the more imperative to
 not accidentally design one with implementation details that users
 downstream start depending on. That is: API design needs to be a
 conscious opt-in decision and not "I guess I didn't think about the
 consequences of leaving the door to my flat open all the time and
 now there are people camping in my living room".

 
 Private is more like locking everyone else's doors for their own
 safety. In the cases that it keeps an intruder out, it was helpful to
 them. When grandma had to sleep on the sidewalk, not so much. Many
 times library authors have prevented me from doing my work because of
 arbitrarily preventing access to implementation details. I should have
 the option to override those decisions. If something blows up, or if
 my code gets broken in the future, it's my fault, because I was the
 one that made that decision.

The thing is, both of the above are true.

Private does have its uses: to hide implementation details from
unrelated parts of the code so that, especially in a large project with
many contributors, you don't end up with accidental dependencies between
parts of the code that really shouldn't depend on each other. Hairball
dependencies among unrelated modules is a major factor of
unmaintainability in large projects, and preventing this goes a long way
to reduce long-term maintenance costs.

The other side to this, however, is that deciding what should be private
and what shouldn't is a hard problem, and most people either can't
figure it out, or can't be bothered to put in the effort to get it
right, so they slap private on everything, making it hard to reuse their
code outside of the narrow confines of how they initially envisioned it.
So you end up with an API that covers the most common use cases but not
others, which causes a lot of frustration when downstream code wants to
do something but can't via the API, so they have to resort to copy-pasta
or breaking private. (See: API design is hard.)

Most people design APIs around how they envision the module would be (or
ought to be) used, at a relatively high level of abstraction, without
regard to the core algorithms that would be used to implement this. What
we may call a "use-centric API".  Contrary to popular belief, this is
actually a mistake.  It frequently leads to the situation where a useful
algorithm that might benefit other parts of the code gets locked behind
the private implementation of the module, because it doesn't directly
map to the external API. This in turn promotes code duplication: if my
module also needs some variant of the same algorithm, I have to
copy-n-paste it or re-implement it from scratch in my own module --
usually also behind `private`, so the next person that comes along will
need to do it again. It actually *reduces* code reuse. It also fosters
the desire to break private: I realize that the algorithm is already
implemented, so I wish I could break private in order to avoid rewriting
it myself.

A better approach is an algorithm-centric API design: in the course of
implementing a module (or library), identify the core algorithms that
solve the main problems that the module/library is trying to solve, and
design the API around exposing this algorithm to user code.  Then on top
of that, add some syntactic sugar that maps this to the high-level usage
of the algorithm (the use-centric API). There may still be private parts
(internal details of the algorithms that the user really doesn't need to
know), but these are confined to things that outside code truly doesn't
need to know, not a blanket default that may unintentionally exclude
certain unusual, but valid, use cases.

There is an important philosophical difference between these two
approaches. The first approach tends towards the philosophy of "you have
problem X, no problem, hand it over to us (the library), we'll perform
the magic to solve it, and we'll give you back the result Y". The method
of solution is opaque and hidden from user code. IOW, the hood is welded
shut; your only recourse in case of problems is to take it back to the
dealer (the library author). The second approach has the philosophy "you
have problem X, we (the library) will give you tools A, B, C, that you
can use to solve problem X. In addition, we provide you special combo D
(syntactic sugar functions) that will solve X the usual way without you
having to figure out how to combine A, B, and C in the right way." The
hood is open and you may fiddle with the things inside if you know what
you're doing. But most of the time you won't need to -- the syntactic
sugar functions handle the most common use cases for you.

The first approach empowers the library writer, the second approach
empowers the user.  My argument is that the second approach is superior.
No abstraction is perfect (otherwise it wouldn't be an abstraction!);
there will always be cases where you need to go under the hood and do
something the library author didn't envision initially. Give him the
tools to do so without breaking encapsulation, instead of forcing him to
come back to you for help.


T

-- 
Claiming that your operating system is the best in the world because more
people use it is like saying McDonalds makes the best food in the world. --
Carl B. Constantine

Jun 30 2023

bachmeier <no spam.net> writes:

On Friday, 30 June 2023 at 16:33:31 UTC, H. S. Teoh wrote:

 Private does have its uses: to hide implementation details from 
 unrelated parts of the code so that, especially in a large 
 project with many contributors, you don't end up with 
 accidental dependencies between parts of the code that really 
 shouldn't depend on each other.

That can never happen if you have to explicitly override 
something that's been marked private - it's an intentional 
dependency.

 The other side to this, however, is that deciding what should 
 be private and what shouldn't is a hard problem, and most 
 people either can't figure it out, or can't be bothered to put 
 in the effort to get it right, so they slap private on 
 everything, making it hard to reuse their code outside of the 
 narrow confines of how they initially envisioned it.

It's worse than that. Saying something is private is used as a 
substitute for documenting or even commenting the code.

 So you end up with an API that covers the most common use cases 
 but not others, which causes a lot of frustration when 
 downstream code wants to do something but can't via the API, so 
 they have to resort to copy-pasta or breaking private. (See: 
 API design is hard.)

It's hard not because you don't know what others need, but 
because you're marking stuff private and there's no way for 
anyone else to override that decision.

One of the many examples related to the project I just released 
is the R shared library. The developers have not exported most of 
the functionality of the library. So when other developers 
created the Matrix package (now installed by default) for greatly 
expanded matrix types and operations, they had to resort to 
copying and pasting large amounts of C code for no obvious 
reason. Now there are two copies of all that code floating 
around, but they're probably out of sync. And as I noted above, 
private means the code is not documented or commented, so who 
knows if that hasn't resulted in bugs in hard-to-catch edge cases.

I agree with the existence of private. In some cases, strictly 
enforcing privacy is a good thing (though you can't prevent copy 
and paste). It's difficult to justify the absence of a simple 
override mechanism.

Where it gets really frustrating is when you've invested time 
getting to 95% of what you need. You're at the point where it 
almost works, but arbitrary decisions about private mean you'll 
never be able to achieve 100% of what you need.

Jun 30 2023

Meta <jared771 gmail.com> writes:

On Friday, 30 June 2023 at 14:41:00 UTC, bachmeier wrote:
 Private is more like locking everyone else's doors for their 
 own safety. In the cases that it keeps an intruder out, it was 
 helpful to them. When grandma had to sleep on the sidewalk, not 
 so much. Many times library authors have prevented me from 
 doing my work because of arbitrarily preventing access to 
 implementation details. I should have the option to override 
 those decisions. If something blows up, or if my code gets 
 broken in the future, it's my fault, because I was the one that 
 made that decision.

IMO private is extremely important for maintaining the internal 
invariants of a unit of encapsulation.

Jun 30 2023

Dom DiSc <dominikus scherkl.de> writes:

On Friday, 30 June 2023 at 16:48:39 UTC, Meta wrote:
 IMO private is extremely important for maintaining the internal 
 invariants of a unit of encapsulation.

Yes. And this is pretty much the only reason to use private.
You have functions that don't keep the invariants for performance 
reasons, so you create public functions that call them in the 
correct order and with the correct parameters to keep the 
invariants.
So private is there, to hide unsafe interfaces, to prevent the 
user of a library to mess up things.

If you want to be able to mess up things, any kind of API will 
never be good enough for you - you simply need the source code 
and modify it.
And then private won't hinder you - simply remove it.

Jul 01 2023

Dukc <ajieskola gmail.com> writes:

On Friday, 30 June 2023 at 14:41:00 UTC, bachmeier wrote:
 I should have the option to override those decisions. If 
 something blows up, or if my code gets broken in the future, 
 it's my fault, because I was the one that made that decision.

You do have it. `__traits(getMember, /+...+/)` as others have 
mentioned, or some ugly casting trickery. Or just patching the 
library yourself to make the member you want public.

Jul 02 2023

Atila Neves <atila.neves gmail.com> writes:

On Friday, 30 June 2023 at 14:41:00 UTC, bachmeier wrote:
 On Friday, 30 June 2023 at 11:07:33 UTC, Atila Neves wrote:

 API design is indeed hard. Which makes it all the more 
 imperative to not accidentally design one with implementation 
 details that users downstream start depending on. That is: API 
 design needs to be a conscious opt-in decision and not "I 
 guess I didn't think about the consequences of leaving the 
 door to my flat open all the time and now there are people 
 camping in my living room".

 Private is more like locking everyone else's doors for their 
 own safety.

I don't see how - it only applies to your own code, adding 
private doesn't make someone else's code no longer accessible.

 In the cases that it keeps an intruder out, it was helpful to 
 them. When grandma had to sleep on the sidewalk, not so much.

This is where the analogy breaks down. The whole point of private 
is to make a conscious choice over what is an implementation 
detail and what is part of the API. If it's the default, the 
programmer is nudged towards thinking of a good API instead of it 
being ad-hoc.

 I should have the option to override those decisions.

As a library author, I don't think you should. It's on me to 
support usage of private functions that I'm nominally allowed to 
delete, but not really if someone is going to complain.

 If something blows up, or if my code gets broken in the future, 
 it's my fault, because I was the one that made that decision.

In theory, yes. In practice, yelling. We told people that `in` 
was in flux and because of that, to not use it. People (including 
me!) did it anyway. Some of them later complained when we decided 
what to do with it.

Jul 03 2023

Steven Schveighoffer <schveiguy gmail.com> writes:

On 7/3/23 3:57 AM, Atila Neves wrote:
 On Friday, 30 June 2023 at 14:41:00 UTC, bachmeier wrote:
 I should have the option to override those decisions.

 
 As a library author, I don't think you should. It's on me to support 
 usage of private functions that I'm nominally allowed to delete, but not 
 really if someone is going to complain.

That is the issue. For instance, if you do:

```d
libFunction(cast(int *)0xdeadbeef);
```

And then complain that `libFunction`'s author didn't handle that case, 
you can rightfully be told to RTFM.

Same thing with circumventing private. It should be *possible*, but 
absolutely unsupported.

 If something blows up, or if my code gets broken in the future, it's 
 my fault, because I was the one that made that decision.

 
 In theory, yes. In practice, yelling. We told people that `in` was in 
 flux and because of that, to not use it. People (including me!) did it 
 anyway. Some of them later complained when we decided what to do with it.

The definition of `private` shouldn't change at all. The ability to 
circumvent it still should remain for those wanting to muck with 
internal data, and I don't think there's any way to get around that 
(there's always reinterpret casting). The thing is, it's important to 
identify the *consequences* of changing private data -- it can *never* 
be within spec for a library to allow private data access.

So one can muck around with private data, and pay the cost of zero 
support (and rightfully so). Or one can petition the library author to 
provide access to that private data.

-Steve

Jul 03 2023

"H. S. Teoh" <hsteoh qfbox.info> writes:

On Mon, Jul 03, 2023 at 12:32:43PM -0400, Steven Schveighoffer via
Digitalmars-d wrote:
[...]
 The definition of `private` shouldn't change at all. The ability to
 circumvent it still should remain for those wanting to muck with
 internal data, and I don't think there's any way to get around that
 (there's always reinterpret casting). The thing is, it's important to
 identify the *consequences* of changing private data -- it can *never*
 be within spec for a library to allow private data access.
 
 So one can muck around with private data, and pay the cost of zero
 support (and rightfully so). Or one can petition the library author to
 provide access to that private data.

[...]

I think we all agree that the mechanics of this won't (and shouldn't)
change. But I think the OP was arguing at a higher level of abstraction.
It isn't so much about whether private should be overridable or not, or
even whether some piece of data in an object should be private or not;
the question IMO is whether the library could have been designed in such
a way that there's no *need* for private data in the first place. Or at
least, the need for such is minimized.

A library with tons of private state and only a rudimentary public API
is generally more likely to have situations where the user will be left
wishing that there were a couple more knobs to turn that can be used to
customize the library's behaviour.

A library with less private state, or just as much private state but
with a sophisticated API can lets you tweak more things, would be less
likely to leave the user out in the cold with unusual use cases.
However, it does risk having too many knobs to turn, causing the API to
be far more complex than it ought to be. Which in turn can lead to
unnecessary complexity: the combinatorial explosion of configurations
make it hard for the author to test every combination, so there may be
lots of bugs hidden behind uncommon corner cases.

The ideal library is one where there's almost no private state because
there's no need for it: the code Just Works(tm) for any combination of
values one may assign to the public state.  The API is simple and
concise, yet easily composible and naturally extends to all kinds of use
cases, including unusual ones and ones the author himself never
envisioned -- yet it all just works together naturally.  This ideal may
or may not be attainable, but the closer a library gets to this ideal,
the better.


T

-- 
It always amuses me that Windows has a Safe Mode during bootup. Does that mean
that Windows is normally unsafe?

Jul 03 2023

Steven Schveighoffer <schveiguy gmail.com> writes:

On 7/3/23 2:05 PM, H. S. Teoh wrote:
 On Mon, Jul 03, 2023 at 12:32:43PM -0400, Steven Schveighoffer via
Digitalmars-d wrote:
 [...]
 The definition of `private` shouldn't change at all. The ability to
 circumvent it still should remain for those wanting to muck with
 internal data, and I don't think there's any way to get around that
 (there's always reinterpret casting). The thing is, it's important to
 identify the *consequences* of changing private data -- it can *never*
 be within spec for a library to allow private data access.

 So one can muck around with private data, and pay the cost of zero
 support (and rightfully so). Or one can petition the library author to
 provide access to that private data.

 [...]
 
 I think we all agree that the mechanics of this won't (and shouldn't)
 change. But I think the OP was arguing at a higher level of abstraction.
 It isn't so much about whether private should be overridable or not, or
 even whether some piece of data in an object should be private or not;
 the question IMO is whether the library could have been designed in such
 a way that there's no *need* for private data in the first place. Or at
 least, the need for such is minimized.
 
 A library with tons of private state and only a rudimentary public API
 is generally more likely to have situations where the user will be left
 wishing that there were a couple more knobs to turn that can be used to
 customize the library's behaviour.

But that's the thing, there are parts that *simply must be private*. No 
matter how you cut it, it has to have some level of privacy, because 
otherwise, you can't enforce semantic invariants with the type.

Should array length (not the property, but the actual data field) be 
public? What about the pointer? Of course not. Yet, you still might want 
to access those things for some reason. That doesn't mean it's worth a 
change to public just for that one reason.

 
 A library with less private state, or just as much private state but
 with a sophisticated API can lets you tweak more things, would be less
 likely to leave the user out in the cold with unusual use cases.
 However, it does risk having too many knobs to turn, causing the API to
 be far more complex than it ought to be. Which in turn can lead to
 unnecessary complexity: the combinatorial explosion of configurations
 make it hard for the author to test every combination, so there may be
 lots of bugs hidden behind uncommon corner cases.

It's easy to talk about this in general terms, like "let you tweak more 
things", but when you start talking about non-abstract real cases, 
usually the reason for private data becomes obvious.

The thing is, if it does make sense that something should just be 
public, making it public is easy, just make a PR to do it, and the 
benefits/drawbacks can be discussed, planned for, and agreed upon. Going 
the other way is much much worse.

If you provide public access, it then becomes a supported API. I 
remember one case in the past, some type in phobos had undocumented 
members that were public due to laziness or carelessness.

When the code had to change to a different implementation, we had to 
deprecate that access for years before actually changing. It was horrid. 
There is a real cost to careless publicity.

-Steve

Jul 03 2023

"H. S. Teoh" <hsteoh qfbox.info> writes:

On Mon, Jul 03, 2023 at 02:30:14PM -0400, Steven Schveighoffer via
Digitalmars-d wrote:
 On 7/3/23 2:05 PM, H. S. Teoh wrote:

[...]
 I think we all agree that the mechanics of this won't (and
 shouldn't) change. But I think the OP was arguing at a higher level
 of abstraction.  It isn't so much about whether private should be
 overridable or not, or even whether some piece of data in an object
 should be private or not; the question IMO is whether the library
 could have been designed in such a way that there's no *need* for
 private data in the first place. Or at least, the need for such is
 minimized.
 
 A library with tons of private state and only a rudimentary public
 API is generally more likely to have situations where the user will
 be left wishing that there were a couple more knobs to turn that can
 be used to customize the library's behaviour.

 
 But that's the thing, there are parts that *simply must be private*.
 No matter how you cut it, it has to have some level of privacy,
 because otherwise, you can't enforce semantic invariants with the
 type.
 
 Should array length (not the property, but the actual data field) be
 public?  What about the pointer? Of course not. Yet, you still might
 want to access those things for some reason. That doesn't mean it's
 worth a change to public just for that one reason.

We're actually agreeing with each other, y'know. :-D

As I said, the *ideal* is that you wouldn't have private state, or that
the private state would be minimal.  In practice, of course, certain
things *should* be private, and that's not a problem. The problems the
OP described arise when either private is used carelessly, causing
things to be private that really need not be, or the API is poorly
designed, so that parts of the library that ought to be reusable aren't
just because of some arbitrary decision made by the author.

I've never heard people complaining about how the array length data
field is private, for example.  That's because it being private does not
hinder the user from doing whatever he wants to do with the array (short
of breaking the implementation and doing something involving UB, of
course).  That's an example of proper usage of private.

An example of where private hinders what a user might wish to do is an
algorithm used internally by the library, that for whatever reason is
private and unusable outside of the library code, even though the
algorithm itself is general and can be applied outside of the scope of
the library.  Often in such cases there are immediate pragmatic reasons
for it -- the implementation of the algorithm is bound to internal
implementation details of other library code, for example. So you can't
actually make it public without also making lots of things public that
probably shouldn't be.  But at a higher level, one asks the question,
why is that algorithm implemented in that way in the first place?  It
could have been implemented generically, and the library could have used
just a specialized instance of it to solve whatever it is it needs to
solve, but the algorithm itself should be available for user code to
use.  *That's* the proper design.

But alas, all too often this is not done, and you end up with 5
different implementations of the same algorithm, each with different
quirks (and often, different subsets of bugs), and all of them are
locked up behind `private`, or require some tangential private structure
as argument that isn't constructible except via a long-winded circuitous
route that probably doesn't do what the user actually wants it to do,
even though the algorithm itself doesn't actually depend on this.

Ultimately these details are just the incidental symptoms. The
underlying root cause is a poor design that doesn't correctly decouple
orthogonal functionality into reusable pieces.


--T

Jul 03 2023

claptrap <clap trap.com> writes:

On Monday, 3 July 2023 at 19:27:45 UTC, H. S. Teoh wrote:
 On Mon, Jul 03, 2023 at 02:30:14PM -0400, Steven Schveighoffer 
 via Digitalmars-d wrote:
 On 7/3/23 2:05 PM, H. S. Teoh wrote:

 [...]

 We're actually agreeing with each other, y'know. :-D

 As I said, the *ideal* is that you wouldn't have private state, 
 or that the private state would be minimal.

the correct usage of "ideal" is..

"Ideally we would do X but we don't because the world is full of 
idiots"

;)

Jul 03 2023

Steven Schveighoffer <schveiguy gmail.com> writes:

On 7/3/23 3:27 PM, H. S. Teoh wrote:
 On Mon, Jul 03, 2023 at 02:30:14PM -0400, Steven Schveighoffer via
Digitalmars-d wrote:
 On 7/3/23 2:05 PM, H. S. Teoh wrote:

 [...]
 I think we all agree that the mechanics of this won't (and
 shouldn't) change. But I think the OP was arguing at a higher level
 of abstraction.  It isn't so much about whether private should be
 overridable or not, or even whether some piece of data in an object
 should be private or not; the question IMO is whether the library
 could have been designed in such a way that there's no *need* for
 private data in the first place. Or at least, the need for such is
 minimized.

 A library with tons of private state and only a rudimentary public
 API is generally more likely to have situations where the user will
 be left wishing that there were a couple more knobs to turn that can
 be used to customize the library's behaviour.

 But that's the thing, there are parts that *simply must be private*.
 No matter how you cut it, it has to have some level of privacy,
 because otherwise, you can't enforce semantic invariants with the
 type.

 Should array length (not the property, but the actual data field) be
 public?  What about the pointer? Of course not. Yet, you still might
 want to access those things for some reason. That doesn't mean it's
 worth a change to public just for that one reason.

 
 We're actually agreeing with each other, y'know. :-D
 

Yeah kind of. It's just that there are 2 types of privacy labeling, 
careless and designed.

 As I said, the *ideal* is that you wouldn't have private state, or that
 the private state would be minimal.  In practice, of course, certain
 things *should* be private, and that's not a problem. The problems the
 OP described arise when either private is used carelessly, causing
 things to be private that really need not be, or the API is poorly
 designed, so that parts of the library that ought to be reusable aren't
 just because of some arbitrary decision made by the author.


If you carelessly label your fields as public, then realizing later they 
should have been private is costly, maybe impossible.

If you carelessly label your fields as private, while it might upset 
some people, making them public later is easy.

So if you are going to "not care" about public/private, technically the 
less risky choice is to make everything private, and worry about it 
later if it becomes an issue. So in that sense I disagree with the OP point.

That being said, I've done a lot of libs where I just don't care and 
leave everything public. It's mostly because I don't expect widespread 
usage, and I also don't mind breaking peoples code (I don't think any of 
my projects that I started are past 1.0 yet). But something like Phobos 
shouldn't be so careless. We really should continue to make careless 
things private unless there is a good reason to make them public.

 
 I've never heard people complaining about how the array length data
 field is private, for example.  That's because it being private does not
 hinder the user from doing whatever he wants to do with the array (short
 of breaking the implementation and doing something involving UB, of
 course).  That's an example of proper usage of private.

It's an obvious example that we all can agree on. If we agree there are 
clearly cases where private is important, than we start working our way 
back to where the line should be drawn.

 An example of where private hinders what a user might wish to do is an
 algorithm used internally by the library, that for whatever reason is
 private and unusable outside of the library code, even though the
 algorithm itself is general and can be applied outside of the scope of
 the library.  Often in such cases there are immediate pragmatic reasons
 for it -- the implementation of the algorithm is bound to internal
 implementation details of other library code, for example. So you can't
 actually make it public without also making lots of things public that
 probably shouldn't be.  But at a higher level, one asks the question,
 why is that algorithm implemented in that way in the first place?  It
 could have been implemented generically, and the library could have used
 just a specialized instance of it to solve whatever it is it needs to
 solve, but the algorithm itself should be available for user code to
 use.  *That's* the proper design.

I agree that some things shouldn't be private. But what's the answer? 
When it should be public, just change it to public!

An actual example of this in Phobos is the absence of a binary search 
algorithm. It's there, in SortedRange. But that implementation is 
private basically for no good reason (it can be trivially extracted into 
its own function). And SortedRange in itself is a schizophrenic meld of 
overbearing restrictions and puzzling allowances.

The only reason I haven't made a PR for it is I just made a copy in my 
own code and have moved on. But it would probably be pretty trivial to 
expose.

-Steve

Jul 03 2023

"H. S. Teoh" <hsteoh qfbox.info> writes:

On Mon, Jul 03, 2023 at 10:14:38PM -0400, Steven Schveighoffer via
Digitalmars-d wrote:
 On 7/3/23 3:27 PM, H. S. Teoh wrote:

[...]
 As I said, the *ideal* is that you wouldn't have private state, or
 that the private state would be minimal.  In practice, of course,
 certain things *should* be private, and that's not a problem. The
 problems the OP described arise when either private is used
 carelessly, causing things to be private that really need not be, or
 the API is poorly designed, so that parts of the library that ought
 to be reusable aren't just because of some arbitrary decision made
 by the author.

 
 
 If you carelessly label your fields as public, then realizing later
 they should have been private is costly, maybe impossible.

Depends.  D is flexible enough that public fields can be replaced with
access functions, and almost all downstream code doesn't have to change
to adapt to it.  I've done it a lot in my own code, where some field,
say mydata, was previously public but now needs to be private. No
problem: just rename it to _mydata, and create access functions
mydata() and mydata(typeof(_mydata)) to maintain compatibility with old
code.  Unless downstream code does something like take an address of the
old field, this change will be transparent, a recompile will make it all
work as before without requiring further changes.


 If you carelessly label your fields as private, while it might upset
 some people, making them public later is easy.

The point is that it then bottlenecks on the author. If the author is
not responsive for whatever reason (busy, abandoned the project, etc.)
downstream users are stuck up the creek without a paddle.


 So if you are going to "not care" about public/private, technically
 the less risky choice is to make everything private, and worry about
 it later if it becomes an issue. So in that sense I disagree with the
 OP point.

OK, I guess we differ on this point.  Given the choice between having to
wait for a potentially MIA author to fix an issue and having the ability
to go under the hood to manually work around the issue, I choose the
latter.


 That being said, I've done a lot of libs where I just don't care and
 leave everything public. It's mostly because I don't expect widespread
 usage, and I also don't mind breaking peoples code (I don't think any
 of my projects that I started are past 1.0 yet). But something like
 Phobos shouldn't be so careless. We really should continue to make
 careless things private unless there is a good reason to make them
 public.

I guess this has to be judged on a case-by-case basis.


 I've never heard people complaining about how the array length data
 field is private, for example.  That's because it being private does
 not hinder the user from doing whatever he wants to do with the
 array (short of breaking the implementation and doing something
 involving UB, of course).  That's an example of proper usage of
 private.

 
 It's an obvious example that we all can agree on. If we agree there
 are clearly cases where private is important, than we start working
 our way back to where the line should be drawn.

My personal criteria is, if something can be designed without private
(and without opening up holes that may allow user code to break stuff),
prefer that design.  Barring that, prefer the design that has the least
amount of private possible for it to work without opening up loopholes
for breakage.

In general, I don't quite agree with e.g. Java's approach of making
everything private by default and having only member functions mediate
access to private state.  My approach is to prefer POD types that hold
public data that anybody can safely mutate, and public functions that
operate on said POD types, rather than the closed-box approach advocated
by OO.

There's a time and place for the closed-box approach, of course. But in
my book, that's the less preferred option that you'd fall back on only
if you couldn't do it another way.  And even when you can't avoid the
closed-box approach, my preference is to minimize the degree of
closedness as much as possible.


 An example of where private hinders what a user might wish to do is
 an algorithm used internally by the library, that for whatever
 reason is private and unusable outside of the library code, even
 though the algorithm itself is general and can be applied outside of
 the scope of the library.  Often in such cases there are immediate
 pragmatic reasons for it -- the implementation of the algorithm is
 bound to internal implementation details of other library code, for
 example. So you can't actually make it public without also making
 lots of things public that probably shouldn't be.  But at a higher
 level, one asks the question, why is that algorithm implemented in
 that way in the first place?  It could have been implemented
 generically, and the library could have used just a specialized
 instance of it to solve whatever it is it needs to solve, but the
 algorithm itself should be available for user code to use.  *That's*
 the proper design.

 
 I agree that some things shouldn't be private. But what's the answer?
 When it should be public, just change it to public!

It's not always so simple, though.  The algorithm might have been
implemented in a way that depends on private types and internal
assumptions that may break in unforeseen ways if you use it without
realizing what the assumptions are.  Forcibly changing it to public may
require you to make other stuff public that shouldn't be.  Or it may be
written in a way that's tightly coupled to other internal library code,
such that you can't call it separately.

This gets particularly frustrating when the core of the algorithm itself
does *not* depend on these things, but the upstream author wrote it that
way because "it's private, so nobody cares if this code is dirty and
badly designed". Being able to hide bad code behind private encourages
this kind of one-off hacks that avoids having to think about proper
code decomposition.


 An actual example of this in Phobos is the absence of a binary search
 algorithm. It's there, in SortedRange. But that implementation is
 private basically for no good reason (it can be trivially extracted
 into its own function). And SortedRange in itself is a schizophrenic
 meld of overbearing restrictions and puzzling allowances.

Yeah, that binary search function really ought to be public.

I think by now, experience has more than proven that SortedRange was a
mistake.  It was an attempt to encode the sortedness of a range in the
type system such that Phobos would be able to take advantage of this to
provide performance improvements, but D's type system simply isn't
powerful enough to express what's needed for this without unnecessary
limitations and the weird quirks you see in the current implementation
of SortedRange.

It was an interesting and ambitious experiment, but I think it has run
its course and the conclusion is that it doesn't work in the current
language. Or at least isn't pulling its own weight given its current
limitations.  Perhaps it's time to send it to the scrap yard.


 The only reason I haven't made a PR for it is I just made a copy in my
 own code and have moved on. But it would probably be pretty trivial to
 expose.

[...]

IMO, we should just get rid of SortedRange and make the binary search
algo a public function.

Or even if we don't get rid of SortedRange (breakage of existing code
and all that), I don't see why the binary search function shouldn't be
publicly available. This is exactly the kind of abuse of `private` I was
talking about: the function is clearly there and ready to use, but the
author for various reasons decided that no, you're not allowed to just
call the function, you have to jump through this here set of hoops to
prove your worthiness first.


T

-- 
My father told me I wasn't at all afraid of hard work. I could lie down right
next to it and go to sleep. -- Walter Bright

Jul 05 2023

Dukc <ajieskola gmail.com> writes:

On Tuesday, 27 June 2023 at 21:53:59 UTC, Ali Çehreli wrote:
 The main topic here is about the harm caused by rich types 
 surrounding algorithms. Let's say I am interested in using an 
 open source algorithm that works with a memory area. (Not 
 related to D.) We all know that a memory area can be described 
 by a fat pointer like D's slices. So, that is what the 
 algorithm should take.

 Unfortunately, the poor little algorithm is not free to be 
 used: It is written to work with a custom type of that library; 
 let's call it MySlice, which is produced by MyMemoryMappedFile, 
 which is produced by MyFile, which is initialized only by types 
 like MyFilePath. (I may have gotten the relationships wrong 
 there.)

 But my data is already in a memory area that I own! How can I 
 call that algorithm? Should I write it to a file first and then 
 use those rich types to access the algorithm? That should not 
 be necessary...

The language-agnostic answer is to patch the library yourself to 
do what you want.

Since D is a systems programming language, you also have another 
choice: bypass the type system, create `MySlice` by pointer 
casting it from the data representing a D slice.

Now, neither of these solutions are exactly inviting. But they 
cannot be: to create `MySlice` in a way the library doesn't 
support, you have to know it's private implementation details. 
Even if the language didn't give the library author a way to 
protect those details, you'd be relying on undocumented 
version-specific details.

Not having `private` would better in the sense you'd be more 
likely to get compiler errors instead of memory corruption if the 
private details change. Maybe `__traits(getMember, /+...+/)`, or 
declaring a private function as external `extern(C)` function, 
CTFE-mangling the D name, would be safer than the pointer cast I 
proposed.

Jul 02 2023

D Programming

C/C++ Programming

Other

digitalmars.D - Algorithms should be free from rich types