digitalmars.D - How can we make it easier to experiment with the compiler?

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (32/32) May 22 2021 I think there are many that would like to experiment with the

Nicholas Wilson (49/82) May 23 2021 (the is the correct terminology). I suspect this is more of a

Walter Bright (55/56) May 23 2021 The #1 problem isn't directories, it's "every module imports every other...

Nicholas Wilson (44/70) May 23 2021 This is a _completely_ orthogonal problem.

Andrei Alexandrescu (5/7) May 23 2021 I think the best first step is to add `private` to the codebase. This is...

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (16/19) May 24 2021 Yes, but I'd like this thread to be more forward-looking and

Walter Bright (6/7) May 23 2021 It's the same problem.

Nicholas Wilson (8/20) May 24 2021 Did you read _literally nothing else_ that I wrote?

Walter Bright (2/3) May 24 2021 I read it, my response was to the entire posting.

Walter Bright (11/15) May 24 2021 To be a little clearer, if the files are all merely reshuffled into vari...

Tobias Pankrath (6/16) May 24 2021 Putting the files into directories would make those violations

Walter Bright (7/10) May 25 2021 When you've got a rusty car, it sure is tempting to just paint it. But i...

Nicholas Wilson (8/20) May 25 2021 That is correct for the analogy you used, however that is a false

Paul Backus (7/12) May 25 2021 For what it's worth, I've found that exploration and navigation

Ola Fosheim Grostad (4/18) May 25 2021 I dont know if it is funny or sad that a thread about enabeling

Imperatorn (4/20) May 26 2021 😅

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (3/5) May 26 2021 Ok, we choose to laugh! 😅

rikki cattermole (7/10) May 26 2021 It also becomes significantly more manageable when you have things like

Ola Fosheim Grostad (6/8) May 26 2021 The experience is irrelevant in this context.

rikki cattermole (10/22) May 26 2021 Agreed.

Ola Fosheim Grostad (4/9) May 26 2021 Exactly. The core principle for anything that has to do with

zjh (2/2) May 26 2021 but W.B. say you are not writing d compiler thousand hours.

Greg Strong (3/5) May 26 2021 By that logic, what you say doesn't count either. Yet you post

rikki cattermole (18/28) May 26 2021 I actually have an article on code quality and how I measure it.

Ola Fosheim Grostad (15/25) May 26 2021 But, my main issues are not these, these are symptoms. My main

Ola Fosheim Grostad (8/12) May 26 2021 I guess another way of putting it is that it is ok that some
rikki cattermole (4/27) May 26 2021 Yeah, although I'll stay out of the whole IR thing as I'm no where near

Ola Fosheim Grostad (17/19) May 26 2021 Ok, one simple way is to just have a standard high level

Alexandru Ermicioi (5/9) May 26 2021 That's if you've got a starting point in the source code, then

rikki cattermole (4/14) May 26 2021 There is a file list (somewhere, I'm not looking for it) that tells you

Walter Bright (3/6) May 26 2021 Creating a FILES.md file, the content of which is each source file with ...

Walter Bright (4/11) May 26 2021 I see this has already been done:

Nicholas Wilson (13/26) May 26 2021 I know, I wrote the equivalent for the backend. And no that does

Ola Fosheim Grostad (11/18) May 26 2021 d) See no value in spending effort on designing an architecture.

Mathias LANG (43/47) May 27 2021 Don't forget about the many contributors that have invested

Ola Fosheim Grostad (35/52) May 27 2021 Yes, let us not forget that they wasted many unproductive hours
Walter Bright (7/24) May 27 2021 The isXXX() functions also make for safe casting. Your example would be:

Basile B. (6/22) May 27 2021 And this is actually the only way to dyncast cast nodes as DMD

Basile B. (6/29) May 27 2021 Other advantage of module scope isXXX functions is that the base
Walter Bright (2/4) May 27 2021 They're all `final` meaning not virtual. The intent is them being inline...

Ola Fosheim Grostad (6/8) May 27 2021 And let me add thar reorganization and restructuring is not a

Patrick Schluter (5/26) May 26 2021 and adding to that by citing Walters message "It's a bit out of
Andrei Alexandrescu (13/16) May 27 2021 Razvan found in https://github.com/dlang/dmd/pull/12560 a number of

Nicholas Wilson (9/12) May 24 2021 Then I can only conclude you have absolutely no perspective for

Alexandru Ermicioi (10/17) May 24 2021 It will be a huge help if they are though. At minimum it will

Walter Bright (4/12) May 24 2021 It establishes a fake hierarchy that is *not* expressed in the code.

Nicholas Wilson (2/12) May 24 2021 It _is_ in the code. FFS, the AST is literally a hierarchy!

Andrei Alexandrescu (15/22) May 23 2021 One problem with that is code duplication. There are two types OutBuffer...

Walter Bright (12/13) May 24 2021 Sure. But in the outbuffer case, the duplication stems from backend bein...

zjh (4/4) May 24 2021 There should be a base package on DMD/Druntime and Phobos.

zjh (16/16) May 24 2021 We need `big changes`.

user1234 (18/34) May 24 2021 100 kb is let's say 2500 slocs (or rather 1500 from the D-Scanner

Iain Buclaw (3/17) May 24 2021 Actually, the visitors have been slowly getting converted into
Basile B. (38/44) Jun 06 2021 I've had the opportunity to do quch a refact yesterday in styx.

Andrei Alexandrescu (3/5) May 24 2021 That evokes the couple who had problems in their relationship, so they

Ola Fosheim Grostad (4/9) May 24 2021 If that meant that they encapsulated their problems and had a
Patrick Schluter (2/7) May 25 2021 You watched "Better Call Saul", didn't you? :-)

Iain Buclaw (3/7) May 24 2021 *ahem* https://github.com/dlang/dmd/pull/12574
Johan Engelen (43/59) May 24 2021 Outbuffer is a case of a data structure that is useful throughout

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (38/46) May 24 2021 Thank you for bringing us back on topic. Yes, or at least have a

sighoya (2/4) May 24 2021 I think you will end up with your own compiler :)

Ola Fosheim Grostad (7/12) May 24 2021 I think we need to learn from Apple and Microsoft, they are doing

Bruce Carneal (28/32) May 24 2021 Yes. It's easier to understand shallow trees with modest leaves
Andrei Alexandrescu (17/67) May 24 2021 Thanks. I set out to write pretty much exactly that. To add to it:
Walter Bright (4/11) May 24 2021 Good points, but part of the reason the front end code is what it is is ...

poffer (5/65) May 24 2021 A good enhancement to the language would be adding some sort of

Walter Bright (3/7) May 24 2021 Importing unused modules is a problem, but a minor one. The larger probl...

poffer (5/12) May 24 2021 No. What I mean is a declaration that for example, allows only

Walter Bright (4/18) May 24 2021 Snark isn't necessary.

Iain Buclaw (7/35) May 24 2021 To be fair, most of this is imported because a function needs the

Walter Bright (8/14) May 24 2021 It is not critical that we fix target.d. It's just that it would be bett...

Iain Buclaw (11/22) May 24 2021 obsolete.

Walter Bright (23/25) May 24 2021 Challenge accepted!

Iain Buclaw (30/50) May 24 2021 I still see a Type though. :-)
zjh (1/1) May 24 2021 We should add a `favor function` to the forum post.

Andrei Alexandrescu (5/43) May 24 2021 Yes, and these are good incremental steps that help a lot, are low cost,...

Dukc (4/7) May 24 2021 You may want to reconsider what you just said.

Walter Bright (6/17) May 24 2021 I knew someone would bring that up. :-) It's a good question.

Dukc (8/13) May 24 2021 Shouldn't the same reasoning apply to `import`ing `dmd.root` from

Dibyendu Majumdar (6/12) May 25 2021 Wow - that's pretty fundamental. How does such code get in? I

Walter Bright (2/4) May 27 2021 There are many people who have pull privileges.

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (72/102) May 24 2021 I think I should have used the term "boring" rather than
Iain Buclaw (16/19) May 24 2021 You can't complain unless you've had a go at making a change to

Nicholas Wilson (4/25) May 24 2021 it makes it difficult to navigate, especially so when you are

Alexandru Ermicioi (26/27) May 24 2021 8. Proper module naming, not abbreviations. Abbreviations need to

Walter Bright (12/13) May 24 2021 You're right, they are not. They're optimized for the people who spend t...

Alexandru Ermicioi (52/65) May 24 2021 Well, there is no dictionary for those abbreviations, and it is

Walter Bright (48/55) May 26 2021 grep -w aa *.d

zjh (7/11) May 26 2021 We have good articles, good posts.

Ola Fosheim Grostad (3/15) May 26 2021 No need, just do this:
zjh (14/15) May 26 2021 Stability is very important.

Ola Fosheim Grostad (2/4) May 26 2021 We are stuck in a 70s mainframe.

12345swordy (6/23) May 24 2021 I seriously question the "Optimized for people who spend

Andrei Alexandrescu (5/31) May 24 2021 Adding documentation would be another good investment with terrific

Max Haughton (4/17) May 24 2021 Where do you start? i.e. there's always work to be done but

Ola Fosheim Grostad (11/30) May 24 2021 The reason I put documentation low on my list is that it has a
Walter Bright (4/5) May 24 2021 At the first function you notice that has poor/missing/wrong documentati...

Ola Fosheim Grostad (11/17) May 24 2021 This is not helpful. Too much commenting makes the code even

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (5/9) May 25 2021 An improvement that would have made any comment on
Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (10/14) May 25 2021 Of course, it is nice to know that that lookup(symbol) can return
sighoya (13/23) May 25 2021 You can't encode the full semantic into one function name with

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (21/25) May 25 2021 We can assume that the reader has read a book on compiler design

jmh530 (6/29) May 25 2021 I don't know...I mean it's a start...

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (12/17) May 25 2021 Many functions have two-liners documentation already, but it is

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (13/21) May 25 2021 Or let me explain it another way. Assume writing good useful

jmh530 (7/10) May 25 2021 Ultimately Walter needs to think about how he best spends his

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (13/16) May 25 2021 They best way to refactor is to partition and encapsulate then

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (19/23) May 25 2021 The key point here is designing new and better interfaces. You

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (6/13) May 25 2021 If you are unsure if the new IR is stable you can rearrange the
zjh (1/1) May 25 2021 +10086.

zjh (5/6) May 25 2021 Refactoring doesn't take much time.

Ola Fosheim Grostad (11/17) May 25 2021 It takes time, but it is a necessary part of the life cycle, and

Walter Bright (2/5) May 27 2021 Meanwhile, Iain and I are putting out PRs on ImportC.

Ola Fosheim Grostad (4/10) May 27 2021 Yes, 2 people run ahead while 10 equally capable people throw

Ola Fosheim Grostad (8/19) May 27 2021 2 years later they were all steamrolled by C++ and Swift because

zjh (6/7) May 27 2021 You're right.

jmh530 (2/8) May 27 2021 Of course, forum =/= dmd

Ola Fosheim Grostad (6/15) May 27 2021 The forums are drying up when it comes to people who are

sighoya (19/42) May 25 2021 For very general things, yes, this is possible, but there are

Ola Fosheim Grostad (6/8) May 25 2021 Nah, because that would be in the documentation, so you are

Alexandru Ermicioi (12/16) May 25 2021 In this case, it might be good to have a documentation comment,

sighoya (36/51) May 25 2021 It's a trade-off. Over modularization can also be a mispattern as

Basile B. (14/20) May 25 2021 Related, all the BUG:, TODO:, etc. comments should be moved to
Iain Buclaw (24/30) May 25 2021 Another place you can make a big impact with zero change in

Walter Bright (8/11) May 24 2021 "Use correct Ddoc function comment blocks."

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

I think there are many that would like to experiment with the 
compiler, but feel discouraged because they don't know how to 
approach it.

I think this is not only comes down to documentation, but also is 
structural. In order to figure out what to improve, the best 
starting point is experienced challenges.

The number one challenge I see is keeping track of DMD as it is 
released with new improvements. Basically reapplying the changes 
made to the experimental branch to the main branch (aka 
"rebasing"?). I suspect that kills many efforts, meaning people 
create a fork, start making changes, but then a new version of 
DMD is released and the fork is left to dry in the sun as 
rebasing is not fun. And well, a hobby that isn't fun, is not a 
good hobby. :-D

Better internal compiler structure would help a lot with this. So 
a prioritized list for me would be:

1. Have a clean separation between frontend and backend, that is 
close to plug-and-play. That would allow people to inject a new 
high level IR between frontend and backend that could open for 
new interesting optimizations, and allow all the compilers to 
benefit from it.

2. Break down source files into smaller units, so that stable 
parts are separated from unstable parts.

3. More encapsulation and separation of responsibility.

4. Switch to a more syntactical AST, possibly enabling AST macros 
in the future without too much hassle, then use an IR for real 
work.

5. Use directories.

6. Improved documentation.

7. Tutorials.

What other items should be on the list?

Which items are feasible in the next 6 months?

May 22 2021

Nicholas Wilson <iamthewilsonator hotmail.com> writes:

On Sunday, 23 May 2021 at 06:12:30 UTC, Ola Fosheim Grøstad wrote:
 I think there are many that would like to experiment with the 
 compiler, but feel discouraged because they don't know how to 
 approach it.

 I think this is not only comes down to documentation, but also 
 is structural. In order to figure out what to improve, the best 
 starting point is experienced challenges.

 The number one challenge I see is keeping track of DMD as it is 
 released with new improvements. Basically reapplying the 
 changes made to the experimental branch to the main branch (aka 
 "rebasing"?).

(the is the correct terminology). I suspect this is more of a 
problem for people that are less familiar with git, which might 
well also include people wanting to play around with DMD, e.g. 
GSoC/SAoC students.
I know this was the case for me while developing dcompute with 
the added difficulty of tracking LLVM on top of LDC (which was 
kept in sync with DMD).

 I suspect that kills many efforts, meaning people create a 
 fork, start making changes, but then a new version of DMD is 
 released and the fork is left to dry in the sun as rebasing is 
 not fun. And well, a hobby that isn't fun, is not a good hobby. 
 :-D

The solution to this is better git skills not so much better 
compiler skills/knowledge of DMD although a merge conflict in a 
critical piece of code is always a PiTA. We now have 
slack/discord for people to ask these kinds of questions, which 
I'm sure they will get answered if the are trying to do something 
interesting or fix an annoying problem.

 Better internal compiler structure would help a lot with this. 
 So a prioritized list for me would be:

Oh god yes. the directory structure, or rather lack thereof, is a 
really dire repellant for newcomers. I cannot understate this. 
173 files in dmd/src/dmd is _completely_ unacceptable, however 
Walter seems to like it this way and has struck down PRs trying 
to remediate this in the past (because it doesn't suit his editor 
configuration? or something like that).

We should have at least the following folders:
ast: ast_node, dsymbol, aggregate, et al
semantic: semantic2, semantic3, ob, nogc, safe et al
visitors: parsetimevisitor, permissivevisitor, visitor et al
glue (backend interfacing files): lib[.*],scan[.*] toir, s2ir, 
e2ir et al
lex: lexer, tokens, identifier, id  utf et al
headers: (alas still needed until dtoh works well enough and has 
been stable enough releases for GDC to bootstrap)

 1. Have a clean separation between frontend and backend, that 
 is close to plug-and-play. That would allow people to inject a 
 new high level IR between frontend and backend that could open 
 for new interesting optimizations, and allow all the compilers 
 to benefit from it.

see also https://mlir.llvm.org, I had a GSoC student try to do 
something with this, I don't think it got to a usable state. but 
this is about as a state of the art as it gets and a very 
interesting research direction. Rust and swift use multiple 
levels of IRs.

Also from what I understand, the pointer and liveness analysis as 
part of DIP 1000/1040/(other walter DIPs?) does something like 
this, but in a hacked up, nonstandard manner.

 2. Break down source files into smaller units, so that stable 
 parts are separated from unstable parts.

Urgh. Dealing with 10000 line files and 1000 line functions is 
such a drain on trying to get stuff done (looking at you 
expressionsem.d). However this needs to be combined with 
directories/packages or it will not improve the situation.

 3. More encapsulation and separation of responsibility.

 4. Switch to a more syntactical AST, possibly enabling AST 
 macros in the future without too much hassle, then use an IR 
 for real work.

That is a noble goal, but would require _a lot_ of changes both 
in DMD and in downstream LDC and GDC, and tools that consume AST 
that expect it to be complete. not to mention designing said IR, 
redoing semantic analysis/transformations to work with it.

 5. Use directories.

Yes!!! sooo much yes! see above.

 6. Improved documentation.

 7. Tutorials.

 What other items should be on the list?

try to make sure we use standard terminology for things so that 
people can reliably search for things

 Which items are feasible in the next 6 months?

Directories.

May 23 2021

Walter Bright <newshound2 digitalmars.com> writes:

On 5/23/2021 7:25 PM, Nicholas Wilson wrote:
 Directories.


module" 
that leaves one with nowhere to start.

We currently have:

   dmd
   dmd/root
   dmd/backend

I regularly fend off attempts to have dmd/root import files from dmd, and 
dmd/backend import files from dmd. I recently had to talk someone out of having 
dmd/backend import files from dmd/root.

In other words, a failure of encapsulation.

Let's look at one example, picked more or less because I've looked at it 
recently, dmd/target.d. The reason for its existence is to abstract target 
information. It's imports are:

   import dmd.argtypes_x86;
   import dmd.argtypes_sysv_x64;
   import core.stdc.string : strlen;
   import dmd.cond;
   import dmd.cppmangle;
   import dmd.cppmanglewin;
   import dmd.dclass;
   import dmd.declaration;
   import dmd.dscope;
   import dmd.dstruct;
   import dmd.dsymbol;
   import dmd.expression;
   import dmd.func;
   import dmd.globals;
   import dmd.id;
   import dmd.identifier;
   import dmd.mtype;
   import dmd.statement;
   import dmd.typesem;
   import dmd.tokens : TOK;
   import dmd.root.ctfloat;
   import dmd.root.outbuffer;
   import dmd.root.string : toDString;

If I want to understand the code, I have to understand half of the rest of the 
compiler. On a more abstract level, why on earth would a target abstraction
need 
to know about AST nodes? At least half of these imports shouldn't be here, and 
if they are, the code needs to be redesigned.

Recently I needed some target information in the ImportC lexer, and it would 
have been so easy to just import dmd.target. But then that drags along all the 
imports that I've really tried to avoid importing into the lexer.

Iain came up with a clever solution to use a template parameter.

Note that Phobos suffers terribly from this disease (everything ultimately 
imports everything else), which makes it very hard to understand and debug.

Fixing this is not easy, it requires a lot of hard thinking about what a module 
*really* needs to do. But each success at eliminating an import makes it more 
understandable.

Creating a false hierarchy (an implied relationship that is instantly defeated 
by the imports) of files won't fix it.

A good rule of thumb is:

     *** Never import a file from an uplevel directory ***

Import sideways and down, never up.

May 23 2021

Nicholas Wilson <iamthewilsonator hotmail.com> writes:

On Monday, 24 May 2021 at 02:56:14 UTC, Walter Bright wrote:
 On 5/23/2021 7:25 PM, Nicholas Wilson wrote:
 Directories.


 every other module" that leaves one with nowhere to start.

 We currently have:

   dmd
   dmd/root
   dmd/backend

 I regularly fend off attempts to have dmd/root import files 
 from dmd, and dmd/backend import files from dmd. I recently had 
 to talk someone out of having dmd/backend import files from 
 dmd/root.

 In other words, a failure of encapsulation.

This is a _completely_ orthogonal problem.

The symptoms are completely orthogonal, although easily confused: 
failure of encapsulation makes _reasoning_ about the 
_interconnectedness_ of code difficult, failure to package makes 
_exploration_ and _enumeration_ of code (files, functions, 
classes, data structures) more difficult.
The solutions, however are cross enabling: we can implement and 
_enforce_ policies like say "AST node implementing modules should 
not import semantic analysis modules" with reasonable confidence 
iff we have all the AST modules in one place and all the semantic 
analysis modules in one place.

The symptoms of failure of encapsulation I'm going to assume you 
are well aware of.
The symptoms of failure to use packages are as follows:
  * the sheer number of filed in src/dmd make it impossible to 
remember what each file is for.  This problem is compounded by 
the fact that many files have names that do not describe well 
what they do _especially_ to newcomers. Principle offending 
example `ob`. Compare with names like `filecache`.
* it is impossible to determine at a glance what files are 
related to each other:
is `foreachvar.d` an AST node?, what about `dcast.d`? (No and No)
Whats the difference between `glue.d` and `gluelayer.d`?
is `visitor.d`, `transitivevisitor.d`, `strictvisitor.d` 
`parsetimevisitor.d` and `permissivevisitor.d` a complete list of 
the module public visitor modules? (No)
Which of `cond.d` and `staticcond.d` is the AST node for a static 
condition? What does the other file do? (`cond.d`, semantic 
analysis)
What files do semantic analysis? Which files declare AST nodes? 
Which files interface with the backend (and subsequently are not 
part of LDC or GDC)?
Where is DMD's entry point?


 snip example
 Fixing this is not easy, it requires a lot of hard thinking 
 about what a module *really* needs to do. But each success at 
 eliminating an import makes it more understandable.

Fixing the lack of directory issue requires only to think about 
what a module _is_ i.e. what package it belongs to: 
driver/frontend (mars, errors etc) , lexer group (lex, parse, 
tokens etc), ast, semantic analysis, backend interfacing, 
backend, root.

 Creating a false hierarchy (an implied relationship that is 
 instantly defeated by the imports)

You cannot seriously tell me with a straight face that e.g. AST, 
is not a hierarchy and should not be grouped together.

 of files won't fix [failure to encapsulate].

Indeed is fixes a different problem, but it makes fixing failure 
to encapsulate much easier.

 A good rule of thumb is:

     *** Never import a file from an uplevel directory ***

 Import sideways and down, never up.

Indeed. However you can't to much of that with just

   dmd
   dmd/root
   dmd/backend

May 23 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/24/21 1:15 AM, Nicholas Wilson wrote:
 Indeed is fixes a different problem, but it makes fixing failure to 
 encapsulate much easier.

I think the best first step is to add `private` to the codebase. This is 
cheap to get into and informs any future refactoring. I find it 
confusing that people push for massive reorganization for years, but 
won't bother to create 50 line PRs that add `private` appropriately.

May 23 2021

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Monday, 24 May 2021 at 06:00:06 UTC, Andrei Alexandrescu wrote:
 refactoring. I find it confusing that people push for massive 
 reorganization for years, but won't bother to create 50 line 
 PRs that add `private` appropriately.

Yes, but I'd like this thread to be more forward-looking and 
focus more on making compiler hacking a "fun hobby" rather than 
being one of should-have-in-the-past.

The key "sociological" point one could take away from this is:

1. Do boring chores together, because that makes them less unfun.

2. Then leave the smaller fun things to individuals that have 
sporadic activity (busy life or weak affiliation with the 
project).

3. Encourage a sense of autonomous ownership, and experimental 
forks is a very good way to achieve that. Research on school 
children shows that a sense of autonomous ownership of the task 
is a good motivation aspect. (Not sufficient, but close to 
necessary.)

(To do unfun chores together we need a plan and some sort of 
model or map).

May 24 2021

Walter Bright <newshound2 digitalmars.com> writes:

On 5/23/2021 10:15 PM, Nicholas Wilson wrote:
 This is a _completely_ orthogonal problem.

It's the same problem.

D's support for modules and packages is literally designed around matching the 
hierarchy of the source files.

Shuffling files around accomplishes nothing when every module imports every 
other module.

May 23 2021

Nicholas Wilson <iamthewilsonator hotmail.com> writes:

On Monday, 24 May 2021 at 06:58:48 UTC, Walter Bright wrote:
 On 5/23/2021 10:15 PM, Nicholas Wilson wrote:
 This is a _completely_ orthogonal problem.

 It's the same problem.

 Shuffling files around accomplishes nothing when every module 
 imports every other module.

Did you read _literally nothing else_ that I wrote?

Let me quote myself again so that you don't miss it:

 The symptoms are completely orthogonal, although easily 
 confused: failure of encapsulation makes _reasoning_ about the 
 _interconnectedness_ of code difficult, failure to package 
 makes _exploration_ and _enumeration_ of code (files, 
 functions, classes, data structures) more difficult.

Putting the modules into packages fixes EXACTLY the problem of 
horrible experience with exploration and enumeration. It 
explicitly does not fix failure of encapsulation because it is a  
_completely_ orthogonal set of symptoms.

 D's support for modules and packages is literally designed 
 around matching the hierarchy of the source files.

Yes, and?

May 24 2021

Walter Bright <newshound2 digitalmars.com> writes:

On 5/24/2021 1:35 AM, Nicholas Wilson wrote:
 Did you read _literally nothing else_ that I wrote?

I read it, my response was to the entire posting.

May 24 2021

Walter Bright <newshound2 digitalmars.com> writes:

On 5/24/2021 2:39 AM, Walter Bright wrote:
 On 5/24/2021 1:35 AM, Nicholas Wilson wrote:
 Did you read _literally nothing else_ that I wrote?

 
 I read it, my response was to the entire posting.

To be a little clearer, if the files are all merely reshuffled into various 
packages, then they all violate the rule:

     *** Never import a file from an uplevel directory ***

and understanding is not increased at all. And it isn't even just an uplevel 
directory, it's up then sideways then down.

There *is*, however, documentation on the dmd source files:

   https://dlang.org/phobos/index.html

Click on "dmd" on the left. For anyone wishing to get a tour of the files and 
what they do, this is the place.

Adding better Ddoc comments to the source files will help with this, of course.

May 24 2021

Tobias Pankrath <tobias+dlang pankrath.net> writes:

On Monday, 24 May 2021 at 10:04:49 UTC, Walter Bright wrote:
 On 5/24/2021 2:39 AM, Walter Bright wrote:
 On 5/24/2021 1:35 AM, Nicholas Wilson wrote:
 Did you read _literally nothing else_ that I wrote?

 
 I read it, my response was to the entire posting.

 To be a little clearer, if the files are all merely reshuffled 
 into various packages, then they all violate the rule:

     *** Never import a file from an uplevel directory ***

 and understanding is not increased at all. And it isn't even 
 just an uplevel directory, it's up then sideways then down.

Putting the files into directories would make those violations 
obvious and
serve as documentation on how the deps should be. Than all others 
can
start work towards that goal.

May 24 2021

Walter Bright <newshound2 digitalmars.com> writes:

On 5/24/2021 3:41 AM, Tobias Pankrath wrote:
 Putting the files into directories would make those violations obvious and
 serve as documentation on how the deps should be. Than all others can
 start work towards that goal.

When you've got a rusty car, it sure is tempting to just paint it. But it's all 
for naught if the real work, the hard work, the boring work -repairing the
rust- 
is not done.

Just shooting color out of the sprayer is fun and looks great. But it avoids 
accomplishing anything worthwhile.

It's an Illusion of Progress.

May 25 2021

Nicholas Wilson <iamthewilsonator hotmail.com> writes:

On Wednesday, 26 May 2021 at 00:31:48 UTC, Walter Bright wrote:
 On 5/24/2021 3:41 AM, Tobias Pankrath wrote:
 Putting the files into directories would make those violations 
 obvious and
 serve as documentation on how the deps should be. Than all 
 others can
 start work towards that goal.

 When you've got a rusty car, it sure is tempting to just paint 
 it. But it's all for naught if the real work, the hard work, 
 the boring work -repairing the rust- is not done.

 Just shooting color out of the sprayer is fun and looks great. 
 But it avoids accomplishing anything worthwhile

That is correct for the analogy you used, however that is a false 
analogy because...

 It's an Illusion of Progress.

...Illusions of Progress provide no actual utility, hence 
illusions.
Packaging DMD otoh, provides _lots_ of utility: exploration and 
navigation is greatly eased, moreso for people who are less 
familiar with the codebase.

May 25 2021

Paul Backus <snarwin gmail.com> writes:

On Wednesday, 26 May 2021 at 01:25:23 UTC, Nicholas Wilson wrote:
 ...Illusions of Progress provide no actual utility, hence 
 illusions.
 Packaging DMD otoh, provides _lots_ of utility: exploration and 
 navigation is greatly eased, moreso for people who are less 
 familiar with the codebase.

For what it's worth, I've found that exploration and navigation 
of DMD code becomes much more manageable with an editor that 
supports "goto definition"--ideally with history, so you can jump 
backwards too.

I use vim's built-in ctags support for this, but I imagine most 
popular code editors can be configured to do something similar.

May 25 2021

Ola Fosheim Grostad <ola.fosheim.grostad gmail.com> writes:

On Wednesday, 26 May 2021 at 01:38:59 UTC, Paul Backus wrote:
 On Wednesday, 26 May 2021 at 01:25:23 UTC, Nicholas Wilson 
 wrote:
 ...Illusions of Progress provide no actual utility, hence 
 illusions.
 Packaging DMD otoh, provides _lots_ of utility: exploration 
 and navigation is greatly eased, moreso for people who are 
 less familiar with the codebase.

 For what it's worth, I've found that exploration and navigation 
 of DMD code becomes much more manageable with an editor that 
 supports "goto definition"--ideally with history, so you can 
 jump backwards too.

 I use vim's built-in ctags support for this, but I imagine most 
 popular code editors can be configured to do something similar.

I dont know if it is funny or sad that a thread about enabeling 
more compiler experiments ends with a note on adding basic IDE 
features to arcane editors...

May 25 2021

Imperatorn <johan_forsberg_86 hotmail.com> writes:

On Wednesday, 26 May 2021 at 05:39:19 UTC, Ola Fosheim Grostad 
wrote:
 On Wednesday, 26 May 2021 at 01:38:59 UTC, Paul Backus wrote:
 On Wednesday, 26 May 2021 at 01:25:23 UTC, Nicholas Wilson 
 wrote:
 [...]

 For what it's worth, I've found that exploration and 
 navigation of DMD code becomes much more manageable with an 
 editor that supports "goto definition"--ideally with history, 
 so you can jump backwards too.

 I use vim's built-in ctags support for this, but I imagine 
 most popular code editors can be configured to do something 
 similar.

 I dont know if it is funny or sad that a thread about enabeling 
 more compiler experiments ends with a note on adding basic IDE 
 features to arcane editors...

😅

You can use VS tho = bliss

May 26 2021

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Wednesday, 26 May 2021 at 08:35:04 UTC, Imperatorn wrote:
 😅

 You can use VS tho = bliss

Ok, we choose to laugh! 😅

(thanks, that helped on a rainy day :-)

May 26 2021

rikki cattermole <rikki cattermole.co.nz> writes:

On 26/05/2021 1:38 PM, Paul Backus wrote:
 For what it's worth, I've found that exploration and navigation of DMD 
 code becomes much more manageable with an editor that supports "goto 
 definition"--ideally with history, so you can jump backwards too.

It also becomes significantly more manageable when you have things like 
the parser, ast, semantic analysis and backend all grouped together in 
different projects of your solution! This way you can ignore significant 
chunks of the compiler which are irrelevant to what you are working on.

Imagine if there was a way to standardize this experience for everyone! 
If only...

May 26 2021

Ola Fosheim Grostad <ola.fosheim.grostad gmail.com> writes:

On Wednesday, 26 May 2021 at 12:37:58 UTC, rikki cattermole wrote:
 Imagine if there was a way to standardize this experience for 
 everyone! If only...

The experience is irrelevant in this context.

Partitioning is a necessary first step for decoupling.

Let us stop pretending this is a matter of taste.
It is not.

It is a matter of basic Software Engineering (the profession).

May 26 2021

rikki cattermole <rikki cattermole.co.nz> writes:

On 27/05/2021 1:04 AM, Ola Fosheim Grostad wrote:
 On Wednesday, 26 May 2021 at 12:37:58 UTC, rikki cattermole wrote:
 Imagine if there was a way to standardize this experience for 
 everyone! If only...

 
 The experience is irrelevant in this context.
 
 Partitioning is a necessary first step for decoupling.
 
 Let us stop pretending this is a matter of taste.
 It is not.
 
 It is a matter of basic Software Engineering (the profession).

Agreed.

When the directory structure does not match the concepts and complexity 
involved in a project, it is a symptom of much larger issues from my 
experience.

Fixing it, makes other issues much more visible to the point where they: 
HAVE TO BE FIXED RIGHT NOW.

Without the rearranging to match concepts and complexities in the file 
structure, it is a lot harder to properly scope modules to doing one and 
only one thing.

May 26 2021

Ola Fosheim Grostad <ola.fosheim.grostad gmail.com> writes:

On Wednesday, 26 May 2021 at 13:51:38 UTC, rikki cattermole wrote:
 Fixing it, makes other issues much more visible to the point 
 where they: HAVE TO BE FIXED RIGHT NOW.

 Without the rearranging to match concepts and complexities in 
 the file structure, it is a lot harder to properly scope 
 modules to doing one and only one thing.

Exactly. The core principle for anything that has to do with 
computers at basically any level is surprisingly simple: Divide 
and Conquer.

May 26 2021

zjh <fqbqrr 163.com> writes:

but W.B. say you are not writing d compiler thousand hours.
What you say doesn't count.

May 26 2021

Greg Strong <mageofmaple protonmail.com> writes:

On Wednesday, 26 May 2021 at 14:16:02 UTC, zjh wrote:
 but W.B. say you are not writing d compiler thousand hours.
 What you say doesn't count.

By that logic, what you say doesn't count either.  Yet you post 
anyway.

May 26 2021

rikki cattermole <rikki cattermole.co.nz> writes:

On 27/05/2021 1:58 AM, Ola Fosheim Grostad wrote:
 On Wednesday, 26 May 2021 at 13:51:38 UTC, rikki cattermole wrote:
 Fixing it, makes other issues much more visible to the point where 
 they: HAVE TO BE FIXED RIGHT NOW.

 Without the rearranging to match concepts and complexities in the file 
 structure, it is a lot harder to properly scope modules to doing one 
 and only one thing.

 
 Exactly. The core principle for anything that has to do with computers 
 at basically any level is surprisingly simple: Divide and Conquer.

I actually have an article on code quality and how I measure it.

https://cattermole.co.nz/article/code_qual

But the important list I use (for which dmd fails completely at):

1. Organized in a way that reflects the idea/concept.
2. Seperate concepts, seperate areas (files/areas of a file).
3. Grouping of resource usage
4. Depth from purpose
5. Naming

1, 2 and 4 is what this part of the thread is all about.

5 is stuff like what is STC? Variable names ext.

3. ok just look at the filename of this.
https://github.com/dlang/dmd/blob/master/src/dmd/libelf.d

or...

https://github.com/dlang/dmd/blob/master/src/dmd/libomf.d

I hope I don't need to say why these files fail that test when they are 
in the same directory as:

https://github.com/dlang/dmd/blob/master/src/dmd/doc.d

May 26 2021

Ola Fosheim Grostad <ola.fosheim.grostad gmail.com> writes:

On Wednesday, 26 May 2021 at 14:21:06 UTC, rikki cattermole wrote:
 I actually have an article on code quality and how I measure it.

 https://cattermole.co.nz/article/code_qual


I like your motto: Code is documentation!

 But the important list I use (for which dmd fails completely 
 at):

 1. Organized in a way that reflects the idea/concept.
 2. Seperate concepts, seperate areas (files/areas of a file).
 3. Grouping of resource usage
 4. Depth from purpose
 5. Naming

 1, 2 and 4 is what this part of the thread is all about.

But, my main issues are not these, these are symptoms. My main 
concerns are the consequenses of the ubderlying cause for these 
symptoms. The real challenge is not having a clean way of 
introducing new components ( like an IR between front and backend 
or a new solver related to the type system ). There is missing an 
analysis of where the compiler should allow extensions (compile 
time) with ease.

That prevents experimentation, and lowers interest in 
participation. LDC has achieved a lot and it is, I think, because 
they could specialize on THEIR piece, and take pride in 
maintaining it in a (I can only assume) busy life. They can also 
make their own decisions, so there is a sense of autonomous 
control, which is a high motivation factor (generally speaking).

May 26 2021

Ola Fosheim Grostad <ola.fosheim.grostad gmail.com> writes:

On Wednesday, 26 May 2021 at 19:13:42 UTC, Ola Fosheim Grostad 
wrote:
 But, my main issues are not these, these are symptoms. My main 
 concerns are the consequenses of the ubderlying cause for these 
 symptoms. The real challenge is not having a clean way of 
 introducing new components ( like an IR between front and

I guess another way of putting it is that it is ok that some 
authors want to maintain and fix bugs in their own component, so 
that component can have little documentation and so on, if there 
is an architecture to support having components! (Which is a 
desirable quality because it allows a sense of autonomous 
ownership etc.)

May 26 2021

rikki cattermole <rikki cattermole.co.nz> writes:

On 27/05/2021 7:13 AM, Ola Fosheim Grostad wrote:
 On Wednesday, 26 May 2021 at 14:21:06 UTC, rikki cattermole wrote:
 I actually have an article on code quality and how I measure it.

 https://cattermole.co.nz/article/code_qual

 
 
 I like your motto: Code is documentation!

Thanks!

 But the important list I use (for which dmd fails completely at):

 1. Organized in a way that reflects the idea/concept.
 2. Seperate concepts, seperate areas (files/areas of a file).
 3. Grouping of resource usage
 4. Depth from purpose
 5. Naming

 1, 2 and 4 is what this part of the thread is all about.

 
 But, my main issues are not these, these are symptoms. My main concerns 
 are the consequenses of the ubderlying cause for these symptoms. The 
 real challenge is not having a clean way of introducing new components ( 
 like an IR between front and backend or a new solver related to the type 
 system ). There is missing an analysis of where the compiler should 
 allow extensions (compile time) with ease.

Yeah, although I'll stay out of the whole IR thing as I'm no where near 
thinking about something like that.

May 26 2021

Ola Fosheim Grostad <ola.fosheim.grostad gmail.com> writes:

On Wednesday, 26 May 2021 at 20:03:28 UTC, rikki cattermole wrote:
 Yeah, although I'll stay out of the whole IR thing as I'm no 
 where near thinking about something like that.

Ok, one simple way is to just have a standard high level 
intermediary language like SIL for Swift. Then authors can build 
auxillary IRs that point back to the language nodes if needed. 
Then run the algorithm on the auxillary IR, then modify the 
language node graph accordingly. Then drop the auxillary IR and 
move on to the next stage. In the end the backend receives 
whatever is left of the intermediate datastructure.

Another option could be to have a mediating layer between front 
and backend. The default noop layer could be designed such that 
the optimizer will remove most of the overhead. Then people can 
replace the mediating layer with their own datastructure that 
obtains what it needs from the frontend, does something with it, 
and pass everything the backend needs down to the backend.

Probably more options. I have no strong opinion of what is best.

Just settle for something that puts a clean separation layer 
between front and backend without loosing sought information.

May 26 2021

Alexandru Ermicioi <alexandru.ermicioi gmail.com> writes:

On Wednesday, 26 May 2021 at 01:38:59 UTC, Paul Backus wrote:
 For what it's worth, I've found that exploration and navigation 
 of DMD code becomes much more manageable with an editor that 
 supports "goto definition"--ideally with history, so you can 
 jump backwards too.

That's if you've got a starting point in the source code, then 
yeah it is a lot better (thx to all who did work on LSP related 
projects for D), however it won't help when you try to find some 
functionality which you don't know in what module is.

May 26 2021

rikki cattermole <rikki cattermole.co.nz> writes:

On 27/05/2021 5:28 AM, Alexandru Ermicioi wrote:
 On Wednesday, 26 May 2021 at 01:38:59 UTC, Paul Backus wrote:
 For what it's worth, I've found that exploration and navigation of DMD 
 code becomes much more manageable with an editor that supports "goto 
 definition"--ideally with history, so you can jump backwards too.

 
 That's if you've got a starting point in the source code, then yeah it 
 is a lot better (thx to all who did work on LSP related projects for D), 
 however it won't help when you try to find some functionality which you 
 don't know in what module is.

There is a file list (somewhere, I'm not looking for it) that tells you 
what is what. But a proper directory structure + good header comments 
can do this even better :D

May 26 2021

Walter Bright <newshound2 digitalmars.com> writes:

On 5/25/2021 6:25 PM, Nicholas Wilson wrote:
 ...Illusions of Progress provide no actual utility, hence illusions.
 Packaging DMD otoh, provides _lots_ of utility: exploration and navigation is 
 greatly eased, moreso for people who are less familiar with the codebase.

Creating a FILES.md file, the content of which is each source file with a brief 
description, will accomplish the same thing.

May 26 2021

Walter Bright <newshound2 digitalmars.com> writes:

On 5/26/2021 8:06 PM, Walter Bright wrote:
 On 5/25/2021 6:25 PM, Nicholas Wilson wrote:
 ...Illusions of Progress provide no actual utility, hence illusions.
 Packaging DMD otoh, provides _lots_ of utility: exploration and navigation is 
 greatly eased, moreso for people who are less familiar with the codebase.

 
 Creating a FILES.md file, the content of which is each source file with a
brief 
 description, will accomplish the same thing.

I see this has already been done:

https://github.com/dlang/dmd/blob/master/src/dmd/README.md

It's a bit out of date, files like typesem.d are missing.

May 26 2021

Nicholas Wilson <iamthewilsonator hotmail.com> writes:

On Thursday, 27 May 2021 at 04:53:12 UTC, Walter Bright wrote:
 On 5/26/2021 8:06 PM, Walter Bright wrote:
 On 5/25/2021 6:25 PM, Nicholas Wilson wrote:
 ...Illusions of Progress provide no actual utility, hence 
 illusions.
 Packaging DMD otoh, provides _lots_ of utility: exploration 
 and navigation is greatly eased, moreso for people who are 
 less familiar with the codebase.

 
 Creating a FILES.md file, the content of which is each source 
 file with a brief description, will accomplish the same thing.

 I see this has already been done:

 https://github.com/dlang/dmd/blob/master/src/dmd/README.md

 It's a bit out of date, files like typesem.d are missing.

I know, I wrote the equivalent for the backend. And no that does 
not accomplish the _same_ thing, not even remotely close. The 
fact you only just found out about it shows that:
a) you have never used the README, and
b) know your way around well enough to not need it, which shows 
the implication that
c) you have no perspective from those who would have use for 
either a README or better structured files and know nothing about 
the relative benefits of either of them.

Yes, a README is strictly better than nothing. It does not 
substitute for having organised files. Neither does well 
organised files substitute for a lack of README.

May 26 2021

Ola Fosheim Grostad <ola.fosheim.grostad gmail.com> writes:

On Thursday, 27 May 2021 at 05:36:55 UTC, Nicholas Wilson wrote:
 The fact you only just found out about it shows that:
 a) you have never used the README, and
 b) know your way around well enough to not need it, which shows 
 the implication that
 c) you have no perspective from those who would have use for 
 either a README or better structured files and know nothing 
 about the relative benefits of either of them.

d) See no value in spending effort on designing an architecture.

e) Does not see the value of having others add new features.

So basically the project is following this structure:

On person runs ahead adding features faster than they are 
completed.
Helping out means walking around grepping for things to fix.

If that is it, then we might as well close the thread and 
conclude that dmd is a hobby project.

Which is fair enough. Just don't pretend it aspires to be more 
than that, because that takes reorganization and restructuring.

May 26 2021

Mathias LANG <geod24 gmail.com> writes:

On Thursday, 27 May 2021 at 06:08:53 UTC, Ola Fosheim Grostad 
wrote:
 If that is it, then we might as well close the thread and 
 conclude that dmd is a hobby project.

 Which is fair enough. Just don't pretend it aspires to be more 
 than that, because that takes reorganization and restructuring.

Don't forget about the many contributors that have invested 
thousands of hours to understand and improve the code base :)

While I disagree with many of Walter's arguments here, a 
refactoring has a lower barrier of entry than a bugfix, and is 
more prone to be subjective.

It isn't always subjective, as, just like a bug fix, a 
refactoring *can* come with a test case. For example, if someone 
writes a tool that uses DMD as a library, it will be a solid 
ground to push a change that could otherwise be perceived as 
subjective. We had work done on trying to make DMD work as a 
library before, and all we ended up with was a massive amount of 
duplication. But when such refactoring are driven by use case 
(e.g. VisualD's usage of DMD), they are easy to justify and 
accept.

Now to talk about what can be done to improve the DMD codebase, 
it's fairly obvious: ELIMINATE ALL CASTS. But not by replacing 
`cast(XXX)e` with `e.isXXX()`, but by actually using proper 
encapsulation.

What I mean is that instead of switching on types, like this:
```D
if (auto tf = t.isTypeFunction())
     (cast(FunctionDeclaration)t.sym).something();
else if (auto td = t.isTypeDelegate())
     
(cast(FunctionDeclaration)(cast(TypeFunction)t).sym).something();
else
     // Something else
```

We should switch on capabilities. We currently "suffer" from 
abstraction: almost everything is a `Type`, `Dsymbol`, 
`Expression`, etc... but then when we do semantic we have to 
`cast` or `isXXX` it all over the place. Functions that are only 
supposed to accept `CallExp` or `BinExp` end up accepting an 
`Expression` because somewhere, a field that should be `CallExp` 
or `BinExp` is stored as an `Expression`, or because we don't 
have the proper return type on a function, etc...

There are simple areas where one can start, for example making 
all aggregate have the most specialized type possible: 
`TypeDelegate.nextOf()` should return a `TypeFunction`, not a 
`Type`. `FunctionDeclaration.type()` should be a property that 
gives you a `TypeFunction`, not a `Type`, etc...

May 27 2021

Ola Fosheim Grostad <ola.fosheim.grostad gmail.com> writes:

On Thursday, 27 May 2021 at 07:14:25 UTC, Mathias LANG wrote:
 On Thursday, 27 May 2021 at 06:08:53 UTC, Ola Fosheim Grostad 
 wrote:
 If that is it, then we might as well close the thread and 
 conclude that dmd is a hobby project.

 Which is fair enough. Just don't pretend it aspires to be more 
 than that, because that takes reorganization and restructuring.

 Don't forget about the many contributors that have invested 
 thousands of hours to understand and improve the code base :)

Yes, let us not forget that they wasted many unproductive hours 
on trying to understand... Let us put a number on that in 
dollars...

 While I disagree with many of Walter's arguments here, a 
 refactorning has a lower barrier of entry than a bugfix, and is 
 more prone to be subjective.

I dont want to enter that territory, neither bugfixes or micro 
level refactoring has much of an impact on people wanting to 
experiment. To enable that we have to look at the macro level and 
establish stable well designed interfaces so that changes in 
compiler internals have small impact on experimental components.

Well designed interfaces can hook up to internals you want to 
change at a later stage, but now you do at least not get more 
dependecies tied to things you want to replace.

So in essence, if there is a mess, first step is not to clean up 
the mess (could be too costly), but to hide it so that people 
stop depending on it.

My impression is that Walter is arguing that everything should be 
cleaned up first. That is not realistic.

 Now to talk about what can be done to improve the DMD codebase, 
 it's fairly obvious: ELIMINATE ALL CASTS. But not by replacing 
 `cast(XXX)e` with `e.isXXX()`, but by actually using proper 
 encapsulation.

Does not enable experimentation. Only a good macro level 
architecture enables experimentation.

The internals can to some extent be a mess, with little impact, 
it does not matter unless you want to change templating or type 
system features.

Many interesting experiments can be done by combining parser 
mods, runtime mods and post frontend mods.

Other interesting improvements can be done if one identifies 
areas in the compiler that can be isolated from the whole and 
where new features could be enabled. I suspect this is needed to 
get a solver that provides proper type unification, but I havent 
looked at this...

Some stuff being messy is not the big picture issue.

The big picture is to get clean points in the codebase where you 
can inject your own component. And to put those injection points 
where they have most potential for enabling experimentation.

An easy first step is to put a separation layer between frontend 
and backend.

May 27 2021

Walter Bright <newshound2 digitalmars.com> writes:

On 5/27/2021 12:14 AM, Mathias LANG wrote:
 Now to talk about what can be done to improve the DMD codebase, it's fairly 
 obvious: ELIMINATE ALL CASTS. But not by replacing `cast(XXX)e` with 
 `e.isXXX()`, but by actually using proper encapsulation.
 
 What I mean is that instead of switching on types, like this:
 ```D
 if (auto tf = t.isTypeFunction())
      (cast(FunctionDeclaration)t.sym).something();
 else if (auto td = t.isTypeDelegate())
 (cast(FunctionDeclaration)(cast(TypeFunction)t).sym).something();
 else
      // Something else
 ```

The isXXX() functions also make for safe casting. Your example would be:

  if (auto tf = t.isTypeFunction())
       tf.sym.something();
  else if (auto td = t.isTypeDelegate())
       t.isTypeFunction().sym.isFunctionDeclaration().something();


 There are simple areas where one can start, for example making all aggregate 
 have the most specialized type possible: `TypeDelegate.nextOf()` should return
a 
 `TypeFunction`, not a `Type`. `FunctionDeclaration.type()` should be a
property 
 that gives you a `TypeFunction`, not a `Type`, etc...

FunctionDeclaration.type() can also give you a TypeError.

May 27 2021

Basile B. <b2.temp gmx.com> writes:

On Thursday, 27 May 2021 at 08:41:49 UTC, Walter Bright wrote:
 On 5/27/2021 12:14 AM, Mathias LANG wrote:
 Now to talk about what can be done to improve the DMD 
 codebase, it's fairly obvious: ELIMINATE ALL CASTS. But not by 
 replacing `cast(XXX)e` with `e.isXXX()`, but by actually using 
 proper encapsulation.
 
 What I mean is that instead of switching on types, like this:
 ```D
 if (auto tf = t.isTypeFunction())
      (cast(FunctionDeclaration)t.sym).something();
 else if (auto td = t.isTypeDelegate())
 (cast(FunctionDeclaration)(cast(TypeFunction)t).sym).something();
 else
      // Something else
 ```

 The isXXX() functions also make for safe casting.

And this is actually the only way to dyncast cast nodes as DMD 
AST is extern(C++)... But TBH I think that all the isXXX family 
of functions should be free functions, not members funcs. All 
these isXXX calls are virtuals but they dont need to (although 
often devirtualized).

May 27 2021

Basile B. <b2.temp gmx.com> writes:

On Thursday, 27 May 2021 at 08:50:32 UTC, Basile B. wrote:
 On Thursday, 27 May 2021 at 08:41:49 UTC, Walter Bright wrote:
 On 5/27/2021 12:14 AM, Mathias LANG wrote:
 Now to talk about what can be done to improve the DMD 
 codebase, it's fairly obvious: ELIMINATE ALL CASTS. But not 
 by replacing `cast(XXX)e` with `e.isXXX()`, but by actually 
 using proper encapsulation.
 
 What I mean is that instead of switching on types, like this:
 ```D
 if (auto tf = t.isTypeFunction())
      (cast(FunctionDeclaration)t.sym).something();
 else if (auto td = t.isTypeDelegate())
 (cast(FunctionDeclaration)(cast(TypeFunction)t).sym).something();
 else
      // Something else
 ```

 The isXXX() functions also make for safe casting.

 And this is actually the only way to dyncast cast nodes as DMD 
 AST is extern(C++)... But TBH I think that all the isXXX family 
 of functions should be free functions, not members funcs. All 
 these isXXX calls are virtuals but they dont need to (although 
 often devirtualized).

Other advantage of module scope isXXX functions is that the base 
Expression node would not need to know about all the derived. We 
would have a real astbase module with just Type, Statement, 
DSymbol, Expression. The isXXXX would be in the module that 
declare the XXXX class.

May 27 2021

Walter Bright <newshound2 digitalmars.com> writes:

On 5/27/2021 1:50 AM, Basile B. wrote:
 All these isXXX calls are virtuals but they 
 dont need to (although often devirtualized).

They're all `final` meaning not virtual. The intent is them being inlined.

May 27 2021

Ola Fosheim Grostad <ola.fosheim.grostad gmail.com> writes:

On Thursday, 27 May 2021 at 06:08:53 UTC, Ola Fosheim Grostad 
wrote:
 Which is fair enough. Just don't pretend it aspires to be more 
 than that, because that takes reorganization and restructuring.

And let me add thar reorganization and restructuring is not a 
sign of failure. It is a sign of professionalism.

Well run projects have this built into the process so that these 
important activities are not put aside or put on hold.

May 27 2021

Patrick Schluter <Patrick.Schluter bbox.fr> writes:

On Thursday, 27 May 2021 at 05:36:55 UTC, Nicholas Wilson wrote:
 On Thursday, 27 May 2021 at 04:53:12 UTC, Walter Bright wrote:
 On 5/26/2021 8:06 PM, Walter Bright wrote:
 [...]

 I see this has already been done:

 https://github.com/dlang/dmd/blob/master/src/dmd/README.md

 It's a bit out of date, files like typesem.d are missing.

 I know, I wrote the equivalent for the backend. And no that 
 does not accomplish the _same_ thing, not even remotely close. 
 The fact you only just found out about it shows that:
 a) you have never used the README, and
 b) know your way around well enough to not need it, which shows 
 the implication that
 c) you have no perspective from those who would have use for 
 either a README or better structured files and know nothing 
 about the relative benefits of either of them.

 Yes, a README is strictly better than nothing. It does not 
 substitute for having organised files. Neither does well 
 organised files substitute for a lack of README.

and adding to that by citing Walters message "It's a bit out of 
date, files like typesem.d are missing." shows the inherent 
problem of a separate README file. Organized files are self 
documenting, readme's are not.

May 26 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/27/21 1:36 AM, Nicholas Wilson wrote:
 Yes, a README is strictly better than nothing. It does not substitute 
 for having organised files. Neither does well organised files substitute 
 for a lack of README.

Razvan found in https://github.com/dlang/dmd/pull/12560 a number of 
imports of backend modules that shouldn't be there.

I wonder if this convention could be enforced by using package-level 
protection in backend (and elsewhere) in such a way that would have made 
it impossible for those imports to work.

That would be a good way forward because as it goes (and went in the 
past) the discussion remains sterile. Once there is a demonstrable 
improvement brought about by packages and (self-evidently) you can't get 
package-level protection without packages, the case will be much easier 
to make.

The overarching point is that better modularization should predate, 
inform, and motivate division in packages, not follow it.

May 27 2021

Nicholas Wilson <iamthewilsonator hotmail.com> writes:

On Monday, 24 May 2021 at 09:39:42 UTC, Walter Bright wrote:
 On 5/24/2021 1:35 AM, Nicholas Wilson wrote:
 Did you read _literally nothing else_ that I wrote?

 I read it, my response was to the entire posting.

Then I can only conclude you have absolutely no perspective for 
someone who has little or no experience with the DMD codebase.

I have had multiple GSoC/SAoC students and I have spoken to 
perhaps two dozen people  at various dconfs who I consider to be 
well versed in D and all of them have complained that the lack of 
organisation of the files in DMD to be a significant hinderance 
to contribution to the point where many simply do not bother. 
Many of these people are regular commits to phobos and druntime.

May 24 2021

Alexandru Ermicioi <alexandru.ermicioi gmail.com> writes:

On Monday, 24 May 2021 at 06:58:48 UTC, Walter Bright wrote:
 On 5/23/2021 10:15 PM, Nicholas Wilson wrote:
 This is a _completely_ orthogonal problem.

 It's the same problem.

 D's support for modules and packages is literally designed 
 around matching the hierarchy of the source files.

 Shuffling files around accomplishes nothing when every module 
 imports every other module.

It will be a huge help if they are though. At minimum it will 
organize the things into packages that have one purpose, which 
will help in understanding the structure of dmd, and also make 
navigation and search of desired functionality easier, compared 
to one flat package. This can actually be the first step at 
unwinding all the mess with imports you're mentioning, since 
packages are not just folders, but logical organization of a set 
of modules that are somewhat related to the purpose the package 
has.

May 24 2021

Walter Bright <newshound2 digitalmars.com> writes:

On 5/24/2021 2:55 AM, Alexandru Ermicioi wrote:
 It will be a huge help if they are though. At minimum it will organize the 
 things into packages that have one purpose, which will help in understanding
the 
 structure of dmd, and also make navigation and search of desired functionality 
 easier, compared to one flat package.

It establishes a fake hierarchy that is *not* expressed in the code.

Poor encapsulation is the problem, and this does nothing to solve it.

 This can actually be the first step at 
 unwinding all the mess with imports you're mentioning, since packages are not 
 just folders, but logical organization of a set of modules that are somewhat 
 related to the purpose the package has.

It's backwards. Fix the rust on the car, then repaint it.

May 24 2021

Nicholas Wilson <iamthewilsonator hotmail.com> writes:

On Monday, 24 May 2021 at 10:15:53 UTC, Walter Bright wrote:
 On 5/24/2021 2:55 AM, Alexandru Ermicioi wrote:
 It will be a huge help if they are though. At minimum it will 
 organize the things into packages that have one purpose, which 
 will help in understanding the structure of dmd, and also make 
 navigation and search of desired functionality easier, 
 compared to one flat package.

 It establishes a fake hierarchy that is *not* expressed in the 
 code.

 Poor encapsulation is the problem, and this does nothing to 
 solve it.

It _is_ in the code. FFS, the AST is literally a hierarchy!

May 24 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/23/21 10:56 PM, Walter Bright wrote:
 I recently had to talk someone out of having dmd/backend import files 
 from dmd/root.

One problem with that is code duplication. There are two types OutBuffer 
in frontend and Outbuffer in backend that are 95% identical, yet 
duplicated. Recent improvements (two distinct) will need to be 
duplicated to the other, which is clearly not a good way to go.

How to address this problem?

I think all of us looking to improve dmd's architecture would be well 
served by reading this book:

https://amazon.com/gp/product/0135974445/

Really close, cover to cover. A lot of the principles in that book are 
either applied with good results (sadly not as often as one would hope), 
or not, with the expected poor outcome, in dmd's codebase. For example, 
this:

 A good rule of thumb is:
 
     *** Never import a file from an uplevel directory ***
 
 Import sideways and down, never up. 

is an approximate formulation of a subset of Dependency Inversion Principle:

https://en.wikipedia.org/wiki/Dependency_inversion_principle

May 23 2021

Walter Bright <newshound2 digitalmars.com> writes:

On 5/23/2021 10:55 PM, Andrei Alexandrescu wrote:
 One problem with that is code duplication.

Sure. But in the outbuffer case, the duplication stems from backend being used 
in multiple projects.

It's hard to have perfection, and if getting to perfection means driving off a 
cliff (!) it's better to just live with a bit of imperfection here and there.

I don't like having two outbuffers, but the cure is worse for that particular
case.

There's even another implementation of outbuffer in Phobos (because I thought 
outbuffer was generally very useful):

https://dlang.org/phobos/std_outbuffer.html

But here we run into our rule that dmd shouldn't rely on Phobos. Compromise is 
inevitable. Outbuffer isn't a spike we need to impale ourselves on. There are 
plenty of other encapsulation problems that could be improved, like target.d.

May 24 2021

zjh <fqbqrr 163.com> writes:

There should be a base package on DMD/Druntime and Phobos.
Split large file into small files in one directory.
0ne big file<=>one directory.
We need big changes.

May 24 2021

zjh <fqbqrr 163.com> writes:

We need `big changes`.
We need `todolist`(order by important).
We need to split big files into directories.
Small refactoring is useless.
`Big changes` are necessary.
We separate the `stable part` from the `unstable part` of the big 
file.And divided into `small files`.
According to dependence, change from `the most dependent`.
Interfacs or func name need not to change.
It's just that the `organization` has to be changed.
Nobody reads `thousands of lines` functions.
No one reads `>100kb` coding files because they are too large.
We just `split up` large files, not modify the function 
implementation.
Because modifying the function implementation is `most likely` to 
make mistakes.

May 24 2021

user1234 <user1234 12.de> writes:

On Monday, 24 May 2021 at 12:38:59 UTC, zjh wrote:
 We need `big changes`.
 We need `todolist`(order by important).
 We need to split big files into directories.
 Small refactoring is useless.
 `Big changes` are necessary.
 We separate the `stable part` from the `unstable part` of the 
 big file.And divided into `small files`.
 According to dependence, change from `the most dependent`.
 Interfacs or func name need not to change.
 It's just that the `organization` has to be changed.
 Nobody reads `thousands of lines` functions.
 No one reads `>100kb` coding files because they are too large.
 We just `split up` large files, not modify the function 
 implementation.
 Because modifying the function implementation is `most likely` 
 to make mistakes.

100 kb is let's say 2500 slocs (or rather 1500 from the D-Scanner 
pov), that's not too crazy. Many DMD source files are big because 
they contain a visitor.
visitors cant be split in several files. Often you only actually 
are interested by a single method of a visitor so the overhall 
size of a source does not matter.

Eventually what could be done for the biggest methods of visitors 
is to extract parts of the content to several **non-nested** free 
functions, so that no more low level implementation details, like 
control loops, are visible and instead you just see do_this; 
do_that; with just a few, nzzcessarily unavoidable, flow 
statements.

The problem is that extracting and splitting the content would be 
tedious because of the decade of more or less well organized 
patchwork added to fix the bugs.

PS: backticks are for inline code, sourround with pairs of stars 
or pairs of underscores.

May 24 2021

Iain Buclaw <ibuclaw gdcproject.org> writes:

On Monday, 24 May 2021 at 19:42:00 UTC, user1234 wrote:
 On Monday, 24 May 2021 at 12:38:59 UTC, zjh wrote:
 [...]

 100 kb is let's say 2500 slocs (or rather 1500 from the 
 D-Scanner pov), that's not too crazy. Many DMD source files are 
 big because they contain a visitor.
 visitors cant be split in several files. Often you only 
 actually are interested by a single method of a visitor so the 
 overhall size of a source does not matter.

 Eventually what could be done for the biggest methods of 
 visitors is to extract parts of the content to several 
 **non-nested** free functions, so that no more low level 
 implementation details, like control loops, are visible and 
 instead you just see do_this; do_that; with just a few, 
 nzzcessarily unavoidable, flow statements.

Actually, the visitors have been slowly getting converted into 
nested functions and a switch table.

May 24 2021

Basile B. <b2.temp gmx.com> writes:

On Monday, 24 May 2021 at 19:42:00 UTC, user1234 wrote:
 Eventually what could be done for the biggest methods of 
 visitors is to extract parts of the content to several 
 **non-nested** free functions, so that no more low level 
 implementation details, like control loops, are visible and 
 instead you just see do_this; do_that; with just a few, 
 nzzcessarily unavoidable, flow statements.

I've had the opportunity to do quch a refact yesterday in styx. 
It makes things very clear, for example the expression semantic 
for binary assign exps :

```d
override void visit(BinAssExpressionAstNode node)
{
     processBinaryOperands(node);
     if (tryRewritingToOperatorOverload(node, [node.left, 
node.right]))
         return;
     if (tryToSetLengthExp(node))
         return;
     if (checkIfInvalidEnumSetOp(node))
         return;
     if (checkPtrArithmeticOp(node))
         return;
     ensureAssignedParamIsLvalue(node);
     tryOneWayAssImplicitConv(node);
     checkIfAssignable(node.left);
}
```

or for binary exps that are not assign and not cmp:

```d
override void visit(BinaryExpressionAstNode node)
{
     processBinaryOperands(node);
     if (tryRewritingToOperatorOverload(node, [node.left, 
node.right]))
         return;
     if (checkIfInvalidEnumSetOp(node))
         return;
     if (checkPtrArithmeticOp(node))
         return;
     tryTwoWaysBinaryImplicitConv(node);
}
```

It was like 200 lines before.

Jun 06 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/24/21 8:38 AM, zjh wrote:
 Small refactoring is useless.
 `Big changes` are necessary.

That evokes the couple who had problems in their relationship, so they 
decided to solve them by getting married.

May 24 2021

Ola Fosheim Grostad <ola.fosheim.grostad gmail.com> writes:

On Monday, 24 May 2021 at 20:45:11 UTC, Andrei Alexandrescu wrote:
 On 5/24/21 8:38 AM, zjh wrote:
 Small refactoring is useless.
 `Big changes` are necessary.

 That evokes the couple who had problems in their relationship, 
 so they decided to solve them by getting married.

If that meant that they encapsulated their problems and had a 
united front towards the rest of the world, then that is the 
right approach for dmd. Arranged marriages are underappreciated...

May 24 2021

Patrick Schluter <Patrick.Schluter bbox.fr> writes:

On Monday, 24 May 2021 at 20:45:11 UTC, Andrei Alexandrescu wrote:
 On 5/24/21 8:38 AM, zjh wrote:
 Small refactoring is useless.
 `Big changes` are necessary.

 That evokes the couple who had problems in their relationship, 
 so they decided to solve them by getting married.

You watched "Better Call Saul", didn't you? :-)

May 25 2021

Iain Buclaw <ibuclaw gdcproject.org> writes:

On Monday, 24 May 2021 at 09:53:42 UTC, Walter Bright wrote:
 But here we run into our rule that dmd shouldn't rely on 
 Phobos. Compromise is inevitable. Outbuffer isn't a spike we 
 need to impale ourselves on. There are plenty of other 
 encapsulation problems that could be improved, like target.d.

*ahem* https://github.com/dlang/dmd/pull/12574

It's a start at least.

May 24 2021

Johan Engelen <j j.nl> writes:

On Monday, 24 May 2021 at 09:53:42 UTC, Walter Bright wrote:
 On 5/23/2021 10:55 PM, Andrei Alexandrescu wrote:
 One problem with that is code duplication.

 Sure. But in the outbuffer case, the duplication stems from 
 backend being used in multiple projects.

 It's hard to have perfection, and if getting to perfection 
 means driving off a cliff (!) it's better to just live with a 
 bit of imperfection here and there.

 I don't like having two outbuffers, but the cure is worse for 
 that particular case.

 There's even another implementation of outbuffer in Phobos 
 (because I thought outbuffer was generally very useful):

 https://dlang.org/phobos/std_outbuffer.html

 But here we run into our rule that dmd shouldn't rely on 
 Phobos. Compromise is inevitable. Outbuffer isn't a spike we 
 need to impale ourselves on. There are plenty of other 
 encapsulation problems that could be improved, like target.d.

Outbuffer is a case of a data structure that is useful throughout 
the compiler. So it is put in a package of the compiler that is 
OK to be imported from other packages (and should avoid importing 
other packages). I think the `dmd.root` package is exactly like 
such a package  (compare with `ADT` in LLVM). From that 
standpoint, I don't see why the `dmd.backend` package cannot 
import `dmd.root`. If `dmd.backend` is to be used in different 
projects, then those should also use `dmd.root` and that's where 
the dependency chain stops.
Better: if it is in Phobos, great, use that!
If you need a certain data structure you know where to look: 
Phobos or `dmd.root`. Is it not in there? Don't create a new 
structure elsewhere, add it to `dmd.root` and import it.

In LDC, we use the C++ stdlib and we use Phobos, because why not? 
We are programming in D after all, and it is the standard library 
that is available in all bootstrapping compiler required 
versions. We do take care not to rely on _latest_ Phobos, but on 
Phobos from the oldest bootstrapping D compiler version supported 
up to the latest version. Same for the C++ stdlib (C++20 is not 
ok, but C++14 is much encouraged).

For example: LDC uses MD5 hashing for its machine codegen cache. 
`import std.digest.md; auto md5hash = md5Of(slice);`  Done.
LLVM does the same, e.g. the project removed its own unique_ptr 
implementation (`OwningPtr`), and now uses `std::unique_ptr`.


My standpoint on the original topic of "make it easier to 
experiment with the compiler": I disagree with making the code 
more stable. If anything, we should be refactoring much more 
aggressively, changing function names etc. Nicholas mentions that 
it is a pain to keep up with LLVM, where even function names are 
renamed from "SortSomeName" to "SortSomeNames" (made-up example), 
because plural is correct. The pain would be _much_ more if all 
unfixed small incorrectness/clumsiness/etc. accumulates over time 
and you end up with a convoluted codebase...
The main stumbling block is already mentioned: ownership. If you 
want contributors, you have to give up some ownership and be 
willing to make compromises between your own and the new 
contributors' viewpoints. The lack of willingness to give up 
ownership is what keeps me out (and I suspect, indirectly, others 
too). The frontend source code is not nice, but I'm not drawn to 
fix it at all (even if paid for) because I am not ashamed by it 
as I would be if I would have some shared 'ownership' of it.

-Johan

May 24 2021

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Monday, 24 May 2021 at 10:47:16 UTC, Johan Engelen wrote:
 My standpoint on the original topic of "make it easier to 
 experiment with the compiler": I disagree with making the code 
 more stable. If anything, we should be refactoring much more 
 aggressively, changing function names etc.

Thank you for bringing us back on topic. Yes, or at least have a 
map of what is considered stable and well encapsulated and what 
is considered unstable and likely to change.

I don't believe this is a matter for git rebasing 
tooling/understanding. I just don't want to build _directly_ on 
top of something that looks like it is likely to change (from a 
software engineering point of view).

I consider every hour spent on rebasing, dealing with regressions 
etc to be losses,  or more importantly "not fun". I only want to 
do "not fun" things if I can learn something from them.

D has to rely on hobbyists, so getting "not fun"/"no learning 
potential" out of the way is important.

 others too). The frontend source code is not nice, but I'm not 
 drawn to fix it at all (even if paid for) because I am not 
 ashamed by it as I would be if I would have some shared 
 'ownership' of it.

That is a bit harsh, of course all code bases that have evolved 
over a long time are not nice, parts of LDC too.

Anyway, my main wish is just to be able to inject my own IR 
between the frontend and backend.

My feeling right now is that to do that I have to choose LDC and 
then heavily modify it. I sense that in the end I basically will 
end up with my own backend, something I don't want to maintain...

Think of it like LEGOs. The front end is a green brick and the 
back end a red brick. I want to insert a white brick between 
them. I don't want to modify the bricks more than "cleaning" the 
studs.

Another analogy, if the frontend is an engine, my IR is the 
transmission and the backend is the wheels, then I don't mind 
that the current engine is oily and grease, I leave that to other 
mechanics to clean up. Same with the wheels. I just want to be an 
expert on the transmission and evolve it from a manual 
transmission into a nice automatic transmission. Right now the 
engine is coupled directly to the wheels... which basically means 
being forced to drive in the same gear all the time.

I am less interested in getting my fingers greasy and am happy to 
leave that to others as long as I can focus on polishing the 
chrome on my transmission line...

(I belive many things could be done with an intermediary high 
level IR, such as ARC, stackless coroutines, heap 
optimizations... LLVM is too low level. AST is too cumbersome.)

May 24 2021

sighoya <sighoya gmail.com> writes:

On Monday, 24 May 2021 at 14:37:45 UTC, Ola Fosheim Grøstad wrote:
I sense that in the end I basically will end up with my own 
backend, something I don't want to maintain...

I think you will end up with your own compiler :)

May 24 2021

Ola Fosheim Grostad <ola.fosheim.grostad gmail.com> writes:

On Monday, 24 May 2021 at 15:16:34 UTC, sighoya wrote:
 On Monday, 24 May 2021 at 14:37:45 UTC, Ola Fosheim Grøstad 
 wrote:
I sense that in the end I basically will end up with my own 
backend, something I don't want to maintain...

 I think you will end up with your own compiler :)

I think we need to learn from Apple and Microsoft, they are doing 
well, not only because of resources, but because they let people 
be specialists on certain aspects of the compiler. D has people 
who has specialized on the GC and LLVM, but it isnt a deliberate 
strategy... Yet.

Building a racing car is not a one man project...

May 24 2021

Bruce Carneal <bcarneal gmail.com> writes:

On Monday, 24 May 2021 at 10:47:16 UTC, Johan Engelen wrote:

[...]
 My standpoint on the original topic of "make it easier to 
 experiment with the compiler": I disagree with making the code 
 more stable. If anything, we should be refactoring much more 
 aggressively, changing function names etc.

Yes. It's easier to understand shallow trees with modest leaves 
than arbitrary graphs with 1000+ LOC "leaves".  Getting there 
will take some work.  Fortunately, it looks like much of that 
work can be done "bottom up" i.e. incrementally.

When simplifying code readability is a commonly applied metric. 
How long does it take for an intelligent but "outside" developer 
to understand the code?  Another useful metric is the degree of 
dynamic dependence: could this code run in parallel?  If not, why 
not?

Examining the ability to run in parallel can also be done "bottom 
up", and is at least as valuable for simplification/correctness 
as it is for parallel speedup potential.  That said, a 
taskification that followed our, sometimes extreme, code 
expansion contours could yield speedups that coarser approaches 
to multi-threading do not.  It could also bring vibe style sanity 
in place of manually managed asynchrony where the dependencies 
are carried in your head.

When looking to foster task independence, building around 
dependency graphs which are immutable/committed in the interior 
and expanding/mutating/synchronizing at the frontier, is one way 
to go. (__traits compiles is interesting in this context...)  The 
SDC people will have other ideas/experience to share if 
taskification becomes a thing.

Finally, my thanks again to the current front end crew and the 
LDC/dcompute crew.  The tool chain may not be perfect but, boy, 
it's way better than falling back to C++/CUDA.

May 24 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/24/21 6:47 AM, Johan Engelen wrote:
 On Monday, 24 May 2021 at 09:53:42 UTC, Walter Bright wrote:
 On 5/23/2021 10:55 PM, Andrei Alexandrescu wrote:
 One problem with that is code duplication.

 Sure. But in the outbuffer case, the duplication stems from backend 
 being used in multiple projects.

 It's hard to have perfection, and if getting to perfection means 
 driving off a cliff (!) it's better to just live with a bit of 
 imperfection here and there.

 I don't like having two outbuffers, but the cure is worse for that 
 particular case.

 There's even another implementation of outbuffer in Phobos (because I 
 thought outbuffer was generally very useful):

 https://dlang.org/phobos/std_outbuffer.html

 But here we run into our rule that dmd shouldn't rely on Phobos. 
 Compromise is inevitable. Outbuffer isn't a spike we need to impale 
 ourselves on. There are plenty of other encapsulation problems that 
 could be improved, like target.d.

 
 Outbuffer is a case of a data structure that is useful throughout the 
 compiler. So it is put in a package of the compiler that is OK to be 
 imported from other packages (and should avoid importing other 
 packages). I think the `dmd.root` package is exactly like such a 
 package  (compare with `ADT` in LLVM). From that standpoint, I don't see 
 why the `dmd.backend` package cannot import `dmd.root`. If `dmd.backend` 
 is to be used in different projects, then those should also use 
 `dmd.root` and that's where the dependency chain stops.

Thanks. I set out to write pretty much exactly that. To add to it:

A. High-level modules should not depend on low-level modules. Both 
should depend on abstractions (e.g., interfaces).
B. Abstractions should not depend on details. Details (concrete 
implementations) should depend on abstractions.

(Source: https://en.wikipedia.org/wiki/Dependency_inversion_principle)

Applied here:

A. The back-end should not depend on the front end. Both should depend 
on abstractions (e.g., interfaces) such as OutBuffer.
B. OutBuffer should not depend on memory-mapped minutia. Memory-mapped 
work should be done to serve OutBuffer.

 Better: if it is in Phobos, great, use that!
 If you need a certain data structure you know where to look: Phobos or 
 `dmd.root`. Is it not in there? Don't create a new structure elsewhere, 
 add it to `dmd.root` and import it.

I am sympathetic to the cause of not addding a large number of moving 
pieces to the compiler codebase. But yes the point stands.

Hopefully good versioning could help a lot with all that.

 In LDC, we use the C++ stdlib and we use Phobos, because why not? We are 
 programming in D after all, and it is the standard library that is 
 available in all bootstrapping compiler required versions. We do take 
 care not to rely on _latest_ Phobos, but on Phobos from the oldest 
 bootstrapping D compiler version supported up to the latest version. 
 Same for the C++ stdlib (C++20 is not ok, but C++14 is much encouraged).
 
 For example: LDC uses MD5 hashing for its machine codegen cache. `import 
 std.digest.md; auto md5hash = md5Of(slice);`  Done.
 LLVM does the same, e.g. the project removed its own unique_ptr 
 implementation (`OwningPtr`), and now uses `std::unique_ptr`.

Cool stuff!

 My standpoint on the original topic of "make it easier to experiment 
 with the compiler": I disagree with making the code more stable. If 
 anything, we should be refactoring much more aggressively, changing 
 function names etc.

Doesn't aggressive refactoring require massive unittests?

May 24 2021

Walter Bright <newshound2 digitalmars.com> writes:

On 5/24/2021 3:47 AM, Johan Engelen wrote:
 The main stumbling block is already mentioned: ownership. If you want 
 contributors, you have to give up some ownership and be willing to make 
 compromises between your own and the new contributors' viewpoints. The lack of 
 willingness to give up ownership is what keeps me out (and I suspect, 
 indirectly, others too). The frontend source code is not nice, but I'm not
drawn 
 to fix it at all (even if paid for) because I am not ashamed by it as I would
be 
 if I would have some shared 'ownership' of it.

Good points, but part of the reason the front end code is what it is is because 
of many contributors with diverse viewpoints on what good code should look
like. 
How should we reconcile that?

May 24 2021

poffer <poffer poffer.net> writes:

On Monday, 24 May 2021 at 02:56:14 UTC, Walter Bright wrote:
 On 5/23/2021 7:25 PM, Nicholas Wilson wrote:
 Directories.


 every other module" that leaves one with nowhere to start.

 We currently have:

   dmd
   dmd/root
   dmd/backend

 I regularly fend off attempts to have dmd/root import files 
 from dmd, and dmd/backend import files from dmd. I recently had 
 to talk someone out of having dmd/backend import files from 
 dmd/root.

 In other words, a failure of encapsulation.

 Let's look at one example, picked more or less because I've 
 looked at it recently, dmd/target.d. The reason for its 
 existence is to abstract target information. It's imports are:

   import dmd.argtypes_x86;
   import dmd.argtypes_sysv_x64;
   import core.stdc.string : strlen;
   import dmd.cond;
   import dmd.cppmangle;
   import dmd.cppmanglewin;
   import dmd.dclass;
   import dmd.declaration;
   import dmd.dscope;
   import dmd.dstruct;
   import dmd.dsymbol;
   import dmd.expression;
   import dmd.func;
   import dmd.globals;
   import dmd.id;
   import dmd.identifier;
   import dmd.mtype;
   import dmd.statement;
   import dmd.typesem;
   import dmd.tokens : TOK;
   import dmd.root.ctfloat;
   import dmd.root.outbuffer;
   import dmd.root.string : toDString;

 If I want to understand the code, I have to understand half of 
 the rest of the compiler. On a more abstract level, why on 
 earth would a target abstraction need to know about AST nodes? 
 At least half of these imports shouldn't be here, and if they 
 are, the code needs to be redesigned.

 Recently I needed some target information in the ImportC lexer, 
 and it would have been so easy to just import dmd.target. But 
 then that drags along all the imports that I've really tried to 
 avoid importing into the lexer.

 Iain came up with a clever solution to use a template parameter.

 Note that Phobos suffers terribly from this disease (everything 
 ultimately imports everything else), which makes it very hard 
 to understand and debug.

 Fixing this is not easy, it requires a lot of hard thinking 
 about what a module *really* needs to do. But each success at 
 eliminating an import makes it more understandable.

 Creating a false hierarchy (an implied relationship that is 
 instantly defeated by the imports) of files won't fix it.

 A good rule of thumb is:

     *** Never import a file from an uplevel directory ***

 Import sideways and down, never up.

A good enhancement to the language would be adding some sort of 
module declaration that just states the admitted import packages 
or modules. I know that could be done by an external tool, but I 
feel that this one is a common problem.

May 24 2021

Walter Bright <newshound2 digitalmars.com> writes:

On 5/24/2021 1:40 AM, poffer wrote:
 A good enhancement to the language would be adding some sort of module 
 declaration that just states the admitted import packages or modules. I know 
 that could be done by an external tool, but I feel that this one is a common 
 problem.

Importing unused modules is a problem, but a minor one. The larger problem is 
needing those modules.

May 24 2021

poffer <poffer poffer.net> writes:

On Monday, 24 May 2021 at 09:41:12 UTC, Walter Bright wrote:
 On 5/24/2021 1:40 AM, poffer wrote:
 A good enhancement to the language would be adding some sort 
 of module declaration that just states the admitted import 
 packages or modules. I know that could be done by an external 
 tool, but I feel that this one is a common problem.

 Importing unused modules is a problem, but a minor one. The 
 larger problem is needing those modules.

No. What I mean is a declaration that for example, allows only 
import from dmd in dmd/backend, of declare that imports from 
dmd/root are forbidden. Aren't you the guy pushing from 
declarations over conventions? Conventions do not scale.

May 24 2021

Walter Bright <newshound2 digitalmars.com> writes:

On 5/24/2021 3:18 AM, poffer wrote:
 On Monday, 24 May 2021 at 09:41:12 UTC, Walter Bright wrote:
 On 5/24/2021 1:40 AM, poffer wrote:
 A good enhancement to the language would be adding some sort of module 
 declaration that just states the admitted import packages or modules. I know 
 that could be done by an external tool, but I feel that this one is a common 
 problem.

 Importing unused modules is a problem, but a minor one. The larger problem is 
 needing those modules.

 
 No. What I mean is a declaration that for example, allows only import from dmd 
 in dmd/backend, of declare that imports from dmd/root are forbidden.

Ok, now I understand what you meant.


 Aren't you the guy pushing from declarations over conventions?

Snark isn't necessary.


 Conventions do not scale.

Please propose a DIP for your idea.

May 24 2021

Iain Buclaw <ibuclaw gdcproject.org> writes:

On Monday, 24 May 2021 at 02:56:14 UTC, Walter Bright wrote:
   import dmd.argtypes_x86;
   import dmd.argtypes_sysv_x64;
   import core.stdc.string : strlen;
   import dmd.cond;
   import dmd.cppmangle;
   import dmd.cppmanglewin;
   import dmd.dclass;
   import dmd.declaration;
   import dmd.dscope;
   import dmd.dstruct;
   import dmd.dsymbol;
   import dmd.expression;
   import dmd.func;
   import dmd.globals;
   import dmd.id;
   import dmd.identifier;
   import dmd.mtype;
   import dmd.statement;
   import dmd.typesem;
   import dmd.tokens : TOK;
   import dmd.root.ctfloat;
   import dmd.root.outbuffer;
   import dmd.root.string : toDString;

 If I want to understand the code, I have to understand half of 
 the rest of the compiler. On a more abstract level, why on 
 earth would a target abstraction need to know about AST nodes? 
 At least half of these imports shouldn't be here, and if they 
 are, the code needs to be redesigned.

To be fair, most of this is imported because a function needs the 
definition of one or more symbols.  This can be made better by 
either:

1. Making these selective imports, or...
2. Moving type definitions of AST nodes into modules that _only_ 
contain definitions.

May 24 2021

Walter Bright <newshound2 digitalmars.com> writes:

On 5/24/2021 2:46 AM, Iain Buclaw wrote:
 To be fair, most of this is imported because a function needs the definition
of 
 one or more symbols.  This can be made better by either:
 
 1. Making these selective imports, or...

That doesn't really help, the dependencies are still there.

 2. Moving type definitions of AST nodes into modules that _only_ contain 
 definitions.

It is not critical that we fix target.d. It's just that it would be better if 
its API was not AST nodes, but just values. Let the caller construct the AST 
node from the information provided.

Like what we did for the C parser. I was happy to have it not indirectly import 
everything in dmd when all it needed was a couple values.

I'm not saying any of this is easy.

May 24 2021

Iain Buclaw <ibuclaw gdcproject.org> writes:

On Monday, 24 May 2021 at 10:21:44 UTC, Walter Bright wrote:
 That doesn't really help, the dependencies are still there.

It makes it clear what they are for, which makes this statement:

 If I want to understand the code, I have to understand half of 
 the rest of the compiler.

obsolete.

 It is not critical that we fix target.d. It's just that it 
 would be better if its API was not AST nodes, but just values. 
 Let the caller construct the AST node from the information 
 provided.

The majority of the API are values, but it still needs to be fed 
AST information in order to make informative decisions.

For instance, how else would we be able to infer 
`isReturnOnStack` without a `TypeFunction`?  Even GDC needs the 
completed `TypeFunction`, as I generate a `tree` on-the-fly and 
pass that to GCC's back-end API to get said information.

 Like what we did for the C parser. I was happy to have it not 
 indirectly import everything in dmd when all it needed was a 
 couple values.

 I'm not saying any of this is easy.

Target's first goal of removing all `global.params.isXXX` fields 
was never going to be easy either. :-)

May 24 2021

Walter Bright <newshound2 digitalmars.com> writes:

On 5/24/2021 3:51 AM, Iain Buclaw wrote:
 For instance, how else would we be able to infer `isReturnOnStack` without a 
 `TypeFunction`?

Challenge accepted!

Let's see. The only things the TypeFunction for are:

1. the return type
2. the function linkage
3. if the function returns a ref

Pass those in instead, and the need for TypeFunction goes away.

https://github.com/dlang/dmd/blob/master/src/dmd/target.d#L762

(1) can be further broken down into "is it a POD", etc.

Breaking all this info out of a TypeFunction takes some code, but this 
"decomposition" can be done by a wrapper function in another module.

The end result is target.d can be completely independent of the compiler's 
internal AST structures.

But wait! There's more!

Notice how isReturnOnStack depends on several random global variables like os, 
is64bit, and isPOSIX. They can be passed in as arguments, too (or passed in as
a 
const ref to a struct containing those as members).

isReturnOnStack() then becomes a pure function! (and safe, nogc, nothrow, all 
that good stuff)

Those initialize() functions go away, too.

The beauty now becomes that we can (at last!) easily and correctly write 
unittests for it. target.d now becomes INDEPENDENT of the rest of the compiler.

How sweet it will be!

May 24 2021

Iain Buclaw <ibuclaw gdcproject.org> writes:

On Monday, 24 May 2021 at 22:18:35 UTC, Walter Bright wrote:
 On 5/24/2021 3:51 AM, Iain Buclaw wrote:
 For instance, how else would we be able to infer 
 `isReturnOnStack` without a `TypeFunction`?

 Challenge accepted!

 Let's see. The only things the TypeFunction for are:

 1. the return type
 2. the function linkage
 3. if the function returns a ref

 Pass those in instead, and the need for TypeFunction goes away.

I still see a Type though. :-)

On my side, in pseudo-code this would become:

```
   tree type = build_gcc_type (return_type);
   if (isref)
     type = build_reference_type (type);
   return targetm.calls.return_in_memory (type);
```

Or alternatively, I could just abandon all accuracy and go with:

```
   return (return_type.ty == Tstruct || return_type.ty == Tsarray) 
&& !isref;
```

Because I know that this function doesn't affect the code 
generator, though users won't be able to reliably do 
introspection.

 Notice how isReturnOnStack depends on several random global 
 variables like os, is64bit, and isPOSIX. They can be passed in 
 as arguments, too (or passed in as a const ref to a struct 
 containing those as members).

They have been moved to the internal state of Target, so no 
longer random globals.

Information such as the target OS or CPU should not be floating 
around the front-end.  It should all be constrained to either the 
dmd.target interface or dmd.backend, leaving the front-end to 
only handle matters relating to language semantics.

 isReturnOnStack() then becomes a pure function! (and safe, 
 nogc, nothrow, all that good stuff)

 Those initialize() functions go away, too.

 The beauty now becomes that we can (at last!) easily and 
 correctly write unittests for it. target.d now becomes 
 INDEPENDENT of the rest of the compiler.

 How sweet it will be!

I think target.d could instead benefit from breaking out into 
per-backend modules though, such as target_linux.d, 
target_freebsd.d, target_x86.d, target_x86_64.d, ... to separate 
out concerns of the OS with concerns of the CPU.  It would be 
something completely dmd-specific though, as I don't use/re-use 
any part of what's present in dmd's source tree around this 
module.

May 24 2021

zjh <fqbqrr 163.com> writes:

We should add a `favor function` to the forum post.

May 24 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/24/21 5:46 AM, Iain Buclaw wrote:
 On Monday, 24 May 2021 at 02:56:14 UTC, Walter Bright wrote:
   import dmd.argtypes_x86;
   import dmd.argtypes_sysv_x64;
   import core.stdc.string : strlen;
   import dmd.cond;
   import dmd.cppmangle;
   import dmd.cppmanglewin;
   import dmd.dclass;
   import dmd.declaration;
   import dmd.dscope;
   import dmd.dstruct;
   import dmd.dsymbol;
   import dmd.expression;
   import dmd.func;
   import dmd.globals;
   import dmd.id;
   import dmd.identifier;
   import dmd.mtype;
   import dmd.statement;
   import dmd.typesem;
   import dmd.tokens : TOK;
   import dmd.root.ctfloat;
   import dmd.root.outbuffer;
   import dmd.root.string : toDString;

 If I want to understand the code, I have to understand half of the 
 rest of the compiler. On a more abstract level, why on earth would a 
 target abstraction need to know about AST nodes? At least half of 
 these imports shouldn't be here, and if they are, the code needs to be 
 redesigned.

 
 To be fair, most of this is imported because a function needs the 
 definition of one or more symbols.  This can be made better by either:
 
 1. Making these selective imports, or...
 2. Moving type definitions of AST nodes into modules that _only_ contain 
 definitions.

Yes, and these are good incremental steps that help a lot, are low cost, 
and inform larger refactorings. There should be active work on pushing 
imports down to where they're used.

My dream: top-level imports will become an antipattern in large D code.

May 24 2021

Dukc <ajieskola gmail.com> writes:

On Monday, 24 May 2021 at 02:56:14 UTC, Walter Bright wrote:
 A good rule of thumb is:

     *** Never import a file from an uplevel directory ***

 Import sideways and down, never up.

You may want to reconsider what you just said.

Do you really insist that `std.stdio` should copy-paste the CLib 
headers instead of importing `core.stdc.stdio`?

May 24 2021

Walter Bright <newshound2 digitalmars.com> writes:

On 5/24/2021 1:35 PM, Dukc wrote:
 On Monday, 24 May 2021 at 02:56:14 UTC, Walter Bright wrote:
 A good rule of thumb is:

     *** Never import a file from an uplevel directory ***

 Import sideways and down, never up.

 
 You may want to reconsider what you just said.
 
 Do you really insist that `std.stdio` should copy-paste the CLib headers
instead 
 of importing `core.stdc.stdio`?

I knew someone would bring that up. :-) It's a good question.

`core` is a separate library in its own, independent hierarchy, it is not in
the 
`std` hierarchy. It is not "up sideways and down". So it's good.

Now, if std.stdio imported core.stdc.stdio, and core.stdc.stdio imported 
std.stdio, then you've got a really bad design.

May 24 2021

Dukc <ajieskola gmail.com> writes:

On Monday, 24 May 2021 at 22:21:46 UTC, Walter Bright wrote:
 `core` is a separate library in its own, independent hierarchy, 
 it is not in the `std` hierarchy. It is not "up sideways and 
 down". So it's good.

Shouldn't the same reasoning apply to `import`ing `dmd.root` from 
`dmd.backend`? if I understood right, `dmd.root` is designed to 
act like an external utility library. It should be no problem to 
`import` it, as long as `dmd.root` does not try to import rest of 
DMD.

 Now, if std.stdio imported core.stdc.stdio, and core.stdc.stdio 
 imported std.stdio, then you've got a really bad design.

It sounds like your real issue is circular imports, not parent 
package imports. That sounds more reasonable to me.

May 24 2021

Dibyendu Majumdar <mobile majumdar.org.uk> writes:

On Monday, 24 May 2021 at 02:56:14 UTC, Walter Bright wrote:

 every other module" that leaves one with nowhere to start.

 I regularly fend off attempts to have dmd/root import files 
 from dmd, and dmd/backend import files from dmd. I recently had 
 to talk someone out of having dmd/backend import files from 
 dmd/root.

Wow - that's pretty fundamental. How does such code get in? I 
assume you own the DMD code - all changes should be vetted and 
approved by you?

Btw one less from Linux maintenance is that the the owners should 
spend all their time reviewing code - not writing code!

May 25 2021

Walter Bright <newshound2 digitalmars.com> writes:

On 5/25/2021 3:16 AM, Dibyendu Majumdar wrote:
 Wow - that's pretty fundamental. How does such code get in? I assume you own
the 
 DMD code - all changes should be vetted and approved by you?

There are many people who have pull privileges.

May 27 2021

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Monday, 24 May 2021 at 02:25:33 UTC, Nicholas Wilson wrote:
 On Sunday, 23 May 2021 at 06:12:30 UTC, Ola Fosheim Grøstad 
 wrote:
 The number one challenge I see is keeping track of DMD as it 
 is released with new improvements. Basically reapplying the 
 changes made to the experimental branch to the main branch 
 (aka "rebasing"?).

 (the is the correct terminology). I suspect this is more of a 
 problem for people that are less familiar with git, which might 
 well also include people wanting to play around with DMD, e.g. 
 GSoC/SAoC students.
 I know this was the case for me while developing dcompute with 
 the added difficulty of tracking LLVM on top of LDC (which was 
 kept in sync with DMD).

 I suspect that kills many efforts, meaning people create a 
 fork, start making changes, but then a new version of DMD is 
 released and the fork is left to dry in the sun as rebasing is 
 not fun. And well, a hobby that isn't fun, is not a good 
 hobby. :-D

 The solution to this is better git skills not so much better 
 compiler skills/knowledge of DMD although a merge conflict in a 
 critical piece of code is always a PiTA. We now have 
 slack/discord for people to ask these kinds of questions, which 
 I'm sure they will get answered if the are trying to do 
 something interesting or fix an annoying problem.

I think I should have used the term "boring" rather than 
"challenging".

I doubt that git skills would solve it as I think it is more 
related to what a hobby is to people who are older and have a 
very long spare time todo-list. Any "unproductive" and "unfun" 
chore will go to the bottom of the todo-list. My 
I-really-ought-todo-list is so long that it could fill up the 
rest of my life...

So it is basically easier to just stay on an outdated dmd-branch 
for a couple of years, rather than keeping track of it... which 
is not a good strategy.

Think of it like this: I have 2-5 hours a week for completely 
unnecessary, but fun things like hacking a new IR + optimization 
inbetween DMD and LLVM. So, what should I do: do my taxes, rebase 
my fork, watch Eurovision with family? Rebasing is down there 
with taxes, except I have to do the taxes eventually, just not 
this Saturday... (Ok, so we watch Eurovision then just to find 
out how bad it is? :-)

I think it would not be too difficult to get to a situation where 
you have well-defined entry points, hooks, layers that makes it 
more of a plugin-experience.

Examples of potential plug-and-play:

1. Add new experimental syntax: The parser is quite close. It 
would not take a lot of work to encapsulate a manager  of 
(file-extension, Parser) pairs that have no overhead (compile 
time). Ok, so if you want to extend the language as experiment, 
just duplicate the parser, modify it and plug it in. This is a 
low-hanging fruit.

2. Add new semantics: add a new file with functions with custom 
intrinsics that are somehow added to the runtime, use your custom 
parser to lower your custom syntax to these custom runtime 
functions. Inject yourself between the front-end and backed 
(assuming a high level IR), pick up the custom intrinsics and do 
the analysis/transforms you want.

3. Add new high level optimization, like ARC: same as 2, except 
you only add new passes in a new file and possibly some new 
fields to the high level IR. Then edit a config file that makes 
the pass available and executed at the right time (with respect 
to other passes).

So, the basic idea is, that instead of _modifying_ the compiler, 
you add new files to it and bring them into the compiler by 
hooks, configuration files etc.

Then you can also much easier merge and combine contributions 
from many different extension authors and easily replace one 
extension with a better one.

 Urgh. Dealing with 10000 line files and 1000 line functions is 
 such a drain on trying to get stuff done (looking at you 
 expressionsem.d). However this needs to be combined with 
 directories/packages or it will not improve the situation.

Yes, but one can create virtual directories though. E.g. in some 
editors you can group files from different directories so it 
looks like they are in one directory. You can do something 
similar with "ln -s", but it isn't optimal...


 Which items are feasible in the next 6 months?

 Directories.

Sounds like a good start. I still think the high level IR is the 
most pressing one, as not having that abstraction makes adding 
new experimental semantics too time consuming for hobbyists.

I had the idea that I could do ARC by adding intrinsics to LLVM, 
but Apple engineers strongly advised against it and strongly 
suggested working on a high level IR instead.

ARC is something well suited for a hobbyists as you can implement 
it in a gradual manner if you have a high level IR (one tweak 
here, one tweak there).

Anyway, I think more experimentation is needed. Say, if 1 out of 
10 experiments made it into the main dmd, then there could be 
more interesting options that would make dmd stand out in the 
crowd.

IMHO The key challenge is to make experimentation fun for people 
who have limited time (which happens as you get older).

Imagine if D could get some of the people that were active with D 
10-15 years ago, but currently have very limited time, to create 
their own experiments? I am sure that many of those have grown to 
capable programmers since then, so that could be something to 
think about.

It has to be fun experience throughout for people to spend those 
3-4 spare hours a week on compiler hacking.

May 24 2021

Iain Buclaw <ibuclaw gdcproject.org> writes:

On Monday, 24 May 2021 at 02:25:33 UTC, Nicholas Wilson wrote:
 On Sunday, 23 May 2021 at 06:12:30 UTC, Ola Fosheim Grøstad
 5. Use directories.

 Yes!!! sooo much yes! see above.

You can't complain unless you've had a go at making a change to 
Ada.

[gcc] (master) $ ls gcc/c | wc -l
19
[gcc] (master) $ ls gcc/cp | wc -l
87
[gcc] (master) $ ls gcc/fortran | wc -l
99
[gcc] (master) $ ls gcc/d/dmd | wc -l
114
[gcc] (master) $ ls gcc/ada | wc -l
565


I do tend to agree though that we should try to respect Dunbar's 
number when it comes to these things.  But the individual file 
count does not map to reality.

May 24 2021

Nicholas Wilson <iamthewilsonator hotmail.com> writes:

On Monday, 24 May 2021 at 09:41:08 UTC, Iain Buclaw wrote:
 On Monday, 24 May 2021 at 02:25:33 UTC, Nicholas Wilson wrote:
 On Sunday, 23 May 2021 at 06:12:30 UTC, Ola Fosheim Grøstad
 5. Use directories.

 Yes!!! sooo much yes! see above.

 You can't complain unless you've had a go at making a change to 
 Ada.

 [gcc] (master) $ ls gcc/c | wc -l
 19
 [gcc] (master) $ ls gcc/cp | wc -l
 87
 [gcc] (master) $ ls gcc/fortran | wc -l
 99
 [gcc] (master) $ ls gcc/d/dmd | wc -l
 114
 [gcc] (master) $ ls gcc/ada | wc -l
 565

Eee gads!

 I do tend to agree though that we should try to respect 
 Dunbar's number when it comes to these things.  But the 
 individual file count does not map to reality.

it makes it difficult to navigate, especially so when you are 
unfamiliar with the code base.

May 24 2021

Alexandru Ermicioi <alexandru.ermicioi gmail.com> writes:

On Sunday, 23 May 2021 at 06:12:30 UTC, Ola Fosheim Grøstad wrote:
 7. Tutorials.

8. Proper module naming, not abbreviations. Abbreviations need to 
be remembered, and that is additional mental workload for new 
volunteer.
9. Proper variable naming, not abbreviations. I really tried to 
understand some code, but got discouraged once met all those 
abbreviated variable names, I literally had to stuff all my 
memory with what those abbreviations meant instead of trying to 
keep the thread of the logic that code is implementing.
10. Split up humongous methods and objects, they are rude to new 
volunteers, and discourages any code improvement.
11. Perhaps some tutorial, on how to orient in all dmd internals, 
with a nice abstract class diagram explaining key elements of dmd 
objects and how they interact between themselves. This would 
allow at least some kind of overview of what does what in dmd, 
and how they interact.

I really was interested in doing some dmd bug fixes, but 8,9, and 
10, make the code to take too much time, and willpower to just 
understand it. It was and is a huge barrier for me to try and 
fix/improve dmd.

P.S. And no, e,exp,aa and other kind of abbreviations except for 
loop indexes, are not always obvious, and do take mental power 
and memory, while trying to understand existing code. They are 
not simple for new volunteers to dmd.

Best regards,
Alexandru.

May 24 2021

Walter Bright <newshound2 digitalmars.com> writes:

On 5/24/2021 2:44 AM, Alexandru Ermicioi wrote:
 They are not simple for new volunteers to dmd.

You're right, they are not. They're optimized for the people who spend
thousands 
of hours working on it.

This inevitably happens with every profession, every discipline, and every 
project. A jargon specific to it grows up around it, for the convenience of the 
people who work on it every day. If the jargon is consistent and reasonably 
logical, it can be a great aid to understanding once one gets familiar with it.

Unfortunately, I have failed at my original design goal of making DMD a simple 
compiler. Reshuffling files around and renaming things will not help. What will 
help is better encapsulation - unfortunately, that is hard to do.

There are some reasonably well-encapsulated parts. The lexer, the parser, and 
the files in the root package. To understand the compiler, I'd start there.

May 24 2021

Alexandru Ermicioi <alexandru.ermicioi gmail.com> writes:

On Monday, 24 May 2021 at 10:34:35 UTC, Walter Bright wrote:
 On 5/24/2021 2:44 AM, Alexandru Ermicioi wrote:
 They are not simple for new volunteers to dmd.

 You're right, they are not. They're optimized for the people 
 who spend thousands of hours working on it.

 This inevitably happens with every profession, every 
 discipline, and every project. A jargon specific to it grows up 
 around it, for the convenience of the people who work on it 
 every day. If the jargon is consistent and reasonably logical, 
 it can be a great aid to understanding once one gets familiar 
 with it.

Well, there is no dictionary for those abbreviations, and it is 
hard, to decipher them when looking into kilometer long code. 
That is my experience with dmd code:
1. I stumble on compiler bug.
2. File a bug report.
3. No-one fixes it in couple of days, and I think perhaps I can 
fix this bug, since it's not complicated, and should be couple of 
lines.
4. I download dmd, try to compile it somehow, because dub 
compilation either freezed, or failed, but somehow manage to by 
using older build system dmd had.
5. Then finally I can start changing code?
6. No, first find what module and class is responsible for that 
code, in ocean of modules named from an ocean of abbreviations, 
or with misguiding names.
7. Oh well after wasting an hour/two of from 3 to 4 what you 
have, you find it.
8. Then you look into the kilometer long function. You seem to 
find the piece of code that might be the cause of the bug, and 
try understanding it better.
9. You read that said code, try keeping in mind entire code flow 
you've read up to this point, and suddenly there is an 'aa'.
10. You try to figure out what 'aa' means, but fail to do so, 
therefore you need to look ad it's declaration to know the type 
and figure it out from it's type.
11. You find the type of variable, and rejoice at deciphering it 
being 'associative array', yay.
12. Okay, let's go back to the line with 'aa'.
13. First find that said line if for some reason your ide didn't 
retain it.
14. Once there, you continue reading, but wait,what was before 
the line with 'aa'?
15. Damn, I forgot. Sigh I have to read all the code again.


That is my experience with all abbreviations in dmd, which are 
like an ocean.
It is ok, to have a couple of well defined and documented 
abbreviations, not an ocean of them without any documentation. It 
is not my job, to fix dmd, I wanted to do something when I had 
couple of hours to invest. It is not rewarding when those couple 
of hours are wasted at deciphering abbreviations, and not even 
understanding the flow of code itself.

Please limit use of abbreviations to minimum, and those that are 
used, should be documented.

 There are some reasonably well-encapsulated parts. The lexer, 
 the parser, and the files in the root package. To understand 
 the compiler, I'd start there.

Yet there is no official guidance on where to start. Also, please 
note that not all volunteers prefer reading source code, and 
invest hours at understanding the architecture and inner 
workings, starting from lexer or parser, some of them just want 
to fix a small bug, and be done with it. It is extremely hard to 
do that now.

Best regards,
Alexandru.

May 24 2021

Walter Bright <newshound2 digitalmars.com> writes:

On 5/24/2021 4:24 AM, Alexandru Ermicioi wrote:
 Please limit use of abbreviations to minimum, and those that are used, should
be 
 documented.

   grep -w aa *.d

yields:

argtypes_aarch64.d: * 
https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst.
clone.d:             * int[S] aa;   // Currently AA key uses bitwise comparison
dinterpret.d:                 *     aa[i][j] op= newval;
dinterpret.d:                 *     aa = [i:[j:T.init]];
dinterpret.d:                 *     aa[j] op= newval;
dinterpret.d:            // Create a CTFE pointer &aa[index]
dinterpret.d:private Expression interpret_aaApply(UnionExp* pue, InterState* 
istate, Expression aa, Expression deleg)
dinterpret.d:    aa = interpret(aa, istate);
dinterpret.d:    if (exceptionOrCantInterpret(aa))
dinterpret.d:        return aa;
dinterpret.d:    if (aa.op != TOK.assocArrayLiteral)
dinterpret.d:    AssocArrayLiteralExp ae = cast(AssocArrayLiteralExp)aa;
dmangle.d:    private extern(D) bool backrefImpl(T)(ref AssocArray!(T, size_t) 
aa, T key)
dmangle.d:        auto p = aa.getLvalue(key);
dsymbol.d:        AliasAssign aa = new AliasAssign(loc, ident,
dsymbol.d:        return aa;
e2ir.d:            elem *aa = toElem(ie.e2, irs);
e2ir.d:            // aaInX(aa, keyti, key);
e2ir.d:            elem *ep = el_params(key, keyti, aa, null);
e2ir.d:                //      *aaGetY(aa, aati, valuesize, &key);
e2ir.d:                //      *aaGetRvalueX(aa, keyti, valuesize, &key);
expression.d:         *     aa[k1][k2][k3] op= val;
expression.d:         *     auto ref __aatmp = aa;
expressionsem.d:                 *  aa.remove(arg) into delete aa[arg]
expressionsem.d:                    ce.error("expected key as argument to 
`aa.remove()`");
expressionsem.d:                     *      aa[key] = e2;
expressionsem.d:                     *      ref __aatmp = aa;
sideeffect.d:             *  S[int] aa;
sideeffect.d:             *  aa[1] = 0;
sideeffect.d:             *  1 in aa ? aa[1].value = 0 : (aa[1] = 0, 
aa[1].this(0)).value;
sideeffect.d:             *  int value = (aa[1] = 0);    // value = aa[1].value

which is not perfect, but very helpful. It usually means "Associative Array", 
but sometimes "AliasAssign". grep is very, very handy at this sort of thing. I 
use it constantly.


 Yet there is no official guidance on where to start. Also, please note that
not 
 all volunteers prefer reading source code, and invest hours at understanding
the 
 architecture and inner workings, starting from lexer or parser, some of them 
 just want to fix a small bug, and be done with it. It is extremely hard to do 
 that now.

Start here:

   https://github.com/dlang/dmd/blob/master/src/dmd/README.md

Each source file has handy links at the start. For example, dsymbol.d:

   https://github.com/dlang/dmd/blob/master/src/dmd/dsymbol.d

has a link to its documentation generated from Ddoc:

   https://dlang.org/phobos/dmd_dsymbol.html

May 26 2021

zjh <fqbqrr 163.com> writes:

On Thursday, 27 May 2021 at 05:08:41 UTC, Walter Bright wrote:
 On 5/24/2021 4:24 AM, Alexandru Ermicioi wrote:
 Please limit use of abbreviations to minimum, and those that 
 are used, should be documented.

   grep -w aa *.d

We have good articles, good posts.
But no system rearragement. So, you said again and again, but 
others still don't know.
We need open another section to rearrange the good 
post/infomation.
We need good organization on man/info/code/....

May 26 2021

Ola Fosheim Grostad <ola.fosheim.grostad gmail.com> writes:

On Thursday, 27 May 2021 at 05:22:26 UTC, zjh wrote:
 On Thursday, 27 May 2021 at 05:08:41 UTC, Walter Bright wrote:
 On 5/24/2021 4:24 AM, Alexandru Ermicioi wrote:
 Please limit use of abbreviations to minimum, and those that 
 are used, should be documented.

   grep -w aa *.d

 We have good articles, good posts.
 But no system rearragement. So, you said again and again, but 
 others still don't know.
 We need open another section to rearrange the good 
 post/infomation.
 We need good organization on man/info/code/....

No need, just do this:

May 26 2021

zjh <fqbqrr 163.com> writes:

On Thursday, 27 May 2021 at 05:22:26 UTC, zjh wrote:
 On Thursday,

Stability is very important.
And a `good architecture/organization` can fix problems as 
quickly as possible.
D can not only rely on `1/2` man, or continue adding features and 
bugfixing.
What D needs is good organization on `code/people/information`.
D needs people to participate. Good organization is very 
important.`Good organization` gets twice the result with half the 
effort.
Well organized, people are naturally willing to participate. And 
errors can be quickly fixed.
Change the organization doesn't mean change implemention.So 
errors may not be too much.

May 26 2021

Ola Fosheim Grostad <ola.fosheim.grostad gmail.com> writes:

On Thursday, 27 May 2021 at 05:08:41 UTC, Walter Bright wrote:
 grep is very,  very handy at this sort of thing. I use it 
 constantly.

We are stuck in a 70s mainframe.

May 26 2021

12345swordy <alexanderheistermann gmail.com> writes:

On Monday, 24 May 2021 at 10:34:35 UTC, Walter Bright wrote:
 On 5/24/2021 2:44 AM, Alexandru Ermicioi wrote:
 They are not simple for new volunteers to dmd.

 You're right, they are not. They're optimized for the people 
 who spend thousands of hours working on it.

 This inevitably happens with every profession, every 
 discipline, and every project. A jargon specific to it grows up 
 around it, for the convenience of the people who work on it 
 every day. If the jargon is consistent and reasonably logical, 
 it can be a great aid to understanding once one gets familiar 
 with it.

 Unfortunately, I have failed at my original design goal of 
 making DMD a simple compiler. Reshuffling files around and 
 renaming things will not help. What will help is better 
 encapsulation - unfortunately, that is hard to do.

 There are some reasonably well-encapsulated parts. The lexer, 
 the parser, and the files in the root package. To understand 
 the compiler, I'd start there.

I seriously question the "Optimized for people who spend 
thousands of hours working on it" line, as I had a very 
intelligent person posted on slacks asking what does this 
function do, as there is no comments for said functions.

-Alex

May 24 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/24/21 9:53 AM, 12345swordy wrote:
 On Monday, 24 May 2021 at 10:34:35 UTC, Walter Bright wrote:
 On 5/24/2021 2:44 AM, Alexandru Ermicioi wrote:
 They are not simple for new volunteers to dmd.

 You're right, they are not. They're optimized for the people who spend 
 thousands of hours working on it.

 This inevitably happens with every profession, every discipline, and 
 every project. A jargon specific to it grows up around it, for the 
 convenience of the people who work on it every day. If the jargon is 
 consistent and reasonably logical, it can be a great aid to 
 understanding once one gets familiar with it.

 Unfortunately, I have failed at my original design goal of making DMD 
 a simple compiler. Reshuffling files around and renaming things will 
 not help. What will help is better encapsulation - unfortunately, that 
 is hard to do.

 There are some reasonably well-encapsulated parts. The lexer, the 
 parser, and the files in the root package. To understand the compiler, 
 I'd start there.

 
 I seriously question the "Optimized for people who spend thousands of 
 hours working on it" line, as I had a very intelligent person posted on 
 slacks asking what does this function do, as there is no comments for 
 said functions.

Adding documentation would be another good investment with terrific 
dividends. Again it minds my boggle that people talk about big changes 
(and no doubt would be willing to try them) but can't be bothered to 
make small changes with disproportionately good impact.

May 24 2021

Max Haughton <maxhaton gmail.com> writes:

On Monday, 24 May 2021 at 20:47:39 UTC, Andrei Alexandrescu wrote:
 On 5/24/21 9:53 AM, 12345swordy wrote:
 On Monday, 24 May 2021 at 10:34:35 UTC, Walter Bright wrote:
 [...]

 
 I seriously question the "Optimized for people who spend 
 thousands of hours working on it" line, as I had a very 
 intelligent person posted on slacks asking what does this 
 function do, as there is no comments for said functions.

 Adding documentation would be another good investment with 
 terrific dividends. Again it minds my boggle that people talk 
 about big changes (and no doubt would be willing to try them) 
 but can't be bothered to make small changes with 
 disproportionately good impact.

Where do you start? i.e. there's always work to be done but 
unless you enforce change from the top you're blocking a river at 
the mouth (to play devil's advocate)

May 24 2021

Ola Fosheim Grostad <ola.fosheim.grostad gmail.com> writes:

On Monday, 24 May 2021 at 21:01:05 UTC, Max Haughton wrote:
 On Monday, 24 May 2021 at 20:47:39 UTC, Andrei Alexandrescu 
 wrote:
 On 5/24/21 9:53 AM, 12345swordy wrote:
 On Monday, 24 May 2021 at 10:34:35 UTC, Walter Bright wrote:
 [...]

 
 I seriously question the "Optimized for people who spend 
 thousands of hours working on it" line, as I had a very 
 intelligent person posted on slacks asking what does this 
 function do, as there is no comments for said functions.

 Adding documentation would be another good investment with 
 terrific dividends. Again it minds my boggle that people talk 
 about big changes (and no doubt would be willing to try them) 
 but can't be bothered to make small changes with 
 disproportionately good impact.

 Where do you start? i.e. there's always work to be done but 
 unless you enforce change from the top you're blocking a river 
 at the mouth (to play devil's advocate)

The reason I put documentation low on my list is that it has a 
high maintenance cost if you are going to redesign.

Also, it has not been a hindrance for experimentation for me. 
Probably a hindrance for fixing bugs, but that is not the topic..

In general, let us try too focus on macro issues, there is no 
need for dmd to be perfect in order to better support 
experimentation. Partitioning and interfacing is more important 
than statement and block level issues. Micro issues such as 
imports and number of OutBuffer implementations are low impact 
issues, those are more aesthetical in nature...

May 24 2021

Walter Bright <newshound2 digitalmars.com> writes:

On 5/24/2021 2:01 PM, Max Haughton wrote:
 Where do you start?

At the first function you notice that has poor/missing/wrong documentation.

Like this one I just did:

https://github.com/dlang/dmd/pull/12570

May 24 2021

Ola Fosheim Grostad <ola.fosheim.grostad gmail.com> writes:

On Tuesday, 25 May 2021 at 00:03:56 UTC, Walter Bright wrote:
 On 5/24/2021 2:01 PM, Max Haughton wrote:
 Where do you start?

 At the first function you notice that has poor/missing/wrong 
 documentation.

 Like this one I just did:

 https://github.com/dlang/dmd/pull/12570

This is not helpful. Too much commenting makes the code even 
harder to read and drowns out important comments.  This corporate 
illness (which assumes that programmers are idiots) is why 
editors now ship with hide-all-comments functionality... Good 
code with good naming needs only few comments and those are on an 
_algorithmic_ level.

Nobody that has read an introductory book on compilers need a 
comment explaining a function that is looking up a symbol from a 
symboltable. If that is a problem, improve the name, use a longer 
name.

May 24 2021

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Tuesday, 25 May 2021 at 04:21:08 UTC, Ola Fosheim Grostad 
wrote:
 Nobody that has read an introductory book on compilers need a 
 comment explaining a function that is looking up a symbol from 
 a symboltable. If that is a problem, improve the name, use a 
 longer name.

An improvement that would have made any comment on 
"lookup(symbol)" superfluous is to have a signature that 
indicates whether a returned pointer can be null or not.

May 25 2021

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Tuesday, 25 May 2021 at 04:21:08 UTC, Ola Fosheim Grostad 
wrote:
 Nobody that has read an introductory book on compilers need a 
 comment explaining a function that is looking up a symbol from 
 a symboltable. If that is a problem, improve the name, use a 
 longer name.

Of course, it is nice to know that that lookup(symbol) can return 
null, but that should be visible in the signature.

That can be covered by some "nullable/notnullable" or "optional" 
wrapper. Just follow some established coding guidelines for 
signatures.

I guess my point is: there is a big difference between 
documentation for a public API where the user is not supposed to 
read the code and internal relations in an application.

May 25 2021

sighoya <sighoya gmail.com> writes:

On Tuesday, 25 May 2021 at 04:21:08 UTC, Ola Fosheim Grostad 
wrote:

 This is not helpful. Too much commenting makes the code even 
 harder to read and drowns out important comments.  This 
 corporate illness (which assumes that programmers are idiots) 
 is why editors now ship with hide-all-comments functionality... 
 Good code with good naming needs only few comments and those 
 are on an _algorithmic_ level.

You can't encode the full semantic into one function name with 
parameter names without to over blow these names.
Though, I concur with you for better naming, at least no 
abbreviations.
I even find the code to be more structured with comment blocks, 
in my eyes it aids to visualize the code structure better.


 Nobody that has read an introductory book on compilers need a 
 comment explaining a function that is looking up a symbol from 
 a symboltable. If that is a problem, improve the name, use a 
 longer name.

+1 for `Symbol lookUpSymbol(string symbolName)`

However, small comments inside the function would also be 
beneficial.
A good example of comments is the ast module of nim:

https://github.com/nim-lang/Nim/blob/devel/compiler/ast.nim

May 25 2021

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Tuesday, 25 May 2021 at 08:32:46 UTC, sighoya wrote:
 You can't encode the full semantic into one function name with 
 parameter names without to over blow these names.

We can assume that the reader has read a book on compiler design 
and is familiar with the terminology and the most common 
algorithms. Provide a reference to wikipedia if unsure if the 
reader is with you...

Functions that are only called from a few places can have long 
descriptive names, that is not a negative.

 However, small comments inside the function would also be 
 beneficial.

Yes, obviously. But adding 6 lines of comments for every trivial 
function is not helpful. It is a useless policy. It is a policy 
for the sake of having a policy.

If time is invested in documenting things that should be 
changed... then change becomes less likely: "look, the 
documentation is over there, change not needed".

Anyway, documentation is the wrong solution to structural issues. 
It does not enable anything.

It is kinda like saying a city does not read roadsigns because 
there is a good map available. Or that a city that is a labyrinth 
of one-way streets are easy to navigate with the right kind of 
map. Driving while looking at a map is not a good experience. And 
when things change, can you then trust the map?

*shrug*

May 25 2021

jmh530 <john.michael.hall gmail.com> writes:

On Tuesday, 25 May 2021 at 09:05:26 UTC, Ola Fosheim Grøstad 
wrote:
 On Tuesday, 25 May 2021 at 08:32:46 UTC, sighoya wrote:
 [...]

 We can assume that the reader has read a book on compiler 
 design and is familiar with the terminology and the most common 
 algorithms. Provide a reference to wikipedia if unsure if the 
 reader is with you...

 Functions that are only called from a few places can have long 
 descriptive names, that is not a negative.

 [...]

 Yes, obviously. But adding 6 lines of comments for every 
 trivial function is not helpful. It is a useless policy. It is 
 a policy for the sake of having a policy.

 If time is invested in documenting things that should be 
 changed... then change becomes less likely: "look, the 
 documentation is over there, change not needed".

 Anyway, documentation is the wrong solution to structural 
 issues. It does not enable anything.

 It is kinda like saying a city does not read roadsigns because 
 there is a good map available. Or that a city that is a 
 labyrinth of one-way streets are easy to navigate with the 
 right kind of map. Driving while looking at a map is not a good 
 experience. And when things change, can you then trust the map?

 *shrug*

I don't know...I mean it's a start...

I feel like this forum has ADHD sometimes. A week ago it was all 
up in arms about ImportC, now it's fcused on this, two weeks from 
now this will be forgotten and on to something else...

May 25 2021

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Tuesday, 25 May 2021 at 11:22:24 UTC, jmh530 wrote:
 I don't know...I mean it's a start...

Many functions have two-liners documentation already, but it is 
kinda like a forest. You see lots of individual trees, but the 
shape of the forest is hard to grasp. More documentation on 
individual functions won't enable anything. Just like stapling 
"this is spruce", "this is birch" to individual trees does not 
help much.

 I feel like this forum has ADHD sometimes. A week ago it was 
 all up in arms about ImportC, now it's fcused on this, two 
 weeks from now this will be forgotten and on to something 
 else...

You have to build consensus somehow.

Most of the new cool features other languages get is bette done 
using a dedicated high level IR. There is currently no easy way 
to experiment with that for D (short of writing your own backend).

I think that is a road block.

May 25 2021

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Tuesday, 25 May 2021 at 11:40:42 UTC, Ola Fosheim Grøstad 
wrote:
 On Tuesday, 25 May 2021 at 11:22:24 UTC, jmh530 wrote:
 I don't know...I mean it's a start...

 Many functions have two-liners documentation already, but it is 
 kinda like a forest. You see lots of individual trees, but the 
 shape of the forest is hard to grasp. More documentation on 
 individual functions won't enable anything. Just like stapling 
 "this is spruce", "this is birch" to individual trees does not 
 help much.

Or let me explain it another way. Assume writing good useful 
documentation takes 10-20% of your coding time.

What should you formally document?

1. High level structure.
2. Stuff that is stable and well encapsulated and needs to be 
explained.
3. FAQs.
4. Functions that deviate from the norm (behaves in surprising 
ways).

Stuff you want to replace, not so much I think. Given that those 
10-20% would be better spent refactoring.

May 25 2021

jmh530 <john.michael.hall gmail.com> writes:

On Tuesday, 25 May 2021 at 12:21:23 UTC, Ola Fosheim Grøstad 
wrote:
 [snip]

 Stuff you want to replace, not so much I think. Given that 
 those 10-20% would be better spent refactoring.

Ultimately Walter needs to think about how he best spends his 
time.

Refactoring won't happen overnight and more people who understand 
the compiler the more can assist with that and other things in 
the meantime.

May 25 2021

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Tuesday, 25 May 2021 at 12:38:56 UTC, jmh530 wrote:
 Refactoring won't happen overnight and more people who 
 understand the compiler the more can assist with that and other 
 things in the meantime.

They best way to refactor is to partition and encapsulate then 
you can replace one item at a time. The only people who can do 
this is people who are willing to dig deep into the codebase.

But this thread is about experimentation. Experimentation on top 
of parts that are considered unstable is futile. You don't need 
to understand every single piece of the compiler to have fun 
extending it. And you should focus your efforts on the stable 
parts.

Parts that are considered unstable need to be encapsulated and 
provide interfaces so that people can build on those interfaces 
instead of making change hard by tying more stuff to the code 
that you want to replace.

May 25 2021

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Tuesday, 25 May 2021 at 12:47:53 UTC, Ola Fosheim Grøstad 
wrote:
 Parts that are considered unstable need to be encapsulated and 
 provide interfaces so that people can build on those interfaces 
 instead of making change hard by tying more stuff to the code 
 that you want to replace.

The key point here is designing new and better interfaces. You 
cannot refactor yourself into heaven with no redesign.

But it does not have to be disruptive. As an example, let's 
pretend we want a new AST and a new IR. Here is a non-disruptive 
sequence:

1. Write new AST and translation to old AST. Not disruptive.

2. Write translation from old AST to new IR, and encourge 
backends to transition. Not disruptive.

3. Transition to new IR, by making old AST private. Backends are 
ready. Not disruptive.

4. Move passes one by one to new IR. Not disruptive.

5. Write translation from new AST to new IR. Done.

You don't want to document your old interface, because you don't 
want people to depend on it. You want to document your new 
interface and encourage people to transition. Then you eventually 
can make the old interface private and can in peace replace the 
old parts.

May 25 2021

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Tuesday, 25 May 2021 at 13:11:17 UTC, Ola Fosheim Grøstad 
wrote:
 1. Write new AST and translation to old AST. Not disruptive.

 2. Write translation from old AST to new IR, and encourge 
 backends to transition. Not disruptive.

 3. Transition to new IR, by making old AST private. Backends 
 are ready. Not disruptive.

 4. Move passes one by one to new IR. Not disruptive.

 5. Write translation from new AST to new IR. Done.

If you are unsure if the new IR is stable you can rearrange the 
sequence like this instead:

new AST -> new IR -> old AST

Perhaps better as it gives more time for backends to transition.

May 25 2021

zjh <fqbqrr 163.com> writes:

+10086.

May 25 2021

zjh <fqbqrr 163.com> writes:

On Tuesday, 25 May 2021 at 13:28:02 UTC, zjh wrote:
 +10086.

Refactoring doesn't take much time.
Because the function has been realized. Refactoring has great 
benefits. Clearly hierarchy, Clearly dependence and Clearly 
interface.

May 25 2021

Ola Fosheim Grostad <ola.fosheim.grostad gmail.com> writes:

On Tuesday, 25 May 2021 at 13:33:25 UTC, zjh wrote:
 On Tuesday, 25 May 2021 at 13:28:02 UTC, zjh wrote:
 +10086.

 Refactoring doesn't take much time.
 Because the function has been realized. Refactoring has great 
 benefits. Clearly hierarchy, Clearly dependence and Clearly 
 interface.

It takes time, but it is a necessary part of the life cycle, and 
does not have to be disruptive. It can happen bit by bit as long 
as you have a new design that is clean.

The nice thing is that one can easily detect regressions by 
comparing a dump from the old compiler with a dump the new 
compiler. Do this for all D programs on github and you can feel 
confident that the new compiler has not introduced new errors.

So: you compare new IR translated to old AST from the new 
compiler with the old ast from the old compiler for a D program. 
If they are equal, then the new compiler passed the test.

May 25 2021

Walter Bright <newshound2 digitalmars.com> writes:

On 5/25/2021 4:22 AM, jmh530 wrote:
 I feel like this forum has ADHD sometimes. A week ago it was all up in arms 
 about ImportC, now it's fcused on this, two weeks from now this will be 
 forgotten and on to something else...

Meanwhile, Iain and I are putting out PRs on ImportC.

May 27 2021

Ola Fosheim Grostad <ola.fosheim.grostad gmail.com> writes:

On Thursday, 27 May 2021 at 09:53:19 UTC, Walter Bright wrote:
 On 5/25/2021 4:22 AM, jmh530 wrote:
 I feel like this forum has ADHD sometimes. A week ago it was 
 all up in arms about ImportC, now it's fcused on this, two 
 weeks from now this will be forgotten and on to something 
 else...

 Meanwhile, Iain and I are putting out PRs on ImportC.

Yes, 2 people run ahead while 10 equally capable people throw 
their hands up in the air then walks off and start writing their 
own compilers.

May 27 2021

Ola Fosheim Grostad <ola.fosheim.grostad gmail.com> writes:

On Thursday, 27 May 2021 at 09:58:31 UTC, Ola Fosheim Grostad 
wrote:
 On Thursday, 27 May 2021 at 09:53:19 UTC, Walter Bright wrote:
 On 5/25/2021 4:22 AM, jmh530 wrote:
 I feel like this forum has ADHD sometimes. A week ago it was 
 all up in arms about ImportC, now it's fcused on this, two 
 weeks from now this will be forgotten and on to something 
 else...

 Meanwhile, Iain and I are putting out PRs on ImportC.

 Yes, 2 people run ahead while 10 equally capable people throw 
 their hands up in the air then walks off and start writing 
 their own compilers.

2 years later they were all steamrolled by C++ and Swift because 
they failed to organize and coordinate between themselves.

The sadness of Open Source is that unlike businesses they don't 
see productivitylosses. They can just ignore them and pretend 
they don't exist. Thus they fail to benefit fom the synergies 
that are available to them.

May 27 2021

zjh <fqbqrr 163.com> writes:

On Thursday, 27 May 2021 at 11:27:00 UTC, Ola Fosheim Grostad 
wrote:
 On Thursday, 27 May 2021 at 09:58:31 UTC, Ola Fosheim Grostad

You're right.

We have few people, and if we don't organize ourselves, we cannot 
compete with other languages. They all have organized.

Wake up,Walter(repeate 3 times).

May 27 2021

jmh530 <john.michael.hall gmail.com> writes:

On Thursday, 27 May 2021 at 09:53:19 UTC, Walter Bright wrote:
 On 5/25/2021 4:22 AM, jmh530 wrote:
 I feel like this forum has ADHD sometimes. A week ago it was 
 all up in arms about ImportC, now it's fcused on this, two 
 weeks from now this will be forgotten and on to something 
 else...

 Meanwhile, Iain and I are putting out PRs on ImportC.

Of course, forum =/= dmd

May 27 2021

Ola Fosheim Grostad <ola.fosheim.grostad gmail.com> writes:

On Thursday, 27 May 2021 at 12:53:25 UTC, jmh530 wrote:
 On Thursday, 27 May 2021 at 09:53:19 UTC, Walter Bright wrote:
 On 5/25/2021 4:22 AM, jmh530 wrote:
 I feel like this forum has ADHD sometimes. A week ago it was 
 all up in arms about ImportC, now it's fcused on this, two 
 weeks from now this will be forgotten and on to something 
 else...

 Meanwhile, Iain and I are putting out PRs on ImportC.

 Of course, forum =/= dmd

The forums are drying up when it comes to people who are 
interested in compilers.

You better do something to recruit the ones that are still there 
unless you want to struggle with DIP1000 all by yourselves 
forever.

May 27 2021

sighoya <sighoya gmail.com> writes:

On Tuesday, 25 May 2021 at 09:05:26 UTC, Ola Fosheim Grøstad 
wrote:
 On Tuesday, 25 May 2021 at 08:32:46 UTC, sighoya wrote:
 You can't encode the full semantic into one function name with 
 parameter names without to over blow these names.

 We can assume that the reader has read a book on compiler 
 design and is familiar with the terminology and the most common 
 algorithms. Provide a reference to wikipedia if unsure if the 
 reader is with you...

For very general things, yes, this is possible, but there are 
structures and algorithms out there which didn't resemble that 
what you've learned or there isn't a simple name 
invented/discovered by someone.
Everyone has a different intuition how to solve a problem which 
could be pretty hard to follow without comments by reading solely 
index operations, shifts and type names which are so specific as 
the cosmos.

 Functions that are only called from a few places can have long 
 descriptive names, that is not a negative.

Trade off, but I appreciate this in tests for instance.



 Yes, obviously. But adding 6 lines of comments for every 
 trivial function is not helpful. It is a useless policy. It is 
 a policy for the sake of having a policy.

Yes, I agree with this and six lines is mostly too much, look at 
the example I linked before, this was mostly a one liner of a 
comment.


 If time is invested in documenting things that should be 
 changed... then change becomes less likely: "look, the 
 documentation is over there, change not needed".

Okay, that may be true, but it makes it also easier to dive in 
and to have fun to change things.

 Anyway, documentation is the wrong solution to structural 
 issues. It does not enable anything.

Agree.

 It is kinda like saying a city does not read roadsigns because 
 there is a good map available. Or that a city that is a 
 labyrinth of one-way streets are easy to navigate with the 
 right kind of map. Driving while looking at a map is not a good 
 experience. And when things change, can you then trust the map?

 *shrug*

I think the metaphor speaks against you as the map is the wiki 
article you mentioned :)

May 25 2021

Ola Fosheim Grostad <ola.fosheim.grostad gmail.com> writes:

On Tuesday, 25 May 2021 at 19:08:18 UTC, sighoya wrote:
 I think the metaphor speaks against you as the map is the wiki 
 article you mentioned :)

Nah, because that would be in the documentation, so you are 
already looking at the map.

But I think for a compiler you should assume basic terminology to 
be known, if people have an interest for this they would pick up 
a book on compiler design and implementation.

May 25 2021

Alexandru Ermicioi <alexandru.ermicioi gmail.com> writes:

On Tuesday, 25 May 2021 at 08:32:46 UTC, sighoya wrote:
 You can't encode the full semantic into one function name with 
 parameter names without to over blow these names.

In this case, it might be good to have a documentation comment, 
otherwise behavior should be known from the function name and 
args.

 However, small comments inside the function would also be 
 beneficial.

Having such comments inside function body, means you've failed to 
make the code easy to read and understand. Instead of such inline 
comments, consider extracting that piece into a function with 
right name. Adding such comments should be the last option in 
your decision on what to do with that piece of code. Note that 
most probably next dev, if he changed that piece of code, will 
most probably just forget updating that comment, meaning that it 
will tell a lie instead of truth.

May 25 2021

sighoya <sighoya gmail.com> writes:

On Tuesday, 25 May 2021 at 16:00:32 UTC, Alexandru Ermicioi wrote:
 On Tuesday, 25 May 2021 at 08:32:46 UTC, sighoya wrote:
 You can't encode the full semantic into one function name with 
 parameter names without to over blow these names.

 In this case, it might be good to have a documentation comment, 
 otherwise behavior should be known from the function name and 
 args.

Agree.


 Having such comments inside function body, means you've failed 
 to make the code easy to read and understand. Instead of such 
 inline comments, consider extracting that piece into a function 
 with right name.

It's a trade-off. Over modularization can also be a mispattern as 
it significantly reduces locality.
The other point is how to deal with dynamic context which may 
solved with templates, what a hack.
Anyhow, you don't always code very high level, sometimes a bit 
more low level or indirect, then it's good to have some thread to 
follow.

 Adding such comments should be the last option in your decision 
 on what to do with that piece of code.

Naming is more important, definitely. But succinct comments for 
small sections aren't that bad and are sometimes better than to 
modularize it with a function:

```D
void firstAddToThenUpdateStructureThenFinalize...
```

Giving a shorter and a more non-functional name to this function 
would be ok but is sometimes too general to understand it in your 
context.
Splitting this function in smaller parts may work to name these 
operations shorter, but the point is the context, it's not always 
clear even with correct semantic naming which is mostly not 
possible without to be too general.

It's like commit messages, I like to commit first with the 
technical detail:

```
Update ClassA:
```

Then in the next lines I add some points describing newly added 
semantics which is too much to compact it into one single line.

If you could add context otherwise, this would be pretty good, 
for instance Swift parameter labels are a first step into the 
right direction:

send(message:"Hello World",from:"Earth",to:"Mars")


 Note that most probably next dev, if he changed that piece of 
 code, will most probably just forget updating that comment, 
 meaning that it will tell a lie instead of truth.

Yes, but I can argue with the same for modularization, if someone 
changes the body without to rename the function failed the same 
way.

May 25 2021

Basile B. <b2.temp gmx.com> writes:

On Tuesday, 25 May 2021 at 00:03:56 UTC, Walter Bright wrote:
 On 5/24/2021 2:01 PM, Max Haughton wrote:
 Where do you start?

 At the first function you notice that has poor/missing/wrong 
 documentation.

 Like this one I just did:

 https://github.com/dlang/dmd/pull/12570

Related, all the BUG:, TODO:, etc. comments should be moved to 
bugzilla.
For example 
[here](https://github.com/dlang/dmd/blob/7fafcd213ac82c58e7b8fb8143c837c7595c4e8f/src/dmd/expressionsem.d#L10140)

```d
/* BUG: Should handle things like:
  *      char c;
  *      c ~ ' '
  *      ' ' ~ c;
  */
```

this is an request to have `char() ~ char()` producing `char[]`.
This has nothing to do in the code.

May 25 2021

Iain Buclaw <ibuclaw gdcproject.org> writes:

On Tuesday, 25 May 2021 at 00:03:56 UTC, Walter Bright wrote:
 On 5/24/2021 2:01 PM, Max Haughton wrote:
 Where do you start?

 At the first function you notice that has poor/missing/wrong 
 documentation.

 Like this one I just did:

 https://github.com/dlang/dmd/pull/12570

Another place you can make a big impact with zero change in 
language behavior is this:

```
Error: none of the overloads of size are callable using a const 
object, candidates are:
        dmd.mtype.Type.size()
        dmd.mtype.Type.size(ref const(Loc) loc)
Error: mutable method dmd.mtype.TypeVector.elementType is not 
callable using a const object
        Consider adding const or inout here
Error: mutable method dmd.mtype.TypeVector.isscalar is not 
callable using a const object
        Consider adding const or inout here
Error: mutable method dmd.mtype.TypeVector.isintegral is not 
callable using a const object
        Consider adding const or inout here
Error: mutable method dmd.mtype.TypeVector.isfloating is not 
callable using a const object
        Consider adding const or inout here
```

Which is but a small portion of the monster error that occurs 
when you use `const` instead of `auto` for any AST `Type` or 
`Dsymbol`.

May 25 2021

Walter Bright <newshound2 digitalmars.com> writes:

On 5/24/2021 6:53 AM, 12345swordy wrote:
 I seriously question the "Optimized for people who spend thousands of hours 
 working on it" line, as I had a very intelligent person posted on slacks
asking 
 what does this function do, as there is no comments for said functions.

"Use correct Ddoc function comment blocks."

https://github.com/dlang/dmd/blob/master/CONTRIBUTING.md

It's up to contributors to read and follow the guidelines, and up to those with 
pull privileges to require conformance.

It's also up to you and I and us to go and fix documentation problems we run 
across, like this:

https://github.com/dlang/dmd/pull/12570

May 24 2021

D Programming

C/C++ Programming

Other

digitalmars.D - How can we make it easier to experiment with the compiler?