digitalmars.D - [SAoC] "Improving DMD as a Library" project thread

Mihaela Chirea (52/52) Sep 14 2020 Hello!

Mathias LANG (5/11) Sep 15 2020 Good luck! It's a much needed improvement. I see you already
Mihaela Chirea (26/26) Oct 28 2020 Hello!

Jacob Carlborg (8/10) Oct 29 2020 I've started on this [1] (very rough workin in progress), if you

RazvanN (11/21) Oct 29 2020 So right now the compiler, when given a .d/.di file it opens it,

Jacob Carlborg (34/41) Oct 30 2020 Kind of, or at least that's one of the reasons. The main idea is

Mihaela Chirea <chireamihaela99 gmail.com> writes:

Hello!

My name is Mihaela Chirea and I am a 4th year Computer 
Engineering student at Politehnica University of Bucharest.

My interest in programming languages lead me to attending a D 
workshop at Ideas and Projects Workshop in 2019 and D Summer 
School this year, both held by Eduard Staniloiu and Razvan Nitu. 
Topics like meta-programming and design by introspection made me 
curious about how these concepts were implemented, thus 
increasing my interest in compilers.

For this year's edition of SAoC I will be working on improving 
dmd as a library, mainly by cleaning up the AST nodes by moving 
the semantic elements in more suitable places and creating new 
visitors when needed.
After studying the current state of dmd and identifying the parts 
I will be working on, I have decided on following this plan:

- Getting used to the structure of the compiler by working on the 
nodes that don't contain that much semantic information:

Milestone 1:
     - aliasthis.d
     - attrib.d
     - statement.d
     - aggregate.d
     - cond.d
     - staticcond.d
     - nspace.d

- Work on the files where semantic elements either appear often, 
or the functions in which they appear are used in many other 
places and therefore more files would need changes

Milestone 2:
     - mtype.d
     - dstruct.d
     - dclass.d
     - denum.d
     - dimport.d

Milestone 3:
     - dsymbol.d
     - expression.d
     - dmodule.d

Milestone 4
     - declaration.d
     - func.d
     - dtemplate.d

However, small changes to this plan may be necessary since other 
changes to the compiler may raise unexpected issues for this 
project.

For as much as time allows, and even after the end of this event, 
I would also work on creating a nice compiler interface, which 
would become much easier after this refactoring step.
I will be posting weekly updates regarding my progress on this 
project.

Thanks!
Mihaela

Sep 14 2020

Mathias LANG <geod24 gmail.com> writes:

On Monday, 14 September 2020 at 12:47:42 UTC, Mihaela Chirea 
wrote:
 Hello!

 My name is Mihaela Chirea and I am a 4th year Computer 
 Engineering student at Politehnica University of Bucharest.

 [...]

 Thanks!
 Mihaela

Good luck! It's a much needed improvement. I see you already 
joined the dlang slack, if you have any questions, #dmd is the 
place to go.

Sep 15 2020

Mihaela Chirea <chireamihaela99 gmail.com> writes:

Hello!

During the first week of working on this project I received 
multiple suggestions regarding other possible tasks that could 
better benefit the community. I started working on them from the 
second week but never clearly changed the milestones.

So, based mostly on Jacob Carlborg's suggestions[1], here are the 
new plans:

Milestone 2:
- Add the start location to the AST nodes that lack this 
information
- Bring all the dmd as a library features already existing in the 
compiler under DMDLIB
- Add the token size
- Add the end location to all nodes

Some of the issues I would like to tackle during the next 
milestones are:
- Add the possibility of analyzing source code that is only in 
memory
- Reduce the global state
- Don't generate TypeInfo when not needed (as suggested here[2])

So far, I didn't get the chance to study these last topics in 
detail and I would appreciate any advice or opinions on how to 
start working on these tasks.

[1] https://github.com/dlang/dmd/pull/11788#issuecomment-698186023
[2] 
https://forum.dlang.org/post/iopxhnudlrgiqwjxzihe forum.dlang.org

Oct 28 2020

Jacob Carlborg <doob me.com> writes:

On Wednesday, 28 October 2020 at 19:08:01 UTC, Mihaela Chirea 
wrote:

 - Add the possibility of analyzing source code that is only in 
 memory

I've started on this [1] (very rough workin in progress), if you 
need any pointers.

[1] 
https://github.com/jacob-carlborg/ddc/commit/cee56ce3750701d593dd619b27d28f18e4929e72

--
/Jacob Carlborg

Oct 29 2020

RazvanN <razvan.nitu1305 gmail.com> writes:

On Thursday, 29 October 2020 at 08:50:46 UTC, Jacob Carlborg 
wrote:
 On Wednesday, 28 October 2020 at 19:08:01 UTC, Mihaela Chirea 
 wrote:

 - Add the possibility of analyzing source code that is only in 
 memory

 I've started on this [1] (very rough workin in progress), if 
 you need any pointers.

 [1] 
 https://github.com/jacob-carlborg/ddc/commit/cee56ce3750701d593dd619b27d28f18e4929e72

 --
 /Jacob Carlborg

So right now the compiler, when given a .d/.di file it opens it, 
reads the contents and immediately lexes+parses the string after 
which the string is discarded. If the contents of the file need 
to be changed or reanalyzed, then the whole process needs to be 
started from scratch. What you are proposing Jacob is that the 
contents of the file are stored somewhere for ease of reuse. Is 
that right?

Cheers,
RazvanN

Oct 29 2020

Jacob Carlborg <doob me.com> writes:

On Friday, 30 October 2020 at 06:03:40 UTC, RazvanN wrote:

 So right now the compiler, when given a .d/.di file it opens 
 it, reads the contents and immediately lexes+parses the string 
 after which the string is discarded. If the contents of the 
 file need to be changed or reanalyzed, then the whole process 
 needs to be started from scratch. What you are proposing Jacob 
 is that the contents of the file are stored somewhere for ease 
 of reuse. Is that right?

Kind of, or at least that's one of the reasons. The main idea is 
to separate the reading of a file from lexing and parsing it. We 
introduce a file manager (like a cache). The compiler will first 
look in the file manager if the file content if available, 
otherwise read from disk. The important part here is that it 
needs to be possible to pre-populate (and also update) the file 
manager with a file and its content. This would allow to do a 
full compilation from memory, without touching the disk.

The main reason for this is to be able to have the compiler 
receive file content data from other sources than disk. Two use 
cases for that would be:

* A LSP server (or similar tool) receiving the data from the 
network from an editor with unsaved files

* The data is already in memory, think a string literal. This is 
useful when writing tests

The other idea is, as you mentioned, to read from memory if the 
file has already been read from disk when reanalyzing. For 
example, if you want to get the tokens of an AST node, as the 
compiler looks like now, you probably need to re-lex the file to 
get the tokens. But you don't want to re-read the file from disk, 
because it might have been updated. For this use case, it's 
really important the compiler is reading the exact same file 
content as it did when it originally created the AST.

Note, there's already a file cache [1], but that will not fit. It 
it's not possible to pre-populate or update. It also splits up 
the file in lines. The existing file cache [1] could perhaps take 
advantage of the new file manager.

Keep in mind that this new file manager needs to be used, not 
only when reading D files, but also when reading files through 
import expressions.

[1] https://github.com/dlang/dmd/blob/master/src/dmd/filecache.d

--
/Jacob Carlborg

Oct 30 2020

D Programming

C/C++ Programming

Other

digitalmars.D - [SAoC] "Improving DMD as a Library" project thread