digitalmars.D - AST files instead of DI interface files for faster compilation and
- timotheecour (67/67) Jun 12 2012 There's a current pull request to improve di file generation
- Tobias Pankrath (2/2) Jun 12 2012 Currently .di-files are compiler independent. If this should hold
- =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= (7/9) Jun 12 2012 Which is a Good Thing (TM). It would /require/ formalization of the
- Timon Gehr (2/8) Jun 12 2012 I do not see how this conclusion could be reached.
- deadalnix (7/9) Jun 12 2012 We need it anyway at some point. AST macro is another example.
- Timon Gehr (10/21) Jun 12 2012 AST macros may refer to AST structures by their representations as D cod...
- Don Clugston (14/24) Jun 12 2012 Is that actually true? My recollection is that the original motivation
- foobar (17/47) Jun 12 2012 I absolutely agree with the above and would also add that goal
- Dmitry Olshansky (13/60) Jun 12 2012 Absolutely. DDoc being built-in didn't sound right to me at first, BUT
- Adam Wilson (28/90) Jun 12 2012 I completely agree with this. The interactions between the D module syst...
- Dmitry Olshansky (7/91) Jun 12 2012 I/O is not. (De)Compression on the fly is more and more intersecting
- Paulo Pinto (14/79) Jun 13 2012 Back in the 90's I only moved 100% away from Turbo Pascal into C
- Jacob Carlborg (4/9) Jun 12 2012 Can't the same be done with OMF? I'm not saying I want to keep OMF.
- Adam Wilson (10/17) Jun 12 2012 OMF doesn't support Custom Sections and I think a custom section is the ...
- foobar (17/47) Jun 12 2012 I absolutely agree with the above and would also add that goal
- deadalnix (3/7) Jun 12 2012 LLVM is definitively something I look at more and more. It is a great
- Walter Bright (4/12) Jun 12 2012 (4) was not a goal.
- Don Clugston (8/25) Jun 13 2012 I don't understand (1) actually.
- Iain Buclaw (21/48) Jun 13 2012 Lexing and Parsing are miniscule tasks in comparison to the three
- Dmitry Olshansky (5/53) Jun 13 2012 Is time spent on I/O accounted for in the parse step? And where is the
- Iain Buclaw (12/80) Jun 13 2012 k
- Dmitry Olshansky (6/81) Jun 13 2012 Ok, then parsing is indistinguishable from I/O and together are only
- deadalnix (4/53) Jun 13 2012 Nice numbers ! It also show that the slowest part is the backend.
- Kagamin (10/10) Jun 13 2012 The measurements should be done for modules being imported, not
- Kagamin (2/12) Jun 13 2012 Oh and let it import .d files, not .di
- Iain Buclaw (8/22) Jun 13 2012 std.datetime is one reason for me to run it again. I can imagine that
- Kagamin (3/10) Jun 13 2012 Probably. Also test with -fsyntax-only is it works and runs
- Jacob Carlborg (7/11) Jun 13 2012 You should try the Objective-C/D bridge, that took quite a while to
- Iain Buclaw (25/46) Jun 16 2012 Rebuilt a compile log with latest gdc as of writing on the 2.059
- deadalnix (2/47) Jun 19 2012 Thank you very much for your work.
- Iain Buclaw (30/75) Jun 16 2012 tl;dr
- Guillaume Chatelet (5/7) Jun 16 2012 So maybe my post about "keeping import clean" wasn't as irrelevant as I
- Iain Buclaw (11/18) Jun 19 2012 I think it's relevancy is only geared towards projects that are
- Walter Bright (4/31) Jun 13 2012 Yes, it is designed so you could just import a symbol table. It is done ...
- Don Clugston (6/42) Jun 14 2012 Iain's data indicates that it's only a few % of the time taken on
- Jonathan M Davis (7/44) Jun 14 2012 If this is the case, is there any value at all to using .di files in dru...
- Kagamin (4/8) Jun 14 2012 Oh, right, the module can use mixins and CTFE, so it should be
- Don Clugston (15/59) Jun 14 2012 I don't think Phobos should use .di files at all. I don't think there
- Jonathan M Davis (9/27) Jun 15 2012 On several occasions, Walter has expressed the desire to make Phobos use...
- Walter Bright (7/11) Jun 16 2012 The language is carefully designed, so that at least in theory all the p...
- deadalnix (3/17) Jun 19 2012 The key point is project size here. I wouldn't expect file size to
- Walter Bright (3/9) Jun 16 2012 I don't think they're nasty or are side effects.
- Don Clugston (7/18) Jun 18 2012 But you argued in your blog that C++ parsing is inherently slow, and
- Walter Bright (4/16) Jun 18 2012 Yeah, but I can't escape that lingering feeling that lexing is slow.
- Daniel (4/24) Jun 18 2012 Same here, I wish there were a standardized pre-lexed-token
- Chris Cain (9/12) Jun 18 2012 If I were to make my own language, I'd forego a human-readable
- Timon Gehr (6/17) Jun 18 2012 This could be done even if the language's source code storage format is
- Kagamin (2/4) Jun 19 2012 Yep, pegged runs at compile time.
- Kagamin (5/9) Jun 19 2012 I don't even understand all this rage about asynchronicity, if
- dennis luehring (8/18) Jun 19 2012 the lexing and parsing process can be asynchron - i will be faster on
- dennis luehring (4/21) Jun 19 2012 so you started you lexing, parsing in seperated threads for each file -
- deadalnix (2/20) Jun 19 2012 It is kind of religious. We need data.
- Steven Schveighoffer (8/26) Jun 25 2012 I have found that my project, which has a huge number of symbols (And
- Martin Nowak (23/41) Jun 25 2012 Lexing is definitely taking a big part of debug compilation time.
There's a current pull request to improve di file generation (https://github.com/D-Programming-Language/dmd/pull/945); I'd like to suggest further ideas. As far as I understand, di interface files try to achieve these conflicting goals: 1) speed up compilation by avoiding having to reparse large files over and over. 2) hide implementation details for proprietary reasons 3) still maintain source code in some form to allow inlining and CTFE 4) be human readable -Goals 2) and 3) are clearly contradictory, so that calls for a command line switch (eg -hidesource), which should be off by default, which when set will indeed remove any implementation details (where possible, ie for non-template and non-auto-return functions) but as a counterpart also prevent any chance for inlining/CTFE for the corresponding exported API. That choice will be left to the user. -Regarding point 1), it won't be untypical to have a D interface file to be almost as large (and slow to parse) as the original source file, even with the upcoming di file improvements (dmd/pull/945), as D encourages the use of templates/auto-return throughout (a large part of phobos would be left quasi-unchanged). In fact, the fast compile time of D _does_ suffer when there are heavy use of templates, or scaling up. So to make interface files really useful in terms of speeding up compilation, why not directly store the AST (could be text-based like JSON but preferably a portable binary format for speed, call it ".dib" file), with possibly some amount of analysis (eg: version(windows) could be pre-handled). This would be analoguous to precompiled header files (http://en.wikipedia.org/wiki/Precompiled_header), which don't exist in D AFAIK. This could be done by extending the currently incomplete json file generation by dmd, to include AST of implementation of each function we want to export such as templates or stuff to inline). During compilation of a module, "import myfun;" would look for 1) myfun.dib (binary or json precompiled interface file), 2) myfun.di (if still needed), 3) myfun.d. We could even go a step further, borrowing some ideas from the "framework" feature found in OSX to distribute components: a single D framework would combine the AST (~ precompiled .dib headers) of a set of D modules and a set of libraries. The user would then use a framework as follows: dmd -L-framework mylib -L-Lpath/to/mylib main.d or simply: dmd main.d if main.d contains pragma(framework,"mylib") and framework mylib is in the search path As in OSX's frameworks, framework mylib is used both during compilation (resolving import statements in main.d) and linking. Upon encountering an "import myfun;" declaration, the compiler would search the linked in frameworks for a symbol or file representing the corresponding AST of module myfun, and if not found, use the default import mechanism. That will both speed up compilation times and make distribution of libraries and versioning a breeze: single framework to download and to link against (this is different from what rdmd does). On OSX, frameworks appear as a single file in Finder but are actually directories; here we could have either a single file or a directory as well. Finally, regarding point 4), a simple command line switch (eg dmd --pretty-print myfun.di) will pretty-print to stdout the AST, and omit the implementation of templates and auto functions for brevity, so they appear as simple di files (but some options could filter out AST nodes for IDE use, etc). Thanks for your comments!
Jun 12 2012
Currently .di-files are compiler independent. If this should hold for dib-files, too, we'll need a standard ast structure, won't we?
Jun 12 2012
On 12-06-2012 12:23, Tobias Pankrath wrote:Currently .di-files are compiler independent. If this should hold for dib-files, too, we'll need a standard ast structure, won't we?Which is a Good Thing (TM). It would /require/ formalization of the language once and for all. -- Alex Rønne Petersen alex lycus.org http://lycus.org
Jun 12 2012
On 06/12/2012 12:47 PM, Alex Rønne Petersen wrote:On 12-06-2012 12:23, Tobias Pankrath wrote:I do not see how this conclusion could be reached.Currently .di-files are compiler independent. If this should hold for dib-files, too, we'll need a standard ast structure, won't we?Which is a Good Thing (TM). It would /require/ formalization of the language once and for all.
Jun 12 2012
Le 12/06/2012 12:23, Tobias Pankrath a écrit :Currently .di-files are compiler independent. If this should hold for dib-files, too, we'll need a standard ast structure, won't we?We need it anyway at some point. AST macro is another example. It would also greatly simplify compiler writing if the D interpreter could be provided as lib (and so run on top of dib file). I want to mention that LLVM IR + metadata can do a really good job here. In addition, LLVM people are working on a JIT backend, if you know what I mean ;)
Jun 12 2012
On 06/12/2012 03:54 PM, deadalnix wrote:Le 12/06/2012 12:23, Tobias Pankrath a écrit :Plain D code is already a perfectly fine standard AST structure.Currently .di-files are compiler independent. If this should hold for dib-files, too, we'll need a standard ast structure, won't we?We need it anyway at some point.AST macro is another example.AST macros may refer to AST structures by their representations as D code.It would also greatly simplify compiler writing if the D interpreter could be provided as lib (and so run on top of dib file).I don't think so. Writing the interpreter is a rather straightforward part of the compiler implementation. Why would you want to run it on top of a '.dib' file anyway? Serializing/deserializing the AST is too much overhead.I want to mention that LLVM IR + metadata can do a really good job here. In addition, LLVM people are working on a JIT backend, if you know what I mean ;)Interpreting manually is not harder than CTFE-compatible LLVM IR code generation, but the LLVM JIT could certainly be leveraged to improve compilation speeds.
Jun 12 2012
On 12/06/12 11:07, timotheecour wrote:There's a current pull request to improve di file generation (https://github.com/D-Programming-Language/dmd/pull/945); I'd like to suggest further ideas. As far as I understand, di interface files try to achieve these conflicting goals: 1) speed up compilation by avoiding having to reparse large files over and over. 2) hide implementation details for proprietary reasons 3) still maintain source code in some form to allow inlining and CTFE 4) be human readableIs that actually true? My recollection is that the original motivation was only goal (2), but I was fairly new to D at the time (2005). Here's the original post where it was implemented: http://www.digitalmars.com/d/archives/digitalmars/D/29883.html and it got partially merged into DMD 0.141 (Dec 4 2005), first usable in DMD0.142 Personally I believe that.di files are *totally* the wrong approach for goal (1). I don't think goal (1) and (2) have anything in common at all with each other, except that C tried to achieve both of them using header files. It's an OK solution for (1) in C, it's a failure in C++, and a complete failure in D. IMHO: If we want goal (1), we should try to achieve goal (1), and stop pretending its in any way related to goal (2).
Jun 12 2012
On Tuesday, 12 June 2012 at 11:09:04 UTC, Don Clugston wrote:On 12/06/12 11:07, timotheecour wrote:I absolutely agree with the above and would also add that goal (4) is an anti-feature. In order to get a human readable version of the API the programmer should use *documentation*. D claims that one of its goals is to make it a breeze to provide documentation by bundling a standard tool - DDoc. There's no need to duplicate this just to provide another format when DDoc itself supposed to be format agnostic. This is a solved problem since the 80's (E.g. Pascal units). Per Adam's post, the issue is tied to DMD's use of OMF/optlink which we all would like to get rid of anyway. Once we're in proper COFF land, couldn't we just store the required metadata (binary AST?) in special sections in the object files themselves? Another related question - AFAIK the LLVM folks did/are doing work to make their implementation less platform-depended. Could we leverage this in ldc to store LLVM bit code as D libs which still retain enough info for the compiler to replace header files?There's a current pull request to improve di file generation (https://github.com/D-Programming-Language/dmd/pull/945); I'd like to suggest further ideas. As far as I understand, di interface files try to achieve these conflicting goals: 1) speed up compilation by avoiding having to reparse large files over and over. 2) hide implementation details for proprietary reasons 3) still maintain source code in some form to allow inliningand CTFE4) be human readableIs that actually true? My recollection is that the original motivation was only goal (2), but I was fairly new to D at the time (2005). Here's the original post where it was implemented: http://www.digitalmars.com/d/archives/digitalmars/D/29883.html and it got partially merged into DMD 0.141 (Dec 4 2005), first usable in DMD0.142 Personally I believe that.di files are *totally* the wrong approach for goal (1). I don't think goal (1) and (2) have anything in common at all with each other, except that C tried to achieve both of them using header files. It's an OK solution for (1) in C, it's a failure in C++, and a complete failure in D. IMHO: If we want goal (1), we should try to achieve goal (1), and stop pretending its in any way related to goal (2).
Jun 12 2012
On 12.06.2012 16:09, foobar wrote:On Tuesday, 12 June 2012 at 11:09:04 UTC, Don Clugston wrote:Absolutely. DDoc being built-in didn't sound right to me at first, BUT it allows us to essentially being able to say that APIs are covered in the DDoc generated files. Not header files etc.On 12/06/12 11:07, timotheecour wrote:I absolutely agree with the above and would also add that goal (4) is an anti-feature. In order to get a human readable version of the API the programmer should use *documentation*. D claims that one of its goals is to make it a breeze to provide documentation by bundling a standard tool - DDoc. There's no need to duplicate this just to provide another format when DDoc itself supposed to be format agnostic.There's a current pull request to improve di file generation (https://github.com/D-Programming-Language/dmd/pull/945); I'd like to suggest further ideas. As far as I understand, di interface files try to achieve these conflicting goals: 1) speed up compilation by avoiding having to reparse large files over and over. 2) hide implementation details for proprietary reasons 3) still maintain source code in some form to allow inliningand CTFE4) be human readableIs that actually true? My recollection is that the original motivation was only goal (2), but I was fairly new to D at the time (2005). Here's the original post where it was implemented: http://www.digitalmars.com/d/archives/digitalmars/D/29883.html and it got partially merged into DMD 0.141 (Dec 4 2005), first usable in DMD0.142 Personally I believe that.di files are *totally* the wrong approach for goal (1). I don't think goal (1) and (2) have anything in common at all with each other, except that C tried to achieve both of them using header files. It's an OK solution for (1) in C, it's a failure in C++, and a complete failure in D. IMHO: If we want goal (1), we should try to achieve goal (1), and stop pretending its in any way related to goal (2).This is a solved problem since the 80's (E.g. Pascal units).Right, seeing yet another newbie hit it everyday is a clear indication of a simple fact: people would like to think & work in modules rather then seeing guts of old and crappy OBJ file technology. Linking with C != using C tools everywhere.Per Adam's post, the issue is tied to DMD's use of OMF/optlink which we all would like to get rid of anyway. Once we're in proper COFF land, couldn't we just store the required metadata (binary AST?) in special sections in the object files themselves?Seconded. At least lexed form could be very compact, I recall early compressors tried doing the Huffman thing on source code tokens with a certain success.Another related question - AFAIK the LLVM folks did/are doing work to make their implementation less platform-depended. Could we leverage this in ldc to store LLVM bit code as D libs which still retain enough info for the compiler to replace header files?-- Dmitry Olshansky
Jun 12 2012
On Tue, 12 Jun 2012 05:23:16 -0700, Dmitry Olshansky <dmitry.olsh gmail.com> wrote:On 12.06.2012 16:09, foobar wrote:I completely agree with this. The interactions between the D module system and D toolchain are utterly confusing to newcomers, especially those from other C-like languages. There are better ways, see .NET Assemblies and Pascal Units. These problems were solved decades ago. Why are we still using 40-year-old paradigms?On Tuesday, 12 June 2012 at 11:09:04 UTC, Don Clugston wrote:Absolutely. DDoc being built-in didn't sound right to me at first, BUT it allows us to essentially being able to say that APIs are covered in the DDoc generated files. Not header files etc.On 12/06/12 11:07, timotheecour wrote:I absolutely agree with the above and would also add that goal (4) is an anti-feature. In order to get a human readable version of the API the programmer should use *documentation*. D claims that one of its goals is to make it a breeze to provide documentation by bundling a standard tool - DDoc. There's no need to duplicate this just to provide another format when DDoc itself supposed to be format agnostic.There's a current pull request to improve di file generation (https://github.com/D-Programming-Language/dmd/pull/945); I'd like to suggest further ideas. As far as I understand, di interface files try to achieve these conflicting goals: 1) speed up compilation by avoiding having to reparse large files over and over. 2) hide implementation details for proprietary reasons 3) still maintain source code in some form to allow inliningand CTFE4) be human readableIs that actually true? My recollection is that the original motivation was only goal (2), but I was fairly new to D at the time (2005). Here's the original post where it was implemented: http://www.digitalmars.com/d/archives/digitalmars/D/29883.html and it got partially merged into DMD 0.141 (Dec 4 2005), first usable in DMD0.142 Personally I believe that.di files are *totally* the wrong approach for goal (1). I don't think goal (1) and (2) have anything in common at all with each other, except that C tried to achieve both of them using header files. It's an OK solution for (1) in C, it's a failure in C++, and a complete failure in D. IMHO: If we want goal (1), we should try to achieve goal (1), and stop pretending its in any way related to goal (2).This is a solved problem since the 80's (E.g. Pascal units).Right, seeing yet another newbie hit it everyday is a clear indication of a simple fact: people would like to think & work in modules rather then seeing guts of old and crappy OBJ file technology. Linking with C != using C tools everywhere.>Per Adam'sI don't see the value of compression. Lexing would already reduce the size significantly and compression would only add to processing times. Disk is cheap. Beyond that though, this is absolutely the direction D must head in. In my mind the DI generation patch was mostly just a stop-gap to bring DI-gen up-to-date with the current system thereby giving us enough time to tackle the (admittedly huge) task of building COFF into the backend, emitting the lexed source into a special section and then giving the compiler *AND* linker the ability to read out the source. For example the giving the linker the ability to read out source code essentially requires a brand-new linker. Although, it is my personal opinion that the linker should be integrated with the compiler and done as one step, this way the linker could have intimate knowledge of the source and would enable some spectacular LTO options. If only DMD were written in D, then we could really open the compile speed throttles with an MT build model...post, the issue is tied to DMD's use of OMF/optlink which we all would like to get rid of anyway. Once we're in proper COFF land, couldn't we just store the required metadata (binary AST?) in special sections in the object files themselves?Seconded. At least lexed form could be very compact, I recall early compressors tried doing the Huffman thing on source code tokens with a certain success.-- Adam Wilson IRC: LightBender Project Coordinator The Horizon Project http://www.thehorizonproject.org/Another related question - AFAIK the LLVM folks did/are doing work to make their implementation less platform-depended. Could we leverage this in ldc to store LLVM bit code as D libs which still retain enough info for the compiler to replace header files?
Jun 12 2012
On 12.06.2012 22:47, Adam Wilson wrote:On Tue, 12 Jun 2012 05:23:16 -0700, Dmitry Olshansky <dmitry.olsh gmail.com> wrote:I/O is not. (De)Compression on the fly is more and more intersecting direction these days. The less you read/write the faster you get. Knowing beforehand the distribution of keywords relative frequency is a boon. Yet I agree that it's premature at the moment.On 12.06.2012 16:09, foobar wrote:I completely agree with this. The interactions between the D module system and D toolchain are utterly confusing to newcomers, especially those from other C-like languages. There are better ways, see .NET Assemblies and Pascal Units. These problems were solved decades ago. Why are we still using 40-year-old paradigms?On Tuesday, 12 June 2012 at 11:09:04 UTC, Don Clugston wrote:Absolutely. DDoc being built-in didn't sound right to me at first, BUT it allows us to essentially being able to say that APIs are covered in the DDoc generated files. Not header files etc.On 12/06/12 11:07, timotheecour wrote:I absolutely agree with the above and would also add that goal (4) is an anti-feature. In order to get a human readable version of the API the programmer should use *documentation*. D claims that one of its goals is to make it a breeze to provide documentation by bundling a standard tool - DDoc. There's no need to duplicate this just to provide another format when DDoc itself supposed to be format agnostic.There's a current pull request to improve di file generation (https://github.com/D-Programming-Language/dmd/pull/945); I'd like to suggest further ideas. As far as I understand, di interface files try to achieve these conflicting goals: 1) speed up compilation by avoiding having to reparse large files over and over. 2) hide implementation details for proprietary reasons 3) still maintain source code in some form to allow inliningand CTFE4) be human readableIs that actually true? My recollection is that the original motivation was only goal (2), but I was fairly new to D at the time (2005). Here's the original post where it was implemented: http://www.digitalmars.com/d/archives/digitalmars/D/29883.html and it got partially merged into DMD 0.141 (Dec 4 2005), first usable in DMD0.142 Personally I believe that.di files are *totally* the wrong approach for goal (1). I don't think goal (1) and (2) have anything in common at all with each other, except that C tried to achieve both of them using header files. It's an OK solution for (1) in C, it's a failure in C++, and a complete failure in D. IMHO: If we want goal (1), we should try to achieve goal (1), and stop pretending its in any way related to goal (2).This is a solved problem since the 80's (E.g. Pascal units).Right, seeing yet another newbie hit it everyday is a clear indication of a simple fact: people would like to think & work in modules rather then seeing guts of old and crappy OBJ file technology. Linking with C != using C tools everywhere.I don't see the value of compression. Lexing would already reduce the size significantly and compression would only add to processing times. Disk is cheap.Per Adam's post, the issue is tied to DMD's use of OMF/optlink which we all would like to get rid of anyway. Once we're in proper COFF land, couldn't we just store the required metadata (binary AST?) in special sections in the object files themselves?Seconded. At least lexed form could be very compact, I recall early compressors tried doing the Huffman thing on source code tokens with a certain success.Beyond that though, this is absolutely the direction D must head in. In my mind the DI generation patch was mostly just a stop-gap to bring DI-gen up-to-date with the current system thereby giving us enough time to tackle the (admittedly huge) task of building COFF into the backend, emitting the lexed source into a special section and then giving the compiler *AND* linker the ability to read out the source. For example the giving the linker the ability to read out source code essentially requires a brand-new linker. Although, it is my personal opinion that the linker should be integrated with the compiler and done as one step, this way the linker could have intimate knowledge of the source and would enable some spectacular LTO options. If only DMD were written in D, then we could really open the compile speed throttles with an MT build model...-- Dmitry Olshansky
Jun 12 2012
On Tuesday, 12 June 2012 at 12:23:21 UTC, Dmitry Olshansky wrote:On 12.06.2012 16:09, foobar wrote:Back in the 90's I only moved 100% away from Turbo Pascal into C land, when I started using Linux at the University and eventually spent some time doing C++ as well. It still baffles me, that in 2012 we still need to rely in crappy C linker tooling, when in the 80's we already had languages with proper modules. Now we have many mainstream languages with proper modules, but many of them leave in VM land. Oberon, Go and Delphi/Free Pascal seem to be the only languages with native code generation compilers that offer the binary only modules solution, while many rely on some form of .di files.On Tuesday, 12 June 2012 at 11:09:04 UTC, Don Clugston wrote:Absolutely. DDoc being built-in didn't sound right to me at first, BUT it allows us to essentially being able to say that APIs are covered in the DDoc generated files. Not header files etc.On 12/06/12 11:07, timotheecour wrote:I absolutely agree with the above and would also add that goal (4) is an anti-feature. In order to get a human readable version of the API the programmer should use *documentation*. D claims that one of its goals is to make it a breeze to provide documentation by bundling a standard tool - DDoc. There's no need to duplicate this just to provide another format when DDoc itself supposed to be format agnostic.There's a current pull request to improve di file generation (https://github.com/D-Programming-Language/dmd/pull/945); I'd like to suggest further ideas. As far as I understand, di interface files try to achieve these conflicting goals: 1) speed up compilation by avoiding having to reparse large files over and over. 2) hide implementation details for proprietary reasons 3) still maintain source code in some form to allow inliningand CTFE4) be human readableIs that actually true? My recollection is that the original motivation was only goal (2), but I was fairly new to D at the time (2005). Here's the original post where it was implemented: http://www.digitalmars.com/d/archives/digitalmars/D/29883.html and it got partially merged into DMD 0.141 (Dec 4 2005), first usable in DMD0.142 Personally I believe that.di files are *totally* the wrong approach for goal (1). I don't think goal (1) and (2) have anything in common at all with each other, except that C tried to achieve both of them using header files. It's an OK solution for (1) in C, it's a failure in C++, and a complete failure in D. IMHO: If we want goal (1), we should try to achieve goal (1), and stop pretending its in any way related to goal (2).This is a solved problem since the 80's (E.g. Pascal units).Right, seeing yet another newbie hit it everyday is a clear indication of a simple fact: people would like to think & work in modules rather then seeing guts of old and crappy OBJ file technology. Linking with C != using C tools everywhere.
Jun 13 2012
On 2012-06-12 14:09, foobar wrote:This is a solved problem since the 80's (E.g. Pascal units). Per Adam's post, the issue is tied to DMD's use of OMF/optlink which we all would like to get rid of anyway. Once we're in proper COFF land, couldn't we just store the required metadata (binary AST?) in special sections in the object files themselves?Can't the same be done with OMF? I'm not saying I want to keep OMF. -- /Jacob Carlborg
Jun 12 2012
On Tue, 12 Jun 2012 06:46:44 -0700, Jacob Carlborg <doob me.com> wrote:On 2012-06-12 14:09, foobar wrote:OMF doesn't support Custom Sections and I think a custom section is the right way to handle this. I found the Borland OMF docs once a while back to verify this. -- Adam Wilson IRC: LightBender Project Coordinator The Horizon Project http://www.thehorizonproject.org/This is a solved problem since the 80's (E.g. Pascal units). Per Adam's post, the issue is tied to DMD's use of OMF/optlink which we all would like to get rid of anyway. Once we're in proper COFF land, couldn't we just store the required metadata (binary AST?) in special sections in the object files themselves?Can't the same be done with OMF? I'm not saying I want to keep OMF.
Jun 12 2012
On Tuesday, 12 June 2012 at 11:09:04 UTC, Don Clugston wrote:On 12/06/12 11:07, timotheecour wrote:I absolutely agree with the above and would also add that goal (4) is an anti-feature. In order to get a human readable version of the API the programmer should use *documentation*. D claims that one of its goals is to make it a breeze to provide documentation by bundling a standard tool - DDoc. There's no need to duplicate this just to provide another format when DDoc itself supposed to be format agnostic. This is a solved problem since the 80's (E.g. Pascal units). Per Adam's post, the issue is tied to DMD's use of OMF/optlink which we all would like to get rid of anyway. Once we're in proper COFF land, couldn't we just store the required metadata (binary AST?) in special sections in the object files themselves? Another related question - AFAIK the LLVM folks did/are doing work to make their implementation less platform-depended. Could we leverage this in ldc to store LLVM bit code as D libs which still retain enough info for the compiler to replace header files?There's a current pull request to improve di file generation (https://github.com/D-Programming-Language/dmd/pull/945); I'd like to suggest further ideas. As far as I understand, di interface files try to achieve these conflicting goals: 1) speed up compilation by avoiding having to reparse large files over and over. 2) hide implementation details for proprietary reasons 3) still maintain source code in some form to allow inliningand CTFE4) be human readableIs that actually true? My recollection is that the original motivation was only goal (2), but I was fairly new to D at the time (2005). Here's the original post where it was implemented: http://www.digitalmars.com/d/archives/digitalmars/D/29883.html and it got partially merged into DMD 0.141 (Dec 4 2005), first usable in DMD0.142 Personally I believe that.di files are *totally* the wrong approach for goal (1). I don't think goal (1) and (2) have anything in common at all with each other, except that C tried to achieve both of them using header files. It's an OK solution for (1) in C, it's a failure in C++, and a complete failure in D. IMHO: If we want goal (1), we should try to achieve goal (1), and stop pretending its in any way related to goal (2).
Jun 12 2012
Le 12/06/2012 14:39, foobar a écrit :Another related question - AFAIK the LLVM folks did/are doing work to make their implementation less platform-depended. Could we leverage this in ldc to store LLVM bit code as D libs which still retain enough info for the compiler to replace header files?LLVM is definitively something I look at more and more. It is a great weapon for D IMO.
Jun 12 2012
On 6/12/2012 2:07 AM, timotheecour wrote:There's a current pull request to improve di file generation (https://github.com/D-Programming-Language/dmd/pull/945); I'd like to suggest further ideas. As far as I understand, di interface files try to achieve these conflicting goals: 1) speed up compilation by avoiding having to reparse large files over and over. 2) hide implementation details for proprietary reasons 3) still maintain source code in some form to allow inlining and CTFE 4) be human readable(4) was not a goal. A .di file could very well be a binary file, but making it look like D source enabled them to be loaded with no additional implementation work in the compiler.
Jun 12 2012
On 12/06/12 18:46, Walter Bright wrote:On 6/12/2012 2:07 AM, timotheecour wrote:I don't understand (1) actually. For two reasons: (a) Is lexing + parsing really a significant part of the compilation time? Has anyone done some solid profiling? (b) Wasn't one of the goals of D's module system supposed to be that you could import a symbol table? Why not just implement that? Seems like that would be much faster than .di files can ever be.There's a current pull request to improve di file generation (https://github.com/D-Programming-Language/dmd/pull/945); I'd like to suggest further ideas. As far as I understand, di interface files try to achieve these conflicting goals: 1) speed up compilation by avoiding having to reparse large files over and over. 2) hide implementation details for proprietary reasons 3) still maintain source code in some form to allow inlining and CTFE 4) be human readable(4) was not a goal. A .di file could very well be a binary file, but making it look like D source enabled them to be loaded with no additional implementation work in the compiler.
Jun 13 2012
On 13 June 2012 09:07, Don Clugston <dac nospam.com> wrote:On 12/06/12 18:46, Walter Bright wrote:Lexing and Parsing are miniscule tasks in comparison to the three semantic runs done on the code. I added speed counters into the glue code of GDC some time ago. http://iainbuclaw.wordpress.com/2010/09/18/implementing-speed-counters-in-gdc/ And here is the relavent report to go with it. http://iainbuclaw.files.wordpress.com/2010/09/d2-time-report2.pdf Example: std/xml.d Module::parse : 0.01 ( 0%) Module::semantic : 0.50 ( 9%) Module::semantic2 : 0.02 ( 0%) Module::semantic3 : 0.04 ( 1%) Module::genobjfile : 0.10 ( 2%) For the entire time it took to compile the one file (5.22 seconds) - it spent almost 10% of it's time running the first semantic analysis. But that was the D2 frontend / phobos as of September 2010. I should re-run a report on updated times and draw some comparisons. :~) Regards -- Iain Buclaw *(p < e ? p++ : p) = (c & 0x0f) + '0';On 6/12/2012 2:07 AM, timotheecour wrote:I don't understand (1) actually. For two reasons: (a) Is lexing + parsing really a significant part of the compilation time? Has anyone done some solid profiling?There's a current pull request to improve di file generation (https://github.com/D-Programming-Language/dmd/pull/945); I'd like to suggest further ideas. As far as I understand, di interface files try to achieve these conflicting goals: 1) speed up compilation by avoiding having to reparse large files over and over. 2) hide implementation details for proprietary reasons 3) still maintain source code in some form to allow inlining and CTFE 4) be human readable(4) was not a goal. A .di file could very well be a binary file, but making it look like D source enabled them to be loaded with no additional implementation work in the compiler.
Jun 13 2012
On 13.06.2012 13:37, Iain Buclaw wrote:On 13 June 2012 09:07, Don Clugston<dac nospam.com> wrote:Is time spent on I/O accounted for in the parse step? And where is the rest spent :) -- Dmitry OlshanskyOn 12/06/12 18:46, Walter Bright wrote:Lexing and Parsing are miniscule tasks in comparison to the three semantic runs done on the code. I added speed counters into the glue code of GDC some time ago. http://iainbuclaw.wordpress.com/2010/09/18/implementing-speed-counters-in-gdc/ And here is the relavent report to go with it. http://iainbuclaw.files.wordpress.com/2010/09/d2-time-report2.pdf Example: std/xml.d Module::parse : 0.01 ( 0%) Module::semantic : 0.50 ( 9%) Module::semantic2 : 0.02 ( 0%) Module::semantic3 : 0.04 ( 1%) Module::genobjfile : 0.10 ( 2%) For the entire time it took to compile the one file (5.22 seconds) - it spent almost 10% of it's time running the first semantic analysis. But that was the D2 frontend / phobos as of September 2010. I should re-run a report on updated times and draw some comparisons. :~)On 6/12/2012 2:07 AM, timotheecour wrote:I don't understand (1) actually. For two reasons: (a) Is lexing + parsing really a significant part of the compilation time? Has anyone done some solid profiling?There's a current pull request to improve di file generation (https://github.com/D-Programming-Language/dmd/pull/945); I'd like to suggest further ideas. As far as I understand, di interface files try to achieve these conflicting goals: 1) speed up compilation by avoiding having to reparse large files over and over. 2) hide implementation details for proprietary reasons 3) still maintain source code in some form to allow inlining and CTFE 4) be human readable(4) was not a goal. A .di file could very well be a binary file, but making it look like D source enabled them to be loaded with no additional implementation work in the compiler.
Jun 13 2012
On 13 June 2012 10:45, Dmitry Olshansky <dmitry.olsh gmail.com> wrote:On 13.06.2012 13:37, Iain Buclaw wrote:rOn 13 June 2012 09:07, Don Clugston<dac nospam.com> =A0wrote:On 12/06/12 18:46, Walter Bright wrote:On 6/12/2012 2:07 AM, timotheecour wrote:There's a current pull request to improve di file generation (https://github.com/D-Programming-Language/dmd/pull/945); I'd like to suggest further ideas. As far as I understand, di interface files try to achieve these conflicting goals: 1) speed up compilation by avoiding having to reparse large files ove=kand over. 2) hide implementation details for proprietary reasons 3) still maintain source code in some form to allow inlining and CTFE 4) be human readable(4) was not a goal. A .di file could very well be a binary file, but making it look like D source enabled them to be loaded with no additional implementation wor=n-gdc/Lexing and Parsing are miniscule tasks in comparison to the three semantic runs done on the code. I added speed counters into the glue code of GDC some time ago. http://iainbuclaw.wordpress.com/2010/09/18/implementing-speed-counters-i=in the compiler.I don't understand (1) actually. For two reasons: (a) Is lexing + parsing really a significant part of the compilation time? Has anyone done some solid profiling?stAnd here is the relavent report to go with it. http://iainbuclaw.files.wordpress.com/2010/09/d2-time-report2.pdf Example: std/xml.d Module::parse : 0.01 ( 0%) Module::semantic : 0.50 ( 9%) Module::semantic2 : 0.02 ( 0%) Module::semantic3 : 0.04 ( 1%) Module::genobjfile : 0.10 ( 2%) For the entire time it took to compile the one file (5.22 seconds) - it spent almost 10% of it's time running the first semantic analysis. But that was the D2 frontend / phobos as of September 2010. =A0I should re-run a report on updated times and draw some comparisons. :~)Is time spent on I/O accounted for in the parse step? And where is the re=spent :)It would be, the counter starts before the files are even touched, and ends after they are closed. The rest of the time spent is in the GCC backend, going through the some 60+ code passes and outputting the assembly to file. --=20 Iain Buclaw *(p < e ? p++ : p) =3D (c & 0x0f) + '0';
Jun 13 2012
On 13.06.2012 14:16, Iain Buclaw wrote:On 13 June 2012 10:45, Dmitry Olshansky<dmitry.olsh gmail.com> wrote:Ok, then parsing is indistinguishable from I/O and together are only tiny fraction of the whole. Great info, thanks.On 13.06.2012 13:37, Iain Buclaw wrote:It would be, the counter starts before the files are even touched, and ends after they are closed.On 13 June 2012 09:07, Don Clugston<dac nospam.com> wrote:Is time spent on I/O accounted for in the parse step? And where is the rest spent :)On 12/06/12 18:46, Walter Bright wrote:Lexing and Parsing are miniscule tasks in comparison to the three semantic runs done on the code. I added speed counters into the glue code of GDC some time ago. http://iainbuclaw.wordpress.com/2010/09/18/implementing-speed-counters-in-gdc/ And here is the relavent report to go with it. http://iainbuclaw.files.wordpress.com/2010/09/d2-time-report2.pdf Example: std/xml.d Module::parse : 0.01 ( 0%) Module::semantic : 0.50 ( 9%) Module::semantic2 : 0.02 ( 0%) Module::semantic3 : 0.04 ( 1%) Module::genobjfile : 0.10 ( 2%) For the entire time it took to compile the one file (5.22 seconds) - it spent almost 10% of it's time running the first semantic analysis. But that was the D2 frontend / phobos as of September 2010. I should re-run a report on updated times and draw some comparisons. :~)On 6/12/2012 2:07 AM, timotheecour wrote:I don't understand (1) actually. For two reasons: (a) Is lexing + parsing really a significant part of the compilation time? Has anyone done some solid profiling?There's a current pull request to improve di file generation (https://github.com/D-Programming-Language/dmd/pull/945); I'd like to suggest further ideas. As far as I understand, di interface files try to achieve these conflicting goals: 1) speed up compilation by avoiding having to reparse large files over and over. 2) hide implementation details for proprietary reasons 3) still maintain source code in some form to allow inlining and CTFE 4) be human readable(4) was not a goal. A .di file could very well be a binary file, but making it look like D source enabled them to be loaded with no additional implementation work in the compiler.The rest of the time spent is in the GCC backend, going through the some 60+ code passes and outputting the assembly to file.Damn, I like DMD :) -- Dmitry Olshansky
Jun 13 2012
Le 13/06/2012 11:37, Iain Buclaw a écrit :On 13 June 2012 09:07, Don Clugston<dac nospam.com> wrote:Nice numbers ! It also show that the slowest part is the backend. Can you get some number on a recent version of D ? And in some different D codes (ie, template intensive or not for instance is nice to compare).On 12/06/12 18:46, Walter Bright wrote:Lexing and Parsing are miniscule tasks in comparison to the three semantic runs done on the code. I added speed counters into the glue code of GDC some time ago. http://iainbuclaw.wordpress.com/2010/09/18/implementing-speed-counters-in-gdc/ And here is the relavent report to go with it. http://iainbuclaw.files.wordpress.com/2010/09/d2-time-report2.pdf Example: std/xml.d Module::parse : 0.01 ( 0%) Module::semantic : 0.50 ( 9%) Module::semantic2 : 0.02 ( 0%) Module::semantic3 : 0.04 ( 1%) Module::genobjfile : 0.10 ( 2%) For the entire time it took to compile the one file (5.22 seconds) - it spent almost 10% of it's time running the first semantic analysis. But that was the D2 frontend / phobos as of September 2010. I should re-run a report on updated times and draw some comparisons. :~) RegardsOn 6/12/2012 2:07 AM, timotheecour wrote:I don't understand (1) actually. For two reasons: (a) Is lexing + parsing really a significant part of the compilation time? Has anyone done some solid profiling?There's a current pull request to improve di file generation (https://github.com/D-Programming-Language/dmd/pull/945); I'd like to suggest further ideas. As far as I understand, di interface files try to achieve these conflicting goals: 1) speed up compilation by avoiding having to reparse large files over and over. 2) hide implementation details for proprietary reasons 3) still maintain source code in some form to allow inlining and CTFE 4) be human readable(4) was not a goal. A .di file could very well be a binary file, but making it look like D source enabled them to be loaded with no additional implementation work in the compiler.
Jun 13 2012
The measurements should be done for modules being imported, not the module being compiled. Something like this. --- import std.algorithm; import std.stdio; import std.typecons; import std.datetime; int ok; ---
Jun 13 2012
On Wednesday, 13 June 2012 at 11:29:45 UTC, Kagamin wrote:The measurements should be done for modules being imported, not the module being compiled. Something like this. --- import std.algorithm; import std.stdio; import std.typecons; import std.datetime; int ok; ---Oh and let it import .d files, not .di
Jun 13 2012
On 13 June 2012 12:33, Kagamin <spam here.lot> wrote:On Wednesday, 13 June 2012 at 11:29:45 UTC, Kagamin wrote:std.datetime is one reason for me to run it again. I can imagine that *that* module will have an impact on parse times. But I'm still persistent that the majority of the compile time in the frontend is done in the first semantic pass, and not the read/parser stage. :~) -- Iain Buclaw *(p < e ? p++ : p) = (c & 0x0f) + '0';The measurements should be done for modules being imported, not the module being compiled. Something like this. --- import std.algorithm; import std.stdio; import std.typecons; import std.datetime; int ok; ---Oh and let it import .d files, not .di
Jun 13 2012
On Wednesday, 13 June 2012 at 11:47:31 UTC, Iain Buclaw wrote:std.datetime is one reason for me to run it again. I can imagine that *that* module will have an impact on parse times. But I'm still persistent that the majority of the compile time in the frontend is done in the first semantic pass, and not the read/parser stage. :~)Probably. Also test with -fsyntax-only is it works and runs semantic passes.
Jun 13 2012
On 2012-06-13 13:47, Iain Buclaw wrote:std.datetime is one reason for me to run it again. I can imagine that *that* module will have an impact on parse times. But I'm still persistent that the majority of the compile time in the frontend is done in the first semantic pass, and not the read/parser stage. :~)You should try the Objective-C/D bridge, that took quite a while to compile. Although it will probably not compile any more, haven't been update. I think it was only for D1 as well. I think that was most templates so I guess that would mean the some of the semantic passes. -- /Jacob Carlborg
Jun 13 2012
On 13 June 2012 12:47, Iain Buclaw <ibuclaw ubuntu.com> wrote:On 13 June 2012 12:33, Kagamin <spam here.lot> wrote:uleOn Wednesday, 13 June 2012 at 11:29:45 UTC, Kagamin wrote:The measurements should be done for modules being imported, not the mod=Rebuilt a compile log with latest gdc as of writing on the 2.059 frontend / library. http://iainbuclaw.files.wordpress.com/2012/06/d2time_report32_2059.pdf http://iainbuclaw.files.wordpress.com/2012/06/d2time_report64_2059.pdf Notes about it: - GCC has 4 new time counters - phase setup (time spent loading the compile time environment) - phase parsing (time spent in the frontend) - phase generate (time spent in the backend) - phase finalize (time spent cleaning up and exiting) - Of the phase parsing stage, it is broken down into 5 components - Module::parse - Module::semantic - Module::semantic2 - Module::semantic3 - Module::genobjfile - Module::read, Module::parse and Module::importAll in the one I did 2 years ago are now counted as part of just the one parsing stage, rather than separate just to make it a little bit more balanced. :-) I'll post a tl;dr later on it. --=20 Iain Buclaw *(p < e ? p++ : p) =3D (c & 0x0f) + '0';std.datetime is one reason for me to run it again. I can imagine that *that* module will have an impact on parse times. =A0But I'm still persistent that the majority of the compile time in the frontend is done in the first semantic pass, and not the read/parser stage. :~)being compiled. Something like this. --- import std.algorithm; import std.stdio; import std.typecons; import std.datetime; int ok; ---Oh and let it import .d files, not .di
Jun 16 2012
Le 16/06/2012 11:18, Iain Buclaw a écrit :On 13 June 2012 12:47, Iain Buclaw<ibuclaw ubuntu.com> wrote:Thank you very much for your work.On 13 June 2012 12:33, Kagamin<spam here.lot> wrote:Rebuilt a compile log with latest gdc as of writing on the 2.059 frontend / library. http://iainbuclaw.files.wordpress.com/2012/06/d2time_report32_2059.pdf http://iainbuclaw.files.wordpress.com/2012/06/d2time_report64_2059.pdf Notes about it: - GCC has 4 new time counters - phase setup (time spent loading the compile time environment) - phase parsing (time spent in the frontend) - phase generate (time spent in the backend) - phase finalize (time spent cleaning up and exiting) - Of the phase parsing stage, it is broken down into 5 components - Module::parse - Module::semantic - Module::semantic2 - Module::semantic3 - Module::genobjfile - Module::read, Module::parse and Module::importAll in the one I did 2 years ago are now counted as part of just the one parsing stage, rather than separate just to make it a little bit more balanced. :-) I'll post a tl;dr later on it.On Wednesday, 13 June 2012 at 11:29:45 UTC, Kagamin wrote:std.datetime is one reason for me to run it again. I can imagine that *that* module will have an impact on parse times. But I'm still persistent that the majority of the compile time in the frontend is done in the first semantic pass, and not the read/parser stage. :~)The measurements should be done for modules being imported, not the module being compiled. Something like this. --- import std.algorithm; import std.stdio; import std.typecons; import std.datetime; int ok; ---Oh and let it import .d files, not .di
Jun 19 2012
On 16 June 2012 10:18, Iain Buclaw <ibuclaw ubuntu.com> wrote:On 13 June 2012 12:47, Iain Buclaw <ibuclaw ubuntu.com> wrote:duleOn 13 June 2012 12:33, Kagamin <spam here.lot> wrote:On Wednesday, 13 June 2012 at 11:29:45 UTC, Kagamin wrote:The measurements should be done for modules being imported, not the mo=tl;dr Total number of source files compiled: 207 Total time to build druntime and phobos: 78.08 seconds Time spent parsing: 17.15 seconds Average time spent parsing: 0.08 seconds Time spent running semantic passes: 10.04 seconds Time spent generating backend AST: 2.15 seconds Time spent in backend: 48.62 seconds So parsing time has taken quite a hit since I last did any reports on compilation speed of building phobos. I suspect most of that comes from the loading of symbols from all imports and that there have been some large additions to phobos recently which provide a constant bottle neck if one was to choose compiling one source at a time. As the apparent large amount of time spent parsing sources does not show when compiling all at once. Module::parse: 0.58 seconds (1%) Module::semantic: 0.24 seconds (1%) Module::semantic2: 0.01 seconds (0%) Module::semantic3: 2.85 seconds (6%) Module::genobjfile: 1.24 seconds ( 3%) TOTAL: 47.06 seconds Considering that the entire phobos library is some 165K lines of code, I don't see why people aren't laughing about just how quick the frontend is at parsing. :~) Regards --=20 Iain Buclaw *(p < e ? p++ : p) =3D (c & 0x0f) + '0';Rebuilt a compile log with latest gdc as of writing on the 2.059 frontend / library. http://iainbuclaw.files.wordpress.com/2012/06/d2time_report32_2059.pdf http://iainbuclaw.files.wordpress.com/2012/06/d2time_report64_2059.pdf Notes about it: - GCC has 4 new time counters =A0- =A0phase setup =A0(time spent loading the compile time environment) =A0- =A0phase parsing =A0(time spent in the frontend) =A0- =A0phase generate (time spent in the backend) =A0- =A0phase finalize =A0(time spent cleaning up and exiting) - Of the phase parsing stage, it is broken down into 5 components =A0- =A0Module::parse =A0- =A0Module::semantic =A0- =A0Module::semantic2 =A0- =A0Module::semantic3 =A0- =A0Module::genobjfile - Module::read, Module::parse and Module::importAll in the one I did 2 years ago are now counted as part of just the one parsing stage, rather than separate just to make it a little bit more balanced. :-) I'll post a tl;dr later on it.std.datetime is one reason for me to run it again. I can imagine that *that* module will have an impact on parse times. =A0But I'm still persistent that the majority of the compile time in the frontend is done in the first semantic pass, and not the read/parser stage. :~)being compiled. Something like this. --- import std.algorithm; import std.stdio; import std.typecons; import std.datetime; int ok; ---Oh and let it import .d files, not .di
Jun 16 2012
So parsing time has taken quite a hit since I last did any reports on compilation speed of building phobos.So maybe my post about "keeping import clean" wasn't as irrelevant as I thought. http://www.digitalmars.com/d/archives/digitalmars/D/Keeping_imports_clean_162890.html#N162890 -- Guillaume
Jun 16 2012
On 16 June 2012 22:17, Guillaume Chatelet <chatelet.guillaume gmail.com> wrote:I think it's relevancy is only geared towards projects that are compiling one file at a time - ie: I'd expect all gdc users to be compiling in this way as whole program compilation using gdc still needs some rigourous testing first. If there is a particular large module, or set of large modules that are persistantly being importanted, then you will see a notable constant slowdown on compilation of each file. -- Iain Buclaw *(p < e ? p++ : p) = (c & 0x0f) + '0';So parsing time has taken quite a hit since I last did any reports on compilation speed of building phobos.So maybe my post about "keeping import clean" wasn't as irrelevant as I thought. http://www.digitalmars.com/d/archives/digitalmars/D/Keeping_imports_clean_162890.html#N162890 -- Guillaume
Jun 19 2012
On 6/13/2012 1:07 AM, Don Clugston wrote:On 12/06/12 18:46, Walter Bright wrote:It is for debug builds.On 6/12/2012 2:07 AM, timotheecour wrote:I don't understand (1) actually. For two reasons: (a) Is lexing + parsing really a significant part of the compilation time? Has anyone done some solid profiling?There's a current pull request to improve di file generation (https://github.com/D-Programming-Language/dmd/pull/945); I'd like to suggest further ideas. As far as I understand, di interface files try to achieve these conflicting goals: 1) speed up compilation by avoiding having to reparse large files over and over. 2) hide implementation details for proprietary reasons 3) still maintain source code in some form to allow inlining and CTFE 4) be human readable(4) was not a goal. A .di file could very well be a binary file, but making it look like D source enabled them to be loaded with no additional implementation work in the compiler.(b) Wasn't one of the goals of D's module system supposed to be that you could import a symbol table? Why not just implement that? Seems like that would be much faster than .di files can ever be.Yes, it is designed so you could just import a symbol table. It is done as source code, however, because it's trivial to implement.
Jun 13 2012
On 13/06/12 16:29, Walter Bright wrote:On 6/13/2012 1:07 AM, Don Clugston wrote:Iain's data indicates that it's only a few % of the time taken on semantic1(). Do you have data that shows otherwise? It seems to me, that slow parsing is a C++ problem which D already solved.On 12/06/12 18:46, Walter Bright wrote:It is for debug builds.On 6/12/2012 2:07 AM, timotheecour wrote:I don't understand (1) actually. For two reasons: (a) Is lexing + parsing really a significant part of the compilation time? Has anyone done some solid profiling?There's a current pull request to improve di file generation (https://github.com/D-Programming-Language/dmd/pull/945); I'd like to suggest further ideas. As far as I understand, di interface files try to achieve these conflicting goals: 1) speed up compilation by avoiding having to reparse large files over and over. 2) hide implementation details for proprietary reasons 3) still maintain source code in some form to allow inlining and CTFE 4) be human readable(4) was not a goal. A .di file could very well be a binary file, but making it look like D source enabled them to be loaded with no additional implementation work in the compiler.It has those nasty side-effects listed under (3) though.(b) Wasn't one of the goals of D's module system supposed to be that you could import a symbol table? Why not just implement that? Seems like that would be much faster than .di files can ever be.Yes, it is designed so you could just import a symbol table. It is done as source code, however, because it's trivial to implement.
Jun 14 2012
On Thursday, June 14, 2012 10:03:05 Don Clugston wrote:On 13/06/12 16:29, Walter Bright wrote:If this is the case, is there any value at all to using .di files in druntime or Phobos other than in cases where we're specifically trying to hide implementation (e.g. with the GC)? Or do we still end up paying the semantic cost for importing the .d files such that using .di files would still help with compilation times? - Jonathan M DavisOn 6/13/2012 1:07 AM, Don Clugston wrote:Iain's data indicates that it's only a few % of the time taken on semantic1(). Do you have data that shows otherwise? It seems to me, that slow parsing is a C++ problem which D already solved.On 12/06/12 18:46, Walter Bright wrote:It is for debug builds.On 6/12/2012 2:07 AM, timotheecour wrote:I don't understand (1) actually. For two reasons: (a) Is lexing + parsing really a significant part of the compilation time? Has anyone done some solid profiling?There's a current pull request to improve di file generation (https://github.com/D-Programming-Language/dmd/pull/945); I'd like to suggest further ideas. As far as I understand, di interface files try to achieve these conflicting goals: 1) speed up compilation by avoiding having to reparse large files over and over. 2) hide implementation details for proprietary reasons 3) still maintain source code in some form to allow inlining and CTFE 4) be human readable(4) was not a goal. A .di file could very well be a binary file, but making it look like D source enabled them to be loaded with no additional implementation work in the compiler.
Jun 14 2012
On Thursday, 14 June 2012 at 08:11:02 UTC, Jonathan M Davis wrote:Or do we still end up paying the semantic cost for importing the .d files such that using .di files would still help with compilation times?Oh, right, the module can use mixins and CTFE, so it should be semantically checked, but the semantic check may be minimal just like in the case of a .di file.
Jun 14 2012
On 14/06/12 10:10, Jonathan M Davis wrote:On Thursday, June 14, 2012 10:03:05 Don Clugston wrote:I don't think Phobos should use .di files at all. I don't think there are any cases where we want to conceal code. The performance benefit you would get is completely negligible. It doesn't even reduce the number of files that need to be loaded, just the length of each one. I think that, for example, improving the way that array literals are dealt with would have at least as much impact on compilation time. For the DMD backend, fixing up the treatment of comma expressions would have a much bigger impact than getting lexing and parsing time to zero. And we're well set up for parallel compilation. There's no shortage of things we can do to improve compilation time. Using di files for speed seems a bit like jettisoning the cargo to keep the ship afloat. It works but you only do it when you've got no other options.On 13/06/12 16:29, Walter Bright wrote:If this is the case, is there any value at all to using .di files in druntime or Phobos other than in cases where we're specifically trying to hide implementation (e.g. with the GC)? Or do we still end up paying the semantic cost for importing the .d files such that using .di files would still help with compilation times? - Jonathan M DavisOn 6/13/2012 1:07 AM, Don Clugston wrote:Iain's data indicates that it's only a few % of the time taken on semantic1(). Do you have data that shows otherwise? It seems to me, that slow parsing is a C++ problem which D already solved.On 12/06/12 18:46, Walter Bright wrote:It is for debug builds.On 6/12/2012 2:07 AM, timotheecour wrote:I don't understand (1) actually. For two reasons: (a) Is lexing + parsing really a significant part of the compilation time? Has anyone done some solid profiling?There's a current pull request to improve di file generation (https://github.com/D-Programming-Language/dmd/pull/945); I'd like to suggest further ideas. As far as I understand, di interface files try to achieve these conflicting goals: 1) speed up compilation by avoiding having to reparse large files over and over. 2) hide implementation details for proprietary reasons 3) still maintain source code in some form to allow inlining and CTFE 4) be human readable(4) was not a goal. A .di file could very well be a binary file, but making it look like D source enabled them to be loaded with no additional implementation work in the compiler.
Jun 14 2012
On Friday, June 15, 2012 08:58:55 Don Clugston wrote:I don't think Phobos should use .di files at all. I don't think there are any cases where we want to conceal code. The performance benefit you would get is completely negligible. It doesn't even reduce the number of files that need to be loaded, just the length of each one. I think that, for example, improving the way that array literals are dealt with would have at least as much impact on compilation time. For the DMD backend, fixing up the treatment of comma expressions would have a much bigger impact than getting lexing and parsing time to zero. And we're well set up for parallel compilation. There's no shortage of things we can do to improve compilation time. Using di files for speed seems a bit like jettisoning the cargo to keep the ship afloat. It works but you only do it when you've got no other options.On several occasions, Walter has expressed the desire to make Phobos use .di files like druntime does, otherwise I probably would never have considered it. Personally, I don't want to bother with it unless there's a large benefit from it, so if we're sure that the gain is minimal, then I say that we should just leave it all as .d files. Most of of Phobos would have to have its implementation left in any .di files anyway so that inlining and CTFE could work. - Jonathan M Davis
Jun 15 2012
On 6/14/2012 11:58 PM, Don Clugston wrote:And we're well set up for parallel compilation. There's no shortage of things we can do to improve compilation time.The language is carefully designed, so that at least in theory all the passes could be done in parallel. I've got the file reads in parallel, but I'd love to have the lexing, parsing, semantic, optimization, and code gen all done in parallel. Wouldn't that be awesome!Using di files for speed seems a bit like jettisoning the cargo to keep the ship afloat. It works but you only do it when you've got no other options..di files don't make a whole lotta sense for small files, but the bigger they get, the more they are useful. D needs to be scalable to enormous project sizes.
Jun 16 2012
Le 17/06/2012 00:41, Walter Bright a écrit :On 6/14/2012 11:58 PM, Don Clugston wrote:The key point is project size here. I wouldn't expect file size to increase in an important manner.And we're well set up for parallel compilation. There's no shortage of things we can do to improve compilation time.The language is carefully designed, so that at least in theory all the passes could be done in parallel. I've got the file reads in parallel, but I'd love to have the lexing, parsing, semantic, optimization, and code gen all done in parallel. Wouldn't that be awesome!Using di files for speed seems a bit like jettisoning the cargo to keep the ship afloat. It works but you only do it when you've got no other options..di files don't make a whole lotta sense for small files, but the bigger they get, the more they are useful. D needs to be scalable to enormous project sizes.
Jun 19 2012
On 6/14/2012 1:03 AM, Don Clugston wrote:Nothing recent, it's mostly from my C++ compiler testing.It is for debug builds.Iain's data indicates that it's only a few % of the time taken on semantic1(). Do you have data that shows otherwise?I don't think they're nasty or are side effects.Yes, it is designed so you could just import a symbol table. It is done as source code, however, because it's trivial to implement.It has those nasty side-effects listed under (3) though.
Jun 16 2012
On 17/06/12 00:37, Walter Bright wrote:On 6/14/2012 1:03 AM, Don Clugston wrote:But you argued in your blog that C++ parsing is inherently slow, and you've fixed those problems in the design of D. And as far as I can tell, you were extremely successful! Parsing in D is very, very fast.Nothing recent, it's mostly from my C++ compiler testing.It is for debug builds.Iain's data indicates that it's only a few % of the time taken on semantic1(). Do you have data that shows otherwise?They are new problems which people ask for solutions for. And they are far more difficult to solve than the original problem.I don't think they're nasty or are side effects.Yes, it is designed so you could just import a symbol table. It is done as source code, however, because it's trivial to implement.It has those nasty side-effects listed under (3) though.
Jun 18 2012
On 6/18/2012 6:07 AM, Don Clugston wrote:On 17/06/12 00:37, Walter Bright wrote:Yeah, but I can't escape that lingering feeling that lexing is slow. I was fairly disappointed that asynchronously reading the source files didn't have a measurable effect most of the time.On 6/14/2012 1:03 AM, Don Clugston wrote:But you argued in your blog that C++ parsing is inherently slow, and you've fixed those problems in the design of D. And as far as I can tell, you were extremely successful! Parsing in D is very, very fast.Nothing recent, it's mostly from my C++ compiler testing.It is for debug builds.Iain's data indicates that it's only a few % of the time taken on semantic1(). Do you have data that shows otherwise?
Jun 18 2012
On Monday, 18 June 2012 at 17:54:40 UTC, Walter Bright wrote:On 6/18/2012 6:07 AM, Don Clugston wrote:Same here, I wish there were a standardized pre-lexed-token "binary" file-format, would benefit all text editors also, as they need to lex it anyway to perform color syntax highlighting.On 17/06/12 00:37, Walter Bright wrote:Yeah, but I can't escape that lingering feeling that lexing is slow. I was fairly disappointed that asynchronously reading the source files didn't have a measurable effect most of the time.On 6/14/2012 1:03 AM, Don Clugston wrote:But you argued in your blog that C++ parsing is inherently slow, and you've fixed those problems in the design of D. And as far as I can tell, you were extremely successful! Parsing in D is very, very fast.Nothing recent, it's mostly from my C++ compiler testing.It is for debug builds.Iain's data indicates that it's only a few % of the time taken on semantic1(). Do you have data that shows otherwise?
Jun 18 2012
On Monday, 18 June 2012 at 18:05:59 UTC, Daniel wrote:Same here, I wish there were a standardized pre-lexed-token "binary" file-format, would benefit all text editors also, as they need to lex it anyway to perform color syntax highlighting.If I were to make my own language, I'd forego a human-readable format and just have the "language" be defined as a big machine-readable AST. You'd have to have an IDE, but it could display the code in just about any way the person wants (syntax, style, etc). Syntax highlighting would be instantaneous and there would be fewer errors made by programmers (maybe ...). Plus it'd be unbelievably easy to implement things like auto-completion.
Jun 18 2012
On 06/19/2012 02:47 AM, Chris Cain wrote:On Monday, 18 June 2012 at 18:05:59 UTC, Daniel wrote:http://de.wikipedia.org/wiki/Lisp ?Same here, I wish there were a standardized pre-lexed-token "binary" file-format, would benefit all text editors also, as they need to lex it anyway to perform color syntax highlighting.If I were to make my own language, I'd forego a human-readable format and just have the "language" be defined as a big machine-readable AST.You'd have to have an IDE, but it could display the code in just about any way the person wants (syntax, style, etc).This could be done even if the language's source code storage format is human-readable.Syntax highlighting would be instantaneous and there would be fewer errors made by programmers (maybe ...). Plus it'd be unbelievably easy to implement things like auto-completion.Parsing is not a huge issue. Depending on how powerful the language is, auto-completion may depend on full code analysis.
Jun 18 2012
On Tuesday, 19 June 2012 at 01:47:27 UTC, Timon Gehr wrote:Parsing is not a huge issue. Depending on how powerful the language is, auto-completion may depend on full code analysis.Yep, pegged runs at compile time.
Jun 19 2012
On Monday, 18 June 2012 at 17:54:40 UTC, Walter Bright wrote:Yeah, but I can't escape that lingering feeling that lexing is slow. I was fairly disappointed that asynchronously reading the source files didn't have a measurable effect most of the time.I don't even understand all this rage about asynchronicity, if the program has nothing to do until it reads the data, asynchronicity won't help you in the slightest. Anyway everything is stuck while the device performs DMA.
Jun 19 2012
Am 19.06.2012 09:43, schrieb Kagamin:On Monday, 18 June 2012 at 17:54:40 UTC, Walter Bright wrote:Yeah, but I can't escape that lingering feeling that lexing is slow. I was fairly disappointed that asynchronously reading the source files didn't have a measurable effect most of the time.I don't even understand all this rage about asynchronicity, if the program has nothing to do until it reads the data,the lexing and parsing process can be asynchron - i will be faster on multiple cores because there is no dependency between seperated lexing-parsing threads - why to lex/parse in sequence then?asynchronicity won't help you in the slightest. Anyway everything is stuck while the device performs DMA.yea down to the hardware level - but there are caches etc. out there - its not like "multithreaded-file-reading-is-always-fast-like-synchron", and also not "asynchron-file-reading-is-always-faster" - more somewere in between :)
Jun 19 2012
Am 18.06.2012 19:53, schrieb Walter Bright:On 6/18/2012 6:07 AM, Don Clugston wrote:so you started you lexing, parsing in seperated threads for each file - where was synchronization needed, have you measured what parts of the code makes it like synchron reading - or is it the file reading itself?On 17/06/12 00:37, Walter Bright wrote:Yeah, but I can't escape that lingering feeling that lexing is slow. I was fairly disappointed that asynchronously reading the source files didn't have a measurable effect most of the time.On 6/14/2012 1:03 AM, Don Clugston wrote:But you argued in your blog that C++ parsing is inherently slow, and you've fixed those problems in the design of D. And as far as I can tell, you were extremely successful! Parsing in D is very, very fast.Nothing recent, it's mostly from my C++ compiler testing.It is for debug builds.Iain's data indicates that it's only a few % of the time taken on semantic1(). Do you have data that shows otherwise?
Jun 19 2012
Le 18/06/2012 19:53, Walter Bright a écrit :On 6/18/2012 6:07 AM, Don Clugston wrote:It is kind of religious. We need data.On 17/06/12 00:37, Walter Bright wrote:Yeah, but I can't escape that lingering feeling that lexing is slow. I was fairly disappointed that asynchronously reading the source files didn't have a measurable effect most of the time.On 6/14/2012 1:03 AM, Don Clugston wrote:But you argued in your blog that C++ parsing is inherently slow, and you've fixed those problems in the design of D. And as far as I can tell, you were extremely successful! Parsing in D is very, very fast.Nothing recent, it's mostly from my C++ compiler testing.It is for debug builds.Iain's data indicates that it's only a few % of the time taken on semantic1(). Do you have data that shows otherwise?
Jun 19 2012
On Mon, 18 Jun 2012 13:53:43 -0400, Walter Bright <newshound2 digitalmars.com> wrote:On 6/18/2012 6:07 AM, Don Clugston wrote:I have found that my project, which has a huge number of symbols (And large ones) compiles much slower than I would expect. Perhaps you have forgotten about this issue: http://d.puremagic.com/issues/show_bug.cgi?id=4900 Maybe fixing this still doesn't help parsing, not sure. -SteveOn 17/06/12 00:37, Walter Bright wrote:Yeah, but I can't escape that lingering feeling that lexing is slow. I was fairly disappointed that asynchronously reading the source files didn't have a measurable effect most of the time.On 6/14/2012 1:03 AM, Don Clugston wrote:But you argued in your blog that C++ parsing is inherently slow, and you've fixed those problems in the design of D. And as far as I can tell, you were extremely successful! Parsing in D is very, very fast.Nothing recent, it's mostly from my C++ compiler testing.It is for debug builds.Iain's data indicates that it's only a few % of the time taken on semantic1(). Do you have data that shows otherwise?
Jun 25 2012
On Mon, 18 Jun 2012 19:53:43 +0200, Walter Bright <newshound2 digitalmars.com> wrote:On 6/18/2012 6:07 AM, Don Clugston wrote:Lexing is definitely taking a big part of debug compilation time. I haven't profiled the compiler for some time now but here are some thoughts. - speeding up the identifier hash table there was always a profile spike at StringTable::lookup, though it reduced since you increased the bucket count - memory mapping the source file saves a copy for UTF-8 sources this is by far the fastest way to read a source file - parallel reading/parsing doesn't help much if most of the source files are read during import semantic I'm regularly hitting other bottle necks so I don't think that lexing is When compiling std.range with unittests for example more that 50% of the compile time is spend to check for existing template instantiations using O(N^2)/2 compares of template arguments. If we managed to fix http://d.puremagic.com/issues/show_bug.cgi?id=7469 we could efficiently use the mangled name as key.On 17/06/12 00:37, Walter Bright wrote:Yeah, but I can't escape that lingering feeling that lexing is slow. I was fairly disappointed that asynchronously reading the source files didn't have a measurable effect most of the time.On 6/14/2012 1:03 AM, Don Clugston wrote:But you argued in your blog that C++ parsing is inherently slow, and you've fixed those problems in the design of D. And as far as I can tell, you were extremely successful! Parsing in D is very, very fast.Nothing recent, it's mostly from my C++ compiler testing.It is for debug builds.Iain's data indicates that it's only a few % of the time taken on semantic1(). Do you have data that shows otherwise?
Jun 25 2012