www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.announce - DMD library available as DUB package

reply Jacob Carlborg <doob me.com> writes:
During the dconf hackathon I set out to create a DUB package for DMD to 
be used as a library. This has finally been merged [1] and is available 
here [2]. It contains the lexer and the parser.

A minimal example:

#!/usr/bin/env dub
/++ dub.sdl:
name "dmd_lexer_example"
dependency "dmd" version="~master"
+/

void main()
{
     import ddmd.lexer;
     import ddmd.tokens;
     import std.stdio;

     immutable sourceCode = "void test() {} // foobar";
     scope lexer = new Lexer("test", sourceCode.ptr, 0, 
sourceCode.length, 0, 0);

     while (lexer.nextToken != TOKeof)
         writeln(lexer.token.value);
}

[1] https://github.com/dlang/dmd/pull/6771
[2] http://code.dlang.org/packages/dmd

-- 
/Jacob Carlborg
Jul 18
next sibling parent reply Suliman <evermind live.ru> writes:
Could you explain where it can be helpful?
Jul 18
next sibling parent Dukc <ajieskola gmail.com> writes:
On Tuesday, 18 July 2017 at 12:35:10 UTC, Suliman wrote:
 Could you explain where it can be helpful?
For tools, such as source code formatters. They do not have to write the parsers themselves if they use a library such as this one.
Jul 18
prev sibling parent Jacob Carlborg <doob me.com> writes:
On 2017-07-18 14:35, Suliman wrote:
 Could you explain where it can be helpful?
As Dukc said, for tools that need to analyze D source code. -- /Jacob Carlborg
Jul 18
prev sibling next sibling parent Meta <jared771 gmail.com> writes:
On Tuesday, 18 July 2017 at 12:07:27 UTC, Jacob Carlborg wrote:
 During the dconf hackathon I set out to create a DUB package 
 for DMD to be used as a library. This has finally been merged 
 [1] and is available here [2]. It contains the lexer and the 
 parser.

 A minimal example:

 #!/usr/bin/env dub
 /++ dub.sdl:
 name "dmd_lexer_example"
 dependency "dmd" version="~master"
 +/

 void main()
 {
     import ddmd.lexer;
     import ddmd.tokens;
     import std.stdio;

     immutable sourceCode = "void test() {} // foobar";
     scope lexer = new Lexer("test", sourceCode.ptr, 0, 
 sourceCode.length, 0, 0);

     while (lexer.nextToken != TOKeof)
         writeln(lexer.token.value);
 }

 [1] https://github.com/dlang/dmd/pull/6771
 [2] http://code.dlang.org/packages/dmd
Nice, I was not aware that DMD as a library was so close to being a reality.
Jul 18
prev sibling next sibling parent NotSpooky <zoteman94 gmail.com> writes:
On Tuesday, 18 July 2017 at 12:07:27 UTC, Jacob Carlborg wrote:
 During the dconf hackathon I set out to create a DUB package 
 for DMD to be used as a library. This has finally been merged 
 [1] and is available here [2]. It contains the lexer and the 
 parser.

 A minimal example:

 #!/usr/bin/env dub
 /++ dub.sdl:
 name "dmd_lexer_example"
 dependency "dmd" version="~master"
 +/

 void main()
 {
     import ddmd.lexer;
     import ddmd.tokens;
     import std.stdio;

     immutable sourceCode = "void test() {} // foobar";
     scope lexer = new Lexer("test", sourceCode.ptr, 0, 
 sourceCode.length, 0, 0);

     while (lexer.nextToken != TOKeof)
         writeln(lexer.token.value);
 }

 [1] https://github.com/dlang/dmd/pull/6771
 [2] http://code.dlang.org/packages/dmd
Awesome, was waiting for this.
Jul 18
prev sibling next sibling parent Andrea Fontana <nospam example.com> writes:
On Tuesday, 18 July 2017 at 12:07:27 UTC, Jacob Carlborg wrote:
 During the dconf hackathon I set out to create a DUB package 
 for DMD to be used as a library. This has finally been merged 
 [1] and is available here [2]. It contains the lexer and the 
 parser.
Great news!! I think it was not ready yet.
Jul 19
prev sibling parent reply Johan Engelen <j j.nl> writes:
On Tuesday, 18 July 2017 at 12:07:27 UTC, Jacob Carlborg wrote:
 During the dconf hackathon I set out to create a DUB package 
 for DMD to be used as a library. This has finally been merged 
 [1] and is available here [2]. It contains the lexer and the 
 parser.
This is great news of course! But I have some bad news ;-) Now that the Lexer nicely separated, it is very easy for me to testdrive libFuzzer+AddressSanitizer on the lexer and... Expect many bug reports in the next days. I am testing this code: ``` void fuzzDMDLexer(const(char*) data, size_t length) { scope lexer = new Lexer("test", data, 0, length, false, false); lexer.nextToken; do { auto drop = lexer.token.value; } while (lexer.nextToken != TOKeof); } ``` A short list of heap-overflow memory access bugs (params data and length are consistent): 1. length == 0 2. data == "\n" (line feed, 0xa) 3. data == "only_ascii*" (nothing following the "*" is the problem) 4. data == "%%" 5. data == "*ô" 6. data == "\nÜÜÜ" 7. data == "\x0a''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''" 8. data == ")\xf7" `void scan(Token* t)` is to blame for most of the bugs I found so far. See e.g. line 980 that causes bug 3: https://github.com/dlang/dmd/blob/154aa1bfd36333a8777d571e39690511e670bfcf/src/ddmd/lexer.d#L979-L980 Example of stacktrace (bug 8): ``` ==11222==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x602000000952 at pc 0x0001028915b5 bp 0x7fff5d3941f0 sp 0x7fff5d3941e8 READ of size 1 at 0x602000000952 thread T0 #0 0x1028915b4 in _D4ddmd5lexer5Lexer9decodeUTFMFZk lexer.d:2314 #1 0x102887cae in _D4ddmd5lexer5Lexer4scanMFPS4ddmd6tokens5TokenZv lexer.d:1019 #2 0x102875089 in _D4ddmd5lexer5Lexer9nextTokenMFZE4ddmd6tokens3TOK lexer.d:222 #3 0x1028c5d20 in _D9fuzzlexer12fuzzDMDLexerFxPhmZv fuzzlexer.d:31 ``` I am very excited to see the fuzzer+asan working so nicely! :-) Johan
Jul 30
parent Johan Engelen <j j.nl> writes:
On Sunday, 30 July 2017 at 23:41:40 UTC, Johan Engelen wrote:
 On Tuesday, 18 July 2017 at 12:07:27 UTC, Jacob Carlborg wrote:
 During the dconf hackathon I set out to create a DUB package 
 for DMD to be used as a library. This has finally been merged 
 [1] and is available here [2]. It contains the lexer and the 
 parser.
This is great news of course! But I have some bad news ;-) Now that the Lexer nicely separated, it is very easy for me to testdrive libFuzzer+AddressSanitizer on the lexer and... Expect many bug reports in the next days.
OK, this wasn't entirely fair. 1. I didn't read the API: the buffer needs to be null-terminated. 2. With a fix [1] to prevent reading beyond the input buffer, I have yet to find a new bug. The fuzzer is running now... I wonder how long it takes to find the next bug, if any. -Johan [1] https://github.com/dlang/dmd/pull/7050
Jul 31