www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.announce - commonmark-d: A fast CommonMark and Github Flavoured Markdown parser,

reply Guillaume Piolat <first.last gmail.com> writes:
Hello,

commonmark-d is a D translation of MD4C, a fast SAX-like Markdown 
parser.
MD4C achieves remarkable parsing speed through the lack of AST 
and careful memory usage.

The route of translation was choosen because parsing Markdown is 
much more involved that first thought. The D translation largely 
preserve the speed benefits of M4DC.


Usage:

     // Parse CommonMark, generate HTML
     import commonmarkd;
     string html = convertMarkdownToHTML(markdown);

Key Performance Numbers:
     - commonmark-d compiles 3x faster than dmarkdown and 40x 
faster than hunt-markdown.
     - commonmark-d parses Markdown 2x faster than dmarkdown and 
15x faster than hunt-markdown (see GitHub for benchmark details)

I haven't measured memory usage of either compile time or run 
time, but I feel like it's also better.

Available now on DUB: http://code.dlang.org/packages/commonmark-d
GitHub page: https://github.com/p0nce/commonmark-d
Sep 30 2019
next sibling parent Mike Parker <aldacron gmail.com> writes:
On Monday, 30 September 2019 at 23:06:42 UTC, Guillaume Piolat 
wrote:
 Hello,

 commonmark-d is a D translation of MD4C, a fast SAX-like 
 Markdown parser.
Thumbs up!
Oct 01 2019
prev sibling next sibling parent reply Dennis <dkorpel gmail.com> writes:
Cool!

On Monday, 30 September 2019 at 23:06:42 UTC, Guillaume Piolat 
wrote:
 Key Performance Numbers:
Have you compared it with the original C code from MD4C?
Oct 01 2019
parent Guillaume Piolat <first.last gmail.com> writes:
On Tuesday, 1 October 2019 at 11:37:00 UTC, Dennis wrote:
 Cool!

 On Monday, 30 September 2019 at 23:06:42 UTC, Guillaume Piolat 
 wrote:
 Key Performance Numbers:
Have you compared it with the original C code from MD4C?
No. It's completely possible that there is a small difference, however most of the code is under nothrow nogc and only use GC to allocate the output buffer (the grow strategy might matter there). I don't expect much difference, but yeah, haven't tested :)
Oct 01 2019
prev sibling next sibling parent reply bachmeier <no spam.net> writes:
On Monday, 30 September 2019 at 23:06:42 UTC, Guillaume Piolat 
wrote:
 Hello,

 commonmark-d is a D translation of MD4C, a fast SAX-like 
 Markdown parser.
 MD4C achieves remarkable parsing speed through the lack of AST 
 and careful memory usage.

 The route of translation was choosen because parsing Markdown 
 is much more involved that first thought. The D translation 
 largely preserve the speed benefits of M4DC.


 Usage:

     // Parse CommonMark, generate HTML
     import commonmarkd;
     string html = convertMarkdownToHTML(markdown);

 Key Performance Numbers:
     - commonmark-d compiles 3x faster than dmarkdown and 40x 
 faster than hunt-markdown.
     - commonmark-d parses Markdown 2x faster than dmarkdown and 
 15x faster than hunt-markdown (see GitHub for benchmark details)

 I haven't measured memory usage of either compile time or run 
 time, but I feel like it's also better.

 Available now on DUB: 
 http://code.dlang.org/packages/commonmark-d
 GitHub page: https://github.com/p0nce/commonmark-d
This is really nice. The examples show only conversion to html. Is there an easy way to get the intermediate output and convert to PDF through latex, to org-mode, etc., or to change the html conversion? One use case that is easy with Pandoc is to copy just the code from markdown into its own source file as a simple form of literate programming.
Oct 01 2019
parent Guillaume Piolat <first.last gmail.com> writes:
On Tuesday, 1 October 2019 at 16:02:47 UTC, bachmeier wrote:
 On Monday, 30 September 2019 at 23:06:42 UTC, Guillaume Piolat 
 wrote:
 Hello,

 commonmark-d is a D translation of MD4C, a fast SAX-like 
 Markdown parser.
 MD4C achieves remarkable parsing speed through the lack of AST 
 and careful memory usage.

 The route of translation was choosen because parsing Markdown 
 is much more involved that first thought. The D translation 
 largely preserve the speed benefits of M4DC.


 Usage:

     // Parse CommonMark, generate HTML
     import commonmarkd;
     string html = convertMarkdownToHTML(markdown);

 Key Performance Numbers:
     - commonmark-d compiles 3x faster than dmarkdown and 40x 
 faster than hunt-markdown.
     - commonmark-d parses Markdown 2x faster than dmarkdown 
 and 15x faster than hunt-markdown (see GitHub for benchmark 
 details)

 I haven't measured memory usage of either compile time or run 
 time, but I feel like it's also better.

 Available now on DUB: 
 http://code.dlang.org/packages/commonmark-d
 GitHub page: https://github.com/p0nce/commonmark-d
This is really nice. The examples show only conversion to html. Is there an easy way to get the intermediate output and convert to PDF through latex, to org-mode, etc., or to change the html conversion? One use case that is easy with Pandoc is to copy just the code from markdown into its own source file as a simple form of literate programming.
MD4C is a push parser without AST so you have to give it callbacks to generate any koind of intermediate output. You'd have to make md_parse public in commonmark-d, this is a C-style API My long term goal is indeed super fast conversion of markdown to PDF, now we have the commonmark parser and the PDF generation, I just need the time to manage layout. Possibly making a minimal browser is a better route, dunno.
Oct 01 2019
prev sibling next sibling parent reply zoujiaqing <zoujiaqing gmail.com> writes:
On Monday, 30 September 2019 at 23:06:42 UTC, Guillaume Piolat 
wrote:
 Hello,

 I haven't measured memory usage of either compile time or run 
 time, but I feel like it's also better.
Thanks, I like this project. Because hunt-markdown is strictly abstract in design, the performance is not particularly good:)
Oct 02 2019
parent Guillaume Piolat <first.last gmail.com> writes:
On Wednesday, 2 October 2019 at 09:33:03 UTC, zoujiaqing wrote:
 On Monday, 30 September 2019 at 23:06:42 UTC, Guillaume Piolat 
 wrote:
 Hello,

 I haven't measured memory usage of either compile time or run 
 time, but I feel like it's also better.
Thanks, I like this project. Because hunt-markdown is strictly abstract in design, the performance is not particularly good:)
I wanted to use hunt-markdown but was thinking it could use a bit less RAM :) Translations look like the originals. I'd be very happy if you can consider commonmark-d for your use case. Having no AST is less flexible but have nice properties.
Oct 02 2019
prev sibling parent reply LocoDelPueblo <fdp-dyna-hum nowhere.mx> writes:
On Monday, 30 September 2019 at 23:06:42 UTC, Guillaume Piolat 
wrote:
 Hello,

 commonmark-d is a D translation of MD4C, a fast SAX-like 
 Markdown parser.
 MD4C achieves remarkable parsing speed through the lack of AST 
 and careful memory usage.

 The route of translation was choosen because parsing Markdown 
 is much more involved that first thought. The D translation 
 largely preserve the speed benefits of M4DC.


 Usage:

     // Parse CommonMark, generate HTML
     import commonmarkd;
     string html = convertMarkdownToHTML(markdown);

 Key Performance Numbers:
     - commonmark-d compiles 3x faster than dmarkdown and 40x 
 faster than hunt-markdown.
     - commonmark-d parses Markdown 2x faster than dmarkdown and 
 15x faster than hunt-markdown (see GitHub for benchmark details)

 I haven't measured memory usage of either compile time or run 
 time, but I feel like it's also better.

 Available now on DUB: 
 http://code.dlang.org/packages/commonmark-d
 GitHub page: https://github.com/p0nce/commonmark-d
d-markdown was actually extracted from vibe-d a a few years ago, mostly for a software called "harbored-mod", to add support for markdown in DDOC comments, so vibe-d MD module should still be in the same magnitude of "sub-optimal-ity". For conversions from MD to HTML, in a static context (i.e not a server), I'd just use Pandoc. markdown-d had some bugs. Maybe fixed in the newest vibe-d since the fork you compare to was basically dead-born.
Oct 03 2019
parent zoujiaqing <zoujiaqing gmail.com> writes:
On Thursday, 3 October 2019 at 08:19:12 UTC, LocoDelPueblo wrote:
 d-markdown was actually extracted from vibe-d a a few years
 
But it is not compatible with commonmark syntax.
Oct 04 2019