www.digitalmars.com         C & C++   DMDScript  

c++.stlsoft - Gauging Interest in Parsing Library

reply "christopher diggins" <cdiggins videotron.ca> writes:
Greetings All,

I have finished a first version of a public-domain CFG R-D parser library, 
the YARD (yet another recursive descent) parser. Would there be any interest 
in including this library in the STLSoft, assuming of course that I assured 
it satisfied the technical requirements? More information is avaialable at 
http://yard-parser.sf.net

TIA

-- 
Christopher Diggins
http://www.cdiggins.com
http://www.heron-language.com 
Dec 26 2004
parent reply "Matthew" <admin stlsoft.dot.dot.dot.dot.org> writes:
Chris

Always interested in all manner of things, especially when they're up to 
technical requirements. :-)

Specifically, parsing's something I'm very amateurish on, so your 
proposition sounds appealing on first listen because it sounds like I 
could learn more.

More objectively, parsing's definitely something that STLSoft does not 
yet provide - other than straightforward tokenisation (via 
string_tokeniser<>) - so sounds like it may answer a need.

I'll try and have a gander at YARD later in the week.

(FYI: STLSoft is 100% header only, and is always going to stay that way. 
However, I'm working on an affiliate structure - so I can 'release' some 
Synesis / STLSoft interoperation code - whereby things that need to work 
with arbitrary external libraries would live in the 'stlsoft::x' 
namespace. Details are a little sketchy atm, but it's important to note 
that (i) the 100% header only rule is never going to change, but (ii) 
we're going to find pragmatic ways to workaround it when 
suitable/necessary.)

Cheers

Matthew

"christopher diggins" <cdiggins videotron.ca> wrote in message 
news:cqn9es$2h81$1 digitaldaemon.com...
 Greetings All,

 I have finished a first version of a public-domain CFG R-D parser 
 library, the YARD (yet another recursive descent) parser. Would there 
 be any interest in including this library in the STLSoft, assuming of 
 course that I assured it satisfied the technical requirements? More 
 information is avaialable at http://yard-parser.sf.net

 TIA

 -- 
 Christopher Diggins
 http://www.cdiggins.com
 http://www.heron-language.com
 
Dec 26 2004
parent reply "christopher diggins" <cdiggins videotron.ca> writes:
"Matthew" <admin stlsoft.dot.dot.dot.dot.org> wrote in message 
news:cqo6l3$dgt$1 digitaldaemon.com...
 Chris

 Always interested in all manner of things, especially when they're up to 
 technical requirements. :-)

 Specifically, parsing's something I'm very amateurish on, so your 
 proposition sounds appealing on first listen because it sounds like I 
 could learn more.

 More objectively, parsing's definitely something that STLSoft does not yet 
 provide - other than straightforward tokenisation (via 
 string_tokeniser<>) - so sounds like it may answer a need.
 I'll try and have a gander at YARD later in the week.
Great!
 (FYI: STLSoft is 100% header only, and is always going to stay that way. 
 However, I'm working on an affiliate structure - so I can 'release' some 
 Synesis / STLSoft interoperation code - whereby things that need to work 
 with arbitrary external libraries would live in the 'stlsoft::x' 
 namespace. Details are a little sketchy atm, but it's important to note 
 that (i) the 100% header only rule is never going to change, but (ii) 
 we're going to find pragmatic ways to workaround it when 
 suitable/necessary.)
YARD is also 100% header. Not something you want in a lager, but definitely an attractive feature in a library ;-) FYI: YARD has only been tested on Visual C++ 7.1 and GCC 3.4.1. Some other compilers are likely going to complain because it uses a lot of metaprogramming. -- Christopher Diggins http://www.cdiggins.com http://www.heron-language.com
Dec 26 2004
next sibling parent "Matthew" <admin stlsoft.dot.dot.dot.dot.org> writes:
 (FYI: STLSoft is 100% header only, and is always going to stay that 
 way. However, I'm working on an affiliate structure - so I can 
 'release' some Synesis / STLSoft interoperation code - whereby things 
 that need to work with arbitrary external libraries would live in the 
 'stlsoft::x' namespace. Details are a little sketchy atm, but it's 
 important to note that (i) the 100% header only rule is never going 
 to change, but (ii) we're going to find pragmatic ways to workaround 
 it when suitable/necessary.)
YARD is also 100% header. Not something you want in a lager, but definitely an attractive feature in a library ;-)
Good!
 FYI: YARD has only been tested on Visual C++ 7.1 and GCC 3.4.1. Some 
 other compilers are likely going to complain because it uses a lot of 
 metaprogramming.
Then you've come to the right place ... ;)
Dec 27 2004
prev sibling parent reply "Matthew" <admin stlsoft.dot.dot.dot.dot.org> writes:
Chris

Still pushed for time, but still interested. Any chance of you pointing 
me in the direction of a couple of succinct samples showing the 
usefulness of YARD? :-)

Cheers

Matthew

"christopher diggins" <cdiggins videotron.ca> wrote in message 
news:cqo8ob$ful$1 digitaldaemon.com...
 "Matthew" <admin stlsoft.dot.dot.dot.dot.org> wrote in message 
 news:cqo6l3$dgt$1 digitaldaemon.com...
 Chris

 Always interested in all manner of things, especially when they're up 
 to technical requirements. :-)

 Specifically, parsing's something I'm very amateurish on, so your 
 proposition sounds appealing on first listen because it sounds like I 
 could learn more.

 More objectively, parsing's definitely something that STLSoft does 
 not yet provide - other than straightforward tokenisation (via 
 string_tokeniser<>) - so sounds like it may answer a need.
 I'll try and have a gander at YARD later in the week.
Great!
 (FYI: STLSoft is 100% header only, and is always going to stay that 
 way. However, I'm working on an affiliate structure - so I can 
 'release' some Synesis / STLSoft interoperation code - whereby things 
 that need to work with arbitrary external libraries would live in the 
 'stlsoft::x' namespace. Details are a little sketchy atm, but it's 
 important to note that (i) the 100% header only rule is never going 
 to change, but (ii) we're going to find pragmatic ways to workaround 
 it when suitable/necessary.)
YARD is also 100% header. Not something you want in a lager, but definitely an attractive feature in a library ;-) FYI: YARD has only been tested on Visual C++ 7.1 and GCC 3.4.1. Some other compilers are likely going to complain because it uses a lot of metaprogramming. -- Christopher Diggins http://www.cdiggins.com http://www.heron-language.com
Jan 31 2005
parent reply "christopher diggins" <cdiggins videotron.ca> writes:
"Matthew" <admin stlsoft.dot.dot.dot.dot.org> wrote in message 
news:ctmfbu$203i$1 digitaldaemon.com...
 Chris

 Still pushed for time, but still interested. Any chance of you pointing me 
 in the direction of a couple of succinct samples showing the usefulness of 
 YARD? :-)

 Cheers
Hi Matthew, Most definitely, but unfortunately it won't be for at least a couple of weeks. I am currently completely drowning here. I'll make an announcement here, when it is ready. I am glad to hear you are still interested! Best, Christopher
Feb 01 2005
parent reply "christopher diggins" <cdiggins videotron.ca> writes:
Hi Matthew,

Yard version 2.0 has just been released at http://www.ootl.org/yard/ . There 
is a sample cpp to html program at 
http://www.ootl.org/yard/examples/cpp_to_html.hpp.htm . I am still working 
on a comprehensive test suite which will contain numerous succinct examples. 
(the code was used to generate the pretty source code). Sorry about the 
delay.

Christopher Diggins
http://www.ootl.org
Feb 16 2005
next sibling parent "Matthew" <admin stlsoft.dot.dot.dot.dot.org> writes:
No worries. Things here are still progressing like a weak snail pushing 
its way through a particularly obdurate bowl of molasses. ;)

"christopher diggins" <cdiggins videotron.ca> wrote in message 
news:cv0bq0$11qd$1 digitaldaemon.com...
 Hi Matthew,

 Yard version 2.0 has just been released at http://www.ootl.org/yard/ . 
 There is a sample cpp to html program at 
 http://www.ootl.org/yard/examples/cpp_to_html.hpp.htm . I am still 
 working on a comprehensive test suite which will contain numerous 
 succinct examples. (the code was used to generate the pretty source 
 code). Sorry about the delay.

 Christopher Diggins
 http://www.ootl.org






 
Feb 16 2005
prev sibling next sibling parent reply "Matthew" <admin stlsoft.dot.dot.dot.dot.org> writes:
The example'd benefit from some examples of input and output data

"christopher diggins" <cdiggins videotron.ca> wrote in message 
news:cv0bq0$11qd$1 digitaldaemon.com...
 Hi Matthew,

 Yard version 2.0 has just been released at http://www.ootl.org/yard/ . 
 There is a sample cpp to html program at 
 http://www.ootl.org/yard/examples/cpp_to_html.hpp.htm . I am still 
 working on a comprehensive test suite which will contain numerous 
 succinct examples. (the code was used to generate the pretty source 
 code). Sorry about the delay.

 Christopher Diggins
 http://www.ootl.org






 
Feb 16 2005
parent "christopher diggins" <cdiggins videotron.ca> writes:
"Matthew" <admin stlsoft.dot.dot.dot.dot.org> wrote in message 
news:cv0db7$13nu$1 digitaldaemon.com...
 The example'd benefit from some examples of input and output data
I will follow your advice and show input and output data. I actually just recently ended up overhauling the library, trimming some fat. I am still working on some better examples. Here is another example of YARD which converts http://xxx to html hyperlink's (i.e. adds <a href= tags) namespace http_replacer { int main() { ParserInput stream(cin); while (!stream.AtEnd()) { DefaultParser::iter_type pos = stream.GetPos(); if (!TextGrammar::Uri::Accept<DefaultParser>(stream)) { cout << stream.GetElem(); stream.GotoNext(); } else { string s(pos, stream.GetPos()); cout << "<a href='" << s << "'>" << s << "</a>"; } } return 0; } } The grammatical rule TextGrammar::Uri is defined in part with the following rules: struct HttpPrefix_string { static char const* GetString() { return "http://"; } }; struct HyperLinkChar : CharSet < char_set_union < ident_next_set, char_set_list < '+','-','?','%', '/','\\','_','.', > > > { }; struct Uri : bnf_and< StringNoCase< HttpPrefix_string >, bnf_plus< HyperLinkChar > > { }; There is perhaps a single line of code which might particularly interest you in my code formatting tool: fstream in; fstream out; //... in > Filter(cpp_to_html::main) > Filter(http_replacer::main) > out; Does this line make intuitive sense to you? I'll give you a hint, > is intended a redirection/pipe operator. So what do you think? (1)Yawn (2)Good for you Christopher [patronizing pat on the head] (3)Lay off of the chemicals (4)I gotta have that code! -- Christopher Diggins Object Oriented Template Library (OOTL) http://www.ootl.org
Feb 19 2005
prev sibling parent reply "Zz" <Zz Zz.com> writes:
It really needs some examples.

I tried compiling with CodeWarrior 9.3 and I got a lot of errors, I'll try
compiling with Visual Studio tommorow.

Out of curiosity is #include <string.h> in yard_input.hpp what you really
meant or did you intend to use #include <string>.

Since you've tried it how is the performance? I'm asking since I saw that
you are using iostreams and each time I see iostreams and parsing being used
together I shudder.

Looks interesting othewise.


"christopher diggins" <cdiggins videotron.ca> wrote in message
news:cv0bq0$11qd$1 digitaldaemon.com...
 Hi Matthew,

 Yard version 2.0 has just been released at http://www.ootl.org/yard/ .
There
 is a sample cpp to html program at
 http://www.ootl.org/yard/examples/cpp_to_html.hpp.htm . I am still working
 on a comprehensive test suite which will contain numerous succinct
examples.
 (the code was used to generate the pretty source code). Sorry about the
 delay.

 Christopher Diggins
 http://www.ootl.org
Feb 21 2005
next sibling parent reply "Matthew" <admin.hat stlsoft.dot.org> writes:
 It really needs some examples.

 I tried compiling with CodeWarrior 9.3 and I got a lot of errors, I'll try
 compiling with Visual Studio tommorow.
If it, or any other library, is incorporated into, or made to work with, STLSoft it'll have to work with a large number of compilers. The good news is that I've a lot of experience at doing this, so I doubt it'd take much work to help Chris to get YARD to do so.
 Out of curiosity is #include <string.h> in yard_input.hpp what you really
 meant or did you intend to use #include <string>.

 Since you've tried it how is the performance? I'm asking since I saw that
 you are using iostreams and each time I see iostreams and parsing being used
 together I shudder.
LOL. A wise comment. But I assume - haven't had much time to delve thus far - that Chris is concentrating on correctness first, and portability and performance next. (There's a lot of scope for making things faster, such as string views, custom allocators, etc. etc. But I do share your aversion to IOStreams when it comes to notions of 'performance'.)
 Looks interesting othewise.
Being a bit of a duffer about parsing, I can't offer any informed opinion on this, but it's interesting to hear another opinion. CheerZz. ;) Matthew
 "christopher diggins" <cdiggins videotron.ca> wrote in message
 news:cv0bq0$11qd$1 digitaldaemon.com...
 Hi Matthew,

 Yard version 2.0 has just been released at http://www.ootl.org/yard/ .
There
 is a sample cpp to html program at
 http://www.ootl.org/yard/examples/cpp_to_html.hpp.htm . I am still working
 on a comprehensive test suite which will contain numerous succinct
examples.
 (the code was used to generate the pretty source code). Sorry about the
 delay.

 Christopher Diggins
 http://www.ootl.org
Feb 21 2005
parent "christopher diggins" <cdiggins videotron.ca> writes:
"Matthew" <admin.hat stlsoft.dot.org> wrote in message 
news:cvejve$201v$1 digitaldaemon.com...

 If it, or any other library, is incorporated into, or made to work with, 
 STLSoft it'll have to work with a large number of compilers. The good news 
 is that I've a lot of experience at doing this, so I doubt it'd take much 
 work to help Chris to get YARD to do so.
Great!
 But I assume - haven't had much time to delve thus far - that Chris is 
 concentrating on correctness first,
You bet: for(;;) testing();
 Looks interesting othewise.
Being a bit of a duffer about parsing, I can't offer any informed opinion on this, but it's interesting to hear another opinion.
I am going to whet your taste buds again. Check out http://www.ootl.org/char_set/ this is a part of the parsing library which I just isolated and separated (and generified a bit more). This should give a good foundation for what the rest of the library is about Matthew. I would like to hear how this code works out for you. CD
Feb 23 2005
prev sibling parent reply "christopher diggins" <cdiggins videotron.ca> writes:
Hi zz,

"Zz" <Zz Zz.com> wrote in message news:cvdujs$17gp$1 digitaldaemon.com...
 It really needs some examples.

 I tried compiling with CodeWarrior 9.3 and I got a lot of errors, I'll try
 compiling with Visual Studio tommorow.
You might want to wait a bit. The library went through some significant changes recently and I have a new version is not yet uploaded.
 Out of curiosity is #include <string.h> in yard_input.hpp what you really
 meant or did you intend to use #include <string>.
Big gaff on my part.
 Since you've tried it how is the performance? I'm asking since I saw that
 you are using iostreams and each time I see iostreams and parsing being 
 used
 together I shudder.
The input is generic, it can be whatever you want. You can parse a raw memory block if you want. Then it goes like a bat out of hell.
 Looks interesting othewise.
Thanks!
Feb 23 2005
next sibling parent reply "Zz" <Zz Zz.com> writes:
Hi Chris,


 You might want to wait a bit. The library went through some significant
 changes recently and I have a new version is not yet uploaded.
I'm patient.
 Out of curiosity is #include <string.h> in yard_input.hpp what you
really
 meant or did you intend to use #include <string>.
The input is generic, it can be whatever you want. You can parse a raw memory block if you want. Then it goes like a bat out of hell.
We on a regular basis handle files of about 1.5GB, so loading the whole file is out of question on the other hand loading it in chunks is okay so long as c++ iostreams are not used and instead using plain fread. Now for some suggestions. When prototyping a parser I usually use rdp, since it generates rd parsers in c that are human readable and editable. http://www.dcs.rhbnc.ac.uk/research/languages/projects/rdp.shtml It has a very interesting concept called Irerator BNF that follows this rule (valid subproduction ) lo hi token. the is the operator. something like ('body') 2 4 'separator' would match the folowing: 1: body separator body 2: body separator body separator body 3: body separator body separator body separator body It can also be used to describe regular ebnf patterns at the same time. Just thought you might find the above interesting, since you seem interested in parsing and meta programming. Zz
Feb 23 2005
parent reply "christopher diggins" <cdiggins videotron.ca> writes:
Hi ZZ,

I really appreciate your feedback, and involvement!

Concerning the large file parsing issue, what would be require would be for 
someone to write a custom ParserInput type (see 
http://www.ootl.org/yard/#parser ). I haven't written a file reading input 
type which uses fread, as I think it may be outside the scope of the 
library. I don't know how hard it is for others to fill in the gaps. Would 
you be willing to write an efficient file reading type and share it with us?

Unfortunately I could not read the rdp documentation (no post-script 
reader). I could not understand the examples you shared with me.

Let me know whether you can get the yard examples working. Thanks!

-- 
Christopher Diggins
Object Oriented Template Library (OOTL)
http://www.ootl.org 
Feb 28 2005
parent "Zz" <Zz Zz.com> writes:
Hi,

Comments are inline.

"christopher diggins" <cdiggins videotron.ca> wrote in message
news:cvvgi0$1an$1 digitaldaemon.com...
 Hi ZZ,

 I really appreciate your feedback, and involvement!
thanks.
 Concerning the large file parsing issue, what would be require would be
for
 someone to write a custom ParserInput type (see
 http://www.ootl.org/yard/#parser ). I haven't written a file reading input
 type which uses fread, as I think it may be outside the scope of the
 library. I don't know how hard it is for others to fill in the gaps. Would
 you be willing to write an efficient file reading type and share it with
us? The method we use is a subset of the on metioned in a paper by Waite in '86 "The cost of Lexical analysis", the same method but a little bit different is used in LLC compiler for the lexical analysis and is documented in their book "A Retargetable C Compiler: Design and Implementation" the sample chapter they give online actually describes this section http://www.cs.princeton.edu/software/lcc/doc/06.pdf The same method for reading is also used in Boost WAVE when compiled with the WAVE_USE_RE2C_CPP_LEXER option, this Lexer is fast fast fast and can beat hand rolled lexers. boost xpressive was supposed to be at least on par with re2c when completed and it would generate a statically bound lexer for Spirit or be used alone.
 Unfortunately I could not read the rdp documentation (no post-script
 reader). I could not understand the examples you shared with me.
you can download ghostscript to read the documents. (postscript is one of the things we parse).
 Let me know whether you can get the yard examples working. Thanks!
I'll try them out with CodeWarror soon. Zz.
Mar 02 2005
prev sibling parent reply "christopher diggins" <cdiggins videotron.ca> writes:
"christopher diggins" <cdiggins videotron.ca> wrote in message 
news:cviprq$1v92$1 digitaldaemon.com...
 Hi zz,

 "Zz" <Zz Zz.com> wrote in message news:cvdujs$17gp$1 digitaldaemon.com...
 It really needs some examples.

 I tried compiling with CodeWarrior 9.3 and I got a lot of errors, I'll 
 try
 compiling with Visual Studio tommorow.
You might want to wait a bit. The library went through some significant changes recently and I have a new version is not yet uploaded.
I finally released a new version of the YARD parser ( http://www.sf.net/yard_parser/ ), and rewrote the documentation. ( http://www.ootl.org/yard/ ) Comments and suggestions are appreciated. -- Christopher Diggins Object Oriented Template Library (OOTL) http://www.ootl.org
Feb 27 2005
parent reply "christopher diggins" <cdiggins videotron.ca> writes:
I found a big problem with latest release, I am accidentally using microsoft 
extensions. I will be fixing that ASAP.

-- 
Christopher Diggins
Object Oriented Template Library (OOTL)
http://www.ootl.org 
Feb 28 2005
parent reply "Matthew" <admin stlsoft.dot.dot.dot.dot.org> writes:
Easily done. ;)

Chris

I really do want to get in and have a look at YARD, but I'm still crying 
out for some examples. I quickly perused the latest URL posted 
yesterday, but failed to see any examples. Am I a thickie, or are they 
still missing/light?

Cheers

Matthew

"christopher diggins" <cdiggins videotron.ca> wrote in message 
news:d0048a$op2$1 digitaldaemon.com...
I found a big problem with latest release, I am accidentally using 
microsoft extensions. I will be fixing that ASAP.

 -- 
 Christopher Diggins
 Object Oriented Template Library (OOTL)
 http://www.ootl.org
 
Feb 28 2005
parent "christopher diggins" <cdiggins videotron.ca> writes:
"Matthew" <admin stlsoft.dot.dot.dot.dot.org> wrote in message 
news:d0061h$rda$2 digitaldaemon.com...
 Easily done. ;)

 Chris

 I really do want to get in and have a look at YARD, but I'm still crying 
 out for some examples. I quickly perused the latest URL posted yesterday, 
 but failed to see any examples. Am I a thickie, or are they still 
 missing/light?
Sorry, they just aren't obvious. Check out the following source files: http://www.ootl.org/yard/examples/simple_example.hpp.htm http://www.ootl.org/yard/examples/include_replacer.hpp.htm http://www.ootl.org/yard/examples/http_to_href.hpp.htm http://www.ootl.org/yard/examples/cpp_to_html.hpp.htm http://www.ootl.org/yard/tests/yard_tests.hpp.htm CD
Feb 28 2005