www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Pegged: spaces

reply Michelle Long <HappyDance321 gmail.com> writes:
Is Pegged suppose to consume white spaces automatically?

I have some text like "abvdfs dfddf"

and I have made some rules to divide the two parts by a space.

The sub-rules are complex but none of them contain a space(' ', 
they do contain spaces to separate the sub-rules).

The parser though is essentially ignore the space.

Sometimes it seems to work on certain rule construction and then 
other times it doesn't.

Basically none of my sub-rules have any space and the main rule is

A ' '+ B

Where A attempts to parse the first half and B the second half.

But A consumes the whole string!

A does not consume any spaces though! (no . usage or ' ' in the 
rule definitions that A uses)


I'd be able to limit the application of the rule to a substring 
that sort of emulates splitting of the string.

(!' ':A) ' '+ B

hypothetically would parse each char for A but terminate the rule 
when it encounters a space before A get to see the space. Is this 
possible with pegged?

It's sort of a look ahead but it has to do it for each character 
rather than (!' ' A) which would only check the first character 
then continue on with A.
Oct 25 2018
parent reply Michelle Long <HappyDance321 gmail.com> writes:
Ignores spaces: <-

Doesn't: <

Concatenates results: <~
Oct 25 2018
parent reply drug <drug2004 bk.ru> writes:
25.10.2018 23:34, Michelle Long пишет:
 Ignores spaces: <-
 
 Doesn't: <
 
 Concatenates results: <~
 
 
Thank you for sharing your results!
Oct 26 2018
parent reply Michelle Long <HappyDance321 gmail.com> writes:
On Friday, 26 October 2018 at 07:36:50 UTC, drug wrote:
 25.10.2018 23:34, Michelle Long пишет:
 Ignores spaces: <-
 
 Doesn't: <
 
 Concatenates results: <~
 
 
Thank you for sharing your results!
I got it backwards when posting: /* < (space arrow) consume spaces before, between and after elements <- <~ (squiggly arrow) concatenates the captures on the right-hand side of the arrow. <: (colon arrow) drops the entire rule result (useful to ignore comments, for example) <^ (keep arrow) that calls the 'keep' operator to all subelements in a rule. / binary operator - conditional or (Matches first rule, if fails then matches the next) | binary operator - Longest match alternation(matches the longest rule first) : Prefix that ignores match in rule but requires it to be valid. */ List is not complete, maybe I will update. What would be really cool if one could have an autogrammar generator! Somehow it looks at text and figures out the grammar. Might require some human interaction but can figure out the rules that will generate the specific grammars. Maybe neural net could do it? Train it enough and it could be fairly accurate and a human just has to fix up small cases. e.g., get a few million lines of C++ source code, pass in to the generator and it pops out a grammar for it! Should be possible since it's usually 1 to 1(for peg grammars at least).
Oct 27 2018
parent Mark Fisher <logicfish gmail.com> writes:
On Saturday, 27 October 2018 at 14:21:51 UTC, Michelle Long wrote:
 What would be really cool if one could have an autogrammar 
 generator! Somehow it looks at text and figures out the 
 grammar. Might require some human interaction but can figure 
 out the rules that will generate the specific grammars. Maybe 
 neural net could do it? Train it enough and it could be fairly 
 accurate and a human just has to fix up small cases.

 e.g., get a few million lines of C++ source code, pass in to 
 the generator and it pops out a grammar for it! Should be 
 possible since it's usually 1 to 1(for peg grammars at least).
Something like eclipse's xtext would be nice, parts of the grammar are attached to OOP features in the code.
Oct 27 2018