www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - DLang Spec rewrite (?)

reply "Borden" <2013 bordenrhodes.com> writes:
Good afternoon, all,

I would still like to compile the D Lang Spec into EPUB (and 
possibly other formats) but, as we discussed in these threads:

http://forum.dlang.org/thread/bsbdpjyjubfxvmecwhjl forum.dlang.org
http://forum.dlang.org/thread/uzdngvjzexukbgkxdzpi forum.dlang.org

having the D Lang Specification written in DDoc macros is making 
it extremely difficult to work with.

I ask, therefore, what opposition would there be to me rewriting 
the DLang Spec files into another format that will be easier to 
parse and compile for the website, PDF, Latex, eBook and other 
formats? If the answer is 'minimal', 'go ahead' or 'it's your 
funeral', then my follow-up question is 'what format would be the 
easiest to write, debug and maintain?'

For greater clarity, I am NOT proposing to rewrite the 
DDoc-generated library documentation or any other pages outside 
of the spec. In the makefile, they are defined as the files 
covered in $(SPEC_ROOT).

With regards,
May 25 2013
next sibling parent reply "Borden" <2013 bordenrhodes.com> writes:
I hasten to add that I don't mean to criticise the original 
writers of the DLang Spec for writing it in DDoc macros. So far, 
I've found the documentation fairly easy to follow (as plain 
text) and so I don't want to lose any of that should the spec be 
rewritten.

It's also possible (although, in my opinion, less preferable) to 
keep the spec written in DDoc macros but reformatted to allow for 
easier conversion to other formats...
May 25 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/25/13 2:16 PM, Borden wrote:
 I hasten to add that I don't mean to criticise the original writers of
 the DLang Spec for writing it in DDoc macros. So far, I've found the
 documentation fairly easy to follow (as plain text) and so I don't want
 to lose any of that should the spec be rewritten.

 It's also possible (although, in my opinion, less preferable) to keep
 the spec written in DDoc macros but reformatted to allow for easier
 conversion to other formats...

My attitude on DDoc has evolved in threes: 3 minutes: "wtf is this crap" 3 hours: "this sucks" 3 days: "grumble I'll make do with this although it totally sucks" 3 months: "this is pretty darn good" To generate several formats from one source, a macro system is needed. One interesting thing I figured about macro systems is they're all dirty - they can't be really considered "languages" because they intermix the programming part with the very output generated. So, what macro system would you use? (Actual question.) Look at m4 - it won't win any beauty contests, either, and it's enormously complicated. DDoc is simple for what it does, it has somehow hit a sweet spot. Andrei
May 25 2013
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/25/13 8:56 PM, Borden wrote:
 My contention is that, for the purposes of writing lengthy, non-code
 documentation like the DLang spec (I'm not referring to any other
 documentation or pages on the site), enclosing the entire exposition in
 macros has made the source too inflexible for me to work with without
 awkward workarounds or having to write my own parser. Again, the idea is
 to use the features of HTML5 and compile the DLang spec into an ePub
 document that I can read on my brand-new Kobo.

This is a worthy goal. We manage to generate mobi files for the spec (and Phobos in a pull request), is the ebook format very different? Andrei
May 25 2013
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/25/13 9:22 PM, Borden wrote:
 1) Allowing sections to be defined using == Heading == or === Heading
 === instead of $(HEADING ) or variants. The advantage that Wiki syntax
 has over macro-syntax is that it automatically works out the section
 nesting (which is essential for building tables of contents in things
 such as, hint hint, eBooks) whereas macros can only do it if the
 subheadings are nested as arguments.

Not getting this at all. You can define in DDoc things like H1, H2, etc. or whatever you want. Besides, you are proposing a bunch of... just sugar aimed at reading the text as is. That's not part of DDoc's charter. Besides, it does not add power (you can do the same with macros) and it makes everything awfully complicated. Do you want to be the guy writing the parser for all that sugar? If ddoc has anything going for it, it's simplicity of syntax. It has like 5 syntactic rules in total. Parsing ddoc is quite simple. What vexes me is that all the sugar you propose goes against what you opened with, which was:
 I ask, therefore, what opposition would there be to me rewriting the
 DLang Spec files into another format that will be easier to parse and
 compile for the website, PDF, Latex, eBook and other formats?

I really don't understand. Far as I can tell you are trying to accomplish a well-defined goal: compile the spec in ebook format. Then every step you're sketching on the way there takes you just away from your goal. We can generate LaTeX from ddoc (there's a pull request for doing it even better) and I can hypothesize that the shortest path between where we are and what you're trying to accomplish is a few dozens of macro definitions. Did you try doing that and failed? Andrei
May 25 2013
parent Walter Bright <newshound2 digitalmars.com> writes:
On 5/25/2013 9:15 PM, Borden wrote:
 3) Consider, for example, this part from abi.dd:
 $(GRAMMAR
 $(I MangledName):
      $(B _D) $(I QualifiedName) $(I Type)
      $(B _D) $(I QualifiedName) $(B M) $(I Type)

 $(I QualifiedName):
      $(I SymbolName)
      $(I SymbolName) $(I QualifiedName)

 $(I SymbolName):
      $(I LName)
      $(I TemplateInstanceName)
 )
 Say I want to style this using a descriptions list, the <dl> tag. That's easy
 enough, but now how do I tell DDoc to tag the $(I) macros using <dt> and <dd>
tags?

You wouldn't use the I macro. You'd write a macro that reflected the structure - you might want to look at the TABLE, TROW, THEAD, THX and TDX macros to see how tables are generated in a flexible, structured manner.
May 25 2013
prev sibling next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 5/25/2013 5:28 PM, Andrei Alexandrescu wrote:
 My attitude on DDoc has evolved in threes:

 3 minutes: "wtf is this crap"
 3 hours: "this sucks"
 3 days: "grumble I'll make do with this although it totally sucks"
 3 months: "this is pretty darn good"

Thanks for the chuckle!
 DDoc is simple for what it does, it has somehow hit a sweet spot.

It's not totally random. I've designed one macro language before (ABEL), and have implemented 3 (ABEL, Make, and C preprocessor), so I knew what I wanted. Ddoc is very similar to Make's macro system. BTW, the C preprocessor takes the cake for being both horrendously complicated (most implementations take about 10 years to get right) and woefully inadequate.
May 25 2013
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/26/13 2:03 AM, H. S. Teoh wrote:
 I don't know, to me DDoc is still lacking a major feature: a mechanism
 for per-character translation. The problem is that many output formats
 have a different scheme of metacharacters, and some (most notably LaTeX)
 require special transcription of certain characters. Right now, the only
 way to handle this correctly in DDoc is very painful: write macros for
 every special character and logical entity (like mdash, nbsp, and the
 like), which makes it very hard to write. Your text would look like:

 	$(T)his is Mr$(DOT)$(NBSP)T$(APOS)s $(DOLLAR)0$(DOT)02
 	recip$(EACUTE)$(MDASH)as seen on TV$(DOT)

 This problem is mostly evaded when you're targeting a single output
 format. Once you start targeting more than a single output format, the
 number of required macros grow exponentially. Making the DDoc source
 targetable to *arbitrary* output formats requires practically wrapping
 every character inside a macro, which is impractical.

 To work around this problem with the current version of DDoc, you'd need
 an external utility to do the transcriptions for you, which is a hassle.

ESCAPES has been recently defined to partially fix that. Also, LaTeX has about the same limitation. Someone defined an "ActiveTeX" derivative in which each character was active (and therefore potentially definable as a macro). As far as I know it didn't catch up, which may be a sign that people were okay without that capability. Andrei
May 26 2013
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/26/13 4:02 PM, H. S. Teoh wrote:
 On Sun, May 26, 2013 at 03:03:30AM -0400, Andrei Alexandrescu wrote:
 ESCAPES has been recently defined to partially fix that.

Is it working now?

Yes.
 Oh? I thought TeX already had the capability. Well, at least, you could
 redefine the default escape character "\" to be basically anything,
 including a letter, so you can achieve strange things that way. I'm not
 saying that's a good design though.

I think you can configure things that way, but by default most characters are not active.
 What I'm more concerned with was how to write DDocs that targets output
 formats with incompatible metacharacters or different foreign character
 encodings. For example, if the docs contained a character like , I'd
 like to be able to specify that it should be translated to \'e when
 targeting LaTeX, and left as-is in HTML, for example. I *could* define a
 macro $(EACUTE) for this purpose, of course, but it makes writing DDocs
 rather painful (why should I resort to $(EACUTE) if the DDoc input is
 already UTF-8 and can already represent such a character directly?).

Agreed.
 Another annoyance, that somebody else already mentioned, is how to wrap
 paragraphs in $(P ...) correctly, as is required for (X)HTML. Currently
 we only have linebreaks, which does not reliably translate to<p>  and
 </p>  with the correct nesting. I've tried to hack around that but still
 cannot get it working correctly in all possible cases. This is rather
 disappointing, since DDoc itself already defines what a paragraph is (or
 at least claims to), yet it doesn't easily lend itself to correct<p>
 nesting. One shouldn't have to dictate the manual use of $(P) in code
 docs in order to generate correct output.

Yah, paragraph breaks are special. LaTeX dedicates them a lot of attention (inserts \parbreak for two \n\n, collapses several consecutive \parbreak occurrences into one etc). Probably ddoc could do better at paragraphs.
 So in short, DDoc as it stands is quite a nice, clean, well-designed
 macro expansion system, but it falls a bit short of being a nice
 *documentation* generation system.

Agreed. Andrei
May 26 2013
prev sibling next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Saturday, May 25, 2013 20:10:53 Borden wrote:
 Good afternoon, all,
 
 I would still like to compile the D Lang Spec into EPUB (and
 possibly other formats) but, as we discussed in these threads:
 
 http://forum.dlang.org/thread/bsbdpjyjubfxvmecwhjl forum.dlang.org
 http://forum.dlang.org/thread/uzdngvjzexukbgkxdzpi forum.dlang.org
 
 having the D Lang Specification written in DDoc macros is making
 it extremely difficult to work with.
 
 I ask, therefore, what opposition would there be to me rewriting
 the DLang Spec files into another format that will be easier to
 parse and compile for the website, PDF, Latex, eBook and other
 formats? If the answer is 'minimal', 'go ahead' or 'it's your
 funeral', then my follow-up question is 'what format would be the
 easiest to write, debug and maintain?'
 
 For greater clarity, I am NOT proposing to rewrite the
 DDoc-generated library documentation or any other pages outside
 of the spec. In the makefile, they are defined as the files
 covered in $(SPEC_ROOT).
 
 With regards,

Can you please give concrete examples of what doesn't work with ddoc? On the whole, I find ddoc to work extremely well. Depending on what you're problem is, it may be the case that the macros in question just need to be rearranged or redesigned. Or maybe we could add a fairly simple feature to ddoc to solve the problem. Certainly, my naturaly reaction is to be against rewriting any of dlang.org in something other than ddoc. It's all in ddoc right now, so it's quite consistent, and aside from you, I'm not aware of anyone complaining about it any time recently. - Jonathan M Davis
May 25 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 5/25/2013 8:55 PM, Jonathan M Davis wrote:
 3) Again using LINK2, if I were to delete the LINK2= line from
 doc.ddoc and forget to readd it, my experience is that dmd -D
 will quietly drop instances of $(LINK2) without telling me.

Then perhaps dmd should be fixed so that it complains. That's a quality of implementation issue and probably easily fixed.

It's quite deliberate, is not a QoI issue, and doesn't need to be fixed.
 4) Again using the same example, if LINK2 gets defined in
 multiple DDoc files, how do I know for certain which definition
 it calls when dmd runs against the files?

Again. That's a QoI issue. We can probably make the compiler give a warning or error in that case.

Again, this is deliberate. Macros are set up so that the last one overrides all the previous ones, enabling a hierarchy of them using ddoc files. It's a simple form of 'inheritance'.
May 25 2013
next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 5/25/2013 9:59 PM, Borden wrote:
 On Sunday, 26 May 2013 at 04:57:12 UTC, Borden wrote:
 On Sunday, 26 May 2013 at 04:30:46 UTC, Walter Bright wrote:
 Again, this is deliberate. Macros are set up so that the last one overrides
 all the previous ones, enabling a hierarchy of them using ddoc files. It's a
 simple form of 'inheritance'.

And perhaps this point could be clarified (and, when I next attack the source I'll test it). I have one.ddoc two.ddoc and src.dd. In src.dd, I use $(MY_MACRO x). one.ddoc has the line MY_MACRO=<p>Called one on $1</p>; two.ddoc has the line MY_MACRO=<p>Called two on $1</p>. So, I now run dmd -o- -D one.ddoc two.ddoc src.dd. What does src.html say?

and by $1 I mean, of course, $0.

The lexically last definition of MY_MACRO is used.
May 25 2013
prev sibling next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 5/25/2013 10:34 PM, Jonathan M Davis wrote:
 My main complaint about ddoc is actually not a complaint about ddoc but about
 html. I find it very annoying to have to put $(P ) around every paragraph.
Stuff
 like LaTeX does that automatically based on blank lines, which is way better
 IMHO, but if you're targetting HTML, then unfortunately, you need to mark
 paragraphs. The only way to fix that with regards to ddoc would be to make it
 so that ddoc understood that blank lines meant new paragraphs and inserted
 <p></p> appropriately, when generating html, but that would make it so that
 ddoc was less general, and there might be other negatives to that I haven't
 thought of. So, we just get to deal with $(P ) I guess.

The issue with implied paragraph breaks is that then ddoc would have to get a lot smarter to avoid putting $(P ) around everything with a blank lines, and then you are already down the path of creating a markup language, not a macro language.
May 26 2013
prev sibling next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/26/13 4:20 PM, H. S. Teoh wrote:
 Wait, why not just make DDoc wrap it in $(P ) instead of<p></p>? That
 way, output formats that don't care can simply define $(P) to be the
 text followed by a line break, and you're done.

I thought it already does that. git grep -i '<p>' **/*.{c,h} src/doc.c:P = <p>$0</p>\n\ Andrei
May 26 2013
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/26/13 9:04 PM, Borden wrote:
 Before we get too off topic in this thread, is there demand for an
 xhtml5.ddoc file? If so, I'd like to make some changes to the other DDoc
 files as to minimise code reuse and minimise ambiguity in 'inherited'
 macro definitions. I'm willing to put in the time but I can't do it alone.

 If there's no demand, that's OK, too, and I'll put the matter to rest.

I think it would be great. In particular, an ebook format would be good. You may want to wait until https://github.com/D-Programming-Language/dlang.org/pull/271 is in. It systematizes macros a lot and it may offer answers to many of your questions. Andrei
May 26 2013
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/26/13 10:45 PM, Borden wrote:
 1) doc.ddoc and html.ddoc define many of the macros that I need, but
 some of them I'll need to redefine for HTML5. Walter's response to how
 dmd resolves 'macro inheritence' doesn't clarify for me whether I should
 override the non-HTML5-compliant macros or rewrite the whole file. I
 hope it's not the latter.

Just define the macros that differ and when compiling docs do this: dmd $FLAGS doc.ddoc html.ddoc html5.ddoc myfile.dd That way the macros defined in html5.ddoc will override those in the previous files.
 Also, I don't understand the difference between doc.ddoc and html.ddoc -
 what is each file supposed to do, exactly?

doc.ddoc is the general skeleton file for defining the online documentation. html.ddoc contains HTML-specific macros only, without having anything to do with our site's specific format.
 2) One I have my xhtml5.ddoc, it won't compile the .dd sources correctly
 because many of the .dd files aren't written in a manner where simple
 macro expansion will generate HTML5 compliant code. To solve this, I'll
 need guidance on how to change the .dd files to get xhtml.ddoc to work
 without breaking the other files.

 To this end it would be most helpful to develop a standard list of
 macros to use in the DLang spec sources and edit the non-conforming .dd
 files to follow it. It seems right now that the source files define
 whatever macros they like and leaves the onus on figuring out what each
 means on the .ddoc files.

Yup, you got your work cut for you. Then again, wait til that diff is merged. It fixes a bunch of problems. Andrei
May 26 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/27/13 12:09 AM, Borden wrote:
 On Monday, 27 May 2013 at 03:32:54 UTC, Andrei Alexandrescu wrote:
 doc.ddoc is the general skeleton file for defining the online
 documentation. html.ddoc contains HTML-specific macros only, without
 having anything to do with our site's specific format.

For greater clarity, html.ddoc will produce a generic, HTML-compliant file. In contrast, doc.ddoc will add all of the dlang.org-specific decorations and boilerplate?

No. Think of html.ddoc as a library of macros for HTML. They lack the "main" file and other things. Andrei
May 26 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/27/13 6:48 PM, Borden wrote:
 Oh, and another thing: XHTML adopts the XML practice of only defining
 the lt, gt and amp entities and no others (like nbsp, mdash, accented,
 or non-Latin characters).

 Since Unicode is, by and large, universal, I've read that the
 recommended practice for including characters not on a standard US
 keyboard is to copy them from a character map and save the file in a
 Unicode encoding. I intend to follow this guidance in writing the
 (x)html.ddoc template.

 As such, should I keep the existing 'entity' macros or use the Unicode
 characters in the DLang spec source files? I imagine that Andrei will
 immediately comment that .tex files are supposed to be in ASCII.
 Suggestions?

The LaTeX configuration won't use your ddoc template. Knock yourself out. Andrei
May 27 2013
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/27/13 9:32 PM, Jonathan M Davis wrote:
 On Monday, May 27, 2013 21:29:41 Andrei Alexandrescu wrote:
 On 5/27/13 6:48 PM, Borden wrote:
 Oh, and another thing: XHTML adopts the XML practice of only defining
 the lt, gt and amp entities and no others (like nbsp, mdash, accented,
 or non-Latin characters).

 Since Unicode is, by and large, universal, I've read that the
 recommended practice for including characters not on a standard US
 keyboard is to copy them from a character map and save the file in a
 Unicode encoding. I intend to follow this guidance in writing the
 (x)html.ddoc template.

 As such, should I keep the existing 'entity' macros or use the Unicode
 characters in the DLang spec source files? I imagine that Andrei will
 immediately comment that .tex files are supposed to be in ASCII.
 Suggestions?

The LaTeX configuration won't use your ddoc template. Knock yourself out.

Yes, but he was wondering if he could change the .dd files to use Unicode characters directly instead of macros, which _would_ affect the LaTeX configuration.

Prolly that wouldn't be a good idea. Macros are the traditional level of indirection that solve all problems... Andrei
May 27 2013
prev sibling parent reply "Daniel Murphy" <yebblies nospamgmail.com> writes:
"Borden" <2013 bordenrhodes.com> wrote in message 
news:qglzffgfawrzjguvttus forum.dlang.org...
 I would still like to work on compiling the DLangSpec into HTML5, but I've 
 noticed that pull request 271 hasn't been touched in over 4 months. 
 Further, I sent in a pull request to move the DLangSpec source files into 
 their own folders and haven't gotten so much as a 'worst pull request 
 ever' in response.

 I fully appreciate that people are very busy - including me - so I want to 
 know if there's anything I can do to help things along. At least with 
 respect to pull request 271, is there anything that I can do to help get 
 it merged into master so I can get working on the HTML5 DDoc?

To be honest, you just have to keep bugging people. I mostly review compiler pulls, and I am much much more likely to review something that shows up in my inbox than something that sits patiently in the list. If you make enough noise somebody will eventually reply.
Jun 29 2013
parent Walter Bright <newshound2 digitalmars.com> writes:
On 7/22/2013 11:15 PM, Borden wrote:
 Ping! I'm just bumping this thread to see where the status of integrating pull
 request 271 is and whether there's anything I can do to expedite matters. I've
 noticed that there are some changes to dlang.org's website source. Are these
 changes working towards HTML 5 compliance? (or, at least, the part of HTML 5
 that probably won't change).

271 is stuck at the moment because it can't be auto-merged. https://github.com/D-Programming-Language/dlang.org/pull/271 (BTW, including links to what you're referring to is a helpful practice and makes it much easier for others to weigh in.)
Jul 23 2013
prev sibling next sibling parent "Borden" <2013 bordenrhodes.com> writes:
On Saturday, 25 May 2013 at 23:28:46 UTC, Jonathan M Davis wrote:
 aside from you, I'm not aware of anyone complaining about it 
 any time recently.

Good evening, Jonathan, I'm not sure whether you mean that nobody's complained recently about the spec being in DDoc lately, because, as in my first link, I found that more people disliked the macros feature of DDoc (2 + me) than liked it (0). To answer your question, regarding concrete examples, these issues are in the context of wanting to translate the DLangSpec pages into HTML5 so that I can compile them into an ePUB document: 1) the DLangSpec files (pick any one) use SECTION# macros, where # is a number. In the DDoc conversion files to HTML, these SECTION# macros convert to <h#> tags and encase the contents within them. However, I'm not aware of any facility within DDoc, short of hand-writing a parser, to allow these SECTION# macros to be nested in order to take advantage of HTML5's <section> tags. 2) The macros are not self-documenting. For example, consider $(LNAME2 pointers, Pointers) in arrays.dd. The easiest way, I know, to figure out what $(LNAME2) means is to read the posix.mak to see that arrays.dd gets pumped through ddoc.dd. Now, a search through doc.ddoc to find the declaration LINK2=<a href="$1">$+</a> at last tells me that argument 1 is the path to the link and everything that follows that is the text to appear in the link. The point is that, as I struggle through modifying the existing .ddoc templates to compile to HTML5, I need to keep flipping back and forth between the source and the .ddoc to make sure that anything I'm redefining I'm doing correctly. 3) Again using LINK2, if I were to delete the LINK2= line from doc.ddoc and forget to readd it, my experience is that dmd -D will quietly drop instances of $(LINK2) without telling me. 4) Again using the same example, if LINK2 gets defined in multiple DDoc files, how do I know for certain which definition it calls when dmd runs against the files? 5) I find that a lot of the DLangSpec is written from an HTML point of view, so maybe it just needs rewriting to make the macros descriptive. For example, consider $(B dmd) and $(B -gc) on lines 881-882 of abi.dd. By default, these get converted into <b>dmd</b> and <b>-gc</b> Say I want commands (like dmd) to be bolded but I want command-line arguments not to be bolded. There's no way to write B= to single out some $(B)s and not others. Now, I know the knee-jerk response is "Yes, but HTML works the exact same way." That's true, but CSS *does* give you a bunch of selectors to cherry-pick, say, "only the <b> tags of class X" or "the element with this id." Meaning that all I have to do is find the Bs I want to change and add a class="" without having to worry about updating any of the other Bs. Is there a simple way to do this in DDoc? These are just a few of the observations that I have. As I said in the other threads, my ePUB effort has ground to a halt because I find that I'm fighting to read the spec sources rather than figuring out how to produce clean and pretty HTML5 code that can get compiled into an ePUB.
May 25 2013
prev sibling next sibling parent "Borden" <2013 bordenrhodes.com> writes:
On Sunday, 26 May 2013 at 00:28:05 UTC, Andrei Alexandrescu wrote:
 To generate several formats from one source, a macro system is 
 needed. One interesting thing I figured about macro systems is 
 they're all dirty - they can't be really considered "languages" 
 because they intermix the programming part with the very output 
 generated. So, what macro system would you use? (Actual 
 question.) Look at m4 - it won't win any beauty contests, 
 either, and it's enormously complicated. DDoc is simple for 
 what it does, it has somehow hit a sweet spot.

Good evening, Professor, I'm not arguing with the macro system or proposing a replacement. I think, for what it's designed to do, it works perfectly well and, you're right, is somewhat faster and cleaner than XML tags. (and this is from someone who's biased in favour of HTML tags) My contention is that, for the purposes of writing lengthy, non-code documentation like the DLang spec (I'm not referring to any other documentation or pages on the site), enclosing the entire exposition in macros has made the source too inflexible for me to work with without awkward workarounds or having to write my own parser. Again, the idea is to use the features of HTML5 and compile the DLang spec into an ePub document that I can read on my brand-new Kobo. Therefore, I'm not proposing a radical overhaul of DDoc or recommending that all known DDoc be recoded in HTML, as I mention in my other forum threads. Rather, I'm mentioning that, for my purposes, the DLang Spec source is too difficult for me to work with in its current state for my purposes and I'm offering to recode it in a language that will allow me to accomplish what I want to do; keep the DLang spec easy to read, write and maintain; and tread on as few toes as possible.
May 25 2013
prev sibling next sibling parent "Borden" <2013 bordenrhodes.com> writes:
I want to keep this discussion focussed on the DLang spec source 
code. If we want to debate the features of DDoc, we should do it 
in another thread.

However, as not to appear full of cricism but short of ideas, I'm 
going to break my own rule and suggest, at least for the purposes 
of solving some of the issues I've run into with the DLang spec 
source, that integrating some wiki-markup into DDoc may help. For 
example:

1) Allowing sections to be defined using == Heading == or === 
Heading === instead of $(HEADING ) or variants. The advantage 
that Wiki syntax has over macro-syntax is that it automatically 
works out the section nesting (which is essential for building 
tables of contents in things such as, hint hint, eBooks) whereas 
macros can only do it if the subheadings are nested as arguments.
1a) Using ==Headings== and the existing /** */ code standards, 
DDoc could have a predefined $(TOC) macro which would 
auto-generate the TOC. /** */ would form the main headings and 
==Heading== would be the subheadings, prettily nested when 
formatted.
2) Adopting Latex's rule that a double line break means a new 
paragraph. This will effectively make the $(P) macros rampant in 
the DLang spec documentation unnecessary.
3) Defining tables using the +---+ syntax. I know that this will 
be unpopular due to the existing /++ documentation code rules 
(and thus is open to alternatives). However, one must admire how 
simply Wiki markup has elegantly solved a problem that Latex and 
XML dosen't.
4) Using either * or - to indicate bullet points, similar to Wiki 
markup. Again, I know that it'll have to be coded as not to 
confuse the parser with /** */ and operands.
4a) Maybe #) to indicate ordered lists? (again, similar to Wiki 
markup)
5) Use the [[Link|Link name]] instead of $(Link) macros to 
cross-link. By default, Link would be a reference to some other 
DDoc and allow links to be handled automagically.

How will this all help the DLang spec? Well, if the spec could be 
rewritten entirely in a /** */ block, with reasonable macro use, 
then couldn't it be parsed more readily into the necessary 
formats? Wouldn't it also make the source more readable and 
editable without all of the nested parentheses? Wouldn't the 
syntax be self-documenting?

Anyway, just throwing stuff against the wall to see what sticks...
May 25 2013
prev sibling next sibling parent "Borden" <2013 bordenrhodes.com> writes:
On Sunday, 26 May 2013 at 01:22:17 UTC, Borden wrote:
 2) Adopting Latex's rule that a double line break means a new 
 paragraph. This will effectively make the $(P) macros rampant 
 in the DLang spec documentation unnecessary.

Oops. I realised that this has already been done. OK, so I guess the question is why does the DLang spec need $(P) macros? How could it be rewritten without them and let dmd worry about them?
May 25 2013
prev sibling next sibling parent "Borden" <2013 bordenrhodes.com> writes:
On Sunday, 26 May 2013 at 01:57:16 UTC, Andrei Alexandrescu wrote:
 This is a worthy goal. We manage to generate mobi files for the 
 spec (and Phobos in a pull request), is the ebook format very 
 different?

 Andrei

Good evening, Professor, I'm still working through the ePUB standard, but, from what I can tell, the two are very similar. According to Wikipedia, ePub obsoletes Mobi (despite Amazon stubbornly enforcing Mobi (or some variant) for its Kindle). They both rely on similar standards, including OPF and NCX file formats, and ePUB uses XHTML and zip compression, thereby avoiding reinventing the wheel to the extent possible (although one would think that Docbook would make more sense for, you know, eBooks). Actually, in generating my first ePUB of the DLang spec, I simply used the same mobi input files, copied-and-pasted two boilerplate ePUB files and zipped the whole thing up. Aside from some egregious formatting on my Kobo, it works. Of course, I want to take the time to generate a proper ePUB 3 file and use as much of the standard as possible to advantage. Naturally, the goal is to extend whatever scripts I write to generate a DLang spec ePub to generate DDoc documentation eBooks, too, so I'm not entirely a one-issue person...
May 25 2013
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday, May 26, 2013 02:44:30 Borden wrote:
 On Saturday, 25 May 2013 at 23:28:46 UTC, Jonathan M Davis wrote:
 aside from you, I'm not aware of anyone complaining about it
 any time recently.

Good evening, Jonathan, I'm not sure whether you mean that nobody's complained recently about the spec being in DDoc lately, because, as in my first link, I found that more people disliked the macros feature of DDoc (2 + me) than liked it (0).

AFAIK, your recent posts on ddoc are the first that anyone has complained about it in quite some time. There are plenty of folks who want various improvements to the online documentation, but doesn't necessarily require doing anything to ddoc, and it's rarely the case that someone complains about ddoc itself.
 To answer your question, regarding concrete examples, these
 issues are in the context of wanting to translate the DLangSpec
 pages into HTML5 so that I can compile them into an ePUB document:
 
 1) the DLangSpec files (pick any one) use SECTION# macros, where
 # is a number. In the DDoc conversion files to HTML, these
 SECTION# macros convert to <h#> tags and encase the contents
 within them. However, I'm not aware of any facility within DDoc,
 short of hand-writing a parser, to allow these SECTION# macros to
 be nested in order to take advantage of HTML5's <section> tags.

Normally, you'd nest things by nesting macros. e.g. $(NESTED stuff $(NESTED more stuff $(NESTED yet more stuff) $(NESTED other stuff))) But I'm afrad that my understanding of html (and particularly html 5) is limited enough that you would have to give explict code samples for me to see what you can't convert. However, ddoc should allow you to do pretty much anything that involves simply transforming the content of a macro to somethnig else. The macro takes a set of arguments and then creates something new with them by rearranging them and adding stuff around them and the like. So, $(MACRO foo, bar, fiddly) can become <foo><fiddly>bar</fiddly></foo> or foo!bar.fiddly or whatever other combination of textual replacement and reording that you want to do. It's things that require counting what's there or generating something elsewhere in tho document base on macros (such as a table of contents or an index) which it can't do on its own (though it would be trivial to have another program read the ddoc and manipulate it to create sections for table of contents and the like if want to do that with a document that you're writing with ddoc). The exact set of macros used with the online documentation may very well be too specific to html 4, and it may be that the macros will have to be rewritten and moved around in the documentation, but the macro system itself will almost certainly do what you want.
 2) The macros are not self-documenting. For example, consider
 $(LNAME2 pointers, Pointers) in arrays.dd. The easiest way, I
 know, to figure out what $(LNAME2) means is to read the posix.mak
 to see that arrays.dd gets pumped through ddoc.dd. Now, a search
 through doc.ddoc to find the declaration LINK2=<a
 href="$1">$+</a> at last tells me that argument 1 is the path to
 the link and everything that follows that is the text to appear
 in the link. The point is that, as I struggle through modifying
 the existing .ddoc templates to compile to HTML5, I need to keep
 flipping back and forth between the source and the .ddoc to make
 sure that anything I'm redefining I'm doing correctly.

The same goes for any function name. You frequently have to look them up to see exactly what they do. If they have better names, that helps, but how self- documenting a macro is is completely up to how well it was named. That has nothing to do with ddoc itself.
 3) Again using LINK2, if I were to delete the LINK2= line from
 doc.ddoc and forget to readd it, my experience is that dmd -D
 will quietly drop instances of $(LINK2) without telling me.

Then perhaps dmd should be fixed so that it complains. That's a quality of implementation issue and probably easily fixed.
 4) Again using the same example, if LINK2 gets defined in
 multiple DDoc files, how do I know for certain which definition
 it calls when dmd runs against the files?

Again. That's a QoI issue. We can probably make the compiler give a warning or error in that case.
 5) I find that a lot of the DLangSpec is written from an HTML
 point of view, so maybe it just needs rewriting to make the
 macros descriptive. For example, consider $(B dmd) and $(B -gc)
 on lines 881-882 of abi.dd. By default, these get converted into
 <b>dmd</b> and <b>-gc</b> Say I want commands (like dmd) to be
 bolded but I want command-line arguments not to be bolded.
 There's no way to write B= to single out some $(B)s and not
 others. Now, I know the knee-jerk response is "Yes, but HTML
 works the exact same way." That's true, but CSS *does* give you a
 bunch of selectors to cherry-pick, say, "only the <b> tags of
 class X" or "the element with this id." Meaning that all I have
 to do is find the Bs I want to change and add a class="" without
 having to worry about updating any of the other Bs. Is there a
 simple way to do this in DDoc?

Ddoc is pure macros. If you want different stuff to be treated differently, you need different macros. But css can be used just fine with the html generated by ddoc, so if css supports what you're looking for, then I would think that you'd be able to do it with css. Regardless, the exact set of macros used with dlang.org are definitely targeted at html, so it's quite possible that we'll need a different set of macros to get it to convert to other formats more easily (though it already translates to one ebook format as well as latex - though Andrei is working on improving the latex stuff). So, my guess is that this not a problem with ddoc but rather a problem with how ddoc is being used. - Jonathan M Davis
May 25 2013
prev sibling next sibling parent "Borden" <2013 bordenrhodes.com> writes:
Good evening, Professor,

On Sunday, 26 May 2013 at 02:05:55 UTC, Andrei Alexandrescu wrote:
 What vexes me is that all the sugar you propose goes against 
 what you opened with...

I'm not trying to cause any offence, and I apologise if any of my phrasing or comments are construed that way. I know that I can be a little bit terse at times but I hope it's taken in the best possible way because I have full respect for the design and implementation of the language. On Sunday, 26 May 2013 at 02:05:55 UTC, Andrei Alexandrescu wrote:
 I can hypothesize that the shortest  path between where we are
 and what you're trying to accomplish is a few dozens of macro
 definitions. Did you try doing that and failed?

Indeed, it has run into some snags: 1) One of the first problems I ran into was coming up with rules for spec.dd. My original objective was to enclose the $(TOC) macro into <nav> tags, consistent both with the HTML5 spec and ePUB3. However, <p> tags are not allowed within the <nav> tags, but I also don't want to strip out the explanation the informative information. I don't know how to define TOC to keep the $(P) macro outside of the <nav> element which will enclose the TOCENTRY items. 2) Consider, for example, parsing arrays.dd (my comment can be easily applied to any other file). Unless I'm miscounting parentheses, $(H4) macros are not being used within $(H3) macros. Therefore, how do I get DDoc to parse the file so that it ends up with nested <section> tags? for example: <section><h3>Dynamic Arrays</h3> <section><h4>Array Declarations</h4> Content </section></section> 3) Consider, for example, this part from abi.dd: $(GRAMMAR $(I MangledName): $(B _D) $(I QualifiedName) $(I Type) $(B _D) $(I QualifiedName) $(B M) $(I Type) $(I QualifiedName): $(I SymbolName) $(I SymbolName) $(I QualifiedName) $(I SymbolName): $(I LName) $(I TemplateInstanceName) ) Say I want to style this using a descriptions list, the <dl> tag. That's easy enough, but now how do I tell DDoc to tag the $(I) macros using <dt> and <dd> tags? 4) Furthermore (still referring to the example above, because the issue applies to other areas), how do I tell DDoc that $(I)s within a $(GRAMMAR) macro are to be formatted using descriptions list syntax, but keep the other $(I) macros as regular <i> elements? 5) The link-related macros appear, by and large, to use relative URLs. If I'm compiling only the DLang Spec into an ePUB, the standard, I believe, requires that the links be resolvable. That's easy enough if the relative URL in question points to another page in the spec. However, if the link points to another page on the website or a library document, which (for now) won't be in the ePUB, is the only way to identify and fix those links going to be by hand? These are the problems that I've run into thus far. I'm doing my best to appreciate the design and theory of DDoc, but maybe it's too much of a paradigm shift for me and I end up fighting the macros?
May 25 2013
prev sibling next sibling parent "Borden" <2013 bordenrhodes.com> writes:
On Sunday, 26 May 2013 at 03:51:48 UTC, Walter Bright wrote:
 It's not totally random. I've designed one macro language 
 before (ABEL), and have implemented 3 (ABEL, Make, and C 
 preprocessor), so I knew what I wanted. Ddoc is very similar to 
 Make's macro system.

 BTW, the C preprocessor takes the cake for being both 
 horrendously complicated (most implementations take about 10 
 years to get right) and woefully inadequate.

Good evening, Walter, I noticed the similarities to DDoc macros and Make immediately. Again, I think the documentation system you designed is excellent. The 'sugar' I suggested in an earlier post seemed, at least to me, in line with the general 'common sense syntax' that you implemented elsewhere - such as with defining code, variables, dates, authors, paragraphs, etc. Then again, you're dealing with an (aspiring) accountant, not a computer scientist, so I only have experience in trying to make complex things look pretty and not caring about all of that optimisation and implementation stuff! Again, I have nothing but the highest respect for the work you've done. Think of it this way: if I didn't admire D so much, I wouldn't be so determined to get its documentation onto eReaders!
May 25 2013
prev sibling next sibling parent "Borden" <2013 bordenrhodes.com> writes:
On Sunday, 26 May 2013 at 03:56:08 UTC, Jonathan M Davis wrote:
 AFAIK, your recent posts on ddoc are the first that anyone has 
 complained about it in quite some time. There are plenty of
 folks who want various improvements to the online documentation,
 but doesn't necessarily require doing anything to ddoc, and it's
 rarely the case that someone complains about ddoc itself.

That's fair, and has probably only come up now because I've decided - granted with very little experience in DDoc - to kick the proverbial hornets' nest by diving head-first into it and do crazy things with the source. On Sunday, 26 May 2013 at 03:56:08 UTC, Jonathan M Davis wrote:
 Normally, you'd nest things by nesting macros. e.g.

 $(NESTED stuff $(NESTED more stuff $(NESTED yet more stuff) 
 $(NESTED other
 stuff)))

Indeed. I suppose, in addition to my grievances about error checking, is suggesting that DDoc should include parentheses closure checking? On Sunday, 26 May 2013 at 03:56:08 UTC, Jonathan M Davis wrote:
 However, ddoc should allow you to do pretty much anything that 
 involves simply transforming the content of a macro to 
 somethnig else. The macro takes a set of arguments and then 
 creates something new with them by rearranging them and adding 
 stuff around them and the like.

And maybe that's what my biggest frustration with the macros is (or at least how they're implemented in the DLang spec): they read like an abstracted wrapper for HTML, and someone like me immediately yearns for the extra features that got simplified out. Say, for example, we have a $(B) macro, and I want some of them to have ids or classes and others not (since my eventual CSS file will have special formatting rules for them). To add this functionality, I would have to find all of the $(B)s and rewrite them to say $(B id, class, content). For each one that I miss, I'm going to have an empty <b> element with its id set to the content! On Sunday, 26 May 2013 at 03:56:08 UTC, Jonathan M Davis wrote:
 The exact set of macros used with the online documentation may 
 very well be too specific to html 4, and it may be that the
 macros will have to be rewritten and moved around in the
 documentation

And now I think we're getting to the heart of the problem. I might have a more favourable opinion of macros if they were more descriptive of the content in the DLang spec source files. That would allow a fair bit more flexibility. Still, I don't think there's any avoiding that any macros requiring nested formatting or special parameters will require them to be written with their intended output formats 'in mind' to work correctly. And, of course, I am very reluctant to attack any of this lest I start breaking the website, Latex or PDF generation in getting the DLang spec 'HTML5 ready'. Hence, why I'm at a standstill.
May 25 2013
prev sibling next sibling parent "Borden" <2013 bordenrhodes.com> writes:
On Sunday, 26 May 2013 at 04:30:46 UTC, Walter Bright wrote:
 Again, this is deliberate. Macros are set up so that the last 
 one overrides all the previous ones, enabling a hierarchy of 
 them using ddoc files. It's a simple form of 'inheritance'.

And perhaps this point could be clarified (and, when I next attack the source I'll test it). I have one.ddoc two.ddoc and src.dd. In src.dd, I use $(MY_MACRO x). one.ddoc has the line MY_MACRO=<p>Called one on $1</p>; two.ddoc has the line MY_MACRO=<p>Called two on $1</p>. So, I now run dmd -o- -D one.ddoc two.ddoc src.dd. What does src.html say?
May 25 2013
prev sibling next sibling parent "Borden" <2013 bordenrhodes.com> writes:
On Sunday, 26 May 2013 at 04:57:12 UTC, Borden wrote:
 On Sunday, 26 May 2013 at 04:30:46 UTC, Walter Bright wrote:
 Again, this is deliberate. Macros are set up so that the last 
 one overrides all the previous ones, enabling a hierarchy of 
 them using ddoc files. It's a simple form of 'inheritance'.

And perhaps this point could be clarified (and, when I next attack the source I'll test it). I have one.ddoc two.ddoc and src.dd. In src.dd, I use $(MY_MACRO x). one.ddoc has the line MY_MACRO=<p>Called one on $1</p>; two.ddoc has the line MY_MACRO=<p>Called two on $1</p>. So, I now run dmd -o- -D one.ddoc two.ddoc src.dd. What does src.html say?

and by $1 I mean, of course, $0.
May 25 2013
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Saturday, May 25, 2013 21:30:44 Walter Bright wrote:
 On 5/25/2013 8:55 PM, Jonathan M Davis wrote:
 3) Again using LINK2, if I were to delete the LINK2= line from
 doc.ddoc and forget to readd it, my experience is that dmd -D
 will quietly drop instances of $(LINK2) without telling me.

Then perhaps dmd should be fixed so that it complains. That's a quality of implementation issue and probably easily fixed.

It's quite deliberate, is not a QoI issue, and doesn't need to be fixed.

Hmmmm. Because it's designed with idea that you can make multiple passes? Well, regardless of why, the fact that it doesn't give an error doesn't harm ddoc's expressiveness. So, in questions of whether ddoc is powerful enough or expressive enough to do something (which appears to be the thrust of Borden's complaints) aren't affected by it. My main complaint about ddoc is actually not a complaint about ddoc but about html. I find it very annoying to have to put $(P ) around every paragraph. Stuff like LaTeX does that automatically based on blank lines, which is way better IMHO, but if you're targetting HTML, then unfortunately, you need to mark paragraphs. The only way to fix that with regards to ddoc would be to make it so that ddoc understood that blank lines meant new paragraphs and inserted <p></p> appropriately, when generating html, but that would make it so that ddoc was less general, and there might be other negatives to that I haven't thought of. So, we just get to deal with $(P ) I guess. And it's easy enough to write a program to handle the stuff that ddoc _can't_ do (like generate a table of contents from all of your CHAPTER tags), that ddoc's limitations really aren't a big deal, and its flexibility is fantastic. - Jonathan M Davis
May 25 2013
prev sibling next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Sat, May 25, 2013 at 08:28:06PM -0400, Andrei Alexandrescu wrote:
[...]
 My attitude on DDoc has evolved in threes:
 
 3 minutes: "wtf is this crap"
 3 hours: "this sucks"
 3 days: "grumble I'll make do with this although it totally sucks"
 3 months: "this is pretty darn good"

LOL... Though for me, I think I stopped at the third step (or slightly past that).
 To generate several formats from one source, a macro system is
 needed. One interesting thing I figured about macro systems is
 they're all dirty - they can't be really considered "languages"
 because they intermix the programming part with the very output
 generated. So, what macro system would you use? (Actual question.)
 Look at m4 - it won't win any beauty contests, either, and it's
 enormously complicated. DDoc is simple for what it does, it has
 somehow hit a sweet spot.

I don't know, to me DDoc is still lacking a major feature: a mechanism for per-character translation. The problem is that many output formats have a different scheme of metacharacters, and some (most notably LaTeX) require special transcription of certain characters. Right now, the only way to handle this correctly in DDoc is very painful: write macros for every special character and logical entity (like mdash, nbsp, and the like), which makes it very hard to write. Your text would look like: $(T)his is Mr$(DOT)$(NBSP)T$(APOS)s $(DOLLAR)0$(DOT)02 recip$(EACUTE)$(MDASH)as seen on TV$(DOT) This problem is mostly evaded when you're targeting a single output format. Once you start targeting more than a single output format, the number of required macros grow exponentially. Making the DDoc source targetable to *arbitrary* output formats requires practically wrapping every character inside a macro, which is impractical. To work around this problem with the current version of DDoc, you'd need an external utility to do the transcriptions for you, which is a hassle. T -- Real men don't take backups. They put their source on a public FTP-server and let the world mirror it. -- Linus Torvalds
May 25 2013
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday, May 26, 2013 00:32:01 Walter Bright wrote:
 On 5/25/2013 10:34 PM, Jonathan M Davis wrote:
 My main complaint about ddoc is actually not a complaint about ddoc but
 about html. I find it very annoying to have to put $(P ) around every
 paragraph. Stuff like LaTeX does that automatically based on blank lines,
 which is way better IMHO, but if you're targetting HTML, then
 unfortunately, you need to mark paragraphs. The only way to fix that with
 regards to ddoc would be to make it so that ddoc understood that blank
 lines meant new paragraphs and inserted <p></p> appropriately, when
 generating html, but that would make it so that ddoc was less general,
 and there might be other negatives to that I haven't thought of. So, we
 just get to deal with $(P ) I guess.

The issue with implied paragraph breaks is that then ddoc would have to get a lot smarter to avoid putting $(P ) around everything with a blank lines, and then you are already down the path of creating a markup language, not a macro language.

Which is why I'm not pushing for any changes in that regard. For some of the stuff that I'm writing in ddoc right now, I considered having the program that does the build add the $(P) macros for me but decided that it was better to just suck it up and use $(P) rather than risk problems with code blocks with blank lines in them and whatnot (I'm using a D program to do the build because it's easier than writing makefile, and I needed a program to generate the table of contents and index anyway, since ddoc can't do that). So, I'm not sure what the best solution with regards to $(P) is, and for the moment, it looks like it's just better to put up with it, but it does end up being my #1 annoyance when dealing with ddoc. - Jonathan M Davis
May 26 2013
prev sibling next sibling parent "Borden" <2013 bordenrhodes.com> writes:
On Sunday, 26 May 2013 at 06:43:46 UTC, Jonathan M Davis wrote:
 So, in questions of whether ddoc is powerful enough or
 expressive enough to do something (which appears to be the 
 thrust of Borden's complaints) aren't affected by it.

How I'd rewrite DDoc from scratch as its own markup language is not quite what I'm trying to get at in this thread. From what I've gathered from Walter's responses, if I've understood correctly, is that the idea behind DDoc is to provide the simplest rules-based formatting scheme possible for the purposes of generating documentation at the same time one compiles code. I just want to make sure that I understand what I'm working with. My 'complaint' - although I would prefer to have my observations about difficulties working with a markup system be called 'observations' - is that the current body of text files which comprise the DLang spec source cannot be easily compiled into clean, well-formed, XHTML5-compliant files from which I can build an ePUB file. To solve this problem, and based on responses I got to previous related threads, I offered in my first post to translate the DLang spec files into a markup designed for documentation. This idea was promptly refuted as being unwelcome effort as, it was explained, the DDoc spec is written in a way which is both sufficient for its purposes and is independent of any particular markup language. I am willing to keep working with the DDoc macros to try to get them to output the XHTML5 files that I want. However, before I can continue, I need guidance on: a) How I can modify the DLang spec files to enable me to translate them into the HTML5 files that I need; and b) Avoid breaking existing compilation into other formats (such as Latex, PDF, HTML4, etc.) (I apologise if my message came across as hostile. It's rather late where I am and I wanted to get this into the aether before I went to bed. I don't mean any insult if anything I've writen could be interpreted that way)
May 26 2013
prev sibling next sibling parent "Juan Manuel Cabo" <juanmanuel.cabo gmail.com> writes:
On Sunday, 26 May 2013 at 08:09:16 UTC, Borden wrote:
 [...]
 My 'complaint' - although I would prefer to have my 
 observations about difficulties working with a markup system be 
 called 'observations' - is that the current body of text files 
 which comprise the DLang spec source cannot be easily compiled 
 into clean, well-formed, XHTML5-compliant files from which I 
 can build an ePUB file.
 [...]

Maybe you can automatically convert HTML to XHTML, and then apply an XSLT transformation. You mentioned somewhere that you needed something like a CSS transformation to target a <p> inside another element. You could do that with XSLT. To convert from HTML to XHTML you could use the following: http://www.codeproject.com/Articles/10792/Convert-HTML-to-XHTML-and-Clean-Unnecessary-Tags-a It is made in C#, though if it works, I guess it could be ported to D. ALso you could use Addam D. Ruppe XML DOM classes, which, though I'm not sure, seem to tolerate HTML4: https://github.com/adamdruppe/misc-stuff-including-D-programming-language-web-stuff (grab dom.d and characterencoding.d from there). Or maybe the next generation xml library for D which will be revieed for inclusion, which supports XPATH queries: http://dsource.org/projects/xmlp --jm
May 26 2013
prev sibling next sibling parent "Borden" <2013 bordenrhodes.com> writes:
Thank you for the suggestions, Juan.

For the purposes of generating a single set of XHTML5 documents, 
your advice would work. What I'm trying to do, however, is update 
the makefiles for the website source so that ePUB files become a 
target.

I worry, therefore, that pumping the DLang spec through several 
conversions will give me less control over the resulting output. 
More importantly, though, it will make the makefiles less 
portable because anyone who wishes to use them will have to 
install all of the dependencies, so I'm trying to avoid that.

Again, what I'd ideally like to do is write an xhtml5.ddoc file 
which will give all of the necessary macro definitions to compile 
the DLang spec into tidy XHTML5 files. Unless I'm mistaken, I 
don't think that this is a very unreasonable goal.

However, I could be wrong and a solution like the one you suggest 
may be the only way to do this...
May 26 2013
prev sibling next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Sun, May 26, 2013 at 03:03:30AM -0400, Andrei Alexandrescu wrote:
 On 5/26/13 2:03 AM, H. S. Teoh wrote:
I don't know, to me DDoc is still lacking a major feature: a
mechanism for per-character translation. The problem is that many
output formats have a different scheme of metacharacters, and some
(most notably LaTeX) require special transcription of certain
characters. Right now, the only way to handle this correctly in DDoc
is very painful: write macros for every special character and logical
entity (like mdash, nbsp, and the like), which makes it very hard to
write. Your text would look like:

	$(T)his is Mr$(DOT)$(NBSP)T$(APOS)s $(DOLLAR)0$(DOT)02
	recip$(EACUTE)$(MDASH)as seen on TV$(DOT)

This problem is mostly evaded when you're targeting a single output
format. Once you start targeting more than a single output format,
the number of required macros grow exponentially. Making the DDoc
source targetable to *arbitrary* output formats requires practically
wrapping every character inside a macro, which is impractical.

To work around this problem with the current version of DDoc, you'd
need an external utility to do the transcriptions for you, which is a
hassle.

ESCAPES has been recently defined to partially fix that.

Is it working now?
 Also, LaTeX has about the same limitation. Someone defined an
 "ActiveTeX" derivative in which each character was active (and
 therefore potentially definable as a macro). As far as I know it
 didn't catch up, which may be a sign that people were okay without
 that capability.

Oh? I thought TeX already had the capability. Well, at least, you could redefine the default escape character "\" to be basically anything, including a letter, so you can achieve strange things that way. I'm not saying that's a good design though. What I'm more concerned with was how to write DDocs that targets output formats with incompatible metacharacters or different foreign character encodings. For example, if the docs contained a character like , I'd like to be able to specify that it should be translated to \'e when targeting LaTeX, and left as-is in HTML, for example. I *could* define a macro $(EACUTE) for this purpose, of course, but it makes writing DDocs rather painful (why should I resort to $(EACUTE) if the DDoc input is already UTF-8 and can already represent such a character directly?). Another annoyance, that somebody else already mentioned, is how to wrap paragraphs in $(P ...) correctly, as is required for (X)HTML. Currently we only have linebreaks, which does not reliably translate to <p> and </p> with the correct nesting. I've tried to hack around that but still cannot get it working correctly in all possible cases. This is rather disappointing, since DDoc itself already defines what a paragraph is (or at least claims to), yet it doesn't easily lend itself to correct <p> nesting. One shouldn't have to dictate the manual use of $(P) in code docs in order to generate correct output. So in short, DDoc as it stands is quite a nice, clean, well-designed macro expansion system, but it falls a bit short of being a nice *documentation* generation system. T -- Blunt statements really don't have a point.
May 26 2013
prev sibling next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Sat, May 25, 2013 at 10:34:30PM -0700, Jonathan M Davis wrote:
[...]
 My main complaint about ddoc is actually not a complaint about ddoc
 but about html. I find it very annoying to have to put $(P ) around
 every paragraph. Stuff like LaTeX does that automatically based on
 blank lines, which is way better IMHO, but if you're targetting HTML,
 then unfortunately, you need to mark paragraphs. The only way to fix
 that with regards to ddoc would be to make it so that ddoc understood
 that blank lines meant new paragraphs and inserted <p></p>
 appropriately, when generating html, but that would make it so that
 ddoc was less general, and there might be other negatives to that I
 haven't thought of. So, we just get to deal with $(P ) I guess.

Wait, why not just make DDoc wrap it in $(P ) instead of <p></p>? That way, output formats that don't care can simply define $(P) to be the text followed by a line break, and you're done. T -- Today's society is one of specialization: as you grow, you learn more and more about less and less. Eventually, you know everything about nothing.
May 26 2013
prev sibling next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday, May 26, 2013 13:20:41 H. S. Teoh wrote:
 On Sat, May 25, 2013 at 10:34:30PM -0700, Jonathan M Davis wrote:
 [...]
 
 My main complaint about ddoc is actually not a complaint about ddoc
 but about html. I find it very annoying to have to put $(P ) around
 every paragraph. Stuff like LaTeX does that automatically based on
 blank lines, which is way better IMHO, but if you're targetting HTML,
 then unfortunately, you need to mark paragraphs. The only way to fix
 that with regards to ddoc would be to make it so that ddoc understood
 that blank lines meant new paragraphs and inserted <p></p>
 appropriately, when generating html, but that would make it so that
 ddoc was less general, and there might be other negatives to that I
 haven't thought of. So, we just get to deal with $(P ) I guess.

[...] Wait, why not just make DDoc wrap it in $(P ) instead of <p></p>? That way, output formats that don't care can simply define $(P) to be the text followed by a line break, and you're done.

I don't follow. The issue is that right now I have to do -------- $(P Here is my paragraph.) $(P Here is another paragraph.) -------- Whereas in something like latex, I'd just do -------- Here is my paragraph. Here is another paragraph. -------- When ddoc is run, the $(P content) gets translated to <p>content</p> in html, and into my second example for latex. But what I want to be able to do is write the second example and have html end up with <p>content</p>. And _that_ doesn't work, because it would require that dmd know about <p> and insert it for me instead of ddoc just being pure macros. It would be simple enough to run a program over the .dd file before running it through dmd in order to add the $(P) macros were appropriate, but then I have to worry about getting the logic right on that, and things like code examples could screw with that (you wouldn't want to insert $(P) into code examples). So, while it's quite feasible, I'm just putting up with $(P) for now. But we can't have a general ddoc solution for this without changing how ddoc works in a way that makes it so that it's not just a macro system anymore. Without another program to massage the ddoc in your file first, you're stuck with $(P). - Jonathan m Davis
May 26 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/26/13 8:33 PM, Jonathan M Davis wrote:
 But we can't have a general
 ddoc solution for this without changing how ddoc works in a way that makes it
 so that it's not just a macro system anymore.

I totally think we can. All ddoc has to do is insert some macro call automatically under certain conditions. That wouldn't make it less general because you get to define macros anyway you wanna. Andrei
May 26 2013
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/26/13 10:12 PM, Jonathan M Davis wrote:
 On Sunday, May 26, 2013 22:08:44 Andrei Alexandrescu wrote:
 On 5/26/13 8:33 PM, Jonathan M Davis wrote:
 But we can't have a general
 ddoc solution for this without changing how ddoc works in a way that makes
 it so that it's not just a macro system anymore.

I totally think we can. All ddoc has to do is insert some macro call automatically under certain conditions. That wouldn't make it less general because you get to define macros anyway you wanna.

Well, if we can do it, great. I hate having to use $(P). But it does involve dmd inserting stuff on its own based on the format for the text rather than using macro expansion (much as what it inserts would presumably be macros to expand). However, as long as we can do that, it should be quite feasible.

Already does that on ---- as noted. True, that makes ddoc less simple. Andrei
May 26 2013
prev sibling next sibling parent "Borden" <2013 bordenrhodes.com> writes:
Before we get too off topic in this thread, is there demand for 
an xhtml5.ddoc file? If so, I'd like to make some changes to the 
other DDoc files as to minimise code reuse and minimise ambiguity 
in 'inherited' macro definitions. I'm willing to put in the time 
but I can't do it alone.

If there's no demand, that's OK, too, and I'll put the matter to 
rest.
May 26 2013
prev sibling next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Mon, May 27, 2013 at 03:04:44AM +0200, Borden wrote:
 Before we get too off topic in this thread, is there demand for an
 xhtml5.ddoc file? If so, I'd like to make some changes to the other
 DDoc files as to minimise code reuse and minimise ambiguity in
 'inherited' macro definitions. I'm willing to put in the time but I
 can't do it alone.
 
 If there's no demand, that's OK, too, and I'll put the matter to
 rest.

I'm interested in something that will make Ddoc produce properly-nested tags for paragraphs. I'm not *too* concerned whether it will be HTML or XHTML, but I do care that it should be possible to correctly nest things without needing to manually do that with explicit macros in doc comments. T -- Why is it that all of the instruments seeking intelligent life in the universe are pointed away from Earth? -- Michael Beibl
May 26 2013
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday, May 26, 2013 22:08:44 Andrei Alexandrescu wrote:
 On 5/26/13 8:33 PM, Jonathan M Davis wrote:
 But we can't have a general
 ddoc solution for this without changing how ddoc works in a way that makes
 it so that it's not just a macro system anymore.

I totally think we can. All ddoc has to do is insert some macro call automatically under certain conditions. That wouldn't make it less general because you get to define macros anyway you wanna.

Well, if we can do it, great. I hate having to use $(P). But it does involve dmd inserting stuff on its own based on the format for the text rather than using macro expansion (much as what it inserts would presumably be macros to expand). However, as long as we can do that, it should be quite feasible. - Jonathan M Davis
May 26 2013
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday, May 26, 2013 22:10:58 Andrei Alexandrescu wrote:
 On 5/26/13 9:04 PM, Borden wrote:
 Before we get too off topic in this thread, is there demand for an
 xhtml5.ddoc file? If so, I'd like to make some changes to the other DDoc
 files as to minimise code reuse and minimise ambiguity in 'inherited'
 macro definitions. I'm willing to put in the time but I can't do it alone.
 
 If there's no demand, that's OK, too, and I'll put the matter to rest.

I think it would be great. In particular, an ebook format would be good. You may want to wait until https://github.com/D-Programming-Language/dlang.org/pull/271 is in. It systematizes macros a lot and it may offer answers to many of your questions.

What's required for that to be merged? Someone to review it? I actually don't have commit privileges to dlang.org (even though all of the newer Phobos committers seem to), so the most that I can do is look it over. But I've generally ignored the dlang.org repo, since I don't have commit rights, and these days I do a poor enough job of review druntime and Phobos pull requests as it is. - Jonathan M Davis
May 26 2013
prev sibling next sibling parent "Borden" <2013 bordenrhodes.com> writes:
On Monday, 27 May 2013 at 02:11:00 UTC, Andrei Alexandrescu wrote:
 I think it would be great. In particular, an ebook format would 
 be good.

 You may want to wait until 
 https://github.com/D-Programming-Language/dlang.org/pull/271 is 
 in. It systematizes macros a lot and it may offer answers to 
 many of your questions.

 Andrei

I appreciate the direct answer to my question, Professor. I would start anyway, in my own source copy, checking the existing .ddoc files and updating, in the few places necessary, the tags from HTML4 to HTML5 - most of these changes are to the HEAD section, anyway, and shouldn't require changes. There are two problems that I've already run into, which I'll need experienced help with: 1) doc.ddoc and html.ddoc define many of the macros that I need, but some of them I'll need to redefine for HTML5. Walter's response to how dmd resolves 'macro inheritence' doesn't clarify for me whether I should override the non-HTML5-compliant macros or rewrite the whole file. I hope it's not the latter. Also, I don't understand the difference between doc.ddoc and html.ddoc - what is each file supposed to do, exactly? 2) One I have my xhtml5.ddoc, it won't compile the .dd sources correctly because many of the .dd files aren't written in a manner where simple macro expansion will generate HTML5 compliant code. To solve this, I'll need guidance on how to change the .dd files to get xhtml.ddoc to work without breaking the other files. To this end it would be most helpful to develop a standard list of macros to use in the DLang spec sources and edit the non-conforming .dd files to follow it. It seems right now that the source files define whatever macros they like and leaves the onus on figuring out what each means on the .ddoc files.
May 26 2013
prev sibling next sibling parent "Borden" <2013 bordenrhodes.com> writes:
On Monday, 27 May 2013 at 03:32:54 UTC, Andrei Alexandrescu wrote:
 Yup, you got your work cut for you. Then again, wait til that 
 diff is merged. It fixes a bunch of problems.

That's OK. As long as I have some guidance on what to do I should manage. This effort isn't entirely selfless - part of tidying up the DLang spec is to help me learn D, too.
May 26 2013
prev sibling next sibling parent "Borden" <2013 bordenrhodes.com> writes:
On Monday, 27 May 2013 at 03:32:54 UTC, Andrei Alexandrescu wrote:
 doc.ddoc is the general skeleton file for defining the online 
 documentation. html.ddoc contains HTML-specific macros only, 
 without having anything to do with our site's specific format.

For greater clarity, html.ddoc will produce a generic, HTML-compliant file. In contrast, doc.ddoc will add all of the dlang.org-specific decorations and boilerplate? That being the case, would it make more sense for me to upgrade html.ddoc to HTML5 (since it's in candidate rec status over at W3C)?
May 26 2013
prev sibling next sibling parent "Borden" <2013 bordenrhodes.com> writes:
Oh, and another thing: XHTML adopts the XML practice of only 
defining the lt, gt and amp entities and no others (like nbsp, 
mdash, accented, or non-Latin characters).

Since Unicode is, by and large, universal, I've read that the 
recommended practice for including characters not on a standard 
US keyboard is to copy them from a character map and save the 
file in a Unicode encoding. I intend to follow this guidance in 
writing the (x)html.ddoc template.

As such, should I keep the existing 'entity' macros or use the 
Unicode characters in the DLang spec source files? I imagine that 
Andrei will immediately comment that .tex files are supposed to 
be in ASCII. Suggestions?
May 27 2013
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Tuesday, May 28, 2013 00:48:02 Borden wrote:
 Oh, and another thing: XHTML adopts the XML practice of only
 defining the lt, gt and amp entities and no others (like nbsp,
 mdash, accented, or non-Latin characters).
 
 Since Unicode is, by and large, universal, I've read that the
 recommended practice for including characters not on a standard
 US keyboard is to copy them from a character map and save the
 file in a Unicode encoding. I intend to follow this guidance in
 writing the (x)html.ddoc template.
 
 As such, should I keep the existing 'entity' macros or use the
 Unicode characters in the DLang spec source files? I imagine that
 Andrei will immediately comment that .tex files are supposed to
 be in ASCII. Suggestions?

Well, it's more user-friendly to have macros for Unicode than having to figure out how to input the actual Unicode character in there (since it's not on the keyboard), and it's trivial to turn the macro into the actual character with the macro, so I'd think that it would be more user-friendly to just use the macros, especially if we're already using them. And if laTeX has to be ASCII (I don't know if it has to be or not), then that's all the more reason to not use Unicode directly. But regardless, if we're already using macros, why bother changing it? Just change what the macros convert to in the XHTML generation. - Jonathan M Davis
May 27 2013
prev sibling next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Mon, May 27, 2013 at 05:30:27PM -0700, Jonathan M Davis wrote:
[...]
 Well, it's more user-friendly to have macros for Unicode than having
 to figure out how to input the actual Unicode character in there
 (since it's not on the keyboard), and it's trivial to turn the macro
 into the actual character with the macro, so I'd think that it would
 be more user-friendly to just use the macros, especially if we're
 already using them. And if laTeX has to be ASCII (I don't know if it
 has to be or not), then that's all the more reason to not use Unicode
 directly. But regardless, if we're already using macros, why bother
 changing it? Just change what the macros convert to in the XHTML
 generation.

Plain vanilla LaTeX assumes ASCII input, and will do odd things if fed 8-bit data (much less UTF-8). I think macros for HTML entities is the way to go, given the current setup. However, it is not a straightforward 1-to-1 mapping between &entity; and macro; to truly support LaTeX properly, one should be aware of some of its idiosyncrasies. For example, in Unicode, a character like ẃ can be represented by w *followed* by a combining diacritic; in LaTeX, however, the combining diacritic must *precede* the modified character (that is, \'w). So such characters should be represented by a single macro, say $(WACUTE), rather than w followed by a general $(ACUTE), which will be impossible to translate to LaTeX correctly. LaTeX also has some special sequences for different kinds of spacings: an abbreviation like "Mr." requires the interspersing space to be escaped, i.e., "Mr.\ X", otherwise it will treat the "." as a sentence terminator and give it an overly-wide space in the output. This may make it a bit annoying to write in Ddoc, though, 'cos you'll need a macro of some sort to indicate this non-terminating ".". The correct way to represent quotation marks in LaTeX is `` and '' for double quotes, and ` and ' for single quotes. Writing " or ' will still work, but it will just be ugly in the output. If there are math formulae involved, then they need to be enclosed with $, for example: "This sentence contains $2+2=4$ words." Inside math formulae, a slightly different syntax is used, but for the purposes of Ddoc, I think that can probably be ignored for now. A bunch of metacharacters need to be escaped; I can't recall the list off the top of my head, but they include at the very least: ~ # $ % ^ & { } _ \ The escape sequences required for these metacharacters are not all obvious; for example, \\ is NOT an escaped backslash, it's a linebreak. I forgot what a literal backslash is... And \^ is NOT a literal caret; it's a circumflex accent on the next letter; ditto with \~. Though IIRC \$ does represent a literal $. So, some care is required to make things work correctly. :) T -- It is impossible to make anything foolproof because fools are so ingenious. -- Sammy
May 27 2013
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Monday, May 27, 2013 21:29:41 Andrei Alexandrescu wrote:
 On 5/27/13 6:48 PM, Borden wrote:
 Oh, and another thing: XHTML adopts the XML practice of only defining
 the lt, gt and amp entities and no others (like nbsp, mdash, accented,
 or non-Latin characters).
 
 Since Unicode is, by and large, universal, I've read that the
 recommended practice for including characters not on a standard US
 keyboard is to copy them from a character map and save the file in a
 Unicode encoding. I intend to follow this guidance in writing the
 (x)html.ddoc template.
 
 As such, should I keep the existing 'entity' macros or use the Unicode
 characters in the DLang spec source files? I imagine that Andrei will
 immediately comment that .tex files are supposed to be in ASCII.
 Suggestions?

The LaTeX configuration won't use your ddoc template. Knock yourself out.

Yes, but he was wondering if he could change the .dd files to use Unicode characters directly instead of macros, which _would_ affect the LaTeX configuration. - Jonathan M Davis
May 27 2013
prev sibling next sibling parent "Borden" <2013 bordenrhodes.com> writes:
Yep, and that seems like a bad idea, so I'll just update the 
macros is the xhtml.ddoc file
May 27 2013
prev sibling next sibling parent "Borden" <2013 bordenrhodes.com> writes:
On Monday, 27 May 2013 at 02:11:00 UTC, Andrei Alexandrescu wrote:
 You may want to wait until 
 https://github.com/D-Programming-Language/dlang.org/pull/271 is 
 in. It systematizes macros a lot and it may offer answers to 
 many of your questions.

 Andrei

Professor, what sort of feedback would help get that pull expedited through? I downloaded it, and it compiled for me, but beyond that I don't know what to look for.
May 28 2013
prev sibling next sibling parent "Borden" <2013 bordenrhodes.com> writes:
On Monday, 27 May 2013 at 02:11:00 UTC, Andrei Alexandrescu wrote:
 I think it would be great. In particular, an ebook format would 
 be good.

 You may want to wait until 
 https://github.com/D-Programming-Language/dlang.org/pull/271 is 
 in. It systematizes macros a lot and it may offer answers to 
 many of your questions.

 Andrei

I would still like to work on compiling the DLangSpec into HTML5, but I've noticed that pull request 271 hasn't been touched in over 4 months. Further, I sent in a pull request to move the DLangSpec source files into their own folders and haven't gotten so much as a 'worst pull request ever' in response. I fully appreciate that people are very busy - including me - so I want to know if there's anything I can do to help things along. At least with respect to pull request 271, is there anything that I can do to help get it merged into master so I can get working on the HTML5 DDoc?
Jun 28 2013
prev sibling next sibling parent "Borden" <2013 bordenrhodes.com> writes:
On Saturday, 29 June 2013 at 11:33:16 UTC, Daniel Murphy wrote:
 To be honest, you just have to keep bugging people.  I mostly 
 review
 compiler pulls, and I am much much more likely to review 
 something that
 shows up in my inbox than something that sits patiently in the 
 list.  If you
 make enough noise somebody will eventually reply.

Sigh, I know. I just don't want to get on anybody's bad side. Maybe I can do that just by keeping this bumped...
Jun 29 2013
prev sibling parent "Borden" <2013 bordenrhodes.com> writes:
Ping! I'm just bumping this thread to see where the status of 
integrating pull request 271 is and whether there's anything I 
can do to expedite matters. I've noticed that there are some 
changes to dlang.org's website source. Are these changes working 
towards HTML 5 compliance? (or, at least, the part of HTML 5 that 
probably won't change).
Jul 22 2013