www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.announce - New web newsreader - requesting participation

reply Adam Ruppe <destructionator gmail.com> writes:
In the other newsgroup, I've been talking about a little
web news program I've been writing as a spinoff of the
potential new homepage idea.

It's to the point where it is usuable, but still kinda buggy:

http://arsdnet.net/d-web-site/nntp/thread-index?
newsgroup=digitalmars.D

Source code: http://arsdnet.net/d-web-site/nntp.d

NOTE: it does /not/ automatically check for new posts. I have
to manually trigger that right now (I don't want it annoying
the news server automatically while still in the testing phase.)

It will lazily load a message on demand though if you know
it's message ID:
http://arsdnet.net/d-web-site/nntp/get-message

Get it from the Message-ID header in the post.



Anyway, here's the features:

a) It isn't god awful slow. The PHP web news currently on digital
mars, as best as I can tell, actually polls the news server every
time you go to it's index! This does aggressive local caching.

b) It actually lets you select text...

OK, if I list every annoyance with the current web news, I'll
never stop. Moving on to new things:

c) It tries to convert news posts to HTML, so the paragraphs
wrap to the browser, links work, quotes are put into the proper
tags for indentation, and it tries to auto-detect D code and
put it in a <pre> block - which my javascript can make inline
editable and runnable. Example:

http://arsdnet.net/d-web-site/nntp/get-message?
newsgroup=digitalmars.D&messageId=%
3Cmailman.1085.1296409409.4748.digitalmars-d%40puremagic.com%3E

With script disabled, you'll see the code in a different colored
block. With script enabled, you'll see an Edit button there
too.

d) It tries to convert HTML emails back to plain text. (Ironically,
so it can turn it back to html...) This gives uniformity across
the various mime types. Similarly, if the type is
multipart/alternative, it will only show the text version.

e) It also makes an attempt to preserve deliberate whitespace,
for things like ASCII art or purposefully short lines. If it
can't make heads or tails of it, it bails out and shows the
original message in a <pre> block for human consumption.

f) Tries to be fast and lean.

g) Written in D!

h) Already read messages is tracked by your browser - if the link
is visited, it puts up a different color url.

Coming as I find time:

a) References to bugzilla entries should be automatically
converted to links.

b) Viewing threads by date or by threaded view.

c) Posting with the option of automatic quoting.

d) Syntax highlighting of D code in posts.

e) Maybe, maybe links to documentation of functions referenced,
   if I can find a good way to get them automatically. Integration
   with my dpldocs.info site is the way I'd do it.

e) Any more ideas? I'm reluctant to add too much, but if I like
   an idea - or if you want to write the code :) - I'll be open'
   to adding it.


Known bugs:

Lots of content types aren't handled right and it ignores
character encoding.

It doesn't always recognize code. This would be ok, but if it
sees one line as code but doesn't include one of them, it would
confuse the reader. Example:

http://arsdnet.net/d-web-site/nntp/get-message?
newsgroup=digitalmars.D&messageId=%3Cii4lbj%242bes%241%
40digitalmars.com%3E

(Look for "auto str =")

The reason for this is it detects code lines by looking for
semicolons and open braces. It will call something a generic
<pre> if there's a lot of whitespace in it - figuring it is
probaby ascii art (if it thinks the whitespace has human
significance, it tries to preserve it), but it still isn't
a perfect detection function.

I'm open to ideas. We want to detect code, but not flag
regular English text.



I'm also open to graphical styling ideas. I put up a dark
theme here because the white was hurting my eyes, but I change
on if I like light or dark almost at random. (Depends on the room's
lighting conditions I think). But I didn't do any more graphic
setup other than the max-width.

Multiple color schemes is an idea I like.



BTW, as a fun fact, this post is about 1/4th the size of the
entire nntp.d code file!
Jan 30 2011
next sibling parent reply "Nick Sabalausky" <a a.a> writes:
"Adam Ruppe" <destructionator gmail.com> wrote in message 
news:ii592i$c09$1 digitalmars.com...
 c) It tries to convert news posts to HTML, so the paragraphs
 wrap to the browser, links work, quotes are put into the proper
 tags for indentation, and it tries to auto-detect D code and
 put it in a <pre> block - which my javascript can make inline
 editable and runnable. Example:

 http://arsdnet.net/d-web-site/nntp/get-message?
 newsgroup=digitalmars.D&messageId=%
 3Cmailman.1085.1296409409.4748.digitalmars-d%40puremagic.com%3E

 With script disabled, you'll see the code in a different colored
 block. With script enabled, you'll see an Edit button there
 too.

That's really cool.
 d) It tries to convert HTML emails back to plain text. (Ironically,
 so it can turn it back to html...)

I love that on so many different levels :)
 h) Already read messages is tracked by your browser - if the link
 is visited, it puts up a different color url.

It's amazing how often people seem to forget that feature exists. That was introduced in what, Mosaic? Sometimes I think I'm the only one in the world who ever uses the "a:visited" CSS. Not that I feel strongly about it, but hey.
 It doesn't always recognize code. This would be ok, but if it
 sees one line as code but doesn't include one of them, it would
 confuse the reader. Example:

 http://arsdnet.net/d-web-site/nntp/get-message?
 newsgroup=digitalmars.D&messageId=%3Cii4lbj%242bes%241%
 40digitalmars.com%3E

 (Look for "auto str =")

Ha! I broke your algorithm! Oh, speaking of fuzzy detection algorithms, it seems to think that the "//" comment tokens are URLs (very, very short URLs ;) ).
 The reason for this is it detects code lines by looking for
 semicolons and open braces. It will call something a generic
 <pre> if there's a lot of whitespace in it - figuring it is
 probaby ascii art (if it thinks the whitespace has human
 significance, it tries to preserve it), but it still isn't
 a perfect detection function.

 I'm open to ideas. We want to detect code, but not flag
 regular English text.

One very rough idea: Take each paragraph (ie, each block of text that's separated by a full newline). Run it through a D lexer. If it has at most, say, 1 lexical error per line (on average), then assume it's intended as D code. If multiple consecutive paragraphs are flagged as D code, consider it them all part of the same code-block. After all, D's supposed to be fast to lex (and to parse for that matter), and you'd only need to do it once and cache the result. Maybe it could even be tied into some syntax highlighting. Maybe use DDMD (we could use more people on DDMD anyway - Koroskin doesn't seem to have had time for it lately...neither have I for that matter...). Actually, what could also be interesting would be an "english parser". Obviously true full-fledged english semantic processing is out-of-reach ATM, but I wonder if something could be made that acts "good enough" as a mere english-*detector*. Or a general natural-language-detector. Could be an interesting project at the very least.
 I'm also open to graphical styling ideas. I put up a dark
 theme here because the white was hurting my eyes, but I change
 on if I like light or dark almost at random. (Depends on the room's
 lighting conditions I think). But I didn't do any more graphic
 setup other than the max-width.

I like to use dark themes for my own stuff for the same reasons. But then I always end up going with bright-ish themes for public stuff because I know I'm in the minority on that. (I'm not really trying to suggest one way or another, just commenting.)
 BTW, as a fun fact, this post is about 1/4th the size of the
 entire nntp.d code file!

Viva la D!
Jan 30 2011
parent Adam Ruppe <destructionator gmail.com> writes:
Nick Sabalausky wrote:
 It's amazing how often people seem to forget [a:visited] exists.

Yeah, it boggles my mind - I personally find it incredibly useful. But every design I get for clients invariably has visited colors purposefully indistinguishable from regular links. Other things that break it for a lot of people is URLs randomly change ever so slightly, or don't change at all, which throws a wrench in caching too. I blame AJAX. (cue someone saying "ajax doesn't need to break it! yeah, I know.) Speaking of caching, that's something I want to work here, but there's one problem with that: checking for replies means the page's contents might actually change. I figure I'll set the cache expires date to coincide with the next newnews check. New posts won't show up immediately anywhere, but it'll be a little faster to navigate around in the mean time. (I'm thinking about a 30 minute check on .D and .learn, and a one hour check on .announce, since it's slower moving anyway.)
 Oh, speaking of fuzzy detection algorithms, it seems to think
 that the "//"
 comment tokens are URLs (very, very short URLs ;) ).

Yea, looks like std.regex.url kinda sucks. It flagged that, but it didn't match paths in website links. (Maybe I'm doing it wrong?)
 One very rough idea: Take each paragraph (ie, each block of text
 that's separated by a full newline). Run it through a D lexer. If
 it has at most, say, 1 lexical error per line (on average), then
 assume it's intended as D code.

I don't think that will work because a lot of regular sentences would register as a series of variable names. It'd probably have to try at least a rudimentary parse. (For comparison, consider a jumble of English words. Each piece is a word, so no problem there, but without understanding what they mean, you can't tell if it is a meaningful sentence or not.)
 Actually, what could also be interesting would be an "english
 parser". Obviously true full-fledged english semantic processing
 is out-of-reach ATM, but I wonder if something could be made that
 acts "good enough" as a mere english-*detector*. Or a general
 natural-language-detector.

I did put a very primitive thing like this in there: it checks for ". " when guessing if it's code or not when not sure. My reasoning is that while periods are common in both, in code it is usually followed by a method name, whereas in English, we usually put a space in there. I sometimes write ".\n" in code, but ". " is pretty rare in my own usage, outside comments. Another thing I considered was to check the frequency of capitalized words vs punctuation, or for balanced brackets and stuff like that. Natural language uses a lot of capital letters right after spaces. Code is more likely to be camelCased. There's some crossover ("McDonald's" could flag either way), but looking for bizarre symbols like parens, operators, etc. might disambiguate it. However, "line[$-1] == ';'" and friends were so much simpler and so far, seem to give good enough results, so I let it stay at that.
Jan 31 2011
prev sibling next sibling parent reply Trass3r <un known.com> writes:
OT:

 c) It tries to convert news posts to HTML, so the paragraphs
 wrap to the browser, links work, quotes are put into the proper
 tags for indentation, and it tries to auto-detect D code and
 put it in a <pre> block - which my javascript can make inline
 editable and runnable. Example:
 
 http://arsdnet.net/d-web-site/nntp/get-message?
 newsgroup=digitalmars.D&messageId=%
 3Cmailman.1085.1296409409.4748.digitalmars-d%40puremagic.com%3E

I accidentally used http://arsdnet.net/d-web-site/nntp/get-message?%20%20newsgroup=digitalmars.D&messageId=%3Cmailman.1085.1296409409.4748.digitalmars-d%40puremagic.com%3E (note the %20%20 before newsgroup) So it showed me some Get Message form with <mailman.1085.1296409409.4748.digitalmars-d puremagic.com> in the message id field. If I click on Get Message then: object.Exception: invalid newsgroup ---------------- /var/www/htdocs/d-web-site/nntp(immutable(char)[] nntp.sanitizeNewsgroupName(immutable(char)[])) [0x80ba73b] /var/www/htdocs/d-web-site/nntp(arsd nntp.Newsreader.getMessage(immutable(char)[], immutable(char)[])) [0x80b84a8] /var/www/htdocs/d-web-site/nntp(_D4arsd3web42__T17prepareReflectionTS4nntp10NewsreaderZ17prepareReflectionFC4arsd3cgi3CgiZPS4arsd3web14ReflectionInfo1499__T15generateWrapperS1425_D4nntp10Newsreader10getMessageFAyaAyaZC4arsd8database1349__T16SimpleDataObjectVAyaa5_706f737473TS4arsd8database1267__T21StructFromCreateTableVAyaa607_0a09435245415445205441424c4520706f73747320280a09092d2d20616c6c206f6620746865736520617265204d6573736167652d49442076616c7565730a09092d2d204649584d453a2074686973206973206c6961626c6520746f20626520736c6f6f6f6f6f6f77206173207468652064622067726f77730a09096d6573736167654964205641524348415228363029205052494d415259204b45592c0a0909696e5265706c79546f2056415243484152283630292c0a0909746872656164526f6f742056415243484152283630292c0a0a09092d2d20746865206e756d65726963206964656e7469666965722c206966207765206b6e6f772069740a090961727469636c65496420494e54454745522c0a0a090964617465506f7374656420424947494e54204e4f54204e554c4c2c0a0a09096e65777367726f7570205641524348415228343029204e4f54204e554c4c2c0a0a0909617574686f72205641524348415228383029204e4f54204e554c4c2c0a09097375626a65637420564152434841522831323029204e4f54204e554c4c2c0a0a09096d65737361676520544558540a092920454e47494e453d496e6e6f44422044454641554c5420434841525345543d757466383b0a0a09435245415445205441424c45206173736f727465645f6461746120280a0909696420494e5445474552204155544f5f494e4352454d454e542c0a0a09096e616d652056415243484152283830292c0a090976616c75652056415243484152283830292c0a0a09095052494d415259204b4559286964290a09292044454641554c5420434841525345543d757466383b0aVAyaa5_706f737473Z21StructFromCreateTableZ16SimpleDataObjectTS4nntp10NewsreaderVxAyaa10_6765744d657373616765Z15generateWrapperMFPS4arsd3web14ReflectionInfoZDFC4arsd3cgi3CgixHAyaAAyaxAyaZAya7wrapperMFC4arsd3cgi3CgixHAy AAyaxAyaZAya+0x1dd) [0x80be21d] /var/www/htdocs/d-web-site/nntp(_D4arsd3web3runFC4arsd3cgi3CgiPS4arsd3web14Ref ectionInfoZv+0x384) [0x80bca68] /var/www/htdocs/d-web-site/nntp(_Dmain+0x2b) [0x80b9b33] /var/www/htdocs/d-web-site/nntp(extern (C) int rt.dmain2.main(int, char**)) [0x80e4a36] /var/www/htdocs/d-web-site/nntp(extern (C) int rt.dmain2.main(int, char**)) [0x80e4990] /var/www/htdocs/d-web-site/nntp(extern (C) int rt.dmain2.main(int, char**)) [0x80e4a7a] /var/www/htdocs/d-web-site/nntp(extern (C) int rt.dmain2.main(int, char**)) [0x80e4990] /var/www/htdocs/d-web-site/nntp(main+0x96) [0x80e4936] /lib/libc.so.6(__libc_start_main+0xe6) [0xf741db86] /var/www/htdocs/d-web-site/nntp() [0x80b8291] Strange thing is, most functions are properly demangled but 2 aren't. Is this a (known) bug?
Jan 31 2011
parent reply Adam Ruppe <destructionator gmail.com> writes:
Trass3r wrote:
 So it showed me some Get Message form with
 <mailman.1085.1296409409.4748.digitalmars-d puremagic.com>
 in the message id field.

That, by the way, is one of the background features of web.d. If there's insufficient parameters to call a function (" newsgroup" != "newsgroup" so it thought it wasn't an argument to the function) it automatically generates a form based on the func's args, auto- fills what it knows, and lets you fill in the rest. The idea there was to define a basic website by doing nothing more than listing some function prototypes. While I find it pretty cool, it's "one size fits all" approach is actually fairly useless in practice, alas. Anyway:
 Strange thing is, most functions are properly demangled but 2
 aren't.
 Is this a (known) bug?

Yes, core.demangle can't do some symbols because DMD applies a one-way hash to them once they reach a certain length because such long symbols tend to break linkers.
Jan 31 2011
parent reply Trass3r <un known.com> writes:
 Strange thing is, most functions are properly demangled but 2
 aren't.
 Is this a (known) bug?

Yes, core.demangle can't do some symbols because DMD applies a one-way hash to them once they reach a certain length because such long symbols tend to break linkers.

Ah I see, but what about the short one: _D4arsd3web3runFC4arsd3cgi3CgiPS4arsd3web14ReflectionInfoZv
Jan 31 2011
parent Adam Ruppe <destructionator gmail.com> writes:
 Ah I see, but what about the short one:

Might be a bug in core.demangle (passing it to the function directly didn't work either). I'm not sure though.
Jan 31 2011
prev sibling next sibling parent reply foobar <foo bar.com> writes:
Adam Ruppe Wrote:

 In the other newsgroup, I've been talking about a little
 web news program I've been writing as a spinoff of the
 potential new homepage idea.
 
 It's to the point where it is usuable, but still kinda buggy:
 
 http://arsdnet.net/d-web-site/nntp/thread-index?
 newsgroup=digitalmars.D
 
 Source code: http://arsdnet.net/d-web-site/nntp.d
 
 NOTE: it does /not/ automatically check for new posts. I have
 to manually trigger that right now (I don't want it annoying
 the news server automatically while still in the testing phase.)
 
 It will lazily load a message on demand though if you know
 it's message ID:
 http://arsdnet.net/d-web-site/nntp/get-message
 
 Get it from the Message-ID header in the post.
 
 
 
 Anyway, here's the features:
 
 a) It isn't god awful slow. The PHP web news currently on digital
 mars, as best as I can tell, actually polls the news server every
 time you go to it's index! This does aggressive local caching.
 
 b) It actually lets you select text...
 
 OK, if I list every annoyance with the current web news, I'll
 never stop. Moving on to new things:
 
 c) It tries to convert news posts to HTML, so the paragraphs
 wrap to the browser, links work, quotes are put into the proper
 tags for indentation, and it tries to auto-detect D code and
 put it in a <pre> block - which my javascript can make inline
 editable and runnable. Example:
 
 http://arsdnet.net/d-web-site/nntp/get-message?
 newsgroup=digitalmars.D&messageId=%
 3Cmailman.1085.1296409409.4748.digitalmars-d%40puremagic.com%3E
 
 With script disabled, you'll see the code in a different colored
 block. With script enabled, you'll see an Edit button there
 too.
 
 d) It tries to convert HTML emails back to plain text. (Ironically,
 so it can turn it back to html...) This gives uniformity across
 the various mime types. Similarly, if the type is
 multipart/alternative, it will only show the text version.
 
 e) It also makes an attempt to preserve deliberate whitespace,
 for things like ASCII art or purposefully short lines. If it
 can't make heads or tails of it, it bails out and shows the
 original message in a <pre> block for human consumption.
 
 f) Tries to be fast and lean.
 
 g) Written in D!
 
 h) Already read messages is tracked by your browser - if the link
 is visited, it puts up a different color url.
 
 Coming as I find time:
 
 a) References to bugzilla entries should be automatically
 converted to links.
 
 b) Viewing threads by date or by threaded view.
 
 c) Posting with the option of automatic quoting.
 
 d) Syntax highlighting of D code in posts.
 
 e) Maybe, maybe links to documentation of functions referenced,
    if I can find a good way to get them automatically. Integration
    with my dpldocs.info site is the way I'd do it.
 
 e) Any more ideas? I'm reluctant to add too much, but if I like
    an idea - or if you want to write the code :) - I'll be open'
    to adding it.
 
 
 Known bugs:
 
 Lots of content types aren't handled right and it ignores
 character encoding.
 
 It doesn't always recognize code. This would be ok, but if it
 sees one line as code but doesn't include one of them, it would
 confuse the reader. Example:
 
 http://arsdnet.net/d-web-site/nntp/get-message?
 newsgroup=digitalmars.D&messageId=%3Cii4lbj%242bes%241%
 40digitalmars.com%3E
 
 (Look for "auto str =")
 
 The reason for this is it detects code lines by looking for
 semicolons and open braces. It will call something a generic
 <pre> if there's a lot of whitespace in it - figuring it is
 probaby ascii art (if it thinks the whitespace has human
 significance, it tries to preserve it), but it still isn't
 a perfect detection function.
 
 I'm open to ideas. We want to detect code, but not flag
 regular English text.
 
 
 
 I'm also open to graphical styling ideas. I put up a dark
 theme here because the white was hurting my eyes, but I change
 on if I like light or dark almost at random. (Depends on the room's
 lighting conditions I think). But I didn't do any more graphic
 setup other than the max-width.
 
 Multiple color schemes is an idea I like.
 
 
 
 BTW, as a fun fact, this post is about 1/4th the size of the
 entire nntp.d code file!

This is great work! looks SO much better than what we have right now. I'd implement the following filters/parsers for text posts: 1. common human markup such as: _foo_ (underline), *foo* (bold) etc, 2. parse BBCode. The NG could standardize on BBcode or some other light-weight marking going forward to make this even more straight forward.
Jan 31 2011
next sibling parent Daniel Gibson <metalcaedes gmail.com> writes:
Am 31.01.2011 11:25, schrieb foobar:
 Adam Ruppe Wrote:

 In the other newsgroup, I've been talking about a little
 web news program I've been writing as a spinoff of the
 potential new homepage idea.

 It's to the point where it is usuable, but still kinda buggy:

 http://arsdnet.net/d-web-site/nntp/thread-index?
 newsgroup=digitalmars.D

 Source code: http://arsdnet.net/d-web-site/nntp.d

 NOTE: it does /not/ automatically check for new posts. I have
 to manually trigger that right now (I don't want it annoying
 the news server automatically while still in the testing phase.)

 It will lazily load a message on demand though if you know
 it's message ID:
 http://arsdnet.net/d-web-site/nntp/get-message

 Get it from the Message-ID header in the post.



 Anyway, here's the features:

 a) It isn't god awful slow. The PHP web news currently on digital
 mars, as best as I can tell, actually polls the news server every
 time you go to it's index! This does aggressive local caching.

 b) It actually lets you select text...

 OK, if I list every annoyance with the current web news, I'll
 never stop. Moving on to new things:

 c) It tries to convert news posts to HTML, so the paragraphs
 wrap to the browser, links work, quotes are put into the proper
 tags for indentation, and it tries to auto-detect D code and
 put it in a<pre>  block - which my javascript can make inline
 editable and runnable. Example:

 http://arsdnet.net/d-web-site/nntp/get-message?
 newsgroup=digitalmars.D&messageId=%
 3Cmailman.1085.1296409409.4748.digitalmars-d%40puremagic.com%3E

 With script disabled, you'll see the code in a different colored
 block. With script enabled, you'll see an Edit button there
 too.

 d) It tries to convert HTML emails back to plain text. (Ironically,
 so it can turn it back to html...) This gives uniformity across
 the various mime types. Similarly, if the type is
 multipart/alternative, it will only show the text version.

 e) It also makes an attempt to preserve deliberate whitespace,
 for things like ASCII art or purposefully short lines. If it
 can't make heads or tails of it, it bails out and shows the
 original message in a<pre>  block for human consumption.

 f) Tries to be fast and lean.

 g) Written in D!

 h) Already read messages is tracked by your browser - if the link
 is visited, it puts up a different color url.

 Coming as I find time:

 a) References to bugzilla entries should be automatically
 converted to links.

 b) Viewing threads by date or by threaded view.

 c) Posting with the option of automatic quoting.

 d) Syntax highlighting of D code in posts.

 e) Maybe, maybe links to documentation of functions referenced,
     if I can find a good way to get them automatically. Integration
     with my dpldocs.info site is the way I'd do it.

 e) Any more ideas? I'm reluctant to add too much, but if I like
     an idea - or if you want to write the code :) - I'll be open'
     to adding it.


 Known bugs:

 Lots of content types aren't handled right and it ignores
 character encoding.

 It doesn't always recognize code. This would be ok, but if it
 sees one line as code but doesn't include one of them, it would
 confuse the reader. Example:

 http://arsdnet.net/d-web-site/nntp/get-message?
 newsgroup=digitalmars.D&messageId=%3Cii4lbj%242bes%241%
 40digitalmars.com%3E

 (Look for "auto str =")

 The reason for this is it detects code lines by looking for
 semicolons and open braces. It will call something a generic
 <pre>  if there's a lot of whitespace in it - figuring it is
 probaby ascii art (if it thinks the whitespace has human
 significance, it tries to preserve it), but it still isn't
 a perfect detection function.

 I'm open to ideas. We want to detect code, but not flag
 regular English text.



 I'm also open to graphical styling ideas. I put up a dark
 theme here because the white was hurting my eyes, but I change
 on if I like light or dark almost at random. (Depends on the room's
 lighting conditions I think). But I didn't do any more graphic
 setup other than the max-width.

 Multiple color schemes is an idea I like.



 BTW, as a fun fact, this post is about 1/4th the size of the
 entire nntp.d code file!

This is great work! looks SO much better than what we have right now. I'd implement the following filters/parsers for text posts: 1. common human markup such as: _foo_ (underline), *foo* (bold) etc, 2. parse BBCode. The NG could standardize on BBcode or some other light-weight marking going forward to make this even more straight forward.

No BBcode please, I'd still like to be able to (properly) view the posts in Thunderbird. Else we could entirely switch to phpBB or something like that instead of using a nntp server.
Jan 31 2011
prev sibling next sibling parent reply "Nick Sabalausky" <a a.a> writes:
"foobar" <foo bar.com> wrote in message 
news:ii62n0$1r3i$1 digitalmars.com...
 1. common human markup such as: _foo_ (underline), *foo* (bold) etc,

I've never been much of a fan of that. Actually that's one of the things I didn't like about Thunderbird when I tried it: it kept replacing *'s and _'s with formatting even what I was in the supposed "plaintext" mode. Of course, if the *'s and _'s stay intact when the text is bolded/etc, then I can't say I'd care quite so much.
Jan 31 2011
parent Daniel Gibson <metalcaedes gmail.com> writes:
Am 31.01.2011 13:19, schrieb Nick Sabalausky:
 "foobar"<foo bar.com>  wrote in message
 news:ii62n0$1r3i$1 digitalmars.com...
 1. common human markup such as: _foo_ (underline), *foo* (bold) etc,

I've never been much of a fan of that. Actually that's one of the things I didn't like about Thunderbird when I tried it: it kept replacing *'s and _'s with formatting even what I was in the supposed "plaintext" mode. Of course, if the *'s and _'s stay intact when the text is bolded/etc, then I can't say I'd care quite so much.

This is exactly what my Thunderbird (okay, it's Icedove really) does. *,_,/ stays intact, but the text is bolded/underlined/italicized.
Jan 31 2011
prev sibling parent reply Adam Ruppe <destructionator gmail.com> writes:
foobar wrote:
 1. common human markup such as: _foo_ (underline), *foo* (bold) etc,

Yeah, that's a pretty good idea. I agree with the others that it should keep the text symbols (especially since I've seen these algorithms wrongly flag things *a lot*) but a basic implementation is ok.
 2. parse BBCode.

This probably isn't a good idea... unless it is a web input only filter. So posts pulled off the news server are treated as plain text - no BBCode parsing is attempted. But posts made through the website may be parsed, and converted to plain text before being forwarded to the news server. (Note that I use my beloved mutt mail client for reading the newsgroups myself, so anything that would break plain text email browsing is a no.) I already have pretty decent bbcode -> html and html -> text functions in my bag of toys, so regular participants never need to know what kind of input was used. It would let web users feel more at home without impacting everyone else. The only downside I see is if people think bbcode is accepted, someone might write it in their newsreader or email client, where it won't be parsed. I don't want the groups to get filled up with bizarre markup everywhere, but, the kind of users who use email clients and newsreaders probably won't make that mistake anyway. So yeah, let's give it a try for web posting and see if it works out.
Jan 31 2011
parent foobar <foo bar.com> writes:
Adam Ruppe Wrote:

 foobar wrote:
 1. common human markup such as: _foo_ (underline), *foo* (bold) etc,

Yeah, that's a pretty good idea. I agree with the others that it should keep the text symbols (especially since I've seen these algorithms wrongly flag things *a lot*) but a basic implementation is ok.
 2. parse BBCode.

This probably isn't a good idea... unless it is a web input only filter. So posts pulled off the news server are treated as plain text - no BBCode parsing is attempted. But posts made through the website may be parsed, and converted to plain text before being forwarded to the news server. (Note that I use my beloved mutt mail client for reading the newsgroups myself, so anything that would break plain text email browsing is a no.) I already have pretty decent bbcode -> html and html -> text functions in my bag of toys, so regular participants never need to know what kind of input was used. It would let web users feel more at home without impacting everyone else. The only downside I see is if people think bbcode is accepted, someone might write it in their newsreader or email client, where it won't be parsed. I don't want the groups to get filled up with bizarre markup everywhere, but, the kind of users who use email clients and newsreaders probably won't make that mistake anyway. So yeah, let's give it a try for web posting and see if it works out.

Just to clarify, I don't want text posts to be filled with lot's of markup either. BBcode was just an example of a light-weight markup which is familiar to web based forum users. other options could be markdown and restructured-text. Basically whatever is light weight enough to not bother text mode users and is also useful enough when parsed by your web reader to convert code into those awesome "compile & run" boxes. We could also support just a tiny subset of BBCode (just the [code] tag), so that code snippets would be identified without a fuzzy guessing algorithm.
Jan 31 2011
prev sibling next sibling parent reply Stephan Soller <stephan.soller helionweb.de> writes:
Nice newsreader! Fast and does what it needs to do, and written in D. I 
like that. :)

I'm currently writing an NNTP web frontend (reading and posting) for my 
university. However it's written in PHP so it's not really fitting for a 
new D homepage. But I'm curious how you do web programming with D. Do 
you use CGI? How do you do all the HTTP stuff (parsing form data, etc.) 
and templating?

But back to the NNTP reader:

# HTML formating

The work you put into formating messages as HTML is impressive. The 
autodetection of source code could really come in handy. I found 
[Markdown][1] to work relatively well with common Mails so the syntax 
might contain a few good ideas for e.g. quotes, links, lists, etc.

[1]: http://daringfireball.net/projects/markdown/syntax

# Topic list

Right now you display the newest few messages on the newsgroup. Building 
a topic list gets quite a bit more complex. To get a proper topic list 
with pagination etc. I query the overview information of all (!) 
messages in a newsgroup with the "over" command (the digitalmars.com 
server supports the older "xover" which is the same). This contains the 
message ID and the references header which can be used to built a 
message tree. All messages on the root level of the tree are topics and 
it's easy to get the number of replies and the latest reply. It a bit 
tricky sometimes but all other algorithms I came up with tend to lose 
some messages (e.g. of the topic post is deleted) or were even slower. 
The overview also contains the subject and from header and some other 
useful stuff. I suppose the current newsreader does something similar 
without caching and this might be the reason why it is so slow.

This message tree and the overview information however can be cached 
very easily. The tree can also be extended on the fly, e.g. check for 
new messages with the newnews command and add them to the tree. This 
might require some locking but at least in PHP flock() was sufficient 
for that.

# Cache invalidation

The problem with the message tree cache or cached messages in general is 
the invalidation. Looks like the digitalmars news server does not delete 
that much messages so this might not be much of a problem. How do you 
handle this right now?

# D website

I took a look at your current version of the D website 
(http://arsdnet.net/d-web-site/). I really like the layout. Looks good 
to get started with D. Just two small things:

- The compile and run button is a bit of a security risk. I was able to 
read the /etc/passwd file for example. Maybe it's possible to lock down 
the compiled binaries with SELinux. Denial of service attacks (e.g. 
endless loops) might still be a problem though. We built an "online D 
compiler" for a presentation at our university but didn't published it 
because of these concerns.
- If you only display mails in the announcements which do not have a 
"References" header you will only get mails that started a new topic. 
This will filter out replies.


If you want some help I could do some stuff. I'm a bit short on time 
right now but since I'm building a NNTP reader in PHP anyway I might be 
able to help out with your D NNTP reader. I can also help with HTML and 
CSS stuff if you want. Support for older browsers and older IE versions 
if there is much traffic with these browsers or some minor design stuff 
(I'm not that much of a designer though). I might also start to look 
into SELinux…

Happy programming
Stephan Soller


On 31.01.2011 04:08, Adam Ruppe wrote:
 In the other newsgroup, I've been talking about a little
 web news program I've been writing as a spinoff of the
 potential new homepage idea.

 It's to the point where it is usuable, but still kinda buggy:

 http://arsdnet.net/d-web-site/nntp/thread-index?
 newsgroup=digitalmars.D

 Source code: http://arsdnet.net/d-web-site/nntp.d

 NOTE: it does /not/ automatically check for new posts. I have
 to manually trigger that right now (I don't want it annoying
 the news server automatically while still in the testing phase.)

 It will lazily load a message on demand though if you know
 it's message ID:
 http://arsdnet.net/d-web-site/nntp/get-message

 Get it from the Message-ID header in the post.



 Anyway, here's the features:

 a) It isn't god awful slow. The PHP web news currently on digital
 mars, as best as I can tell, actually polls the news server every
 time you go to it's index! This does aggressive local caching.

 b) It actually lets you select text...

 OK, if I list every annoyance with the current web news, I'll
 never stop. Moving on to new things:

 c) It tries to convert news posts to HTML, so the paragraphs
 wrap to the browser, links work, quotes are put into the proper
 tags for indentation, and it tries to auto-detect D code and
 put it in a<pre>  block - which my javascript can make inline
 editable and runnable. Example:

 http://arsdnet.net/d-web-site/nntp/get-message?
 newsgroup=digitalmars.D&messageId=%
 3Cmailman.1085.1296409409.4748.digitalmars-d%40puremagic.com%3E

 With script disabled, you'll see the code in a different colored
 block. With script enabled, you'll see an Edit button there
 too.

 d) It tries to convert HTML emails back to plain text. (Ironically,
 so it can turn it back to html...) This gives uniformity across
 the various mime types. Similarly, if the type is
 multipart/alternative, it will only show the text version.

 e) It also makes an attempt to preserve deliberate whitespace,
 for things like ASCII art or purposefully short lines. If it
 can't make heads or tails of it, it bails out and shows the
 original message in a<pre>  block for human consumption.

 f) Tries to be fast and lean.

 g) Written in D!

 h) Already read messages is tracked by your browser - if the link
 is visited, it puts up a different color url.

 Coming as I find time:

 a) References to bugzilla entries should be automatically
 converted to links.

 b) Viewing threads by date or by threaded view.

 c) Posting with the option of automatic quoting.

 d) Syntax highlighting of D code in posts.

 e) Maybe, maybe links to documentation of functions referenced,
     if I can find a good way to get them automatically. Integration
     with my dpldocs.info site is the way I'd do it.

 e) Any more ideas? I'm reluctant to add too much, but if I like
     an idea - or if you want to write the code :) - I'll be open'
     to adding it.


 Known bugs:

 Lots of content types aren't handled right and it ignores
 character encoding.

 It doesn't always recognize code. This would be ok, but if it
 sees one line as code but doesn't include one of them, it would
 confuse the reader. Example:

 http://arsdnet.net/d-web-site/nntp/get-message?
 newsgroup=digitalmars.D&messageId=%3Cii4lbj%242bes%241%
 40digitalmars.com%3E

 (Look for "auto str =")

 The reason for this is it detects code lines by looking for
 semicolons and open braces. It will call something a generic
 <pre>  if there's a lot of whitespace in it - figuring it is
 probaby ascii art (if it thinks the whitespace has human
 significance, it tries to preserve it), but it still isn't
 a perfect detection function.

 I'm open to ideas. We want to detect code, but not flag
 regular English text.



 I'm also open to graphical styling ideas. I put up a dark
 theme here because the white was hurting my eyes, but I change
 on if I like light or dark almost at random. (Depends on the room's
 lighting conditions I think). But I didn't do any more graphic
 setup other than the max-width.

 Multiple color schemes is an idea I like.



 BTW, as a fun fact, this post is about 1/4th the size of the
 entire nntp.d code file!

Jan 31 2011
next sibling parent Adam Ruppe <destructionator gmail.com> writes:
 But I'm curious how you do web programming with D. Do you use CGI?

Yes, for most my apps (some have a homegrown HTTP server they use instead, if persistence is necessary). The module is here: http://arsdnet.net/dcode/cgi.d That same module works with standard CGI and with the embedded http server, just with different constructors. The default one reads CGI variables, and the alternative takes http header and body fed to it from the network class.
 How do you do all the HTTP stuff (parsing form data, etc.)

You can see in the code that it's pretty straightforward. With the CGI standard, the webserver passes you data through stdin and environment variables. For GET and COOKIE variables, you check the relevant environment variable (QUERY_STRING and HTTP_COOKIE, respectively), then url decode them and use the resulting string arrays. For POST, you first check the CONTENT_TYPE and CONTENT_LENGTH environment variables, then pull in data from stdin (same as any simple program, except you know the length you want too). The content type can be one of many options. Regular forms are x-www-url-encoded (or something like that) and you decode them identically to the query string. My class puts them into an associative array, similar to PHP: immutable string[string] get; immutable string[string] post; immutable string[string] cookies; Names can also be repeated in a web form. PHP does this with a naming convention: if you put [] after the name in the form, it loads up a dynamic array in the field. So name="mything[]", repeated, becomes $_POST["mything"], which is an Array. I did it differently - there's simply an alternative variable to access them: immutable string[][string] getArray; // ditto for post The names are preserved from the form exactly. This is the lowest level access: ?key=value is there as getArray["key"][0] == "value". I don't try to follow PHP's convention. (As you can see, getArray["key"][0] is always usable. But since I find this relatively rare, I also offer plain get["key"] as a shortcut to it.) Where PHP uses globals for this, I used class members. So you'd actually write: Cgi cgi = new Cgi(); cgi.get["key"]; And so on. That handles strings, but there's other content types too. The most common alternative is used for file uploads. File upload forms have a content type of multipart/form-data, which is a MIME style encoding, similar to email attachments. The content type gives a boundary string. You search stdin for the boundary, then read some part headers, and finally the data, ending with the boundary string again. This continues until you hit "--" ~ boundary. Field names are no longer given by key=value like in urlencoding. It's passed as a field header, after the boundary, before the content. The original filename for file uploads is passed the same way. The CGI class takes care of all this for you, loading up the same associative arrays you get with a normal form. If there are files uploaded though, you access them through: cgi.files["name_from_form"] Which returns an UploadedFile struct. It includes the metadata passed along and the file's contents as a byte array. (You can expect this wouldn't work for very large files. That's probably why PHP uses a temporary file, but I find that such a hassle that I wanted to avoid it. My class currently simply rejects too big files, since I've not needed to solve that problem yet! All my apps only accept small files to upload anyway, little spreadsheet attachments, photos, etc., all of which easily fit in memory.) Anyway, saving file is as simple as: std.file.write("some name", cgi.files["myfile"].content); You can also use the member strings filename and contentType of that UploadedFile struct to get more info. Writing response data back to the user's browser is a simple case of writing things to stdout. First comes headers, then data. I abstracted this with the class too: cgi.write(); // write's response data, like php's echo For headers, there's some specific functions to do it, or a generic header() method that works just like PHP's. cgi.setResponseLocation("/"); // does a 302 redirect cgi.setResponseContentType("image/png"); // tell the browser a png is coming cgi.write("hello!"); // write data See the cgi.d file for details and more. The reason I provide these instead of just letting the user code use writefln() or whatever directly is: a) isolate them from handling the headers. It isn't hard to do, but it is easy to make mistakes and it's a bit tedious. The class takes care of it for you every time. b) writefln() won't work in the embedded server environment. cgi.write, on the other hand, will. (It implements this via a delegate passed to the constructor. It passes your data to the delegate, which is responsible for forwarding it to the network) Embedded server headers are slightly different than CGI headers too. The helper functions keep these changes from affecting user code. Switching from CGI to embedded server, if you use the class, it often as simple as changing the constructor call, keeping the rest of the code unchanged. In theory, FastCGI or other protocols could be added through additional constructors too. I haven't done this myself though because plain old CGI is both well supported and quite fast, despite it's reputation. (I think CGI got the blame for Perl's slowness more than its own weaknesses. Yes, it's startup and some parts of the output is slower than something like mod_php, but the program itself still runs as fast as it runs. For D, that means it blows PHP's speed out of the water. Startup time tends to have a disproportionate impact on benchmarks, because those benchmarks don't actually do anything interesting! Once your program does something useful, the time it spends doing real work will quickly outgrow the startup time, so that slight initial delay becomes irrelevant in the overall result.)
 and templating?

I have two methods that I use together: a TemplatedDocument class, and a plain old (well, extended) DOM style Document class. TemplatedDocument extends Document, so everything about the latter applies to the former. You start with a well-formed HTML file. This might be build out of the text of several files. e.g: auto document = new TemplatedDocument( std.file.readText("header.html" ) ~ std.file.readText("mypage.html" ) ~ std.file.readText("footer.html" )); Now you have a DOM object that you can grow or modify with your content. Building a tree with the standard DOM is tedious, but I have some extensions to help with that. document.getElementById("some-holder").innerText = my_name; The innerText method, borrowed from Internet Explorer, is one of the most useful. You can get or set plain text, with the object taking care of encoding. Alternatively, the HTML file might have a placeholder: <h1>{$title}</h1> And you fill those into the document via a simple AA: document.vars["title"] = "My cool site & stuff"; Document vars are automatically encoded for HTML before being output. The only way to put raw html in the output is to use the innerHTML DOM method and friends or to use the innerRawSource extension. (HTML may try to check for well-formedness. innerRawSource just takes your word for it and doesn't attempt to build an object tree.) This is meant to ensure the easy way is the correct way. Explicit encoding or decoding is rarely necessary. My template class offers no way to loop. All it does is that placeholder variable thing. To loop, do it yourself. One of my pages defines a custom html tag: <repeat times="10"> hello! </repeat> Then you can implement it in the code like so: foreach(e; document.getElementsByTagName("repeat")) { string html; foreach(i; 0 .. to!int(e.times)) // attributes are accessible like in javascript html ~= e.innerHTML; // get the inner contents e.outerHTML = html; // replace this tag with those contents } (For fancier modifications, there's also a getElementsBySelector, letting you do CSS style loops.) However, I'm more likely to just build those portions with the DOM, with the template saying where it goes: <div id="messages-holder"></div> auto holder = document.getElementById("messages-holder"); foreach(message; messages) { holder.addChild("p", message); // a shortcut method to create a child, set its text, and append it all in one } Some people believe this is no better than putting html output in your code as strings; it basically does the same thing. But I don't agree this is a big problem: a) You can still keep it separate with functions. See nntp.d for an example of this in practice. getMessage() returns a Post object. getMessagePage() takes a Post object and returns a Document. (Note this is handled automatically by the FancyMain mixin, defined in web.d, used in nntp.d. I've been mostly describing the lower level classes in this post: cgi.d and dom.d. web.d builds upon them to automate a lot of common tasks and to try to force more MVC separation. For an example of why this is cool, check this out: http://arsdnet.net/d-web-site/nntp/get-message?newsgroup=digitalmars.D&messageId=<ii4993%241l5b%241%40digitalmars.com>&format=json See the "&format=json" at the end? It outputs that message object as json instead of the HTML page! There's no code in nntp.d to do this - it's handled automatically by web.d. There's a variety of formats available. Try table, xml, string, and html too. They don't all work as well here because Post is a class rather than a struct (web.d currently works much better with simple structs than with fancy structs or classes), but in the future or in other projects, they would work too.) Back to templates in general, I also think HTML itself really isn't layout nor style. Layout is done by the skeleton html files, not the in code dom, and style is done with CSS. For example, nntp.d doesn't do inline styles. It just describes the data with tags and attributes, letting CSS finish the job of colors and other such details. My large work project currently has 6 skins for it, all written independently of the code. None of the D had to be changed, despite me using the dom to build out content loops. The dom extensions also save huge amounts of time. Take this: auto form = cast(Form) document.getElementById("my-form"); foreach(k, v; cgi.post) form.setValue(k, v); That takes all the POST variables and sets them in the given form. Whether the form is built out of inputs, radio boxes, selects, textareas - it doesn't matter. The Form class abstracts it all away to a uniform interface. (If it doesn't find a matching field in the given HTML, it automatically appends an <input type="hidden"> with the given values.) Not having to write: <input type="text" name="something" value="<?= $something ?>" /> Oops, I didn't htmlEntitesEncode that something, XSS time! <select name="Something"> <option value="a" <?php if($a == "a") echo "selected=\"selected\"";?>>My Option</option> <option value="b" <?php if($a == "b") echo "selected=\"selected\"";?>>My Option</option> </select> Sucks ass compared to form.setValue("Something", "a"); That alone infinitely outweighs any counter argument I've heard to my use of of a customized DOM. I've been typing this for a long time, I'm going to break up comments on the rest of your post into a separate message.
Jan 31 2011
prev sibling next sibling parent reply Adam Ruppe <destructionator gmail.com> writes:
Stephan Soller wrote:
 Cache invalidation
 How do you handle this right now?

I don't. My program assumes that once it has a message, it never needs to look to the server for it again. (This is probably because of my own experience with mailing lists - I use the mailing list interface to the newsgroup for reading. With them, once the email is sent, it isn't going to change. I just assumed the newsgroup worked the same way...)
 D website
  I really like the layout.

The credit for that goes to Christopher Bergqvist. See the thread "Suggestion: New D front page" in the main newsgroup. He posted a png outlining his idea and I just ran with it :)
 The compile and run button is a bit of a security risk. I was able
 to read the /etc/passwd file for example.

Yeah, but that's normal on a multi user linux system. It doesn't really break anything. But, I moved the compile and run program to a separate VM to further limit it. If you read that entire filesystem, it doesn't really matter - it's an out of the box Slackware install. There's nothing sensitive or private on it at all. (Like it's domain name says, it is completely expendable info!)
 Denial of service attacks (e.g.
 endless loops) might still be a problem though.

I think this is solved with my use of setrlimit. If a process eats more than 5 seconds of CPU time, the operating system kills it. The limits are also set to 16 MB of RAM, 16 kb files, 3 forks, and a bunch of other things. (This might be interesting to test some programs - it will actually get out of memory exceptions pretty easily!) Write access is also limited to a single directory, in addition to that individual size limit. Filling up the disk shouldn't be possible. The operating system firewall prevents most network activity, incoming and outgoing. You can play with sockets, but only if they are working with localhost, and even then, they aren't allowed to access the ssh port. Running a spam bot off it is impossible. More than this, the VM is also limited. I set its memory and CPU limits to about 1/5 the resources of the physical server. So if you did manage to get root and max out your program, it won't have a significant impact on the other things running with it (all low traffic websites). An external firewall serves as layer 2 to protect against spambots. Finally, I did a VM snapshot after setting it up. I'm considering running a scheduled script on my computer to blank and reset that VM every night. Then, if you got root and worked around my other restrictions, it'd be a temporary victory anyway, just until I revert the snapshot again. All in all, I think I have a pretty safe setup. If I'm proven wrong, plan B is to use the ideone API instead.
 If you only display mails in the announcements which do not have a
 "References" header you will only get mails that started a new topic.
 This will filter out replies.

Yes, that's what I wanted. The idea is to show a feed of new things coming out, rather than new replies on old ideas. This way, the homepage shows the most variety.
 Happy programming

Thanks! If I have any questions, I'll be sure to ask. I've gotta get back to my real work soon though (stupid Monday) so finishing this will probably have to wait until next weekend.
Jan 31 2011
parent Stephan Soller <stephan.soller helionweb.de> writes:
Quite some impressive stuff. Actually I'm somewhat blown away. Looks 
like I'm going to try GCI with D in the near future. :)

The compile & run functionality looks very solid. Sorry for assuming bad 
security. setrlimit, extra VM, internal and external firewalls… looks 
like it's as solid as it can get.

Happy programming
Stephan


On 31.01.2011 18:03, Adam Ruppe wrote:
 Stephan Soller wrote:
 Cache invalidation
 How do you handle this right now?

I don't. My program assumes that once it has a message, it never needs to look to the server for it again. (This is probably because of my own experience with mailing lists - I use the mailing list interface to the newsgroup for reading. With them, once the email is sent, it isn't going to change. I just assumed the newsgroup worked the same way...)
 D website
   I really like the layout.

The credit for that goes to Christopher Bergqvist. See the thread "Suggestion: New D front page" in the main newsgroup. He posted a png outlining his idea and I just ran with it :)
 The compile and run button is a bit of a security risk. I was able
 to read the /etc/passwd file for example.

Yeah, but that's normal on a multi user linux system. It doesn't really break anything. But, I moved the compile and run program to a separate VM to further limit it. If you read that entire filesystem, it doesn't really matter - it's an out of the box Slackware install. There's nothing sensitive or private on it at all. (Like it's domain name says, it is completely expendable info!)
 Denial of service attacks (e.g.
 endless loops) might still be a problem though.

I think this is solved with my use of setrlimit. If a process eats more than 5 seconds of CPU time, the operating system kills it. The limits are also set to 16 MB of RAM, 16 kb files, 3 forks, and a bunch of other things. (This might be interesting to test some programs - it will actually get out of memory exceptions pretty easily!) Write access is also limited to a single directory, in addition to that individual size limit. Filling up the disk shouldn't be possible. The operating system firewall prevents most network activity, incoming and outgoing. You can play with sockets, but only if they are working with localhost, and even then, they aren't allowed to access the ssh port. Running a spam bot off it is impossible. More than this, the VM is also limited. I set its memory and CPU limits to about 1/5 the resources of the physical server. So if you did manage to get root and max out your program, it won't have a significant impact on the other things running with it (all low traffic websites). An external firewall serves as layer 2 to protect against spambots. Finally, I did a VM snapshot after setting it up. I'm considering running a scheduled script on my computer to blank and reset that VM every night. Then, if you got root and worked around my other restrictions, it'd be a temporary victory anyway, just until I revert the snapshot again. All in all, I think I have a pretty safe setup. If I'm proven wrong, plan B is to use the ideone API instead.
 If you only display mails in the announcements which do not have a
 "References" header you will only get mails that started a new topic.
 This will filter out replies.

Yes, that's what I wanted. The idea is to show a feed of new things coming out, rather than new replies on old ideas. This way, the homepage shows the most variety.
 Happy programming

Thanks! If I have any questions, I'll be sure to ask. I've gotta get back to my real work soon though (stupid Monday) so finishing this will probably have to wait until next weekend.

Feb 02 2011
prev sibling next sibling parent Trass3r <un known.com> writes:
Very interesting stuff.
May D kick php out of business ;)
Jan 31 2011
prev sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
Word wrapping, please!

Looks cool so far.
Jan 31 2011
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
Adam Ruppe wrote:
 In the other newsgroup, I've been talking about a little
 web news program I've been writing as a spinoff of the
 potential new homepage idea.

That is great news. I've been wanting to do one for years! I haven't looked much at yours yet, but here's my ideas anyway :-) 1. Can use web interface or nntp interface 2. web interface looks sort of like reddit, i.e. all posts on a thread 3. users can post anonymously 4. web interfaces supports logins - logged in users can vote up or down on posts 5. web interface can mark posts as read or unread - fixing my beef with reddit that there's no reasonable way to scan a thread for new posts 6. an easy way for moderators to delete spam 7. runs on 64 bit FreeBSD (what the Digital Mars server runs on), yes, I know that means I have to get 64 bit dmd on FreeBSD working! I can contribute the code that generates the D archive pages from the news postings.
Jan 31 2011
next sibling parent reply Trass3r <un known.com> writes:
btw Gour, neither web interface can display your messages, they appear empty.
Feb 01 2011
next sibling parent reply "Nick Sabalausky" <a a.a> writes:
"Trass3r" <un known.com> wrote in message 
news:ii8n1u$qoq$1 digitalmars.com...
 btw Gour, neither web interface can display your messages, they appear 
 empty.

That's interesting. For me, in Outlook Express, his messages show up as blank too, *but* the message does show up as an attachment (With a filename matching the regex "ATT[0-9]+\.txt"). There's a couple other people too whose messages also show the same way for me: "Jerome M. Berger" and Russel Winder.
Feb 01 2011
parent "Nick Sabalausky" <a a.a> writes:
"Denis Koroskin" <2korden gmail.com> wrote in message 
news:op.vp79ixtao7cclz korden-pc...
 On Tue, 01 Feb 2011 14:49:47 +0300, Nick Sabalausky <a a.a> wrote:

 "Trass3r" <un known.com> wrote in message
 news:ii8n1u$qoq$1 digitalmars.com...
 btw Gour, neither web interface can display your messages, they appear
 empty.

That's interesting. For me, in Outlook Express, his messages show up as blank too, *but* the message does show up as an attachment (With a filename matching the regex "ATT[0-9]+\.txt"). There's a couple other people too whose messages also show the same way for me: "Jerome M. Berger" and Russel Winder.

Shows up perfectly fine here, in Opera, and headers look well-formed, too. I recommend upgrading your IE6-grade newsreader to something more advanced :p

Heh, yea, I do need something better. But last time I looked I wasn't able to find anything I liked much. I've been hoping to find the time to just make one :)
Feb 01 2011
prev sibling parent Jesse Phillips <jessekphillips+D gmail.com> writes:
Trass3r Wrote:

 btw Gour, neither web interface can display your messages, they appear empty.

I have a special reader for his and the other's posts: http://news.gmane.org/gmane.comp.lang.d.announce Note that I continue to use Web-News because the threaded view at the bottom of every message is highly beneficial, and the main page will break from the top thread to show new posts within a given thread. These two things have made Web-News preferable to the other alternatives.
Feb 01 2011
prev sibling next sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Tue, 01 Feb 2011 14:49:47 +0300, Nick Sabalausky <a a.a> wrote:

 "Trass3r" <un known.com> wrote in message
 news:ii8n1u$qoq$1 digitalmars.com...
 btw Gour, neither web interface can display your messages, they appear
 empty.

That's interesting. For me, in Outlook Express, his messages show up as blank too, *but* the message does show up as an attachment (With a filename matching the regex "ATT[0-9]+\.txt"). There's a couple other people too whose messages also show the same way for me: "Jerome M. Berger" and Russel Winder.

Shows up perfectly fine here, in Opera, and headers look well-formed, too. I recommend upgrading your IE6-grade newsreader to something more advanced :p
Feb 01 2011
prev sibling next sibling parent reply Eric Poggel <dnewsgroup2 yage3d.net> writes:
On 1/31/2011 5:28 PM, Walter Bright wrote:
 Adam Ruppe wrote:
 In the other newsgroup, I've been talking about a little
 web news program I've been writing as a spinoff of the
 potential new homepage idea.

That is great news. I've been wanting to do one for years! I haven't looked much at yours yet, but here's my ideas anyway :-) 1. Can use web interface or nntp interface 2. web interface looks sort of like reddit, i.e. all posts on a thread 3. users can post anonymously 4. web interfaces supports logins - logged in users can vote up or down on posts 5. web interface can mark posts as read or unread - fixing my beef with reddit that there's no reasonable way to scan a thread for new posts 6. an easy way for moderators to delete spam 7. runs on 64 bit FreeBSD (what the Digital Mars server runs on), yes, I know that means I have to get 64 bit dmd on FreeBSD working! I can contribute the code that generates the D archive pages from the news postings.

I hate to mention this in light of Adam's work, but Reddit is open source--why not run our own deployment of it for D? It seems that these changes would require minimal changes to the code base, except for nntp access. But I guess I don't understand the benefits of it over a web-based solution.
Feb 03 2011
parent reply Adam Ruppe <destructionator gmail.com> writes:
Eric Poggel wrote:
 I hate to mention this in light of Adam's work, but Reddit is open
 source--why not run our own deployment of it for D?

I *really* dislike tree style interfaces. I find them incredibly hard to navigate. Of course, I'm fairly unlikely to use the web interface much anyway (whether mine or someone else - I prefer my mail client most the time), but still, it would be nice if it didn't suck. Anyway, I did a little more work on my thing this morning: http://arsdnet.net/d-web-site/nntp/thread-index?newsgroup=digitalmars.D There's now [Tree] and [Linear] links on the right to view the whole thread at once. Any ideas on how to improve that? I copied a few basic elements of reddit style sites, but I'm thinking that view works best for very short messages.
Feb 03 2011
next sibling parent Daniel Gibson <metalcaedes gmail.com> writes:
Am 03.02.2011 22:00, schrieb Lars T. Kyllingstad:
 On Thu, 03 Feb 2011 17:16:58 +0000, Adam Ruppe wrote:
 
 Eric Poggel wrote:
 I hate to mention this in light of Adam's work, but Reddit is open
 source--why not run our own deployment of it for D?

I *really* dislike tree style interfaces. I find them incredibly hard to navigate. Of course, I'm fairly unlikely to use the web interface much anyway (whether mine or someone else - I prefer my mail client most the time), but still, it would be nice if it didn't suck. Anyway, I did a little more work on my thing this morning: http://arsdnet.net/d-web-site/nntp/thread-index?newsgroup=digitalmars.D There's now [Tree] and [Linear] links on the right to view the whole thread at once. Any ideas on how to improve that? I copied a few basic elements of reddit style sites, but I'm thinking that view works best for very short messages.

I agree. Subject, author and date should be shown in a tree view, but you should never display more than one message body at a time. The average message on this forum is far too long for that. -Lars

I find it annoying to open each message in a thread manually. I prefer a fully expanded thread with all bodies (or maybe partially expended by subthreads or something when it's too big). This makes reading longer threads much easier. I haven't found a non-web-based news/mail client that does this yet, but going to the next message with 'n' in Thunderbird certainly is less painful than clicking the next message I want to read on a website. Cheers, - Daniel
Feb 03 2011
prev sibling next sibling parent Jesse Phillips <jessekphillips+D gmail.com> writes:
Adam Ruppe Wrote:

 Eric Poggel wrote:
 I hate to mention this in light of Adam's work, but Reddit is open
 source--why not run our own deployment of it for D?

I *really* dislike tree style interfaces. I find them incredibly hard to navigate.

I've usually already read major branches and just need the latest posts way down there. I don't care too much how it is displayed but I want to be able to follow a thread. And this was one of Walter's complaints about reddit.
 Any ideas on how to improve that? I copied a few basic elements
 of reddit style sites, but I'm thinking that view works best
 for very short messages.

If your are going to do a tree view your going to have to make it more like reddit, only nest so deep. A reddit frontend for digitalmars.D would be interesting. But their are major issues that would need resolved for it to be good as a news reader.
Feb 03 2011
prev sibling parent reply Adam Ruppe <destructionator gmail.com> writes:
Well, it posted, but evidently still has a few bugs. As you can see,
the newlines got butchered with the real data and some headers
didn't come out right.

Newlines have been the hardest thing in all of this. They sometimes
matter in plain text, but sometimes are just an artifact of wrapping.
They sometimes matter in bbcode, but sometimes should be collapsed
into the surrounding area.

And, of course, the matter to the NNTP network talk. But, it looks
like my code is just collapsing them a little too often.

Other than that, it seems to have worked well, and the reading code
for UTF-8 and quoted-printable did good work. One more weekend and
we should be set with all the fundamentals. Then, wrap up the
gravy and style and I'll call it done.
Feb 06 2011
parent Adam Ruppe <destructionator gmail.com> writes:
 But, it looks like my code is just collapsing them a little too often.

LOL: the reason it worked in my tests but not for the live post? \n\n != \r\n\r\n Stupid line ending bullshit. But with that fixed, I think all my woes are gone... I'll try the headers again later, but that should be fixed too. Take a look: http://arsdnet.net/d-web-site/nntp/get-message?newsgroup=digitalmars.D.announce&messageId=%3Cii592i%24c09%241%40digitalmars.com%3E You can see the reply form at the bottom, auto-filling bbcode from the original plain text message. Feel free to post as much as you want as long as you do *not* commit to the news server - don't want to spam it until I'm a little more confident in the code... But if you post to the local database, that's ok. Be aware I delete it from time to time so don't put anything you want to keep. You might try posting some bbcode, then hit View Original in the upper right. That's the plain text version - what would be posted to the server. It's fairly beautiful. btw, the pendulum of my brain swung back to light backgrounds, so it is right now black on white again. I kept the grey code too. Soon enough, I figure I'll just make it user selectable, defaulting to not specifying colors at all, so your browser defaults work.
Feb 06 2011
prev sibling parent "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Thu, 03 Feb 2011 17:16:58 +0000, Adam Ruppe wrote:

 Eric Poggel wrote:
 I hate to mention this in light of Adam's work, but Reddit is open
 source--why not run our own deployment of it for D?

I *really* dislike tree style interfaces. I find them incredibly hard to navigate. Of course, I'm fairly unlikely to use the web interface much anyway (whether mine or someone else - I prefer my mail client most the time), but still, it would be nice if it didn't suck. Anyway, I did a little more work on my thing this morning: http://arsdnet.net/d-web-site/nntp/thread-index?newsgroup=digitalmars.D There's now [Tree] and [Linear] links on the right to view the whole thread at once. Any ideas on how to improve that? I copied a few basic elements of reddit style sites, but I'm thinking that view works best for very short messages.

I agree. Subject, author and date should be shown in a tree view, but you should never display more than one message body at a time. The average message on this forum is far too long for that. -Lars
Feb 03 2011
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
8. Search functionality

digitalmars uses google for searching the NG archive, but I've no idea
how to do custom searches. I.e. I'd like to search for a keyword in
the topic title only, how would I do that?
Jan 31 2011
prev sibling next sibling parent Gour <gour atmarama.net> writes:
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

On Mon, 31 Jan 2011 14:28:37 -0800
Walter Bright <newshound2 digitalmars.com> wrote:

 7. runs on 64 bit FreeBSD (what the Digital Mars server runs on),
 yes, I know that means I have to get 64 bit dmd on FreeBSD working!

You've made my day. :-) Sincerely, Gour --=20 Gour | Hlapicina, Croatia | GPG key: CDBF17CA ----------------------------------------------------------------
Jan 31 2011
prev sibling next sibling parent reply Trass3r <un known.com> writes:
Speaking of newsgroup web interface, interestingly while the main D site points
to this crappy reader:
http://www.digitalmars.com/pnews/indexing.php?server=news.digitalmars.com&group=digitalmars.D.announce
there still is a hidden one which is much better imho:
http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D.announce
Feb 01 2011
next sibling parent dennis luehring <dl.soluz gmx.net> writes:
Am 01.02.2011 09:37, schrieb Trass3r:
 Speaking of newsgroup web interface, interestingly while the main D site
points to this crappy reader:
http://www.digitalmars.com/pnews/indexing.php?server=news.digitalmars.com&group=digitalmars.D.announce
 there still is a hidden one which is much better imho:
http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D.announce

and there is the updated 1.6.4 version whichs looks a little more better http://sourceforge.net/projects/web-news/ why don't use this one and style it like the D main page?
Feb 01 2011
prev sibling next sibling parent reply Adam Ruppe <destructionator gmail.com> writes:
Trass3r Wrote:

 Speaking of newsgroup web interface, interestingly while the main D site
points to this crappy reader:
http://www.digitalmars.com/pnews/indexing.php?server=news.digitalmars.com&group=digitalmars.D.announce
 there still is a hidden one which is much better imho:
http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D.announce

Wow, this one is a lot better than the other one! And it's proper name is "Web-News". Now I feel bad for calling the other one web news, it's like slandering this much better program. I still think we can do better though!
Feb 01 2011
parent dennis luehring <dl.soluz gmx.net> writes:
Am 01.02.2011 14:42, schrieb Adam Ruppe:
 Trass3r Wrote:

  Speaking of newsgroup web interface, interestingly while the main D site
points to this crappy reader:
http://www.digitalmars.com/pnews/indexing.php?server=news.digitalmars.com&group=digitalmars.D.announce
  there still is a hidden one which is much better imho:
http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D.announce

Wow, this one is a lot better than the other one! And it's proper name is "Web-News". Now I feel bad for calling the other one web news, it's like slandering this much better program. I still think we can do better though!

but wouldn't it be better to use the newer frontend now and switch then - after your development is finished (i mean the feature stable release, not the very first alpha) to your implementation...
Feb 01 2011
prev sibling next sibling parent Andrew Wiley <debio264 gmail.com> writes:
--20cf3054a2e925d9c5049b3a7c5d
Content-Type: text/plain; charset=ISO-8859-1

On Tue, Feb 1, 2011 at 9:44 AM, Andrej Mitrovic
<andrej.mitrovich gmail.com>wrote:

 On 2/1/11, Trass3r <un known.com> wrote:
 Speaking of newsgroup web interface, interestingly while the main D site
 points to this crappy reader:

 there still is a hidden one which is much better imho:


That one has horrible bugs. You'll click on a topic, then try to read a reply, and it shoots you to some random topic 4+ years ago. Happens all the time. I only use it to post to NG since using Gmail directly doesn't show up my own posts (this is a known gmail bug).

I'm not sure what you mean. I have my Gmail account subscribed to the mailing lists, and everything seems fine? --20cf3054a2e925d9c5049b3a7c5d Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable <div class=3D"gmail_quote">On Tue, Feb 1, 2011 at 9:44 AM, Andrej Mitrovic = <span dir=3D"ltr">&lt;<a href=3D"mailto:andrej.mitrovich gmail.com">andrej.= mitrovich gmail.com</a>&gt;</span> wrote:<br><blockquote class=3D"gmail_quo= te" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;= "> <div><div></div><div class=3D"h5">On 2/1/11, Trass3r &lt;<a href=3D"mailto:= un known.com">un known.com</a>&gt; wrote:<br> &gt; Speaking of newsgroup web interface, interestingly while the main D si= te<br> &gt; points to this crappy reader:<br> &gt; <a href=3D"http://www.digitalmars.com/pnews/indexing.php?server=3Dnews= .digitalmars.com&amp;group=3Ddigitalmars.D.announce" target=3D"_blank">http= ://www.digitalmars.com/pnews/indexing.php?server=3Dnews.digitalmars.com&amp= ;group=3Ddigitalmars.D.announce</a><br> &gt; there still is a hidden one which is much better imho:<br> &gt; <a href=3D"http://www.digitalmars.com/webnews/newsgroups.php?art_group= =3Ddigitalmars.D.announce" target=3D"_blank">http://www.digitalmars.com/web= news/newsgroups.php?art_group=3Ddigitalmars.D.announce</a><br> &gt;<br> <br> </div></div>That one has horrible bugs. You&#39;ll click on a topic, then t= ry to read<br> a reply, and it shoots you to some random topic 4+ years ago. Happens<br> all the time. I only use it to post to NG since using Gmail directly<br> doesn&#39;t show up my own posts (this is a known gmail bug).<br> </blockquote></div><br><div>I&#39;m not sure what you mean. I have my Gmail= account subscribed to the mailing lists, and everything seems fine?</div> --20cf3054a2e925d9c5049b3a7c5d--
Feb 01 2011
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 2/1/11, Andrew Wiley <debio264 gmail.com> wrote:
 I'm not sure what you mean. I have my Gmail account subscribed to the
 mailing lists, and everything seems fine?

When you start a new topic it doesn't show up in Gmail. Well, maybe they've fixed that recently? I haven't tried in a while, but it didn't work before.
Feb 01 2011
prev sibling next sibling parent Andrew Wiley <debio264 gmail.com> writes:
--00248c05002fb7d7c5049b3ac420
Content-Type: text/plain; charset=ISO-8859-1

On Tue, Feb 1, 2011 at 10:06 AM, Andrej Mitrovic <andrej.mitrovich gmail.com
 wrote:

 On 2/1/11, Andrew Wiley <debio264 gmail.com> wrote:
 I'm not sure what you mean. I have my Gmail account subscribed to the
 mailing lists, and everything seems fine?

When you start a new topic it doesn't show up in Gmail. Well, maybe they've fixed that recently? I haven't tried in a while, but it didn't work before.

The email doesn't show up in the inbox until someone replies. This behavior makes sense to me, at least, because sent messages go to "Sent Mail" and don't appear in the inbox until they become conversations between multiple people. If that frustrates you, well, it's how email clients work because you want to separate the ongoing discussions from the send-and-forget messages. --00248c05002fb7d7c5049b3ac420 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable <br><br><div class=3D"gmail_quote">On Tue, Feb 1, 2011 at 10:06 AM, Andrej = Mitrovic <span dir=3D"ltr">&lt;<a href=3D"mailto:andrej.mitrovich gmail.com= ">andrej.mitrovich gmail.com</a>&gt;</span> wrote:<br><blockquote class=3D"= gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-= left:1ex;"> <div class=3D"im">On 2/1/11, Andrew Wiley &lt;<a href=3D"mailto:debio264 gm= ail.com">debio264 gmail.com</a>&gt; wrote:<br> &gt; I&#39;m not sure what you mean. I have my Gmail account subscribed to = the<br> &gt; mailing lists, and everything seems fine?<br> <br> </div>When you start a new topic it doesn&#39;t show up in Gmail. Well, may= be<br> they&#39;ve fixed that recently? I haven&#39;t tried in a while, but it did= n&#39;t<br> work before.<br> </blockquote></div><br><div>The email doesn&#39;t show up in the inbox unti= l someone replies. This behavior makes sense to me, at least, because sent = messages go to &quot;Sent Mail&quot; and don&#39;t appear in the inbox unti= l they become conversations between multiple people.</div> <div>If that frustrates you, well, it&#39;s how email clients work because = you want to separate the ongoing discussions from the send-and-forget messa= ges.</div> --00248c05002fb7d7c5049b3ac420--
Feb 01 2011
prev sibling next sibling parent Jesse Phillips <jessekphillips+D gmail.com> writes:
Andrej Mitrovic Wrote:

 That one has horrible bugs. You'll click on a topic, then try to read
 a reply, and it shoots you to some random topic 4+ years ago. Happens
 all the time. I only use it to post to NG since using Gmail directly
 doesn't show up my own posts (this is a known gmail bug).

I've only had this issue if I change the newsgroup in another tab/window.
Feb 01 2011
prev sibling parent reply Trass3r <un known.com> writes:
 That one has horrible bugs. You'll click on a topic, then try to read
 a reply, and it shoots you to some random topic 4+ years ago. Happens
 all the time. I only use it to post to NG since using Gmail directly
 doesn't show up my own posts (this is a known gmail bug).

Didn't occur to me so far.
Feb 01 2011
next sibling parent "Nick Sabalausky" <a a.a> writes:
"Trass3r" <un known.com> wrote in message 
news:ii9dlt$2548$1 digitalmars.com...
 That one has horrible bugs. You'll click on a topic, then try to read
 a reply, and it shoots you to some random topic 4+ years ago. Happens
 all the time. I only use it to post to NG since using Gmail directly
 doesn't show up my own posts (this is a known gmail bug).

Didn't occur to me so far.

Happened to me every time I used it. But it has been a long time since I last tried though.
Feb 01 2011
prev sibling parent Kagamin <spam here.lot> writes:
Trass3r Wrote:

 That one has horrible bugs. You'll click on a topic, then try to read
 a reply, and it shoots you to some random topic 4+ years ago. Happens
 all the time. I only use it to post to NG since using Gmail directly
 doesn't show up my own posts (this is a known gmail bug).

Didn't occur to me so far.

After the cookie timeout it resets newsgroup to digitalmars.D keeping post ids which are group-specific. As .D has much more posts than other groups, getting a post with small id from it results in a very old post.
Feb 03 2011
prev sibling next sibling parent Gour <gour atmarama.net> writes:
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

On Tue, 01 Feb 2011 05:25:02 -0500
Trass3r <un known.com> wrote:

 btw Gour, neither web interface can display your messages, they
 appear empty.

Hmmm...what about: http://www.digitalmars.com/pnews/read.php?server=3Dnews.digitalmars.com&gro= up=3Ddigitalmars.D.announce&artnum=3D20045 Sincerely, Gour --=20 Gour | Hlapicina, Croatia | GPG key: CDBF17CA ----------------------------------------------------------------
Feb 01 2011
prev sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 2/1/11, Trass3r <un known.com> wrote:
 Speaking of newsgroup web interface, interestingly while the main D site
 points to this crappy reader:
 http://www.digitalmars.com/pnews/indexing.php?server=news.digitalmars.com&group=digitalmars.D.announce
 there still is a hidden one which is much better imho:
 http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D.announce

That one has horrible bugs. You'll click on a topic, then try to read a reply, and it shoots you to some random topic 4+ years ago. Happens all the time. I only use it to post to NG since using Gmail directly doesn't show up my own posts (this is a known gmail bug).
Feb 01 2011