digitalmars.D.bugs - [Issue 786] New: the \ EndOfFile EscapeSequence in double-quoted strings doesn't work
- d-bugmail puremagic.com (47/47) Jan 02 2007 http://d.puremagic.com/issues/show_bug.cgi?id=786
- d-bugmail puremagic.com (15/15) Jan 03 2007 http://d.puremagic.com/issues/show_bug.cgi?id=786
- d-bugmail puremagic.com (6/6) Jan 06 2007 http://d.puremagic.com/issues/show_bug.cgi?id=786
- d-bugmail puremagic.com (13/13) Feb 02 2007 http://d.puremagic.com/issues/show_bug.cgi?id=786
- d-bugmail puremagic.com (4/4) Feb 02 2007 http://d.puremagic.com/issues/show_bug.cgi?id=786
- d-bugmail puremagic.com (5/5) Feb 02 2007 http://d.puremagic.com/issues/show_bug.cgi?id=786
- d-bugmail puremagic.com (5/5) Feb 03 2007 http://d.puremagic.com/issues/show_bug.cgi?id=786
- d-bugmail puremagic.com (6/6) Feb 03 2007 http://d.puremagic.com/issues/show_bug.cgi?id=786
- d-bugmail puremagic.com (8/8) Feb 04 2007 http://d.puremagic.com/issues/show_bug.cgi?id=786
http://d.puremagic.com/issues/show_bug.cgi?id=786 Summary: the \ EndOfFile EscapeSequence in double-quoted strings doesn't work Product: D Version: 0.178 Platform: PC OS/Version: Windows Status: NEW Keywords: rejects-valid, spec Severity: normal Priority: P3 Component: DMD AssignedTo: bugzilla digitalmars.com ReportedBy: thecybershadow gmail.com Spec non-conformacy, I believe. Spec: http://www.digitalmars.com/d/lex.html#StringLiteral Program: void main() { char[] eof_literal = "\"; // the character after the backslash is \u001A, as per the specs } Compiler output: C:\...>dmd lexical.d lexical.d(3): unterminated string constant starting at lexical.d(3) lexical.d(3): semicolon expected, not 'EOF' lexical.d(3): found 'EOF' instead of statement lexical.d(3): found 'EOF' instead of statement lexical.d(3): found 'EOF' instead of statement lexical.d(3): found 'EOF' instead of statement lexical.d(3): found 'EOF' instead of statement lexical.d(3): found 'EOF' instead of statement lexical.d(3): found 'EOF' instead of statement lexical.d(3): found 'EOF' instead of statement lexical.d(3): found 'EOF' instead of statement lexical.d(3): found 'EOF' instead of statement lexical.d(3): found 'EOF' instead of statement lexical.d(3): found 'EOF' instead of statement lexical.d(3): found 'EOF' instead of statement lexical.d(3): found 'EOF' instead of statement lexical.d(3): found 'EOF' instead of statement lexical.d(3): found 'EOF' instead of statement lexical.d(3): found 'EOF' instead of statement lexical.d(3): found 'EOF' instead of statement lexical.d(3): found 'EOF' instead of statement (that's 19 repeating lines) --
Jan 02 2007
http://d.puremagic.com/issues/show_bug.cgi?id=786 smjg iname.com changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |smjg iname.com ------- Comment #1 from smjg iname.com 2007-01-03 04:01 ------- "End of File EndOfFile: physical end of the file \u0000 \u001A " AIUI, locating the end of the code conceptually happens before tokenization. But indeed, the spec isn't crystal clear on this. --
Jan 03 2007
http://d.puremagic.com/issues/show_bug.cgi?id=786 ------- Comment #2 from thomas-dloop kuehne.cn 2007-01-06 15:46 ------- Intermingling eof detection with tokenisation would cause quite a bit of changes within DMD and makes no sense to me as it would allow to read past the physical end of the file. --
Jan 06 2007
http://d.puremagic.com/issues/show_bug.cgi?id=786 bugzilla digitalmars.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |INVALID ------- Comment #3 from bugzilla digitalmars.com 2007-02-02 21:34 ------- 0x1A is listed in lex.html as 'end of file', which trumps any token, I think the spec is reasonably clear on this: "The source text is terminated by whichever comes first." The reason for this is that some (old) text editors put out a 0x1A to mark end of file. Not a bug. --
Feb 02 2007
http://d.puremagic.com/issues/show_bug.cgi?id=786 ------- Comment #4 from thecybershadow gmail.com 2007-02-02 21:37 ------- In that case, why is "\ EndOfFile" listed as a valid EscapeSequence token? --
Feb 02 2007
http://d.puremagic.com/issues/show_bug.cgi?id=786 ------- Comment #5 from bugzilla digitalmars.com 2007-02-02 23:19 ------- If a \ is the last character in a file, the escape sequence will resolve to the \ character, that's what that is for. --
Feb 02 2007
http://d.puremagic.com/issues/show_bug.cgi?id=786 ------- Comment #6 from smjg iname.com 2007-02-03 08:10 ------- But a StringLiteral can never be the last token of a syntactically valid D source file, or can it? --
Feb 03 2007
http://d.puremagic.com/issues/show_bug.cgi?id=786 ------- Comment #7 from bugzilla digitalmars.com 2007-02-03 12:13 ------- Currently, no, it can't, hence the error message about semicolon expected instead of EOF. But the lexer doesn't (and shouldn't) know syntax, it just knows tokens. --
Feb 03 2007
http://d.puremagic.com/issues/show_bug.cgi?id=786 ------- Comment #8 from smjg iname.com 2007-02-04 07:00 ------- Exactly. So really, EscapeSequence: \ EndOfFile has no effect except perhaps on what error message the compiler throws. Moreover, UIMS the spec gives no meaning to this EscapeSequence form. Which is probably why we're all asking. --
Feb 04 2007