digitalmars.D.bugs - [Issue 786] New: the \ EndOfFile EscapeSequence in double-quoted strings doesn't work
- d-bugmail puremagic.com (47/47) Jan 02 2007 http://d.puremagic.com/issues/show_bug.cgi?id=786
- d-bugmail puremagic.com (15/15) Jan 03 2007 http://d.puremagic.com/issues/show_bug.cgi?id=786
- d-bugmail puremagic.com (6/6) Jan 06 2007 http://d.puremagic.com/issues/show_bug.cgi?id=786
- d-bugmail puremagic.com (13/13) Feb 02 2007 http://d.puremagic.com/issues/show_bug.cgi?id=786
- d-bugmail puremagic.com (4/4) Feb 02 2007 http://d.puremagic.com/issues/show_bug.cgi?id=786
- d-bugmail puremagic.com (5/5) Feb 02 2007 http://d.puremagic.com/issues/show_bug.cgi?id=786
- d-bugmail puremagic.com (5/5) Feb 03 2007 http://d.puremagic.com/issues/show_bug.cgi?id=786
- d-bugmail puremagic.com (6/6) Feb 03 2007 http://d.puremagic.com/issues/show_bug.cgi?id=786
- d-bugmail puremagic.com (8/8) Feb 04 2007 http://d.puremagic.com/issues/show_bug.cgi?id=786
http://d.puremagic.com/issues/show_bug.cgi?id=786 Summary: the \ EndOfFile EscapeSequence in double-quoted strings doesn't work Product: D Version: 0.178 Platform: PC OS/Version: Windows Status: NEW Keywords: rejects-valid, spec Severity: normal Priority: P3 Component: DMD AssignedTo: bugzilla digitalmars.com ReportedBy: thecybershadow gmail.com Spec non-conformacy, I believe. Spec: http://www.digitalmars.com/d/lex.html#StringLiteral Program: void main() { char[] eof_literal = "\"; // the character after the backslash is \u001A, as per the specs } Compiler output: C:\...>dmd lexical.d lexical.d(3): unterminated string constant starting at lexical.d(3) lexical.d(3): semicolon expected, not 'EOF' lexical.d(3): found 'EOF' instead of statement lexical.d(3): found 'EOF' instead of statement lexical.d(3): found 'EOF' instead of statement lexical.d(3): found 'EOF' instead of statement lexical.d(3): found 'EOF' instead of statement lexical.d(3): found 'EOF' instead of statement lexical.d(3): found 'EOF' instead of statement lexical.d(3): found 'EOF' instead of statement lexical.d(3): found 'EOF' instead of statement lexical.d(3): found 'EOF' instead of statement lexical.d(3): found 'EOF' instead of statement lexical.d(3): found 'EOF' instead of statement lexical.d(3): found 'EOF' instead of statement lexical.d(3): found 'EOF' instead of statement lexical.d(3): found 'EOF' instead of statement lexical.d(3): found 'EOF' instead of statement lexical.d(3): found 'EOF' instead of statement lexical.d(3): found 'EOF' instead of statement lexical.d(3): found 'EOF' instead of statement (that's 19 repeating lines) --
Jan 02 2007
http://d.puremagic.com/issues/show_bug.cgi?id=786 smjg iname.com changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |smjg iname.com "End of File EndOfFile: physical end of the file \u0000 \u001A " AIUI, locating the end of the code conceptually happens before tokenization. But indeed, the spec isn't crystal clear on this. --
Jan 03 2007
http://d.puremagic.com/issues/show_bug.cgi?id=786 Intermingling eof detection with tokenisation would cause quite a bit of changes within DMD and makes no sense to me as it would allow to read past the physical end of the file. --
Jan 06 2007
http://d.puremagic.com/issues/show_bug.cgi?id=786 bugzilla digitalmars.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |INVALID 0x1A is listed in lex.html as 'end of file', which trumps any token, I think the spec is reasonably clear on this: "The source text is terminated by whichever comes first." The reason for this is that some (old) text editors put out a 0x1A to mark end of file. Not a bug. --
Feb 02 2007
http://d.puremagic.com/issues/show_bug.cgi?id=786 In that case, why is "\ EndOfFile" listed as a valid EscapeSequence token? --
Feb 02 2007
http://d.puremagic.com/issues/show_bug.cgi?id=786 If a \ is the last character in a file, the escape sequence will resolve to the \ character, that's what that is for. --
Feb 02 2007
http://d.puremagic.com/issues/show_bug.cgi?id=786 But a StringLiteral can never be the last token of a syntactically valid D source file, or can it? --
Feb 03 2007
http://d.puremagic.com/issues/show_bug.cgi?id=786 Currently, no, it can't, hence the error message about semicolon expected instead of EOF. But the lexer doesn't (and shouldn't) know syntax, it just knows tokens. --
Feb 03 2007
http://d.puremagic.com/issues/show_bug.cgi?id=786 Exactly. So really, EscapeSequence: \ EndOfFile has no effect except perhaps on what error message the compiler throws. Moreover, UIMS the spec gives no meaning to this EscapeSequence form. Which is probably why we're all asking. --
Feb 04 2007