digitalmars.D.bugs - [Issue 4389] New: CTFE: Appending dchar to empty string creates invalid string.
- d-bugmail puremagic.com (42/42) Jun 25 2010 http://d.puremagic.com/issues/show_bug.cgi?id=4389
- d-bugmail puremagic.com (63/63) Nov 23 2010 http://d.puremagic.com/issues/show_bug.cgi?id=4389
- d-bugmail puremagic.com (6/6) Nov 23 2010 http://d.puremagic.com/issues/show_bug.cgi?id=4389
- d-bugmail puremagic.com (22/22) Nov 23 2010 http://d.puremagic.com/issues/show_bug.cgi?id=4389
- d-bugmail puremagic.com (10/10) Dec 27 2010 http://d.puremagic.com/issues/show_bug.cgi?id=4389
- d-bugmail puremagic.com (11/11) Dec 27 2010 http://d.puremagic.com/issues/show_bug.cgi?id=4389
- d-bugmail puremagic.com (60/60) Jan 20 2011 http://d.puremagic.com/issues/show_bug.cgi?id=4389
- d-bugmail puremagic.com (12/12) Feb 06 2011 http://d.puremagic.com/issues/show_bug.cgi?id=4389
http://d.puremagic.com/issues/show_bug.cgi?id=4389 Summary: CTFE: Appending dchar to empty string creates invalid string. Product: D Version: D2 Platform: All OS/Version: Windows Status: NEW Severity: critical Priority: P2 Component: DMD AssignedTo: nobody puremagic.com ReportedBy: yebblies gmail.com With dmd2.047 the following code will create a string which is actually invalid utf-32. The problem goes away if you start with a non-empty string. This can occur easily when appending is used in conjunction with the new string front() and back() functions that return dchar. test case: import std.stdio; string get() { string s; dchar c = 'D'; s ~= c; s ~= c; return s; } void main() { enum s = get(); // Call 'get' with ctfe writeln('"', s, '"'); // outputs "D " auto p = (s.ptr); // take pointer to defeat bounds checking auto s2 = p[0..8]; writeln('"', s2, '"'); // outputs "D D " auto s3 = get(); // At compile time, this works as expected writeln('"', s3, '"'); // outputs "DD" } -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jun 25 2010
http://d.puremagic.com/issues/show_bug.cgi?id=4389 Don <clugdbug yahoo.com.au> changed: What |Removed |Added ---------------------------------------------------------------------------- Keywords| |ice-on-valid-code Summary|CTFE: Appending dchar to |ICE(constfold.c, |empty string creates |expression.c), or wrong |invalid string. |code: string~=dchar in CTFE Actually constant folding of string ~= dchar is ALWAYS wrong, even if the string is non-empty. All these tests fail. It can also cause an ICE(constfold.c) if you attempt a further concatenation, or ICE(expression.c) if it is mixed in (bug 3490). It happens because string ~= dchar is treated specially in the backend, but not in constfold.c. --------- int bug4389() { string s; dchar c = '\u2348'; s ~= c; assert(s.length==3); dchar d = 'D'; s ~= d; assert(s.length==4); s = ""; s ~= c; assert(s.length==3); s ~= d; assert(s.length==4); string z; wchar w = '\u0300'; z ~= w; assert(z.length==2); z = ""; z ~= w; assert(z.length==2); return 1; } static assert(bug4389()); // ICE(constfold.c) int ice4389() { string s; dchar c = '\u2348'; s ~= c; s = s ~ "xxx"; return 1; } static assert(ice4389()); // ICE(expression.c) string ice4390() { string s; dchar c = '`'; s ~= c; s ~= c; return s; } static assert(mixin(ice4390()) == ``); -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Nov 23 2010
http://d.puremagic.com/issues/show_bug.cgi?id=4389 *** Issue 4390 has been marked as a duplicate of this issue. *** -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Nov 23 2010
http://d.puremagic.com/issues/show_bug.cgi?id=4389 Created an attachment (id=827) Patch against D2 svn 767 Although the patch is relatively long, it is very mundane. The patch adds these functions to utf.c (they are copied from druntime). These convert single chars from UTF32 to UTF8 or UTF16. void utf_encodeChar(unsigned char *s, dchar_t c); void utf_encodeWchar(unsigned short *s, dchar_t c); int utf_codeLengthChar(dchar_t c); int utf_codeLengthWchar(dchar_t c); Two other wrapper functions provide the interface that is actually used: int utf_codeLength(int sz, dchar_t c); void utf_encode(int sz, void *s, dchar_t c); Then, these functions are used in Cat() in constfold.c Instead of assuming a len of 1, it uses utf_codeLength; and instead of a memcpy, it uses utf_encode. The cases which are patched in Cat() are: TOKnull ~ dchar, dchar ~ TOKnull, TOKstring ~ dchar. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Nov 23 2010
http://d.puremagic.com/issues/show_bug.cgi?id=4389 Walter Bright <bugzilla digitalmars.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |bugzilla digitalmars.com 18:54:58 PST --- The patch should work in D1, but does not. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Dec 27 2010
http://d.puremagic.com/issues/show_bug.cgi?id=4389 Walter Bright <bugzilla digitalmars.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Version|D2 |D1 & D2 19:27:54 PST --- http://www.dsource.org/projects/dmd/changeset/825 Leaving bug open until D1 is fixed. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Dec 27 2010
http://d.puremagic.com/issues/show_bug.cgi?id=4389 Fails on D1 because dchar, wchar implicitly convert to char, even if they don't fit. PATCH: expression.c, CatAssignExp::semantic(), line 8574. The two 'else' clauses need to be swapped, so that we check for an array~char before we check for implicit conversion. { // Append array e2 = e2->castTo(sc, e1->type); type = e1->type; e = this; } ----------- SWAP THIS SECTION WITH THE NEXT ONE else if ((tb1->ty == Tarray) && e2->implicitConvTo(tb1next) ) { // Append element e2 = e2->castTo(sc, tb1next); type = e1->type; e = this; } ------- else if (tb1->ty == Tarray && (tb1next->ty == Tchar || tb1next->ty == Twchar) && e2->implicitConvTo(Type::tdchar) ) { // Append dchar to char[] or wchar[] e2 = e2->castTo(sc, Type::tdchar); type = e1->type; e = this; /* Do not allow appending wchar to char[] because if wchar happens * to be a surrogate pair, nothing good can result. */ } ----------- else { if (tb1 != Type::terror && tb2 != Type::terror) error("cannot append type %s to type %s", tb2->toChars(), tb1->toChars()); e = new ErrorExp(); } ALTERNATE PATCH: Incidentally, if dchar->char is disallowed, all DMD tests still pass. But this would probably cause existing valid code to break, so probably the first solution is better. mtype.c, line 1570, MATCH TypeBasic::implicitConvTo(Type *to) + if ((ty == Tdchar || ty == Twchar) && to->ty == Tchar) + return MATCHnomatch; + if (ty == Tdchar && to->ty == Twchar) + return MATCHnomatch; } else if (flags & TFLAGSfloating) { // Disallow implicit conversion of floating point to integer if (tob->flags & TFLAGSintegral) return MATCHnomatch; -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jan 20 2011
http://d.puremagic.com/issues/show_bug.cgi?id=4389 Don <clugdbug yahoo.com.au> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED D1 fixed as well: https://github.com/D-Programming-Language/dmd/commit/2b51411e9e8b7ca3e62747ffb6f96c94089cebcc -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Feb 06 2011