www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Encoding problems with dsss.exe and implib.exe

reply Vitaly Kulich <vit_klich list.ru> writes:
For some months dsss.exe and implib.exe worked correctly on my machine. Now
one of other programs has updated registry  settings, I guess, and as a result
when using dsss I got
'4invalid UTF-8 sequence',
 when using implib with the arguments string
"implib/system A.lib A.dll "
there is the output:
" IMPLIB: error IM4601: unrecognized option '/system'; option ignored ";
though, when using dmd itself everything is strangely  OK.
The question is simple:
how to restore now all the previous (good) settings on Windows XP Professional
(32 bits)? There might be an inner format changed,
since even text converted with Notepad++ into UTF-8
seems to be not good...
Jan 24 2011
next sibling parent Vitaly Kulich <vit_klich list.ru> writes:
I have to add the following to my post:

In Phobos for dmd version 1 there is only one
obvious source of this exception. Namely, it is
function 'decode' in the std.utf module.
Here is its listing:

/***************
 * Decodes and returns character starting at s[idx]. idx is advanced past the
 * decoded character. If the character is not well formed, a UtfException is
 * thrown and idx remains unchanged.
 */

dchar decode(char[] s, inout size_t idx)
    in
    {
	assert(idx >= 0 && idx < s.length);
    }
    out (result)
    {
	assert(isValidDchar(result));
    }
    body
    {
	size_t len = s.length;
	dchar V;
	size_t i = idx;
	char u = s[i];

	if (u & 0x80)
	{   uint n;
	    char u2;

	    /* The following encodings are valid, except for the 5 and 6 byte
	     * combinations:
	     *	0xxxxxxx
	     *	110xxxxx 10xxxxxx
	     *	1110xxxx 10xxxxxx 10xxxxxx
	     *	11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
	     *	111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx
	     *	1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx
	     */
	    for (n = 1; ; n++)
	    {
		if (n > 4)
		    goto Lerr;		// only do the first 4 of 6 encodings
		if (((u << n) & 0x80) == 0)
		{
		    if (n == 1)
			goto Lerr;
		    break;
		}
	    }

	    // Pick off (7 - n) significant bits of B from first byte of octet
	    V = cast(dchar)(u & ((1 << (7 - n)) - 1));

	    if (i + (n - 1) >= len)
		goto Lerr;			// off end of string

	    /* The following combinations are overlong, and illegal:
	     *	1100000x (10xxxxxx)
	     *	11100000 100xxxxx (10xxxxxx)
	     *	11110000 1000xxxx (10xxxxxx 10xxxxxx)
	     *	11111000 10000xxx (10xxxxxx 10xxxxxx 10xxxxxx)
	     *	11111100 100000xx (10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx)
	     */
	    u2 = s[i + 1];
	    if ((u & 0xFE) == 0xC0 ||
		(u == 0xE0 && (u2 & 0xE0) == 0x80) ||
		(u == 0xF0 && (u2 & 0xF0) == 0x80) ||
		(u == 0xF8 && (u2 & 0xF8) == 0x80) ||
		(u == 0xFC && (u2 & 0xFC) == 0x80))
		goto Lerr;			// overlong combination

	    for (uint j = 1; j != n; j++)
	    {
		u = s[i + j];
		if ((u & 0xC0) != 0x80)
		    goto Lerr;			// trailing bytes are 10xxxxxx
		V = (V << 6) | (u & 0x3F);
	    }
	    if (!isValidDchar(V))
		goto Lerr;
	    i += n;
	}
	else
	{
	    V = cast(dchar) u;
	    i++;
	}

	idx = i;
	return V;

      Lerr:
	//printf("\ndecode: idx = %d, i = %d, length = %d s = \n'%.*s'\n%x\n'%.*s'\n",
idx, i, s.length, s, s[i], s[i .. length]);
	throw new UtfException("4invalid UTF-8 sequence", i);
    }

In no other place was found text "4invalid UTF-8 sequence",
therefore, this function needs a revision.
So, I myself answered to the question that concerns the dsss behavior,
as dsss is written in D. But the strange behaviour of implib still undefined.
Jan 24 2011
prev sibling parent Kagamin <spam here.lot> writes:
Vitaly Kulich Wrote:

 For some months dsss.exe and implib.exe worked correctly on my machine. Now
 one of other programs has updated registry  settings, I guess
Photoshop encoding fix?
Jan 24 2011