www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - MBCS character code support

reply "Katayama Hirofumi MZ" <katayama.hirofumi.mz gmail.com> writes:
Hello, everyone!

I want multibyte character string (MBCS) support except Unicode, UTF-8 and 
UTF-16.

Could you make the D compiler generate Shift_JIS, EUC-JP code for every 
string literals by a specific command line option?

Shift_JIS and EUC-JP are Japanese character set. 
May 15 2012
next sibling parent "Katayama Hirofumi MZ" <katayama.hirofumi.mz gmail.com> writes:
You can convert UTF-8 to Shift_JIS by the following code.


/* Linux, FreeBSD or UNIX */
#include <iconv.h>
iconv_t g_icUTF8toSJIS;

char *convert_utf8_to_sjis(char *in)
{
     char *out, *p_in, *p_out,
     size_t in_size, out_size;

     in_size = strlen(in);
     out_size = in_size;
     out = (char *)malloc(out_size + 1);
     if (out == NULL)
         return NULL;

     p_in = in;
     p_out = out;
     iconv(g_icUTF8toSJIS, &p_in, &in_size, &p_out, &out_size);
     *p_out = 0;

     return out;
}


int main(void)
{
     char *out;
     g_icUTF8toSJIS = iconv_open("UTF-8", "SJIS");
     if (g_icUTF8toSJIS == (iconv_t)-1) {
         // error
     }
     ...
     out = convert_utf8_to_sjis(...);
     ...
     free(out);
     ...

     iconv_close(g_icUTF8toSJIS);
     return 0;
}

/* Windows */
#include <windows.h>

char *UTF8toSJIS(char *utf8)
{
     char *wide, *sjis;
     int size;

     size = MultiByteToWideChar(CP_UTF8, 0, utf8, -1, 0, 0);
     wide = (char *)malloc((size + 1) * sizeof(WCHAR));
     if (wide == NULL)
         return NULL;
     MultiByteToWideChar(CP_UTF8, 0, utf8, -1, wide, size);

     size = WideCharToMultiByte(CP_ACP, 0, wide, -1, 0, 0, 0, 0);
     sjis = malloc(size * 2 + 1);
     if (sjis == NULL) {
         free(wide);
         return NULL;
     }

     WideCharToMultiByte(CP_ACP, 0, wide, -1, sjis, size, 0, 0);
     free(wide);

     return sjis;
}

int main(void)
{
     char *out;
     ...
     out = UTF8toSJIS(...);
     ...
     free(out);
     ...
     return 0;
}
May 15 2012
prev sibling next sibling parent reply "Katayama Hirofumi MZ" <katayama.hirofumi.mz gmail.com> writes:
All Japaneses and/or other Asians want native MBCS support.
Please let the D compiler generate Shift_JIS code for literal 
strings.
May 15 2012
parent reply =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <xtzgzorex gmail.com> writes:
On 16-05-2012 06:04, Katayama Hirofumi MZ wrote:
 All Japaneses and/or other Asians want native MBCS support.
 Please let the D compiler generate Shift_JIS code for literal strings.

I really do not understand why you want to use Shift-JIS. Unicode has long superseded all these magical encodings used all over the world. Why oppose a unified encoding? -- - Alex
May 15 2012
parent reply =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:
On 16-05-2012 06:18, Katayama Hirofumi MZ wrote:
 On Wednesday, 16 May 2012 at 04:12:04 UTC, Alex Rønne Petersen wrote:
 I really do not understand why you want to use Shift-JIS. Unicode has
 long superseded all these magical encodings used all over the world.
 Why oppose a unified encoding?

On Windows 9x, there is no Unicode support. Instead, native MBCS encoding exists. So, if the D Windows program could use UTF-8 only, then the programmer should let the D program convert these strings to Shift_JIS.

D does not support Windows versions older than Windows 2000. -- Alex Rønne Petersen alex lycus.org http://lycus.org
May 15 2012
parent Denis Shelomovskij <verylonglogin.reg gmail.com> writes:
16.05.2012 8:26, Alex Rønne Petersen написал:
 On 16-05-2012 06:18, Katayama Hirofumi MZ wrote:
 On Wednesday, 16 May 2012 at 04:12:04 UTC, Alex Rønne Petersen wrote:
 I really do not understand why you want to use Shift-JIS. Unicode has
 long superseded all these magical encodings used all over the world.
 Why oppose a unified encoding?

On Windows 9x, there is no Unicode support. Instead, native MBCS encoding exists. So, if the D Windows program could use UTF-8 only, then the programmer should let the D program convert these strings to Shift_JIS.

D does not support Windows versions older than Windows 2000.

D2 has no Windows 2000 support for a long time. http://d.puremagic.com/issues/show_bug.cgi?id=6024 https://github.com/D-Programming-Language/druntime/pull/212 -- Денис В. Шеломовский Denis V. Shelomovskij
May 19 2012
prev sibling next sibling parent "Katayama Hirofumi MZ" <katayama.hirofumi.mz gmail.com> writes:
On Wednesday, 16 May 2012 at 04:12:04 UTC, Alex Rønne Petersen 
wrote:
 I really do not understand why you want to use Shift-JIS. 
 Unicode has long superseded all these magical encodings used 
 all over the world. Why oppose a unified encoding?

On Windows 9x, there is no Unicode support. Instead, native MBCS encoding exists. So, if the D Windows program could use UTF-8 only, then the programmer should let the D program convert these strings to Shift_JIS.
May 15 2012
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 5/15/2012 7:12 PM, Katayama Hirofumi MZ wrote:
 Hello, everyone!
 
 I want multibyte character string (MBCS) support except Unicode, UTF-8 and
UTF-16.
 
 Could you make the D compiler generate Shift_JIS, EUC-JP code for every string 
 literals by a specific command line option?
 
 Shift_JIS and EUC-JP are Japanese character set.

I'm familiar with Shift-JIS from the C compiler days. D is designed to internally be all UTF-8, and the runtime code all assumes UTF-8. I recommend programming in such a way that user input is converted from Shift-JIS to UTF-8, then all processing is done in terms of UTF-8, then the output is converted to Shift-JIS.
May 16 2012
next sibling parent reply "Katayama Hirofumi MZ" <katayama.hirofumi.mz gmail.com> writes:
Can D convert strings on compile time?
May 20 2012
parent Walter Bright <newshound2 digitalmars.com> writes:
On 5/20/2012 10:14 PM, Katayama Hirofumi MZ wrote:
 Can D convert strings on compile time?

Yes, you can write a CTFE function to do it.
May 20 2012
prev sibling parent "Katayama Hirofumi MZ" <katayama.hirofumi.mz gmail.com> writes:
I made source code converter for Shift_JIS.
Settled.

http://ime.nu/katahiromz.web.fc2.com/d/mbconvd.html
May 23 2012
prev sibling parent "Kagamin" <spam here.lot> writes:
On Wednesday, 16 May 2012 at 04:19:00 UTC, Katayama Hirofumi MZ 
wrote:
 On Windows 9x, there is no Unicode support.

http://msdn.microsoft.com/en-us/goglobal/bb688166
May 16 2012