www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - how to string =?UTF-8?B?4oaS?= uint* ?

reply "xky" <mozirikan gmail.com> writes:
hello. :-)
when i was using DerelictSFML2( 
http://code.dlang.org/packages/derelict-sfml2 ), i got this 
problem.

CSFML doc had 'setUnicodeString':
--------------------------------------------------------------------------------
CSFML_GRAPHICS_API void sfText_setUnicodeString  ( sfText *  text,
   const sfUint32 *  string
  )
--------------------------------------------------------------------------------
*'sfUint32' same 'unsigned int'.


how to convert string → uint? i just try that, but not work.
--------------------------------------------------------------------------------
string test = "안녕, こんにちは";
string* test_p = &test;
sfUint32* uintObject = cast(sfUint32*)test_p;
sfText_setUnicodeString( ****, uintObject );
--------------------------------------------------------------------------------

thanks, :)
Jun 27 2015
next sibling parent reply "anonymous" <anonymous example.com> writes:
On Sunday, 28 June 2015 at 01:57:46 UTC, xky wrote:
 hello. :-)
 when i was using DerelictSFML2( 
 http://code.dlang.org/packages/derelict-sfml2 ), i got this 
 problem.

 CSFML doc had 'setUnicodeString':
 --------------------------------------------------------------------------------
 CSFML_GRAPHICS_API void sfText_setUnicodeString  ( sfText *  
 text,
   const sfUint32 *  string
  )
 --------------------------------------------------------------------------------
 *'sfUint32' same 'unsigned int'.


 how to convert string → uint? i just try that, but not work.
 --------------------------------------------------------------------------------
 string test = "안녕, こんにちは";
 string* test_p = &test;
 sfUint32* uintObject = cast(sfUint32*)test_p;
 sfText_setUnicodeString( ****, uintObject );
 --------------------------------------------------------------------------------

 thanks, :)
Don't try casting just because you guess it could maybe work. The best documentation for setUnicodeString I could find is this: https://github.com/SFML/CSFML/blob/master/include/SFML/Graphics/Text.h#L243 which is pretty bad. It doesn't say if "unicode" is UTF8/16/32. `sfUint32` hints at UTF32, so I'll go with that. A D `string` is in UTF8, so you'll have to convert it to a `dstring` which is in UTF32 (UTF16 would be `wstring`). You can use std.conv.to for that: ---- string test = "안녕, こんにちは"; dstring test32 = test.to!dstring; ---- Alternatively, you can simply make `test` a dstring from the start: ---- dstring test = "안녕, こんにちは"; ---- The documentation also doesn't say if the string has to be null-terminated. Since the function doesn't take a length, I'm assuming that it has to be. D strings (of all varieties) are generally not null-terminated. String literals are, though. So if you're passing a hard-coded string, you're fine. But if the string is user input or otherwise generated at run time, you have to add a null character at the end. There's std.string.toStringz for that, but I'm afraid it's for UTF8 only. So you'd have to go into the details yourself. I'm continuing with `test` as above, which is null-terminated, because it's from a literal. Alright, the data is properly set up (hopefully). Now we need to get a `const sfUint32*` out of the `dstring`. I'm assuming `sfUint32` is just an alias for `uint`. So we need a `const uint*`. By convention, when a function takes a pointer and says it's a string, the pointer points to the first character of the string. `dstring` is an alias for `immutable(dchar)[]`, i.e. a dynamic array of `immutable dchar`s. Dynamic arrays have the `.ptr` property which is a pointer to the first element; exactly what we need. `test.ptr` is an `immutable(dchar)*` though, not a `const uint*`. `dchar` and `uint` have the same size. And any bit-pattern is valid for a `uint`, so `dchar`s can be reinterpreted as `uints` without problem. There's std.string.representation which does just that. Combining `representation` with `.ptr` you get an `immutable(uint)*` which implicitly converts to `const uint*`: ---- dstring test = "안녕, こんにちは"; sfText_setUnicodeString( ****, test.representation.ptr); ---- That's it. Another gripe about the documentation, though: It also doesn't say if the pointer has to be persistent or not. The string data may be garbage collected once it goes out of scope, so I'm betting on the pointer not having to be persistent, here.
Jun 28 2015
parent "xky" <mozirikan gmail.com> writes:
On Sunday, 28 June 2015 at 10:00:37 UTC, anonymous wrote:
 On Sunday, 28 June 2015 at 01:57:46 UTC, xky wrote:
 [...]
Don't try casting just because you guess it could maybe work. The best documentation for setUnicodeString I could find is this: https://github.com/SFML/CSFML/blob/master/include/SFML/Graphics/Text.h#L243 which is pretty bad. [...]
Thank you everybody for answered me! ^_^
Jun 28 2015
prev sibling parent reply "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:
In addition to what anonymous said, you might want to raise a bug 
report with Derelict, because the function signatures are 
arguable wrong, though that depends on whether Derelict wants to 
provide a strict one-to-one mapping of the C code, or one that is 
already somewhat adapted to D:

https://github.com/DerelictOrg/DerelictSFML2/blob/master/source/derelict/sfml2/graphics.d#L521-L522

alias da_sfText_setString = void function( sfText*,const( char )* 
);

The documentation says that this is for ANSI strings, but `char` 
in D is defined to be a *UTF8* code unit. Instead, the type 
should be `const(ubyte)*`.

alias da_sfText_setUnicodeString = void function( sfText*,const( 
sfUint32 )* );

Probably better to use `const(dchar)*` here.
Jun 28 2015
parent reply Mike Parker <aldacron gmail.com> writes:
On 6/28/2015 7:08 PM, "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net>" 
wrote:
 In addition to what anonymous said, you might want to raise a bug report
 with Derelict, because the function signatures are arguable wrong,
 though that depends on whether Derelict wants to provide a strict
 one-to-one mapping of the C code, or one that is already somewhat
 adapted to D:

 https://github.com/DerelictOrg/DerelictSFML2/blob/master/source/derelict/sfml2/graphics.d#L521-L522


 alias da_sfText_setString = void function( sfText*,const( char )* );

 The documentation says that this is for ANSI strings, but `char` in D is
 defined to be a *UTF8* code unit. Instead, the type should be
 `const(ubyte)*`.
I've been mapping D char to C char in Derelict packages for 11 years. It's also what's recommended on the page about interfacing to C[1]. Although I do understand your point, I'm curious if anyone is actually taking the ubyte approach these days? Or has anyone actually encountered a problem with the char->char mapping?
 alias da_sfText_setUnicodeString = void function( sfText*,const(
 sfUint32 )* );

 Probably better to use `const(dchar)*` here.
Agreed. That's what I've been doing in recent additions. [1] http://dlang.org/interfaceToC.html
Jun 28 2015
parent "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:
On Sunday, 28 June 2015 at 10:29:48 UTC, Mike Parker wrote:
 On 6/28/2015 7:08 PM, "Marc =?UTF-8?B?U2Now7x0eiI=?= 
 <schuetzm gmx.net>" wrote:
 In addition to what anonymous said, you might want to raise a 
 bug report
 with Derelict, because the function signatures are arguable 
 wrong,
 though that depends on whether Derelict wants to provide a 
 strict
 one-to-one mapping of the C code, or one that is already 
 somewhat
 adapted to D:

 https://github.com/DerelictOrg/DerelictSFML2/blob/master/source/derelict/sfml2/graphics.d#L521-L522


 alias da_sfText_setString = void function( sfText*,const( char 
 )* );

 The documentation says that this is for ANSI strings, but 
 `char` in D is
 defined to be a *UTF8* code unit. Instead, the type should be
 `const(ubyte)*`.
I've been mapping D char to C char in Derelict packages for 11 years. It's also what's recommended on the page about interfacing to C[1]. Although I do understand your point, I'm curious if anyone is actually taking the ubyte approach these days? Or has anyone actually encountered a problem with the char->char mapping?
Invalid UTF8 in strings currently throws on decoding, but it's being changed to asserts, because it's against specification and therefore (the consequence of) a bug somewhere in the program. One reason for this is to make decoding nogc, the other is simply correctness: a variable having the type `char[]` should be a guarantee that it contains valid UTF8. I don't know of any actual problems though, but if they happen, they will soon be Errors instead of Exceptions.
Jun 28 2015