www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Write UTF-8 bytes directly to stack buffer

reply Chris Piker <chris hoopjump.com> writes:
Hi D

There are quite a few string, array and range functions in phobos 
so I'm getting confused as to the right way to encode string data 
as UTF-8 directly into a stack buffer while keeping track of the 
write point.

I have some output packets I'm building up in a tight loop.  For 
speed I'm using the a priori knowledge that output packets will 
never be larger then 64K.   So what's the best way to do this:

```d
ubyte[65536] buf;
ubyte[] usable_buf = buf;

   // part of some tight loop, how to create function writef_utf8 ?
   foreach(input_thing; things){
      usable_buf.writef_utf8!"format str"(input_thing.fieldA, 
input_thing.fieldB);
   }

size_t used = buf.length - usable_buf.length;
stdout.write(buf[0.. used]);

```
Mar 10 2022
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, Mar 10, 2022 at 05:39:34PM +0000, Chris Piker via Digitalmars-d-learn
wrote:
 Hi D
 
 There are quite a few string, array and range functions in phobos so
 I'm getting confused as to the right way to encode string data as
 UTF-8 directly into a stack buffer while keeping track of the write
 point.
 
 I have some output packets I'm building up in a tight loop.  For speed
 I'm using the a priori knowledge that output packets will never be
 larger then 64K.   So what's the best way to do this:
 
 ```d
 ubyte[65536] buf;
 ubyte[] usable_buf = buf;
 
   // part of some tight loop, how to create function writef_utf8 ?
   foreach(input_thing; things){
      usable_buf.writef_utf8!"format str"(input_thing.fieldA,
input_thing.fieldB);
   }
 
 size_t used = buf.length - usable_buf.length;
 stdout.write(buf[0.. used]);
 
 ```
Probably what you're looking for is std.format.formattedWrite. For example: ------ import std; void main() { ubyte[65536] buf; char[] usable_buf = cast(char[]) buf[]; usable_buf.formattedWrite!"Blah %d blah %s"(123, "Это UTF-8 строка."); auto used = buf.length - usable_buf.length; writefln("%(%02X %)", buf[0 .. used]); } ------ D strings are UTF-8 by default, so for the most part, you don't need to worry about it. T -- Guns don't kill people. Bullets do.
Mar 10 2022
parent reply Chris Piker <chris hoopjump.com> writes:
On Thursday, 10 March 2022 at 17:59:33 UTC, H. S. Teoh wrote:
 Probably what you're looking for is std.format.formattedWrite. 
 For example:

 ```d
 import std;
 void main() {
 	ubyte[65536] buf;
 	char[] usable_buf = cast(char[]) buf[];
 	usable_buf.formattedWrite!"Blah %d blah %s"(123, "Это UTF-8 
 строка.");
 	auto used = buf.length - usable_buf.length;
 	writefln("%(%02X %)", buf[0 .. used]);
 }
 ```
Hey thanks! That does work with recent versions of dmd+phobos, but doesn't work in gdc-10. For some reason it produces this error: ```d error: static assert "Cannot put a const(char)[] into a char[]." ``` Is there a work around involving `.representation` as alluded to in this [thread](https://forum.dlang.org/post/zmehmpithifbgfuefchv forum.dlang.org) ? To get around the issue I built gdc-11.2 from source code at the GNU site but the old version of phobos is still included, so no dice.
Mar 12 2022
parent Brian Callahan <bcallah openbsd.org> writes:
On Sunday, 13 March 2022 at 07:55:01 UTC, Chris Piker wrote:
 Hey thanks!  That does work with recent versions of dmd+phobos, 
 but doesn't work in gdc-10.  For some reason it produces this 
 error:

 ```d
 error: static assert  "Cannot put a const(char)[] into a 
 char[]."
 ```

 Is there a work around involving `.representation` as alluded 
 to in this 
 [thread](https://forum.dlang.org/post/zmehmpithifbgfuefchv forum.dlang.org) ?

 To get around the issue I built gdc-11.2 from source code at 
 the GNU site but the old version of phobos is still included, 
 so no dice.
Build the latest gdc snapshot: https://mirrors.concertpass.com/gcc/snapshots/12-20220306/ ~Brian
Mar 13 2022